This is the start of the stable review cycle for the 6.6.70 release. There are 222 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 08 Jan 2025 15:11:04 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.6.70-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.6.y and the diffstat can be found below.
thanks,
greg k-h
------------- Pseudo-Shortlog of commits:
Greg Kroah-Hartman gregkh@linuxfoundation.org Linux 6.6.70-rc1
Yihang Li liyihang9@huawei.com scsi: hisi_sas: Remove redundant checks for automatic debugfs dump
Kashyap Desai kashyap.desai@broadcom.com RDMA/bnxt_re: Fix max SGEs for the Work Request
Paolo Abeni pabeni@redhat.com mptcp: don't always assume copied data in mptcp_cleanup_rbuf()
Paolo Abeni pabeni@redhat.com mptcp: fix recvbuffer adjust on sleeping rcvmsg
Paolo Abeni pabeni@redhat.com mptcp: fix TCP options overflow.
Seiji Nishikawa snishika@redhat.com mm: vmscan: account for free pages to prevent infinite Loop in throttle_direct_reclaim()
Alessandro Carminati acarmina@redhat.com mm/kmemleak: fix sleeping function called from invalid context at print message
Yafang Shao laoar.shao@gmail.com mm/readahead: fix large folio support in async readahead
Joshua Washington joshwash@google.com gve: guard XDP xmit NDO on existence of xdp queues
Joshua Washington joshwash@google.com gve: guard XSK operations on the existence of queues
David Hildenbrand david@redhat.com fs/proc/task_mmu: fix pagemap flags with PMD THP entries on 32bit
Biju Das biju.das.jz@bp.renesas.com drm: adv7511: Fix use-after-free in adv7533_attach_dsi()
Biju Das biju.das.jz@bp.renesas.com dt-bindings: display: adi,adv7533: Drop single lane support
Biju Das biju.das.jz@bp.renesas.com drm: adv7511: Drop dsi single lane support
Nikolay Kuratov kniv@yandex-team.ru net/sctp: Prevent autoclose integer overflow in sctp_association_init()
Pascal Hambourg pascal@plouf.fr.eu.org sky2: Add device ID 11ab:4373 for Marvell 88E8075
Evgenii Shatokhin e.shatokhin@yadro.com pinctrl: mcp23s08: Fix sleeping in atomic context due to regmap locking
Dan Carpenter dan.carpenter@linaro.org RDMA/uverbs: Prevent integer overflow issue
Kuan-Wei Chiu visitorckw@gmail.com scripts/sorttable: fix orc_sort_cmp() to maintain symmetry and transitivity
Arnd Bergmann arnd@arndb.de kcov: mark in_softirq_really() as __always_inline
Dennis Lam dennis.lamerice@gmail.com ocfs2: fix slab-use-after-free due to dangling pointer dqi_priv
Takashi Iwai tiwai@suse.de ALSA: seq: oss: Fix races at processing SysEx messages
Daniel Schaefer dhs@frame.work ALSA hda/realtek: Add quirk for Framework F111:000C
Takashi Iwai tiwai@suse.de ALSA: seq: Check UMP support for midi_version change
Shung-Hsi Yu shung-hsi.yu@suse.com Revert "bpf: support non-r10 register spill/fill to/from stack in precision tracking"
Masahiro Yamada masahiroy@kernel.org modpost: fix the missed iteration for the max bit in do_input()
Masahiro Yamada masahiroy@kernel.org modpost: fix input MODULE_DEVICE_TABLE() built for 64-bit on 32-bit host
Selvin Xavier selvin.xavier@broadcom.com RDMA/bnxt_re: Fix the max WQE size for static WQE support
Nathan Lynch nathanl@linux.ibm.com seq_buf: Make DECLARE_SEQ_BUF() usable
Leon Romanovsky leon@kernel.org ARC: build: Try to guess GCC variant of cross compiler
Uros Bizjak ubizjak@gmail.com irqchip/gic: Correct declaration of *percpu_base pointer in union gic_base
Luiz Augusto von Dentz luiz.von.dentz@intel.com Bluetooth: hci_core: Fix sleeping function called from invalid context
Daniele Palmas dnlplm@gmail.com net: usb: qmi_wwan: add Telit FE910C04 compositions
Enzo Matsumiya ematsumiya@suse.de smb: client: destroy cfid_put_wq on module exit
Namjae Jeon linkinjeon@kernel.org ksmbd: set ATTR_CTIME flags when setting mtime
Hobin Woo hobin.woo@samsung.com ksmbd: retry iterate_dir in smb2_query_dir
Anton Protopopov aspsk@isovalent.com bpf: fix potential error return
Adrian Ratiu adrian.ratiu@collabora.com sound: usb: format: don't warn that raw DSD is unsupported
Adrian Ratiu adrian.ratiu@collabora.com sound: usb: enable DSD output for ddHiFi TC44C
Vasiliy Kovalev kovalev@altlinux.org ALSA: hda/realtek: Add new alc2xx-fixup-headset-mic model
Takashi Iwai tiwai@suse.de ALSA: hda/ca0132: Use standard HD-audio quirk matching helpers
Filipe Manana fdmanana@suse.com btrfs: flush delalloc workers queue before stopping cleaner kthread during unmount
Prike Liang Prike.Liang@amd.com drm/amdkfd: Correct the migration DMA map direction
Emmanuel Grumbach emmanuel.grumbach@intel.com wifi: mac80211: wake the queues in case of failure in resume
Issam Hamdi ih@simonwunderlich.de wifi: mac80211: fix mbss changed flags corruption on 32 bit systems
Meghana Malladi m-malladi@ti.com net: ti: icssg-prueth: Fix clearing of IEP_CMP_CFG registers during iep_init
Eric Dumazet edumazet@google.com ila: serialize calls to nf_register_net_hooks()
Eric Dumazet edumazet@google.com af_packet: fix vlan_get_protocol_dgram() vs MSG_PEEK
Eric Dumazet edumazet@google.com af_packet: fix vlan_get_tci() vs MSG_PEEK
Maciej S. Szmigiero mail@maciej.szmigiero.name net: wwan: iosm: Properly check for valid exec stage in ipc_mmio_init()
Eric Dumazet edumazet@google.com net: restrict SO_REUSEPORT to inet sockets
Willem de Bruijn willemb@google.com net: reenable NETIF_F_IPV6_CSUM offload for BIG TCP packets
Liang Jie liangjie@lixiang.com net: sfc: Correct key_len for efx_tc_ct_zone_ht_params
Li Zhijian lizhijian@fujitsu.com RDMA/rtrs: Ensure 'ib_sge list' is accessible
Jinjian Song jinjian.song@fibocom.com net: wwan: t7xx: Fix FSM command timeout issue
Joe Hattori joe@pf.is.s.u-tokyo.ac.jp net: mv643xx_eth: fix an OF node reference leak
Vitalii Mordan mordan@ispras.ru eth: bcmsysport: fix call balance of priv->clk handling routines
Tanya Agarwal tanyaagarwal25699@gmail.com ALSA: usb-audio: US16x08: Initialize array before use
Antonio Pastor antonio.pastor@gmail.com net: llc: reset skb->transport_header
Pablo Neira Ayuso pablo@netfilter.org netfilter: nft_set_hash: unaligned atomic read on struct nft_set_ext
Rodrigo Vivi rodrigo.vivi@intel.com drm/i915/dg1: Fix power gate sequence.
Jianbo Liu jianbol@nvidia.com net/mlx5e: Skip restore TC rules for vport rep without loaded flag
Dragos Tatulea dtatulea@nvidia.com net/mlx5e: macsec: Maintain TX SA from encoding_sa
Shahar Shitrit shshitrit@nvidia.com net/mlx5: DR, select MSIX vector 0 for completion queue creation
Ilya Shchipletsov rabbelkin@mail.ru netrom: check buffer length before accessing it
Xiao Liang shaw.leon@gmail.com net: Fix netns for ip_tunnel_init_flow()
Ido Schimmel idosch@nvidia.com ipv4: ip_tunnel: Unmask upper DSCP bits in ip_tunnel_xmit()
Ido Schimmel idosch@nvidia.com ipv4: ip_tunnel: Unmask upper DSCP bits in ip_md_tunnel_xmit()
Ido Schimmel idosch@nvidia.com ipv4: ip_tunnel: Unmask upper DSCP bits in ip_tunnel_bind_dev()
Eric Dumazet edumazet@google.com ip_tunnel: annotate data-races around t->parms.link
Wang Liang wangliang74@huawei.com net: fix memory leak in tcp_conn_request()
Joe Hattori joe@pf.is.s.u-tokyo.ac.jp net: stmmac: restructure the error path of stmmac_probe_config_dt()
Andrew Halaney ahalaney@redhat.com net: stmmac: don't create a MDIO bus if unnecessary
Chengchang Tang tangchengchang@huawei.com RDMA/hns: Fix missing flush CQE for DWQE
Chengchang Tang tangchengchang@huawei.com RDMA/hns: Fix warning storm caused by invalid input in IO path
wenglianfa wenglianfa@huawei.com RDMA/hns: Fix mapping error of zero-hop WQE buffer
Chengchang Tang tangchengchang@huawei.com RDMA/hns: Remove unused parameters and variables
Chengchang Tang tangchengchang@huawei.com RDMA/hns: Refactor mtr find
Tristram Ha tristram.ha@microchip.com net: dsa: microchip: Fix LAN937X set_ageing_time function
Tristram Ha tristram.ha@microchip.com net: dsa: microchip: Fix KSZ9477 set_ageing_time function
Stefan Ekenberg stefan.ekenberg@axis.com drm/bridge: adv7511_audio: Update Audio InfoFrame properly
Selvin Xavier selvin.xavier@broadcom.com RDMA/bnxt_re: Fix the locking while accessing the QP table
Damodharam Ammepalli damodharam.ammepalli@broadcom.com RDMA/bnxt_re: Fix MSN table size for variable wqe mode
Damodharam Ammepalli damodharam.ammepalli@broadcom.com RDMA/bnxt_re: Add send queue size check for variable wqe
Kalesh AP kalesh-anakkur.purayil@broadcom.com RDMA/bnxt_re: Disable use of reserved wqes
Selvin Xavier selvin.xavier@broadcom.com RDMA/bnxt_re: Add support for Variable WQE in Genp7 adapters
Selvin Xavier selvin.xavier@broadcom.com RDMA/bnxt_re: Fix max_qp_wrs reported
Kalesh AP kalesh-anakkur.purayil@broadcom.com RDMA/bnxt_re: Fix reporting hw_ver in query_device
Saravanan Vajravel saravanan.vajravel@broadcom.com RDMA/bnxt_re: Add check for path mtu in modify_qp
Kalesh AP kalesh-anakkur.purayil@broadcom.com RDMA/bnxt_re: Fix the check for 9060 condition
Robert Beckett bob.beckett@collabora.com nvme-pci: 512 byte aligned dma pool segment quirk
Kashyap Desai kashyap.desai@broadcom.com RDMA/bnxt_re: Avoid sending the modify QP workaround for latest adapters
Selvin Xavier selvin.xavier@broadcom.com RDMA/bnxt_re: Avoid initializing the software queue for user queues
Patrisious Haddad phaddad@nvidia.com RDMA/mlx5: Enforce same type port association for multiport RoCE
Leon Romanovsky leon@kernel.org RDMA/bnxt_re: Remove always true dattr validity check
Selvin Xavier selvin.xavier@broadcom.com RDMA/bnxt_re: Allow MSN table capability check
Steven Rostedt rostedt@goodmis.org tracing: Check "%s" dereference via the field and not the TP_printk format
Steven Rostedt rostedt@goodmis.org tracing: Fix trace_check_vprintf() when tp_printk is used
Steven Rostedt (Google) rostedt@goodmis.org tracing: Handle old buffer mappings for event strings and functions
Kees Cook keescook@chromium.org seq_buf: Introduce DECLARE_SEQ_BUF and seq_buf_str()
Matthew Wilcox (Oracle) willy@infradead.org powerpc: Remove initialisation of readpos
Matthew Wilcox (Oracle) willy@infradead.org tracing: Move readpos from seq_buf to trace_seq
Jeremy Kerr jk@codeconstruct.com.au net: mctp: handle skb cleanup on sock_queue failures
Max Kellermann max.kellermann@ionos.com ceph: give up on paths longer than PATH_MAX
Xiubo Li xiubli@redhat.com ceph: print cluster fsid and client global_id in all debug logs
Xiubo Li xiubli@redhat.com libceph: add doutc and *_client debug macros support
Steven Rostedt rostedt@goodmis.org tracing: Have process_string() also allow arrays
Eric Biggers ebiggers@google.com mmc: sdhci-msm: fix crypto key eviction
Johannes Thumshirn johannes.thumshirn@wdc.com btrfs: fix use-after-free in btrfs_encoded_read_endio()
Thiébaud Weksteen tweek@google.com selinux: ignore unknown extended permissions
Chao Yu chao@kernel.org f2fs: fix to wait dio completion
Joe Hattori joe@pf.is.s.u-tokyo.ac.jp platform/x86: mlx-platform: call pci_dev_put() to balance the refcount
Takashi Iwai tiwai@suse.de ALSA: ump: Shut up truncated string warning
Michal Pecio michal.pecio@gmail.com usb: xhci: Avoid queuing redundant Stop Endpoint commands
Dmitry Baryshkov dmitry.baryshkov@linaro.org usb: typec: ucsi: glink: fix off-by-one in connector_status
Yihang Li liyihang9@huawei.com scsi: hisi_sas: Fix a deadlock issue related to automatic dump
Uros Bizjak ubizjak@gmail.com cleanup: Remove address space of returned pointer
Stefan Berger stefanb@linux.ibm.com crypto: ecc - Prevent ecc_digits_from_bytes from reading too many bytes
Jan Beulich jbeulich@suse.com memblock: make memblock_set_node() also warn about use of MAX_NUMNODES
Chris Lu chris.lu@mediatek.com Bluetooth: btusb: mediatek: add callback function in btusb_disconnect
Chris Lu chris.lu@mediatek.com Bluetooth: btusb: add callback function in btusb suspend/resume
Filipe Manana fdmanana@suse.com btrfs: fix use-after-free when COWing tree bock and tracing is enabled
Filipe Manana fdmanana@suse.com btrfs: rename and export __btrfs_cow_block()
Xin Li (Intel) xin@zytor.com x86/fred: Clear WFE in missing-ENDBRANCH #CPs
Xin Li xin3.li@intel.com x86/ptrace: Add FRED additional information to the pt_regs structure
Xin Li xin3.li@intel.com x86/ptrace: Cleanup the definition of the pt_regs structure
Qinxin Xia xiaqinxin@huawei.com ACPI/IORT: Add PMCG platform information for HiSilicon HIP09A
Yicong Yang yangyicong@hisilicon.com ACPI/IORT: Add PMCG platform information for HiSilicon HIP10/11
Ranjan Kumar ranjan.kumar@broadcom.com scsi: mpi3mr: Start controller indexing from 0
Guixin Liu kanie@linux.alibaba.com scsi: mpi3mr: Use ida to manage mrioc ID
Takashi Iwai tiwai@suse.de ALSA: ump: Update legacy substream names upon FB info update
Takashi Iwai tiwai@suse.de ALSA: ump: Indicate the inactive group in legacy substream names
Takashi Iwai tiwai@suse.de ALSA: ump: Don't open legacy substream for an inactive group
Takashi Iwai tiwai@suse.de ALSA: ump: Use guard() for locking
Jan Kara jack@suse.cz udf: Verify inode link counts before performing rename
Al Viro viro@zeniv.linux.org.uk udf_rename(): only access the child content on cross-directory rename
Claudiu Beznea claudiu.beznea.uj@bp.renesas.com watchdog: rzg2l_wdt: Power on the watchdog domain in the restart handler
Claudiu Beznea claudiu.beznea.uj@bp.renesas.com watchdog: rzg2l_wdt: Rely on the reset driver for doing proper reset
Claudiu Beznea claudiu.beznea.uj@bp.renesas.com watchdog: rzg2l_wdt: Remove reset de-assert from probe
Andrea della Porta andrea.porta@suse.com of: address: Preserve the flags portion on 1:1 dma-ranges mapping
Rob Herring robh@kernel.org of: address: Store number of bus flag cells rather than bool
Herve Codina herve.codina@bootlin.com of: address: Remove duplicated functions
Naman Jain namjain@linux.microsoft.com x86/hyperv: Fix hv tsc page based sched_clock for hibernation
Baoquan He bhe@redhat.com x86, crash: wrap crash dumping code into crash related ifdefs
Mario Limonciello mario.limonciello@amd.com thunderbolt: Don't display nvm_version unless upgrade supported
Mika Westerberg mika.westerberg@linux.intel.com thunderbolt: Add support for Intel Panther Lake-M/P
Mika Westerberg mika.westerberg@linux.intel.com thunderbolt: Add support for Intel Lunar Lake
Mathias Nyman mathias.nyman@linux.intel.com xhci: Turn NEC specific quirk for handling Stop Endpoint errors generic
Michal Pecio michal.pecio@gmail.com usb: xhci: Limit Stop Endpoint retries
Michal Pecio michal.pecio@gmail.com xhci: retry Stop Endpoint on buggy NEC controllers
Nikita Yushchenko nikita.yoush@cogentembedded.com net: renesas: rswitch: fix possible early skb release
K Prateek Nayak kprateek.nayak@amd.com softirq: Allow raising SCHED_SOFTIRQ from SMP-call-function on RT kernel
Sebastian Ott sebott@redhat.com net/mlx5: unique names for per device caches
Nilay Shroff nilay@linux.ibm.com Revert "nvme: make keep-alive synchronous operation"
Nilay Shroff nilay@linux.ibm.com nvme: use helper nvme_ctrl_state in nvme_keep_alive_finish function
Dmitry Baryshkov dmitry.baryshkov@linaro.org usb: typec: ucsi: glink: be more precise on orientation-aware ports
Dmitry Baryshkov dmitry.baryshkov@linaro.org usb: typec: ucsi: glink: set orientation aware if supported
Dmitry Baryshkov dmitry.baryshkov@linaro.org usb: typec: ucsi: add update_connector callback
Dmitry Baryshkov dmitry.baryshkov@linaro.org usb: typec: ucsi: glink: move GPIO reading into connector_status callback
Dmitry Baryshkov dmitry.baryshkov@linaro.org usb: typec: ucsi: add callback for connector status updates
Nuno Sa nuno.sa@analog.com iio: adc: ad7192: properly check spi_get_device_match_data()
Jonathan Cameron Jonathan.Cameron@huawei.com iio: adc: ad7192: Convert from of specific to fwnode property handling
Xu Yang xu.yang_2@nxp.com usb: chipidea: udc: limit usb request length to max 16KB
Xu Yang xu.yang_2@nxp.com usb: chipidea: add CI_HDRC_HAS_SHORT_PKT_LIMIT flag
Tomer Maimon tmaimon77@gmail.com usb: chipidea: add CI_HDRC_FORCE_VBUS_ACTIVE_ALWAYS flag
Konstantin Komarov almaz.alexandrovich@paragon-software.com fs/ntfs3: Fix warning in ni_fiemap
Konstantin Komarov almaz.alexandrovich@paragon-software.com fs/ntfs3: Implement fallocate for compressed files
Dmitry Baryshkov dmitry.baryshkov@linaro.org remoteproc: qcom: pas: enable SAR2130P audio DSP support
Tengfei Fan quic_tengfan@quicinc.com remoteproc: qcom: pas: Add support for SA8775p ADSP, CDSP and GPDSP
Nikita Travkin nikita@trvn.ru remoteproc: qcom: pas: Add sc7180 adsp
Adam Young admiyo@os.amperecomputing.com mailbox: pcc: Check before sending MCTP PCC response ACK
Sudeep Holla sudeep.holla@arm.com ACPI: PCC: Add PCC shared memory region command and status bitfields
Huisong Li lihuisong@huawei.com mailbox: pcc: Support shared interrupt for multiple subspaces
Huisong Li lihuisong@huawei.com mailbox: pcc: Add support for platform notification handling
Devi Priya quic_devipriy@quicinc.com clk: qcom: clk-alpha-pll: Add NSS HUAYRA ALPHA PLL support for ipq9574
Rajendra Nayak quic_rjendra@quicinc.com clk: qcom: clk-alpha-pll: Add support for zonda ole pll configure
Yihang Li liyihang9@huawei.com scsi: hisi_sas: Create all dump files during debugfs initialization
Yihang Li liyihang9@huawei.com scsi: hisi_sas: Allocate DFX memory during dump trigger
Yihang Li liyihang9@huawei.com scsi: hisi_sas: Directly call register snapshot instead of using workqueue
Hao Qin hao.qin@mediatek.com Bluetooth: btusb: Add new VID/PID 0489/e111 for MT7925
Jiande Lu jiande.lu@mediatek.com Bluetooth: btusb: Add USB HW IDs for MT7921/MT7922/MT7925
Ulrik Strid ulrik@strid.tech Bluetooth: btusb: Add new VID/PID 13d3/3602 for MT7925
Jingyang Wang wjy7717@126.com Bluetooth: Add support ITTIM PE50-M75C
Markus Elfring elfring@users.sourceforge.net Bluetooth: hci_conn: Reduce hci_conn_drop() calls in two functions
Jarkko Nikula jarkko.nikula@linux.intel.com i2c: i801: Add support for Intel Panther Lake
Jarkko Nikula jarkko.nikula@linux.intel.com i2c: i801: Add support for Intel Arrow Lake-H
Kang Yang quic_kangyang@quicinc.com wifi: ath10k: avoid NULL pointer error during sdio remove
Jeff Johnson quic_jjohnson@quicinc.com wifi: ath10k: Update Qualcomm Innovation Center, Inc. copyrights
Kalle Valo quic_kvalo@quicinc.com wifi: ath12k: fix atomic calls in ath12k_mac_op_set_bitrate_mask()
Rory Little rory@candelatech.com wifi: mac80211: Add non-atomic station iterator
Karthikeyan Periyasamy quic_periyasa@quicinc.com wifi: ath12k: Optimize the mac80211 hw data access
Ping-Ke Shih pkshih@realtek.com wifi: rtw88: use ieee80211_purge_tx_queue() to purge TX skb
Ping-Ke Shih pkshih@realtek.com wifi: mac80211: export ieee80211_purge_tx_queue() for drivers
Ricardo Ribalda ribalda@chromium.org media: uvcvideo: Force UVC version to 1.0a for 0408:4033
Laurent Pinchart laurent.pinchart@ideasonboard.com media: uvcvideo: Force UVC version to 1.0a for 0408:4035
Przemek Kitszel przemyslaw.kitszel@intel.com cleanup: Adjust scoped_guard() macros to avoid potential warning
Peter Zijlstra peterz@infradead.org cleanup: Add conditional guard support
Lukas Wunner lukas@wunner.de crypto: ecdsa - Avoid signed integer overflow on signature decoding
Stefan Berger stefanb@linux.ibm.com crypto: ecdsa - Use ecc_digits_from_bytes to convert signature
Stefan Berger stefanb@linux.ibm.com crypto: ecdsa - Rename keylen to bufsize where necessary
Stefan Berger stefanb@linux.ibm.com crypto: ecdsa - Convert byte arrays with key coordinates to digits
Brian Foster bfoster@redhat.com ext4: partial zero eof block on unaligned inode size extension
Jeff Layton jlayton@kernel.org ext4: convert to new timestamp accessors
Mike Rapoport (Microsoft) rppt@kernel.org memblock: allow zero threshold in validate_numa_converage()
Liam Ni zhiguangni01@gmail.com NUMA: optimize detection of memory with no node id assigned by firmware
Miguel Ojeda ojeda@kernel.org rust: allow `clippy::needless_lifetimes`
Miguel Ojeda ojeda@kernel.org rust: relax most deny-level lints to warnings
Alex Deucher alexander.deucher@amd.com Revert "drm/radeon: Delay Connector detecting when HPD singals is unstable"
Shixiong Ou oushixiong@kylinos.cn drm/radeon: Delay Connector detecting when HPD singals is unstable
Thomas Gleixner tglx@linutronix.de sched: Initialize idle tasks only once
Selvarasu Ganesan selvarasu.g@samsung.com usb: dwc3: gadget: Add missing check for single port RAM in TxFIFO resizing logic
Paulo Alcantara pc@manguebit.com smb: client: fix use-after-free of signing key
Paulo Alcantara pc@manguebit.com smb: client: stop flooding dmesg in smb2_calc_signature()
Ralph Boehme slow@samba.org fs/smb/client: implement chmod() for SMB3 POSIX Extensions
ChenXiaoSong chenxiaosong@kylinos.cn smb/client: rename cifs_ace to smb_ace
ChenXiaoSong chenxiaosong@kylinos.cn smb/client: rename cifs_acl to smb_acl
ChenXiaoSong chenxiaosong@kylinos.cn smb/client: rename cifs_sid to smb_sid
ChenXiaoSong chenxiaosong@kylinos.cn smb/client: rename cifs_ntsd to smb_ntsd
Borislav Petkov (AMD) bp@alien8.de x86/mm: Carve out INVLPG inline asm for use by others
Mauro Carvalho Chehab mchehab+huawei@kernel.org docs: media: update location of the media patches
Fangzhi Zuo Jerry.Zuo@amd.com drm/amd/display: Fix incorrect DSC recompute trigger
Agustin Gutierrez agustin.gutierrez@amd.com drm/amd/display: Fix DSC-re-computing
-------------
Diffstat:
Documentation/admin-guide/media/building.rst | 2 +- Documentation/admin-guide/media/saa7134.rst | 2 +- Documentation/arch/arm64/silicon-errata.rst | 5 +- .../bindings/display/bridge/adi,adv7533.yaml | 2 +- Documentation/i2c/busses/i2c-i801.rst | 2 + Makefile | 29 +- arch/arc/Makefile | 2 +- arch/loongarch/kernel/numa.c | 28 +- arch/powerpc/kernel/setup-common.c | 1 - arch/x86/entry/vsyscall/vsyscall_64.c | 2 +- arch/x86/include/asm/ptrace.h | 104 ++- arch/x86/include/asm/tlb.h | 4 + arch/x86/kernel/Makefile | 4 +- arch/x86/kernel/cet.c | 30 + arch/x86/kernel/cpu/mshyperv.c | 68 +- arch/x86/kernel/kexec-bzimage64.c | 4 + arch/x86/kernel/kvm.c | 4 +- arch/x86/kernel/machine_kexec_64.c | 3 + arch/x86/kernel/process_64.c | 2 +- arch/x86/kernel/reboot.c | 4 +- arch/x86/kernel/setup.c | 2 +- arch/x86/kernel/smp.c | 2 +- arch/x86/mm/numa.c | 34 +- arch/x86/mm/tlb.c | 3 +- arch/x86/xen/enlighten_hvm.c | 4 + arch/x86/xen/mmu_pv.c | 2 +- crypto/ecc.c | 22 + crypto/ecdsa.c | 51 +- drivers/acpi/arm64/iort.c | 9 + drivers/bluetooth/btusb.c | 47 ++ drivers/clk/qcom/clk-alpha-pll.c | 27 + drivers/clk/qcom/clk-alpha-pll.h | 5 + drivers/clocksource/hyperv_timer.c | 14 +- drivers/gpu/drm/amd/amdkfd/kfd_migrate.c | 4 +- .../amd/display/amdgpu_dm/amdgpu_dm_mst_types.c | 16 +- drivers/gpu/drm/bridge/adv7511/adv7511_audio.c | 14 +- drivers/gpu/drm/bridge/adv7511/adv7511_drv.c | 10 +- drivers/gpu/drm/bridge/adv7511/adv7533.c | 4 +- drivers/gpu/drm/i915/gt/intel_rc6.c | 2 +- drivers/i2c/busses/Kconfig | 2 + drivers/i2c/busses/i2c-i801.c | 9 + drivers/iio/adc/ad7192.c | 39 +- drivers/infiniband/core/uverbs_cmd.c | 16 +- drivers/infiniband/hw/bnxt_re/ib_verbs.c | 63 +- drivers/infiniband/hw/bnxt_re/main.c | 21 +- drivers/infiniband/hw/bnxt_re/qplib_fp.c | 93 +-- drivers/infiniband/hw/bnxt_re/qplib_fp.h | 19 +- drivers/infiniband/hw/bnxt_re/qplib_res.h | 6 + drivers/infiniband/hw/bnxt_re/qplib_sp.c | 26 +- drivers/infiniband/hw/bnxt_re/qplib_sp.h | 9 + drivers/infiniband/hw/bnxt_re/roce_hsi.h | 30 +- drivers/infiniband/hw/hns/hns_roce_alloc.c | 3 +- drivers/infiniband/hw/hns/hns_roce_cq.c | 11 +- drivers/infiniband/hw/hns/hns_roce_device.h | 12 +- drivers/infiniband/hw/hns/hns_roce_hem.c | 54 +- drivers/infiniband/hw/hns/hns_roce_hw_v2.c | 130 ++-- drivers/infiniband/hw/hns/hns_roce_mr.c | 95 +-- drivers/infiniband/hw/hns/hns_roce_qp.c | 4 +- drivers/infiniband/hw/hns/hns_roce_srq.c | 4 +- drivers/infiniband/hw/mlx5/main.c | 6 +- drivers/infiniband/ulp/rtrs/rtrs-srv.c | 2 +- drivers/irqchip/irq-gic.c | 2 +- drivers/mailbox/pcc.c | 136 +++- drivers/media/usb/uvc/uvc_driver.c | 22 + drivers/mmc/host/sdhci-msm.c | 16 +- drivers/net/dsa/microchip/ksz9477.c | 47 +- drivers/net/dsa/microchip/ksz9477_reg.h | 4 +- drivers/net/dsa/microchip/lan937x_main.c | 62 +- drivers/net/dsa/microchip/lan937x_reg.h | 9 +- drivers/net/ethernet/broadcom/bcmsysport.c | 21 +- drivers/net/ethernet/google/gve/gve_main.c | 25 +- drivers/net/ethernet/google/gve/gve_tx.c | 5 +- drivers/net/ethernet/marvell/mv643xx_eth.c | 14 +- drivers/net/ethernet/marvell/sky2.c | 1 + .../ethernet/mellanox/mlx5/core/en_accel/macsec.c | 4 + .../net/ethernet/mellanox/mlx5/core/esw/ipsec_fs.c | 6 +- drivers/net/ethernet/mellanox/mlx5/core/eswitch.h | 3 + .../ethernet/mellanox/mlx5/core/eswitch_offloads.c | 3 - drivers/net/ethernet/mellanox/mlx5/core/fs_core.c | 7 +- .../ethernet/mellanox/mlx5/core/steering/dr_send.c | 4 +- .../net/ethernet/mellanox/mlxsw/spectrum_span.c | 3 +- drivers/net/ethernet/renesas/rswitch.c | 5 +- drivers/net/ethernet/sfc/tc_conntrack.c | 2 +- .../net/ethernet/stmicro/stmmac/stmmac_platform.c | 120 ++-- drivers/net/ethernet/ti/icssg/icss_iep.c | 8 + drivers/net/usb/qmi_wwan.c | 3 + drivers/net/wireless/ath/ath10k/bmi.c | 1 + drivers/net/wireless/ath/ath10k/ce.c | 1 + drivers/net/wireless/ath/ath10k/core.c | 1 + drivers/net/wireless/ath/ath10k/core.h | 1 + drivers/net/wireless/ath/ath10k/coredump.c | 1 + drivers/net/wireless/ath/ath10k/coredump.h | 1 + drivers/net/wireless/ath/ath10k/debug.c | 1 + drivers/net/wireless/ath/ath10k/debugfs_sta.c | 1 + drivers/net/wireless/ath/ath10k/htc.c | 1 + drivers/net/wireless/ath/ath10k/htt.h | 1 + drivers/net/wireless/ath/ath10k/htt_rx.c | 1 + drivers/net/wireless/ath/ath10k/htt_tx.c | 1 + drivers/net/wireless/ath/ath10k/hw.c | 1 + drivers/net/wireless/ath/ath10k/hw.h | 1 + drivers/net/wireless/ath/ath10k/mac.c | 1 + drivers/net/wireless/ath/ath10k/pci.c | 1 + drivers/net/wireless/ath/ath10k/pci.h | 1 + drivers/net/wireless/ath/ath10k/qmi.c | 1 + drivers/net/wireless/ath/ath10k/qmi_wlfw_v01.c | 1 + drivers/net/wireless/ath/ath10k/qmi_wlfw_v01.h | 1 + drivers/net/wireless/ath/ath10k/rx_desc.h | 1 + drivers/net/wireless/ath/ath10k/sdio.c | 5 +- drivers/net/wireless/ath/ath10k/thermal.c | 1 + drivers/net/wireless/ath/ath10k/usb.h | 1 + drivers/net/wireless/ath/ath10k/wmi-tlv.h | 1 + drivers/net/wireless/ath/ath10k/wmi.c | 1 + drivers/net/wireless/ath/ath10k/wmi.h | 1 + drivers/net/wireless/ath/ath10k/wow.c | 1 + drivers/net/wireless/ath/ath12k/mac.c | 26 +- drivers/net/wireless/ath/ath12k/reg.c | 6 +- drivers/net/wireless/realtek/rtw88/sdio.c | 6 +- drivers/net/wireless/realtek/rtw88/usb.c | 5 +- drivers/net/wwan/iosm/iosm_ipc_mmio.c | 2 +- drivers/net/wwan/t7xx/t7xx_state_monitor.c | 26 +- drivers/net/wwan/t7xx/t7xx_state_monitor.h | 5 +- drivers/nvme/host/core.c | 27 +- drivers/nvme/host/nvme.h | 5 + drivers/nvme/host/pci.c | 9 +- drivers/of/address.c | 30 +- drivers/pinctrl/pinctrl-mcp23s08.c | 6 + drivers/platform/x86/mlx-platform.c | 2 + drivers/remoteproc/qcom_q6v5_pas.c | 94 +++ drivers/scsi/hisi_sas/hisi_sas.h | 3 +- drivers/scsi/hisi_sas/hisi_sas_main.c | 17 +- drivers/scsi/hisi_sas/hisi_sas_v3_hw.c | 200 +++--- drivers/scsi/mpi3mr/mpi3mr_os.c | 12 +- drivers/thunderbolt/nhi.c | 12 + drivers/thunderbolt/nhi.h | 6 + drivers/thunderbolt/retimer.c | 17 +- drivers/usb/chipidea/ci.h | 2 + drivers/usb/chipidea/ci_hdrc_imx.c | 1 + drivers/usb/chipidea/core.c | 2 + drivers/usb/chipidea/otg.c | 5 +- drivers/usb/chipidea/udc.c | 6 + drivers/usb/dwc3/core.h | 4 + drivers/usb/dwc3/gadget.c | 54 +- drivers/usb/host/xhci-ring.c | 42 +- drivers/usb/host/xhci.c | 21 +- drivers/usb/host/xhci.h | 2 + drivers/usb/typec/ucsi/ucsi.c | 9 + drivers/usb/typec/ucsi/ucsi.h | 5 + drivers/usb/typec/ucsi/ucsi_glink.c | 60 +- drivers/watchdog/rzg2l_wdt.c | 83 ++- fs/btrfs/ctree.c | 37 +- fs/btrfs/ctree.h | 7 + fs/btrfs/disk-io.c | 9 + fs/btrfs/inode.c | 2 +- fs/ceph/acl.c | 6 +- fs/ceph/addr.c | 279 ++++---- fs/ceph/caps.c | 708 ++++++++++++--------- fs/ceph/crypto.c | 39 +- fs/ceph/debugfs.c | 6 +- fs/ceph/dir.c | 218 ++++--- fs/ceph/export.c | 39 +- fs/ceph/file.c | 245 ++++--- fs/ceph/inode.c | 485 +++++++------- fs/ceph/ioctl.c | 13 +- fs/ceph/locks.c | 57 +- fs/ceph/mds_client.c | 563 +++++++++------- fs/ceph/mdsmap.c | 24 +- fs/ceph/metric.c | 5 +- fs/ceph/quota.c | 29 +- fs/ceph/snap.c | 174 ++--- fs/ceph/super.c | 70 +- fs/ceph/super.h | 6 + fs/ceph/xattr.c | 96 +-- fs/ext4/ext4.h | 20 +- fs/ext4/extents.c | 18 +- fs/ext4/ialloc.c | 4 +- fs/ext4/inline.c | 4 +- fs/ext4/inode.c | 72 ++- fs/ext4/ioctl.c | 13 +- fs/ext4/namei.c | 10 +- fs/ext4/super.c | 2 +- fs/ext4/xattr.c | 8 +- fs/f2fs/file.c | 13 + fs/ntfs3/attrib.c | 32 +- fs/ntfs3/frecord.c | 103 +-- fs/ntfs3/inode.c | 3 +- fs/ntfs3/ntfs_fs.h | 3 +- fs/ocfs2/quota_global.c | 2 +- fs/ocfs2/quota_local.c | 1 + fs/proc/task_mmu.c | 2 +- fs/smb/client/cifsacl.c | 266 ++++---- fs/smb/client/cifsacl.h | 20 +- fs/smb/client/cifsfs.c | 1 + fs/smb/client/cifsglob.h | 22 +- fs/smb/client/cifsproto.h | 26 +- fs/smb/client/cifssmb.c | 6 +- fs/smb/client/inode.c | 4 +- fs/smb/client/smb2inode.c | 4 +- fs/smb/client/smb2ops.c | 14 +- fs/smb/client/smb2pdu.c | 12 +- fs/smb/client/smb2pdu.h | 8 +- fs/smb/client/smb2proto.h | 4 +- fs/smb/client/smb2transport.c | 56 +- fs/smb/client/xattr.c | 4 +- fs/smb/server/smb2pdu.c | 22 +- fs/smb/server/vfs.h | 1 + fs/udf/namei.c | 17 +- include/acpi/pcc.h | 20 + include/clocksource/hyperv_timer.h | 2 + include/crypto/internal/ecc.h | 10 + include/linux/bpf_verifier.h | 31 +- include/linux/ceph/ceph_debug.h | 38 ++ include/linux/cleanup.h | 88 ++- include/linux/if_vlan.h | 16 +- include/linux/memblock.h | 1 + include/linux/mlx5/driver.h | 6 + include/linux/mutex.h | 3 +- include/linux/rwsem.h | 8 +- include/linux/seq_buf.h | 25 +- include/linux/spinlock.h | 15 + include/linux/trace_events.h | 6 +- include/linux/trace_seq.h | 2 + include/linux/usb/chipidea.h | 2 + include/net/bluetooth/hci_core.h | 108 ++-- include/net/mac80211.h | 31 + include/net/netfilter/nf_tables.h | 7 +- kernel/bpf/core.c | 6 +- kernel/bpf/verifier.c | 175 +++-- kernel/kcov.c | 2 +- kernel/sched/core.c | 12 +- kernel/softirq.c | 15 +- kernel/trace/trace.c | 231 ++----- kernel/trace/trace.h | 6 +- kernel/trace/trace_events.c | 44 +- kernel/trace/trace_output.c | 6 +- kernel/trace/trace_seq.c | 6 +- lib/seq_buf.c | 26 +- mm/kmemleak.c | 2 +- mm/memblock.c | 38 ++ mm/readahead.c | 6 +- mm/vmscan.c | 9 +- net/bluetooth/hci_conn.c | 13 +- net/bluetooth/hci_core.c | 10 +- net/bluetooth/iso.c | 6 + net/bluetooth/l2cap_core.c | 12 +- net/bluetooth/rfcomm/core.c | 6 + net/bluetooth/sco.c | 12 +- net/core/dev.c | 4 +- net/core/sock.c | 5 +- net/ipv4/ip_tunnel.c | 38 +- net/ipv4/tcp_input.c | 1 + net/ipv6/ila/ila_xlat.c | 16 +- net/llc/llc_input.c | 2 +- net/mac80211/ieee80211_i.h | 2 - net/mac80211/mesh.c | 6 +- net/mac80211/status.c | 1 + net/mac80211/util.c | 19 +- net/mctp/route.c | 36 +- net/mptcp/options.c | 7 + net/mptcp/protocol.c | 22 +- net/netrom/nr_route.c | 6 + net/packet/af_packet.c | 28 +- net/sctp/associola.c | 3 +- rust/Makefile | 4 +- scripts/mod/file2alias.c | 4 +- scripts/sorttable.h | 5 +- security/selinux/ss/services.c | 8 +- sound/core/seq/oss/seq_oss_synth.c | 2 + sound/core/seq/seq_clientmgr.c | 14 +- sound/core/ump.c | 61 +- sound/pci/hda/patch_ca0132.c | 37 +- sound/pci/hda/patch_realtek.c | 2 + sound/usb/format.c | 7 +- sound/usb/mixer_us16x08.c | 2 +- sound/usb/quirks.c | 2 + .../bpf/progs/verifier_subprog_precision.c | 23 +- tools/testing/selftests/bpf/verifier/precise.c | 38 +- 276 files changed, 4885 insertions(+), 3100 deletions(-)
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Agustin Gutierrez agustin.gutierrez@amd.com
[ Upstream commit b9b5a82c532109a09f4340ef5cabdfdbb0691a9d ]
[Why] This fixes a bug introduced by commit c53655545141 ("drm/amd/display: dsc mst re-compute pbn for changes on hub"). The change caused light-up issues with a second display that required DSC on some MST docks.
[How] Use Virtual DPCD for DSC caps in MST case.
[Limitations] This change only affects MST DSC devices that follow specifications additional changes are required to check for old MST DSC devices such as ones which do not check for Virtual DPCD registers.
Reviewed-by: Swapnil Patel swapnil.patel@amd.com Reviewed-by: Hersen Wu hersenxs.wu@amd.com Acked-by: Tom Chung chiahsuan.chung@amd.com Signed-off-by: Agustin Gutierrez agustin.gutierrez@amd.com Tested-by: Daniel Wheeler daniel.wheeler@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Stable-dep-of: 4641169a8c95 ("drm/amd/display: Fix incorrect DSC recompute trigger") Signed-off-by: Sasha Levin sashal@kernel.org --- .../display/amdgpu_dm/amdgpu_dm_mst_types.c | 18 +++++++++++++----- 1 file changed, 13 insertions(+), 5 deletions(-)
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c index 9ec9792f115a..b4bbd3be35a6 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c @@ -1219,10 +1219,6 @@ static bool is_dsc_need_re_compute( if (dc_link->type != dc_connection_mst_branch) return false;
- if (!(dc_link->dpcd_caps.dsc_caps.dsc_basic_caps.fields.dsc_support.DSC_SUPPORT || - dc_link->dpcd_caps.dsc_caps.dsc_basic_caps.fields.dsc_support.DSC_PASSTHROUGH_SUPPORT)) - return false; - for (i = 0; i < MAX_PIPES; i++) stream_on_link[i] = NULL;
@@ -1240,7 +1236,19 @@ static bool is_dsc_need_re_compute( continue;
aconnector = (struct amdgpu_dm_connector *) stream->dm_stream_context; - if (!aconnector) + if (!aconnector || !aconnector->dsc_aux) + continue; + + /* + * Check if cached virtual MST DSC caps are available and DSC is supported + * this change takes care of newer MST DSC capable devices that report their + * DPCD caps as per specifications in their Virtual DPCD registers. + + * TODO: implement the check for older MST DSC devices that do not conform to + * specifications. + */ + if (!(aconnector->dc_sink->dsc_caps.dsc_dec_caps.is_dsc_supported || + aconnector->dc_link->dpcd_caps.dsc_caps.dsc_basic_caps.fields.dsc_support.DSC_PASSTHROUGH_SUPPORT)) continue;
stream_on_link[new_stream_on_link_num] = aconnector;
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Fangzhi Zuo Jerry.Zuo@amd.com
[ Upstream commit 4641169a8c95d9efc35d2d3c55c3948f3b375ff9 ]
A stream without dsc_aux should not be eliminated from the dsc determination. Whether it needs a dsc recompute depends on whether its mode has changed or not. Eliminating such a no-dsc stream from the dsc determination policy will end up with inconsistencies in the new dc_state when compared to the current dc_state, triggering a dsc recompute that should not have happened.
Reviewed-by: Rodrigo Siqueira rodrigo.siqueira@amd.com Signed-off-by: Fangzhi Zuo Jerry.Zuo@amd.com Signed-off-by: Aurabindo Pillai aurabindo.pillai@amd.com Tested-by: Daniel Wheeler daniel.wheeler@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c index b4bbd3be35a6..385a5a75fdf8 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c @@ -1236,7 +1236,7 @@ static bool is_dsc_need_re_compute( continue;
aconnector = (struct amdgpu_dm_connector *) stream->dm_stream_context; - if (!aconnector || !aconnector->dsc_aux) + if (!aconnector) continue;
/*
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mauro Carvalho Chehab mchehab+huawei@kernel.org
[ Upstream commit 72ad4ff638047bbbdf3232178fea4bec1f429319 ]
Due to recent changes on the way we're maintaining media, the location of the main tree was updated.
Change docs accordingly.
Cc: stable@vger.kernel.org Signed-off-by: Mauro Carvalho Chehab mchehab+huawei@kernel.org Reviewed-by: Hans Verkuil hverkuil@xs4all.nl Signed-off-by: Sasha Levin sashal@kernel.org --- Documentation/admin-guide/media/building.rst | 2 +- Documentation/admin-guide/media/saa7134.rst | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/Documentation/admin-guide/media/building.rst b/Documentation/admin-guide/media/building.rst index a06473429916..7a413ba07f93 100644 --- a/Documentation/admin-guide/media/building.rst +++ b/Documentation/admin-guide/media/building.rst @@ -15,7 +15,7 @@ Please notice, however, that, if:
you should use the main media development tree ``master`` branch:
- https://git.linuxtv.org/media_tree.git/ + https://git.linuxtv.org/media.git/
In this case, you may find some useful information at the `LinuxTv wiki pages https://linuxtv.org/wiki`_: diff --git a/Documentation/admin-guide/media/saa7134.rst b/Documentation/admin-guide/media/saa7134.rst index 51eae7eb5ab7..18d7cbc897db 100644 --- a/Documentation/admin-guide/media/saa7134.rst +++ b/Documentation/admin-guide/media/saa7134.rst @@ -67,7 +67,7 @@ Changes / Fixes Please mail to linux-media AT vger.kernel.org unified diffs against the linux media git tree:
- https://git.linuxtv.org/media_tree.git/ + https://git.linuxtv.org/media.git/
This is done by committing a patch at a clone of the git tree and submitting the patch using ``git send-email``. Don't forget to
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Borislav Petkov (AMD) bp@alien8.de
[ Upstream commit f1d84b59cbb9547c243d93991acf187fdbe9fbe9 ]
No functional changes.
Signed-off-by: Borislav Petkov (AMD) bp@alien8.de Link: https://lore.kernel.org/r/ZyulbYuvrkshfsd2@antipodes Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/include/asm/tlb.h | 4 ++++ arch/x86/mm/tlb.c | 3 ++- 2 files changed, 6 insertions(+), 1 deletion(-)
diff --git a/arch/x86/include/asm/tlb.h b/arch/x86/include/asm/tlb.h index 580636cdc257..4d3c9d00d6b6 100644 --- a/arch/x86/include/asm/tlb.h +++ b/arch/x86/include/asm/tlb.h @@ -34,4 +34,8 @@ static inline void __tlb_remove_table(void *table) free_page_and_swap_cache(table); }
+static inline void invlpg(unsigned long addr) +{ + asm volatile("invlpg (%0)" ::"r" (addr) : "memory"); +} #endif /* _ASM_X86_TLB_H */ diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index 2fbae48f0b47..64f594826a28 100644 --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm/tlb.c @@ -19,6 +19,7 @@ #include <asm/cacheflush.h> #include <asm/apic.h> #include <asm/perf_event.h> +#include <asm/tlb.h>
#include "mm_internal.h"
@@ -1145,7 +1146,7 @@ STATIC_NOPV void native_flush_tlb_one_user(unsigned long addr) bool cpu_pcide;
/* Flush 'addr' from the kernel PCID: */ - asm volatile("invlpg (%0)" ::"r" (addr) : "memory"); + invlpg(addr);
/* If PTI is off there is no user PCID and nothing to flush. */ if (!static_cpu_has(X86_FEATURE_PTI))
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: ChenXiaoSong chenxiaosong@kylinos.cn
[ Upstream commit 3651487607ae778df1051a0a38bb34a5bd34e3b7 ]
Preparation for moving acl definitions to new common header file.
Use the following shell command to rename:
find fs/smb/client -type f -exec sed -i \ 's/struct cifs_ntsd/struct smb_ntsd/g' {} +
Signed-off-by: ChenXiaoSong chenxiaosong@kylinos.cn Reviewed-by: Namjae Jeon linkinjeon@kernel.org Signed-off-by: Steve French stfrench@microsoft.com Stable-dep-of: d413eabff18d ("fs/smb/client: implement chmod() for SMB3 POSIX Extensions") Signed-off-by: Sasha Levin sashal@kernel.org --- fs/smb/client/cifsacl.c | 36 ++++++++++++++++++------------------ fs/smb/client/cifsacl.h | 6 +++--- fs/smb/client/cifsglob.h | 12 ++++++------ fs/smb/client/cifsproto.h | 16 ++++++++-------- fs/smb/client/cifssmb.c | 6 +++--- fs/smb/client/smb2ops.c | 14 +++++++------- fs/smb/client/smb2pdu.c | 2 +- fs/smb/client/smb2proto.h | 2 +- fs/smb/client/xattr.c | 4 ++-- 9 files changed, 49 insertions(+), 49 deletions(-)
diff --git a/fs/smb/client/cifsacl.c b/fs/smb/client/cifsacl.c index f5b6df82e857..3f7657475cd9 100644 --- a/fs/smb/client/cifsacl.c +++ b/fs/smb/client/cifsacl.c @@ -515,8 +515,8 @@ exit_cifs_idmap(void) }
/* copy ntsd, owner sid, and group sid from a security descriptor to another */ -static __u32 copy_sec_desc(const struct cifs_ntsd *pntsd, - struct cifs_ntsd *pnntsd, +static __u32 copy_sec_desc(const struct smb_ntsd *pntsd, + struct smb_ntsd *pnntsd, __u32 sidsoffset, struct cifs_sid *pownersid, struct cifs_sid *pgrpsid) @@ -527,7 +527,7 @@ static __u32 copy_sec_desc(const struct cifs_ntsd *pntsd, /* copy security descriptor control portion */ pnntsd->revision = pntsd->revision; pnntsd->type = pntsd->type; - pnntsd->dacloffset = cpu_to_le32(sizeof(struct cifs_ntsd)); + pnntsd->dacloffset = cpu_to_le32(sizeof(struct smb_ntsd)); pnntsd->sacloffset = 0; pnntsd->osidoffset = cpu_to_le32(sidsoffset); pnntsd->gsidoffset = cpu_to_le32(sidsoffset + sizeof(struct cifs_sid)); @@ -1191,7 +1191,7 @@ static int parse_sid(struct cifs_sid *psid, char *end_of_acl)
/* Convert CIFS ACL to POSIX form */ static int parse_sec_desc(struct cifs_sb_info *cifs_sb, - struct cifs_ntsd *pntsd, int acl_len, struct cifs_fattr *fattr, + struct smb_ntsd *pntsd, int acl_len, struct cifs_fattr *fattr, bool get_mode_from_special_sid) { int rc = 0; @@ -1249,7 +1249,7 @@ static int parse_sec_desc(struct cifs_sb_info *cifs_sb, }
/* Convert permission bits from mode to equivalent CIFS ACL */ -static int build_sec_desc(struct cifs_ntsd *pntsd, struct cifs_ntsd *pnntsd, +static int build_sec_desc(struct smb_ntsd *pntsd, struct smb_ntsd *pnntsd, __u32 secdesclen, __u32 *pnsecdesclen, __u64 *pnmode, kuid_t uid, kgid_t gid, bool mode_from_sid, bool id_from_sid, int *aclflag) { @@ -1279,7 +1279,7 @@ static int build_sec_desc(struct cifs_ntsd *pntsd, struct cifs_ntsd *pnntsd, le32_to_cpu(pntsd->gsidoffset));
if (pnmode && *pnmode != NO_CHANGE_64) { /* chmod */ - ndacloffset = sizeof(struct cifs_ntsd); + ndacloffset = sizeof(struct smb_ntsd); ndacl_ptr = (struct cifs_acl *)((char *)pnntsd + ndacloffset); ndacl_ptr->revision = dacloffset ? dacl_ptr->revision : cpu_to_le16(ACL_REVISION); @@ -1297,7 +1297,7 @@ static int build_sec_desc(struct cifs_ntsd *pntsd, struct cifs_ntsd *pnntsd,
*aclflag |= CIFS_ACL_DACL; } else { - ndacloffset = sizeof(struct cifs_ntsd); + ndacloffset = sizeof(struct smb_ntsd); ndacl_ptr = (struct cifs_acl *)((char *)pnntsd + ndacloffset); ndacl_ptr->revision = dacloffset ? dacl_ptr->revision : cpu_to_le16(ACL_REVISION); @@ -1385,11 +1385,11 @@ static int build_sec_desc(struct cifs_ntsd *pntsd, struct cifs_ntsd *pnntsd, }
#ifdef CONFIG_CIFS_ALLOW_INSECURE_LEGACY -struct cifs_ntsd *get_cifs_acl_by_fid(struct cifs_sb_info *cifs_sb, +struct smb_ntsd *get_cifs_acl_by_fid(struct cifs_sb_info *cifs_sb, const struct cifs_fid *cifsfid, u32 *pacllen, u32 __maybe_unused unused) { - struct cifs_ntsd *pntsd = NULL; + struct smb_ntsd *pntsd = NULL; unsigned int xid; int rc; struct tcon_link *tlink = cifs_sb_tlink(cifs_sb); @@ -1410,10 +1410,10 @@ struct cifs_ntsd *get_cifs_acl_by_fid(struct cifs_sb_info *cifs_sb, return pntsd; }
-static struct cifs_ntsd *get_cifs_acl_by_path(struct cifs_sb_info *cifs_sb, +static struct smb_ntsd *get_cifs_acl_by_path(struct cifs_sb_info *cifs_sb, const char *path, u32 *pacllen) { - struct cifs_ntsd *pntsd = NULL; + struct smb_ntsd *pntsd = NULL; int oplock = 0; unsigned int xid; int rc; @@ -1454,11 +1454,11 @@ static struct cifs_ntsd *get_cifs_acl_by_path(struct cifs_sb_info *cifs_sb, }
/* Retrieve an ACL from the server */ -struct cifs_ntsd *get_cifs_acl(struct cifs_sb_info *cifs_sb, +struct smb_ntsd *get_cifs_acl(struct cifs_sb_info *cifs_sb, struct inode *inode, const char *path, u32 *pacllen, u32 info) { - struct cifs_ntsd *pntsd = NULL; + struct smb_ntsd *pntsd = NULL; struct cifsFileInfo *open_file = NULL;
if (inode) @@ -1472,7 +1472,7 @@ struct cifs_ntsd *get_cifs_acl(struct cifs_sb_info *cifs_sb, }
/* Set an ACL on the server */ -int set_cifs_acl(struct cifs_ntsd *pnntsd, __u32 acllen, +int set_cifs_acl(struct smb_ntsd *pnntsd, __u32 acllen, struct inode *inode, const char *path, int aclflag) { int oplock = 0; @@ -1528,7 +1528,7 @@ cifs_acl_to_fattr(struct cifs_sb_info *cifs_sb, struct cifs_fattr *fattr, struct inode *inode, bool mode_from_special_sid, const char *path, const struct cifs_fid *pfid) { - struct cifs_ntsd *pntsd = NULL; + struct smb_ntsd *pntsd = NULL; u32 acllen = 0; int rc = 0; struct tcon_link *tlink = cifs_sb_tlink(cifs_sb); @@ -1581,8 +1581,8 @@ id_mode_to_cifs_acl(struct inode *inode, const char *path, __u64 *pnmode, __u32 nsecdesclen = 0; __u32 dacloffset = 0; struct cifs_acl *dacl_ptr = NULL; - struct cifs_ntsd *pntsd = NULL; /* acl obtained from server */ - struct cifs_ntsd *pnntsd = NULL; /* modified acl to be sent to server */ + struct smb_ntsd *pntsd = NULL; /* acl obtained from server */ + struct smb_ntsd *pnntsd = NULL; /* modified acl to be sent to server */ struct cifs_sb_info *cifs_sb = CIFS_SB(inode->i_sb); struct tcon_link *tlink = cifs_sb_tlink(cifs_sb); struct smb_version_operations *ops; @@ -1630,7 +1630,7 @@ id_mode_to_cifs_acl(struct inode *inode, const char *path, __u64 *pnmode, nsecdesclen += 5 * sizeof(struct cifs_ace); } else { /* chown */ /* When ownership changes, changes new owner sid length could be different */ - nsecdesclen = sizeof(struct cifs_ntsd) + (sizeof(struct cifs_sid) * 2); + nsecdesclen = sizeof(struct smb_ntsd) + (sizeof(struct cifs_sid) * 2); dacloffset = le32_to_cpu(pntsd->dacloffset); if (dacloffset) { dacl_ptr = (struct cifs_acl *)((char *)pntsd + dacloffset); diff --git a/fs/smb/client/cifsacl.h b/fs/smb/client/cifsacl.h index ccbfc754bd3c..1516545d7f67 100644 --- a/fs/smb/client/cifsacl.h +++ b/fs/smb/client/cifsacl.h @@ -33,7 +33,7 @@ * Security Descriptor length containing DACL with 3 ACEs (one each for * owner, group and world). */ -#define DEFAULT_SEC_DESC_LEN (sizeof(struct cifs_ntsd) + \ +#define DEFAULT_SEC_DESC_LEN (sizeof(struct smb_ntsd) + \ sizeof(struct cifs_acl) + \ (sizeof(struct cifs_ace) * 4))
@@ -55,7 +55,7 @@ #define SID_STRING_BASE_SIZE (2 + 3 + 15 + 1) #define SID_STRING_SUBAUTH_SIZE (11) /* size of a single subauth string */
-struct cifs_ntsd { +struct smb_ntsd { __le16 revision; /* revision level */ __le16 type; __le32 osidoffset; @@ -194,6 +194,6 @@ struct owner_group_sids { * Minimum security descriptor can be one without any SACL and DACL and can * consist of revision, type, and two sids of minimum size for owner and group */ -#define MIN_SEC_DESC_LEN (sizeof(struct cifs_ntsd) + (2 * MIN_SID_LEN)) +#define MIN_SEC_DESC_LEN (sizeof(struct smb_ntsd) + (2 * MIN_SID_LEN))
#endif /* _CIFSACL_H */ diff --git a/fs/smb/client/cifsglob.h b/fs/smb/client/cifsglob.h index 6b57b167a49d..cf22629bf90b 100644 --- a/fs/smb/client/cifsglob.h +++ b/fs/smb/client/cifsglob.h @@ -539,12 +539,12 @@ struct smb_version_operations { int (*set_EA)(const unsigned int, struct cifs_tcon *, const char *, const char *, const void *, const __u16, const struct nls_table *, struct cifs_sb_info *); - struct cifs_ntsd * (*get_acl)(struct cifs_sb_info *, struct inode *, - const char *, u32 *, u32); - struct cifs_ntsd * (*get_acl_by_fid)(struct cifs_sb_info *, - const struct cifs_fid *, u32 *, u32); - int (*set_acl)(struct cifs_ntsd *, __u32, struct inode *, const char *, - int); + struct smb_ntsd * (*get_acl)(struct cifs_sb_info *cifssb, struct inode *ino, + const char *patch, u32 *plen, u32 info); + struct smb_ntsd * (*get_acl_by_fid)(struct cifs_sb_info *cifssmb, + const struct cifs_fid *pfid, u32 *plen, u32 info); + int (*set_acl)(struct smb_ntsd *pntsd, __u32 len, struct inode *ino, const char *path, + int flag); /* writepages retry size */ unsigned int (*wp_retry_size)(struct inode *); /* get mtu credits */ diff --git a/fs/smb/client/cifsproto.h b/fs/smb/client/cifsproto.h index 83692bf60007..f34c533efe49 100644 --- a/fs/smb/client/cifsproto.h +++ b/fs/smb/client/cifsproto.h @@ -231,16 +231,16 @@ extern int cifs_acl_to_fattr(struct cifs_sb_info *cifs_sb, const char *path, const struct cifs_fid *pfid); extern int id_mode_to_cifs_acl(struct inode *inode, const char *path, __u64 *pnmode, kuid_t uid, kgid_t gid); -extern struct cifs_ntsd *get_cifs_acl(struct cifs_sb_info *, struct inode *, - const char *, u32 *, u32); -extern struct cifs_ntsd *get_cifs_acl_by_fid(struct cifs_sb_info *, - const struct cifs_fid *, u32 *, u32); +extern struct smb_ntsd *get_cifs_acl(struct cifs_sb_info *cifssmb, struct inode *ino, + const char *path, u32 *plen, u32 info); +extern struct smb_ntsd *get_cifs_acl_by_fid(struct cifs_sb_info *cifssb, + const struct cifs_fid *pfid, u32 *plen, u32 info); extern struct posix_acl *cifs_get_acl(struct mnt_idmap *idmap, struct dentry *dentry, int type); extern int cifs_set_acl(struct mnt_idmap *idmap, struct dentry *dentry, struct posix_acl *acl, int type); -extern int set_cifs_acl(struct cifs_ntsd *, __u32, struct inode *, - const char *, int); +extern int set_cifs_acl(struct smb_ntsd *pntsd, __u32 len, struct inode *ino, + const char *path, int flag); extern unsigned int setup_authusers_ACE(struct cifs_ace *pace); extern unsigned int setup_special_mode_ACE(struct cifs_ace *pace, __u64 nmode); extern unsigned int setup_special_user_owner_ACE(struct cifs_ace *pace); @@ -568,9 +568,9 @@ extern int CIFSSMBSetEA(const unsigned int xid, struct cifs_tcon *tcon, const struct nls_table *nls_codepage, struct cifs_sb_info *cifs_sb); extern int CIFSSMBGetCIFSACL(const unsigned int xid, struct cifs_tcon *tcon, - __u16 fid, struct cifs_ntsd **acl_inf, __u32 *buflen); + __u16 fid, struct smb_ntsd **acl_inf, __u32 *buflen); extern int CIFSSMBSetCIFSACL(const unsigned int, struct cifs_tcon *, __u16, - struct cifs_ntsd *, __u32, int); + struct smb_ntsd *pntsd, __u32 len, int aclflag); extern int cifs_do_get_acl(const unsigned int xid, struct cifs_tcon *tcon, const unsigned char *searchName, struct posix_acl **acl, const int acl_type, diff --git a/fs/smb/client/cifssmb.c b/fs/smb/client/cifssmb.c index a34db419e46f..2f8745736dbb 100644 --- a/fs/smb/client/cifssmb.c +++ b/fs/smb/client/cifssmb.c @@ -3385,7 +3385,7 @@ validate_ntransact(char *buf, char **ppparm, char **ppdata, /* Get Security Descriptor (by handle) from remote server for a file or dir */ int CIFSSMBGetCIFSACL(const unsigned int xid, struct cifs_tcon *tcon, __u16 fid, - struct cifs_ntsd **acl_inf, __u32 *pbuflen) + struct smb_ntsd **acl_inf, __u32 *pbuflen) { int rc = 0; int buf_type = 0; @@ -3455,7 +3455,7 @@ CIFSSMBGetCIFSACL(const unsigned int xid, struct cifs_tcon *tcon, __u16 fid,
/* check if buffer is big enough for the acl header followed by the smallest SID */ - if ((*pbuflen < sizeof(struct cifs_ntsd) + 8) || + if ((*pbuflen < sizeof(struct smb_ntsd) + 8) || (*pbuflen >= 64 * 1024)) { cifs_dbg(VFS, "bad acl length %d\n", *pbuflen); rc = -EINVAL; @@ -3475,7 +3475,7 @@ CIFSSMBGetCIFSACL(const unsigned int xid, struct cifs_tcon *tcon, __u16 fid,
int CIFSSMBSetCIFSACL(const unsigned int xid, struct cifs_tcon *tcon, __u16 fid, - struct cifs_ntsd *pntsd, __u32 acllen, int aclflag) + struct smb_ntsd *pntsd, __u32 acllen, int aclflag) { __u16 byte_count, param_count, data_count, param_offset, data_offset; int rc = 0; diff --git a/fs/smb/client/smb2ops.c b/fs/smb/client/smb2ops.c index 6645f147d57c..fc6d00344c50 100644 --- a/fs/smb/client/smb2ops.c +++ b/fs/smb/client/smb2ops.c @@ -3001,11 +3001,11 @@ smb2_get_dfs_refer(const unsigned int xid, struct cifs_ses *ses, return rc; }
-static struct cifs_ntsd * +static struct smb_ntsd * get_smb2_acl_by_fid(struct cifs_sb_info *cifs_sb, const struct cifs_fid *cifsfid, u32 *pacllen, u32 info) { - struct cifs_ntsd *pntsd = NULL; + struct smb_ntsd *pntsd = NULL; unsigned int xid; int rc = -EOPNOTSUPP; struct tcon_link *tlink = cifs_sb_tlink(cifs_sb); @@ -3030,11 +3030,11 @@ get_smb2_acl_by_fid(struct cifs_sb_info *cifs_sb,
}
-static struct cifs_ntsd * +static struct smb_ntsd * get_smb2_acl_by_path(struct cifs_sb_info *cifs_sb, const char *path, u32 *pacllen, u32 info) { - struct cifs_ntsd *pntsd = NULL; + struct smb_ntsd *pntsd = NULL; u8 oplock = SMB2_OPLOCK_LEVEL_NONE; unsigned int xid; int rc; @@ -3097,7 +3097,7 @@ get_smb2_acl_by_path(struct cifs_sb_info *cifs_sb, }
static int -set_smb2_acl(struct cifs_ntsd *pnntsd, __u32 acllen, +set_smb2_acl(struct smb_ntsd *pnntsd, __u32 acllen, struct inode *inode, const char *path, int aclflag) { u8 oplock = SMB2_OPLOCK_LEVEL_NONE; @@ -3155,12 +3155,12 @@ set_smb2_acl(struct cifs_ntsd *pnntsd, __u32 acllen, }
/* Retrieve an ACL from the server */ -static struct cifs_ntsd * +static struct smb_ntsd * get_smb2_acl(struct cifs_sb_info *cifs_sb, struct inode *inode, const char *path, u32 *pacllen, u32 info) { - struct cifs_ntsd *pntsd = NULL; + struct smb_ntsd *pntsd = NULL; struct cifsFileInfo *open_file = NULL;
if (inode && !(info & SACL_SECINFO)) diff --git a/fs/smb/client/smb2pdu.c b/fs/smb/client/smb2pdu.c index 38b26468eb0c..0a4985bba55f 100644 --- a/fs/smb/client/smb2pdu.c +++ b/fs/smb/client/smb2pdu.c @@ -5626,7 +5626,7 @@ SMB2_set_eof(const unsigned int xid, struct cifs_tcon *tcon, u64 persistent_fid, int SMB2_set_acl(const unsigned int xid, struct cifs_tcon *tcon, u64 persistent_fid, u64 volatile_fid, - struct cifs_ntsd *pnntsd, int pacllen, int aclflag) + struct smb_ntsd *pnntsd, int pacllen, int aclflag) { return send_set_info(xid, tcon, persistent_fid, volatile_fid, current->tgid, 0, SMB2_O_INFO_SECURITY, aclflag, diff --git a/fs/smb/client/smb2proto.h b/fs/smb/client/smb2proto.h index 613667b46c58..fd90d8e5a1d1 100644 --- a/fs/smb/client/smb2proto.h +++ b/fs/smb/client/smb2proto.h @@ -247,7 +247,7 @@ extern int SMB2_set_info_init(struct cifs_tcon *tcon, extern void SMB2_set_info_free(struct smb_rqst *rqst); extern int SMB2_set_acl(const unsigned int xid, struct cifs_tcon *tcon, u64 persistent_fid, u64 volatile_fid, - struct cifs_ntsd *pnntsd, int pacllen, int aclflag); + struct smb_ntsd *pnntsd, int pacllen, int aclflag); extern int SMB2_set_ea(const unsigned int xid, struct cifs_tcon *tcon, u64 persistent_fid, u64 volatile_fid, struct smb2_file_full_ea_info *buf, int len); diff --git a/fs/smb/client/xattr.c b/fs/smb/client/xattr.c index c2bf829310be..e8696ad4da99 100644 --- a/fs/smb/client/xattr.c +++ b/fs/smb/client/xattr.c @@ -162,7 +162,7 @@ static int cifs_xattr_set(const struct xattr_handler *handler, case XATTR_CIFS_ACL: case XATTR_CIFS_NTSD: case XATTR_CIFS_NTSD_FULL: { - struct cifs_ntsd *pacl; + struct smb_ntsd *pacl;
if (!value) goto out; @@ -315,7 +315,7 @@ static int cifs_xattr_get(const struct xattr_handler *handler, * fetch owner and DACL otherwise */ u32 acllen, extra_info; - struct cifs_ntsd *pacl; + struct smb_ntsd *pacl;
if (pTcon->ses->server->ops->get_acl == NULL) goto out; /* rc already EOPNOTSUPP */
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: ChenXiaoSong chenxiaosong@kylinos.cn
[ Upstream commit 7f599d8fb3e087aff5be4e1392baaae3f8d42419 ]
Preparation for moving acl definitions to new common header file.
Use the following shell command to rename:
find fs/smb/client -type f -exec sed -i \ 's/struct cifs_sid/struct smb_sid/g' {} +
Signed-off-by: ChenXiaoSong chenxiaosong@kylinos.cn Reviewed-by: Namjae Jeon linkinjeon@kernel.org Signed-off-by: Steve French stfrench@microsoft.com Stable-dep-of: d413eabff18d ("fs/smb/client: implement chmod() for SMB3 POSIX Extensions") Signed-off-by: Sasha Levin sashal@kernel.org --- fs/smb/client/cifsacl.c | 96 +++++++++++++++++++-------------------- fs/smb/client/cifsacl.h | 6 +-- fs/smb/client/cifsglob.h | 8 ++-- fs/smb/client/cifsproto.h | 2 +- fs/smb/client/smb2inode.c | 4 +- fs/smb/client/smb2pdu.c | 2 +- fs/smb/client/smb2pdu.h | 8 ++-- 7 files changed, 63 insertions(+), 63 deletions(-)
diff --git a/fs/smb/client/cifsacl.c b/fs/smb/client/cifsacl.c index 3f7657475cd9..dd399f9a7424 100644 --- a/fs/smb/client/cifsacl.c +++ b/fs/smb/client/cifsacl.c @@ -27,18 +27,18 @@ #include "cifs_unicode.h"
/* security id for everyone/world system group */ -static const struct cifs_sid sid_everyone = { +static const struct smb_sid sid_everyone = { 1, 1, {0, 0, 0, 0, 0, 1}, {0} }; /* security id for Authenticated Users system group */ -static const struct cifs_sid sid_authusers = { +static const struct smb_sid sid_authusers = { 1, 1, {0, 0, 0, 0, 0, 5}, {cpu_to_le32(11)} };
/* S-1-22-1 Unmapped Unix users */ -static const struct cifs_sid sid_unix_users = {1, 1, {0, 0, 0, 0, 0, 22}, +static const struct smb_sid sid_unix_users = {1, 1, {0, 0, 0, 0, 0, 22}, {cpu_to_le32(1), 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0} };
/* S-1-22-2 Unmapped Unix groups */ -static const struct cifs_sid sid_unix_groups = { 1, 1, {0, 0, 0, 0, 0, 22}, +static const struct smb_sid sid_unix_groups = { 1, 1, {0, 0, 0, 0, 0, 22}, {cpu_to_le32(2), 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0} };
/* @@ -48,17 +48,17 @@ static const struct cifs_sid sid_unix_groups = { 1, 1, {0, 0, 0, 0, 0, 22}, /* S-1-5-88 MS NFS and Apple style UID/GID/mode */
/* S-1-5-88-1 Unix uid */ -static const struct cifs_sid sid_unix_NFS_users = { 1, 2, {0, 0, 0, 0, 0, 5}, +static const struct smb_sid sid_unix_NFS_users = { 1, 2, {0, 0, 0, 0, 0, 5}, {cpu_to_le32(88), cpu_to_le32(1), 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0} };
/* S-1-5-88-2 Unix gid */ -static const struct cifs_sid sid_unix_NFS_groups = { 1, 2, {0, 0, 0, 0, 0, 5}, +static const struct smb_sid sid_unix_NFS_groups = { 1, 2, {0, 0, 0, 0, 0, 5}, {cpu_to_le32(88), cpu_to_le32(2), 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0} };
/* S-1-5-88-3 Unix mode */ -static const struct cifs_sid sid_unix_NFS_mode = { 1, 2, {0, 0, 0, 0, 0, 5}, +static const struct smb_sid sid_unix_NFS_mode = { 1, 2, {0, 0, 0, 0, 0, 5}, {cpu_to_le32(88), cpu_to_le32(3), 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0} };
@@ -106,7 +106,7 @@ static struct key_type cifs_idmap_key_type = { };
static char * -sid_to_key_str(struct cifs_sid *sidptr, unsigned int type) +sid_to_key_str(struct smb_sid *sidptr, unsigned int type) { int i, len; unsigned int saval; @@ -158,7 +158,7 @@ sid_to_key_str(struct cifs_sid *sidptr, unsigned int type) * the same returns zero, if they do not match returns non-zero. */ static int -compare_sids(const struct cifs_sid *ctsid, const struct cifs_sid *cwsid) +compare_sids(const struct smb_sid *ctsid, const struct smb_sid *cwsid) { int i; int num_subauth, num_sat, num_saw; @@ -204,11 +204,11 @@ compare_sids(const struct cifs_sid *ctsid, const struct cifs_sid *cwsid) }
static bool -is_well_known_sid(const struct cifs_sid *psid, uint32_t *puid, bool is_group) +is_well_known_sid(const struct smb_sid *psid, uint32_t *puid, bool is_group) { int i; int num_subauth; - const struct cifs_sid *pwell_known_sid; + const struct smb_sid *pwell_known_sid;
if (!psid || (puid == NULL)) return false; @@ -260,7 +260,7 @@ is_well_known_sid(const struct cifs_sid *psid, uint32_t *puid, bool is_group) }
static __u16 -cifs_copy_sid(struct cifs_sid *dst, const struct cifs_sid *src) +cifs_copy_sid(struct smb_sid *dst, const struct smb_sid *src) { int i; __u16 size = 1 + 1 + 6; @@ -277,11 +277,11 @@ cifs_copy_sid(struct cifs_sid *dst, const struct cifs_sid *src) }
static int -id_to_sid(unsigned int cid, uint sidtype, struct cifs_sid *ssid) +id_to_sid(unsigned int cid, uint sidtype, struct smb_sid *ssid) { int rc; struct key *sidkey; - struct cifs_sid *ksid; + struct smb_sid *ksid; unsigned int ksid_size; char desc[3 + 10 + 1]; /* 3 byte prefix + 10 bytes for value + NULL */ const struct cred *saved_cred; @@ -312,8 +312,8 @@ id_to_sid(unsigned int cid, uint sidtype, struct cifs_sid *ssid) * it could be. */ ksid = sidkey->datalen <= sizeof(sidkey->payload) ? - (struct cifs_sid *)&sidkey->payload : - (struct cifs_sid *)sidkey->payload.data[0]; + (struct smb_sid *)&sidkey->payload : + (struct smb_sid *)sidkey->payload.data[0];
ksid_size = CIFS_SID_BASE_SIZE + (ksid->num_subauth * sizeof(__le32)); if (ksid_size > sidkey->datalen) { @@ -336,7 +336,7 @@ id_to_sid(unsigned int cid, uint sidtype, struct cifs_sid *ssid) }
int -sid_to_id(struct cifs_sb_info *cifs_sb, struct cifs_sid *psid, +sid_to_id(struct cifs_sb_info *cifs_sb, struct smb_sid *psid, struct cifs_fattr *fattr, uint sidtype) { int rc = 0; @@ -518,11 +518,11 @@ exit_cifs_idmap(void) static __u32 copy_sec_desc(const struct smb_ntsd *pntsd, struct smb_ntsd *pnntsd, __u32 sidsoffset, - struct cifs_sid *pownersid, - struct cifs_sid *pgrpsid) + struct smb_sid *pownersid, + struct smb_sid *pgrpsid) { - struct cifs_sid *owner_sid_ptr, *group_sid_ptr; - struct cifs_sid *nowner_sid_ptr, *ngroup_sid_ptr; + struct smb_sid *owner_sid_ptr, *group_sid_ptr; + struct smb_sid *nowner_sid_ptr, *ngroup_sid_ptr;
/* copy security descriptor control portion */ pnntsd->revision = pntsd->revision; @@ -530,28 +530,28 @@ static __u32 copy_sec_desc(const struct smb_ntsd *pntsd, pnntsd->dacloffset = cpu_to_le32(sizeof(struct smb_ntsd)); pnntsd->sacloffset = 0; pnntsd->osidoffset = cpu_to_le32(sidsoffset); - pnntsd->gsidoffset = cpu_to_le32(sidsoffset + sizeof(struct cifs_sid)); + pnntsd->gsidoffset = cpu_to_le32(sidsoffset + sizeof(struct smb_sid));
/* copy owner sid */ if (pownersid) owner_sid_ptr = pownersid; else - owner_sid_ptr = (struct cifs_sid *)((char *)pntsd + + owner_sid_ptr = (struct smb_sid *)((char *)pntsd + le32_to_cpu(pntsd->osidoffset)); - nowner_sid_ptr = (struct cifs_sid *)((char *)pnntsd + sidsoffset); + nowner_sid_ptr = (struct smb_sid *)((char *)pnntsd + sidsoffset); cifs_copy_sid(nowner_sid_ptr, owner_sid_ptr);
/* copy group sid */ if (pgrpsid) group_sid_ptr = pgrpsid; else - group_sid_ptr = (struct cifs_sid *)((char *)pntsd + + group_sid_ptr = (struct smb_sid *)((char *)pntsd + le32_to_cpu(pntsd->gsidoffset)); - ngroup_sid_ptr = (struct cifs_sid *)((char *)pnntsd + sidsoffset + - sizeof(struct cifs_sid)); + ngroup_sid_ptr = (struct smb_sid *)((char *)pnntsd + sidsoffset + + sizeof(struct smb_sid)); cifs_copy_sid(ngroup_sid_ptr, group_sid_ptr);
- return sidsoffset + (2 * sizeof(struct cifs_sid)); + return sidsoffset + (2 * sizeof(struct smb_sid)); }
@@ -666,7 +666,7 @@ static void mode_to_access_flags(umode_t mode, umode_t bits_to_use, return; }
-static __u16 cifs_copy_ace(struct cifs_ace *dst, struct cifs_ace *src, struct cifs_sid *psid) +static __u16 cifs_copy_ace(struct cifs_ace *dst, struct cifs_ace *src, struct smb_sid *psid) { __u16 size = 1 + 1 + 2 + 4;
@@ -686,7 +686,7 @@ static __u16 cifs_copy_ace(struct cifs_ace *dst, struct cifs_ace *src, struct ci }
static __u16 fill_ace_for_sid(struct cifs_ace *pntace, - const struct cifs_sid *psid, __u64 nmode, + const struct smb_sid *psid, __u64 nmode, umode_t bits, __u8 access_type, bool allow_delete_child) { @@ -759,7 +759,7 @@ static void dump_ace(struct cifs_ace *pace, char *end_of_acl) #endif
static void parse_dacl(struct cifs_acl *pdacl, char *end_of_acl, - struct cifs_sid *pownersid, struct cifs_sid *pgrpsid, + struct smb_sid *pownersid, struct smb_sid *pgrpsid, struct cifs_fattr *fattr, bool mode_from_special_sid) { int i; @@ -930,8 +930,8 @@ unsigned int setup_special_user_owner_ACE(struct cifs_ace *pntace) }
static void populate_new_aces(char *nacl_base, - struct cifs_sid *pownersid, - struct cifs_sid *pgrpsid, + struct smb_sid *pownersid, + struct smb_sid *pgrpsid, __u64 *pnmode, u32 *pnum_aces, u16 *pnsize, bool modefromsid) { @@ -967,7 +967,7 @@ static void populate_new_aces(char *nacl_base, * updated in the inode. */
- if (!memcmp(pownersid, pgrpsid, sizeof(struct cifs_sid))) { + if (!memcmp(pownersid, pgrpsid, sizeof(struct smb_sid))) { /* * Case when owner and group SIDs are the same. * Set the more restrictive of the two modes. @@ -1035,8 +1035,8 @@ static void populate_new_aces(char *nacl_base, }
static __u16 replace_sids_and_copy_aces(struct cifs_acl *pdacl, struct cifs_acl *pndacl, - struct cifs_sid *pownersid, struct cifs_sid *pgrpsid, - struct cifs_sid *pnownersid, struct cifs_sid *pngrpsid) + struct smb_sid *pownersid, struct smb_sid *pgrpsid, + struct smb_sid *pnownersid, struct smb_sid *pngrpsid) { int i; u16 size = 0; @@ -1075,7 +1075,7 @@ static __u16 replace_sids_and_copy_aces(struct cifs_acl *pdacl, struct cifs_acl }
static int set_chmod_dacl(struct cifs_acl *pdacl, struct cifs_acl *pndacl, - struct cifs_sid *pownersid, struct cifs_sid *pgrpsid, + struct smb_sid *pownersid, struct smb_sid *pgrpsid, __u64 *pnmode, bool mode_from_sid) { int i; @@ -1156,7 +1156,7 @@ static int set_chmod_dacl(struct cifs_acl *pdacl, struct cifs_acl *pndacl, return 0; }
-static int parse_sid(struct cifs_sid *psid, char *end_of_acl) +static int parse_sid(struct smb_sid *psid, char *end_of_acl) { /* BB need to add parm so we can store the SID BB */
@@ -1195,7 +1195,7 @@ static int parse_sec_desc(struct cifs_sb_info *cifs_sb, bool get_mode_from_special_sid) { int rc = 0; - struct cifs_sid *owner_sid_ptr, *group_sid_ptr; + struct smb_sid *owner_sid_ptr, *group_sid_ptr; struct cifs_acl *dacl_ptr; /* no need for SACL ptr */ char *end_of_acl = ((char *)pntsd) + acl_len; __u32 dacloffset; @@ -1203,9 +1203,9 @@ static int parse_sec_desc(struct cifs_sb_info *cifs_sb, if (pntsd == NULL) return -EIO;
- owner_sid_ptr = (struct cifs_sid *)((char *)pntsd + + owner_sid_ptr = (struct smb_sid *)((char *)pntsd + le32_to_cpu(pntsd->osidoffset)); - group_sid_ptr = (struct cifs_sid *)((char *)pntsd + + group_sid_ptr = (struct smb_sid *)((char *)pntsd + le32_to_cpu(pntsd->gsidoffset)); dacloffset = le32_to_cpu(pntsd->dacloffset); dacl_ptr = (struct cifs_acl *)((char *)pntsd + dacloffset); @@ -1257,8 +1257,8 @@ static int build_sec_desc(struct smb_ntsd *pntsd, struct smb_ntsd *pnntsd, __u32 dacloffset; __u32 ndacloffset; __u32 sidsoffset; - struct cifs_sid *owner_sid_ptr, *group_sid_ptr; - struct cifs_sid *nowner_sid_ptr = NULL, *ngroup_sid_ptr = NULL; + struct smb_sid *owner_sid_ptr, *group_sid_ptr; + struct smb_sid *nowner_sid_ptr = NULL, *ngroup_sid_ptr = NULL; struct cifs_acl *dacl_ptr = NULL; /* no need for SACL ptr */ struct cifs_acl *ndacl_ptr = NULL; /* no need for SACL ptr */ char *end_of_acl = ((char *)pntsd) + secdesclen; @@ -1273,9 +1273,9 @@ static int build_sec_desc(struct smb_ntsd *pntsd, struct smb_ntsd *pnntsd, } }
- owner_sid_ptr = (struct cifs_sid *)((char *)pntsd + + owner_sid_ptr = (struct smb_sid *)((char *)pntsd + le32_to_cpu(pntsd->osidoffset)); - group_sid_ptr = (struct cifs_sid *)((char *)pntsd + + group_sid_ptr = (struct smb_sid *)((char *)pntsd + le32_to_cpu(pntsd->gsidoffset));
if (pnmode && *pnmode != NO_CHANGE_64) { /* chmod */ @@ -1305,7 +1305,7 @@ static int build_sec_desc(struct smb_ntsd *pntsd, struct smb_ntsd *pnntsd,
if (uid_valid(uid)) { /* chown */ uid_t id; - nowner_sid_ptr = kzalloc(sizeof(struct cifs_sid), + nowner_sid_ptr = kzalloc(sizeof(struct smb_sid), GFP_KERNEL); if (!nowner_sid_ptr) { rc = -ENOMEM; @@ -1334,7 +1334,7 @@ static int build_sec_desc(struct smb_ntsd *pntsd, struct smb_ntsd *pnntsd, } if (gid_valid(gid)) { /* chgrp */ gid_t id; - ngroup_sid_ptr = kzalloc(sizeof(struct cifs_sid), + ngroup_sid_ptr = kzalloc(sizeof(struct smb_sid), GFP_KERNEL); if (!ngroup_sid_ptr) { rc = -ENOMEM; @@ -1630,7 +1630,7 @@ id_mode_to_cifs_acl(struct inode *inode, const char *path, __u64 *pnmode, nsecdesclen += 5 * sizeof(struct cifs_ace); } else { /* chown */ /* When ownership changes, changes new owner sid length could be different */ - nsecdesclen = sizeof(struct smb_ntsd) + (sizeof(struct cifs_sid) * 2); + nsecdesclen = sizeof(struct smb_ntsd) + (sizeof(struct smb_sid) * 2); dacloffset = le32_to_cpu(pntsd->dacloffset); if (dacloffset) { dacl_ptr = (struct cifs_acl *)((char *)pntsd + dacloffset); diff --git a/fs/smb/client/cifsacl.h b/fs/smb/client/cifsacl.h index 1516545d7f67..6a38718220fc 100644 --- a/fs/smb/client/cifsacl.h +++ b/fs/smb/client/cifsacl.h @@ -64,14 +64,14 @@ struct smb_ntsd { __le32 dacloffset; } __attribute__((packed));
-struct cifs_sid { +struct smb_sid { __u8 revision; /* revision level */ __u8 num_subauth; __u8 authority[NUM_AUTHS]; __le32 sub_auth[SID_MAX_SUB_AUTHORITIES]; /* sub_auth[num_subauth] */ } __attribute__((packed));
-/* size of a struct cifs_sid, sans sub_auth array */ +/* size of a struct smb_sid, sans sub_auth array */ #define CIFS_SID_BASE_SIZE (1 + 1 + NUM_AUTHS)
struct cifs_acl { @@ -116,7 +116,7 @@ struct cifs_ace { __u8 flags; __le16 size; __le32 access_req; - struct cifs_sid sid; /* ie UUID of user or group who gets these perms */ + struct smb_sid sid; /* ie UUID of user or group who gets these perms */ } __attribute__((packed));
/* diff --git a/fs/smb/client/cifsglob.h b/fs/smb/client/cifsglob.h index cf22629bf90b..69d850b6b37f 100644 --- a/fs/smb/client/cifsglob.h +++ b/fs/smb/client/cifsglob.h @@ -202,8 +202,8 @@ struct cifs_cred { int gid; int mode; int cecount; - struct cifs_sid osid; - struct cifs_sid gsid; + struct smb_sid osid; + struct smb_sid gsid; struct cifs_ntace *ntaces; struct cifs_ace *aces; }; @@ -231,8 +231,8 @@ struct cifs_open_info_data { unsigned int eas_len; } wsl; char *symlink_target; - struct cifs_sid posix_owner; - struct cifs_sid posix_group; + struct smb_sid posix_owner; + struct smb_sid posix_group; union { struct smb2_file_all_info fi; struct smb311_posix_qinfo posix_fi; diff --git a/fs/smb/client/cifsproto.h b/fs/smb/client/cifsproto.h index f34c533efe49..059e506ccf5b 100644 --- a/fs/smb/client/cifsproto.h +++ b/fs/smb/client/cifsproto.h @@ -223,7 +223,7 @@ extern int cifs_set_file_info(struct inode *inode, struct iattr *attrs, extern int cifs_rename_pending_delete(const char *full_path, struct dentry *dentry, const unsigned int xid); -extern int sid_to_id(struct cifs_sb_info *cifs_sb, struct cifs_sid *psid, +extern int sid_to_id(struct cifs_sb_info *cifs_sb, struct smb_sid *psid, struct cifs_fattr *fattr, uint sidtype); extern int cifs_acl_to_fattr(struct cifs_sb_info *cifs_sb, struct cifs_fattr *fattr, struct inode *inode, diff --git a/fs/smb/client/smb2inode.c b/fs/smb/client/smb2inode.c index 2a292736c89a..e695df1dbb23 100644 --- a/fs/smb/client/smb2inode.c +++ b/fs/smb/client/smb2inode.c @@ -315,7 +315,7 @@ static int smb2_compound_op(const unsigned int xid, struct cifs_tcon *tcon, SMB2_O_INFO_FILE, 0, sizeof(struct smb311_posix_qinfo *) + (PATH_MAX * 2) + - (sizeof(struct cifs_sid) * 2), 0, NULL); + (sizeof(struct smb_sid) * 2), 0, NULL); } else { rc = SMB2_query_info_init(tcon, server, &rqst[num_rqst], @@ -325,7 +325,7 @@ static int smb2_compound_op(const unsigned int xid, struct cifs_tcon *tcon, SMB2_O_INFO_FILE, 0, sizeof(struct smb311_posix_qinfo *) + (PATH_MAX * 2) + - (sizeof(struct cifs_sid) * 2), 0, NULL); + (sizeof(struct smb_sid) * 2), 0, NULL); } if (!rc && (!cfile || num_rqst > 1)) { smb2_set_next_command(tcon, &rqst[num_rqst]); diff --git a/fs/smb/client/smb2pdu.c b/fs/smb/client/smb2pdu.c index 0a4985bba55f..42f950ae10fb 100644 --- a/fs/smb/client/smb2pdu.c +++ b/fs/smb/client/smb2pdu.c @@ -3915,7 +3915,7 @@ SMB311_posix_query_info(const unsigned int xid, struct cifs_tcon *tcon, u64 persistent_fid, u64 volatile_fid, struct smb311_posix_qinfo *data, u32 *plen) { size_t output_len = sizeof(struct smb311_posix_qinfo *) + - (sizeof(struct cifs_sid) * 2) + (PATH_MAX * 2); + (sizeof(struct smb_sid) * 2) + (PATH_MAX * 2); *plen = 0;
return query_info(xid, tcon, persistent_fid, volatile_fid, diff --git a/fs/smb/client/smb2pdu.h b/fs/smb/client/smb2pdu.h index 5c458ab3b05a..076d9e83e1a0 100644 --- a/fs/smb/client/smb2pdu.h +++ b/fs/smb/client/smb2pdu.h @@ -364,8 +364,8 @@ struct create_posix_rsp { u32 nlink; u32 reparse_tag; u32 mode; - struct cifs_sid owner; /* var-sized on the wire */ - struct cifs_sid group; /* var-sized on the wire */ + struct smb_sid owner; /* var-sized on the wire */ + struct smb_sid group; /* var-sized on the wire */ } __packed;
#define SMB2_QUERY_DIRECTORY_IOV_SIZE 2 @@ -408,8 +408,8 @@ struct smb2_posix_info { struct smb2_posix_info_parsed { const struct smb2_posix_info *base; size_t size; - struct cifs_sid owner; - struct cifs_sid group; + struct smb_sid owner; + struct smb_sid group; int name_len; const u8 *name; };
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: ChenXiaoSong chenxiaosong@kylinos.cn
[ Upstream commit 251b93ae73805b216e84ed2190b525f319da4c87 ]
Preparation for moving acl definitions to new common header file.
Use the following shell command to rename:
find fs/smb/client -type f -exec sed -i \ 's/struct cifs_acl/struct smb_acl/g' {} +
Signed-off-by: ChenXiaoSong chenxiaosong@kylinos.cn Reviewed-by: Namjae Jeon linkinjeon@kernel.org Signed-off-by: Steve French stfrench@microsoft.com Stable-dep-of: d413eabff18d ("fs/smb/client: implement chmod() for SMB3 POSIX Extensions") Signed-off-by: Sasha Levin sashal@kernel.org --- fs/smb/client/cifsacl.c | 34 +++++++++++++++++----------------- fs/smb/client/cifsacl.h | 4 ++-- 2 files changed, 19 insertions(+), 19 deletions(-)
diff --git a/fs/smb/client/cifsacl.c b/fs/smb/client/cifsacl.c index dd399f9a7424..2e1c9b528dde 100644 --- a/fs/smb/client/cifsacl.c +++ b/fs/smb/client/cifsacl.c @@ -758,7 +758,7 @@ static void dump_ace(struct cifs_ace *pace, char *end_of_acl) } #endif
-static void parse_dacl(struct cifs_acl *pdacl, char *end_of_acl, +static void parse_dacl(struct smb_acl *pdacl, char *end_of_acl, struct smb_sid *pownersid, struct smb_sid *pgrpsid, struct cifs_fattr *fattr, bool mode_from_special_sid) { @@ -793,7 +793,7 @@ static void parse_dacl(struct cifs_acl *pdacl, char *end_of_acl, fattr->cf_mode &= ~(0777);
acl_base = (char *)pdacl; - acl_size = sizeof(struct cifs_acl); + acl_size = sizeof(struct smb_acl);
num_aces = le32_to_cpu(pdacl->num_aces); if (num_aces > 0) { @@ -1034,7 +1034,7 @@ static void populate_new_aces(char *nacl_base, *pnsize = nsize; }
-static __u16 replace_sids_and_copy_aces(struct cifs_acl *pdacl, struct cifs_acl *pndacl, +static __u16 replace_sids_and_copy_aces(struct smb_acl *pdacl, struct smb_acl *pndacl, struct smb_sid *pownersid, struct smb_sid *pgrpsid, struct smb_sid *pnownersid, struct smb_sid *pngrpsid) { @@ -1049,11 +1049,11 @@ static __u16 replace_sids_and_copy_aces(struct cifs_acl *pdacl, struct cifs_acl u16 ace_size = 0;
acl_base = (char *)pdacl; - size = sizeof(struct cifs_acl); + size = sizeof(struct smb_acl); src_num_aces = le32_to_cpu(pdacl->num_aces);
nacl_base = (char *)pndacl; - nsize = sizeof(struct cifs_acl); + nsize = sizeof(struct smb_acl);
/* Go through all the ACEs */ for (i = 0; i < src_num_aces; ++i) { @@ -1074,7 +1074,7 @@ static __u16 replace_sids_and_copy_aces(struct cifs_acl *pdacl, struct cifs_acl return nsize; }
-static int set_chmod_dacl(struct cifs_acl *pdacl, struct cifs_acl *pndacl, +static int set_chmod_dacl(struct smb_acl *pdacl, struct smb_acl *pndacl, struct smb_sid *pownersid, struct smb_sid *pgrpsid, __u64 *pnmode, bool mode_from_sid) { @@ -1091,7 +1091,7 @@ static int set_chmod_dacl(struct cifs_acl *pdacl, struct cifs_acl *pndacl,
/* Assuming that pndacl and pnmode are never NULL */ nacl_base = (char *)pndacl; - nsize = sizeof(struct cifs_acl); + nsize = sizeof(struct smb_acl);
/* If pdacl is NULL, we don't have a src. Simply populate new ACL. */ if (!pdacl) { @@ -1103,7 +1103,7 @@ static int set_chmod_dacl(struct cifs_acl *pdacl, struct cifs_acl *pndacl, }
acl_base = (char *)pdacl; - size = sizeof(struct cifs_acl); + size = sizeof(struct smb_acl); src_num_aces = le32_to_cpu(pdacl->num_aces);
/* Retain old ACEs which we can retain */ @@ -1196,7 +1196,7 @@ static int parse_sec_desc(struct cifs_sb_info *cifs_sb, { int rc = 0; struct smb_sid *owner_sid_ptr, *group_sid_ptr; - struct cifs_acl *dacl_ptr; /* no need for SACL ptr */ + struct smb_acl *dacl_ptr; /* no need for SACL ptr */ char *end_of_acl = ((char *)pntsd) + acl_len; __u32 dacloffset;
@@ -1208,7 +1208,7 @@ static int parse_sec_desc(struct cifs_sb_info *cifs_sb, group_sid_ptr = (struct smb_sid *)((char *)pntsd + le32_to_cpu(pntsd->gsidoffset)); dacloffset = le32_to_cpu(pntsd->dacloffset); - dacl_ptr = (struct cifs_acl *)((char *)pntsd + dacloffset); + dacl_ptr = (struct smb_acl *)((char *)pntsd + dacloffset); cifs_dbg(NOISY, "revision %d type 0x%x ooffset 0x%x goffset 0x%x sacloffset 0x%x dacloffset 0x%x\n", pntsd->revision, pntsd->type, le32_to_cpu(pntsd->osidoffset), le32_to_cpu(pntsd->gsidoffset), @@ -1259,14 +1259,14 @@ static int build_sec_desc(struct smb_ntsd *pntsd, struct smb_ntsd *pnntsd, __u32 sidsoffset; struct smb_sid *owner_sid_ptr, *group_sid_ptr; struct smb_sid *nowner_sid_ptr = NULL, *ngroup_sid_ptr = NULL; - struct cifs_acl *dacl_ptr = NULL; /* no need for SACL ptr */ - struct cifs_acl *ndacl_ptr = NULL; /* no need for SACL ptr */ + struct smb_acl *dacl_ptr = NULL; /* no need for SACL ptr */ + struct smb_acl *ndacl_ptr = NULL; /* no need for SACL ptr */ char *end_of_acl = ((char *)pntsd) + secdesclen; u16 size = 0;
dacloffset = le32_to_cpu(pntsd->dacloffset); if (dacloffset) { - dacl_ptr = (struct cifs_acl *)((char *)pntsd + dacloffset); + dacl_ptr = (struct smb_acl *)((char *)pntsd + dacloffset); if (end_of_acl < (char *)dacl_ptr + le16_to_cpu(dacl_ptr->size)) { cifs_dbg(VFS, "Server returned illegal ACL size\n"); return -EINVAL; @@ -1280,7 +1280,7 @@ static int build_sec_desc(struct smb_ntsd *pntsd, struct smb_ntsd *pnntsd,
if (pnmode && *pnmode != NO_CHANGE_64) { /* chmod */ ndacloffset = sizeof(struct smb_ntsd); - ndacl_ptr = (struct cifs_acl *)((char *)pnntsd + ndacloffset); + ndacl_ptr = (struct smb_acl *)((char *)pnntsd + ndacloffset); ndacl_ptr->revision = dacloffset ? dacl_ptr->revision : cpu_to_le16(ACL_REVISION);
@@ -1298,7 +1298,7 @@ static int build_sec_desc(struct smb_ntsd *pntsd, struct smb_ntsd *pnntsd, *aclflag |= CIFS_ACL_DACL; } else { ndacloffset = sizeof(struct smb_ntsd); - ndacl_ptr = (struct cifs_acl *)((char *)pnntsd + ndacloffset); + ndacl_ptr = (struct smb_acl *)((char *)pnntsd + ndacloffset); ndacl_ptr->revision = dacloffset ? dacl_ptr->revision : cpu_to_le16(ACL_REVISION); ndacl_ptr->num_aces = dacl_ptr ? dacl_ptr->num_aces : 0; @@ -1580,7 +1580,7 @@ id_mode_to_cifs_acl(struct inode *inode, const char *path, __u64 *pnmode, __u32 secdesclen = 0; __u32 nsecdesclen = 0; __u32 dacloffset = 0; - struct cifs_acl *dacl_ptr = NULL; + struct smb_acl *dacl_ptr = NULL; struct smb_ntsd *pntsd = NULL; /* acl obtained from server */ struct smb_ntsd *pnntsd = NULL; /* modified acl to be sent to server */ struct cifs_sb_info *cifs_sb = CIFS_SB(inode->i_sb); @@ -1633,7 +1633,7 @@ id_mode_to_cifs_acl(struct inode *inode, const char *path, __u64 *pnmode, nsecdesclen = sizeof(struct smb_ntsd) + (sizeof(struct smb_sid) * 2); dacloffset = le32_to_cpu(pntsd->dacloffset); if (dacloffset) { - dacl_ptr = (struct cifs_acl *)((char *)pntsd + dacloffset); + dacl_ptr = (struct smb_acl *)((char *)pntsd + dacloffset); if (mode_from_sid) nsecdesclen += le32_to_cpu(dacl_ptr->num_aces) * sizeof(struct cifs_ace); diff --git a/fs/smb/client/cifsacl.h b/fs/smb/client/cifsacl.h index 6a38718220fc..a23d59987828 100644 --- a/fs/smb/client/cifsacl.h +++ b/fs/smb/client/cifsacl.h @@ -34,7 +34,7 @@ * owner, group and world). */ #define DEFAULT_SEC_DESC_LEN (sizeof(struct smb_ntsd) + \ - sizeof(struct cifs_acl) + \ + sizeof(struct smb_acl) + \ (sizeof(struct cifs_ace) * 4))
/* @@ -74,7 +74,7 @@ struct smb_sid { /* size of a struct smb_sid, sans sub_auth array */ #define CIFS_SID_BASE_SIZE (1 + 1 + NUM_AUTHS)
-struct cifs_acl { +struct smb_acl { __le16 revision; /* revision level */ __le16 size; __le32 num_aces;
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: ChenXiaoSong chenxiaosong@kylinos.cn
[ Upstream commit 09bedafc1e2c5c82aad3cbfe1359e2b0bf752f3a ]
Preparation for moving acl definitions to new common header file.
Use the following shell command to rename:
find fs/smb/client -type f -exec sed -i \ 's/struct cifs_ace/struct smb_ace/g' {} +
Signed-off-by: ChenXiaoSong chenxiaosong@kylinos.cn Reviewed-by: Namjae Jeon linkinjeon@kernel.org Signed-off-by: Steve French stfrench@microsoft.com Stable-dep-of: d413eabff18d ("fs/smb/client: implement chmod() for SMB3 POSIX Extensions") Signed-off-by: Sasha Levin sashal@kernel.org --- fs/smb/client/cifsacl.c | 62 +++++++++++++++++++-------------------- fs/smb/client/cifsacl.h | 4 +-- fs/smb/client/cifsglob.h | 2 +- fs/smb/client/cifsproto.h | 6 ++-- fs/smb/client/smb2pdu.c | 8 ++--- 5 files changed, 41 insertions(+), 41 deletions(-)
diff --git a/fs/smb/client/cifsacl.c b/fs/smb/client/cifsacl.c index 2e1c9b528dde..e2ec1d934335 100644 --- a/fs/smb/client/cifsacl.c +++ b/fs/smb/client/cifsacl.c @@ -666,7 +666,7 @@ static void mode_to_access_flags(umode_t mode, umode_t bits_to_use, return; }
-static __u16 cifs_copy_ace(struct cifs_ace *dst, struct cifs_ace *src, struct smb_sid *psid) +static __u16 cifs_copy_ace(struct smb_ace *dst, struct smb_ace *src, struct smb_sid *psid) { __u16 size = 1 + 1 + 2 + 4;
@@ -685,7 +685,7 @@ static __u16 cifs_copy_ace(struct cifs_ace *dst, struct cifs_ace *src, struct sm return size; }
-static __u16 fill_ace_for_sid(struct cifs_ace *pntace, +static __u16 fill_ace_for_sid(struct smb_ace *pntace, const struct smb_sid *psid, __u64 nmode, umode_t bits, __u8 access_type, bool allow_delete_child) @@ -723,7 +723,7 @@ static __u16 fill_ace_for_sid(struct cifs_ace *pntace,
#ifdef CONFIG_CIFS_DEBUG2 -static void dump_ace(struct cifs_ace *pace, char *end_of_acl) +static void dump_ace(struct smb_ace *pace, char *end_of_acl) { int num_subauth;
@@ -766,7 +766,7 @@ static void parse_dacl(struct smb_acl *pdacl, char *end_of_acl, int num_aces = 0; int acl_size; char *acl_base; - struct cifs_ace **ppace; + struct smb_ace **ppace;
/* BB need to add parm so we can store the SID BB */
@@ -799,15 +799,15 @@ static void parse_dacl(struct smb_acl *pdacl, char *end_of_acl, if (num_aces > 0) { umode_t denied_mode = 0;
- if (num_aces > ULONG_MAX / sizeof(struct cifs_ace *)) + if (num_aces > ULONG_MAX / sizeof(struct smb_ace *)) return; - ppace = kmalloc_array(num_aces, sizeof(struct cifs_ace *), + ppace = kmalloc_array(num_aces, sizeof(struct smb_ace *), GFP_KERNEL); if (!ppace) return;
for (i = 0; i < num_aces; ++i) { - ppace[i] = (struct cifs_ace *) (acl_base + acl_size); + ppace[i] = (struct smb_ace *) (acl_base + acl_size); #ifdef CONFIG_CIFS_DEBUG2 dump_ace(ppace[i], end_of_acl); #endif @@ -849,7 +849,7 @@ static void parse_dacl(struct smb_acl *pdacl, char *end_of_acl,
/* memcpy((void *)(&(cifscred->aces[i])), (void *)ppace[i], - sizeof(struct cifs_ace)); */ + sizeof(struct smb_ace)); */
acl_base = (char *)ppace[i]; acl_size = le16_to_cpu(ppace[i]->size); @@ -861,7 +861,7 @@ static void parse_dacl(struct smb_acl *pdacl, char *end_of_acl, return; }
-unsigned int setup_authusers_ACE(struct cifs_ace *pntace) +unsigned int setup_authusers_ACE(struct smb_ace *pntace) { int i; unsigned int ace_size = 20; @@ -885,7 +885,7 @@ unsigned int setup_authusers_ACE(struct cifs_ace *pntace) * Fill in the special SID based on the mode. See * https://technet.microsoft.com/en-us/library/hh509017(v=ws.10).aspx */ -unsigned int setup_special_mode_ACE(struct cifs_ace *pntace, __u64 nmode) +unsigned int setup_special_mode_ACE(struct smb_ace *pntace, __u64 nmode) { int i; unsigned int ace_size = 28; @@ -907,7 +907,7 @@ unsigned int setup_special_mode_ACE(struct cifs_ace *pntace, __u64 nmode) return ace_size; }
-unsigned int setup_special_user_owner_ACE(struct cifs_ace *pntace) +unsigned int setup_special_user_owner_ACE(struct smb_ace *pntace) { int i; unsigned int ace_size = 28; @@ -944,17 +944,17 @@ static void populate_new_aces(char *nacl_base, __u64 deny_user_mode = 0; __u64 deny_group_mode = 0; bool sticky_set = false; - struct cifs_ace *pnntace = NULL; + struct smb_ace *pnntace = NULL;
nmode = *pnmode; num_aces = *pnum_aces; nsize = *pnsize;
if (modefromsid) { - pnntace = (struct cifs_ace *) (nacl_base + nsize); + pnntace = (struct smb_ace *) (nacl_base + nsize); nsize += setup_special_mode_ACE(pnntace, nmode); num_aces++; - pnntace = (struct cifs_ace *) (nacl_base + nsize); + pnntace = (struct smb_ace *) (nacl_base + nsize); nsize += setup_authusers_ACE(pnntace); num_aces++; goto set_size; @@ -992,7 +992,7 @@ static void populate_new_aces(char *nacl_base, sticky_set = true;
if (deny_user_mode) { - pnntace = (struct cifs_ace *) (nacl_base + nsize); + pnntace = (struct smb_ace *) (nacl_base + nsize); nsize += fill_ace_for_sid(pnntace, pownersid, deny_user_mode, 0700, ACCESS_DENIED, false); num_aces++; @@ -1000,31 +1000,31 @@ static void populate_new_aces(char *nacl_base,
/* Group DENY ACE does not conflict with owner ALLOW ACE. Keep in preferred order*/ if (deny_group_mode && !(deny_group_mode & (user_mode >> 3))) { - pnntace = (struct cifs_ace *) (nacl_base + nsize); + pnntace = (struct smb_ace *) (nacl_base + nsize); nsize += fill_ace_for_sid(pnntace, pgrpsid, deny_group_mode, 0070, ACCESS_DENIED, false); num_aces++; }
- pnntace = (struct cifs_ace *) (nacl_base + nsize); + pnntace = (struct smb_ace *) (nacl_base + nsize); nsize += fill_ace_for_sid(pnntace, pownersid, user_mode, 0700, ACCESS_ALLOWED, true); num_aces++;
/* Group DENY ACE conflicts with owner ALLOW ACE. So keep it after. */ if (deny_group_mode && (deny_group_mode & (user_mode >> 3))) { - pnntace = (struct cifs_ace *) (nacl_base + nsize); + pnntace = (struct smb_ace *) (nacl_base + nsize); nsize += fill_ace_for_sid(pnntace, pgrpsid, deny_group_mode, 0070, ACCESS_DENIED, false); num_aces++; }
- pnntace = (struct cifs_ace *) (nacl_base + nsize); + pnntace = (struct smb_ace *) (nacl_base + nsize); nsize += fill_ace_for_sid(pnntace, pgrpsid, group_mode, 0070, ACCESS_ALLOWED, !sticky_set); num_aces++;
- pnntace = (struct cifs_ace *) (nacl_base + nsize); + pnntace = (struct smb_ace *) (nacl_base + nsize); nsize += fill_ace_for_sid(pnntace, &sid_everyone, other_mode, 0007, ACCESS_ALLOWED, !sticky_set); num_aces++; @@ -1040,11 +1040,11 @@ static __u16 replace_sids_and_copy_aces(struct smb_acl *pdacl, struct smb_acl *p { int i; u16 size = 0; - struct cifs_ace *pntace = NULL; + struct smb_ace *pntace = NULL; char *acl_base = NULL; u32 src_num_aces = 0; u16 nsize = 0; - struct cifs_ace *pnntace = NULL; + struct smb_ace *pnntace = NULL; char *nacl_base = NULL; u16 ace_size = 0;
@@ -1057,8 +1057,8 @@ static __u16 replace_sids_and_copy_aces(struct smb_acl *pdacl, struct smb_acl *p
/* Go through all the ACEs */ for (i = 0; i < src_num_aces; ++i) { - pntace = (struct cifs_ace *) (acl_base + size); - pnntace = (struct cifs_ace *) (nacl_base + nsize); + pntace = (struct smb_ace *) (acl_base + size); + pnntace = (struct smb_ace *) (nacl_base + nsize);
if (pnownersid && compare_sids(&pntace->sid, pownersid) == 0) ace_size = cifs_copy_ace(pnntace, pntace, pnownersid); @@ -1080,11 +1080,11 @@ static int set_chmod_dacl(struct smb_acl *pdacl, struct smb_acl *pndacl, { int i; u16 size = 0; - struct cifs_ace *pntace = NULL; + struct smb_ace *pntace = NULL; char *acl_base = NULL; u32 src_num_aces = 0; u16 nsize = 0; - struct cifs_ace *pnntace = NULL; + struct smb_ace *pnntace = NULL; char *nacl_base = NULL; u32 num_aces = 0; bool new_aces_set = false; @@ -1108,7 +1108,7 @@ static int set_chmod_dacl(struct smb_acl *pdacl, struct smb_acl *pndacl,
/* Retain old ACEs which we can retain */ for (i = 0; i < src_num_aces; ++i) { - pntace = (struct cifs_ace *) (acl_base + size); + pntace = (struct smb_ace *) (acl_base + size);
if (!new_aces_set && (pntace->flags & INHERITED_ACE)) { /* Place the new ACEs in between existing explicit and inherited */ @@ -1130,7 +1130,7 @@ static int set_chmod_dacl(struct smb_acl *pdacl, struct smb_acl *pndacl, }
/* update the pointer to the next ACE to populate*/ - pnntace = (struct cifs_ace *) (nacl_base + nsize); + pnntace = (struct smb_ace *) (nacl_base + nsize);
nsize += cifs_copy_ace(pnntace, pntace, NULL); num_aces++; @@ -1625,9 +1625,9 @@ id_mode_to_cifs_acl(struct inode *inode, const char *path, __u64 *pnmode, nsecdesclen = secdesclen; if (pnmode && *pnmode != NO_CHANGE_64) { /* chmod */ if (mode_from_sid) - nsecdesclen += 2 * sizeof(struct cifs_ace); + nsecdesclen += 2 * sizeof(struct smb_ace); else /* cifsacl */ - nsecdesclen += 5 * sizeof(struct cifs_ace); + nsecdesclen += 5 * sizeof(struct smb_ace); } else { /* chown */ /* When ownership changes, changes new owner sid length could be different */ nsecdesclen = sizeof(struct smb_ntsd) + (sizeof(struct smb_sid) * 2); @@ -1636,7 +1636,7 @@ id_mode_to_cifs_acl(struct inode *inode, const char *path, __u64 *pnmode, dacl_ptr = (struct smb_acl *)((char *)pntsd + dacloffset); if (mode_from_sid) nsecdesclen += - le32_to_cpu(dacl_ptr->num_aces) * sizeof(struct cifs_ace); + le32_to_cpu(dacl_ptr->num_aces) * sizeof(struct smb_ace); else /* cifsacl */ nsecdesclen += le16_to_cpu(dacl_ptr->size); } diff --git a/fs/smb/client/cifsacl.h b/fs/smb/client/cifsacl.h index a23d59987828..cbaed8038e36 100644 --- a/fs/smb/client/cifsacl.h +++ b/fs/smb/client/cifsacl.h @@ -35,7 +35,7 @@ */ #define DEFAULT_SEC_DESC_LEN (sizeof(struct smb_ntsd) + \ sizeof(struct smb_acl) + \ - (sizeof(struct cifs_ace) * 4)) + (sizeof(struct smb_ace) * 4))
/* * Maximum size of a string representation of a SID: @@ -111,7 +111,7 @@ struct smb_acl { #define SUCCESSFUL_ACCESS_ACE_FLAG 0x40 #define FAILED_ACCESS_ACE_FLAG 0x80
-struct cifs_ace { +struct smb_ace { __u8 type; /* see above and MS-DTYP 2.4.4.1 */ __u8 flags; __le16 size; diff --git a/fs/smb/client/cifsglob.h b/fs/smb/client/cifsglob.h index 69d850b6b37f..43b42eca6780 100644 --- a/fs/smb/client/cifsglob.h +++ b/fs/smb/client/cifsglob.h @@ -205,7 +205,7 @@ struct cifs_cred { struct smb_sid osid; struct smb_sid gsid; struct cifs_ntace *ntaces; - struct cifs_ace *aces; + struct smb_ace *aces; };
struct cifs_open_info_data { diff --git a/fs/smb/client/cifsproto.h b/fs/smb/client/cifsproto.h index 059e506ccf5b..6399dbd04625 100644 --- a/fs/smb/client/cifsproto.h +++ b/fs/smb/client/cifsproto.h @@ -241,9 +241,9 @@ extern int cifs_set_acl(struct mnt_idmap *idmap, struct dentry *dentry, struct posix_acl *acl, int type); extern int set_cifs_acl(struct smb_ntsd *pntsd, __u32 len, struct inode *ino, const char *path, int flag); -extern unsigned int setup_authusers_ACE(struct cifs_ace *pace); -extern unsigned int setup_special_mode_ACE(struct cifs_ace *pace, __u64 nmode); -extern unsigned int setup_special_user_owner_ACE(struct cifs_ace *pace); +extern unsigned int setup_authusers_ACE(struct smb_ace *pace); +extern unsigned int setup_special_mode_ACE(struct smb_ace *pace, __u64 nmode); +extern unsigned int setup_special_user_owner_ACE(struct smb_ace *pace);
extern void dequeue_mid(struct mid_q_entry *mid, bool malformed); extern int cifs_read_from_socket(struct TCP_Server_Info *server, char *buf, diff --git a/fs/smb/client/smb2pdu.c b/fs/smb/client/smb2pdu.c index 42f950ae10fb..101c80f22d77 100644 --- a/fs/smb/client/smb2pdu.c +++ b/fs/smb/client/smb2pdu.c @@ -2623,7 +2623,7 @@ create_sd_buf(umode_t mode, bool set_owner, unsigned int *len) unsigned int group_offset = 0; struct smb3_acl acl = {};
- *len = round_up(sizeof(struct crt_sd_ctxt) + (sizeof(struct cifs_ace) * 4), 8); + *len = round_up(sizeof(struct crt_sd_ctxt) + (sizeof(struct smb_ace) * 4), 8);
if (set_owner) { /* sizeof(struct owner_group_sids) is already multiple of 8 so no need to round */ @@ -2672,21 +2672,21 @@ create_sd_buf(umode_t mode, bool set_owner, unsigned int *len) ptr += sizeof(struct smb3_acl);
/* create one ACE to hold the mode embedded in reserved special SID */ - acelen = setup_special_mode_ACE((struct cifs_ace *)ptr, (__u64)mode); + acelen = setup_special_mode_ACE((struct smb_ace *)ptr, (__u64)mode); ptr += acelen; acl_size = acelen + sizeof(struct smb3_acl); ace_count = 1;
if (set_owner) { /* we do not need to reallocate buffer to add the two more ACEs. plenty of space */ - acelen = setup_special_user_owner_ACE((struct cifs_ace *)ptr); + acelen = setup_special_user_owner_ACE((struct smb_ace *)ptr); ptr += acelen; acl_size += acelen; ace_count += 1; }
/* and one more ACE to allow access for authenticated users */ - acelen = setup_authusers_ACE((struct cifs_ace *)ptr); + acelen = setup_authusers_ACE((struct smb_ace *)ptr); ptr += acelen; acl_size += acelen; ace_count += 1;
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ralph Boehme slow@samba.org
[ Upstream commit d413eabff18d640031fc955d107ad9c03c3bf9f1 ]
The NT ACL format for an SMB3 POSIX Extensions chmod() is a single ACE with the magic S-1-5-88-3-mode SID:
NT Security Descriptor Revision: 1 Type: 0x8004, Self Relative, DACL Present Offset to owner SID: 56 Offset to group SID: 124 Offset to SACL: 0 Offset to DACL: 20 Owner: S-1-5-21-3177838999-3893657415-1037673384-1000 Group: S-1-22-2-1000 NT User (DACL) ACL Revision: NT4 (2) Size: 36 Num ACEs: 1 NT ACE: S-1-5-88-3-438, flags 0x00, Access Allowed, mask 0x00000000 Type: Access Allowed NT ACE Flags: 0x00 Size: 28 Access required: 0x00000000 SID: S-1-5-88-3-438
Owner and Group should be NULL, but the server is not required to fail the request if they are present.
Signed-off-by: Ralph Boehme slow@samba.org Cc: stable@vger.kernel.org Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/smb/client/cifsacl.c | 50 +++++++++++++++++++++++---------------- fs/smb/client/cifsproto.h | 4 +++- fs/smb/client/inode.c | 4 +++- fs/smb/client/smb2pdu.c | 2 +- 4 files changed, 37 insertions(+), 23 deletions(-)
diff --git a/fs/smb/client/cifsacl.c b/fs/smb/client/cifsacl.c index e2ec1d934335..bff8d0dd74fe 100644 --- a/fs/smb/client/cifsacl.c +++ b/fs/smb/client/cifsacl.c @@ -885,12 +885,17 @@ unsigned int setup_authusers_ACE(struct smb_ace *pntace) * Fill in the special SID based on the mode. See * https://technet.microsoft.com/en-us/library/hh509017(v=ws.10).aspx */ -unsigned int setup_special_mode_ACE(struct smb_ace *pntace, __u64 nmode) +unsigned int setup_special_mode_ACE(struct smb_ace *pntace, + bool posix, + __u64 nmode) { int i; unsigned int ace_size = 28;
- pntace->type = ACCESS_DENIED_ACE_TYPE; + if (posix) + pntace->type = ACCESS_ALLOWED_ACE_TYPE; + else + pntace->type = ACCESS_DENIED_ACE_TYPE; pntace->flags = 0x0; pntace->access_req = 0; pntace->sid.num_subauth = 3; @@ -933,7 +938,8 @@ static void populate_new_aces(char *nacl_base, struct smb_sid *pownersid, struct smb_sid *pgrpsid, __u64 *pnmode, u32 *pnum_aces, u16 *pnsize, - bool modefromsid) + bool modefromsid, + bool posix) { __u64 nmode; u32 num_aces = 0; @@ -950,13 +956,15 @@ static void populate_new_aces(char *nacl_base, num_aces = *pnum_aces; nsize = *pnsize;
- if (modefromsid) { - pnntace = (struct smb_ace *) (nacl_base + nsize); - nsize += setup_special_mode_ACE(pnntace, nmode); - num_aces++; + if (modefromsid || posix) { pnntace = (struct smb_ace *) (nacl_base + nsize); - nsize += setup_authusers_ACE(pnntace); + nsize += setup_special_mode_ACE(pnntace, posix, nmode); num_aces++; + if (modefromsid) { + pnntace = (struct smb_ace *) (nacl_base + nsize); + nsize += setup_authusers_ACE(pnntace); + num_aces++; + } goto set_size; }
@@ -1076,7 +1084,7 @@ static __u16 replace_sids_and_copy_aces(struct smb_acl *pdacl, struct smb_acl *p
static int set_chmod_dacl(struct smb_acl *pdacl, struct smb_acl *pndacl, struct smb_sid *pownersid, struct smb_sid *pgrpsid, - __u64 *pnmode, bool mode_from_sid) + __u64 *pnmode, bool mode_from_sid, bool posix) { int i; u16 size = 0; @@ -1094,11 +1102,11 @@ static int set_chmod_dacl(struct smb_acl *pdacl, struct smb_acl *pndacl, nsize = sizeof(struct smb_acl);
/* If pdacl is NULL, we don't have a src. Simply populate new ACL. */ - if (!pdacl) { + if (!pdacl || posix) { populate_new_aces(nacl_base, pownersid, pgrpsid, pnmode, &num_aces, &nsize, - mode_from_sid); + mode_from_sid, posix); goto finalize_dacl; }
@@ -1115,7 +1123,7 @@ static int set_chmod_dacl(struct smb_acl *pdacl, struct smb_acl *pndacl, populate_new_aces(nacl_base, pownersid, pgrpsid, pnmode, &num_aces, &nsize, - mode_from_sid); + mode_from_sid, posix);
new_aces_set = true; } @@ -1144,7 +1152,7 @@ static int set_chmod_dacl(struct smb_acl *pdacl, struct smb_acl *pndacl, populate_new_aces(nacl_base, pownersid, pgrpsid, pnmode, &num_aces, &nsize, - mode_from_sid); + mode_from_sid, posix);
new_aces_set = true; } @@ -1251,7 +1259,7 @@ static int parse_sec_desc(struct cifs_sb_info *cifs_sb, /* Convert permission bits from mode to equivalent CIFS ACL */ static int build_sec_desc(struct smb_ntsd *pntsd, struct smb_ntsd *pnntsd, __u32 secdesclen, __u32 *pnsecdesclen, __u64 *pnmode, kuid_t uid, kgid_t gid, - bool mode_from_sid, bool id_from_sid, int *aclflag) + bool mode_from_sid, bool id_from_sid, bool posix, int *aclflag) { int rc = 0; __u32 dacloffset; @@ -1288,7 +1296,7 @@ static int build_sec_desc(struct smb_ntsd *pntsd, struct smb_ntsd *pnntsd, ndacl_ptr->num_aces = cpu_to_le32(0);
rc = set_chmod_dacl(dacl_ptr, ndacl_ptr, owner_sid_ptr, group_sid_ptr, - pnmode, mode_from_sid); + pnmode, mode_from_sid, posix);
sidsoffset = ndacloffset + le16_to_cpu(ndacl_ptr->size); /* copy the non-dacl portion of secdesc */ @@ -1587,6 +1595,7 @@ id_mode_to_cifs_acl(struct inode *inode, const char *path, __u64 *pnmode, struct tcon_link *tlink = cifs_sb_tlink(cifs_sb); struct smb_version_operations *ops; bool mode_from_sid, id_from_sid; + bool posix = tlink_tcon(tlink)->posix_extensions; const u32 info = 0;
if (IS_ERR(tlink)) @@ -1622,12 +1631,13 @@ id_mode_to_cifs_acl(struct inode *inode, const char *path, __u64 *pnmode, id_from_sid = false;
/* Potentially, five new ACEs can be added to the ACL for U,G,O mapping */ - nsecdesclen = secdesclen; if (pnmode && *pnmode != NO_CHANGE_64) { /* chmod */ - if (mode_from_sid) - nsecdesclen += 2 * sizeof(struct smb_ace); + if (posix) + nsecdesclen = 1 * sizeof(struct smb_ace); + else if (mode_from_sid) + nsecdesclen = secdesclen + (2 * sizeof(struct smb_ace)); else /* cifsacl */ - nsecdesclen += 5 * sizeof(struct smb_ace); + nsecdesclen = secdesclen + (5 * sizeof(struct smb_ace)); } else { /* chown */ /* When ownership changes, changes new owner sid length could be different */ nsecdesclen = sizeof(struct smb_ntsd) + (sizeof(struct smb_sid) * 2); @@ -1657,7 +1667,7 @@ id_mode_to_cifs_acl(struct inode *inode, const char *path, __u64 *pnmode, }
rc = build_sec_desc(pntsd, pnntsd, secdesclen, &nsecdesclen, pnmode, uid, gid, - mode_from_sid, id_from_sid, &aclflag); + mode_from_sid, id_from_sid, posix, &aclflag);
cifs_dbg(NOISY, "build_sec_desc rc: %d\n", rc);
diff --git a/fs/smb/client/cifsproto.h b/fs/smb/client/cifsproto.h index 6399dbd04625..a151ffffc6f3 100644 --- a/fs/smb/client/cifsproto.h +++ b/fs/smb/client/cifsproto.h @@ -242,7 +242,9 @@ extern int cifs_set_acl(struct mnt_idmap *idmap, extern int set_cifs_acl(struct smb_ntsd *pntsd, __u32 len, struct inode *ino, const char *path, int flag); extern unsigned int setup_authusers_ACE(struct smb_ace *pace); -extern unsigned int setup_special_mode_ACE(struct smb_ace *pace, __u64 nmode); +extern unsigned int setup_special_mode_ACE(struct smb_ace *pace, + bool posix, + __u64 nmode); extern unsigned int setup_special_user_owner_ACE(struct smb_ace *pace);
extern void dequeue_mid(struct mid_q_entry *mid, bool malformed); diff --git a/fs/smb/client/inode.c b/fs/smb/client/inode.c index ce7e0aed8f7d..b3e59a7c7120 100644 --- a/fs/smb/client/inode.c +++ b/fs/smb/client/inode.c @@ -3087,6 +3087,7 @@ cifs_setattr_nounix(struct dentry *direntry, struct iattr *attrs) int rc = -EACCES; __u32 dosattr = 0; __u64 mode = NO_CHANGE_64; + bool posix = cifs_sb_master_tcon(cifs_sb)->posix_extensions;
xid = get_xid();
@@ -3177,7 +3178,8 @@ cifs_setattr_nounix(struct dentry *direntry, struct iattr *attrs) mode = attrs->ia_mode; rc = 0; if ((cifs_sb->mnt_cifs_flags & CIFS_MOUNT_CIFS_ACL) || - (cifs_sb->mnt_cifs_flags & CIFS_MOUNT_MODE_FROM_SID)) { + (cifs_sb->mnt_cifs_flags & CIFS_MOUNT_MODE_FROM_SID) || + posix) { rc = id_mode_to_cifs_acl(inode, full_path, &mode, INVALID_UID, INVALID_GID); if (rc) { diff --git a/fs/smb/client/smb2pdu.c b/fs/smb/client/smb2pdu.c index 101c80f22d77..c012fbc2638e 100644 --- a/fs/smb/client/smb2pdu.c +++ b/fs/smb/client/smb2pdu.c @@ -2672,7 +2672,7 @@ create_sd_buf(umode_t mode, bool set_owner, unsigned int *len) ptr += sizeof(struct smb3_acl);
/* create one ACE to hold the mode embedded in reserved special SID */ - acelen = setup_special_mode_ACE((struct smb_ace *)ptr, (__u64)mode); + acelen = setup_special_mode_ACE((struct smb_ace *)ptr, false, (__u64)mode); ptr += acelen; acl_size = acelen + sizeof(struct smb3_acl); ace_count = 1;
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Paulo Alcantara pc@manguebit.com
[ Upstream commit a13ca780afab350f37f8be9eda2bf79d1aed9bdd ]
When having several mounts that share same credential and the client couldn't re-establish an SMB session due to an expired kerberos ticket or rotated password, smb2_calc_signature() will end up flooding dmesg when not finding SMB sessions to calculate signatures.
Signed-off-by: Paulo Alcantara (Red Hat) pc@manguebit.com Signed-off-by: Steve French stfrench@microsoft.com Stable-dep-of: 343d7fe6df9e ("smb: client: fix use-after-free of signing key") Signed-off-by: Sasha Levin sashal@kernel.org --- fs/smb/client/smb2transport.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/smb/client/smb2transport.c b/fs/smb/client/smb2transport.c index 4ca04e62a993..73eae1b16034 100644 --- a/fs/smb/client/smb2transport.c +++ b/fs/smb/client/smb2transport.c @@ -242,7 +242,7 @@ smb2_calc_signature(struct smb_rqst *rqst, struct TCP_Server_Info *server,
ses = smb2_find_smb_ses(server, le64_to_cpu(shdr->SessionId)); if (unlikely(!ses)) { - cifs_server_dbg(VFS, "%s: Could not find session\n", __func__); + cifs_server_dbg(FYI, "%s: Could not find session\n", __func__); return -ENOENT; }
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Paulo Alcantara pc@manguebit.com
[ Upstream commit 343d7fe6df9e247671440a932b6a73af4fa86d95 ]
Customers have reported use-after-free in @ses->auth_key.response with SMB2.1 + sign mounts which occurs due to following race:
task A task B cifs_mount() dfs_mount_share() get_session() cifs_mount_get_session() cifs_send_recv() cifs_get_smb_ses() compound_send_recv() cifs_setup_session() smb2_setup_request() kfree_sensitive() smb2_calc_signature() crypto_shash_setkey() *UAF*
Fix this by ensuring that we have a valid @ses->auth_key.response by checking whether @ses->ses_status is SES_GOOD or SES_EXITING with @ses->ses_lock held. After commit 24a9799aa8ef ("smb: client: fix UAF in smb2_reconnect_server()"), we made sure to call ->logoff() only when @ses was known to be good (e.g. valid ->auth_key.response), so it's safe to access signing key when @ses->ses_status == SES_EXITING.
Cc: stable@vger.kernel.org Reported-by: Jay Shin jaeshin@redhat.com Signed-off-by: Paulo Alcantara (Red Hat) pc@manguebit.com Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/smb/client/smb2proto.h | 2 -- fs/smb/client/smb2transport.c | 56 +++++++++++++++++++++++++---------- 2 files changed, 40 insertions(+), 18 deletions(-)
diff --git a/fs/smb/client/smb2proto.h b/fs/smb/client/smb2proto.h index fd90d8e5a1d1..750e4e397b13 100644 --- a/fs/smb/client/smb2proto.h +++ b/fs/smb/client/smb2proto.h @@ -37,8 +37,6 @@ extern struct mid_q_entry *smb2_setup_request(struct cifs_ses *ses, struct smb_rqst *rqst); extern struct mid_q_entry *smb2_setup_async_request( struct TCP_Server_Info *server, struct smb_rqst *rqst); -extern struct cifs_ses *smb2_find_smb_ses(struct TCP_Server_Info *server, - __u64 ses_id); extern struct cifs_tcon *smb2_find_smb_tcon(struct TCP_Server_Info *server, __u64 ses_id, __u32 tid); extern int smb2_calc_signature(struct smb_rqst *rqst, diff --git a/fs/smb/client/smb2transport.c b/fs/smb/client/smb2transport.c index 73eae1b16034..4a43802375b3 100644 --- a/fs/smb/client/smb2transport.c +++ b/fs/smb/client/smb2transport.c @@ -74,7 +74,7 @@ smb311_crypto_shash_allocate(struct TCP_Server_Info *server)
static -int smb2_get_sign_key(__u64 ses_id, struct TCP_Server_Info *server, u8 *key) +int smb3_get_sign_key(__u64 ses_id, struct TCP_Server_Info *server, u8 *key) { struct cifs_chan *chan; struct TCP_Server_Info *pserver; @@ -168,16 +168,41 @@ smb2_find_smb_ses_unlocked(struct TCP_Server_Info *server, __u64 ses_id) return NULL; }
-struct cifs_ses * -smb2_find_smb_ses(struct TCP_Server_Info *server, __u64 ses_id) +static int smb2_get_sign_key(struct TCP_Server_Info *server, + __u64 ses_id, u8 *key) { struct cifs_ses *ses; + int rc = -ENOENT; + + if (SERVER_IS_CHAN(server)) + server = server->primary_server;
spin_lock(&cifs_tcp_ses_lock); - ses = smb2_find_smb_ses_unlocked(server, ses_id); - spin_unlock(&cifs_tcp_ses_lock); + list_for_each_entry(ses, &server->smb_ses_list, smb_ses_list) { + if (ses->Suid != ses_id) + continue;
- return ses; + rc = 0; + spin_lock(&ses->ses_lock); + switch (ses->ses_status) { + case SES_EXITING: /* SMB2_LOGOFF */ + case SES_GOOD: + if (likely(ses->auth_key.response)) { + memcpy(key, ses->auth_key.response, + SMB2_NTLMV2_SESSKEY_SIZE); + } else { + rc = -EIO; + } + break; + default: + rc = -EAGAIN; + break; + } + spin_unlock(&ses->ses_lock); + break; + } + spin_unlock(&cifs_tcp_ses_lock); + return rc; }
static struct cifs_tcon * @@ -236,14 +261,16 @@ smb2_calc_signature(struct smb_rqst *rqst, struct TCP_Server_Info *server, unsigned char *sigptr = smb2_signature; struct kvec *iov = rqst->rq_iov; struct smb2_hdr *shdr = (struct smb2_hdr *)iov[0].iov_base; - struct cifs_ses *ses; struct shash_desc *shash = NULL; struct smb_rqst drqst; + __u64 sid = le64_to_cpu(shdr->SessionId); + u8 key[SMB2_NTLMV2_SESSKEY_SIZE];
- ses = smb2_find_smb_ses(server, le64_to_cpu(shdr->SessionId)); - if (unlikely(!ses)) { - cifs_server_dbg(FYI, "%s: Could not find session\n", __func__); - return -ENOENT; + rc = smb2_get_sign_key(server, sid, key); + if (unlikely(rc)) { + cifs_server_dbg(FYI, "%s: [sesid=0x%llx] couldn't find signing key: %d\n", + __func__, sid, rc); + return rc; }
memset(smb2_signature, 0x0, SMB2_HMACSHA256_SIZE); @@ -260,8 +287,7 @@ smb2_calc_signature(struct smb_rqst *rqst, struct TCP_Server_Info *server, shash = server->secmech.hmacsha256; }
- rc = crypto_shash_setkey(shash->tfm, ses->auth_key.response, - SMB2_NTLMV2_SESSKEY_SIZE); + rc = crypto_shash_setkey(shash->tfm, key, sizeof(key)); if (rc) { cifs_server_dbg(VFS, "%s: Could not update with response\n", @@ -303,8 +329,6 @@ smb2_calc_signature(struct smb_rqst *rqst, struct TCP_Server_Info *server, out: if (allocate_crypto) cifs_free_hash(&shash); - if (ses) - cifs_put_smb_ses(ses); return rc; }
@@ -570,7 +594,7 @@ smb3_calc_signature(struct smb_rqst *rqst, struct TCP_Server_Info *server, struct smb_rqst drqst; u8 key[SMB3_SIGN_KEY_SIZE];
- rc = smb2_get_sign_key(le64_to_cpu(shdr->SessionId), server, key); + rc = smb3_get_sign_key(le64_to_cpu(shdr->SessionId), server, key); if (unlikely(rc)) { cifs_server_dbg(FYI, "%s: Could not get signing key\n", __func__); return rc;
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Selvarasu Ganesan selvarasu.g@samsung.com
[ Upstream commit 61eb055cd3048ee01ca43d1be924167d33e16fdc ]
The existing implementation of the TxFIFO resizing logic only supports scenarios where more than one port RAM is used. However, there is a need to resize the TxFIFO in USB2.0-only mode where only a single port RAM is available. This commit introduces the necessary changes to support TxFIFO resizing in such scenarios by adding a missing check for single port RAM.
This fix addresses certain platform configurations where the existing TxFIFO resizing logic does not work properly due to the absence of support for single port RAM. By adding this missing check, we ensure that the TxFIFO resizing logic works correctly in all scenarios, including those with a single port RAM.
Fixes: 9f607a309fbe ("usb: dwc3: Resize TX FIFOs to meet EP bursting requirements") Cc: stable@vger.kernel.org # 6.12.x: fad16c82: usb: dwc3: gadget: Refine the logic for resizing Tx FIFOs Signed-off-by: Selvarasu Ganesan selvarasu.g@samsung.com Acked-by: Thinh Nguyen Thinh.Nguyen@synopsys.com Link: https://lore.kernel.org/r/20241112044807.623-1-selvarasu.g@samsung.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/usb/dwc3/core.h | 4 +++ drivers/usb/dwc3/gadget.c | 54 +++++++++++++++++++++++++++++++++------ 2 files changed, 50 insertions(+), 8 deletions(-)
diff --git a/drivers/usb/dwc3/core.h b/drivers/usb/dwc3/core.h index b118f4aab189..d00bf714a7cc 100644 --- a/drivers/usb/dwc3/core.h +++ b/drivers/usb/dwc3/core.h @@ -899,6 +899,7 @@ struct dwc3_hwparams { #define DWC3_MODE(n) ((n) & 0x7)
/* HWPARAMS1 */ +#define DWC3_SPRAM_TYPE(n) (((n) >> 23) & 1) #define DWC3_NUM_INT(n) (((n) & (0x3f << 15)) >> 15)
/* HWPARAMS3 */ @@ -909,6 +910,9 @@ struct dwc3_hwparams { #define DWC3_NUM_IN_EPS(p) (((p)->hwparams3 & \ (DWC3_NUM_IN_EPS_MASK)) >> 18)
+/* HWPARAMS6 */ +#define DWC3_RAM0_DEPTH(n) (((n) & (0xffff0000)) >> 16) + /* HWPARAMS7 */ #define DWC3_RAM1_DEPTH(n) ((n) & 0xffff)
diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c index b560996bd421..656460c0c1dd 100644 --- a/drivers/usb/dwc3/gadget.c +++ b/drivers/usb/dwc3/gadget.c @@ -687,6 +687,44 @@ static int dwc3_gadget_calc_tx_fifo_size(struct dwc3 *dwc, int mult) return fifo_size; }
+/** + * dwc3_gadget_calc_ram_depth - calculates the ram depth for txfifo + * @dwc: pointer to the DWC3 context + */ +static int dwc3_gadget_calc_ram_depth(struct dwc3 *dwc) +{ + int ram_depth; + int fifo_0_start; + bool is_single_port_ram; + + /* Check supporting RAM type by HW */ + is_single_port_ram = DWC3_SPRAM_TYPE(dwc->hwparams.hwparams1); + + /* + * If a single port RAM is utilized, then allocate TxFIFOs from + * RAM0. otherwise, allocate them from RAM1. + */ + ram_depth = is_single_port_ram ? DWC3_RAM0_DEPTH(dwc->hwparams.hwparams6) : + DWC3_RAM1_DEPTH(dwc->hwparams.hwparams7); + + /* + * In a single port RAM configuration, the available RAM is shared + * between the RX and TX FIFOs. This means that the txfifo can begin + * at a non-zero address. + */ + if (is_single_port_ram) { + u32 reg; + + /* Check if TXFIFOs start at non-zero addr */ + reg = dwc3_readl(dwc->regs, DWC3_GTXFIFOSIZ(0)); + fifo_0_start = DWC3_GTXFIFOSIZ_TXFSTADDR(reg); + + ram_depth -= (fifo_0_start >> 16); + } + + return ram_depth; +} + /** * dwc3_gadget_clear_tx_fifos - Clears txfifo allocation * @dwc: pointer to the DWC3 context @@ -753,7 +791,7 @@ static int dwc3_gadget_resize_tx_fifos(struct dwc3_ep *dep) { struct dwc3 *dwc = dep->dwc; int fifo_0_start; - int ram1_depth; + int ram_depth; int fifo_size; int min_depth; int num_in_ep; @@ -773,7 +811,7 @@ static int dwc3_gadget_resize_tx_fifos(struct dwc3_ep *dep) if (dep->flags & DWC3_EP_TXFIFO_RESIZED) return 0;
- ram1_depth = DWC3_RAM1_DEPTH(dwc->hwparams.hwparams7); + ram_depth = dwc3_gadget_calc_ram_depth(dwc);
if ((dep->endpoint.maxburst > 1 && usb_endpoint_xfer_bulk(dep->endpoint.desc)) || @@ -794,7 +832,7 @@ static int dwc3_gadget_resize_tx_fifos(struct dwc3_ep *dep)
/* Reserve at least one FIFO for the number of IN EPs */ min_depth = num_in_ep * (fifo + 1); - remaining = ram1_depth - min_depth - dwc->last_fifo_depth; + remaining = ram_depth - min_depth - dwc->last_fifo_depth; remaining = max_t(int, 0, remaining); /* * We've already reserved 1 FIFO per EP, so check what we can fit in @@ -820,9 +858,9 @@ static int dwc3_gadget_resize_tx_fifos(struct dwc3_ep *dep) dwc->last_fifo_depth += DWC31_GTXFIFOSIZ_TXFDEP(fifo_size);
/* Check fifo size allocation doesn't exceed available RAM size. */ - if (dwc->last_fifo_depth >= ram1_depth) { + if (dwc->last_fifo_depth >= ram_depth) { dev_err(dwc->dev, "Fifosize(%d) > RAM size(%d) %s depth:%d\n", - dwc->last_fifo_depth, ram1_depth, + dwc->last_fifo_depth, ram_depth, dep->endpoint.name, fifo_size); if (DWC3_IP_IS(DWC3)) fifo_size = DWC3_GTXFIFOSIZ_TXFDEP(fifo_size); @@ -3078,7 +3116,7 @@ static int dwc3_gadget_check_config(struct usb_gadget *g) struct dwc3 *dwc = gadget_to_dwc(g); struct usb_ep *ep; int fifo_size = 0; - int ram1_depth; + int ram_depth; int ep_num = 0;
if (!dwc->do_fifo_resize) @@ -3101,8 +3139,8 @@ static int dwc3_gadget_check_config(struct usb_gadget *g) fifo_size += dwc->max_cfg_eps;
/* Check if we can fit a single fifo per endpoint */ - ram1_depth = DWC3_RAM1_DEPTH(dwc->hwparams.hwparams7); - if (fifo_size > ram1_depth) + ram_depth = dwc3_gadget_calc_ram_depth(dwc); + if (fifo_size > ram_depth) return -ENOMEM;
return 0;
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Thomas Gleixner tglx@linutronix.de
[ Upstream commit b23decf8ac9102fc52c4de5196f4dc0a5f3eb80b ]
Idle tasks are initialized via __sched_fork() twice:
fork_idle() copy_process() sched_fork() __sched_fork() init_idle() __sched_fork()
Instead of cleaning this up, sched_ext hacked around it. Even when analyis and solution were provided in a discussion, nobody cared to clean this up.
init_idle() is also invoked from sched_init() to initialize the boot CPU's idle task, which requires the __sched_fork() invocation. But this can be trivially solved by invoking __sched_fork() before init_idle() in sched_init() and removing the __sched_fork() invocation from init_idle().
Do so and clean up the comments explaining this historical leftover.
Signed-off-by: Thomas Gleixner tglx@linutronix.de Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Link: https://lore.kernel.org/r/20241028103142.359584747@linutronix.de Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/sched/core.c | 12 +++++------- 1 file changed, 5 insertions(+), 7 deletions(-)
diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 228f7c07da72..86606fb9e6bc 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -4488,7 +4488,8 @@ int wake_up_state(struct task_struct *p, unsigned int state) * Perform scheduler related setup for a newly forked process p. * p is forked by current. * - * __sched_fork() is basic setup used by init_idle() too: + * __sched_fork() is basic setup which is also used by sched_init() to + * initialize the boot CPU's idle task. */ static void __sched_fork(unsigned long clone_flags, struct task_struct *p) { @@ -9257,8 +9258,6 @@ void __init init_idle(struct task_struct *idle, int cpu) struct rq *rq = cpu_rq(cpu); unsigned long flags;
- __sched_fork(0, idle); - raw_spin_lock_irqsave(&idle->pi_lock, flags); raw_spin_rq_lock(rq);
@@ -9273,10 +9272,8 @@ void __init init_idle(struct task_struct *idle, int cpu)
#ifdef CONFIG_SMP /* - * It's possible that init_idle() gets called multiple times on a task, - * in that case do_set_cpus_allowed() will not do the right thing. - * - * And since this is boot we can forgo the serialization. + * No validation and serialization required at boot time and for + * setting up the idle tasks of not yet online CPUs. */ set_cpus_allowed_common(idle, &ac); #endif @@ -10105,6 +10102,7 @@ void __init sched_init(void) * but because we are the idle thread, we just pick up running again * when this runqueue becomes "idle". */ + __sched_fork(0, current); init_idle(current, smp_processor_id());
calc_load_update = jiffies + LOAD_FREQ;
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Shixiong Ou oushixiong@kylinos.cn
[ Upstream commit 949658cb9b69ab9d22a42a662b2fdc7085689ed8 ]
In some causes, HPD signals will jitter when plugging in or unplugging HDMI.
Rescheduling the hotplug work for a second when EDID may still be readable but HDP is disconnected, and fixes this issue.
Signed-off-by: Shixiong Ou oushixiong@kylinos.cn Signed-off-by: Alex Deucher alexander.deucher@amd.com Stable-dep-of: 979bfe291b5b ("Revert "drm/radeon: Delay Connector detecting when HPD singals is unstable"") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/radeon/radeon_connectors.c | 10 ++++++++++ 1 file changed, 10 insertions(+)
diff --git a/drivers/gpu/drm/radeon/radeon_connectors.c b/drivers/gpu/drm/radeon/radeon_connectors.c index b84b58926106..cf0114ca59a4 100644 --- a/drivers/gpu/drm/radeon/radeon_connectors.c +++ b/drivers/gpu/drm/radeon/radeon_connectors.c @@ -1267,6 +1267,16 @@ radeon_dvi_detect(struct drm_connector *connector, bool force) goto exit; } } + + if (dret && radeon_connector->hpd.hpd != RADEON_HPD_NONE && + !radeon_hpd_sense(rdev, radeon_connector->hpd.hpd) && + connector->connector_type == DRM_MODE_CONNECTOR_HDMIA) { + DRM_DEBUG_KMS("EDID is readable when HPD disconnected\n"); + schedule_delayed_work(&rdev->hotplug_work, msecs_to_jiffies(1000)); + ret = connector_status_disconnected; + goto exit; + } + if (dret) { radeon_connector->detected_by_load = false; radeon_connector_free_edid(connector);
[Public]
-----Original Message----- From: Greg Kroah-Hartman gregkh@linuxfoundation.org Sent: Monday, January 6, 2025 10:14 AM To: stable@vger.kernel.org Cc: Greg Kroah-Hartman gregkh@linuxfoundation.org; patches@lists.linux.dev; Shixiong Ou oushixiong@kylinos.cn; Deucher, Alexander Alexander.Deucher@amd.com; Sasha Levin sashal@kernel.org Subject: [PATCH 6.6 014/222] drm/radeon: Delay Connector detecting when HPD singals is unstable
6.6-stable review patch. If anyone has any objections, please let me know.
From: Shixiong Ou oushixiong@kylinos.cn
[ Upstream commit 949658cb9b69ab9d22a42a662b2fdc7085689ed8 ]
In some causes, HPD signals will jitter when plugging in or unplugging HDMI.
Rescheduling the hotplug work for a second when EDID may still be readable but HDP is disconnected, and fixes this issue.
Signed-off-by: Shixiong Ou oushixiong@kylinos.cn Signed-off-by: Alex Deucher alexander.deucher@amd.com Stable-dep-of: 979bfe291b5b ("Revert "drm/radeon: Delay Connector detecting when HPD singals is unstable"")
Please drop both of these patches. There is no need to pull back a patch just so that you can apply the revert.
Thanks,
Alex
Signed-off-by: Sasha Levin sashal@kernel.org
drivers/gpu/drm/radeon/radeon_connectors.c | 10 ++++++++++ 1 file changed, 10 insertions(+)
diff --git a/drivers/gpu/drm/radeon/radeon_connectors.c b/drivers/gpu/drm/radeon/radeon_connectors.c index b84b58926106..cf0114ca59a4 100644 --- a/drivers/gpu/drm/radeon/radeon_connectors.c +++ b/drivers/gpu/drm/radeon/radeon_connectors.c @@ -1267,6 +1267,16 @@ radeon_dvi_detect(struct drm_connector *connector, bool force) goto exit; } }
if (dret && radeon_connector->hpd.hpd != RADEON_HPD_NONE &&
!radeon_hpd_sense(rdev, radeon_connector->hpd.hpd) &&
connector->connector_type == DRM_MODE_CONNECTOR_HDMIA) {
DRM_DEBUG_KMS("EDID is readable when HPD
disconnected\n");
schedule_delayed_work(&rdev->hotplug_work,
msecs_to_jiffies(1000));
ret = connector_status_disconnected;
goto exit;
}
if (dret) { radeon_connector->detected_by_load = false; radeon_connector_free_edid(connector);
-- 2.39.5
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alex Deucher alexander.deucher@amd.com
[ Upstream commit 979bfe291b5b30a9132c2fd433247e677b24c6aa ]
This reverts commit 949658cb9b69ab9d22a42a662b2fdc7085689ed8.
This causes a blank screen on boot.
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3696 Signed-off-by: Alex Deucher alexander.deucher@amd.com Cc: Shixiong Ou oushixiong@kylinos.cn Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/radeon/radeon_connectors.c | 10 ---------- 1 file changed, 10 deletions(-)
diff --git a/drivers/gpu/drm/radeon/radeon_connectors.c b/drivers/gpu/drm/radeon/radeon_connectors.c index cf0114ca59a4..b84b58926106 100644 --- a/drivers/gpu/drm/radeon/radeon_connectors.c +++ b/drivers/gpu/drm/radeon/radeon_connectors.c @@ -1267,16 +1267,6 @@ radeon_dvi_detect(struct drm_connector *connector, bool force) goto exit; } } - - if (dret && radeon_connector->hpd.hpd != RADEON_HPD_NONE && - !radeon_hpd_sense(rdev, radeon_connector->hpd.hpd) && - connector->connector_type == DRM_MODE_CONNECTOR_HDMIA) { - DRM_DEBUG_KMS("EDID is readable when HPD disconnected\n"); - schedule_delayed_work(&rdev->hotplug_work, msecs_to_jiffies(1000)); - ret = connector_status_disconnected; - goto exit; - } - if (dret) { radeon_connector->detected_by_load = false; radeon_connector_free_edid(connector);
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Miguel Ojeda ojeda@kernel.org
[ Upstream commit f8f88aa25a03ce1e0fc8a9842840988b870f0c37 ]
Since we are starting to support several Rust toolchains, lints (including Clippy ones) now may behave differently and lint groups may include new lints.
Therefore, to maximize the chances a given version works, relax some deny-level lints to warnings. It may also make our lives a bit easier while developing new code or refactoring.
To be clear, the requirements for in-tree code are still the same, since Rust code still needs to be warning-free (patches should be clean under `WERROR=y`) and the set of lints is not changed.
`unsafe_op_in_unsafe_fn` is left unmodified, i.e. as an error, since it is becoming the default in the language (warn-by-default in Rust 2024 [1] and ideally an error later on) and thus it should also be very well tested. In addition, it is simple enough that it should not have false positives (unlike e.g. `rust_2018_idioms`'s `explicit_outlives_requirements`).
`non_ascii_idents` is left unmodified as well, i.e. as an error, since it is unlikely one gains any productivity during development if it were a warning (in fact, it may be worse, since it is likely one made a typo). In addition, it should not have false positives.
Finally, put the two `-D` ones at the top and take the chance to do one per line.
Link: https://github.com/rust-lang/rust/pull/112038 [1] Reviewed-by: Finn Behrens me@kloenk.dev Tested-by: Benno Lossin benno.lossin@proton.me Tested-by: Andreas Hindborg a.hindborg@samsung.com Link: https://lore.kernel.org/r/20240709160615.998336-5-ojeda@kernel.org Signed-off-by: Miguel Ojeda ojeda@kernel.org Stable-dep-of: 60fc1e675013 ("rust: allow `clippy::needless_lifetimes`") Signed-off-by: Sasha Levin sashal@kernel.org --- Makefile | 24 +++++++++++++----------- rust/Makefile | 4 ++-- 2 files changed, 15 insertions(+), 13 deletions(-)
diff --git a/Makefile b/Makefile index ec4d9d1d9b7a..0be58613bfe0 100644 --- a/Makefile +++ b/Makefile @@ -457,17 +457,19 @@ KBUILD_USERLDFLAGS := $(USERLDFLAGS) # host programs. export rust_common_flags := --edition=2021 \ -Zbinary_dep_depinfo=y \ - -Dunsafe_op_in_unsafe_fn -Drust_2018_idioms \ - -Dunreachable_pub -Dnon_ascii_idents \ + -Dunsafe_op_in_unsafe_fn \ + -Dnon_ascii_idents \ + -Wrust_2018_idioms \ + -Wunreachable_pub \ -Wmissing_docs \ - -Drustdoc::missing_crate_level_docs \ - -Dclippy::correctness -Dclippy::style \ - -Dclippy::suspicious -Dclippy::complexity \ - -Dclippy::perf \ - -Dclippy::let_unit_value -Dclippy::mut_mut \ - -Dclippy::needless_bitwise_bool \ - -Dclippy::needless_continue \ - -Dclippy::no_mangle_with_rust_abi \ + -Wrustdoc::missing_crate_level_docs \ + -Wclippy::correctness -Wclippy::style \ + -Wclippy::suspicious -Wclippy::complexity \ + -Wclippy::perf \ + -Wclippy::let_unit_value -Wclippy::mut_mut \ + -Wclippy::needless_bitwise_bool \ + -Wclippy::needless_continue \ + -Wclippy::no_mangle_with_rust_abi \ -Wclippy::dbg_macro
KBUILD_HOSTCFLAGS := $(KBUILD_USERHOSTCFLAGS) $(HOST_LFS_CFLAGS) $(HOSTCFLAGS) @@ -572,7 +574,7 @@ KBUILD_RUSTFLAGS := $(rust_common_flags) \ -Csymbol-mangling-version=v0 \ -Crelocation-model=static \ -Zfunction-sections=n \ - -Dclippy::float_arithmetic + -Wclippy::float_arithmetic
KBUILD_AFLAGS_KERNEL := KBUILD_CFLAGS_KERNEL := diff --git a/rust/Makefile b/rust/Makefile index 333b9a482473..12b9d78fd25c 100644 --- a/rust/Makefile +++ b/rust/Makefile @@ -422,7 +422,7 @@ ifneq ($(or $(CONFIG_ARM64),$(and $(CONFIG_RISCV),$(CONFIG_64BIT))),) endif
$(obj)/core.o: private skip_clippy = 1 -$(obj)/core.o: private skip_flags = -Dunreachable_pub +$(obj)/core.o: private skip_flags = -Wunreachable_pub $(obj)/core.o: private rustc_objcopy = $(foreach sym,$(redirect-intrinsics),--redefine-sym $(sym)=__rust$(sym)) $(obj)/core.o: private rustc_target_flags = $(core-cfgs) $(obj)/core.o: $(RUST_LIB_SRC)/core/src/lib.rs scripts/target.json FORCE @@ -433,7 +433,7 @@ $(obj)/compiler_builtins.o: $(src)/compiler_builtins.rs $(obj)/core.o FORCE $(call if_changed_dep,rustc_library)
$(obj)/alloc.o: private skip_clippy = 1 -$(obj)/alloc.o: private skip_flags = -Dunreachable_pub +$(obj)/alloc.o: private skip_flags = -Wunreachable_pub $(obj)/alloc.o: private rustc_target_flags = $(alloc-cfgs) $(obj)/alloc.o: $(src)/alloc/lib.rs $(obj)/compiler_builtins.o FORCE $(call if_changed_dep,rustc_library)
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Miguel Ojeda ojeda@kernel.org
[ Upstream commit 60fc1e6750133620e404d40b93df5afe32e3e6c6 ]
In beta Clippy (i.e. Rust 1.83.0), the `needless_lifetimes` lint has been extended [1] to suggest eliding `impl` lifetimes, e.g.
error: the following explicit lifetimes could be elided: 'a --> rust/kernel/list.rs:647:6 | 647 | impl<'a, T: ?Sized + ListItem<ID>, const ID: u64> FusedIterator for Iter<'a, T, ID> {} | ^^ ^^ | = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#needless_lifetimes = note: `-D clippy::needless-lifetimes` implied by `-D warnings` = help: to override `-D warnings` add `#[allow(clippy::needless_lifetimes)]` help: elide the lifetimes | 647 - impl<'a, T: ?Sized + ListItem<ID>, const ID: u64> FusedIterator for Iter<'a, T, ID> {} 647 + impl<T: ?Sized + ListItem<ID>, const ID: u64> FusedIterator for Iter<'_, T, ID> {}
A possibility would have been to clean them -- the RFC patch [2] did this, while asking if we wanted these cleanups. There is an open issue [3] in Clippy about being able to differentiate some of the new cases, e.g. those that do not involve introducing `'_`. Thus it seems others feel similarly.
Thus, for the time being, we decided to `allow` the lint.
Link: https://github.com/rust-lang/rust-clippy/pull/13286 [1] Link: https://lore.kernel.org/rust-for-linux/20241012231300.397010-1-ojeda@kernel.... [2] Link: https://github.com/rust-lang/rust-clippy/issues/13514 [3] Reviewed-by: Alice Ryhl aliceryhl@google.com Reviewed-by: Andreas Hindborg a.hindborg@kernel.org Link: https://lore.kernel.org/r/20241116181538.369355-1-ojeda@kernel.org Signed-off-by: Miguel Ojeda ojeda@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- Makefile | 1 + 1 file changed, 1 insertion(+)
diff --git a/Makefile b/Makefile index 0be58613bfe0..31b4fc9f5ccb 100644 --- a/Makefile +++ b/Makefile @@ -469,6 +469,7 @@ export rust_common_flags := --edition=2021 \ -Wclippy::let_unit_value -Wclippy::mut_mut \ -Wclippy::needless_bitwise_bool \ -Wclippy::needless_continue \ + -Aclippy::needless_lifetimes \ -Wclippy::no_mangle_with_rust_abi \ -Wclippy::dbg_macro
On Mon, Jan 6, 2025 at 4:23 PM Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
6.6-stable review patch. If anyone has any objections, please let me know.
This should not be needed (because in 6.6.y the Rust version is pinned, unlike in 6.12.y), so both Rust-related commits can be dropped, but they should not hurt either (built-tested).
See https://lore.kernel.org/stable/CANiq72k9A-adJy8uzog_NdrrfLh6+EgHY0kqPcA5Y45H...
Thanks!
Cheers, Miguel
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Liam Ni zhiguangni01@gmail.com
[ Upstream commit ff6c3d81f2e86b63a3a530683f89ef393882782a ]
Sanity check that makes sure the nodes cover all memory loops over numa_meminfo to count the pages that have node id assigned by the firmware, then loops again over memblock.memory to find the total amount of memory and in the end checks that the difference between the total memory and memory that covered by nodes is less than some threshold. Worse, the loop over numa_meminfo calls __absent_pages_in_range() that also partially traverses memblock.memory.
It's much simpler and more efficient to have a single traversal of memblock.memory that verifies that amount of memory not covered by nodes is less than a threshold.
Introduce memblock_validate_numa_coverage() that does exactly that and use it instead of numa_meminfo_cover_memory().
Link: https://lkml.kernel.org/r/20231026020329.327329-1-zhiguangni01@gmail.com Signed-off-by: Liam Ni zhiguangni01@gmail.com Reviewed-by: Mike Rapoport (IBM) rppt@kernel.org Cc: Andy Lutomirski luto@kernel.org Cc: Bibo Mao maobibo@loongson.cn Cc: Binbin Zhou zhoubinbin@loongson.cn Cc: Borislav Petkov bp@alien8.de Cc: Dave Hansen dave.hansen@linux.intel.com Cc: Feiyang Chen chenfeiyang@loongson.cn Cc: "H. Peter Anvin" hpa@zytor.com Cc: Huacai Chen chenhuacai@kernel.org Cc: Ingo Molnar mingo@redhat.com Cc: Peter Zijlstra peterz@infradead.org Cc: Thomas Gleixner tglx@linutronix.de Cc: WANG Xuerui kernel@xen0n.name Signed-off-by: Andrew Morton akpm@linux-foundation.org Stable-dep-of: 9cdc6423acb4 ("memblock: allow zero threshold in validate_numa_converage()") Signed-off-by: Sasha Levin sashal@kernel.org --- arch/loongarch/kernel/numa.c | 28 +--------------------------- arch/x86/mm/numa.c | 34 ++-------------------------------- include/linux/memblock.h | 1 + mm/memblock.c | 34 ++++++++++++++++++++++++++++++++++ 4 files changed, 38 insertions(+), 59 deletions(-)
diff --git a/arch/loongarch/kernel/numa.c b/arch/loongarch/kernel/numa.c index 6e65ff12d5c7..8fe21f868f72 100644 --- a/arch/loongarch/kernel/numa.c +++ b/arch/loongarch/kernel/numa.c @@ -226,32 +226,6 @@ static void __init node_mem_init(unsigned int node)
#ifdef CONFIG_ACPI_NUMA
-/* - * Sanity check to catch more bad NUMA configurations (they are amazingly - * common). Make sure the nodes cover all memory. - */ -static bool __init numa_meminfo_cover_memory(const struct numa_meminfo *mi) -{ - int i; - u64 numaram, biosram; - - numaram = 0; - for (i = 0; i < mi->nr_blks; i++) { - u64 s = mi->blk[i].start >> PAGE_SHIFT; - u64 e = mi->blk[i].end >> PAGE_SHIFT; - - numaram += e - s; - numaram -= __absent_pages_in_range(mi->blk[i].nid, s, e); - if ((s64)numaram < 0) - numaram = 0; - } - max_pfn = max_low_pfn; - biosram = max_pfn - absent_pages_in_range(0, max_pfn); - - BUG_ON((s64)(biosram - numaram) >= (1 << (20 - PAGE_SHIFT))); - return true; -} - static void __init add_node_intersection(u32 node, u64 start, u64 size, u32 type) { static unsigned long num_physpages; @@ -396,7 +370,7 @@ int __init init_numa_memory(void) return -EINVAL;
init_node_memblock(); - if (numa_meminfo_cover_memory(&numa_meminfo) == false) + if (!memblock_validate_numa_coverage(SZ_1M)) return -EINVAL;
for_each_node_mask(node, node_possible_map) { diff --git a/arch/x86/mm/numa.c b/arch/x86/mm/numa.c index c7fa5396c0f0..2c67bfc3cf32 100644 --- a/arch/x86/mm/numa.c +++ b/arch/x86/mm/numa.c @@ -448,37 +448,6 @@ int __node_distance(int from, int to) } EXPORT_SYMBOL(__node_distance);
-/* - * Sanity check to catch more bad NUMA configurations (they are amazingly - * common). Make sure the nodes cover all memory. - */ -static bool __init numa_meminfo_cover_memory(const struct numa_meminfo *mi) -{ - u64 numaram, e820ram; - int i; - - numaram = 0; - for (i = 0; i < mi->nr_blks; i++) { - u64 s = mi->blk[i].start >> PAGE_SHIFT; - u64 e = mi->blk[i].end >> PAGE_SHIFT; - numaram += e - s; - numaram -= __absent_pages_in_range(mi->blk[i].nid, s, e); - if ((s64)numaram < 0) - numaram = 0; - } - - e820ram = max_pfn - absent_pages_in_range(0, max_pfn); - - /* We seem to lose 3 pages somewhere. Allow 1M of slack. */ - if ((s64)(e820ram - numaram) >= (1 << (20 - PAGE_SHIFT))) { - printk(KERN_ERR "NUMA: nodes only cover %LuMB of your %LuMB e820 RAM. Not used.\n", - (numaram << PAGE_SHIFT) >> 20, - (e820ram << PAGE_SHIFT) >> 20); - return false; - } - return true; -} - /* * Mark all currently memblock-reserved physical memory (which covers the * kernel's own memory ranges) as hot-unswappable. @@ -584,7 +553,8 @@ static int __init numa_register_memblks(struct numa_meminfo *mi) return -EINVAL; } } - if (!numa_meminfo_cover_memory(mi)) + + if (!memblock_validate_numa_coverage(SZ_1M)) return -EINVAL;
/* Finally register nodes. */ diff --git a/include/linux/memblock.h b/include/linux/memblock.h index ed57c23f80ac..ed64240041e8 100644 --- a/include/linux/memblock.h +++ b/include/linux/memblock.h @@ -122,6 +122,7 @@ unsigned long memblock_addrs_overlap(phys_addr_t base1, phys_addr_t size1, phys_addr_t base2, phys_addr_t size2); bool memblock_overlaps_region(struct memblock_type *type, phys_addr_t base, phys_addr_t size); +bool memblock_validate_numa_coverage(unsigned long threshold_bytes); int memblock_mark_hotplug(phys_addr_t base, phys_addr_t size); int memblock_clear_hotplug(phys_addr_t base, phys_addr_t size); int memblock_mark_mirror(phys_addr_t base, phys_addr_t size); diff --git a/mm/memblock.c b/mm/memblock.c index d630f5c2bdb9..3a3ab73546f5 100644 --- a/mm/memblock.c +++ b/mm/memblock.c @@ -735,6 +735,40 @@ int __init_memblock memblock_add(phys_addr_t base, phys_addr_t size) return memblock_add_range(&memblock.memory, base, size, MAX_NUMNODES, 0); }
+/** + * memblock_validate_numa_coverage - check if amount of memory with + * no node ID assigned is less than a threshold + * @threshold_bytes: maximal number of pages that can have unassigned node + * ID (in bytes). + * + * A buggy firmware may report memory that does not belong to any node. + * Check if amount of such memory is below @threshold_bytes. + * + * Return: true on success, false on failure. + */ +bool __init_memblock memblock_validate_numa_coverage(unsigned long threshold_bytes) +{ + unsigned long nr_pages = 0; + unsigned long start_pfn, end_pfn, mem_size_mb; + int nid, i; + + /* calculate lose page */ + for_each_mem_pfn_range(i, MAX_NUMNODES, &start_pfn, &end_pfn, &nid) { + if (nid == NUMA_NO_NODE) + nr_pages += end_pfn - start_pfn; + } + + if ((nr_pages << PAGE_SHIFT) >= threshold_bytes) { + mem_size_mb = memblock_phys_mem_size() >> 20; + pr_err("NUMA: no nodes coverage for %luMB of %luMB RAM\n", + (nr_pages << PAGE_SHIFT) >> 20, mem_size_mb); + return false; + } + + return true; +} + + /** * memblock_isolate_range - isolate given range into disjoint memblocks * @type: memblock type to isolate range for
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mike Rapoport (Microsoft) rppt@kernel.org
[ Upstream commit 9cdc6423acb49055efb444ecd895d853a70ef931 ]
Currently memblock validate_numa_converage() returns false negative when threshold set to zero.
Make the check if the memory size with invalid node ID is greater than the threshold exclusive to fix that.
Link: https://lore.kernel.org/all/Z0mIDBD4KLyxyOCm@kernel.org/ Signed-off-by: Mike Rapoport (Microsoft) rppt@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- mm/memblock.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/mm/memblock.c b/mm/memblock.c index 3a3ab73546f5..87a2b4340ce4 100644 --- a/mm/memblock.c +++ b/mm/memblock.c @@ -738,7 +738,7 @@ int __init_memblock memblock_add(phys_addr_t base, phys_addr_t size) /** * memblock_validate_numa_coverage - check if amount of memory with * no node ID assigned is less than a threshold - * @threshold_bytes: maximal number of pages that can have unassigned node + * @threshold_bytes: maximal memory size that can have unassigned node * ID (in bytes). * * A buggy firmware may report memory that does not belong to any node. @@ -758,7 +758,7 @@ bool __init_memblock memblock_validate_numa_coverage(unsigned long threshold_byt nr_pages += end_pfn - start_pfn; }
- if ((nr_pages << PAGE_SHIFT) >= threshold_bytes) { + if ((nr_pages << PAGE_SHIFT) > threshold_bytes) { mem_size_mb = memblock_phys_mem_size() >> 20; pr_err("NUMA: no nodes coverage for %luMB of %luMB RAM\n", (nr_pages << PAGE_SHIFT) >> 20, mem_size_mb);
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jeff Layton jlayton@kernel.org
[ Upstream commit b898ab233611f7903d88c0b10f8145e1c15d3642 ]
Convert to using the new inode timestamp accessor functions.
Signed-off-by: Jeff Layton jlayton@kernel.org Link: https://lore.kernel.org/r/20231004185347.80880-33-jlayton@kernel.org Signed-off-by: Christian Brauner brauner@kernel.org Stable-dep-of: c7fc0366c656 ("ext4: partial zero eof block on unaligned inode size extension") Signed-off-by: Sasha Levin sashal@kernel.org --- fs/ext4/ext4.h | 20 +++++++++++++++----- fs/ext4/extents.c | 11 ++++++----- fs/ext4/ialloc.c | 4 ++-- fs/ext4/inline.c | 4 ++-- fs/ext4/inode.c | 19 ++++++++++--------- fs/ext4/ioctl.c | 13 +++++++++++-- fs/ext4/namei.c | 10 +++++----- fs/ext4/super.c | 2 +- fs/ext4/xattr.c | 8 ++++---- 9 files changed, 56 insertions(+), 35 deletions(-)
diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h index 3db01b933c3e..60455c84a937 100644 --- a/fs/ext4/ext4.h +++ b/fs/ext4/ext4.h @@ -891,10 +891,13 @@ do { \ (raw_inode)->xtime = cpu_to_le32(clamp_t(int32_t, (ts).tv_sec, S32_MIN, S32_MAX)); \ } while (0)
-#define EXT4_INODE_SET_XTIME(xtime, inode, raw_inode) \ - EXT4_INODE_SET_XTIME_VAL(xtime, inode, raw_inode, (inode)->xtime) +#define EXT4_INODE_SET_ATIME(inode, raw_inode) \ + EXT4_INODE_SET_XTIME_VAL(i_atime, inode, raw_inode, inode_get_atime(inode))
-#define EXT4_INODE_SET_CTIME(inode, raw_inode) \ +#define EXT4_INODE_SET_MTIME(inode, raw_inode) \ + EXT4_INODE_SET_XTIME_VAL(i_mtime, inode, raw_inode, inode_get_mtime(inode)) + +#define EXT4_INODE_SET_CTIME(inode, raw_inode) \ EXT4_INODE_SET_XTIME_VAL(i_ctime, inode, raw_inode, inode_get_ctime(inode))
#define EXT4_EINODE_SET_XTIME(xtime, einode, raw_inode) \ @@ -910,9 +913,16 @@ do { \ .tv_sec = (signed)le32_to_cpu((raw_inode)->xtime) \ })
-#define EXT4_INODE_GET_XTIME(xtime, inode, raw_inode) \ +#define EXT4_INODE_GET_ATIME(inode, raw_inode) \ +do { \ + inode_set_atime_to_ts(inode, \ + EXT4_INODE_GET_XTIME_VAL(i_atime, inode, raw_inode)); \ +} while (0) + +#define EXT4_INODE_GET_MTIME(inode, raw_inode) \ do { \ - (inode)->xtime = EXT4_INODE_GET_XTIME_VAL(xtime, inode, raw_inode); \ + inode_set_mtime_to_ts(inode, \ + EXT4_INODE_GET_XTIME_VAL(i_mtime, inode, raw_inode)); \ } while (0)
#define EXT4_INODE_GET_CTIME(inode, raw_inode) \ diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c index 5ea75af6ca22..f2341d99d09f 100644 --- a/fs/ext4/extents.c +++ b/fs/ext4/extents.c @@ -4532,7 +4532,8 @@ static int ext4_alloc_file_blocks(struct file *file, ext4_lblk_t offset, if (epos > new_size) epos = new_size; if (ext4_update_inode_size(inode, epos) & 0x1) - inode->i_mtime = inode_get_ctime(inode); + inode_set_mtime_to_ts(inode, + inode_get_ctime(inode)); } ret2 = ext4_mark_inode_dirty(handle, inode); ext4_update_inode_fsync_trans(handle, inode, 1); @@ -4670,7 +4671,7 @@ static long ext4_zero_range(struct file *file, loff_t offset,
/* Now release the pages and zero block aligned part of pages */ truncate_pagecache_range(inode, start, end - 1); - inode->i_mtime = inode_set_ctime_current(inode); + inode_set_mtime_to_ts(inode, inode_set_ctime_current(inode));
ret = ext4_alloc_file_blocks(file, lblk, max_blocks, new_size, flags); @@ -4695,7 +4696,7 @@ static long ext4_zero_range(struct file *file, loff_t offset, goto out_mutex; }
- inode->i_mtime = inode_set_ctime_current(inode); + inode_set_mtime_to_ts(inode, inode_set_ctime_current(inode)); if (new_size) ext4_update_inode_size(inode, new_size); ret = ext4_mark_inode_dirty(handle, inode); @@ -5431,7 +5432,7 @@ static int ext4_collapse_range(struct file *file, loff_t offset, loff_t len) up_write(&EXT4_I(inode)->i_data_sem); if (IS_SYNC(inode)) ext4_handle_sync(handle); - inode->i_mtime = inode_set_ctime_current(inode); + inode_set_mtime_to_ts(inode, inode_set_ctime_current(inode)); ret = ext4_mark_inode_dirty(handle, inode); ext4_update_inode_fsync_trans(handle, inode, 1);
@@ -5541,7 +5542,7 @@ static int ext4_insert_range(struct file *file, loff_t offset, loff_t len) /* Expand file to avoid data loss if there is error while shifting */ inode->i_size += len; EXT4_I(inode)->i_disksize += len; - inode->i_mtime = inode_set_ctime_current(inode); + inode_set_mtime_to_ts(inode, inode_set_ctime_current(inode)); ret = ext4_mark_inode_dirty(handle, inode); if (ret) goto out_stop; diff --git a/fs/ext4/ialloc.c b/fs/ext4/ialloc.c index d4d0ad689d3c..52f2959d29e6 100644 --- a/fs/ext4/ialloc.c +++ b/fs/ext4/ialloc.c @@ -1255,8 +1255,8 @@ struct inode *__ext4_new_inode(struct mnt_idmap *idmap, inode->i_ino = ino + group * EXT4_INODES_PER_GROUP(sb); /* This is the optimal IO size (for stat), not the fs block size */ inode->i_blocks = 0; - inode->i_mtime = inode->i_atime = inode_set_ctime_current(inode); - ei->i_crtime = inode->i_mtime; + simple_inode_init_ts(inode); + ei->i_crtime = inode_get_mtime(inode);
memset(ei->i_data, 0, sizeof(ei->i_data)); ei->i_dir_start_lookup = 0; diff --git a/fs/ext4/inline.c b/fs/ext4/inline.c index cb65052ee3de..3f363276ddd3 100644 --- a/fs/ext4/inline.c +++ b/fs/ext4/inline.c @@ -1037,7 +1037,7 @@ static int ext4_add_dirent_to_inline(handle_t *handle, * happen is that the times are slightly out of date * and/or different from the directory change time. */ - dir->i_mtime = inode_set_ctime_current(dir); + inode_set_mtime_to_ts(dir, inode_set_ctime_current(dir)); ext4_update_dx_flag(dir); inode_inc_iversion(dir); return 1; @@ -2010,7 +2010,7 @@ int ext4_inline_data_truncate(struct inode *inode, int *has_inline) ext4_orphan_del(handle, inode);
if (err == 0) { - inode->i_mtime = inode_set_ctime_current(inode); + inode_set_mtime_to_ts(inode, inode_set_ctime_current(inode)); err = ext4_mark_inode_dirty(handle, inode); if (IS_SYNC(inode)) ext4_handle_sync(handle); diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index 18ec9106c5b0..04f25a04f35a 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -4055,7 +4055,7 @@ int ext4_punch_hole(struct file *file, loff_t offset, loff_t length) if (IS_SYNC(inode)) ext4_handle_sync(handle);
- inode->i_mtime = inode_set_ctime_current(inode); + inode_set_mtime_to_ts(inode, inode_set_ctime_current(inode)); ret2 = ext4_mark_inode_dirty(handle, inode); if (unlikely(ret2)) ret = ret2; @@ -4215,7 +4215,7 @@ int ext4_truncate(struct inode *inode) if (inode->i_nlink) ext4_orphan_del(handle, inode);
- inode->i_mtime = inode_set_ctime_current(inode); + inode_set_mtime_to_ts(inode, inode_set_ctime_current(inode)); err2 = ext4_mark_inode_dirty(handle, inode); if (unlikely(err2 && !err)) err = err2; @@ -4319,8 +4319,8 @@ static int ext4_fill_raw_inode(struct inode *inode, struct ext4_inode *raw_inode raw_inode->i_links_count = cpu_to_le16(inode->i_nlink);
EXT4_INODE_SET_CTIME(inode, raw_inode); - EXT4_INODE_SET_XTIME(i_mtime, inode, raw_inode); - EXT4_INODE_SET_XTIME(i_atime, inode, raw_inode); + EXT4_INODE_SET_MTIME(inode, raw_inode); + EXT4_INODE_SET_ATIME(inode, raw_inode); EXT4_EINODE_SET_XTIME(i_crtime, ei, raw_inode);
raw_inode->i_dtime = cpu_to_le32(ei->i_dtime); @@ -4928,8 +4928,8 @@ struct inode *__ext4_iget(struct super_block *sb, unsigned long ino, }
EXT4_INODE_GET_CTIME(inode, raw_inode); - EXT4_INODE_GET_XTIME(i_mtime, inode, raw_inode); - EXT4_INODE_GET_XTIME(i_atime, inode, raw_inode); + EXT4_INODE_GET_ATIME(inode, raw_inode); + EXT4_INODE_GET_MTIME(inode, raw_inode); EXT4_EINODE_GET_XTIME(i_crtime, ei, raw_inode);
if (likely(!test_opt2(inode->i_sb, HURD_COMPAT))) { @@ -5054,8 +5054,8 @@ static void __ext4_update_other_inode_time(struct super_block *sb,
spin_lock(&ei->i_raw_lock); EXT4_INODE_SET_CTIME(inode, raw_inode); - EXT4_INODE_SET_XTIME(i_mtime, inode, raw_inode); - EXT4_INODE_SET_XTIME(i_atime, inode, raw_inode); + EXT4_INODE_SET_MTIME(inode, raw_inode); + EXT4_INODE_SET_ATIME(inode, raw_inode); ext4_inode_csum_set(inode, raw_inode, ei); spin_unlock(&ei->i_raw_lock); trace_ext4_other_inode_update_time(inode, orig_ino); @@ -5451,7 +5451,8 @@ int ext4_setattr(struct mnt_idmap *idmap, struct dentry *dentry, * update c/mtime in shrink case below */ if (!shrink) - inode->i_mtime = inode_set_ctime_current(inode); + inode_set_mtime_to_ts(inode, + inode_set_ctime_current(inode));
if (shrink) ext4_fc_track_range(handle, inode, diff --git a/fs/ext4/ioctl.c b/fs/ext4/ioctl.c index 0bfe2ce589e2..4f931f80cb34 100644 --- a/fs/ext4/ioctl.c +++ b/fs/ext4/ioctl.c @@ -312,13 +312,22 @@ static void swap_inode_data(struct inode *inode1, struct inode *inode2) struct ext4_inode_info *ei1; struct ext4_inode_info *ei2; unsigned long tmp; + struct timespec64 ts1, ts2;
ei1 = EXT4_I(inode1); ei2 = EXT4_I(inode2);
swap(inode1->i_version, inode2->i_version); - swap(inode1->i_atime, inode2->i_atime); - swap(inode1->i_mtime, inode2->i_mtime); + + ts1 = inode_get_atime(inode1); + ts2 = inode_get_atime(inode2); + inode_set_atime_to_ts(inode1, ts2); + inode_set_atime_to_ts(inode2, ts1); + + ts1 = inode_get_mtime(inode1); + ts2 = inode_get_mtime(inode2); + inode_set_mtime_to_ts(inode1, ts2); + inode_set_mtime_to_ts(inode2, ts1);
memswap(ei1->i_data, ei2->i_data, sizeof(ei1->i_data)); tmp = ei1->i_flags & EXT4_FL_SHOULD_SWAP; diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c index 4de1f61bba76..96a048d3f51b 100644 --- a/fs/ext4/namei.c +++ b/fs/ext4/namei.c @@ -2210,7 +2210,7 @@ static int add_dirent_to_buf(handle_t *handle, struct ext4_filename *fname, * happen is that the times are slightly out of date * and/or different from the directory change time. */ - dir->i_mtime = inode_set_ctime_current(dir); + inode_set_mtime_to_ts(dir, inode_set_ctime_current(dir)); ext4_update_dx_flag(dir); inode_inc_iversion(dir); err2 = ext4_mark_inode_dirty(handle, dir); @@ -3248,7 +3248,7 @@ static int ext4_rmdir(struct inode *dir, struct dentry *dentry) * recovery. */ inode->i_size = 0; ext4_orphan_add(handle, inode); - dir->i_mtime = inode_set_ctime_current(dir); + inode_set_mtime_to_ts(dir, inode_set_ctime_current(dir)); inode_set_ctime_current(inode); retval = ext4_mark_inode_dirty(handle, inode); if (retval) @@ -3323,7 +3323,7 @@ int __ext4_unlink(struct inode *dir, const struct qstr *d_name, retval = ext4_delete_entry(handle, dir, de, bh); if (retval) goto out_handle; - dir->i_mtime = inode_set_ctime_current(dir); + inode_set_mtime_to_ts(dir, inode_set_ctime_current(dir)); ext4_update_dx_flag(dir); retval = ext4_mark_inode_dirty(handle, dir); if (retval) @@ -3691,7 +3691,7 @@ static int ext4_setent(handle_t *handle, struct ext4_renament *ent, if (ext4_has_feature_filetype(ent->dir->i_sb)) ent->de->file_type = file_type; inode_inc_iversion(ent->dir); - ent->dir->i_mtime = inode_set_ctime_current(ent->dir); + inode_set_mtime_to_ts(ent->dir, inode_set_ctime_current(ent->dir)); retval = ext4_mark_inode_dirty(handle, ent->dir); BUFFER_TRACE(ent->bh, "call ext4_handle_dirty_metadata"); if (!ent->inlined) { @@ -4006,7 +4006,7 @@ static int ext4_rename(struct mnt_idmap *idmap, struct inode *old_dir, ext4_dec_count(new.inode); inode_set_ctime_current(new.inode); } - old.dir->i_mtime = inode_set_ctime_current(old.dir); + inode_set_mtime_to_ts(old.dir, inode_set_ctime_current(old.dir)); ext4_update_dx_flag(old.dir); if (old.dir_bh) { retval = ext4_rename_dir_finish(handle, &old, new.dir->i_ino); diff --git a/fs/ext4/super.c b/fs/ext4/super.c index 2346ef071b24..71ced0ada9a2 100644 --- a/fs/ext4/super.c +++ b/fs/ext4/super.c @@ -7180,7 +7180,7 @@ static int ext4_quota_off(struct super_block *sb, int type) } EXT4_I(inode)->i_flags &= ~(EXT4_NOATIME_FL | EXT4_IMMUTABLE_FL); inode_set_flags(inode, 0, S_NOATIME | S_IMMUTABLE); - inode->i_mtime = inode_set_ctime_current(inode); + inode_set_mtime_to_ts(inode, inode_set_ctime_current(inode)); err = ext4_mark_inode_dirty(handle, inode); ext4_journal_stop(handle); out_unlock: diff --git a/fs/ext4/xattr.c b/fs/ext4/xattr.c index f40785bc4e55..df5ab1a75fc4 100644 --- a/fs/ext4/xattr.c +++ b/fs/ext4/xattr.c @@ -356,7 +356,7 @@ ext4_xattr_inode_hash(struct ext4_sb_info *sbi, const void *buffer, size_t size)
static u64 ext4_xattr_inode_get_ref(struct inode *ea_inode) { - return ((u64) inode_get_ctime(ea_inode).tv_sec << 32) | + return ((u64) inode_get_ctime_sec(ea_inode) << 32) | (u32) inode_peek_iversion_raw(ea_inode); }
@@ -368,12 +368,12 @@ static void ext4_xattr_inode_set_ref(struct inode *ea_inode, u64 ref_count)
static u32 ext4_xattr_inode_get_hash(struct inode *ea_inode) { - return (u32)ea_inode->i_atime.tv_sec; + return (u32) inode_get_atime_sec(ea_inode); }
static void ext4_xattr_inode_set_hash(struct inode *ea_inode, u32 hash) { - ea_inode->i_atime.tv_sec = hash; + inode_set_atime(ea_inode, hash, 0); }
/* @@ -418,7 +418,7 @@ static int ext4_xattr_inode_read(struct inode *ea_inode, void *buf, size_t size) return ret; }
-#define EXT4_XATTR_INODE_GET_PARENT(inode) ((__u32)(inode)->i_mtime.tv_sec) +#define EXT4_XATTR_INODE_GET_PARENT(inode) ((__u32)(inode_get_mtime_sec(inode)))
static int ext4_xattr_inode_iget(struct inode *parent, unsigned long ea_ino, u32 ea_inode_hash, struct inode **ea_inode)
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Brian Foster bfoster@redhat.com
[ Upstream commit c7fc0366c65628fd69bfc310affec4918199aae2 ]
Using mapped writes, it's technically possible to expose stale post-eof data on a truncate up operation. Consider the following example:
$ xfs_io -fc "pwrite 0 2k" -c "mmap 0 4k" -c "mwrite 2k 2k" \ -c "truncate 8k" -c "pread -v 2k 16" <file> ... 00000800: 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 XXXXXXXXXXXXXXXX ...
This shows that the post-eof data written via mwrite lands within EOF after a truncate up. While this is deliberate of the test case, behavior is somewhat unpredictable because writeback does post-eof zeroing, and writeback can occur at any time in the background. For example, an fsync inserted between the mwrite and truncate causes the subsequent read to instead return zeroes. This basically means that there is a race window in this situation between any subsequent extending operation and writeback that dictates whether post-eof data is exposed to the file or zeroed.
To prevent this problem, perform partial block zeroing as part of the various inode size extending operations that are susceptible to it. For truncate extension, zero around the original eof similar to how truncate down does partial zeroing of the new eof. For extension via writes and fallocate related operations, zero the newly exposed range of the file to cover any partial zeroing that must occur at the original and new eof blocks.
Signed-off-by: Brian Foster bfoster@redhat.com Link: https://patch.msgid.link/20240919160741.208162-2-bfoster@redhat.com Signed-off-by: Theodore Ts'o tytso@mit.edu Signed-off-by: Sasha Levin sashal@kernel.org --- fs/ext4/extents.c | 7 ++++++- fs/ext4/inode.c | 51 +++++++++++++++++++++++++++++++++-------------- 2 files changed, 42 insertions(+), 16 deletions(-)
diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c index f2341d99d09f..32218ac7f50f 100644 --- a/fs/ext4/extents.c +++ b/fs/ext4/extents.c @@ -4475,7 +4475,7 @@ static int ext4_alloc_file_blocks(struct file *file, ext4_lblk_t offset, int depth = 0; struct ext4_map_blocks map; unsigned int credits; - loff_t epos; + loff_t epos, old_size = i_size_read(inode);
BUG_ON(!ext4_test_inode_flag(inode, EXT4_INODE_EXTENTS)); map.m_lblk = offset; @@ -4534,6 +4534,11 @@ static int ext4_alloc_file_blocks(struct file *file, ext4_lblk_t offset, if (ext4_update_inode_size(inode, epos) & 0x1) inode_set_mtime_to_ts(inode, inode_get_ctime(inode)); + if (epos > old_size) { + pagecache_isize_extended(inode, old_size, epos); + ext4_zero_partial_blocks(handle, inode, + old_size, epos - old_size); + } } ret2 = ext4_mark_inode_dirty(handle, inode); ext4_update_inode_fsync_trans(handle, inode, 1); diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index 04f25a04f35a..19d7bcf16ebb 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -1328,8 +1328,10 @@ static int ext4_write_end(struct file *file, folio_unlock(folio); folio_put(folio);
- if (old_size < pos && !verity) + if (old_size < pos && !verity) { pagecache_isize_extended(inode, old_size, pos); + ext4_zero_partial_blocks(handle, inode, old_size, pos - old_size); + } /* * Don't mark the inode dirty under folio lock. First, it unnecessarily * makes the holding time of folio lock longer. Second, it forces lock @@ -1445,8 +1447,10 @@ static int ext4_journalled_write_end(struct file *file, folio_unlock(folio); folio_put(folio);
- if (old_size < pos && !verity) + if (old_size < pos && !verity) { pagecache_isize_extended(inode, old_size, pos); + ext4_zero_partial_blocks(handle, inode, old_size, pos - old_size); + }
if (size_changed) { ret2 = ext4_mark_inode_dirty(handle, inode); @@ -2971,7 +2975,8 @@ static int ext4_da_do_write_end(struct address_space *mapping, struct inode *inode = mapping->host; loff_t old_size = inode->i_size; bool disksize_changed = false; - loff_t new_i_size; + loff_t new_i_size, zero_len = 0; + handle_t *handle;
if (unlikely(!folio_buffers(folio))) { folio_unlock(folio); @@ -3015,18 +3020,21 @@ static int ext4_da_do_write_end(struct address_space *mapping, folio_unlock(folio); folio_put(folio);
- if (old_size < pos) + if (pos > old_size) { pagecache_isize_extended(inode, old_size, pos); + zero_len = pos - old_size; + }
- if (disksize_changed) { - handle_t *handle; + if (!disksize_changed && !zero_len) + return copied;
- handle = ext4_journal_start(inode, EXT4_HT_INODE, 2); - if (IS_ERR(handle)) - return PTR_ERR(handle); - ext4_mark_inode_dirty(handle, inode); - ext4_journal_stop(handle); - } + handle = ext4_journal_start(inode, EXT4_HT_INODE, 2); + if (IS_ERR(handle)) + return PTR_ERR(handle); + if (zero_len) + ext4_zero_partial_blocks(handle, inode, old_size, zero_len); + ext4_mark_inode_dirty(handle, inode); + ext4_journal_stop(handle);
return copied; } @@ -5437,6 +5445,14 @@ int ext4_setattr(struct mnt_idmap *idmap, struct dentry *dentry, }
if (attr->ia_size != inode->i_size) { + /* attach jbd2 jinode for EOF folio tail zeroing */ + if (attr->ia_size & (inode->i_sb->s_blocksize - 1) || + oldsize & (inode->i_sb->s_blocksize - 1)) { + error = ext4_inode_attach_jinode(inode); + if (error) + goto err_out; + } + handle = ext4_journal_start(inode, EXT4_HT_INODE, 3); if (IS_ERR(handle)) { error = PTR_ERR(handle); @@ -5447,12 +5463,17 @@ int ext4_setattr(struct mnt_idmap *idmap, struct dentry *dentry, orphan = 1; } /* - * Update c/mtime on truncate up, ext4_truncate() will - * update c/mtime in shrink case below + * Update c/mtime and tail zero the EOF folio on + * truncate up. ext4_truncate() handles the shrink case + * below. */ - if (!shrink) + if (!shrink) { inode_set_mtime_to_ts(inode, inode_set_ctime_current(inode)); + if (oldsize & (inode->i_sb->s_blocksize - 1)) + ext4_block_truncate_page(handle, + inode->i_mapping, oldsize); + }
if (shrink) ext4_fc_track_range(handle, inode,
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Stefan Berger stefanb@linux.ibm.com
[ Upstream commit d67c96fb97b5811e15c881d5cb72e293faa5f8e1 ]
For NIST P192/256/384 the public key's x and y parameters could be copied directly from a given array since both parameters filled 'ndigits' of digits (a 'digit' is a u64). For support of NIST P521 the key parameters need to have leading zeros prepended to the most significant digit since only 2 bytes of the most significant digit are provided.
Therefore, implement ecc_digits_from_bytes to convert a byte array into an array of digits and use this function in ecdsa_set_pub_key where an input byte array needs to be converted into digits.
Suggested-by: Lukas Wunner lukas@wunner.de Tested-by: Lukas Wunner lukas@wunner.de Reviewed-by: Jarkko Sakkinen jarkko@kernel.org Signed-off-by: Stefan Berger stefanb@linux.ibm.com Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Stable-dep-of: 3b0565c70350 ("crypto: ecdsa - Avoid signed integer overflow on signature decoding") Signed-off-by: Sasha Levin sashal@kernel.org --- crypto/ecdsa.c | 14 +++++++++----- include/crypto/internal/ecc.h | 21 +++++++++++++++++++++ 2 files changed, 30 insertions(+), 5 deletions(-)
diff --git a/crypto/ecdsa.c b/crypto/ecdsa.c index 3f9ec273a121..e819d0983bd3 100644 --- a/crypto/ecdsa.c +++ b/crypto/ecdsa.c @@ -222,9 +222,8 @@ static int ecdsa_ecc_ctx_reset(struct ecc_ctx *ctx) static int ecdsa_set_pub_key(struct crypto_akcipher *tfm, const void *key, unsigned int keylen) { struct ecc_ctx *ctx = akcipher_tfm_ctx(tfm); + unsigned int digitlen, ndigits; const unsigned char *d = key; - const u64 *digits = (const u64 *)&d[1]; - unsigned int ndigits; int ret;
ret = ecdsa_ecc_ctx_reset(ctx); @@ -238,12 +237,17 @@ static int ecdsa_set_pub_key(struct crypto_akcipher *tfm, const void *key, unsig return -EINVAL;
keylen--; - ndigits = (keylen >> 1) / sizeof(u64); + digitlen = keylen >> 1; + + ndigits = DIV_ROUND_UP(digitlen, sizeof(u64)); if (ndigits != ctx->curve->g.ndigits) return -EINVAL;
- ecc_swap_digits(digits, ctx->pub_key.x, ndigits); - ecc_swap_digits(&digits[ndigits], ctx->pub_key.y, ndigits); + d++; + + ecc_digits_from_bytes(d, digitlen, ctx->pub_key.x, ndigits); + ecc_digits_from_bytes(&d[digitlen], digitlen, ctx->pub_key.y, ndigits); + ret = ecc_is_pubkey_valid_full(ctx->curve, &ctx->pub_key);
ctx->pub_key_set = ret == 0; diff --git a/include/crypto/internal/ecc.h b/include/crypto/internal/ecc.h index 4f6c1a68882f..ab722a8986b7 100644 --- a/include/crypto/internal/ecc.h +++ b/include/crypto/internal/ecc.h @@ -56,6 +56,27 @@ static inline void ecc_swap_digits(const void *in, u64 *out, unsigned int ndigit out[i] = get_unaligned_be64(&src[ndigits - 1 - i]); }
+/** + * ecc_digits_from_bytes() - Create ndigits-sized digits array from byte array + * @in: Input byte array + * @nbytes Size of input byte array + * @out Output digits array + * @ndigits: Number of digits to create from byte array + */ +static inline void ecc_digits_from_bytes(const u8 *in, unsigned int nbytes, + u64 *out, unsigned int ndigits) +{ + unsigned int o = nbytes & 7; + __be64 msd = 0; + + if (o) { + memcpy((u8 *)&msd + sizeof(msd) - o, in, o); + out[--ndigits] = be64_to_cpu(msd); + in += o; + } + ecc_swap_digits(in, out, ndigits); +} + /** * ecc_is_key_valid() - Validate a given ECDH private key *
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Stefan Berger stefanb@linux.ibm.com
[ Upstream commit 703ca5cda1ea04735e48882a7cccff97d57656c3 ]
In cases where 'keylen' was referring to the size of the buffer used by a curve's digits, it does not reflect the purpose of the variable anymore once NIST P521 is used. What it refers to then is the size of the buffer, which may be a few bytes larger than the size a coordinate of a key. Therefore, rename keylen to bufsize where appropriate.
Tested-by: Lukas Wunner lukas@wunner.de Reviewed-by: Jarkko Sakkinen jarkko@kernel.org Signed-off-by: Stefan Berger stefanb@linux.ibm.com Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Stable-dep-of: 3b0565c70350 ("crypto: ecdsa - Avoid signed integer overflow on signature decoding") Signed-off-by: Sasha Levin sashal@kernel.org --- crypto/ecdsa.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/crypto/ecdsa.c b/crypto/ecdsa.c index e819d0983bd3..142bed98fa97 100644 --- a/crypto/ecdsa.c +++ b/crypto/ecdsa.c @@ -35,8 +35,8 @@ struct ecdsa_signature_ctx { static int ecdsa_get_signature_rs(u64 *dest, size_t hdrlen, unsigned char tag, const void *value, size_t vlen, unsigned int ndigits) { - size_t keylen = ndigits * sizeof(u64); - ssize_t diff = vlen - keylen; + size_t bufsize = ndigits * sizeof(u64); + ssize_t diff = vlen - bufsize; const char *d = value; u8 rs[ECC_MAX_BYTES];
@@ -58,7 +58,7 @@ static int ecdsa_get_signature_rs(u64 *dest, size_t hdrlen, unsigned char tag, if (diff) return -EINVAL; } - if (-diff >= keylen) + if (-diff >= bufsize) return -EINVAL;
if (diff) { @@ -138,7 +138,7 @@ static int ecdsa_verify(struct akcipher_request *req) { struct crypto_akcipher *tfm = crypto_akcipher_reqtfm(req); struct ecc_ctx *ctx = akcipher_tfm_ctx(tfm); - size_t keylen = ctx->curve->g.ndigits * sizeof(u64); + size_t bufsize = ctx->curve->g.ndigits * sizeof(u64); struct ecdsa_signature_ctx sig_ctx = { .curve = ctx->curve, }; @@ -165,14 +165,14 @@ static int ecdsa_verify(struct akcipher_request *req) goto error;
/* if the hash is shorter then we will add leading zeros to fit to ndigits */ - diff = keylen - req->dst_len; + diff = bufsize - req->dst_len; if (diff >= 0) { if (diff) memset(rawhash, 0, diff); memcpy(&rawhash[diff], buffer + req->src_len, req->dst_len); } else if (diff < 0) { /* given hash is longer, we take the left-most bytes */ - memcpy(&rawhash, buffer + req->src_len, keylen); + memcpy(&rawhash, buffer + req->src_len, bufsize); }
ecc_swap_digits((u64 *)rawhash, hash, ctx->curve->g.ndigits);
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Stefan Berger stefanb@linux.ibm.com
[ Upstream commit 546ce0bdc91afd9f5c4c67d9fc4733e0fc7086d1 ]
Since ecc_digits_from_bytes will provide zeros when an insufficient number of bytes are passed in the input byte array, use it to convert the r and s components of the signature to digits directly from the input byte array. This avoids going through an intermediate byte array that has the first few bytes filled with zeros.
Signed-off-by: Stefan Berger stefanb@linux.ibm.com Reviewed-by: Jarkko Sakkinen jarkko@kernel.org Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Stable-dep-of: 3b0565c70350 ("crypto: ecdsa - Avoid signed integer overflow on signature decoding") Signed-off-by: Sasha Levin sashal@kernel.org --- crypto/ecdsa.c | 12 ++---------- 1 file changed, 2 insertions(+), 10 deletions(-)
diff --git a/crypto/ecdsa.c b/crypto/ecdsa.c index 142bed98fa97..28441311af36 100644 --- a/crypto/ecdsa.c +++ b/crypto/ecdsa.c @@ -38,7 +38,6 @@ static int ecdsa_get_signature_rs(u64 *dest, size_t hdrlen, unsigned char tag, size_t bufsize = ndigits * sizeof(u64); ssize_t diff = vlen - bufsize; const char *d = value; - u8 rs[ECC_MAX_BYTES];
if (!value || !vlen) return -EINVAL; @@ -46,7 +45,7 @@ static int ecdsa_get_signature_rs(u64 *dest, size_t hdrlen, unsigned char tag, /* diff = 0: 'value' has exacly the right size * diff > 0: 'value' has too many bytes; one leading zero is allowed that * makes the value a positive integer; error on more - * diff < 0: 'value' is missing leading zeros, which we add + * diff < 0: 'value' is missing leading zeros */ if (diff > 0) { /* skip over leading zeros that make 'value' a positive int */ @@ -61,14 +60,7 @@ static int ecdsa_get_signature_rs(u64 *dest, size_t hdrlen, unsigned char tag, if (-diff >= bufsize) return -EINVAL;
- if (diff) { - /* leading zeros not given in 'value' */ - memset(rs, 0, -diff); - } - - memcpy(&rs[-diff], d, vlen); - - ecc_swap_digits((u64 *)rs, dest, ndigits); + ecc_digits_from_bytes(d, vlen, dest, ndigits);
return 0; }
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Lukas Wunner lukas@wunner.de
[ Upstream commit 3b0565c703503f832d6cd7ba805aafa3b330cb9d ]
When extracting a signature component r or s from an ASN.1-encoded integer, ecdsa_get_signature_rs() subtracts the expected length "bufsize" from the ASN.1 length "vlen" (both of unsigned type size_t) and stores the result in "diff" (of signed type ssize_t).
This results in a signed integer overflow if vlen > SSIZE_MAX + bufsize.
The kernel is compiled with -fno-strict-overflow, which implies -fwrapv, meaning signed integer overflow is not undefined behavior. And the function does check for overflow:
if (-diff >= bufsize) return -EINVAL;
So the code is fine in principle but not very obvious. In the future it might trigger a false-positive with CONFIG_UBSAN_SIGNED_WRAP=y.
Avoid by comparing the two unsigned variables directly and erroring out if "vlen" is too large.
Signed-off-by: Lukas Wunner lukas@wunner.de Reviewed-by: Stefan Berger stefanb@linux.ibm.com Reviewed-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Sasha Levin sashal@kernel.org --- crypto/ecdsa.c | 19 +++++++------------ 1 file changed, 7 insertions(+), 12 deletions(-)
diff --git a/crypto/ecdsa.c b/crypto/ecdsa.c index 28441311af36..da04df3c8ecf 100644 --- a/crypto/ecdsa.c +++ b/crypto/ecdsa.c @@ -36,29 +36,24 @@ static int ecdsa_get_signature_rs(u64 *dest, size_t hdrlen, unsigned char tag, const void *value, size_t vlen, unsigned int ndigits) { size_t bufsize = ndigits * sizeof(u64); - ssize_t diff = vlen - bufsize; const char *d = value;
- if (!value || !vlen) + if (!value || !vlen || vlen > bufsize + 1) return -EINVAL;
- /* diff = 0: 'value' has exacly the right size - * diff > 0: 'value' has too many bytes; one leading zero is allowed that - * makes the value a positive integer; error on more - * diff < 0: 'value' is missing leading zeros + /* + * vlen may be 1 byte larger than bufsize due to a leading zero byte + * (necessary if the most significant bit of the integer is set). */ - if (diff > 0) { + if (vlen > bufsize) { /* skip over leading zeros that make 'value' a positive int */ if (*d == 0) { vlen -= 1; - diff--; d++; - } - if (diff) + } else { return -EINVAL; + } } - if (-diff >= bufsize) - return -EINVAL;
ecc_digits_from_bytes(d, vlen, dest, ndigits);
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Peter Zijlstra peterz@infradead.org
[ Upstream commit e4ab322fbaaaf84b23d6cb0e3317a7f68baf36dc ]
Adds:
- DEFINE_GUARD_COND() / DEFINE_LOCK_GUARD_1_COND() to extend existing guards with conditional lock primitives, eg. mutex_trylock(), mutex_lock_interruptible().
nb. both primitives allow NULL 'locks', which cause the lock to fail (obviously).
- extends scoped_guard() to not take the body when the the conditional guard 'fails'. eg.
scoped_guard (mutex_intr, &task->signal_cred_guard_mutex) { ... }
will only execute the body when the mutex is held.
- provides scoped_cond_guard(name, fail, args...); which extends scoped_guard() to do fail when the lock-acquire fails.
Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Link: https://lkml.kernel.org/r/20231102110706.460851167%40infradead.org Stable-dep-of: fcc22ac5baf0 ("cleanup: Adjust scoped_guard() macros to avoid potential warning") Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/cleanup.h | 52 +++++++++++++++++++++++++++++++++++++--- include/linux/mutex.h | 3 ++- include/linux/rwsem.h | 8 +++---- include/linux/spinlock.h | 15 ++++++++++++ 4 files changed, 70 insertions(+), 8 deletions(-)
diff --git a/include/linux/cleanup.h b/include/linux/cleanup.h index 53f1a7a932b0..6d7bfa899df0 100644 --- a/include/linux/cleanup.h +++ b/include/linux/cleanup.h @@ -92,25 +92,55 @@ static inline class_##_name##_t class_##_name##ext##_constructor(_init_args) \ * trivial wrapper around DEFINE_CLASS() above specifically * for locks. * + * DEFINE_GUARD_COND(name, ext, condlock) + * wrapper around EXTEND_CLASS above to add conditional lock + * variants to a base class, eg. mutex_trylock() or + * mutex_lock_interruptible(). + * * guard(name): - * an anonymous instance of the (guard) class + * an anonymous instance of the (guard) class, not recommended for + * conditional locks. * * scoped_guard (name, args...) { }: * similar to CLASS(name, scope)(args), except the variable (with the * explicit name 'scope') is declard in a for-loop such that its scope is * bound to the next (compound) statement. * + * for conditional locks the loop body is skipped when the lock is not + * acquired. + * + * scoped_cond_guard (name, fail, args...) { }: + * similar to scoped_guard(), except it does fail when the lock + * acquire fails. + * */
#define DEFINE_GUARD(_name, _type, _lock, _unlock) \ - DEFINE_CLASS(_name, _type, _unlock, ({ _lock; _T; }), _type _T) + DEFINE_CLASS(_name, _type, if (_T) { _unlock; }, ({ _lock; _T; }), _type _T); \ + static inline void * class_##_name##_lock_ptr(class_##_name##_t *_T) \ + { return *_T; } + +#define DEFINE_GUARD_COND(_name, _ext, _condlock) \ + EXTEND_CLASS(_name, _ext, \ + ({ void *_t = _T; if (_T && !(_condlock)) _t = NULL; _t; }), \ + class_##_name##_t _T) \ + static inline void * class_##_name##_ext##_lock_ptr(class_##_name##_t *_T) \ + { return class_##_name##_lock_ptr(_T); }
#define guard(_name) \ CLASS(_name, __UNIQUE_ID(guard))
+#define __guard_ptr(_name) class_##_name##_lock_ptr + #define scoped_guard(_name, args...) \ for (CLASS(_name, scope)(args), \ - *done = NULL; !done; done = (void *)1) + *done = NULL; __guard_ptr(_name)(&scope) && !done; done = (void *)1) + +#define scoped_cond_guard(_name, _fail, args...) \ + for (CLASS(_name, scope)(args), \ + *done = NULL; !done; done = (void *)1) \ + if (!__guard_ptr(_name)(&scope)) _fail; \ + else
/* * Additional helper macros for generating lock guards with types, either for @@ -119,6 +149,7 @@ static inline class_##_name##_t class_##_name##ext##_constructor(_init_args) \ * * DEFINE_LOCK_GUARD_0(name, lock, unlock, ...) * DEFINE_LOCK_GUARD_1(name, type, lock, unlock, ...) + * DEFINE_LOCK_GUARD_1_COND(name, ext, condlock) * * will result in the following type: * @@ -140,6 +171,11 @@ typedef struct { \ static inline void class_##_name##_destructor(class_##_name##_t *_T) \ { \ if (_T->lock) { _unlock; } \ +} \ + \ +static inline void *class_##_name##_lock_ptr(class_##_name##_t *_T) \ +{ \ + return _T->lock; \ }
@@ -168,4 +204,14 @@ __DEFINE_LOCK_GUARD_1(_name, _type, _lock) __DEFINE_UNLOCK_GUARD(_name, void, _unlock, __VA_ARGS__) \ __DEFINE_LOCK_GUARD_0(_name, _lock)
+#define DEFINE_LOCK_GUARD_1_COND(_name, _ext, _condlock) \ + EXTEND_CLASS(_name, _ext, \ + ({ class_##_name##_t _t = { .lock = l }, *_T = &_t;\ + if (_T->lock && !(_condlock)) _T->lock = NULL; \ + _t; }), \ + typeof_member(class_##_name##_t, lock) l) \ + static inline void * class_##_name##_ext##_lock_ptr(class_##_name##_t *_T) \ + { return class_##_name##_lock_ptr(_T); } + + #endif /* __LINUX_GUARDS_H */ diff --git a/include/linux/mutex.h b/include/linux/mutex.h index 5b5630e58407..e1c323c7d75b 100644 --- a/include/linux/mutex.h +++ b/include/linux/mutex.h @@ -248,6 +248,7 @@ extern void mutex_unlock(struct mutex *lock); extern int atomic_dec_and_mutex_lock(atomic_t *cnt, struct mutex *lock);
DEFINE_GUARD(mutex, struct mutex *, mutex_lock(_T), mutex_unlock(_T)) -DEFINE_FREE(mutex, struct mutex *, if (_T) mutex_unlock(_T)) +DEFINE_GUARD_COND(mutex, _try, mutex_trylock(_T)) +DEFINE_GUARD_COND(mutex, _intr, mutex_lock_interruptible(_T) == 0)
#endif /* __LINUX_MUTEX_H */ diff --git a/include/linux/rwsem.h b/include/linux/rwsem.h index 1dd530ce8b45..9c29689ff505 100644 --- a/include/linux/rwsem.h +++ b/include/linux/rwsem.h @@ -203,11 +203,11 @@ extern void up_read(struct rw_semaphore *sem); extern void up_write(struct rw_semaphore *sem);
DEFINE_GUARD(rwsem_read, struct rw_semaphore *, down_read(_T), up_read(_T)) -DEFINE_GUARD(rwsem_write, struct rw_semaphore *, down_write(_T), up_write(_T)) - -DEFINE_FREE(up_read, struct rw_semaphore *, if (_T) up_read(_T)) -DEFINE_FREE(up_write, struct rw_semaphore *, if (_T) up_write(_T)) +DEFINE_GUARD_COND(rwsem_read, _try, down_read_trylock(_T)) +DEFINE_GUARD_COND(rwsem_read, _intr, down_read_interruptible(_T) == 0)
+DEFINE_GUARD(rwsem_write, struct rw_semaphore *, down_write(_T), up_write(_T)) +DEFINE_GUARD_COND(rwsem_write, _try, down_write_trylock(_T))
/* * downgrade write lock to read lock diff --git a/include/linux/spinlock.h b/include/linux/spinlock.h index 31d3d747a9db..ceb56b39c70f 100644 --- a/include/linux/spinlock.h +++ b/include/linux/spinlock.h @@ -507,6 +507,8 @@ DEFINE_LOCK_GUARD_1(raw_spinlock, raw_spinlock_t, raw_spin_lock(_T->lock), raw_spin_unlock(_T->lock))
+DEFINE_LOCK_GUARD_1_COND(raw_spinlock, _try, raw_spin_trylock(_T->lock)) + DEFINE_LOCK_GUARD_1(raw_spinlock_nested, raw_spinlock_t, raw_spin_lock_nested(_T->lock, SINGLE_DEPTH_NESTING), raw_spin_unlock(_T->lock)) @@ -515,23 +517,36 @@ DEFINE_LOCK_GUARD_1(raw_spinlock_irq, raw_spinlock_t, raw_spin_lock_irq(_T->lock), raw_spin_unlock_irq(_T->lock))
+DEFINE_LOCK_GUARD_1_COND(raw_spinlock_irq, _try, raw_spin_trylock_irq(_T->lock)) + DEFINE_LOCK_GUARD_1(raw_spinlock_irqsave, raw_spinlock_t, raw_spin_lock_irqsave(_T->lock, _T->flags), raw_spin_unlock_irqrestore(_T->lock, _T->flags), unsigned long flags)
+DEFINE_LOCK_GUARD_1_COND(raw_spinlock_irqsave, _try, + raw_spin_trylock_irqsave(_T->lock, _T->flags)) + DEFINE_LOCK_GUARD_1(spinlock, spinlock_t, spin_lock(_T->lock), spin_unlock(_T->lock))
+DEFINE_LOCK_GUARD_1_COND(spinlock, _try, spin_trylock(_T->lock)) + DEFINE_LOCK_GUARD_1(spinlock_irq, spinlock_t, spin_lock_irq(_T->lock), spin_unlock_irq(_T->lock))
+DEFINE_LOCK_GUARD_1_COND(spinlock_irq, _try, + spin_trylock_irq(_T->lock)) + DEFINE_LOCK_GUARD_1(spinlock_irqsave, spinlock_t, spin_lock_irqsave(_T->lock, _T->flags), spin_unlock_irqrestore(_T->lock, _T->flags), unsigned long flags)
+DEFINE_LOCK_GUARD_1_COND(spinlock_irqsave, _try, + spin_trylock_irqsave(_T->lock, _T->flags)) + #undef __LINUX_INSIDE_SPINLOCK_H #endif /* __LINUX_SPINLOCK_H */
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Przemek Kitszel przemyslaw.kitszel@intel.com
[ Upstream commit fcc22ac5baf06dd17193de44b60dbceea6461983 ]
Change scoped_guard() and scoped_cond_guard() macros to make reasoning about them easier for static analysis tools (smatch, compiler diagnostics), especially to enable them to tell if the given usage of scoped_guard() is with a conditional lock class (interruptible-locks, try-locks) or not (like simple mutex_lock()).
Add compile-time error if scoped_cond_guard() is used for non-conditional lock class.
Beyond easier tooling and a little shrink reported by bloat-o-meter this patch enables developer to write code like:
int foo(struct my_drv *adapter) { scoped_guard(spinlock, &adapter->some_spinlock) return adapter->spinlock_protected_var; }
Current scoped_guard() implementation does not support that, due to compiler complaining: error: control reaches end of non-void function [-Werror=return-type]
Technical stuff about the change: scoped_guard() macro uses common idiom of using "for" statement to declare a scoped variable. Unfortunately, current logic is too hard for compiler diagnostics to be sure that there is exactly one loop step; fix that.
To make any loop so trivial that there is no above warning, it must not depend on any non-const variable to tell if there are more steps. There is no obvious solution for that in C, but one could use the compound statement expression with "goto" jumping past the "loop", effectively leaving only the subscope part of the loop semantics.
More impl details: one more level of macro indirection is now needed to avoid duplicating label names; I didn't spot any other place that is using the "for (...; goto label) if (0) label: break;" idiom, so it's not packed for reuse beyond scoped_guard() family, what makes actual macros code cleaner.
There was also a need to introduce const true/false variable per lock class, it is used to aid compiler diagnostics reasoning about "exactly 1 step" loops (note that converting that to function would undo the whole benefit).
Big thanks to Andy Shevchenko for help on this patch, both internal and public, ranging from whitespace/formatting, through commit message clarifications, general improvements, ending with presenting alternative approaches - all despite not even liking the idea.
Big thanks to Dmitry Torokhov for the idea of compile-time check for scoped_cond_guard() (to use it only with conditional locsk), and general improvements for the patch.
Big thanks to David Lechner for idea to cover also scoped_cond_guard().
Signed-off-by: Przemek Kitszel przemyslaw.kitszel@intel.com Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Reviewed-by: Dmitry Torokhov dmitry.torokhov@gmail.com Link: https://lkml.kernel.org/r/20241018113823.171256-1-przemyslaw.kitszel@intel.c... Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/cleanup.h | 52 +++++++++++++++++++++++++++++++++-------- 1 file changed, 42 insertions(+), 10 deletions(-)
diff --git a/include/linux/cleanup.h b/include/linux/cleanup.h index 6d7bfa899df0..f0c6d1d45e67 100644 --- a/include/linux/cleanup.h +++ b/include/linux/cleanup.h @@ -113,14 +113,20 @@ static inline class_##_name##_t class_##_name##ext##_constructor(_init_args) \ * similar to scoped_guard(), except it does fail when the lock * acquire fails. * + * Only for conditional locks. */
+#define __DEFINE_CLASS_IS_CONDITIONAL(_name, _is_cond) \ +static __maybe_unused const bool class_##_name##_is_conditional = _is_cond + #define DEFINE_GUARD(_name, _type, _lock, _unlock) \ + __DEFINE_CLASS_IS_CONDITIONAL(_name, false); \ DEFINE_CLASS(_name, _type, if (_T) { _unlock; }, ({ _lock; _T; }), _type _T); \ static inline void * class_##_name##_lock_ptr(class_##_name##_t *_T) \ { return *_T; }
#define DEFINE_GUARD_COND(_name, _ext, _condlock) \ + __DEFINE_CLASS_IS_CONDITIONAL(_name##_ext, true); \ EXTEND_CLASS(_name, _ext, \ ({ void *_t = _T; if (_T && !(_condlock)) _t = NULL; _t; }), \ class_##_name##_t _T) \ @@ -131,17 +137,40 @@ static inline class_##_name##_t class_##_name##ext##_constructor(_init_args) \ CLASS(_name, __UNIQUE_ID(guard))
#define __guard_ptr(_name) class_##_name##_lock_ptr +#define __is_cond_ptr(_name) class_##_name##_is_conditional
-#define scoped_guard(_name, args...) \ - for (CLASS(_name, scope)(args), \ - *done = NULL; __guard_ptr(_name)(&scope) && !done; done = (void *)1) - -#define scoped_cond_guard(_name, _fail, args...) \ - for (CLASS(_name, scope)(args), \ - *done = NULL; !done; done = (void *)1) \ - if (!__guard_ptr(_name)(&scope)) _fail; \ - else - +/* + * Helper macro for scoped_guard(). + * + * Note that the "!__is_cond_ptr(_name)" part of the condition ensures that + * compiler would be sure that for the unconditional locks the body of the + * loop (caller-provided code glued to the else clause) could not be skipped. + * It is needed because the other part - "__guard_ptr(_name)(&scope)" - is too + * hard to deduce (even if could be proven true for unconditional locks). + */ +#define __scoped_guard(_name, _label, args...) \ + for (CLASS(_name, scope)(args); \ + __guard_ptr(_name)(&scope) || !__is_cond_ptr(_name); \ + ({ goto _label; })) \ + if (0) { \ +_label: \ + break; \ + } else + +#define scoped_guard(_name, args...) \ + __scoped_guard(_name, __UNIQUE_ID(label), args) + +#define __scoped_cond_guard(_name, _fail, _label, args...) \ + for (CLASS(_name, scope)(args); true; ({ goto _label; })) \ + if (!__guard_ptr(_name)(&scope)) { \ + BUILD_BUG_ON(!__is_cond_ptr(_name)); \ + _fail; \ +_label: \ + break; \ + } else + +#define scoped_cond_guard(_name, _fail, args...) \ + __scoped_cond_guard(_name, _fail, __UNIQUE_ID(label), args) /* * Additional helper macros for generating lock guards with types, either for * locks that don't have a native type (eg. RCU, preempt) or those that need a @@ -197,14 +226,17 @@ static inline class_##_name##_t class_##_name##_constructor(void) \ }
#define DEFINE_LOCK_GUARD_1(_name, _type, _lock, _unlock, ...) \ +__DEFINE_CLASS_IS_CONDITIONAL(_name, false); \ __DEFINE_UNLOCK_GUARD(_name, _type, _unlock, __VA_ARGS__) \ __DEFINE_LOCK_GUARD_1(_name, _type, _lock)
#define DEFINE_LOCK_GUARD_0(_name, _lock, _unlock, ...) \ +__DEFINE_CLASS_IS_CONDITIONAL(_name, false); \ __DEFINE_UNLOCK_GUARD(_name, void, _unlock, __VA_ARGS__) \ __DEFINE_LOCK_GUARD_0(_name, _lock)
#define DEFINE_LOCK_GUARD_1_COND(_name, _ext, _condlock) \ + __DEFINE_CLASS_IS_CONDITIONAL(_name##_ext, true); \ EXTEND_CLASS(_name, _ext, \ ({ class_##_name##_t _t = { .lock = l }, *_T = &_t;\ if (_T->lock && !(_condlock)) _T->lock = NULL; \
On 1/6/25 16:13, Greg Kroah-Hartman wrote:
6.6-stable review patch. If anyone has any objections, please let me know.
From: Przemek Kitszel przemyslaw.kitszel@intel.com
[ Upstream commit fcc22ac5baf06dd17193de44b60dbceea6461983 ]
Change scoped_guard() and scoped_cond_guard() macros to make reasoning about them easier for static analysis tools (smatch, compiler diagnostics), especially to enable them to tell if the given usage of scoped_guard() is with a conditional lock class (interruptible-locks, try-locks) or not (like simple mutex_lock()).
please also consider cherry pick of the commit 9a884bdb6e95 ("iio: magnetometer: fix if () scoped_guard() formatting") when picking the one in the subject
On Tue, Jan 07, 2025 at 11:01:23AM +0100, Przemek Kitszel wrote:
On 1/6/25 16:13, Greg Kroah-Hartman wrote:
6.6-stable review patch. If anyone has any objections, please let me know.
From: Przemek Kitszel przemyslaw.kitszel@intel.com
[ Upstream commit fcc22ac5baf06dd17193de44b60dbceea6461983 ]
Change scoped_guard() and scoped_cond_guard() macros to make reasoning about them easier for static analysis tools (smatch, compiler diagnostics), especially to enable them to tell if the given usage of scoped_guard() is with a conditional lock class (interruptible-locks, try-locks) or not (like simple mutex_lock()).
please also consider cherry pick of the commit 9a884bdb6e95 ("iio: magnetometer: fix if () scoped_guard() formatting") when picking the one in the subject
Did you try to apply that commit to the 6.6.y tree? If you have it working, please send us the correct backport because as-is, it does not apply at all.
thanks,
greg k-h
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Laurent Pinchart laurent.pinchart@ideasonboard.com
[ Upstream commit c397e8c45d911443b4ab60084fb723edf2a5b604 ]
The Quanta ACER HD User Facing camera reports a UVC 1.50 version, but implements UVC 1.0a as shown by the UVC probe control being 26 bytes long. Force the UVC version for that device.
Reported-by: Giuliano Lotta giuliano.lotta@gmail.com Closes: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2000947 Link: https://lore.kernel.org/r/20230115205210.20077-1-laurent.pinchart@ideasonboa... Tested-by: Giuliano Lotta giuliano.lotta@gmail.com Signed-off-by: Laurent Pinchart laurent.pinchart@ideasonboard.com Stable-dep-of: c9df99302fff ("media: uvcvideo: Force UVC version to 1.0a for 0408:4033") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/media/usb/uvc/uvc_driver.c | 11 +++++++++++ 1 file changed, 11 insertions(+)
diff --git a/drivers/media/usb/uvc/uvc_driver.c b/drivers/media/usb/uvc/uvc_driver.c index 5a3e933df633..adf1abce5333 100644 --- a/drivers/media/usb/uvc/uvc_driver.c +++ b/drivers/media/usb/uvc/uvc_driver.c @@ -2519,6 +2519,17 @@ static const struct usb_device_id uvc_ids[] = { .bInterfaceSubClass = 1, .bInterfaceProtocol = UVC_PC_PROTOCOL_15, .driver_info = (kernel_ulong_t)&uvc_ctrl_power_line_limited }, + /* Quanta ACER HD User Facing */ + { .match_flags = USB_DEVICE_ID_MATCH_DEVICE + | USB_DEVICE_ID_MATCH_INT_INFO, + .idVendor = 0x0408, + .idProduct = 0x4035, + .bInterfaceClass = USB_CLASS_VIDEO, + .bInterfaceSubClass = 1, + .bInterfaceProtocol = UVC_PC_PROTOCOL_15, + .driver_info = (kernel_ulong_t)&(const struct uvc_device_info){ + .uvc_version = 0x010a, + } }, /* LogiLink Wireless Webcam */ { .match_flags = USB_DEVICE_ID_MATCH_DEVICE | USB_DEVICE_ID_MATCH_INT_INFO,
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ricardo Ribalda ribalda@chromium.org
[ Upstream commit c9df99302fff53b6007666136b9f43fbac7ee3d8 ]
The Quanta ACER HD User Facing camera reports a UVC 1.50 version, but implements UVC 1.0a as shown by the UVC probe control being 26 bytes long. Force the UVC version for that device.
Reported-by: Giuliano Lotta giuliano.lotta@gmail.com Closes: https://lore.kernel.org/linux-media/fce4f906-d69b-417d-9f13-bf69fe5c81e3@koy... Signed-off-by: Ricardo Ribalda ribalda@chromium.org Reviewed-by: Laurent Pinchart laurent.pinchart@ideasonboard.com Link: https://lore.kernel.org/r/20240924-uvc-quanta-v1-1-2de023863767@chromium.org Signed-off-by: Laurent Pinchart laurent.pinchart@ideasonboard.com Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/media/usb/uvc/uvc_driver.c | 11 +++++++++++ 1 file changed, 11 insertions(+)
diff --git a/drivers/media/usb/uvc/uvc_driver.c b/drivers/media/usb/uvc/uvc_driver.c index adf1abce5333..1b05890f99f4 100644 --- a/drivers/media/usb/uvc/uvc_driver.c +++ b/drivers/media/usb/uvc/uvc_driver.c @@ -2520,6 +2520,17 @@ static const struct usb_device_id uvc_ids[] = { .bInterfaceProtocol = UVC_PC_PROTOCOL_15, .driver_info = (kernel_ulong_t)&uvc_ctrl_power_line_limited }, /* Quanta ACER HD User Facing */ + { .match_flags = USB_DEVICE_ID_MATCH_DEVICE + | USB_DEVICE_ID_MATCH_INT_INFO, + .idVendor = 0x0408, + .idProduct = 0x4033, + .bInterfaceClass = USB_CLASS_VIDEO, + .bInterfaceSubClass = 1, + .bInterfaceProtocol = UVC_PC_PROTOCOL_15, + .driver_info = (kernel_ulong_t)&(const struct uvc_device_info){ + .uvc_version = 0x010a, + } }, + /* Quanta ACER HD User Facing */ { .match_flags = USB_DEVICE_ID_MATCH_DEVICE | USB_DEVICE_ID_MATCH_INT_INFO, .idVendor = 0x0408,
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ping-Ke Shih pkshih@realtek.com
[ Upstream commit 53bc1b73b67836ac9867f93dee7a443986b4a94f ]
Drivers need to purge TX SKB when stopping. Using skb_queue_purge() can't report TX status to mac80211, causing ieee80211_free_ack_frame() warns "Have pending ack frames!". Export ieee80211_purge_tx_queue() for drivers to not have to reimplement it.
Signed-off-by: Ping-Ke Shih pkshih@realtek.com Link: https://patch.msgid.link/20240822014255.10211-1-pkshih@realtek.com Signed-off-by: Johannes Berg johannes.berg@intel.com Stable-dep-of: 3e5e4a801aaf ("wifi: rtw88: use ieee80211_purge_tx_queue() to purge TX skb") Signed-off-by: Sasha Levin sashal@kernel.org --- include/net/mac80211.h | 13 +++++++++++++ net/mac80211/ieee80211_i.h | 2 -- net/mac80211/status.c | 1 + 3 files changed, 14 insertions(+), 2 deletions(-)
diff --git a/include/net/mac80211.h b/include/net/mac80211.h index 47ade676565d..901fe3ac139e 100644 --- a/include/net/mac80211.h +++ b/include/net/mac80211.h @@ -3039,6 +3039,19 @@ ieee80211_get_alt_retry_rate(const struct ieee80211_hw *hw, */ void ieee80211_free_txskb(struct ieee80211_hw *hw, struct sk_buff *skb);
+/** + * ieee80211_purge_tx_queue - purge TX skb queue + * @hw: the hardware + * @skbs: the skbs + * + * Free a set of transmit skbs. Use this function when device is going to stop + * but some transmit skbs without TX status are still queued. + * This function does not take the list lock and the caller must hold the + * relevant locks to use it. + */ +void ieee80211_purge_tx_queue(struct ieee80211_hw *hw, + struct sk_buff_head *skbs); + /** * DOC: Hardware crypto acceleration * diff --git a/net/mac80211/ieee80211_i.h b/net/mac80211/ieee80211_i.h index daea061d0fc1..04c876d78d3b 100644 --- a/net/mac80211/ieee80211_i.h +++ b/net/mac80211/ieee80211_i.h @@ -2057,8 +2057,6 @@ void __ieee80211_subif_start_xmit(struct sk_buff *skb, u32 info_flags, u32 ctrl_flags, u64 *cookie); -void ieee80211_purge_tx_queue(struct ieee80211_hw *hw, - struct sk_buff_head *skbs); struct sk_buff * ieee80211_build_data_template(struct ieee80211_sub_if_data *sdata, struct sk_buff *skb, u32 info_flags); diff --git a/net/mac80211/status.c b/net/mac80211/status.c index 44d83da60aee..9676ed15efec 100644 --- a/net/mac80211/status.c +++ b/net/mac80211/status.c @@ -1270,3 +1270,4 @@ void ieee80211_purge_tx_queue(struct ieee80211_hw *hw, while ((skb = __skb_dequeue(skbs))) ieee80211_free_txskb(hw, skb); } +EXPORT_SYMBOL(ieee80211_purge_tx_queue);
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ping-Ke Shih pkshih@realtek.com
[ Upstream commit 3e5e4a801aaf4283390cc34959c6c48f910ca5ea ]
When removing kernel modules by: rmmod rtw88_8723cs rtw88_8703b rtw88_8723x rtw88_sdio rtw88_core
Driver uses skb_queue_purge() to purge TX skb, but not report tx status causing "Have pending ack frames!" warning. Use ieee80211_purge_tx_queue() to correct this.
Since ieee80211_purge_tx_queue() doesn't take locks, to prevent racing between TX work and purge TX queue, flush and destroy TX work in advance.
wlan0: deauthenticating from aa:f5:fd:60:4c:a8 by local choice (Reason: 3=DEAUTH_LEAVING) ------------[ cut here ]------------ Have pending ack frames! WARNING: CPU: 3 PID: 9232 at net/mac80211/main.c:1691 ieee80211_free_ack_frame+0x5c/0x90 [mac80211] CPU: 3 PID: 9232 Comm: rmmod Tainted: G C 6.10.1-200.fc40.aarch64 #1 Hardware name: pine64 Pine64 PinePhone Braveheart (1.1)/Pine64 PinePhone Braveheart (1.1), BIOS 2024.01 01/01/2024 pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : ieee80211_free_ack_frame+0x5c/0x90 [mac80211] lr : ieee80211_free_ack_frame+0x5c/0x90 [mac80211] sp : ffff80008c1b37b0 x29: ffff80008c1b37b0 x28: ffff000003be8000 x27: 0000000000000000 x26: 0000000000000000 x25: ffff000003dc14b8 x24: ffff80008c1b37d0 x23: ffff000000ff9f80 x22: 0000000000000000 x21: 000000007fffffff x20: ffff80007c7e93d8 x19: ffff00006e66f400 x18: 0000000000000000 x17: ffff7ffffd2b3000 x16: ffff800083fc0000 x15: 0000000000000000 x14: 0000000000000000 x13: 2173656d61726620 x12: 6b636120676e6964 x11: 0000000000000000 x10: 000000000000005d x9 : ffff8000802af2b0 x8 : ffff80008c1b3430 x7 : 0000000000000001 x6 : 0000000000000001 x5 : 0000000000000000 x4 : 0000000000000000 x3 : 0000000000000000 x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff000003be8000 Call trace: ieee80211_free_ack_frame+0x5c/0x90 [mac80211] idr_for_each+0x74/0x110 ieee80211_free_hw+0x44/0xe8 [mac80211] rtw_sdio_remove+0x9c/0xc0 [rtw88_sdio] sdio_bus_remove+0x44/0x180 device_remove+0x54/0x90 device_release_driver_internal+0x1d4/0x238 driver_detach+0x54/0xc0 bus_remove_driver+0x78/0x108 driver_unregister+0x38/0x78 sdio_unregister_driver+0x2c/0x40 rtw_8723cs_driver_exit+0x18/0x1000 [rtw88_8723cs] __do_sys_delete_module.isra.0+0x190/0x338 __arm64_sys_delete_module+0x1c/0x30 invoke_syscall+0x74/0x100 el0_svc_common.constprop.0+0x48/0xf0 do_el0_svc+0x24/0x38 el0_svc+0x3c/0x158 el0t_64_sync_handler+0x120/0x138 el0t_64_sync+0x194/0x198 ---[ end trace 0000000000000000 ]---
Reported-by: Peter Robinson pbrobinson@gmail.com Closes: https://lore.kernel.org/linux-wireless/CALeDE9OAa56KMzgknaCD3quOgYuEHFx9_hcT... Tested-by: Ping-Ke Shih pkshih@realtek.com # 8723DU Signed-off-by: Ping-Ke Shih pkshih@realtek.com Link: https://patch.msgid.link/20240822014255.10211-2-pkshih@realtek.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireless/realtek/rtw88/sdio.c | 6 +++--- drivers/net/wireless/realtek/rtw88/usb.c | 5 +++-- 2 files changed, 6 insertions(+), 5 deletions(-)
diff --git a/drivers/net/wireless/realtek/rtw88/sdio.c b/drivers/net/wireless/realtek/rtw88/sdio.c index 0cae5746f540..5bd1ee81210d 100644 --- a/drivers/net/wireless/realtek/rtw88/sdio.c +++ b/drivers/net/wireless/realtek/rtw88/sdio.c @@ -1295,12 +1295,12 @@ static void rtw_sdio_deinit_tx(struct rtw_dev *rtwdev) struct rtw_sdio *rtwsdio = (struct rtw_sdio *)rtwdev->priv; int i;
- for (i = 0; i < RTK_MAX_TX_QUEUE_NUM; i++) - skb_queue_purge(&rtwsdio->tx_queue[i]); - flush_workqueue(rtwsdio->txwq); destroy_workqueue(rtwsdio->txwq); kfree(rtwsdio->tx_handler_data); + + for (i = 0; i < RTK_MAX_TX_QUEUE_NUM; i++) + ieee80211_purge_tx_queue(rtwdev->hw, &rtwsdio->tx_queue[i]); }
int rtw_sdio_probe(struct sdio_func *sdio_func, diff --git a/drivers/net/wireless/realtek/rtw88/usb.c b/drivers/net/wireless/realtek/rtw88/usb.c index 04a64afcbf8a..8f1d653282b7 100644 --- a/drivers/net/wireless/realtek/rtw88/usb.c +++ b/drivers/net/wireless/realtek/rtw88/usb.c @@ -416,10 +416,11 @@ static void rtw_usb_tx_handler(struct work_struct *work)
static void rtw_usb_tx_queue_purge(struct rtw_usb *rtwusb) { + struct rtw_dev *rtwdev = rtwusb->rtwdev; int i;
for (i = 0; i < ARRAY_SIZE(rtwusb->tx_queue); i++) - skb_queue_purge(&rtwusb->tx_queue[i]); + ieee80211_purge_tx_queue(rtwdev->hw, &rtwusb->tx_queue[i]); }
static void rtw_usb_write_port_complete(struct urb *urb) @@ -801,9 +802,9 @@ static void rtw_usb_deinit_tx(struct rtw_dev *rtwdev) { struct rtw_usb *rtwusb = rtw_get_usb_priv(rtwdev);
- rtw_usb_tx_queue_purge(rtwusb); flush_workqueue(rtwusb->txwq); destroy_workqueue(rtwusb->txwq); + rtw_usb_tx_queue_purge(rtwusb); }
static int rtw_usb_intf_init(struct rtw_dev *rtwdev,
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Karthikeyan Periyasamy quic_periyasa@quicinc.com
[ Upstream commit 842addae02089fce4731be1c8d7d539449d4d009 ]
Currently mac80211 hw data is accessed by convert the hw to radio (ar) structure and then radio to hw structure which is not necessary in some places where mac80211 hw data is already present. So in that kind of places avoid the conversion and directly access the mac80211 hw data.
Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.0.1-00029-QCAHKSWPL_SILICONZ-1
Signed-off-by: Karthikeyan Periyasamy quic_periyasa@quicinc.com Acked-by: Jeff Johnson quic_jjohnson@quicinc.com Signed-off-by: Kalle Valo quic_kvalo@quicinc.com Link: https://lore.kernel.org/r/20231120235812.2602198-2-quic_periyasa@quicinc.com Stable-dep-of: 8fac3266c68a ("wifi: ath12k: fix atomic calls in ath12k_mac_op_set_bitrate_mask()") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireless/ath/ath12k/mac.c | 14 +++++++------- drivers/net/wireless/ath/ath12k/reg.c | 6 +++--- 2 files changed, 10 insertions(+), 10 deletions(-)
diff --git a/drivers/net/wireless/ath/ath12k/mac.c b/drivers/net/wireless/ath/ath12k/mac.c index f90191a290c2..e4b8c45898d2 100644 --- a/drivers/net/wireless/ath/ath12k/mac.c +++ b/drivers/net/wireless/ath/ath12k/mac.c @@ -4945,7 +4945,7 @@ static void ath12k_mac_op_tx(struct ieee80211_hw *hw, if (ret) { ath12k_warn(ar->ab, "failed to queue management frame %d\n", ret); - ieee80211_free_txskb(ar->hw, skb); + ieee80211_free_txskb(hw, skb); } return; } @@ -4953,7 +4953,7 @@ static void ath12k_mac_op_tx(struct ieee80211_hw *hw, ret = ath12k_dp_tx(ar, arvif, skb); if (ret) { ath12k_warn(ar->ab, "failed to transmit frame %d\n", ret); - ieee80211_free_txskb(ar->hw, skb); + ieee80211_free_txskb(hw, skb); } }
@@ -5496,7 +5496,7 @@ static int ath12k_mac_op_add_interface(struct ieee80211_hw *hw, goto err_peer_del;
param_id = WMI_VDEV_PARAM_RTS_THRESHOLD; - param_value = ar->hw->wiphy->rts_threshold; + param_value = hw->wiphy->rts_threshold; ret = ath12k_wmi_vdev_set_param_cmd(ar, arvif->vdev_id, param_id, param_value); if (ret) { @@ -6676,7 +6676,7 @@ ath12k_mac_op_set_bitrate_mask(struct ieee80211_hw *hw, arvif->vdev_id, ret); return ret; } - ieee80211_iterate_stations_atomic(ar->hw, + ieee80211_iterate_stations_atomic(hw, ath12k_mac_disable_peer_fixed_rate, arvif); } else if (ath12k_mac_bitrate_mask_get_single_nss(ar, band, mask, @@ -6722,14 +6722,14 @@ ath12k_mac_op_set_bitrate_mask(struct ieee80211_hw *hw, return -EINVAL; }
- ieee80211_iterate_stations_atomic(ar->hw, + ieee80211_iterate_stations_atomic(hw, ath12k_mac_disable_peer_fixed_rate, arvif);
mutex_lock(&ar->conf_mutex);
arvif->bitrate_mask = *mask; - ieee80211_iterate_stations_atomic(ar->hw, + ieee80211_iterate_stations_atomic(hw, ath12k_mac_set_bitrate_mask_iter, arvif);
@@ -6767,7 +6767,7 @@ ath12k_mac_op_reconfig_complete(struct ieee80211_hw *hw, ath12k_warn(ar->ab, "pdev %d successfully recovered\n", ar->pdev->pdev_id); ar->state = ATH12K_STATE_ON; - ieee80211_wake_queues(ar->hw); + ieee80211_wake_queues(hw);
if (ab->is_reset) { recovery_count = atomic_inc_return(&ab->recovery_count); diff --git a/drivers/net/wireless/ath/ath12k/reg.c b/drivers/net/wireless/ath/ath12k/reg.c index 32bdefeccc24..837a3e1ec3a4 100644 --- a/drivers/net/wireless/ath/ath12k/reg.c +++ b/drivers/net/wireless/ath/ath12k/reg.c @@ -28,11 +28,11 @@ static const struct ieee80211_regdomain ath12k_world_regd = { } };
-static bool ath12k_regdom_changes(struct ath12k *ar, char *alpha2) +static bool ath12k_regdom_changes(struct ieee80211_hw *hw, char *alpha2) { const struct ieee80211_regdomain *regd;
- regd = rcu_dereference_rtnl(ar->hw->wiphy->regd); + regd = rcu_dereference_rtnl(hw->wiphy->regd); /* This can happen during wiphy registration where the previous * user request is received before we update the regd received * from firmware. @@ -71,7 +71,7 @@ ath12k_reg_notifier(struct wiphy *wiphy, struct regulatory_request *request) return; }
- if (!ath12k_regdom_changes(ar, request->alpha2)) { + if (!ath12k_regdom_changes(hw, request->alpha2)) { ath12k_dbg(ar->ab, ATH12K_DBG_REG, "Country is already set\n"); return; }
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Rory Little rory@candelatech.com
[ Upstream commit 7c3b69eadea9e57c28bf914b0fd70f268f3682e1 ]
Drivers may at times want to iterate their stations with a function which requires some non-atomic operations.
ieee80211_iterate_stations_mtx() introduces an API to iterate stations while holding that wiphy's mutex. This allows the iterating function to do non-atomic operations safely.
Signed-off-by: Rory Little rory@candelatech.com Link: https://patch.msgid.link/20240806004024.2014080-2-rory@candelatech.com [unify internal list iteration functions] Signed-off-by: Johannes Berg johannes.berg@intel.com Stable-dep-of: 8fac3266c68a ("wifi: ath12k: fix atomic calls in ath12k_mac_op_set_bitrate_mask()") Signed-off-by: Sasha Levin sashal@kernel.org --- include/net/mac80211.h | 18 ++++++++++++++++++ net/mac80211/util.c | 16 +++++++++++++++- 2 files changed, 33 insertions(+), 1 deletion(-)
diff --git a/include/net/mac80211.h b/include/net/mac80211.h index 901fe3ac139e..835a58ce9ca5 100644 --- a/include/net/mac80211.h +++ b/include/net/mac80211.h @@ -6080,6 +6080,24 @@ void ieee80211_iterate_stations_atomic(struct ieee80211_hw *hw, void (*iterator)(void *data, struct ieee80211_sta *sta), void *data); + +/** + * ieee80211_iterate_stations_mtx - iterate stations + * + * This function iterates over all stations associated with a given + * hardware that are currently uploaded to the driver and calls the callback + * function for them. This version can only be used while holding the wiphy + * mutex. + * + * @hw: the hardware struct of which the interfaces should be iterated over + * @iterator: the iterator function to call + * @data: first argument of the iterator function + */ +void ieee80211_iterate_stations_mtx(struct ieee80211_hw *hw, + void (*iterator)(void *data, + struct ieee80211_sta *sta), + void *data); + /** * ieee80211_queue_work - add work onto the mac80211 workqueue * diff --git a/net/mac80211/util.c b/net/mac80211/util.c index d682c32821a1..cc3c46a82077 100644 --- a/net/mac80211/util.c +++ b/net/mac80211/util.c @@ -827,7 +827,8 @@ static void __iterate_stations(struct ieee80211_local *local, { struct sta_info *sta;
- list_for_each_entry_rcu(sta, &local->sta_list, list) { + list_for_each_entry_rcu(sta, &local->sta_list, list, + lockdep_is_held(&local->hw.wiphy->mtx)) { if (!sta->uploaded) continue;
@@ -848,6 +849,19 @@ void ieee80211_iterate_stations_atomic(struct ieee80211_hw *hw, } EXPORT_SYMBOL_GPL(ieee80211_iterate_stations_atomic);
+void ieee80211_iterate_stations_mtx(struct ieee80211_hw *hw, + void (*iterator)(void *data, + struct ieee80211_sta *sta), + void *data) +{ + struct ieee80211_local *local = hw_to_local(hw); + + lockdep_assert_wiphy(local->hw.wiphy); + + __iterate_stations(local, iterator, data); +} +EXPORT_SYMBOL_GPL(ieee80211_iterate_stations_mtx); + struct ieee80211_vif *wdev_to_ieee80211_vif(struct wireless_dev *wdev) { struct ieee80211_sub_if_data *sdata = IEEE80211_WDEV_TO_SUB_IF(wdev);
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kalle Valo quic_kvalo@quicinc.com
[ Upstream commit 8fac3266c68a8e647240b8ac8d0b82f1821edf85 ]
When I try to manually set bitrates:
iw wlan0 set bitrates legacy-2.4 1
I get sleeping from invalid context error, see below. Fix that by switching to use recently introduced ieee80211_iterate_stations_mtx().
Do note that WCN6855 firmware is still crashing, I'm not sure if that firmware even supports bitrate WMI commands and should we consider disabling ath12k_mac_op_set_bitrate_mask() for WCN6855? But that's for another patch.
BUG: sleeping function called from invalid context at drivers/net/wireless/ath/ath12k/wmi.c:420 in_atomic(): 0, irqs_disabled(): 0, non_block: 0, pid: 2236, name: iw preempt_count: 0, expected: 0 RCU nest depth: 1, expected: 0 3 locks held by iw/2236: #0: ffffffffabc6f1d8 (cb_lock){++++}-{3:3}, at: genl_rcv+0x14/0x40 #1: ffff888138410810 (&rdev->wiphy.mtx){+.+.}-{3:3}, at: nl80211_pre_doit+0x54d/0x800 [cfg80211] #2: ffffffffab2cfaa0 (rcu_read_lock){....}-{1:2}, at: ieee80211_iterate_stations_atomic+0x2f/0x200 [mac80211] CPU: 3 UID: 0 PID: 2236 Comm: iw Not tainted 6.11.0-rc7-wt-ath+ #1772 Hardware name: Intel(R) Client Systems NUC8i7HVK/NUC8i7HVB, BIOS HNKBLi70.86A.0067.2021.0528.1339 05/28/2021 Call Trace: <TASK> dump_stack_lvl+0xa4/0xe0 dump_stack+0x10/0x20 __might_resched+0x363/0x5a0 ? __alloc_skb+0x165/0x340 __might_sleep+0xad/0x160 ath12k_wmi_cmd_send+0xb1/0x3d0 [ath12k] ? ath12k_wmi_init_wcn7850+0xa40/0xa40 [ath12k] ? __netdev_alloc_skb+0x45/0x7b0 ? __asan_memset+0x39/0x40 ? ath12k_wmi_alloc_skb+0xf0/0x150 [ath12k] ? reacquire_held_locks+0x4d0/0x4d0 ath12k_wmi_set_peer_param+0x340/0x5b0 [ath12k] ath12k_mac_disable_peer_fixed_rate+0xa3/0x110 [ath12k] ? ath12k_mac_vdev_stop+0x4f0/0x4f0 [ath12k] ieee80211_iterate_stations_atomic+0xd4/0x200 [mac80211] ath12k_mac_op_set_bitrate_mask+0x5d2/0x1080 [ath12k] ? ath12k_mac_vif_chan+0x320/0x320 [ath12k] drv_set_bitrate_mask+0x267/0x470 [mac80211] ieee80211_set_bitrate_mask+0x4cc/0x8a0 [mac80211] ? __this_cpu_preempt_check+0x13/0x20 nl80211_set_tx_bitrate_mask+0x2bc/0x530 [cfg80211] ? nl80211_parse_tx_bitrate_mask+0x2320/0x2320 [cfg80211] ? trace_contention_end+0xef/0x140 ? rtnl_unlock+0x9/0x10 ? nl80211_pre_doit+0x557/0x800 [cfg80211] genl_family_rcv_msg_doit+0x1f0/0x2e0 ? genl_family_rcv_msg_attrs_parse.isra.0+0x250/0x250 ? ns_capable+0x57/0xd0 genl_family_rcv_msg+0x34c/0x600 ? genl_family_rcv_msg_dumpit+0x310/0x310 ? __lock_acquire+0xc62/0x1de0 ? he_set_mcs_mask.isra.0+0x8d0/0x8d0 [cfg80211] ? nl80211_parse_tx_bitrate_mask+0x2320/0x2320 [cfg80211] ? cfg80211_external_auth_request+0x690/0x690 [cfg80211] genl_rcv_msg+0xa0/0x130 netlink_rcv_skb+0x14c/0x400 ? genl_family_rcv_msg+0x600/0x600 ? netlink_ack+0xd70/0xd70 ? rwsem_optimistic_spin+0x4f0/0x4f0 ? genl_rcv+0x14/0x40 ? down_read_killable+0x580/0x580 ? netlink_deliver_tap+0x13e/0x350 ? __this_cpu_preempt_check+0x13/0x20 genl_rcv+0x23/0x40 netlink_unicast+0x45e/0x790 ? netlink_attachskb+0x7f0/0x7f0 netlink_sendmsg+0x7eb/0xdb0 ? netlink_unicast+0x790/0x790 ? __this_cpu_preempt_check+0x13/0x20 ? selinux_socket_sendmsg+0x31/0x40 ? netlink_unicast+0x790/0x790 __sock_sendmsg+0xc9/0x160 ____sys_sendmsg+0x620/0x990 ? kernel_sendmsg+0x30/0x30 ? __copy_msghdr+0x410/0x410 ? __kasan_check_read+0x11/0x20 ? mark_lock+0xe6/0x1470 ___sys_sendmsg+0xe9/0x170 ? copy_msghdr_from_user+0x120/0x120 ? __lock_acquire+0xc62/0x1de0 ? do_fault_around+0x2c6/0x4e0 ? do_user_addr_fault+0x8c1/0xde0 ? reacquire_held_locks+0x220/0x4d0 ? do_user_addr_fault+0x8c1/0xde0 ? __kasan_check_read+0x11/0x20 ? __fdget+0x4e/0x1d0 ? sockfd_lookup_light+0x1a/0x170 __sys_sendmsg+0xd2/0x180 ? __sys_sendmsg_sock+0x20/0x20 ? reacquire_held_locks+0x4d0/0x4d0 ? debug_smp_processor_id+0x17/0x20 __x64_sys_sendmsg+0x72/0xb0 ? lockdep_hardirqs_on+0x7d/0x100 x64_sys_call+0x894/0x9f0 do_syscall_64+0x64/0x130 entry_SYSCALL_64_after_hwframe+0x4b/0x53 RIP: 0033:0x7f230fe04807 Code: 64 89 02 48 c7 c0 ff ff ff ff eb bb 0f 1f 80 00 00 00 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 89 54 24 1c 48 89 74 24 10 RSP: 002b:00007ffe996a7ea8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 0000556f9f9c3390 RCX: 00007f230fe04807 RDX: 0000000000000000 RSI: 00007ffe996a7ee0 RDI: 0000000000000003 RBP: 0000556f9f9c88c0 R08: 0000000000000002 R09: 0000000000000000 R10: 0000556f965ca190 R11: 0000000000000246 R12: 0000556f9f9c8780 R13: 00007ffe996a7ee0 R14: 0000556f9f9c87d0 R15: 0000556f9f9c88c0 </TASK>
Tested-on: WCN7850 hw2.0 PCI WLAN.HMT.1.0.c5-00481-QCAHMTSWPL_V1.0_V2.0_SILICONZ-3
Signed-off-by: Kalle Valo quic_kvalo@quicinc.com Link: https://patch.msgid.link/20241007165932.78081-2-kvalo@kernel.org Signed-off-by: Jeff Johnson quic_jjohnson@quicinc.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireless/ath/ath12k/mac.c | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-)
diff --git a/drivers/net/wireless/ath/ath12k/mac.c b/drivers/net/wireless/ath/ath12k/mac.c index e4b8c45898d2..713899735ccc 100644 --- a/drivers/net/wireless/ath/ath12k/mac.c +++ b/drivers/net/wireless/ath/ath12k/mac.c @@ -6676,9 +6676,9 @@ ath12k_mac_op_set_bitrate_mask(struct ieee80211_hw *hw, arvif->vdev_id, ret); return ret; } - ieee80211_iterate_stations_atomic(hw, - ath12k_mac_disable_peer_fixed_rate, - arvif); + ieee80211_iterate_stations_mtx(hw, + ath12k_mac_disable_peer_fixed_rate, + arvif); } else if (ath12k_mac_bitrate_mask_get_single_nss(ar, band, mask, &single_nss)) { rate = WMI_FIXED_RATE_NONE; @@ -6722,16 +6722,16 @@ ath12k_mac_op_set_bitrate_mask(struct ieee80211_hw *hw, return -EINVAL; }
- ieee80211_iterate_stations_atomic(hw, - ath12k_mac_disable_peer_fixed_rate, - arvif); + ieee80211_iterate_stations_mtx(hw, + ath12k_mac_disable_peer_fixed_rate, + arvif);
mutex_lock(&ar->conf_mutex);
arvif->bitrate_mask = *mask; - ieee80211_iterate_stations_atomic(hw, - ath12k_mac_set_bitrate_mask_iter, - arvif); + ieee80211_iterate_stations_mtx(hw, + ath12k_mac_set_bitrate_mask_iter, + arvif);
mutex_unlock(&ar->conf_mutex); }
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jeff Johnson quic_jjohnson@quicinc.com
[ Upstream commit b1dc0ba41431147e55407140962c76f3e7a06753 ]
Update the copyright for all ath10k files modified on behalf of Qualcomm Innovation Center, Inc. in 2021 through 2023.
Signed-off-by: Jeff Johnson quic_jjohnson@quicinc.com Signed-off-by: Kalle Valo quic_kvalo@quicinc.com Link: https://lore.kernel.org/r/20231128-ath12kcopyrights-v1-3-be0b7408cbac@quicin... Stable-dep-of: 95c38953cb1e ("wifi: ath10k: avoid NULL pointer error during sdio remove") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireless/ath/ath10k/bmi.c | 1 + drivers/net/wireless/ath/ath10k/ce.c | 1 + drivers/net/wireless/ath/ath10k/core.c | 1 + drivers/net/wireless/ath/ath10k/core.h | 1 + drivers/net/wireless/ath/ath10k/coredump.c | 1 + drivers/net/wireless/ath/ath10k/coredump.h | 1 + drivers/net/wireless/ath/ath10k/debug.c | 1 + drivers/net/wireless/ath/ath10k/debugfs_sta.c | 1 + drivers/net/wireless/ath/ath10k/htc.c | 1 + drivers/net/wireless/ath/ath10k/htt.h | 1 + drivers/net/wireless/ath/ath10k/htt_rx.c | 1 + drivers/net/wireless/ath/ath10k/htt_tx.c | 1 + drivers/net/wireless/ath/ath10k/hw.c | 1 + drivers/net/wireless/ath/ath10k/hw.h | 1 + drivers/net/wireless/ath/ath10k/mac.c | 1 + drivers/net/wireless/ath/ath10k/pci.c | 1 + drivers/net/wireless/ath/ath10k/pci.h | 1 + drivers/net/wireless/ath/ath10k/qmi.c | 1 + drivers/net/wireless/ath/ath10k/qmi_wlfw_v01.c | 1 + drivers/net/wireless/ath/ath10k/qmi_wlfw_v01.h | 1 + drivers/net/wireless/ath/ath10k/rx_desc.h | 1 + drivers/net/wireless/ath/ath10k/sdio.c | 1 + drivers/net/wireless/ath/ath10k/thermal.c | 1 + drivers/net/wireless/ath/ath10k/usb.h | 1 + drivers/net/wireless/ath/ath10k/wmi-tlv.h | 1 + drivers/net/wireless/ath/ath10k/wmi.c | 1 + drivers/net/wireless/ath/ath10k/wmi.h | 1 + drivers/net/wireless/ath/ath10k/wow.c | 1 + 28 files changed, 28 insertions(+)
diff --git a/drivers/net/wireless/ath/ath10k/bmi.c b/drivers/net/wireless/ath/ath10k/bmi.c index af6546572df2..9a4f8e815412 100644 --- a/drivers/net/wireless/ath/ath10k/bmi.c +++ b/drivers/net/wireless/ath/ath10k/bmi.c @@ -2,6 +2,7 @@ /* * Copyright (c) 2005-2011 Atheros Communications Inc. * Copyright (c) 2011-2014,2016-2017 Qualcomm Atheros, Inc. + * Copyright (c) 2022 Qualcomm Innovation Center, Inc. All rights reserved. */
#include "bmi.h" diff --git a/drivers/net/wireless/ath/ath10k/ce.c b/drivers/net/wireless/ath/ath10k/ce.c index c27b8204718a..afae4a8027f8 100644 --- a/drivers/net/wireless/ath/ath10k/ce.c +++ b/drivers/net/wireless/ath/ath10k/ce.c @@ -3,6 +3,7 @@ * Copyright (c) 2005-2011 Atheros Communications Inc. * Copyright (c) 2011-2017 Qualcomm Atheros, Inc. * Copyright (c) 2018 The Linux Foundation. All rights reserved. + * Copyright (c) 2022 Qualcomm Innovation Center, Inc. All rights reserved. */
#include "hif.h" diff --git a/drivers/net/wireless/ath/ath10k/core.c b/drivers/net/wireless/ath/ath10k/core.c index 81058be3598f..c3a8b3496be2 100644 --- a/drivers/net/wireless/ath/ath10k/core.c +++ b/drivers/net/wireless/ath/ath10k/core.c @@ -3,6 +3,7 @@ * Copyright (c) 2005-2011 Atheros Communications Inc. * Copyright (c) 2011-2017 Qualcomm Atheros, Inc. * Copyright (c) 2018-2019, The Linux Foundation. All rights reserved. + * Copyright (c) 2021-2023 Qualcomm Innovation Center, Inc. All rights reserved. */
#include <linux/module.h> diff --git a/drivers/net/wireless/ath/ath10k/core.h b/drivers/net/wireless/ath/ath10k/core.h index 4b5239de4018..cb2359d2ee0b 100644 --- a/drivers/net/wireless/ath/ath10k/core.h +++ b/drivers/net/wireless/ath/ath10k/core.h @@ -3,6 +3,7 @@ * Copyright (c) 2005-2011 Atheros Communications Inc. * Copyright (c) 2011-2017 Qualcomm Atheros, Inc. * Copyright (c) 2018-2019, The Linux Foundation. All rights reserved. + * Copyright (c) 2022 Qualcomm Innovation Center, Inc. All rights reserved. */
#ifndef _CORE_H_ diff --git a/drivers/net/wireless/ath/ath10k/coredump.c b/drivers/net/wireless/ath/ath10k/coredump.c index 2d1634a890dd..bb3a276b7ed5 100644 --- a/drivers/net/wireless/ath/ath10k/coredump.c +++ b/drivers/net/wireless/ath/ath10k/coredump.c @@ -2,6 +2,7 @@ /* * Copyright (c) 2011-2017 Qualcomm Atheros, Inc. * Copyright (c) 2018, The Linux Foundation. All rights reserved. + * Copyright (c) 2022 Qualcomm Innovation Center, Inc. All rights reserved. */
#include "coredump.h" diff --git a/drivers/net/wireless/ath/ath10k/coredump.h b/drivers/net/wireless/ath/ath10k/coredump.h index 437b9759f05d..e5ef0352e319 100644 --- a/drivers/net/wireless/ath/ath10k/coredump.h +++ b/drivers/net/wireless/ath/ath10k/coredump.h @@ -1,6 +1,7 @@ /* SPDX-License-Identifier: ISC */ /* * Copyright (c) 2011-2017 Qualcomm Atheros, Inc. + * Copyright (c) 2022 Qualcomm Innovation Center, Inc. All rights reserved. */
#ifndef _COREDUMP_H_ diff --git a/drivers/net/wireless/ath/ath10k/debug.c b/drivers/net/wireless/ath/ath10k/debug.c index fe89bc61e531..92ad0a04bcc7 100644 --- a/drivers/net/wireless/ath/ath10k/debug.c +++ b/drivers/net/wireless/ath/ath10k/debug.c @@ -3,6 +3,7 @@ * Copyright (c) 2005-2011 Atheros Communications Inc. * Copyright (c) 2011-2017 Qualcomm Atheros, Inc. * Copyright (c) 2018, The Linux Foundation. All rights reserved. + * Copyright (c) 2022 Qualcomm Innovation Center, Inc. All rights reserved. */
#include <linux/module.h> diff --git a/drivers/net/wireless/ath/ath10k/debugfs_sta.c b/drivers/net/wireless/ath/ath10k/debugfs_sta.c index 5598cf706daa..0f6de862c3a9 100644 --- a/drivers/net/wireless/ath/ath10k/debugfs_sta.c +++ b/drivers/net/wireless/ath/ath10k/debugfs_sta.c @@ -2,6 +2,7 @@ /* * Copyright (c) 2014-2017 Qualcomm Atheros, Inc. * Copyright (c) 2018, The Linux Foundation. All rights reserved. + * Copyright (c) 2022 Qualcomm Innovation Center, Inc. All rights reserved. */
#include "core.h" diff --git a/drivers/net/wireless/ath/ath10k/htc.c b/drivers/net/wireless/ath/ath10k/htc.c index 5bfeecb95fca..a6e21ce90bad 100644 --- a/drivers/net/wireless/ath/ath10k/htc.c +++ b/drivers/net/wireless/ath/ath10k/htc.c @@ -2,6 +2,7 @@ /* * Copyright (c) 2005-2011 Atheros Communications Inc. * Copyright (c) 2011-2017 Qualcomm Atheros, Inc. + * Copyright (c) 2022 Qualcomm Innovation Center, Inc. All rights reserved. */
#include "core.h" diff --git a/drivers/net/wireless/ath/ath10k/htt.h b/drivers/net/wireless/ath/ath10k/htt.h index 7b24297146e7..52f6dc6b81c5 100644 --- a/drivers/net/wireless/ath/ath10k/htt.h +++ b/drivers/net/wireless/ath/ath10k/htt.h @@ -3,6 +3,7 @@ * Copyright (c) 2005-2011 Atheros Communications Inc. * Copyright (c) 2011-2017 Qualcomm Atheros, Inc. * Copyright (c) 2018, The Linux Foundation. All rights reserved. + * Copyright (c) 2021, 2023 Qualcomm Innovation Center, Inc. All rights reserved. */
#ifndef _HTT_H_ diff --git a/drivers/net/wireless/ath/ath10k/htt_rx.c b/drivers/net/wireless/ath/ath10k/htt_rx.c index 438b0caaceb7..51855f23ea26 100644 --- a/drivers/net/wireless/ath/ath10k/htt_rx.c +++ b/drivers/net/wireless/ath/ath10k/htt_rx.c @@ -3,6 +3,7 @@ * Copyright (c) 2005-2011 Atheros Communications Inc. * Copyright (c) 2011-2017 Qualcomm Atheros, Inc. * Copyright (c) 2018, The Linux Foundation. All rights reserved. + * Copyright (c) 2022 Qualcomm Innovation Center, Inc. All rights reserved. */
#include "core.h" diff --git a/drivers/net/wireless/ath/ath10k/htt_tx.c b/drivers/net/wireless/ath/ath10k/htt_tx.c index bd603feb7953..60425d22d707 100644 --- a/drivers/net/wireless/ath/ath10k/htt_tx.c +++ b/drivers/net/wireless/ath/ath10k/htt_tx.c @@ -2,6 +2,7 @@ /* * Copyright (c) 2005-2011 Atheros Communications Inc. * Copyright (c) 2011-2017 Qualcomm Atheros, Inc. + * Copyright (c) 2022 Qualcomm Innovation Center, Inc. All rights reserved. */
#include <linux/etherdevice.h> diff --git a/drivers/net/wireless/ath/ath10k/hw.c b/drivers/net/wireless/ath/ath10k/hw.c index 6d32b43a4da6..8fafe096adff 100644 --- a/drivers/net/wireless/ath/ath10k/hw.c +++ b/drivers/net/wireless/ath/ath10k/hw.c @@ -1,6 +1,7 @@ // SPDX-License-Identifier: ISC /* * Copyright (c) 2014-2017 Qualcomm Atheros, Inc. + * Copyright (c) 2022 Qualcomm Innovation Center, Inc. All rights reserved. */
#include <linux/types.h> diff --git a/drivers/net/wireless/ath/ath10k/hw.h b/drivers/net/wireless/ath/ath10k/hw.h index 7ecdd0011cfa..afd336282615 100644 --- a/drivers/net/wireless/ath/ath10k/hw.h +++ b/drivers/net/wireless/ath/ath10k/hw.h @@ -3,6 +3,7 @@ * Copyright (c) 2005-2011 Atheros Communications Inc. * Copyright (c) 2011-2017 Qualcomm Atheros, Inc. * Copyright (c) 2018 The Linux Foundation. All rights reserved. + * Copyright (c) 2022 Qualcomm Innovation Center, Inc. All rights reserved. */
#ifndef _HW_H_ diff --git a/drivers/net/wireless/ath/ath10k/mac.c b/drivers/net/wireless/ath/ath10k/mac.c index d5e6e11f630b..655fb5cdf01f 100644 --- a/drivers/net/wireless/ath/ath10k/mac.c +++ b/drivers/net/wireless/ath/ath10k/mac.c @@ -3,6 +3,7 @@ * Copyright (c) 2005-2011 Atheros Communications Inc. * Copyright (c) 2011-2017 Qualcomm Atheros, Inc. * Copyright (c) 2018-2019, The Linux Foundation. All rights reserved. + * Copyright (c) 2021-2023 Qualcomm Innovation Center, Inc. All rights reserved. */
#include "mac.h" diff --git a/drivers/net/wireless/ath/ath10k/pci.c b/drivers/net/wireless/ath/ath10k/pci.c index 23f366221939..aaa240f3c08a 100644 --- a/drivers/net/wireless/ath/ath10k/pci.c +++ b/drivers/net/wireless/ath/ath10k/pci.c @@ -2,6 +2,7 @@ /* * Copyright (c) 2005-2011 Atheros Communications Inc. * Copyright (c) 2011-2017 Qualcomm Atheros, Inc. + * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All rights reserved. */
#include <linux/pci.h> diff --git a/drivers/net/wireless/ath/ath10k/pci.h b/drivers/net/wireless/ath/ath10k/pci.h index 480cd97ab739..27bb4cf2dfea 100644 --- a/drivers/net/wireless/ath/ath10k/pci.h +++ b/drivers/net/wireless/ath/ath10k/pci.h @@ -2,6 +2,7 @@ /* * Copyright (c) 2005-2011 Atheros Communications Inc. * Copyright (c) 2011-2017 Qualcomm Atheros, Inc. + * Copyright (c) 2022 Qualcomm Innovation Center, Inc. All rights reserved. */
#ifndef _PCI_H_ diff --git a/drivers/net/wireless/ath/ath10k/qmi.c b/drivers/net/wireless/ath/ath10k/qmi.c index 52c1a3de8da6..38e939f572a9 100644 --- a/drivers/net/wireless/ath/ath10k/qmi.c +++ b/drivers/net/wireless/ath/ath10k/qmi.c @@ -1,6 +1,7 @@ // SPDX-License-Identifier: ISC /* * Copyright (c) 2018 The Linux Foundation. All rights reserved. + * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All rights reserved. */
#include <linux/completion.h> diff --git a/drivers/net/wireless/ath/ath10k/qmi_wlfw_v01.c b/drivers/net/wireless/ath/ath10k/qmi_wlfw_v01.c index 1c81e454f943..0e85c75d2278 100644 --- a/drivers/net/wireless/ath/ath10k/qmi_wlfw_v01.c +++ b/drivers/net/wireless/ath/ath10k/qmi_wlfw_v01.c @@ -1,6 +1,7 @@ // SPDX-License-Identifier: ISC /* * Copyright (c) 2018 The Linux Foundation. All rights reserved. + * Copyright (c) 2022 Qualcomm Innovation Center, Inc. All rights reserved. */
#include <linux/soc/qcom/qmi.h> diff --git a/drivers/net/wireless/ath/ath10k/qmi_wlfw_v01.h b/drivers/net/wireless/ath/ath10k/qmi_wlfw_v01.h index f0db991408dc..9f311f3bc9e7 100644 --- a/drivers/net/wireless/ath/ath10k/qmi_wlfw_v01.h +++ b/drivers/net/wireless/ath/ath10k/qmi_wlfw_v01.h @@ -1,6 +1,7 @@ /* SPDX-License-Identifier: ISC */ /* * Copyright (c) 2018 The Linux Foundation. All rights reserved. + * Copyright (c) 2022 Qualcomm Innovation Center, Inc. All rights reserved. */
#ifndef WCN3990_QMI_SVC_V01_H diff --git a/drivers/net/wireless/ath/ath10k/rx_desc.h b/drivers/net/wireless/ath/ath10k/rx_desc.h index 777e53aa69dc..564293df1e9a 100644 --- a/drivers/net/wireless/ath/ath10k/rx_desc.h +++ b/drivers/net/wireless/ath/ath10k/rx_desc.h @@ -2,6 +2,7 @@ /* * Copyright (c) 2005-2011 Atheros Communications Inc. * Copyright (c) 2011-2017 Qualcomm Atheros, Inc. + * Copyright (c) 2022 Qualcomm Innovation Center, Inc. All rights reserved. */
#ifndef _RX_DESC_H_ diff --git a/drivers/net/wireless/ath/ath10k/sdio.c b/drivers/net/wireless/ath/ath10k/sdio.c index 56fbcfb80bf8..0ab5433f6cf6 100644 --- a/drivers/net/wireless/ath/ath10k/sdio.c +++ b/drivers/net/wireless/ath/ath10k/sdio.c @@ -3,6 +3,7 @@ * Copyright (c) 2004-2011 Atheros Communications Inc. * Copyright (c) 2011-2012,2017 Qualcomm Atheros, Inc. * Copyright (c) 2016-2017 Erik Stromdahl erik.stromdahl@gmail.com + * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All rights reserved. */
#include <linux/module.h> diff --git a/drivers/net/wireless/ath/ath10k/thermal.c b/drivers/net/wireless/ath/ath10k/thermal.c index cefd97323dfe..31c8d7fbb095 100644 --- a/drivers/net/wireless/ath/ath10k/thermal.c +++ b/drivers/net/wireless/ath/ath10k/thermal.c @@ -1,6 +1,7 @@ // SPDX-License-Identifier: ISC /* * Copyright (c) 2014-2015 Qualcomm Atheros, Inc. + * Copyright (c) 2022 Qualcomm Innovation Center, Inc. All rights reserved. */
#include <linux/device.h> diff --git a/drivers/net/wireless/ath/ath10k/usb.h b/drivers/net/wireless/ath/ath10k/usb.h index 48e066ba8162..7e4cfbb673c9 100644 --- a/drivers/net/wireless/ath/ath10k/usb.h +++ b/drivers/net/wireless/ath/ath10k/usb.h @@ -3,6 +3,7 @@ * Copyright (c) 2004-2011 Atheros Communications Inc. * Copyright (c) 2011-2012 Qualcomm Atheros, Inc. * Copyright (c) 2016-2017 Erik Stromdahl erik.stromdahl@gmail.com + * Copyright (c) 2022 Qualcomm Innovation Center, Inc. All rights reserved. */
#ifndef _USB_H_ diff --git a/drivers/net/wireless/ath/ath10k/wmi-tlv.h b/drivers/net/wireless/ath/ath10k/wmi-tlv.h index dbb48d70f2e9..83a8f07a687f 100644 --- a/drivers/net/wireless/ath/ath10k/wmi-tlv.h +++ b/drivers/net/wireless/ath/ath10k/wmi-tlv.h @@ -3,6 +3,7 @@ * Copyright (c) 2005-2011 Atheros Communications Inc. * Copyright (c) 2011-2017 Qualcomm Atheros, Inc. * Copyright (c) 2018-2019, The Linux Foundation. All rights reserved. + * Copyright (c) 2022 Qualcomm Innovation Center, Inc. All rights reserved. */ #ifndef _WMI_TLV_H #define _WMI_TLV_H diff --git a/drivers/net/wireless/ath/ath10k/wmi.c b/drivers/net/wireless/ath/ath10k/wmi.c index 1c21dbde77b8..818aea99f85e 100644 --- a/drivers/net/wireless/ath/ath10k/wmi.c +++ b/drivers/net/wireless/ath/ath10k/wmi.c @@ -3,6 +3,7 @@ * Copyright (c) 2005-2011 Atheros Communications Inc. * Copyright (c) 2011-2017 Qualcomm Atheros, Inc. * Copyright (c) 2018-2019, The Linux Foundation. All rights reserved. + * Copyright (c) 2021-2022 Qualcomm Innovation Center, Inc. All rights reserved. */
#include <linux/skbuff.h> diff --git a/drivers/net/wireless/ath/ath10k/wmi.h b/drivers/net/wireless/ath/ath10k/wmi.h index b112e8826093..9146df98fcee 100644 --- a/drivers/net/wireless/ath/ath10k/wmi.h +++ b/drivers/net/wireless/ath/ath10k/wmi.h @@ -3,6 +3,7 @@ * Copyright (c) 2005-2011 Atheros Communications Inc. * Copyright (c) 2011-2017 Qualcomm Atheros, Inc. * Copyright (c) 2018-2019, The Linux Foundation. All rights reserved. + * Copyright (c) 2021-2023 Qualcomm Innovation Center, Inc. All rights reserved. */
#ifndef _WMI_H_ diff --git a/drivers/net/wireless/ath/ath10k/wow.c b/drivers/net/wireless/ath/ath10k/wow.c index 20b9aa8ddf7d..aa7b2e703f3d 100644 --- a/drivers/net/wireless/ath/ath10k/wow.c +++ b/drivers/net/wireless/ath/ath10k/wow.c @@ -2,6 +2,7 @@ /* * Copyright (c) 2015-2017 Qualcomm Atheros, Inc. * Copyright (c) 2018, The Linux Foundation. All rights reserved. + * Copyright (c) 2022 Qualcomm Innovation Center, Inc. All rights reserved. */
#include "mac.h"
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kang Yang quic_kangyang@quicinc.com
[ Upstream commit 95c38953cb1ecf40399a676a1f85dfe2b5780a9a ]
When running 'rmmod ath10k', ath10k_sdio_remove() will free sdio workqueue by destroy_workqueue(). But if CONFIG_INIT_ON_FREE_DEFAULT_ON is set to yes, kernel panic will happen: Call trace: destroy_workqueue+0x1c/0x258 ath10k_sdio_remove+0x84/0x94 sdio_bus_remove+0x50/0x16c device_release_driver_internal+0x188/0x25c device_driver_detach+0x20/0x2c
This is because during 'rmmod ath10k', ath10k_sdio_remove() will call ath10k_core_destroy() before destroy_workqueue(). wiphy_dev_release() will finally be called in ath10k_core_destroy(). This function will free struct cfg80211_registered_device *rdev and all its members, including wiphy, dev and the pointer of sdio workqueue. Then the pointer of sdio workqueue will be set to NULL due to CONFIG_INIT_ON_FREE_DEFAULT_ON.
After device release, destroy_workqueue() will use NULL pointer then the kernel panic happen.
Call trace: ath10k_sdio_remove ->ath10k_core_unregister …… ->ath10k_core_stop ->ath10k_hif_stop ->ath10k_sdio_irq_disable ->ath10k_hif_power_down ->del_timer_sync(&ar_sdio->sleep_timer) ->ath10k_core_destroy ->ath10k_mac_destroy ->ieee80211_free_hw ->wiphy_free …… ->wiphy_dev_release ->destroy_workqueue
Need to call destroy_workqueue() before ath10k_core_destroy(), free the work queue buffer first and then free pointer of work queue by ath10k_core_destroy(). This order matches the error path order in ath10k_sdio_probe().
No work will be queued on sdio workqueue between it is destroyed and ath10k_core_destroy() is called. Based on the call_stack above, the reason is: Only ath10k_sdio_sleep_timer_handler(), ath10k_sdio_hif_tx_sg() and ath10k_sdio_irq_disable() will queue work on sdio workqueue. Sleep timer will be deleted before ath10k_core_destroy() in ath10k_hif_power_down(). ath10k_sdio_irq_disable() only be called in ath10k_hif_stop(). ath10k_core_unregister() will call ath10k_hif_power_down() to stop hif bus, so ath10k_sdio_hif_tx_sg() won't be called anymore.
Tested-on: QCA6174 hw3.2 SDIO WLAN.RMH.4.4.1-00189
Signed-off-by: Kang Yang quic_kangyang@quicinc.com Tested-by: David Ruth druth@chromium.org Reviewed-by: David Ruth druth@chromium.org Link: https://patch.msgid.link/20241008022246.1010-1-quic_kangyang@quicinc.com Signed-off-by: Jeff Johnson quic_jjohnson@quicinc.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireless/ath/ath10k/sdio.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/net/wireless/ath/ath10k/sdio.c b/drivers/net/wireless/ath/ath10k/sdio.c index 0ab5433f6cf6..850d999615a2 100644 --- a/drivers/net/wireless/ath/ath10k/sdio.c +++ b/drivers/net/wireless/ath/ath10k/sdio.c @@ -3,7 +3,7 @@ * Copyright (c) 2004-2011 Atheros Communications Inc. * Copyright (c) 2011-2012,2017 Qualcomm Atheros, Inc. * Copyright (c) 2016-2017 Erik Stromdahl erik.stromdahl@gmail.com - * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All rights reserved. + * Copyright (c) 2022-2024 Qualcomm Innovation Center, Inc. All rights reserved. */
#include <linux/module.h> @@ -2648,9 +2648,9 @@ static void ath10k_sdio_remove(struct sdio_func *func)
netif_napi_del(&ar->napi);
- ath10k_core_destroy(ar); - destroy_workqueue(ar_sdio->workqueue); + + ath10k_core_destroy(ar); }
static const struct sdio_device_id ath10k_sdio_devices[] = {
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jarkko Nikula jarkko.nikula@linux.intel.com
[ Upstream commit f0eda4ddb2146a9f29d31b54c396f741bd0c82f1 ]
Add SMBus PCI ID on Intel Arrow Lake-H.
Signed-off-by: Jarkko Nikula jarkko.nikula@linux.intel.com Signed-off-by: Andi Shyti andi.shyti@kernel.org Stable-dep-of: bd492b583712 ("i2c: i801: Add support for Intel Panther Lake") Signed-off-by: Sasha Levin sashal@kernel.org --- Documentation/i2c/busses/i2c-i801.rst | 1 + drivers/i2c/busses/Kconfig | 1 + drivers/i2c/busses/i2c-i801.c | 3 +++ 3 files changed, 5 insertions(+)
diff --git a/Documentation/i2c/busses/i2c-i801.rst b/Documentation/i2c/busses/i2c-i801.rst index 10eced6c2e46..c840b597912c 100644 --- a/Documentation/i2c/busses/i2c-i801.rst +++ b/Documentation/i2c/busses/i2c-i801.rst @@ -48,6 +48,7 @@ Supported adapters: * Intel Raptor Lake (PCH) * Intel Meteor Lake (SOC and PCH) * Intel Birch Stream (SOC) + * Intel Arrow Lake (SOC)
Datasheets: Publicly available at the Intel website
diff --git a/drivers/i2c/busses/Kconfig b/drivers/i2c/busses/Kconfig index 97d27e01a6ee..0dc3c91f52a8 100644 --- a/drivers/i2c/busses/Kconfig +++ b/drivers/i2c/busses/Kconfig @@ -159,6 +159,7 @@ config I2C_I801 Raptor Lake (PCH) Meteor Lake (SOC and PCH) Birch Stream (SOC) + Arrow Lake (SOC)
This driver can also be built as a module. If so, the module will be called i2c-i801. diff --git a/drivers/i2c/busses/i2c-i801.c b/drivers/i2c/busses/i2c-i801.c index 2b8bcd121ffa..ed943f303cdb 100644 --- a/drivers/i2c/busses/i2c-i801.c +++ b/drivers/i2c/busses/i2c-i801.c @@ -80,6 +80,7 @@ * Meteor Lake SoC-S (SOC) 0xae22 32 hard yes yes yes * Meteor Lake PCH-S (PCH) 0x7f23 32 hard yes yes yes * Birch Stream (SOC) 0x5796 32 hard yes yes yes + * Arrow Lake-H (SOC) 0x7722 32 hard yes yes yes * * Features supported by this driver: * Software PEC no @@ -234,6 +235,7 @@ #define PCI_DEVICE_ID_INTEL_ALDER_LAKE_M_SMBUS 0x54a3 #define PCI_DEVICE_ID_INTEL_BIRCH_STREAM_SMBUS 0x5796 #define PCI_DEVICE_ID_INTEL_BROXTON_SMBUS 0x5ad4 +#define PCI_DEVICE_ID_INTEL_ARROW_LAKE_H_SMBUS 0x7722 #define PCI_DEVICE_ID_INTEL_RAPTOR_LAKE_S_SMBUS 0x7a23 #define PCI_DEVICE_ID_INTEL_ALDER_LAKE_S_SMBUS 0x7aa3 #define PCI_DEVICE_ID_INTEL_METEOR_LAKE_P_SMBUS 0x7e22 @@ -1046,6 +1048,7 @@ static const struct pci_device_id i801_ids[] = { { PCI_DEVICE_DATA(INTEL, METEOR_LAKE_SOC_S_SMBUS, FEATURES_ICH5 | FEATURE_TCO_CNL) }, { PCI_DEVICE_DATA(INTEL, METEOR_LAKE_PCH_S_SMBUS, FEATURES_ICH5 | FEATURE_TCO_CNL) }, { PCI_DEVICE_DATA(INTEL, BIRCH_STREAM_SMBUS, FEATURES_ICH5 | FEATURE_TCO_CNL) }, + { PCI_DEVICE_DATA(INTEL, ARROW_LAKE_H_SMBUS, FEATURES_ICH5 | FEATURE_TCO_CNL) }, { 0, } };
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jarkko Nikula jarkko.nikula@linux.intel.com
[ Upstream commit bd492b58371295d3ae26162b9666be584abad68a ]
Add SMBus PCI IDs on Intel Panther Lake-P and -U.
Signed-off-by: Jarkko Nikula jarkko.nikula@linux.intel.com Signed-off-by: Andi Shyti andi.shyti@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- Documentation/i2c/busses/i2c-i801.rst | 1 + drivers/i2c/busses/Kconfig | 1 + drivers/i2c/busses/i2c-i801.c | 6 ++++++ 3 files changed, 8 insertions(+)
diff --git a/Documentation/i2c/busses/i2c-i801.rst b/Documentation/i2c/busses/i2c-i801.rst index c840b597912c..47e8ac5b7099 100644 --- a/Documentation/i2c/busses/i2c-i801.rst +++ b/Documentation/i2c/busses/i2c-i801.rst @@ -49,6 +49,7 @@ Supported adapters: * Intel Meteor Lake (SOC and PCH) * Intel Birch Stream (SOC) * Intel Arrow Lake (SOC) + * Intel Panther Lake (SOC)
Datasheets: Publicly available at the Intel website
diff --git a/drivers/i2c/busses/Kconfig b/drivers/i2c/busses/Kconfig index 0dc3c91f52a8..982007a112c2 100644 --- a/drivers/i2c/busses/Kconfig +++ b/drivers/i2c/busses/Kconfig @@ -160,6 +160,7 @@ config I2C_I801 Meteor Lake (SOC and PCH) Birch Stream (SOC) Arrow Lake (SOC) + Panther Lake (SOC)
This driver can also be built as a module. If so, the module will be called i2c-i801. diff --git a/drivers/i2c/busses/i2c-i801.c b/drivers/i2c/busses/i2c-i801.c index ed943f303cdb..18c04f5e41d9 100644 --- a/drivers/i2c/busses/i2c-i801.c +++ b/drivers/i2c/busses/i2c-i801.c @@ -81,6 +81,8 @@ * Meteor Lake PCH-S (PCH) 0x7f23 32 hard yes yes yes * Birch Stream (SOC) 0x5796 32 hard yes yes yes * Arrow Lake-H (SOC) 0x7722 32 hard yes yes yes + * Panther Lake-H (SOC) 0xe322 32 hard yes yes yes + * Panther Lake-P (SOC) 0xe422 32 hard yes yes yes * * Features supported by this driver: * Software PEC no @@ -258,6 +260,8 @@ #define PCI_DEVICE_ID_INTEL_CANNONLAKE_H_SMBUS 0xa323 #define PCI_DEVICE_ID_INTEL_COMETLAKE_V_SMBUS 0xa3a3 #define PCI_DEVICE_ID_INTEL_METEOR_LAKE_SOC_S_SMBUS 0xae22 +#define PCI_DEVICE_ID_INTEL_PANTHER_LAKE_H_SMBUS 0xe322 +#define PCI_DEVICE_ID_INTEL_PANTHER_LAKE_P_SMBUS 0xe422
struct i801_mux_config { char *gpio_chip; @@ -1049,6 +1053,8 @@ static const struct pci_device_id i801_ids[] = { { PCI_DEVICE_DATA(INTEL, METEOR_LAKE_PCH_S_SMBUS, FEATURES_ICH5 | FEATURE_TCO_CNL) }, { PCI_DEVICE_DATA(INTEL, BIRCH_STREAM_SMBUS, FEATURES_ICH5 | FEATURE_TCO_CNL) }, { PCI_DEVICE_DATA(INTEL, ARROW_LAKE_H_SMBUS, FEATURES_ICH5 | FEATURE_TCO_CNL) }, + { PCI_DEVICE_DATA(INTEL, PANTHER_LAKE_H_SMBUS, FEATURES_ICH5 | FEATURE_TCO_CNL) }, + { PCI_DEVICE_DATA(INTEL, PANTHER_LAKE_P_SMBUS, FEATURES_ICH5 | FEATURE_TCO_CNL) }, { 0, } };
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Markus Elfring elfring@users.sourceforge.net
[ Upstream commit d96b543c6f3b78b6440b68b5a5bbface553eff28 ]
An hci_conn_drop() call was immediately used after a null pointer check for an hci_conn_link() call in two function implementations. Thus call such a function only once instead directly before the checks.
This issue was transformed by using the Coccinelle software.
Signed-off-by: Markus Elfring elfring@users.sourceforge.net Signed-off-by: Luiz Augusto von Dentz luiz.von.dentz@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/bluetooth/hci_conn.c | 13 +++---------- 1 file changed, 3 insertions(+), 10 deletions(-)
diff --git a/net/bluetooth/hci_conn.c b/net/bluetooth/hci_conn.c index 6178ae8feafc..549ee9e87d63 100644 --- a/net/bluetooth/hci_conn.c +++ b/net/bluetooth/hci_conn.c @@ -2178,13 +2178,9 @@ struct hci_conn *hci_bind_bis(struct hci_dev *hdev, bdaddr_t *dst, conn->iso_qos.bcast.big); if (parent && parent != conn) { link = hci_conn_link(parent, conn); - if (!link) { - hci_conn_drop(conn); - return ERR_PTR(-ENOLINK); - } - - /* Link takes the refcount */ hci_conn_drop(conn); + if (!link) + return ERR_PTR(-ENOLINK); }
return conn; @@ -2274,15 +2270,12 @@ struct hci_conn *hci_connect_cis(struct hci_dev *hdev, bdaddr_t *dst, }
link = hci_conn_link(le, cis); + hci_conn_drop(cis); if (!link) { hci_conn_drop(le); - hci_conn_drop(cis); return ERR_PTR(-ENOLINK); }
- /* Link takes the refcount */ - hci_conn_drop(cis); - cis->state = BT_CONNECT;
hci_le_create_cis_pending(hdev);
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jingyang Wang wjy7717@126.com
[ Upstream commit 00b1c3c4b682ba4b4f9fc46e047b537e668576af ]
-Device(35f5:7922) from /sys/kernel/debug/usb/devices P: Vendor=35f5 ProdID=7922 Rev= 1.00 S: Manufacturer=MediaTek Inc. S: Product=Wireless_Device S: SerialNumber=000000000 C:* #Ifs= 3 Cfg#= 1 Atr=e0 MxPwr=100mA A: FirstIf#= 0 IfCount= 3 Cls=e0(wlcon) Sub=01 Prot=01 I:* If#= 0 Alt= 0 #EPs= 3 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=81(I) Atr=03(Int.) MxPS= 16 Ivl=125us E: Ad=82(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 1 Alt= 0 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=83(I) Atr=01(Isoc) MxPS= 0 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 0 Ivl=1ms I: If#= 1 Alt= 1 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=83(I) Atr=01(Isoc) MxPS= 9 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 9 Ivl=1ms I: If#= 1 Alt= 2 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=83(I) Atr=01(Isoc) MxPS= 17 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 17 Ivl=1ms I: If#= 1 Alt= 3 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=83(I) Atr=01(Isoc) MxPS= 25 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 25 Ivl=1ms I: If#= 1 Alt= 4 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=83(I) Atr=01(Isoc) MxPS= 33 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 33 Ivl=1ms I: If#= 1 Alt= 5 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=83(I) Atr=01(Isoc) MxPS= 49 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 49 Ivl=1ms I: If#= 1 Alt= 6 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=83(I) Atr=01(Isoc) MxPS= 63 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 63 Ivl=1ms I:* If#= 2 Alt= 0 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=(none) E: Ad=8a(I) Atr=03(Int.) MxPS= 64 Ivl=125us E: Ad=0a(O) Atr=03(Int.) MxPS= 64 Ivl=125us I: If#= 2 Alt= 1 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=(none) E: Ad=8a(I) Atr=03(Int.) MxPS= 512 Ivl=125us E: Ad=0a(O) Atr=03(Int.) MxPS= 512 Ivl=125us
Signed-off-by: Jingyang Wang wjy7717@126.com Signed-off-by: Luiz Augusto von Dentz luiz.von.dentz@intel.com Stable-dep-of: faa5fd605d20 ("Bluetooth: btusb: Add new VID/PID 0489/e111 for MT7925") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/bluetooth/btusb.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/bluetooth/btusb.c b/drivers/bluetooth/btusb.c index fe5e30662017..6b0d9d9f3004 100644 --- a/drivers/bluetooth/btusb.c +++ b/drivers/bluetooth/btusb.c @@ -658,6 +658,9 @@ static const struct usb_device_id quirks_table[] = { { USB_DEVICE(0x04ca, 0x3804), .driver_info = BTUSB_MEDIATEK | BTUSB_WIDEBAND_SPEECH | BTUSB_VALID_LE_STATES }, + { USB_DEVICE(0x35f5, 0x7922), .driver_info = BTUSB_MEDIATEK | + BTUSB_WIDEBAND_SPEECH | + BTUSB_VALID_LE_STATES },
/* Additional Realtek 8723AE Bluetooth devices */ { USB_DEVICE(0x0930, 0x021d), .driver_info = BTUSB_REALTEK },
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ulrik Strid ulrik@strid.tech
[ Upstream commit 560ff4bc99070bfb493018b6b7045af269a008a4 ]
Add VID 13d3 & PID 3602 for MediaTek MT7925 USB Bluetooth chip.
The information in /sys/kernel/debug/usb/devices about the Bluetooth device is listed as the below.
T: Bus=07 Lev=01 Prnt=01 Port=10 Cnt=02 Dev#= 2 Spd=480 MxCh= 0 D: Ver= 2.10 Cls=ef(misc ) Sub=02 Prot=01 MxPS=64 #Cfgs= 1 P: Vendor=13d3 ProdID=3602 Rev= 1.00 S: Manufacturer=MediaTek Inc. S: Product=Wireless_Device S: SerialNumber=000000000 C:* #Ifs= 3 Cfg#= 1 Atr=e0 MxPwr=100mA A: FirstIf#= 0 IfCount= 3 Cls=e0(wlcon) Sub=01 Prot=01 I:* If#= 0 Alt= 0 #EPs= 3 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=81(I) Atr=03(Int.) MxPS= 16 Ivl=125us E: Ad=82(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 1 Alt= 0 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=83(I) Atr=01(Isoc) MxPS= 0 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 0 Ivl=1ms I: If#= 1 Alt= 1 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=83(I) Atr=01(Isoc) MxPS= 9 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 9 Ivl=1ms I: If#= 1 Alt= 2 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=83(I) Atr=01(Isoc) MxPS= 17 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 17 Ivl=1ms I: If#= 1 Alt= 3 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=83(I) Atr=01(Isoc) MxPS= 25 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 25 Ivl=1ms I: If#= 1 Alt= 4 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=83(I) Atr=01(Isoc) MxPS= 33 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 33 Ivl=1ms I: If#= 1 Alt= 5 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=83(I) Atr=01(Isoc) MxPS= 49 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 49 Ivl=1ms I: If#= 1 Alt= 6 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=83(I) Atr=01(Isoc) MxPS= 63 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 63 Ivl=1ms I:* If#= 2 Alt= 0 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=(none) E: Ad=8a(I) Atr=03(Int.) MxPS= 64 Ivl=125us E: Ad=0a(O) Atr=03(Int.) MxPS= 64 Ivl=125us I: If#= 2 Alt= 1 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=(none) E: Ad=8a(I) Atr=03(Int.) MxPS= 512 Ivl=125us E: Ad=0a(O) Atr=03(Int.) MxPS= 512 Ivl=125us
Signed-off-by: Ulrik Strid ulrik@strid.tech Signed-off-by: Deren Wu deren.wu@mediatek.com Signed-off-by: Luiz Augusto von Dentz luiz.von.dentz@intel.com Stable-dep-of: faa5fd605d20 ("Bluetooth: btusb: Add new VID/PID 0489/e111 for MT7925") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/bluetooth/btusb.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/drivers/bluetooth/btusb.c b/drivers/bluetooth/btusb.c index 6b0d9d9f3004..fd57a02046ae 100644 --- a/drivers/bluetooth/btusb.c +++ b/drivers/bluetooth/btusb.c @@ -662,6 +662,11 @@ static const struct usb_device_id quirks_table[] = { BTUSB_WIDEBAND_SPEECH | BTUSB_VALID_LE_STATES },
+ /* Additional MediaTek MT7925 Bluetooth devices */ + { USB_DEVICE(0x13d3, 0x3602), .driver_info = BTUSB_MEDIATEK | + BTUSB_WIDEBAND_SPEECH | + BTUSB_VALID_LE_STATES }, + /* Additional Realtek 8723AE Bluetooth devices */ { USB_DEVICE(0x0930, 0x021d), .driver_info = BTUSB_REALTEK }, { USB_DEVICE(0x13d3, 0x3394), .driver_info = BTUSB_REALTEK },
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jiande Lu jiande.lu@mediatek.com
[ Upstream commit 129d329286f624b0ccd66e904a2c27eb4d5196e5 ]
Add HW IDs for wireless module specific to Acer/ASUS notebook models to ensure proper recognition and functionality. These HW IDs are extracted from Windows driver inf file. Note some HW IDs without official drivers, still in testing phase. Thus, we update module HW ID and test ensure consistent boot success.
Signed-off-by: Jiande Lu jiande.lu@mediatek.com Signed-off-by: Luiz Augusto von Dentz luiz.von.dentz@intel.com Stable-dep-of: faa5fd605d20 ("Bluetooth: btusb: Add new VID/PID 0489/e111 for MT7925") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/bluetooth/btusb.c | 24 ++++++++++++++++++++++++ 1 file changed, 24 insertions(+)
diff --git a/drivers/bluetooth/btusb.c b/drivers/bluetooth/btusb.c index fd57a02046ae..dc0bba7f4028 100644 --- a/drivers/bluetooth/btusb.c +++ b/drivers/bluetooth/btusb.c @@ -620,6 +620,9 @@ static const struct usb_device_id quirks_table[] = { { USB_DEVICE(0x0e8d, 0x0608), .driver_info = BTUSB_MEDIATEK | BTUSB_WIDEBAND_SPEECH | BTUSB_VALID_LE_STATES }, + { USB_DEVICE(0x13d3, 0x3606), .driver_info = BTUSB_MEDIATEK | + BTUSB_WIDEBAND_SPEECH | + BTUSB_VALID_LE_STATES },
/* MediaTek MT7922A Bluetooth devices */ { USB_DEVICE(0x0489, 0xe0d8), .driver_info = BTUSB_MEDIATEK | @@ -661,11 +664,32 @@ static const struct usb_device_id quirks_table[] = { { USB_DEVICE(0x35f5, 0x7922), .driver_info = BTUSB_MEDIATEK | BTUSB_WIDEBAND_SPEECH | BTUSB_VALID_LE_STATES }, + { USB_DEVICE(0x13d3, 0x3614), .driver_info = BTUSB_MEDIATEK | + BTUSB_WIDEBAND_SPEECH | + BTUSB_VALID_LE_STATES }, + { USB_DEVICE(0x13d3, 0x3615), .driver_info = BTUSB_MEDIATEK | + BTUSB_WIDEBAND_SPEECH | + BTUSB_VALID_LE_STATES }, + { USB_DEVICE(0x04ca, 0x38e4), .driver_info = BTUSB_MEDIATEK | + BTUSB_WIDEBAND_SPEECH | + BTUSB_VALID_LE_STATES }, + { USB_DEVICE(0x13d3, 0x3605), .driver_info = BTUSB_MEDIATEK | + BTUSB_WIDEBAND_SPEECH | + BTUSB_VALID_LE_STATES }, + { USB_DEVICE(0x13d3, 0x3607), .driver_info = BTUSB_MEDIATEK | + BTUSB_WIDEBAND_SPEECH | + BTUSB_VALID_LE_STATES },
/* Additional MediaTek MT7925 Bluetooth devices */ + { USB_DEVICE(0x0489, 0xe113), .driver_info = BTUSB_MEDIATEK | + BTUSB_WIDEBAND_SPEECH | + BTUSB_VALID_LE_STATES }, { USB_DEVICE(0x13d3, 0x3602), .driver_info = BTUSB_MEDIATEK | BTUSB_WIDEBAND_SPEECH | BTUSB_VALID_LE_STATES }, + { USB_DEVICE(0x13d3, 0x3603), .driver_info = BTUSB_MEDIATEK | + BTUSB_WIDEBAND_SPEECH | + BTUSB_VALID_LE_STATES },
/* Additional Realtek 8723AE Bluetooth devices */ { USB_DEVICE(0x0930, 0x021d), .driver_info = BTUSB_REALTEK },
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Hao Qin hao.qin@mediatek.com
[ Upstream commit faa5fd605d2081b6c9fa2355b59582d4ccd24b16 ]
Add VID 0489 & PID e111 for MediaTek MT7925 USB Bluetooth chip.
The information in /sys/kernel/debug/usb/devices about the Bluetooth device is listed as the below.
T: Bus=02 Lev=02 Prnt=02 Port=04 Cnt=02 Dev#= 4 Spd=480 MxCh= 0 D: Ver= 2.10 Cls=ef(misc ) Sub=02 Prot=01 MxPS=64 #Cfgs= 1 P: Vendor=0489 ProdID=e111 Rev= 1.00 S: Manufacturer=MediaTek Inc. S: Product=Wireless_Device S: SerialNumber=000000000 C:* #Ifs= 3 Cfg#= 1 Atr=e0 MxPwr=100mA A: FirstIf#= 0 IfCount= 3 Cls=e0(wlcon) Sub=01 Prot=01 I:* If#= 0 Alt= 0 #EPs= 3 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=81(I) Atr=03(Int.) MxPS= 16 Ivl=125us E: Ad=82(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 1 Alt= 0 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=83(I) Atr=01(Isoc) MxPS= 0 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 0 Ivl=1ms I: If#= 1 Alt= 1 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=83(I) Atr=01(Isoc) MxPS= 9 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 9 Ivl=1ms I: If#= 1 Alt= 2 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=83(I) Atr=01(Isoc) MxPS= 17 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 17 Ivl=1ms I: If#= 1 Alt= 3 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=83(I) Atr=01(Isoc) MxPS= 25 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 25 Ivl=1ms I: If#= 1 Alt= 4 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=83(I) Atr=01(Isoc) MxPS= 33 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 33 Ivl=1ms I: If#= 1 Alt= 5 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=83(I) Atr=01(Isoc) MxPS= 49 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 49 Ivl=1ms I: If#= 1 Alt= 6 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=83(I) Atr=01(Isoc) MxPS= 63 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 63 Ivl=1ms I:* If#= 2 Alt= 0 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=(none) E: Ad=8a(I) Atr=03(Int.) MxPS= 64 Ivl=125us E: Ad=0a(O) Atr=03(Int.) MxPS= 64 Ivl=125us I: If#= 2 Alt= 1 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=(none) E: Ad=8a(I) Atr=03(Int.) MxPS= 512 Ivl=125us E: Ad=0a(O) Atr=03(Int.) MxPS= 512 Ivl=125us
Signed-off-by: Hao Qin hao.qin@mediatek.com Signed-off-by: Luiz Augusto von Dentz luiz.von.dentz@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/bluetooth/btusb.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/bluetooth/btusb.c b/drivers/bluetooth/btusb.c index dc0bba7f4028..67577933835f 100644 --- a/drivers/bluetooth/btusb.c +++ b/drivers/bluetooth/btusb.c @@ -681,6 +681,8 @@ static const struct usb_device_id quirks_table[] = { BTUSB_VALID_LE_STATES },
/* Additional MediaTek MT7925 Bluetooth devices */ + { USB_DEVICE(0x0489, 0xe111), .driver_info = BTUSB_MEDIATEK | + BTUSB_WIDEBAND_SPEECH }, { USB_DEVICE(0x0489, 0xe113), .driver_info = BTUSB_MEDIATEK | BTUSB_WIDEBAND_SPEECH | BTUSB_VALID_LE_STATES },
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Yihang Li liyihang9@huawei.com
[ Upstream commit 2ff07b5c6fe9173e7a7de3b23f300d71ad4d8fde ]
Currently, register information dump is performed via workqueue, regardless of the trigger mode (automatic or manual). There is a delay in dumping register through workqueue, the exact register information at trigger time cannot be obtained.
Call register snapshot directly instead of through a workqueue.
Signed-off-by: Yihang Li liyihang9@huawei.com Signed-off-by: Xiang Chen chenxiang66@hisilicon.com Link: https://lore.kernel.org/r/1694571327-78697-3-git-send-email-chenxiang66@hisi... Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Stable-dep-of: 9f564f15f884 ("scsi: hisi_sas: Create all dump files during debugfs initialization") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/hisi_sas/hisi_sas.h | 1 - drivers/scsi/hisi_sas/hisi_sas_main.c | 7 +++++-- drivers/scsi/hisi_sas/hisi_sas_v3_hw.c | 14 +++----------- 3 files changed, 8 insertions(+), 14 deletions(-)
diff --git a/drivers/scsi/hisi_sas/hisi_sas.h b/drivers/scsi/hisi_sas/hisi_sas.h index 9e73e9cbbcfc..3d511c44c02d 100644 --- a/drivers/scsi/hisi_sas/hisi_sas.h +++ b/drivers/scsi/hisi_sas/hisi_sas.h @@ -451,7 +451,6 @@ struct hisi_hba { const struct hisi_sas_hw *hw; /* Low level hw interface */ unsigned long sata_dev_bitmap[BITS_TO_LONGS(HISI_SAS_MAX_DEVICES)]; struct work_struct rst_work; - struct work_struct debugfs_work; u32 phy_state; u32 intr_coal_ticks; /* Time of interrupt coalesce in us */ u32 intr_coal_count; /* Interrupt count to coalesce */ diff --git a/drivers/scsi/hisi_sas/hisi_sas_main.c b/drivers/scsi/hisi_sas/hisi_sas_main.c index db9ae206974c..5fdba7b39a1b 100644 --- a/drivers/scsi/hisi_sas/hisi_sas_main.c +++ b/drivers/scsi/hisi_sas/hisi_sas_main.c @@ -1967,8 +1967,11 @@ static bool hisi_sas_internal_abort_timeout(struct sas_task *task, struct hisi_hba *hisi_hba = dev_to_hisi_hba(device); struct hisi_sas_internal_abort_data *timeout = data;
- if (hisi_sas_debugfs_enable && hisi_hba->debugfs_itct[0].itct) - queue_work(hisi_hba->wq, &hisi_hba->debugfs_work); + if (hisi_sas_debugfs_enable && hisi_hba->debugfs_itct[0].itct) { + down(&hisi_hba->sem); + hisi_hba->hw->debugfs_snapshot_regs(hisi_hba); + up(&hisi_hba->sem); + }
if (task->task_state_flags & SAS_TASK_STATE_DONE) { pr_err("Internal abort: timeout %016llx\n", diff --git a/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c b/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c index 4054659d48f7..10f048b5a489 100644 --- a/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c +++ b/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c @@ -558,7 +558,6 @@ static int experimental_iopoll_q_cnt; module_param(experimental_iopoll_q_cnt, int, 0444); MODULE_PARM_DESC(experimental_iopoll_q_cnt, "number of queues to be used as poll mode, def=0");
-static void debugfs_work_handler_v3_hw(struct work_struct *work); static void debugfs_snapshot_regs_v3_hw(struct hisi_hba *hisi_hba);
static u32 hisi_sas_read32(struct hisi_hba *hisi_hba, u32 off) @@ -3397,7 +3396,6 @@ hisi_sas_shost_alloc_pci(struct pci_dev *pdev) hisi_hba = shost_priv(shost);
INIT_WORK(&hisi_hba->rst_work, hisi_sas_rst_work_handler); - INIT_WORK(&hisi_hba->debugfs_work, debugfs_work_handler_v3_hw); hisi_hba->hw = &hisi_sas_v3_hw; hisi_hba->pci_dev = pdev; hisi_hba->dev = dev; @@ -3919,7 +3917,9 @@ static ssize_t debugfs_trigger_dump_v3_hw_write(struct file *file, if (buf[0] != '1') return -EFAULT;
- queue_work(hisi_hba->wq, &hisi_hba->debugfs_work); + down(&hisi_hba->sem); + debugfs_snapshot_regs_v3_hw(hisi_hba); + up(&hisi_hba->sem);
return count; } @@ -4670,14 +4670,6 @@ static void debugfs_fifo_init_v3_hw(struct hisi_hba *hisi_hba) } }
-static void debugfs_work_handler_v3_hw(struct work_struct *work) -{ - struct hisi_hba *hisi_hba = - container_of(work, struct hisi_hba, debugfs_work); - - debugfs_snapshot_regs_v3_hw(hisi_hba); -} - static void debugfs_release_v3_hw(struct hisi_hba *hisi_hba, int dump_index) { struct device *dev = hisi_hba->dev;
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Yihang Li liyihang9@huawei.com
[ Upstream commit 63f0733d07ce60252e885602b39571ade0441015 ]
Currently, if CONFIG_SCSI_HISI_SAS_DEBUGFS_DEFAULT_ENABLE is enabled, the memory space used by DFX is allocated during device initialization, which occupies a large number of memory resources. The memory usage before and after the driver is loaded is as follows:
Memory usage before the driver is loaded: $ free -m total used free shared buff/cache available Mem: 867352 2578 864037 11 735 861681 Swap: 4095 0 4095
Memory usage after the driver which include 4 HBAs is loaded: $ insmod hisi_sas_v3_hw.ko $ free -m total used free shared buff/cache available Mem: 867352 4760 861848 11 743 859495 Swap: 4095 0 4095
The driver with 4 HBAs connected will allocate about 110 MB of memory without enabling debugfs.
Therefore, to avoid wasting memory resources, DFX memory is allocated during dump triggering. The dump may fail due to memory allocation failure. After this change, each dump costs about 10 MB of memory, and each dump lasts about 100 ms.
Signed-off-by: Yihang Li liyihang9@huawei.com Signed-off-by: Xiang Chen chenxiang66@hisilicon.com Link: https://lore.kernel.org/r/1694571327-78697-4-git-send-email-chenxiang66@hisi... Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Stable-dep-of: 9f564f15f884 ("scsi: hisi_sas: Create all dump files during debugfs initialization") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/hisi_sas/hisi_sas.h | 2 +- drivers/scsi/hisi_sas/hisi_sas_v3_hw.c | 93 +++++++++++++------------- 2 files changed, 46 insertions(+), 49 deletions(-)
diff --git a/drivers/scsi/hisi_sas/hisi_sas.h b/drivers/scsi/hisi_sas/hisi_sas.h index 3d511c44c02d..1e4550156b73 100644 --- a/drivers/scsi/hisi_sas/hisi_sas.h +++ b/drivers/scsi/hisi_sas/hisi_sas.h @@ -343,7 +343,7 @@ struct hisi_sas_hw { u8 reg_index, u8 reg_count, u8 *write_data); void (*wait_cmds_complete_timeout)(struct hisi_hba *hisi_hba, int delay_ms, int timeout_ms); - void (*debugfs_snapshot_regs)(struct hisi_hba *hisi_hba); + int (*debugfs_snapshot_regs)(struct hisi_hba *hisi_hba); int complete_hdr_size; const struct scsi_host_template *sht; }; diff --git a/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c b/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c index 10f048b5a489..cea548655629 100644 --- a/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c +++ b/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c @@ -558,7 +558,7 @@ static int experimental_iopoll_q_cnt; module_param(experimental_iopoll_q_cnt, int, 0444); MODULE_PARM_DESC(experimental_iopoll_q_cnt, "number of queues to be used as poll mode, def=0");
-static void debugfs_snapshot_regs_v3_hw(struct hisi_hba *hisi_hba); +static int debugfs_snapshot_regs_v3_hw(struct hisi_hba *hisi_hba);
static u32 hisi_sas_read32(struct hisi_hba *hisi_hba, u32 off) { @@ -3867,37 +3867,6 @@ static void debugfs_create_files_v3_hw(struct hisi_hba *hisi_hba) &debugfs_ras_v3_hw_fops); }
-static void debugfs_snapshot_regs_v3_hw(struct hisi_hba *hisi_hba) -{ - int debugfs_dump_index = hisi_hba->debugfs_dump_index; - struct device *dev = hisi_hba->dev; - u64 timestamp = local_clock(); - - if (debugfs_dump_index >= hisi_sas_debugfs_dump_count) { - dev_warn(dev, "dump count exceeded!\n"); - return; - } - - do_div(timestamp, NSEC_PER_MSEC); - hisi_hba->debugfs_timestamp[debugfs_dump_index] = timestamp; - - debugfs_snapshot_prepare_v3_hw(hisi_hba); - - debugfs_snapshot_global_reg_v3_hw(hisi_hba); - debugfs_snapshot_port_reg_v3_hw(hisi_hba); - debugfs_snapshot_axi_reg_v3_hw(hisi_hba); - debugfs_snapshot_ras_reg_v3_hw(hisi_hba); - debugfs_snapshot_cq_reg_v3_hw(hisi_hba); - debugfs_snapshot_dq_reg_v3_hw(hisi_hba); - debugfs_snapshot_itct_reg_v3_hw(hisi_hba); - debugfs_snapshot_iost_reg_v3_hw(hisi_hba); - - debugfs_create_files_v3_hw(hisi_hba); - - debugfs_snapshot_restore_v3_hw(hisi_hba); - hisi_hba->debugfs_dump_index++; -} - static ssize_t debugfs_trigger_dump_v3_hw_write(struct file *file, const char __user *user_buf, size_t count, loff_t *ppos) @@ -3905,9 +3874,6 @@ static ssize_t debugfs_trigger_dump_v3_hw_write(struct file *file, struct hisi_hba *hisi_hba = file->f_inode->i_private; char buf[8];
- if (hisi_hba->debugfs_dump_index >= hisi_sas_debugfs_dump_count) - return -EFAULT; - if (count > 8) return -EFAULT;
@@ -3918,7 +3884,10 @@ static ssize_t debugfs_trigger_dump_v3_hw_write(struct file *file, return -EFAULT;
down(&hisi_hba->sem); - debugfs_snapshot_regs_v3_hw(hisi_hba); + if (debugfs_snapshot_regs_v3_hw(hisi_hba)) { + up(&hisi_hba->sem); + return -EFAULT; + } up(&hisi_hba->sem);
return count; @@ -4704,7 +4673,7 @@ static int debugfs_alloc_v3_hw(struct hisi_hba *hisi_hba, int dump_index) { const struct hisi_sas_hw *hw = hisi_hba->hw; struct device *dev = hisi_hba->dev; - int p, c, d, r, i; + int p, c, d, r; size_t sz;
for (r = 0; r < DEBUGFS_REGS_NUM; r++) { @@ -4784,11 +4753,48 @@ static int debugfs_alloc_v3_hw(struct hisi_hba *hisi_hba, int dump_index)
return 0; fail: - for (i = 0; i < hisi_sas_debugfs_dump_count; i++) - debugfs_release_v3_hw(hisi_hba, i); + debugfs_release_v3_hw(hisi_hba, dump_index); return -ENOMEM; }
+static int debugfs_snapshot_regs_v3_hw(struct hisi_hba *hisi_hba) +{ + int debugfs_dump_index = hisi_hba->debugfs_dump_index; + struct device *dev = hisi_hba->dev; + u64 timestamp = local_clock(); + + if (debugfs_dump_index >= hisi_sas_debugfs_dump_count) { + dev_warn(dev, "dump count exceeded!\n"); + return -EINVAL; + } + + if (debugfs_alloc_v3_hw(hisi_hba, debugfs_dump_index)) { + dev_warn(dev, "failed to alloc memory\n"); + return -ENOMEM; + } + + do_div(timestamp, NSEC_PER_MSEC); + hisi_hba->debugfs_timestamp[debugfs_dump_index] = timestamp; + + debugfs_snapshot_prepare_v3_hw(hisi_hba); + + debugfs_snapshot_global_reg_v3_hw(hisi_hba); + debugfs_snapshot_port_reg_v3_hw(hisi_hba); + debugfs_snapshot_axi_reg_v3_hw(hisi_hba); + debugfs_snapshot_ras_reg_v3_hw(hisi_hba); + debugfs_snapshot_cq_reg_v3_hw(hisi_hba); + debugfs_snapshot_dq_reg_v3_hw(hisi_hba); + debugfs_snapshot_itct_reg_v3_hw(hisi_hba); + debugfs_snapshot_iost_reg_v3_hw(hisi_hba); + + debugfs_create_files_v3_hw(hisi_hba); + + debugfs_snapshot_restore_v3_hw(hisi_hba); + hisi_hba->debugfs_dump_index++; + + return 0; +} + static void debugfs_phy_down_cnt_init_v3_hw(struct hisi_hba *hisi_hba) { struct dentry *dir = debugfs_create_dir("phy_down_cnt", @@ -4875,7 +4881,6 @@ static void debugfs_exit_v3_hw(struct hisi_hba *hisi_hba) static void debugfs_init_v3_hw(struct hisi_hba *hisi_hba) { struct device *dev = hisi_hba->dev; - int i;
hisi_hba->debugfs_dir = debugfs_create_dir(dev_name(dev), hisi_sas_debugfs_dir); @@ -4892,14 +4897,6 @@ static void debugfs_init_v3_hw(struct hisi_hba *hisi_hba)
debugfs_phy_down_cnt_init_v3_hw(hisi_hba); debugfs_fifo_init_v3_hw(hisi_hba); - - for (i = 0; i < hisi_sas_debugfs_dump_count; i++) { - if (debugfs_alloc_v3_hw(hisi_hba, i)) { - debugfs_exit_v3_hw(hisi_hba); - dev_dbg(dev, "failed to init debugfs!\n"); - break; - } - } }
static int
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Yihang Li liyihang9@huawei.com
[ Upstream commit 9f564f15f88490b484e02442dc4c4b11640ea172 ]
For the current debugfs of hisi_sas, after user triggers dump, the driver allocate memory space to save the register information and create debugfs files to display the saved information. In this process, the debugfs files created after each dump.
Therefore, when the dump is triggered while the driver is unbind, the following hang occurs:
[67840.853907] Unable to handle kernel NULL pointer dereference at virtual address 00000000000000a0 [67840.862947] Mem abort info: [67840.865855] ESR = 0x0000000096000004 [67840.869713] EC = 0x25: DABT (current EL), IL = 32 bits [67840.875125] SET = 0, FnV = 0 [67840.878291] EA = 0, S1PTW = 0 [67840.881545] FSC = 0x04: level 0 translation fault [67840.886528] Data abort info: [67840.889524] ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000 [67840.895117] CM = 0, WnR = 0, TnD = 0, TagAccess = 0 [67840.900284] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 [67840.905709] user pgtable: 4k pages, 48-bit VAs, pgdp=0000002803a1f000 [67840.912263] [00000000000000a0] pgd=0000000000000000, p4d=0000000000000000 [67840.919177] Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP [67840.996435] pstate: 80400009 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) [67841.003628] pc : down_write+0x30/0x98 [67841.007546] lr : start_creating.part.0+0x60/0x198 [67841.012495] sp : ffff8000b979ba20 [67841.016046] x29: ffff8000b979ba20 x28: 0000000000000010 x27: 0000000000024b40 [67841.023412] x26: 0000000000000012 x25: ffff20202b355ae8 x24: ffff20202b35a8c8 [67841.030779] x23: ffffa36877928208 x22: ffffa368b4972240 x21: ffff8000b979bb18 [67841.038147] x20: ffff00281dc1e3c0 x19: fffffffffffffffe x18: 0000000000000020 [67841.045515] x17: 0000000000000000 x16: ffffa368b128a530 x15: ffffffffffffffff [67841.052888] x14: ffff8000b979bc18 x13: ffffffffffffffff x12: ffff8000b979bb18 [67841.060263] x11: 0000000000000000 x10: 0000000000000000 x9 : ffffa368b1289b18 [67841.067640] x8 : 0000000000000012 x7 : 0000000000000000 x6 : 00000000000003a9 [67841.075014] x5 : 0000000000000000 x4 : ffff002818c5cb00 x3 : 0000000000000001 [67841.082388] x2 : 0000000000000000 x1 : ffff002818c5cb00 x0 : 00000000000000a0 [67841.089759] Call trace: [67841.092456] down_write+0x30/0x98 [67841.096017] start_creating.part.0+0x60/0x198 [67841.100613] debugfs_create_dir+0x48/0x1f8 [67841.104950] debugfs_create_files_v3_hw+0x88/0x348 [hisi_sas_v3_hw] [67841.111447] debugfs_snapshot_regs_v3_hw+0x708/0x798 [hisi_sas_v3_hw] [67841.118111] debugfs_trigger_dump_v3_hw_write+0x9c/0x120 [hisi_sas_v3_hw] [67841.125115] full_proxy_write+0x68/0xc8 [67841.129175] vfs_write+0xd8/0x3f0 [67841.132708] ksys_write+0x70/0x108 [67841.136317] __arm64_sys_write+0x24/0x38 [67841.140440] invoke_syscall+0x50/0x128 [67841.144385] el0_svc_common.constprop.0+0xc8/0xf0 [67841.149273] do_el0_svc+0x24/0x38 [67841.152773] el0_svc+0x38/0xd8 [67841.156009] el0t_64_sync_handler+0xc0/0xc8 [67841.160361] el0t_64_sync+0x1a4/0x1a8 [67841.164189] Code: b9000882 d2800002 d2800023 f9800011 (c85ffc05) [67841.170443] ---[ end trace 0000000000000000 ]---
To fix this issue, create all directories and files during debugfs initialization. In this way, the driver only needs to allocate memory space to save information each time the user triggers dumping.
Signed-off-by: Yihang Li liyihang9@huawei.com Link: https://lore.kernel.org/r/20241008021822.2617339-13-liyihang9@huawei.com Reviewed-by: Xingui Yang yangxingui@huawei.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/hisi_sas/hisi_sas_v3_hw.c | 99 ++++++++++++++++++++------ 1 file changed, 77 insertions(+), 22 deletions(-)
diff --git a/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c b/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c index cea548655629..ff5f86867dbf 100644 --- a/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c +++ b/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c @@ -3560,6 +3560,11 @@ debugfs_to_reg_name_v3_hw(int off, int base_off, return NULL; }
+static bool debugfs_dump_is_generated_v3_hw(void *p) +{ + return p ? true : false; +} + static void debugfs_print_reg_v3_hw(u32 *regs_val, struct seq_file *s, const struct hisi_sas_debugfs_reg *reg) { @@ -3585,6 +3590,9 @@ static int debugfs_global_v3_hw_show(struct seq_file *s, void *p) { struct hisi_sas_debugfs_regs *global = s->private;
+ if (!debugfs_dump_is_generated_v3_hw(global->data)) + return -EPERM; + debugfs_print_reg_v3_hw(global->data, s, &debugfs_global_reg);
@@ -3596,6 +3604,9 @@ static int debugfs_axi_v3_hw_show(struct seq_file *s, void *p) { struct hisi_sas_debugfs_regs *axi = s->private;
+ if (!debugfs_dump_is_generated_v3_hw(axi->data)) + return -EPERM; + debugfs_print_reg_v3_hw(axi->data, s, &debugfs_axi_reg);
@@ -3607,6 +3618,9 @@ static int debugfs_ras_v3_hw_show(struct seq_file *s, void *p) { struct hisi_sas_debugfs_regs *ras = s->private;
+ if (!debugfs_dump_is_generated_v3_hw(ras->data)) + return -EPERM; + debugfs_print_reg_v3_hw(ras->data, s, &debugfs_ras_reg);
@@ -3619,6 +3633,9 @@ static int debugfs_port_v3_hw_show(struct seq_file *s, void *p) struct hisi_sas_debugfs_port *port = s->private; const struct hisi_sas_debugfs_reg *reg_port = &debugfs_port_reg;
+ if (!debugfs_dump_is_generated_v3_hw(port->data)) + return -EPERM; + debugfs_print_reg_v3_hw(port->data, s, reg_port);
return 0; @@ -3674,6 +3691,9 @@ static int debugfs_cq_v3_hw_show(struct seq_file *s, void *p) struct hisi_sas_debugfs_cq *debugfs_cq = s->private; int slot;
+ if (!debugfs_dump_is_generated_v3_hw(debugfs_cq->complete_hdr)) + return -EPERM; + for (slot = 0; slot < HISI_SAS_QUEUE_SLOTS; slot++) debugfs_cq_show_slot_v3_hw(s, slot, debugfs_cq);
@@ -3695,8 +3715,12 @@ static void debugfs_dq_show_slot_v3_hw(struct seq_file *s, int slot,
static int debugfs_dq_v3_hw_show(struct seq_file *s, void *p) { + struct hisi_sas_debugfs_dq *debugfs_dq = s->private; int slot;
+ if (!debugfs_dump_is_generated_v3_hw(debugfs_dq->hdr)) + return -EPERM; + for (slot = 0; slot < HISI_SAS_QUEUE_SLOTS; slot++) debugfs_dq_show_slot_v3_hw(s, slot, s->private);
@@ -3710,6 +3734,9 @@ static int debugfs_iost_v3_hw_show(struct seq_file *s, void *p) struct hisi_sas_iost *iost = debugfs_iost->iost; int i, max_command_entries = HISI_SAS_MAX_COMMANDS;
+ if (!debugfs_dump_is_generated_v3_hw(iost)) + return -EPERM; + for (i = 0; i < max_command_entries; i++, iost++) { __le64 *data = &iost->qw0;
@@ -3729,6 +3756,9 @@ static int debugfs_iost_cache_v3_hw_show(struct seq_file *s, void *p) int i, tab_idx; __le64 *iost;
+ if (!debugfs_dump_is_generated_v3_hw(iost_cache)) + return -EPERM; + for (i = 0; i < HISI_SAS_IOST_ITCT_CACHE_NUM; i++, iost_cache++) { /* * Data struct of IOST cache: @@ -3752,6 +3782,9 @@ static int debugfs_itct_v3_hw_show(struct seq_file *s, void *p) struct hisi_sas_debugfs_itct *debugfs_itct = s->private; struct hisi_sas_itct *itct = debugfs_itct->itct;
+ if (!debugfs_dump_is_generated_v3_hw(itct)) + return -EPERM; + for (i = 0; i < HISI_SAS_MAX_ITCT_ENTRIES; i++, itct++) { __le64 *data = &itct->qw0;
@@ -3771,6 +3804,9 @@ static int debugfs_itct_cache_v3_hw_show(struct seq_file *s, void *p) int i, tab_idx; __le64 *itct;
+ if (!debugfs_dump_is_generated_v3_hw(itct_cache)) + return -EPERM; + for (i = 0; i < HISI_SAS_IOST_ITCT_CACHE_NUM; i++, itct_cache++) { /* * Data struct of ITCT cache: @@ -3788,10 +3824,9 @@ static int debugfs_itct_cache_v3_hw_show(struct seq_file *s, void *p) } DEFINE_SHOW_ATTRIBUTE(debugfs_itct_cache_v3_hw);
-static void debugfs_create_files_v3_hw(struct hisi_hba *hisi_hba) +static void debugfs_create_files_v3_hw(struct hisi_hba *hisi_hba, int index) { u64 *debugfs_timestamp; - int dump_index = hisi_hba->debugfs_dump_index; struct dentry *dump_dentry; struct dentry *dentry; char name[256]; @@ -3799,17 +3834,17 @@ static void debugfs_create_files_v3_hw(struct hisi_hba *hisi_hba) int c; int d;
- snprintf(name, 256, "%d", dump_index); + snprintf(name, 256, "%d", index);
dump_dentry = debugfs_create_dir(name, hisi_hba->debugfs_dump_dentry);
- debugfs_timestamp = &hisi_hba->debugfs_timestamp[dump_index]; + debugfs_timestamp = &hisi_hba->debugfs_timestamp[index];
debugfs_create_u64("timestamp", 0400, dump_dentry, debugfs_timestamp);
debugfs_create_file("global", 0400, dump_dentry, - &hisi_hba->debugfs_regs[dump_index][DEBUGFS_GLOBAL], + &hisi_hba->debugfs_regs[index][DEBUGFS_GLOBAL], &debugfs_global_v3_hw_fops);
/* Create port dir and files */ @@ -3818,7 +3853,7 @@ static void debugfs_create_files_v3_hw(struct hisi_hba *hisi_hba) snprintf(name, 256, "%d", p);
debugfs_create_file(name, 0400, dentry, - &hisi_hba->debugfs_port_reg[dump_index][p], + &hisi_hba->debugfs_port_reg[index][p], &debugfs_port_v3_hw_fops); }
@@ -3828,7 +3863,7 @@ static void debugfs_create_files_v3_hw(struct hisi_hba *hisi_hba) snprintf(name, 256, "%d", c);
debugfs_create_file(name, 0400, dentry, - &hisi_hba->debugfs_cq[dump_index][c], + &hisi_hba->debugfs_cq[index][c], &debugfs_cq_v3_hw_fops); }
@@ -3838,32 +3873,32 @@ static void debugfs_create_files_v3_hw(struct hisi_hba *hisi_hba) snprintf(name, 256, "%d", d);
debugfs_create_file(name, 0400, dentry, - &hisi_hba->debugfs_dq[dump_index][d], + &hisi_hba->debugfs_dq[index][d], &debugfs_dq_v3_hw_fops); }
debugfs_create_file("iost", 0400, dump_dentry, - &hisi_hba->debugfs_iost[dump_index], + &hisi_hba->debugfs_iost[index], &debugfs_iost_v3_hw_fops);
debugfs_create_file("iost_cache", 0400, dump_dentry, - &hisi_hba->debugfs_iost_cache[dump_index], + &hisi_hba->debugfs_iost_cache[index], &debugfs_iost_cache_v3_hw_fops);
debugfs_create_file("itct", 0400, dump_dentry, - &hisi_hba->debugfs_itct[dump_index], + &hisi_hba->debugfs_itct[index], &debugfs_itct_v3_hw_fops);
debugfs_create_file("itct_cache", 0400, dump_dentry, - &hisi_hba->debugfs_itct_cache[dump_index], + &hisi_hba->debugfs_itct_cache[index], &debugfs_itct_cache_v3_hw_fops);
debugfs_create_file("axi", 0400, dump_dentry, - &hisi_hba->debugfs_regs[dump_index][DEBUGFS_AXI], + &hisi_hba->debugfs_regs[index][DEBUGFS_AXI], &debugfs_axi_v3_hw_fops);
debugfs_create_file("ras", 0400, dump_dentry, - &hisi_hba->debugfs_regs[dump_index][DEBUGFS_RAS], + &hisi_hba->debugfs_regs[index][DEBUGFS_RAS], &debugfs_ras_v3_hw_fops); }
@@ -4645,22 +4680,34 @@ static void debugfs_release_v3_hw(struct hisi_hba *hisi_hba, int dump_index) int i;
devm_kfree(dev, hisi_hba->debugfs_iost_cache[dump_index].cache); + hisi_hba->debugfs_iost_cache[dump_index].cache = NULL; devm_kfree(dev, hisi_hba->debugfs_itct_cache[dump_index].cache); + hisi_hba->debugfs_itct_cache[dump_index].cache = NULL; devm_kfree(dev, hisi_hba->debugfs_iost[dump_index].iost); + hisi_hba->debugfs_iost[dump_index].iost = NULL; devm_kfree(dev, hisi_hba->debugfs_itct[dump_index].itct); + hisi_hba->debugfs_itct[dump_index].itct = NULL;
- for (i = 0; i < hisi_hba->queue_count; i++) + for (i = 0; i < hisi_hba->queue_count; i++) { devm_kfree(dev, hisi_hba->debugfs_dq[dump_index][i].hdr); + hisi_hba->debugfs_dq[dump_index][i].hdr = NULL; + }
- for (i = 0; i < hisi_hba->queue_count; i++) + for (i = 0; i < hisi_hba->queue_count; i++) { devm_kfree(dev, hisi_hba->debugfs_cq[dump_index][i].complete_hdr); + hisi_hba->debugfs_cq[dump_index][i].complete_hdr = NULL; + }
- for (i = 0; i < DEBUGFS_REGS_NUM; i++) + for (i = 0; i < DEBUGFS_REGS_NUM; i++) { devm_kfree(dev, hisi_hba->debugfs_regs[dump_index][i].data); + hisi_hba->debugfs_regs[dump_index][i].data = NULL; + }
- for (i = 0; i < hisi_hba->n_phy; i++) + for (i = 0; i < hisi_hba->n_phy; i++) { devm_kfree(dev, hisi_hba->debugfs_port_reg[dump_index][i].data); + hisi_hba->debugfs_port_reg[dump_index][i].data = NULL; + } }
static const struct hisi_sas_debugfs_reg *debugfs_reg_array_v3_hw[DEBUGFS_REGS_NUM] = { @@ -4787,8 +4834,6 @@ static int debugfs_snapshot_regs_v3_hw(struct hisi_hba *hisi_hba) debugfs_snapshot_itct_reg_v3_hw(hisi_hba); debugfs_snapshot_iost_reg_v3_hw(hisi_hba);
- debugfs_create_files_v3_hw(hisi_hba); - debugfs_snapshot_restore_v3_hw(hisi_hba); hisi_hba->debugfs_dump_index++;
@@ -4872,6 +4917,17 @@ static void debugfs_bist_init_v3_hw(struct hisi_hba *hisi_hba) hisi_hba->debugfs_bist_linkrate = SAS_LINK_RATE_1_5_GBPS; }
+static void debugfs_dump_init_v3_hw(struct hisi_hba *hisi_hba) +{ + int i; + + hisi_hba->debugfs_dump_dentry = + debugfs_create_dir("dump", hisi_hba->debugfs_dir); + + for (i = 0; i < hisi_sas_debugfs_dump_count; i++) + debugfs_create_files_v3_hw(hisi_hba, i); +} + static void debugfs_exit_v3_hw(struct hisi_hba *hisi_hba) { debugfs_remove_recursive(hisi_hba->debugfs_dir); @@ -4892,8 +4948,7 @@ static void debugfs_init_v3_hw(struct hisi_hba *hisi_hba) /* create bist structures */ debugfs_bist_init_v3_hw(hisi_hba);
- hisi_hba->debugfs_dump_dentry = - debugfs_create_dir("dump", hisi_hba->debugfs_dir); + debugfs_dump_init_v3_hw(hisi_hba);
debugfs_phy_down_cnt_init_v3_hw(hisi_hba); debugfs_fifo_init_v3_hw(hisi_hba);
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Rajendra Nayak quic_rjendra@quicinc.com
[ Upstream commit c32f4f4ae1c6035b44bb4ca7a41fa4fd51244597 ]
Zonda ole pll has as extra PLL_OFF_CONFIG_CTL_U2 register, hence add support for it.
Signed-off-by: Rajendra Nayak quic_rjendra@quicinc.com Signed-off-by: Abel Vesa abel.vesa@linaro.org Link: https://lore.kernel.org/r/20240202-x1e80100-clock-controllers-v4-6-7fb08c861... Signed-off-by: Bjorn Andersson andersson@kernel.org Stable-dep-of: 79dfed29aa3f ("clk: qcom: clk-alpha-pll: Add NSS HUAYRA ALPHA PLL support for ipq9574") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/clk/qcom/clk-alpha-pll.c | 16 ++++++++++++++++ drivers/clk/qcom/clk-alpha-pll.h | 4 ++++ 2 files changed, 20 insertions(+)
diff --git a/drivers/clk/qcom/clk-alpha-pll.c b/drivers/clk/qcom/clk-alpha-pll.c index 8b3e5f84e89a..87040b949eb4 100644 --- a/drivers/clk/qcom/clk-alpha-pll.c +++ b/drivers/clk/qcom/clk-alpha-pll.c @@ -52,6 +52,7 @@ #define PLL_CONFIG_CTL(p) ((p)->offset + (p)->regs[PLL_OFF_CONFIG_CTL]) #define PLL_CONFIG_CTL_U(p) ((p)->offset + (p)->regs[PLL_OFF_CONFIG_CTL_U]) #define PLL_CONFIG_CTL_U1(p) ((p)->offset + (p)->regs[PLL_OFF_CONFIG_CTL_U1]) +#define PLL_CONFIG_CTL_U2(p) ((p)->offset + (p)->regs[PLL_OFF_CONFIG_CTL_U2]) #define PLL_TEST_CTL(p) ((p)->offset + (p)->regs[PLL_OFF_TEST_CTL]) #define PLL_TEST_CTL_U(p) ((p)->offset + (p)->regs[PLL_OFF_TEST_CTL_U]) #define PLL_TEST_CTL_U1(p) ((p)->offset + (p)->regs[PLL_OFF_TEST_CTL_U1]) @@ -227,6 +228,21 @@ const u8 clk_alpha_pll_regs[][PLL_OFF_MAX_REGS] = { [PLL_OFF_ALPHA_VAL] = 0x24, [PLL_OFF_ALPHA_VAL_U] = 0x28, }, + [CLK_ALPHA_PLL_TYPE_ZONDA_OLE] = { + [PLL_OFF_L_VAL] = 0x04, + [PLL_OFF_ALPHA_VAL] = 0x08, + [PLL_OFF_USER_CTL] = 0x0c, + [PLL_OFF_USER_CTL_U] = 0x10, + [PLL_OFF_CONFIG_CTL] = 0x14, + [PLL_OFF_CONFIG_CTL_U] = 0x18, + [PLL_OFF_CONFIG_CTL_U1] = 0x1c, + [PLL_OFF_CONFIG_CTL_U2] = 0x20, + [PLL_OFF_TEST_CTL] = 0x24, + [PLL_OFF_TEST_CTL_U] = 0x28, + [PLL_OFF_TEST_CTL_U1] = 0x2c, + [PLL_OFF_OPMODE] = 0x30, + [PLL_OFF_STATUS] = 0x3c, + }, }; EXPORT_SYMBOL_GPL(clk_alpha_pll_regs);
diff --git a/drivers/clk/qcom/clk-alpha-pll.h b/drivers/clk/qcom/clk-alpha-pll.h index 3fd0ef41c72c..f50de33a045d 100644 --- a/drivers/clk/qcom/clk-alpha-pll.h +++ b/drivers/clk/qcom/clk-alpha-pll.h @@ -21,6 +21,7 @@ enum { CLK_ALPHA_PLL_TYPE_LUCID = CLK_ALPHA_PLL_TYPE_TRION, CLK_ALPHA_PLL_TYPE_AGERA, CLK_ALPHA_PLL_TYPE_ZONDA, + CLK_ALPHA_PLL_TYPE_ZONDA_OLE, CLK_ALPHA_PLL_TYPE_LUCID_EVO, CLK_ALPHA_PLL_TYPE_LUCID_OLE, CLK_ALPHA_PLL_TYPE_RIVIAN_EVO, @@ -42,6 +43,7 @@ enum { PLL_OFF_CONFIG_CTL, PLL_OFF_CONFIG_CTL_U, PLL_OFF_CONFIG_CTL_U1, + PLL_OFF_CONFIG_CTL_U2, PLL_OFF_TEST_CTL, PLL_OFF_TEST_CTL_U, PLL_OFF_TEST_CTL_U1, @@ -119,6 +121,7 @@ struct alpha_pll_config { u32 config_ctl_val; u32 config_ctl_hi_val; u32 config_ctl_hi1_val; + u32 config_ctl_hi2_val; u32 user_ctl_val; u32 user_ctl_hi_val; u32 user_ctl_hi1_val; @@ -173,6 +176,7 @@ extern const struct clk_ops clk_alpha_pll_postdiv_lucid_5lpe_ops;
extern const struct clk_ops clk_alpha_pll_zonda_ops; #define clk_alpha_pll_postdiv_zonda_ops clk_alpha_pll_postdiv_fabia_ops +#define clk_alpha_pll_zonda_ole_ops clk_alpha_pll_zonda_ops
extern const struct clk_ops clk_alpha_pll_lucid_evo_ops; extern const struct clk_ops clk_alpha_pll_reset_lucid_evo_ops;
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Devi Priya quic_devipriy@quicinc.com
[ Upstream commit 79dfed29aa3f714e0a94a39b2bfe9ac14ce19a6a ]
Add support for NSS Huayra alpha pll found on ipq9574 SoCs. Programming sequence is the same as that of Huayra type Alpha PLL, so we can re-use the same.
Reviewed-by: Dmitry Baryshkov dmitry.baryshkov@linaro.org Signed-off-by: Devi Priya quic_devipriy@quicinc.com Link: https://lore.kernel.org/r/20241028060506.246606-2-quic_srichara@quicinc.com Signed-off-by: Bjorn Andersson andersson@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/clk/qcom/clk-alpha-pll.c | 11 +++++++++++ drivers/clk/qcom/clk-alpha-pll.h | 1 + 2 files changed, 12 insertions(+)
diff --git a/drivers/clk/qcom/clk-alpha-pll.c b/drivers/clk/qcom/clk-alpha-pll.c index 87040b949eb4..ce44dbfd47e2 100644 --- a/drivers/clk/qcom/clk-alpha-pll.c +++ b/drivers/clk/qcom/clk-alpha-pll.c @@ -243,6 +243,17 @@ const u8 clk_alpha_pll_regs[][PLL_OFF_MAX_REGS] = { [PLL_OFF_OPMODE] = 0x30, [PLL_OFF_STATUS] = 0x3c, }, + [CLK_ALPHA_PLL_TYPE_NSS_HUAYRA] = { + [PLL_OFF_L_VAL] = 0x04, + [PLL_OFF_ALPHA_VAL] = 0x08, + [PLL_OFF_TEST_CTL] = 0x0c, + [PLL_OFF_TEST_CTL_U] = 0x10, + [PLL_OFF_USER_CTL] = 0x14, + [PLL_OFF_CONFIG_CTL] = 0x18, + [PLL_OFF_CONFIG_CTL_U] = 0x1c, + [PLL_OFF_STATUS] = 0x20, + }, + }; EXPORT_SYMBOL_GPL(clk_alpha_pll_regs);
diff --git a/drivers/clk/qcom/clk-alpha-pll.h b/drivers/clk/qcom/clk-alpha-pll.h index f50de33a045d..52dc5b9b546a 100644 --- a/drivers/clk/qcom/clk-alpha-pll.h +++ b/drivers/clk/qcom/clk-alpha-pll.h @@ -29,6 +29,7 @@ enum { CLK_ALPHA_PLL_TYPE_BRAMMO_EVO, CLK_ALPHA_PLL_TYPE_STROMER, CLK_ALPHA_PLL_TYPE_STROMER_PLUS, + CLK_ALPHA_PLL_TYPE_NSS_HUAYRA, CLK_ALPHA_PLL_TYPE_MAX, };
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Huisong Li lihuisong@huawei.com
[ Upstream commit 60c40b06fa68694dd08a1a0038ea8b9de3f3b1ca ]
Currently, PCC driver doesn't support the processing of platform notification for type 4 PCC subspaces.
According to ACPI specification, if platform sends a notification to OSPM, it must clear the command complete bit and trigger platform interrupt. OSPM needs to check whether the command complete bit is cleared, clear platform interrupt, process command, and then set the command complete and ring doorbell to the Platform.
Let us stash the value of the pcc type and use the same while processing the interrupt of the channel. We also need to set the command complete bit and ring doorbell in the interrupt handler for the type 4 channel to complete the communication flow after processing the notification from the Platform.
Signed-off-by: Huisong Li lihuisong@huawei.com Reviewed-by: Hanjun Guo guohanjun@huawei.com Link: https://lore.kernel.org/r/20230801063827.25336-2-lihuisong@huawei.com Signed-off-by: Sudeep Holla sudeep.holla@arm.com Stable-dep-of: 7f9e19f207be ("mailbox: pcc: Check before sending MCTP PCC response ACK") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/mailbox/pcc.c | 50 +++++++++++++++++++++++++++++++++++-------- 1 file changed, 41 insertions(+), 9 deletions(-)
diff --git a/drivers/mailbox/pcc.c b/drivers/mailbox/pcc.c index a44d4b3e5beb..80310b48bfb6 100644 --- a/drivers/mailbox/pcc.c +++ b/drivers/mailbox/pcc.c @@ -91,6 +91,7 @@ struct pcc_chan_reg { * @cmd_update: PCC register bundle for the command complete update register * @error: PCC register bundle for the error status register * @plat_irq: platform interrupt + * @type: PCC subspace type */ struct pcc_chan_info { struct pcc_mbox_chan chan; @@ -100,12 +101,15 @@ struct pcc_chan_info { struct pcc_chan_reg cmd_update; struct pcc_chan_reg error; int plat_irq; + u8 type; };
#define to_pcc_chan_info(c) container_of(c, struct pcc_chan_info, chan) static struct pcc_chan_info *chan_info; static int pcc_chan_count;
+static int pcc_send_data(struct mbox_chan *chan, void *data); + /* * PCC can be used with perf critical drivers such as CPPC * So it makes sense to locally cache the virtual address and @@ -221,6 +225,34 @@ static int pcc_map_interrupt(u32 interrupt, u32 flags) return acpi_register_gsi(NULL, interrupt, trigger, polarity); }
+static bool pcc_mbox_cmd_complete_check(struct pcc_chan_info *pchan) +{ + u64 val; + int ret; + + ret = pcc_chan_reg_read(&pchan->cmd_complete, &val); + if (ret) + return false; + + if (!pchan->cmd_complete.gas) + return true; + + /* + * Judge if the channel respond the interrupt based on the value of + * command complete. + */ + val &= pchan->cmd_complete.status_mask; + /* + * If this is PCC slave subspace channel, and the command complete + * bit 0 indicates that Platform is sending a notification and OSPM + * needs to respond this interrupt to process this command. + */ + if (pchan->type == ACPI_PCCT_TYPE_EXT_PCC_SLAVE_SUBSPACE) + return !val; + + return !!val; +} + /** * pcc_mbox_irq - PCC mailbox interrupt handler * @irq: interrupt number @@ -236,17 +268,9 @@ static irqreturn_t pcc_mbox_irq(int irq, void *p) int ret;
pchan = chan->con_priv; - - ret = pcc_chan_reg_read(&pchan->cmd_complete, &val); - if (ret) + if (!pcc_mbox_cmd_complete_check(pchan)) return IRQ_NONE;
- if (val) { /* Ensure GAS exists and value is non-zero */ - val &= pchan->cmd_complete.status_mask; - if (!val) - return IRQ_NONE; - } - ret = pcc_chan_reg_read(&pchan->error, &val); if (ret) return IRQ_NONE; @@ -262,6 +286,13 @@ static irqreturn_t pcc_mbox_irq(int irq, void *p)
mbox_chan_received_data(chan, NULL);
+ /* + * The PCC slave subspace channel needs to set the command complete bit + * and ring doorbell after processing message. + */ + if (pchan->type == ACPI_PCCT_TYPE_EXT_PCC_SLAVE_SUBSPACE) + pcc_send_data(chan, NULL); + return IRQ_HANDLED; }
@@ -698,6 +729,7 @@ static int pcc_mbox_probe(struct platform_device *pdev)
pcc_parse_subspace_shmem(pchan, pcct_entry);
+ pchan->type = pcct_entry->type; pcct_entry = (struct acpi_subtable_header *) ((unsigned long) pcct_entry + pcct_entry->length); }
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Huisong Li lihuisong@huawei.com
[ Upstream commit 3db174e478cb0bb34888c20a531608b70aec9c1f ]
If the platform acknowledge interrupt is level triggered, then it can be shared by multiple subspaces provided each one has a unique platform interrupt ack preserve and ack set masks.
If it can be shared, then we can request the irq with IRQF_SHARED and IRQF_ONESHOT flags. The first one indicating it can be shared and the latter one to keep the interrupt disabled until the hardirq handler finished.
Further, since there is no way to detect if the interrupt is for a given channel as the interrupt ack preserve and ack set masks are for clearing the interrupt and not for reading the status(in case Irq Ack register may be write-only on some platforms), we need a way to identify if the given channel is in use and expecting the interrupt.
PCC type0, type1 and type5 do not support shared level triggered interrupt. The methods of determining whether a given channel for remaining types should respond to an interrupt are as follows: - type2: Whether the interrupt belongs to a given channel is only determined by the status field in Generic Communications Channel Shared Memory Region, which is done in rx_callback of PCC client. - type3: This channel checks chan_in_use flag first and then checks the command complete bit(value '1' indicates that the command has been completed). - type4: Platform ensure that the default value of the command complete bit corresponding to the type4 channel is '1'. This command complete bit is '0' when receive a platform notification.
The new field, 'chan_in_use' is used by the type only support the communication from OSPM to Platform (like type3) and should be completely ignored by other types so as to avoid too many type unnecessary checks in IRQ handler.
Signed-off-by: Huisong Li lihuisong@huawei.com Reviewed-by: Hanjun Guo guohanjun@huawei.com Link: https://lore.kernel.org/r/20230801063827.25336-3-lihuisong@huawei.com Signed-off-by: Sudeep Holla sudeep.holla@arm.com Stable-dep-of: 7f9e19f207be ("mailbox: pcc: Check before sending MCTP PCC response ACK") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/mailbox/pcc.c | 43 ++++++++++++++++++++++++++++++++++++++++--- 1 file changed, 40 insertions(+), 3 deletions(-)
diff --git a/drivers/mailbox/pcc.c b/drivers/mailbox/pcc.c index 80310b48bfb6..94885e411085 100644 --- a/drivers/mailbox/pcc.c +++ b/drivers/mailbox/pcc.c @@ -92,6 +92,13 @@ struct pcc_chan_reg { * @error: PCC register bundle for the error status register * @plat_irq: platform interrupt * @type: PCC subspace type + * @plat_irq_flags: platform interrupt flags + * @chan_in_use: this flag is used just to check if the interrupt needs + * handling when it is shared. Since only one transfer can occur + * at a time and mailbox takes care of locking, this flag can be + * accessed without a lock. Note: the type only support the + * communication from OSPM to Platform, like type3, use it, and + * other types completely ignore it. */ struct pcc_chan_info { struct pcc_mbox_chan chan; @@ -102,6 +109,8 @@ struct pcc_chan_info { struct pcc_chan_reg error; int plat_irq; u8 type; + unsigned int plat_irq_flags; + bool chan_in_use; };
#define to_pcc_chan_info(c) container_of(c, struct pcc_chan_info, chan) @@ -225,6 +234,12 @@ static int pcc_map_interrupt(u32 interrupt, u32 flags) return acpi_register_gsi(NULL, interrupt, trigger, polarity); }
+static bool pcc_chan_plat_irq_can_be_shared(struct pcc_chan_info *pchan) +{ + return (pchan->plat_irq_flags & ACPI_PCCT_INTERRUPT_MODE) == + ACPI_LEVEL_SENSITIVE; +} + static bool pcc_mbox_cmd_complete_check(struct pcc_chan_info *pchan) { u64 val; @@ -242,6 +257,7 @@ static bool pcc_mbox_cmd_complete_check(struct pcc_chan_info *pchan) * command complete. */ val &= pchan->cmd_complete.status_mask; + /* * If this is PCC slave subspace channel, and the command complete * bit 0 indicates that Platform is sending a notification and OSPM @@ -268,6 +284,10 @@ static irqreturn_t pcc_mbox_irq(int irq, void *p) int ret;
pchan = chan->con_priv; + if (pchan->type == ACPI_PCCT_TYPE_EXT_PCC_MASTER_SUBSPACE && + !pchan->chan_in_use) + return IRQ_NONE; + if (!pcc_mbox_cmd_complete_check(pchan)) return IRQ_NONE;
@@ -289,9 +309,12 @@ static irqreturn_t pcc_mbox_irq(int irq, void *p) /* * The PCC slave subspace channel needs to set the command complete bit * and ring doorbell after processing message. + * + * The PCC master subspace channel clears chan_in_use to free channel. */ if (pchan->type == ACPI_PCCT_TYPE_EXT_PCC_SLAVE_SUBSPACE) pcc_send_data(chan, NULL); + pchan->chan_in_use = false;
return IRQ_HANDLED; } @@ -371,7 +394,11 @@ static int pcc_send_data(struct mbox_chan *chan, void *data) if (ret) return ret;
- return pcc_chan_reg_read_modify_write(&pchan->db); + ret = pcc_chan_reg_read_modify_write(&pchan->db); + if (!ret && pchan->plat_irq > 0) + pchan->chan_in_use = true; + + return ret; }
/** @@ -384,11 +411,14 @@ static int pcc_send_data(struct mbox_chan *chan, void *data) static int pcc_startup(struct mbox_chan *chan) { struct pcc_chan_info *pchan = chan->con_priv; + unsigned long irqflags; int rc;
if (pchan->plat_irq > 0) { - rc = devm_request_irq(chan->mbox->dev, pchan->plat_irq, pcc_mbox_irq, 0, - MBOX_IRQ_NAME, chan); + irqflags = pcc_chan_plat_irq_can_be_shared(pchan) ? + IRQF_SHARED | IRQF_ONESHOT : 0; + rc = devm_request_irq(chan->mbox->dev, pchan->plat_irq, pcc_mbox_irq, + irqflags, MBOX_IRQ_NAME, chan); if (unlikely(rc)) { dev_err(chan->mbox->dev, "failed to register PCC interrupt %d\n", pchan->plat_irq); @@ -494,6 +524,7 @@ static int pcc_parse_subspace_irq(struct pcc_chan_info *pchan, pcct_ss->platform_interrupt); return -EINVAL; } + pchan->plat_irq_flags = pcct_ss->flags;
if (pcct_ss->header.type == ACPI_PCCT_TYPE_HW_REDUCED_SUBSPACE_TYPE2) { struct acpi_pcct_hw_reduced_type2 *pcct2_ss = (void *)pcct_ss; @@ -515,6 +546,12 @@ static int pcc_parse_subspace_irq(struct pcc_chan_info *pchan, "PLAT IRQ ACK"); }
+ if (pcc_chan_plat_irq_can_be_shared(pchan) && + !pchan->plat_irq_ack.gas) { + pr_err("PCC subspace has level IRQ with no ACK register\n"); + return -EINVAL; + } + return ret; }
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sudeep Holla sudeep.holla@arm.com
[ Upstream commit 55d235ebb684b993b3247740c1c8e273f8af4a54 ]
Define the common macros to use when referring to various bitfields in the PCC generic communications channel command and status fields.
Currently different drivers that need to use these bitfields have defined these locally. This common macro is intended to consolidate and replace those.
Cc: "Rafael J. Wysocki" rafael.j.wysocki@intel.com Link: https://lore.kernel.org/r/20230927-pcc_defines-v2-1-0b8ffeaef2e5@arm.com Signed-off-by: Sudeep Holla sudeep.holla@arm.com Stable-dep-of: 7f9e19f207be ("mailbox: pcc: Check before sending MCTP PCC response ACK") Signed-off-by: Sasha Levin sashal@kernel.org --- include/acpi/pcc.h | 13 +++++++++++++ 1 file changed, 13 insertions(+)
diff --git a/include/acpi/pcc.h b/include/acpi/pcc.h index 73e806fe7ce7..9b373d172a77 100644 --- a/include/acpi/pcc.h +++ b/include/acpi/pcc.h @@ -18,7 +18,20 @@ struct pcc_mbox_chan { u16 min_turnaround_time; };
+/* Generic Communications Channel Shared Memory Region */ +#define PCC_SIGNATURE 0x50434300 +/* Generic Communications Channel Command Field */ +#define PCC_CMD_GENERATE_DB_INTR BIT(15) +/* Generic Communications Channel Status Field */ +#define PCC_STATUS_CMD_COMPLETE BIT(0) +#define PCC_STATUS_SCI_DOORBELL BIT(1) +#define PCC_STATUS_ERROR BIT(2) +#define PCC_STATUS_PLATFORM_NOTIFY BIT(3) +/* Initiator Responder Communications Channel Flags */ +#define PCC_CMD_COMPLETION_NOTIFY BIT(0) + #define MAX_PCC_SUBSPACES 256 + #ifdef CONFIG_PCC extern struct pcc_mbox_chan * pcc_mbox_request_channel(struct mbox_client *cl, int subspace_id);
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Adam Young admiyo@os.amperecomputing.com
[ Upstream commit 7f9e19f207be0c534d517d65e01417ba968cdd34 ]
Type 4 PCC channels have an option to send back a response to the platform when they are done processing the request. The flag to indicate whether or not to respond is inside the message body, and thus is not available to the pcc mailbox.
If the flag is not set, still set command completion bit after processing message.
In order to read the flag, this patch maps the shared buffer to virtual memory. To avoid duplication of mapping the shared buffer is then made available to be used by the driver that uses the mailbox.
Signed-off-by: Adam Young admiyo@os.amperecomputing.com Cc: Sudeep Holla sudeep.holla@arm.com Signed-off-by: Jassi Brar jassisinghbrar@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/mailbox/pcc.c | 61 +++++++++++++++++++++++++++++++++++++------ include/acpi/pcc.h | 7 +++++ 2 files changed, 60 insertions(+), 8 deletions(-)
diff --git a/drivers/mailbox/pcc.c b/drivers/mailbox/pcc.c index 94885e411085..82102a4c5d68 100644 --- a/drivers/mailbox/pcc.c +++ b/drivers/mailbox/pcc.c @@ -269,6 +269,35 @@ static bool pcc_mbox_cmd_complete_check(struct pcc_chan_info *pchan) return !!val; }
+static void check_and_ack(struct pcc_chan_info *pchan, struct mbox_chan *chan) +{ + struct acpi_pcct_ext_pcc_shared_memory pcc_hdr; + + if (pchan->type != ACPI_PCCT_TYPE_EXT_PCC_SLAVE_SUBSPACE) + return; + /* If the memory region has not been mapped, we cannot + * determine if we need to send the message, but we still + * need to set the cmd_update flag before returning. + */ + if (pchan->chan.shmem == NULL) { + pcc_chan_reg_read_modify_write(&pchan->cmd_update); + return; + } + memcpy_fromio(&pcc_hdr, pchan->chan.shmem, + sizeof(struct acpi_pcct_ext_pcc_shared_memory)); + /* + * The PCC slave subspace channel needs to set the command complete bit + * after processing message. If the PCC_ACK_FLAG is set, it should also + * ring the doorbell. + * + * The PCC master subspace channel clears chan_in_use to free channel. + */ + if (le32_to_cpup(&pcc_hdr.flags) & PCC_ACK_FLAG_MASK) + pcc_send_data(chan, NULL); + else + pcc_chan_reg_read_modify_write(&pchan->cmd_update); +} + /** * pcc_mbox_irq - PCC mailbox interrupt handler * @irq: interrupt number @@ -306,14 +335,7 @@ static irqreturn_t pcc_mbox_irq(int irq, void *p)
mbox_chan_received_data(chan, NULL);
- /* - * The PCC slave subspace channel needs to set the command complete bit - * and ring doorbell after processing message. - * - * The PCC master subspace channel clears chan_in_use to free channel. - */ - if (pchan->type == ACPI_PCCT_TYPE_EXT_PCC_SLAVE_SUBSPACE) - pcc_send_data(chan, NULL); + check_and_ack(pchan, chan); pchan->chan_in_use = false;
return IRQ_HANDLED; @@ -365,14 +387,37 @@ EXPORT_SYMBOL_GPL(pcc_mbox_request_channel); void pcc_mbox_free_channel(struct pcc_mbox_chan *pchan) { struct mbox_chan *chan = pchan->mchan; + struct pcc_chan_info *pchan_info; + struct pcc_mbox_chan *pcc_mbox_chan;
if (!chan || !chan->cl) return; + pchan_info = chan->con_priv; + pcc_mbox_chan = &pchan_info->chan; + if (pcc_mbox_chan->shmem) { + iounmap(pcc_mbox_chan->shmem); + pcc_mbox_chan->shmem = NULL; + }
mbox_free_channel(chan); } EXPORT_SYMBOL_GPL(pcc_mbox_free_channel);
+int pcc_mbox_ioremap(struct mbox_chan *chan) +{ + struct pcc_chan_info *pchan_info; + struct pcc_mbox_chan *pcc_mbox_chan; + + if (!chan || !chan->cl) + return -1; + pchan_info = chan->con_priv; + pcc_mbox_chan = &pchan_info->chan; + pcc_mbox_chan->shmem = ioremap(pcc_mbox_chan->shmem_base_addr, + pcc_mbox_chan->shmem_size); + return 0; +} +EXPORT_SYMBOL_GPL(pcc_mbox_ioremap); + /** * pcc_send_data - Called from Mailbox Controller code. Used * here only to ring the channel doorbell. The PCC client diff --git a/include/acpi/pcc.h b/include/acpi/pcc.h index 9b373d172a77..699c1a37b8e7 100644 --- a/include/acpi/pcc.h +++ b/include/acpi/pcc.h @@ -12,6 +12,7 @@ struct pcc_mbox_chan { struct mbox_chan *mchan; u64 shmem_base_addr; + void __iomem *shmem; u64 shmem_size; u32 latency; u32 max_access_rate; @@ -31,11 +32,13 @@ struct pcc_mbox_chan { #define PCC_CMD_COMPLETION_NOTIFY BIT(0)
#define MAX_PCC_SUBSPACES 256 +#define PCC_ACK_FLAG_MASK 0x1
#ifdef CONFIG_PCC extern struct pcc_mbox_chan * pcc_mbox_request_channel(struct mbox_client *cl, int subspace_id); extern void pcc_mbox_free_channel(struct pcc_mbox_chan *chan); +extern int pcc_mbox_ioremap(struct mbox_chan *chan); #else static inline struct pcc_mbox_chan * pcc_mbox_request_channel(struct mbox_client *cl, int subspace_id) @@ -43,6 +46,10 @@ pcc_mbox_request_channel(struct mbox_client *cl, int subspace_id) return ERR_PTR(-ENODEV); } static inline void pcc_mbox_free_channel(struct pcc_mbox_chan *chan) { } +static inline int pcc_mbox_ioremap(struct mbox_chan *chan) +{ + return 0; +}; #endif
#endif /* _PCC_H */
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nikita Travkin nikita@trvn.ru
[ Upstream commit 8de60bbab994bf8165d7d10e974872852da47aa7 ]
sc7180 has a dedicated ADSP similar to the one found in sm8250. Add it's compatible to the driver reusing the existing config so the devices that use the adsp can probe it.
Signed-off-by: Nikita Travkin nikita@trvn.ru Reviewed-by: Stephen Boyd swboyd@chromium.org Link: https://lore.kernel.org/r/20230907-sc7180-adsp-rproc-v3-2-6515c3fbe0a3@trvn.... Signed-off-by: Bjorn Andersson andersson@kernel.org Stable-dep-of: 009e288c989b ("remoteproc: qcom: pas: enable SAR2130P audio DSP support") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/remoteproc/qcom_q6v5_pas.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/remoteproc/qcom_q6v5_pas.c b/drivers/remoteproc/qcom_q6v5_pas.c index 6235721f2c1a..fd66bb8b23f8 100644 --- a/drivers/remoteproc/qcom_q6v5_pas.c +++ b/drivers/remoteproc/qcom_q6v5_pas.c @@ -1163,6 +1163,7 @@ static const struct of_device_id adsp_of_match[] = { { .compatible = "qcom,qcs404-adsp-pas", .data = &adsp_resource_init }, { .compatible = "qcom,qcs404-cdsp-pas", .data = &cdsp_resource_init }, { .compatible = "qcom,qcs404-wcss-pas", .data = &wcss_resource_init }, + { .compatible = "qcom,sc7180-adsp-pas", .data = &sm8250_adsp_resource}, { .compatible = "qcom,sc7180-mpss-pas", .data = &mpss_resource_init}, { .compatible = "qcom,sc7280-mpss-pas", .data = &mpss_resource_init}, { .compatible = "qcom,sc8180x-adsp-pas", .data = &sm8150_adsp_resource},
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Tengfei Fan quic_tengfan@quicinc.com
[ Upstream commit 9091225ba28c0106d3cd041c7abf5551a94bb524 ]
Add support for PIL loading on ADSP, CDSP0, CDSP1, GPDSP0 and GPDSP1 on SA8775p SoCs.
Signed-off-by: Tengfei Fan quic_tengfan@quicinc.com Co-developed-by: Bartosz Golaszewski bartosz.golaszewski@linaro.org Signed-off-by: Bartosz Golaszewski bartosz.golaszewski@linaro.org Link: https://lore.kernel.org/r/20240805-topic-sa8775p-iot-remoteproc-v4-3-86affdc... Signed-off-by: Bjorn Andersson andersson@kernel.org Stable-dep-of: 009e288c989b ("remoteproc: qcom: pas: enable SAR2130P audio DSP support") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/remoteproc/qcom_q6v5_pas.c | 92 ++++++++++++++++++++++++++++++ 1 file changed, 92 insertions(+)
diff --git a/drivers/remoteproc/qcom_q6v5_pas.c b/drivers/remoteproc/qcom_q6v5_pas.c index fd66bb8b23f8..4a73723e375a 100644 --- a/drivers/remoteproc/qcom_q6v5_pas.c +++ b/drivers/remoteproc/qcom_q6v5_pas.c @@ -786,6 +786,23 @@ static const struct adsp_data adsp_resource_init = { .ssctl_id = 0x14, };
+static const struct adsp_data sa8775p_adsp_resource = { + .crash_reason_smem = 423, + .firmware_name = "adsp.mbn", + .pas_id = 1, + .minidump_id = 5, + .auto_boot = true, + .proxy_pd_names = (char*[]){ + "lcx", + "lmx", + NULL + }, + .load_state = "adsp", + .ssr_name = "lpass", + .sysmon_name = "adsp", + .ssctl_id = 0x14, +}; + static const struct adsp_data sdm845_adsp_resource_init = { .crash_reason_smem = 423, .firmware_name = "adsp.mdt", @@ -885,6 +902,42 @@ static const struct adsp_data cdsp_resource_init = { .ssctl_id = 0x17, };
+static const struct adsp_data sa8775p_cdsp0_resource = { + .crash_reason_smem = 601, + .firmware_name = "cdsp0.mbn", + .pas_id = 18, + .minidump_id = 7, + .auto_boot = true, + .proxy_pd_names = (char*[]){ + "cx", + "mxc", + "nsp", + NULL + }, + .load_state = "cdsp", + .ssr_name = "cdsp", + .sysmon_name = "cdsp", + .ssctl_id = 0x17, +}; + +static const struct adsp_data sa8775p_cdsp1_resource = { + .crash_reason_smem = 633, + .firmware_name = "cdsp1.mbn", + .pas_id = 30, + .minidump_id = 20, + .auto_boot = true, + .proxy_pd_names = (char*[]){ + "cx", + "mxc", + "nsp", + NULL + }, + .load_state = "nsp", + .ssr_name = "cdsp1", + .sysmon_name = "cdsp1", + .ssctl_id = 0x20, +}; + static const struct adsp_data sdm845_cdsp_resource_init = { .crash_reason_smem = 601, .firmware_name = "cdsp.mdt", @@ -987,6 +1040,40 @@ static const struct adsp_data sm8350_cdsp_resource = { .ssctl_id = 0x17, };
+static const struct adsp_data sa8775p_gpdsp0_resource = { + .crash_reason_smem = 640, + .firmware_name = "gpdsp0.mbn", + .pas_id = 39, + .minidump_id = 21, + .auto_boot = true, + .proxy_pd_names = (char*[]){ + "cx", + "mxc", + NULL + }, + .load_state = "gpdsp0", + .ssr_name = "gpdsp0", + .sysmon_name = "gpdsp0", + .ssctl_id = 0x21, +}; + +static const struct adsp_data sa8775p_gpdsp1_resource = { + .crash_reason_smem = 641, + .firmware_name = "gpdsp1.mbn", + .pas_id = 40, + .minidump_id = 22, + .auto_boot = true, + .proxy_pd_names = (char*[]){ + "cx", + "mxc", + NULL + }, + .load_state = "gpdsp1", + .ssr_name = "gpdsp1", + .sysmon_name = "gpdsp1", + .ssctl_id = 0x22, +}; + static const struct adsp_data mpss_resource_init = { .crash_reason_smem = 421, .firmware_name = "modem.mdt", @@ -1163,6 +1250,11 @@ static const struct of_device_id adsp_of_match[] = { { .compatible = "qcom,qcs404-adsp-pas", .data = &adsp_resource_init }, { .compatible = "qcom,qcs404-cdsp-pas", .data = &cdsp_resource_init }, { .compatible = "qcom,qcs404-wcss-pas", .data = &wcss_resource_init }, + { .compatible = "qcom,sa8775p-adsp-pas", .data = &sa8775p_adsp_resource}, + { .compatible = "qcom,sa8775p-cdsp0-pas", .data = &sa8775p_cdsp0_resource}, + { .compatible = "qcom,sa8775p-cdsp1-pas", .data = &sa8775p_cdsp1_resource}, + { .compatible = "qcom,sa8775p-gpdsp0-pas", .data = &sa8775p_gpdsp0_resource}, + { .compatible = "qcom,sa8775p-gpdsp1-pas", .data = &sa8775p_gpdsp1_resource}, { .compatible = "qcom,sc7180-adsp-pas", .data = &sm8250_adsp_resource}, { .compatible = "qcom,sc7180-mpss-pas", .data = &mpss_resource_init}, { .compatible = "qcom,sc7280-mpss-pas", .data = &mpss_resource_init},
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dmitry Baryshkov dmitry.baryshkov@linaro.org
[ Upstream commit 009e288c989b3fe548a45c82da407d7bd00418a9 ]
Enable support for the Audio DSP on the Qualcomm SAR2130P platform, reusing the SM8350 resources.
Signed-off-by: Dmitry Baryshkov dmitry.baryshkov@linaro.org Reviewed-by: Neil Armstrong neil.armstrong@linaro.org Link: https://lore.kernel.org/r/20241027-sar2130p-adsp-v1-3-bd204e39d24e@linaro.or... Signed-off-by: Bjorn Andersson andersson@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/remoteproc/qcom_q6v5_pas.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/remoteproc/qcom_q6v5_pas.c b/drivers/remoteproc/qcom_q6v5_pas.c index 4a73723e375a..fd6bf9e77afc 100644 --- a/drivers/remoteproc/qcom_q6v5_pas.c +++ b/drivers/remoteproc/qcom_q6v5_pas.c @@ -1255,6 +1255,7 @@ static const struct of_device_id adsp_of_match[] = { { .compatible = "qcom,sa8775p-cdsp1-pas", .data = &sa8775p_cdsp1_resource}, { .compatible = "qcom,sa8775p-gpdsp0-pas", .data = &sa8775p_gpdsp0_resource}, { .compatible = "qcom,sa8775p-gpdsp1-pas", .data = &sa8775p_gpdsp1_resource}, + { .compatible = "qcom,sar2130p-adsp-pas", .data = &sm8350_adsp_resource}, { .compatible = "qcom,sc7180-adsp-pas", .data = &sm8250_adsp_resource}, { .compatible = "qcom,sc7180-mpss-pas", .data = &mpss_resource_init}, { .compatible = "qcom,sc7280-mpss-pas", .data = &mpss_resource_init},
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Konstantin Komarov almaz.alexandrovich@paragon-software.com
[ Upstream commit 9a2d6a40b8a1a6fa62eaf47ceee10a5eef62284c ]
Signed-off-by: Konstantin Komarov almaz.alexandrovich@paragon-software.com Stable-dep-of: e2705dd3d16d ("fs/ntfs3: Fix warning in ni_fiemap") Signed-off-by: Sasha Levin sashal@kernel.org --- fs/ntfs3/attrib.c | 25 +++++++++++++++---------- fs/ntfs3/inode.c | 3 ++- 2 files changed, 17 insertions(+), 11 deletions(-)
diff --git a/fs/ntfs3/attrib.c b/fs/ntfs3/attrib.c index fc6cea60044e..582628b9b796 100644 --- a/fs/ntfs3/attrib.c +++ b/fs/ntfs3/attrib.c @@ -977,15 +977,17 @@ int attr_data_get_block(struct ntfs_inode *ni, CLST vcn, CLST clen, CLST *lcn, goto out;
/* Check for compressed frame. */ - err = attr_is_frame_compressed(ni, attr, vcn >> NTFS_LZNT_CUNIT, &hint); + err = attr_is_frame_compressed(ni, attr_b, vcn >> NTFS_LZNT_CUNIT, + &hint); if (err) goto out;
if (hint) { /* if frame is compressed - don't touch it. */ *lcn = COMPRESSED_LCN; - *len = hint; - err = -EOPNOTSUPP; + /* length to the end of frame. */ + *len = NTFS_LZNT_CLUSTERS - (vcn & (NTFS_LZNT_CLUSTERS - 1)); + err = 0; goto out; }
@@ -1028,16 +1030,16 @@ int attr_data_get_block(struct ntfs_inode *ni, CLST vcn, CLST clen, CLST *lcn,
/* Check if 'vcn' and 'vcn0' in different attribute segments. */ if (vcn < svcn || evcn1 <= vcn) { - /* Load attribute for truncated vcn. */ - attr = ni_find_attr(ni, attr_b, &le, ATTR_DATA, NULL, 0, - &vcn, &mi); - if (!attr) { + struct ATTRIB *attr2; + /* Load runs for truncated vcn. */ + attr2 = ni_find_attr(ni, attr_b, &le_b, ATTR_DATA, NULL, + 0, &vcn, &mi); + if (!attr2) { err = -EINVAL; goto out; } - svcn = le64_to_cpu(attr->nres.svcn); - evcn1 = le64_to_cpu(attr->nres.evcn) + 1; - err = attr_load_runs(attr, ni, run, NULL); + evcn1 = le64_to_cpu(attr2->nres.evcn) + 1; + err = attr_load_runs(attr2, ni, run, NULL); if (err) goto out; } @@ -1530,6 +1532,9 @@ int attr_wof_frame_info(struct ntfs_inode *ni, struct ATTRIB *attr,
/* * attr_is_frame_compressed - Used to detect compressed frame. + * + * attr - base (primary) attribute segment. + * Only base segments contains valid 'attr->nres.c_unit' */ int attr_is_frame_compressed(struct ntfs_inode *ni, struct ATTRIB *attr, CLST frame, CLST *clst_data) diff --git a/fs/ntfs3/inode.c b/fs/ntfs3/inode.c index 52b80fd15914..af7c0cbba74e 100644 --- a/fs/ntfs3/inode.c +++ b/fs/ntfs3/inode.c @@ -604,7 +604,8 @@ static noinline int ntfs_get_block_vbo(struct inode *inode, u64 vbo,
bytes = ((u64)len << cluster_bits) - off;
- if (lcn == SPARSE_LCN) { + if (lcn >= sbi->used.bitmap.nbits) { + /* This case includes resident/compressed/sparse. */ if (!create) { if (bh->b_size > bytes) bh->b_size = bytes;
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Konstantin Komarov almaz.alexandrovich@paragon-software.com
[ Upstream commit e2705dd3d16d1000f1fd8193d82447065de8c899 ]
Use local runs_tree instead of cached. This way excludes rw_semaphore lock.
Reported-by: syzbot+1c25748a40fe79b8a119@syzkaller.appspotmail.com Signed-off-by: Konstantin Komarov almaz.alexandrovich@paragon-software.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/ntfs3/attrib.c | 9 ++-- fs/ntfs3/frecord.c | 103 +++++++-------------------------------------- fs/ntfs3/ntfs_fs.h | 3 +- 3 files changed, 21 insertions(+), 94 deletions(-)
diff --git a/fs/ntfs3/attrib.c b/fs/ntfs3/attrib.c index 582628b9b796..e25989dd2c6b 100644 --- a/fs/ntfs3/attrib.c +++ b/fs/ntfs3/attrib.c @@ -978,7 +978,7 @@ int attr_data_get_block(struct ntfs_inode *ni, CLST vcn, CLST clen, CLST *lcn,
/* Check for compressed frame. */ err = attr_is_frame_compressed(ni, attr_b, vcn >> NTFS_LZNT_CUNIT, - &hint); + &hint, run); if (err) goto out;
@@ -1534,16 +1534,16 @@ int attr_wof_frame_info(struct ntfs_inode *ni, struct ATTRIB *attr, * attr_is_frame_compressed - Used to detect compressed frame. * * attr - base (primary) attribute segment. + * run - run to use, usually == &ni->file.run. * Only base segments contains valid 'attr->nres.c_unit' */ int attr_is_frame_compressed(struct ntfs_inode *ni, struct ATTRIB *attr, - CLST frame, CLST *clst_data) + CLST frame, CLST *clst_data, struct runs_tree *run) { int err; u32 clst_frame; CLST clen, lcn, vcn, alen, slen, vcn_next; size_t idx; - struct runs_tree *run;
*clst_data = 0;
@@ -1555,7 +1555,6 @@ int attr_is_frame_compressed(struct ntfs_inode *ni, struct ATTRIB *attr,
clst_frame = 1u << attr->nres.c_unit; vcn = frame * clst_frame; - run = &ni->file.run;
if (!run_lookup_entry(run, vcn, &lcn, &clen, &idx)) { err = attr_load_runs_vcn(ni, attr->type, attr_name(attr), @@ -1691,7 +1690,7 @@ int attr_allocate_frame(struct ntfs_inode *ni, CLST frame, size_t compr_size, if (err) goto out;
- err = attr_is_frame_compressed(ni, attr_b, frame, &clst_data); + err = attr_is_frame_compressed(ni, attr_b, frame, &clst_data, run); if (err) goto out;
diff --git a/fs/ntfs3/frecord.c b/fs/ntfs3/frecord.c index 12e03feb3074..3c876c468c2c 100644 --- a/fs/ntfs3/frecord.c +++ b/fs/ntfs3/frecord.c @@ -1900,46 +1900,6 @@ enum REPARSE_SIGN ni_parse_reparse(struct ntfs_inode *ni, struct ATTRIB *attr, return REPARSE_LINK; }
-/* - * fiemap_fill_next_extent_k - a copy of fiemap_fill_next_extent - * but it uses 'fe_k' instead of fieinfo->fi_extents_start - */ -static int fiemap_fill_next_extent_k(struct fiemap_extent_info *fieinfo, - struct fiemap_extent *fe_k, u64 logical, - u64 phys, u64 len, u32 flags) -{ - struct fiemap_extent extent; - - /* only count the extents */ - if (fieinfo->fi_extents_max == 0) { - fieinfo->fi_extents_mapped++; - return (flags & FIEMAP_EXTENT_LAST) ? 1 : 0; - } - - if (fieinfo->fi_extents_mapped >= fieinfo->fi_extents_max) - return 1; - - if (flags & FIEMAP_EXTENT_DELALLOC) - flags |= FIEMAP_EXTENT_UNKNOWN; - if (flags & FIEMAP_EXTENT_DATA_ENCRYPTED) - flags |= FIEMAP_EXTENT_ENCODED; - if (flags & (FIEMAP_EXTENT_DATA_TAIL | FIEMAP_EXTENT_DATA_INLINE)) - flags |= FIEMAP_EXTENT_NOT_ALIGNED; - - memset(&extent, 0, sizeof(extent)); - extent.fe_logical = logical; - extent.fe_physical = phys; - extent.fe_length = len; - extent.fe_flags = flags; - - memcpy(fe_k + fieinfo->fi_extents_mapped, &extent, sizeof(extent)); - - fieinfo->fi_extents_mapped++; - if (fieinfo->fi_extents_mapped == fieinfo->fi_extents_max) - return 1; - return (flags & FIEMAP_EXTENT_LAST) ? 1 : 0; -} - /* * ni_fiemap - Helper for file_fiemap(). * @@ -1950,11 +1910,9 @@ int ni_fiemap(struct ntfs_inode *ni, struct fiemap_extent_info *fieinfo, __u64 vbo, __u64 len) { int err = 0; - struct fiemap_extent *fe_k = NULL; struct ntfs_sb_info *sbi = ni->mi.sbi; u8 cluster_bits = sbi->cluster_bits; - struct runs_tree *run; - struct rw_semaphore *run_lock; + struct runs_tree run; struct ATTRIB *attr; CLST vcn = vbo >> cluster_bits; CLST lcn, clen; @@ -1965,13 +1923,11 @@ int ni_fiemap(struct ntfs_inode *ni, struct fiemap_extent_info *fieinfo, u32 flags; bool ok;
+ run_init(&run); if (S_ISDIR(ni->vfs_inode.i_mode)) { - run = &ni->dir.alloc_run; attr = ni_find_attr(ni, NULL, NULL, ATTR_ALLOC, I30_NAME, ARRAY_SIZE(I30_NAME), NULL, NULL); - run_lock = &ni->dir.run_lock; } else { - run = &ni->file.run; attr = ni_find_attr(ni, NULL, NULL, ATTR_DATA, NULL, 0, NULL, NULL); if (!attr) { @@ -1986,7 +1942,6 @@ int ni_fiemap(struct ntfs_inode *ni, struct fiemap_extent_info *fieinfo, "fiemap is not supported for compressed file (cp -r)"); goto out; } - run_lock = &ni->file.run_lock; }
if (!attr || !attr->non_res) { @@ -1998,51 +1953,33 @@ int ni_fiemap(struct ntfs_inode *ni, struct fiemap_extent_info *fieinfo, goto out; }
- /* - * To avoid lock problems replace pointer to user memory by pointer to kernel memory. - */ - fe_k = kmalloc_array(fieinfo->fi_extents_max, - sizeof(struct fiemap_extent), - GFP_NOFS | __GFP_ZERO); - if (!fe_k) { - err = -ENOMEM; - goto out; - } - end = vbo + len; alloc_size = le64_to_cpu(attr->nres.alloc_size); if (end > alloc_size) end = alloc_size;
- down_read(run_lock);
while (vbo < end) { if (idx == -1) { - ok = run_lookup_entry(run, vcn, &lcn, &clen, &idx); + ok = run_lookup_entry(&run, vcn, &lcn, &clen, &idx); } else { CLST vcn_next = vcn;
- ok = run_get_entry(run, ++idx, &vcn, &lcn, &clen) && + ok = run_get_entry(&run, ++idx, &vcn, &lcn, &clen) && vcn == vcn_next; if (!ok) vcn = vcn_next; }
if (!ok) { - up_read(run_lock); - down_write(run_lock); - err = attr_load_runs_vcn(ni, attr->type, attr_name(attr), - attr->name_len, run, vcn); - - up_write(run_lock); - down_read(run_lock); + attr->name_len, &run, vcn);
if (err) break;
- ok = run_lookup_entry(run, vcn, &lcn, &clen, &idx); + ok = run_lookup_entry(&run, vcn, &lcn, &clen, &idx);
if (!ok) { err = -EINVAL; @@ -2067,8 +2004,9 @@ int ni_fiemap(struct ntfs_inode *ni, struct fiemap_extent_info *fieinfo, } else if (is_attr_compressed(attr)) { CLST clst_data;
- err = attr_is_frame_compressed( - ni, attr, vcn >> attr->nres.c_unit, &clst_data); + err = attr_is_frame_compressed(ni, attr, + vcn >> attr->nres.c_unit, + &clst_data, &run); if (err) break; if (clst_data < NTFS_LZNT_CLUSTERS) @@ -2097,8 +2035,8 @@ int ni_fiemap(struct ntfs_inode *ni, struct fiemap_extent_info *fieinfo, if (vbo + dlen >= end) flags |= FIEMAP_EXTENT_LAST;
- err = fiemap_fill_next_extent_k(fieinfo, fe_k, vbo, lbo, - dlen, flags); + err = fiemap_fill_next_extent(fieinfo, vbo, lbo, dlen, + flags);
if (err < 0) break; @@ -2119,8 +2057,7 @@ int ni_fiemap(struct ntfs_inode *ni, struct fiemap_extent_info *fieinfo, if (vbo + bytes >= end) flags |= FIEMAP_EXTENT_LAST;
- err = fiemap_fill_next_extent_k(fieinfo, fe_k, vbo, lbo, bytes, - flags); + err = fiemap_fill_next_extent(fieinfo, vbo, lbo, bytes, flags); if (err < 0) break; if (err == 1) { @@ -2131,19 +2068,8 @@ int ni_fiemap(struct ntfs_inode *ni, struct fiemap_extent_info *fieinfo, vbo += bytes; }
- up_read(run_lock); - - /* - * Copy to user memory out of lock - */ - if (copy_to_user(fieinfo->fi_extents_start, fe_k, - fieinfo->fi_extents_max * - sizeof(struct fiemap_extent))) { - err = -EFAULT; - } - out: - kfree(fe_k); + run_close(&run); return err; }
@@ -2674,7 +2600,8 @@ int ni_read_frame(struct ntfs_inode *ni, u64 frame_vbo, struct page **pages, down_write(&ni->file.run_lock); run_truncate_around(run, le64_to_cpu(attr->nres.svcn)); frame = frame_vbo >> (cluster_bits + NTFS_LZNT_CUNIT); - err = attr_is_frame_compressed(ni, attr, frame, &clst_data); + err = attr_is_frame_compressed(ni, attr, frame, &clst_data, + run); up_write(&ni->file.run_lock); if (err) goto out1; diff --git a/fs/ntfs3/ntfs_fs.h b/fs/ntfs3/ntfs_fs.h index cfe9d3bf07f9..c98e6868bfba 100644 --- a/fs/ntfs3/ntfs_fs.h +++ b/fs/ntfs3/ntfs_fs.h @@ -446,7 +446,8 @@ int attr_wof_frame_info(struct ntfs_inode *ni, struct ATTRIB *attr, struct runs_tree *run, u64 frame, u64 frames, u8 frame_bits, u32 *ondisk_size, u64 *vbo_data); int attr_is_frame_compressed(struct ntfs_inode *ni, struct ATTRIB *attr, - CLST frame, CLST *clst_data); + CLST frame, CLST *clst_data, + struct runs_tree *run); int attr_allocate_frame(struct ntfs_inode *ni, CLST frame, size_t compr_size, u64 new_valid); int attr_collapse_range(struct ntfs_inode *ni, u64 vbo, u64 bytes);
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Tomer Maimon tmaimon77@gmail.com
[ Upstream commit 2978cc1f285390c1bd4d9bfc665747adc6e4b19c ]
Adding CI_HDRC_FORCE_VBUS_ACTIVE_ALWAYS flag to modify the vbus_active parameter to active in case the ChipIdea USB IP role is device-only and there is no otgsc register.
Signed-off-by: Tomer Maimon tmaimon77@gmail.com Acked-by: Peter Chen peter.chen@kernel.org Link: https://lore.kernel.org/r/20231017195903.1665260-2-tmaimon77@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Stable-dep-of: ec841b8d73cf ("usb: chipidea: add CI_HDRC_HAS_SHORT_PKT_LIMIT flag") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/usb/chipidea/otg.c | 5 ++++- include/linux/usb/chipidea.h | 1 + 2 files changed, 5 insertions(+), 1 deletion(-)
diff --git a/drivers/usb/chipidea/otg.c b/drivers/usb/chipidea/otg.c index f5490f2a5b6b..647e98f4e351 100644 --- a/drivers/usb/chipidea/otg.c +++ b/drivers/usb/chipidea/otg.c @@ -130,8 +130,11 @@ enum ci_role ci_otg_role(struct ci_hdrc *ci)
void ci_handle_vbus_change(struct ci_hdrc *ci) { - if (!ci->is_otg) + if (!ci->is_otg) { + if (ci->platdata->flags & CI_HDRC_FORCE_VBUS_ACTIVE_ALWAYS) + usb_gadget_vbus_connect(&ci->gadget); return; + }
if (hw_read_otgsc(ci, OTGSC_BSV) && !ci->vbus_active) usb_gadget_vbus_connect(&ci->gadget); diff --git a/include/linux/usb/chipidea.h b/include/linux/usb/chipidea.h index 0b4f2d5faa08..5a7f96684ea2 100644 --- a/include/linux/usb/chipidea.h +++ b/include/linux/usb/chipidea.h @@ -64,6 +64,7 @@ struct ci_hdrc_platform_data { #define CI_HDRC_PMQOS BIT(15) #define CI_HDRC_PHY_VBUS_CONTROL BIT(16) #define CI_HDRC_HAS_PORTSC_PEC_MISSED BIT(17) +#define CI_HDRC_FORCE_VBUS_ACTIVE_ALWAYS BIT(18) enum usb_dr_mode dr_mode; #define CI_HDRC_CONTROLLER_RESET_EVENT 0 #define CI_HDRC_CONTROLLER_STOPPED_EVENT 1
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Xu Yang xu.yang_2@nxp.com
[ Upstream commit ec841b8d73cff37f8960e209017efe1eb2fb21f2 ]
Currently, the imx deivice controller has below limitations:
1. can't generate short packet interrupt if IOC not set in dTD. So if one request span more than one dTDs and only the last dTD set IOC, the usb request will pending there if no more data comes. 2. the controller can't accurately deliver data to differtent usb requests in some cases due to short packet. For example: one usb request span 3 dTDs, then if the controller received a short packet the next packet will go to 2nd dTD of current request rather than the first dTD of next request. 3. can't build a bus packet use multiple dTDs. For example: controller needs to send one packet of 512 bytes use dTD1 (200 bytes) + dTD2 (312 bytes), actually the host side will see 200 bytes short packet.
Based on these limits, add CI_HDRC_HAS_SHORT_PKT_LIMIT flag and use it on imx platforms.
Signed-off-by: Xu Yang xu.yang_2@nxp.com Acked-by: Peter Chen peter.chen@kernel.org Link: https://lore.kernel.org/r/20240923081203.2851768-1-xu.yang_2@nxp.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/usb/chipidea/ci.h | 1 + drivers/usb/chipidea/ci_hdrc_imx.c | 1 + drivers/usb/chipidea/core.c | 2 ++ include/linux/usb/chipidea.h | 1 + 4 files changed, 5 insertions(+)
diff --git a/drivers/usb/chipidea/ci.h b/drivers/usb/chipidea/ci.h index 2a38e1eb6546..e4b003d060c2 100644 --- a/drivers/usb/chipidea/ci.h +++ b/drivers/usb/chipidea/ci.h @@ -260,6 +260,7 @@ struct ci_hdrc { bool b_sess_valid_event; bool imx28_write_fix; bool has_portsc_pec_bug; + bool has_short_pkt_limit; bool supports_runtime_pm; bool in_lpm; bool wakeup_int; diff --git a/drivers/usb/chipidea/ci_hdrc_imx.c b/drivers/usb/chipidea/ci_hdrc_imx.c index e28bb2f2612d..477af457c1a1 100644 --- a/drivers/usb/chipidea/ci_hdrc_imx.c +++ b/drivers/usb/chipidea/ci_hdrc_imx.c @@ -334,6 +334,7 @@ static int ci_hdrc_imx_probe(struct platform_device *pdev) struct ci_hdrc_platform_data pdata = { .name = dev_name(&pdev->dev), .capoffset = DEF_CAPOFFSET, + .flags = CI_HDRC_HAS_SHORT_PKT_LIMIT, .notify_event = ci_hdrc_imx_notify_event, }; int ret; diff --git a/drivers/usb/chipidea/core.c b/drivers/usb/chipidea/core.c index ca71df4f32e4..c161a4ee5290 100644 --- a/drivers/usb/chipidea/core.c +++ b/drivers/usb/chipidea/core.c @@ -1076,6 +1076,8 @@ static int ci_hdrc_probe(struct platform_device *pdev) CI_HDRC_SUPPORTS_RUNTIME_PM); ci->has_portsc_pec_bug = !!(ci->platdata->flags & CI_HDRC_HAS_PORTSC_PEC_MISSED); + ci->has_short_pkt_limit = !!(ci->platdata->flags & + CI_HDRC_HAS_SHORT_PKT_LIMIT); platform_set_drvdata(pdev, ci);
ret = hw_device_init(ci, base); diff --git a/include/linux/usb/chipidea.h b/include/linux/usb/chipidea.h index 5a7f96684ea2..ebdfef124b2b 100644 --- a/include/linux/usb/chipidea.h +++ b/include/linux/usb/chipidea.h @@ -65,6 +65,7 @@ struct ci_hdrc_platform_data { #define CI_HDRC_PHY_VBUS_CONTROL BIT(16) #define CI_HDRC_HAS_PORTSC_PEC_MISSED BIT(17) #define CI_HDRC_FORCE_VBUS_ACTIVE_ALWAYS BIT(18) +#define CI_HDRC_HAS_SHORT_PKT_LIMIT BIT(19) enum usb_dr_mode dr_mode; #define CI_HDRC_CONTROLLER_RESET_EVENT 0 #define CI_HDRC_CONTROLLER_STOPPED_EVENT 1
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Xu Yang xu.yang_2@nxp.com
[ Upstream commit ca8d18aa7b0f22d66a3ca9a90d8f73431b8eca89 ]
To let the device controller work properly on short packet limitations, one usb request should only correspond to one dTD. Then every dTD will set IOC. In theory, each dTD support up to 20KB data transfer if the offset is 0. Due to we cannot predetermine the offset, this will limit the usb request length to max 16KB. This should be fine since most of the user transfer data based on this size policy.
Signed-off-by: Xu Yang xu.yang_2@nxp.com Acked-by: Peter Chen peter.chen@kernel.org Link: https://lore.kernel.org/r/20240923081203.2851768-2-xu.yang_2@nxp.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/usb/chipidea/ci.h | 1 + drivers/usb/chipidea/udc.c | 6 ++++++ 2 files changed, 7 insertions(+)
diff --git a/drivers/usb/chipidea/ci.h b/drivers/usb/chipidea/ci.h index e4b003d060c2..97437de52ef6 100644 --- a/drivers/usb/chipidea/ci.h +++ b/drivers/usb/chipidea/ci.h @@ -25,6 +25,7 @@ #define TD_PAGE_COUNT 5 #define CI_HDRC_PAGE_SIZE 4096ul /* page size for TD's */ #define ENDPT_MAX 32 +#define CI_MAX_REQ_SIZE (4 * CI_HDRC_PAGE_SIZE) #define CI_MAX_BUF_SIZE (TD_PAGE_COUNT * CI_HDRC_PAGE_SIZE)
/****************************************************************************** diff --git a/drivers/usb/chipidea/udc.c b/drivers/usb/chipidea/udc.c index 9f7d003e467b..f2ae5f4c5828 100644 --- a/drivers/usb/chipidea/udc.c +++ b/drivers/usb/chipidea/udc.c @@ -959,6 +959,12 @@ static int _ep_queue(struct usb_ep *ep, struct usb_request *req, return -EMSGSIZE; }
+ if (ci->has_short_pkt_limit && + hwreq->req.length > CI_MAX_REQ_SIZE) { + dev_err(hwep->ci->dev, "request length too big (max 16KB)\n"); + return -EMSGSIZE; + } + /* first nuke then test link, e.g. previous status has not sent */ if (!list_empty(&hwreq->queue)) { dev_err(hwep->ci->dev, "request already in queue\n");
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jonathan Cameron Jonathan.Cameron@huawei.com
[ Upstream commit c3708c829a0662af429897a90aed46b70f14a50b ]
Enables use of with other firmwware types. Removes a case of device tree specific handlers that might get copied into new drivers.
Cc: Alisa-Dariana Roman alisa.roman@analog.com Reviewed-by: Andy Shevchenko andriy.shevchenko@linux.intel.com Link: https://lore.kernel.org/r/20240218172731.1023367-6-jic23@kernel.org Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com Stable-dep-of: b7f99fa1b64a ("iio: adc: ad7192: properly check spi_get_device_match_data()") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/iio/adc/ad7192.c | 38 +++++++++++++++++++------------------- 1 file changed, 19 insertions(+), 19 deletions(-)
diff --git a/drivers/iio/adc/ad7192.c b/drivers/iio/adc/ad7192.c index b64fd365f83f..ecaf87af539b 100644 --- a/drivers/iio/adc/ad7192.c +++ b/drivers/iio/adc/ad7192.c @@ -16,7 +16,9 @@ #include <linux/err.h> #include <linux/sched.h> #include <linux/delay.h> -#include <linux/of.h> +#include <linux/module.h> +#include <linux/mod_devicetable.h> +#include <linux/property.h>
#include <linux/iio/iio.h> #include <linux/iio/sysfs.h> @@ -360,19 +362,19 @@ static inline bool ad7192_valid_external_frequency(u32 freq) freq <= AD7192_EXT_FREQ_MHZ_MAX); }
-static int ad7192_of_clock_select(struct ad7192_state *st) +static int ad7192_clock_select(struct ad7192_state *st) { - struct device_node *np = st->sd.spi->dev.of_node; + struct device *dev = &st->sd.spi->dev; unsigned int clock_sel;
clock_sel = AD7192_CLK_INT;
/* use internal clock */ if (!st->mclk) { - if (of_property_read_bool(np, "adi,int-clock-output-enable")) + if (device_property_read_bool(dev, "adi,int-clock-output-enable")) clock_sel = AD7192_CLK_INT_CO; } else { - if (of_property_read_bool(np, "adi,clock-xtal")) + if (device_property_read_bool(dev, "adi,clock-xtal")) clock_sel = AD7192_CLK_EXT_MCLK1_2; else clock_sel = AD7192_CLK_EXT_MCLK2; @@ -381,7 +383,7 @@ static int ad7192_of_clock_select(struct ad7192_state *st) return clock_sel; }
-static int ad7192_setup(struct iio_dev *indio_dev, struct device_node *np) +static int ad7192_setup(struct iio_dev *indio_dev, struct device *dev) { struct ad7192_state *st = iio_priv(indio_dev); bool rej60_en, refin2_en; @@ -403,7 +405,7 @@ static int ad7192_setup(struct iio_dev *indio_dev, struct device_node *np) id &= AD7192_ID_MASK;
if (id != st->chip_info->chip_id) - dev_warn(&st->sd.spi->dev, "device ID query failed (0x%X != 0x%X)\n", + dev_warn(dev, "device ID query failed (0x%X != 0x%X)\n", id, st->chip_info->chip_id);
st->mode = AD7192_MODE_SEL(AD7192_MODE_IDLE) | @@ -412,31 +414,31 @@ static int ad7192_setup(struct iio_dev *indio_dev, struct device_node *np)
st->conf = AD7192_CONF_GAIN(0);
- rej60_en = of_property_read_bool(np, "adi,rejection-60-Hz-enable"); + rej60_en = device_property_read_bool(dev, "adi,rejection-60-Hz-enable"); if (rej60_en) st->mode |= AD7192_MODE_REJ60;
- refin2_en = of_property_read_bool(np, "adi,refin2-pins-enable"); + refin2_en = device_property_read_bool(dev, "adi,refin2-pins-enable"); if (refin2_en && st->chip_info->chip_id != CHIPID_AD7195) st->conf |= AD7192_CONF_REFSEL;
st->conf &= ~AD7192_CONF_CHOP; st->f_order = AD7192_NO_SYNC_FILTER;
- buf_en = of_property_read_bool(np, "adi,buffer-enable"); + buf_en = device_property_read_bool(dev, "adi,buffer-enable"); if (buf_en) st->conf |= AD7192_CONF_BUF;
- bipolar = of_property_read_bool(np, "bipolar"); + bipolar = device_property_read_bool(dev, "bipolar"); if (!bipolar) st->conf |= AD7192_CONF_UNIPOLAR;
- burnout_curr_en = of_property_read_bool(np, - "adi,burnout-currents-enable"); + burnout_curr_en = device_property_read_bool(dev, + "adi,burnout-currents-enable"); if (burnout_curr_en && buf_en) { st->conf |= AD7192_CONF_BURN; } else if (burnout_curr_en) { - dev_warn(&st->sd.spi->dev, + dev_warn(dev, "Can't enable burnout currents: see CHOP or buffer\n"); }
@@ -1036,9 +1038,7 @@ static int ad7192_probe(struct spi_device *spi) } st->int_vref_mv = ret / 1000;
- st->chip_info = of_device_get_match_data(&spi->dev); - if (!st->chip_info) - st->chip_info = (void *)spi_get_device_id(spi)->driver_data; + st->chip_info = spi_get_device_match_data(spi); indio_dev->name = st->chip_info->name; indio_dev->modes = INDIO_DIRECT_MODE;
@@ -1065,7 +1065,7 @@ static int ad7192_probe(struct spi_device *spi) if (IS_ERR(st->mclk)) return PTR_ERR(st->mclk);
- st->clock_sel = ad7192_of_clock_select(st); + st->clock_sel = ad7192_clock_select(st);
if (st->clock_sel == AD7192_CLK_EXT_MCLK1_2 || st->clock_sel == AD7192_CLK_EXT_MCLK2) { @@ -1077,7 +1077,7 @@ static int ad7192_probe(struct spi_device *spi) } }
- ret = ad7192_setup(indio_dev, spi->dev.of_node); + ret = ad7192_setup(indio_dev, &spi->dev); if (ret) return ret;
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nuno Sa nuno.sa@analog.com
[ Upstream commit b7f99fa1b64af2f696b13cec581cb4cd7d3982b8 ]
spi_get_device_match_data() can return a NULL pointer. Hence, let's check for it.
Signed-off-by: Nuno Sa nuno.sa@analog.com Link: https://patch.msgid.link/20241014-fix-error-check-v1-1-089e1003d12f@analog.c... Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/iio/adc/ad7192.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/iio/adc/ad7192.c b/drivers/iio/adc/ad7192.c index ecaf87af539b..fa6810aa6a4a 100644 --- a/drivers/iio/adc/ad7192.c +++ b/drivers/iio/adc/ad7192.c @@ -1039,6 +1039,9 @@ static int ad7192_probe(struct spi_device *spi) st->int_vref_mv = ret / 1000;
st->chip_info = spi_get_device_match_data(spi); + if (!st->chip_info) + return -ENODEV; + indio_dev->name = st->chip_info->name; indio_dev->modes = INDIO_DIRECT_MODE;
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dmitry Baryshkov dmitry.baryshkov@linaro.org
[ Upstream commit 24bce22d09ec8e67022aab9a888acb56fb7a996a ]
Allow UCSI glue driver to perform addtional work to update connector status. For example, it might check the cable orientation. This call is performed after reading new connector statatus, so the platform driver can peek at new connection status bits.
The callback is called both when registering the port and when the connector change event is being handled.
Signed-off-by: Dmitry Baryshkov dmitry.baryshkov@linaro.org Reviewed-by: Heikki Krogerus heikki.krogerus@linux.intel.com Link: https://lore.kernel.org/r/20240411-ucsi-orient-aware-v2-1-d4b1cb22a33f@linar... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Stable-dep-of: de9df030ccb5 ("usb: typec: ucsi: glink: be more precise on orientation-aware ports") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/usb/typec/ucsi/ucsi.c | 6 ++++++ drivers/usb/typec/ucsi/ucsi.h | 3 +++ 2 files changed, 9 insertions(+)
diff --git a/drivers/usb/typec/ucsi/ucsi.c b/drivers/usb/typec/ucsi/ucsi.c index f6fb5575d4f0..3f7039a711c7 100644 --- a/drivers/usb/typec/ucsi/ucsi.c +++ b/drivers/usb/typec/ucsi/ucsi.c @@ -903,6 +903,9 @@ static void ucsi_handle_connector_change(struct work_struct *work)
trace_ucsi_connector_change(con->num, &con->status);
+ if (ucsi->ops->connector_status) + ucsi->ops->connector_status(con); + role = !!(con->status.flags & UCSI_CONSTAT_PWR_DIR);
if (con->status.change & UCSI_CONSTAT_POWER_DIR_CHANGE) { @@ -1322,6 +1325,9 @@ static int ucsi_register_port(struct ucsi *ucsi, struct ucsi_connector *con) } ret = 0; /* ucsi_send_command() returns length on success */
+ if (ucsi->ops->connector_status) + ucsi->ops->connector_status(con); + switch (UCSI_CONSTAT_PARTNER_TYPE(con->status.flags)) { case UCSI_CONSTAT_PARTNER_TYPE_UFP: case UCSI_CONSTAT_PARTNER_TYPE_CABLE_AND_UFP: diff --git a/drivers/usb/typec/ucsi/ucsi.h b/drivers/usb/typec/ucsi/ucsi.h index 42c60eba5fb6..3d23b52cf5a9 100644 --- a/drivers/usb/typec/ucsi/ucsi.h +++ b/drivers/usb/typec/ucsi/ucsi.h @@ -15,6 +15,7 @@
struct ucsi; struct ucsi_altmode; +struct ucsi_connector; struct dentry;
/* UCSI offsets (Bytes) */ @@ -52,6 +53,7 @@ struct dentry; * @sync_write: Blocking write operation * @async_write: Non-blocking write operation * @update_altmodes: Squashes duplicate DP altmodes + * @connector_status: Updates connector status, called holding connector lock * * Read and write routines for UCSI interface. @sync_write must wait for the * Command Completion Event from the PPM before returning, and @async_write must @@ -66,6 +68,7 @@ struct ucsi_operations { const void *val, size_t val_len); bool (*update_altmodes)(struct ucsi *ucsi, struct ucsi_altmode *orig, struct ucsi_altmode *updated); + void (*connector_status)(struct ucsi_connector *con); };
struct ucsi *ucsi_create(struct device *dev, const struct ucsi_operations *ops);
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dmitry Baryshkov dmitry.baryshkov@linaro.org
[ Upstream commit 76716fd5bf09725c2c6825264147f16c21e56853 ]
To simplify the platform code move Type-C orientation handling into the connector_status callback. As it is called both during connector registration and on connector change events, duplicated code from pmic_glink_ucsi_register() can be dropped.
Also this moves operations that can sleep into a worker thread, removing the only sleeping operation from pmic_glink_ucsi_notify().
Tested-by: Krishna Kurapati quic_kriskura@quicinc.com Signed-off-by: Dmitry Baryshkov dmitry.baryshkov@linaro.org Reviewed-by: Heikki Krogerus heikki.krogeurs@linux.intel.com Link: https://lore.kernel.org/r/20240411-ucsi-orient-aware-v2-2-d4b1cb22a33f@linar... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Stable-dep-of: de9df030ccb5 ("usb: typec: ucsi: glink: be more precise on orientation-aware ports") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/usb/typec/ucsi/ucsi_glink.c | 48 ++++++++++++----------------- 1 file changed, 20 insertions(+), 28 deletions(-)
diff --git a/drivers/usb/typec/ucsi/ucsi_glink.c b/drivers/usb/typec/ucsi/ucsi_glink.c index 94f2df02f06e..4c9352cdd641 100644 --- a/drivers/usb/typec/ucsi/ucsi_glink.c +++ b/drivers/usb/typec/ucsi/ucsi_glink.c @@ -186,10 +186,28 @@ static int pmic_glink_ucsi_sync_write(struct ucsi *__ucsi, unsigned int offset, return ret; }
+static void pmic_glink_ucsi_connector_status(struct ucsi_connector *con) +{ + struct pmic_glink_ucsi *ucsi = ucsi_get_drvdata(con->ucsi); + int orientation; + + if (con->num >= PMIC_GLINK_MAX_PORTS || + !ucsi->port_orientation[con->num - 1]) + return; + + orientation = gpiod_get_value(ucsi->port_orientation[con->num - 1]); + if (orientation >= 0) { + typec_switch_set(ucsi->port_switch[con->num - 1], + orientation ? TYPEC_ORIENTATION_REVERSE + : TYPEC_ORIENTATION_NORMAL); + } +} + static const struct ucsi_operations pmic_glink_ucsi_ops = { .read = pmic_glink_ucsi_read, .sync_write = pmic_glink_ucsi_sync_write, - .async_write = pmic_glink_ucsi_async_write + .async_write = pmic_glink_ucsi_async_write, + .connector_status = pmic_glink_ucsi_connector_status, };
static void pmic_glink_ucsi_read_ack(struct pmic_glink_ucsi *ucsi, const void *data, int len) @@ -228,20 +246,8 @@ static void pmic_glink_ucsi_notify(struct work_struct *work) }
con_num = UCSI_CCI_CONNECTOR(cci); - if (con_num) { - if (con_num <= PMIC_GLINK_MAX_PORTS && - ucsi->port_orientation[con_num - 1]) { - int orientation = gpiod_get_value(ucsi->port_orientation[con_num - 1]); - - if (orientation >= 0) { - typec_switch_set(ucsi->port_switch[con_num - 1], - orientation ? TYPEC_ORIENTATION_REVERSE - : TYPEC_ORIENTATION_NORMAL); - } - } - + if (con_num) ucsi_connector_change(ucsi->ucsi, con_num); - }
if (ucsi->sync_pending && (cci & (UCSI_CCI_ACK_COMPLETE | UCSI_CCI_COMMAND_COMPLETE))) { @@ -252,20 +258,6 @@ static void pmic_glink_ucsi_notify(struct work_struct *work) static void pmic_glink_ucsi_register(struct work_struct *work) { struct pmic_glink_ucsi *ucsi = container_of(work, struct pmic_glink_ucsi, register_work); - int orientation; - int i; - - for (i = 0; i < PMIC_GLINK_MAX_PORTS; i++) { - if (!ucsi->port_orientation[i]) - continue; - orientation = gpiod_get_value(ucsi->port_orientation[i]); - - if (orientation >= 0) { - typec_switch_set(ucsi->port_switch[i], - orientation ? TYPEC_ORIENTATION_REVERSE - : TYPEC_ORIENTATION_NORMAL); - } - }
ucsi_register(ucsi->ucsi); }
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dmitry Baryshkov dmitry.baryshkov@linaro.org
[ Upstream commit 62866465196228917f233aea68de73be6cdb9fae ]
Add a callback to allow glue drivers to update the connector before registering corresponding power supply and Type-C port. In particular this is useful if glue drivers want to touch the connector's Type-C capabilities structure.
Signed-off-by: Dmitry Baryshkov dmitry.baryshkov@linaro.org Reviewed-by: Heikki Krogerus heikki.krogerus@linux.intel.com Link: https://lore.kernel.org/r/20240411-ucsi-orient-aware-v2-4-d4b1cb22a33f@linar... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Stable-dep-of: de9df030ccb5 ("usb: typec: ucsi: glink: be more precise on orientation-aware ports") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/usb/typec/ucsi/ucsi.c | 3 +++ drivers/usb/typec/ucsi/ucsi.h | 2 ++ 2 files changed, 5 insertions(+)
diff --git a/drivers/usb/typec/ucsi/ucsi.c b/drivers/usb/typec/ucsi/ucsi.c index 3f7039a711c7..d6a3fd00c3a5 100644 --- a/drivers/usb/typec/ucsi/ucsi.c +++ b/drivers/usb/typec/ucsi/ucsi.c @@ -1261,6 +1261,9 @@ static int ucsi_register_port(struct ucsi *ucsi, struct ucsi_connector *con) cap->driver_data = con; cap->ops = &ucsi_ops;
+ if (ucsi->ops->update_connector) + ucsi->ops->update_connector(con); + ret = ucsi_register_port_psy(con); if (ret) goto out; diff --git a/drivers/usb/typec/ucsi/ucsi.h b/drivers/usb/typec/ucsi/ucsi.h index 3d23b52cf5a9..921ef0e115cf 100644 --- a/drivers/usb/typec/ucsi/ucsi.h +++ b/drivers/usb/typec/ucsi/ucsi.h @@ -53,6 +53,7 @@ struct dentry; * @sync_write: Blocking write operation * @async_write: Non-blocking write operation * @update_altmodes: Squashes duplicate DP altmodes + * @update_connector: Update connector capabilities before registering * @connector_status: Updates connector status, called holding connector lock * * Read and write routines for UCSI interface. @sync_write must wait for the @@ -68,6 +69,7 @@ struct ucsi_operations { const void *val, size_t val_len); bool (*update_altmodes)(struct ucsi *ucsi, struct ucsi_altmode *orig, struct ucsi_altmode *updated); + void (*update_connector)(struct ucsi_connector *con); void (*connector_status)(struct ucsi_connector *con); };
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dmitry Baryshkov dmitry.baryshkov@linaro.org
[ Upstream commit 3d1b6c9d47707d6a0f80bb5db6473b1f107b5baf ]
If the PMIC-GLINK device has orientation GPIOs declared, then it will report connection orientation. In this case set the flag to mark registered ports as orientation-aware.
Signed-off-by: Dmitry Baryshkov dmitry.baryshkov@linaro.org Reviewed-by: Heikki Krogerus heikki.krogerus@linux.intel.com Link: https://lore.kernel.org/r/20240411-ucsi-orient-aware-v2-5-d4b1cb22a33f@linar... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Stable-dep-of: de9df030ccb5 ("usb: typec: ucsi: glink: be more precise on orientation-aware ports") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/usb/typec/ucsi/ucsi_glink.c | 12 ++++++++++++ 1 file changed, 12 insertions(+)
diff --git a/drivers/usb/typec/ucsi/ucsi_glink.c b/drivers/usb/typec/ucsi/ucsi_glink.c index 4c9352cdd641..f6c3af5846e6 100644 --- a/drivers/usb/typec/ucsi/ucsi_glink.c +++ b/drivers/usb/typec/ucsi/ucsi_glink.c @@ -186,6 +186,17 @@ static int pmic_glink_ucsi_sync_write(struct ucsi *__ucsi, unsigned int offset, return ret; }
+static void pmic_glink_ucsi_update_connector(struct ucsi_connector *con) +{ + struct pmic_glink_ucsi *ucsi = ucsi_get_drvdata(con->ucsi); + int i; + + for (i = 0; i < PMIC_GLINK_MAX_PORTS; i++) { + if (ucsi->port_orientation[i]) + con->typec_cap.orientation_aware = true; + } +} + static void pmic_glink_ucsi_connector_status(struct ucsi_connector *con) { struct pmic_glink_ucsi *ucsi = ucsi_get_drvdata(con->ucsi); @@ -207,6 +218,7 @@ static const struct ucsi_operations pmic_glink_ucsi_ops = { .read = pmic_glink_ucsi_read, .sync_write = pmic_glink_ucsi_sync_write, .async_write = pmic_glink_ucsi_async_write, + .update_connector = pmic_glink_ucsi_update_connector, .connector_status = pmic_glink_ucsi_connector_status, };
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dmitry Baryshkov dmitry.baryshkov@linaro.org
[ Upstream commit de9df030ccb5d3e31ee0c715d74cd77c619748f8 ]
Instead of checking if any of the USB-C ports have orientation GPIO and thus is orientation-aware, check for the GPIO for the port being registered. There are no boards that are affected by this change at this moment, so the patch is not marked as a fix, but it might affect other boards in future.
Reviewed-by: Abel Vesa abel.vesa@linaro.org Reviewed-by: Neil Armstrong neil.armstrong@linaro.org Reviewed-by: Johan Hovold johan+linaro@kernel.org Tested-by: Johan Hovold johan+linaro@kernel.org Signed-off-by: Dmitry Baryshkov dmitry.baryshkov@linaro.org Link: https://lore.kernel.org/r/20241109-ucsi-glue-fixes-v2-2-8b21ff4f9fbe@linaro.... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/usb/typec/ucsi/ucsi_glink.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/drivers/usb/typec/ucsi/ucsi_glink.c b/drivers/usb/typec/ucsi/ucsi_glink.c index f6c3af5846e6..f0b4d0a4bb19 100644 --- a/drivers/usb/typec/ucsi/ucsi_glink.c +++ b/drivers/usb/typec/ucsi/ucsi_glink.c @@ -189,12 +189,12 @@ static int pmic_glink_ucsi_sync_write(struct ucsi *__ucsi, unsigned int offset, static void pmic_glink_ucsi_update_connector(struct ucsi_connector *con) { struct pmic_glink_ucsi *ucsi = ucsi_get_drvdata(con->ucsi); - int i;
- for (i = 0; i < PMIC_GLINK_MAX_PORTS; i++) { - if (ucsi->port_orientation[i]) - con->typec_cap.orientation_aware = true; - } + if (con->num > PMIC_GLINK_MAX_PORTS || + !ucsi->port_orientation[con->num - 1]) + return; + + con->typec_cap.orientation_aware = true; }
static void pmic_glink_ucsi_connector_status(struct ucsi_connector *con)
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nilay Shroff nilay@linux.ibm.com
[ Upstream commit 599d9f3a10eec69ef28a90161763e4bd7c9c02bf ]
We no more need acquiring ctrl->lock before accessing the NVMe controller state and instead we can now use the helper nvme_ctrl_state. So replace the use of ctrl->lock from nvme_keep_alive_finish function with nvme_ctrl_state call.
Reviewed-by: Christoph Hellwig hch@lst.de Signed-off-by: Nilay Shroff nilay@linux.ibm.com Signed-off-by: Keith Busch kbusch@kernel.org Stable-dep-of: 84488282166d ("Revert "nvme: make keep-alive synchronous operation"") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/nvme/host/core.c | 10 ++-------- 1 file changed, 2 insertions(+), 8 deletions(-)
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c index 5b6a6bd4e6e8..ae494c799fc5 100644 --- a/drivers/nvme/host/core.c +++ b/drivers/nvme/host/core.c @@ -1181,10 +1181,9 @@ static void nvme_queue_keep_alive_work(struct nvme_ctrl *ctrl) static void nvme_keep_alive_finish(struct request *rq, blk_status_t status, struct nvme_ctrl *ctrl) { - unsigned long flags; - bool startka = false; unsigned long rtt = jiffies - (rq->deadline - rq->timeout); unsigned long delay = nvme_keep_alive_work_period(ctrl); + enum nvme_ctrl_state state = nvme_ctrl_state(ctrl);
/* * Subtract off the keepalive RTT so nvme_keep_alive_work runs @@ -1207,12 +1206,7 @@ static void nvme_keep_alive_finish(struct request *rq,
ctrl->ka_last_check_time = jiffies; ctrl->comp_seen = false; - spin_lock_irqsave(&ctrl->lock, flags); - if (ctrl->state == NVME_CTRL_LIVE || - ctrl->state == NVME_CTRL_CONNECTING) - startka = true; - spin_unlock_irqrestore(&ctrl->lock, flags); - if (startka) + if (state == NVME_CTRL_LIVE || state == NVME_CTRL_CONNECTING) queue_delayed_work(nvme_wq, &ctrl->ka_work, delay); }
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nilay Shroff nilay@linux.ibm.com
[ Upstream commit 84488282166de6b6760ada8030e87aaa08bce3aa ]
This reverts commit d06923670b5a5f609603d4a9fee4dec02d38de9c.
It was realized that the fix implemented to contain the race condition among the keep alive task and the fabric shutdown code path in the commit d06923670b5ia ("nvme: make keep-alive synchronous operation") is not optimal. The reason being keep-alive runs under the workqueue and making it synchronous would waste a workqueue context. Furthermore, we later found that the above race condition is a regression caused due to the changes implemented in commit a54a93d0e359 ("nvme: move stopping keep-alive into nvme_uninit_ctrl()"). So we decided to revert the commit d06923670b5a ("nvme: make keep-alive synchronous operation") and then fix the regression.
Link: https://lore.kernel.org/all/196f4013-3bbf-43ff-98b4-9cb2a96c20c2@grimberg.me... Reviewed-by: Ming Lei ming.lei@redhat.com Signed-off-by: Nilay Shroff nilay@linux.ibm.com Signed-off-by: Keith Busch kbusch@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/nvme/host/core.c | 17 ++++++++++------- 1 file changed, 10 insertions(+), 7 deletions(-)
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c index ae494c799fc5..4aad16390d47 100644 --- a/drivers/nvme/host/core.c +++ b/drivers/nvme/host/core.c @@ -1178,9 +1178,10 @@ static void nvme_queue_keep_alive_work(struct nvme_ctrl *ctrl) nvme_keep_alive_work_period(ctrl)); }
-static void nvme_keep_alive_finish(struct request *rq, - blk_status_t status, struct nvme_ctrl *ctrl) +static enum rq_end_io_ret nvme_keep_alive_end_io(struct request *rq, + blk_status_t status) { + struct nvme_ctrl *ctrl = rq->end_io_data; unsigned long rtt = jiffies - (rq->deadline - rq->timeout); unsigned long delay = nvme_keep_alive_work_period(ctrl); enum nvme_ctrl_state state = nvme_ctrl_state(ctrl); @@ -1197,17 +1198,20 @@ static void nvme_keep_alive_finish(struct request *rq, delay = 0; }
+ blk_mq_free_request(rq); + if (status) { dev_err(ctrl->device, "failed nvme_keep_alive_end_io error=%d\n", status); - return; + return RQ_END_IO_NONE; }
ctrl->ka_last_check_time = jiffies; ctrl->comp_seen = false; if (state == NVME_CTRL_LIVE || state == NVME_CTRL_CONNECTING) queue_delayed_work(nvme_wq, &ctrl->ka_work, delay); + return RQ_END_IO_NONE; }
static void nvme_keep_alive_work(struct work_struct *work) @@ -1216,7 +1220,6 @@ static void nvme_keep_alive_work(struct work_struct *work) struct nvme_ctrl, ka_work); bool comp_seen = ctrl->comp_seen; struct request *rq; - blk_status_t status;
ctrl->ka_last_check_time = jiffies;
@@ -1239,9 +1242,9 @@ static void nvme_keep_alive_work(struct work_struct *work) nvme_init_request(rq, &ctrl->ka_cmd);
rq->timeout = ctrl->kato * HZ; - status = blk_execute_rq(rq, false); - nvme_keep_alive_finish(rq, status, ctrl); - blk_mq_free_request(rq); + rq->end_io = nvme_keep_alive_end_io; + rq->end_io_data = ctrl; + blk_execute_rq_nowait(rq, false); }
static void nvme_start_keep_alive(struct nvme_ctrl *ctrl)
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sebastian Ott sebott@redhat.com
[ Upstream commit 25872a079bbbe952eb660249cc9f40fa75623e68 ]
Add the device name to the per device kmem_cache names to ensure their uniqueness. This fixes warnings like this: "kmem_cache of name 'mlx5_fs_fgs' already exists".
Signed-off-by: Sebastian Ott sebott@redhat.com Reviewed-by: Breno Leitao leitao@debian.org Reviewed-by: Tariq Toukan tariqt@nvidia.com Link: https://patch.msgid.link/20241023134146.28448-1-sebott@redhat.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/mellanox/mlx5/core/fs_core.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c index 991250f44c2e..474e63d02ba4 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c @@ -3478,6 +3478,7 @@ void mlx5_fs_core_free(struct mlx5_core_dev *dev) int mlx5_fs_core_alloc(struct mlx5_core_dev *dev) { struct mlx5_flow_steering *steering; + char name[80]; int err = 0;
err = mlx5_init_fc_stats(dev); @@ -3502,10 +3503,12 @@ int mlx5_fs_core_alloc(struct mlx5_core_dev *dev) else steering->mode = MLX5_FLOW_STEERING_MODE_DMFS;
- steering->fgs_cache = kmem_cache_create("mlx5_fs_fgs", + snprintf(name, sizeof(name), "%s-mlx5_fs_fgs", dev_name(dev->device)); + steering->fgs_cache = kmem_cache_create(name, sizeof(struct mlx5_flow_group), 0, 0, NULL); - steering->ftes_cache = kmem_cache_create("mlx5_fs_ftes", sizeof(struct fs_fte), 0, + snprintf(name, sizeof(name), "%s-mlx5_fs_ftes", dev_name(dev->device)); + steering->ftes_cache = kmem_cache_create(name, sizeof(struct fs_fte), 0, 0, NULL); if (!steering->ftes_cache || !steering->fgs_cache) { err = -ENOMEM;
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: K Prateek Nayak kprateek.nayak@amd.com
[ Upstream commit 6675ce20046d149e1e1ffe7e9577947dee17aad5 ]
do_softirq_post_smp_call_flush() on PREEMPT_RT kernels carries a WARN_ON_ONCE() for any SOFTIRQ being raised from an SMP-call-function. Since do_softirq_post_smp_call_flush() is called with preempt disabled, raising a SOFTIRQ during flush_smp_call_function_queue() can lead to longer preempt disabled sections.
Since commit b2a02fc43a1f ("smp: Optimize send_call_function_single_ipi()") IPIs to an idle CPU in TIF_POLLING_NRFLAG mode can be optimized out by instead setting TIF_NEED_RESCHED bit in idle task's thread_info and relying on the flush_smp_call_function_queue() in the idle-exit path to run the SMP-call-function.
To trigger an idle load balancing, the scheduler queues nohz_csd_function() responsible for triggering an idle load balancing on a target nohz idle CPU and sends an IPI. Only now, this IPI is optimized out and the SMP-call-function is executed from flush_smp_call_function_queue() in do_idle() which can raise a SCHED_SOFTIRQ to trigger the balancing.
So far, this went undetected since, the need_resched() check in nohz_csd_function() would make it bail out of idle load balancing early as the idle thread does not clear TIF_POLLING_NRFLAG before calling flush_smp_call_function_queue(). The need_resched() check was added with the intent to catch a new task wakeup, however, it has recently discovered to be unnecessary and will be removed in the subsequent commit after which nohz_csd_function() can raise a SCHED_SOFTIRQ from flush_smp_call_function_queue() to trigger an idle load balance on an idle target in TIF_POLLING_NRFLAG mode.
nohz_csd_function() bails out early if "idle_cpu()" check for the target CPU, and does not lock the target CPU's rq until the very end, once it has found tasks to run on the CPU and will not inhibit the wakeup of, or running of a newly woken up higher priority task. Account for this and prevent a WARN_ON_ONCE() when SCHED_SOFTIRQ is raised from flush_smp_call_function_queue().
Signed-off-by: K Prateek Nayak kprateek.nayak@amd.com Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Link: https://lore.kernel.org/r/20241119054432.6405-2-kprateek.nayak@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/softirq.c | 15 +++++++++++---- 1 file changed, 11 insertions(+), 4 deletions(-)
diff --git a/kernel/softirq.c b/kernel/softirq.c index bd9716d7bb63..f24d80cf20bd 100644 --- a/kernel/softirq.c +++ b/kernel/softirq.c @@ -279,17 +279,24 @@ static inline void invoke_softirq(void) wakeup_softirqd(); }
+#define SCHED_SOFTIRQ_MASK BIT(SCHED_SOFTIRQ) + /* * flush_smp_call_function_queue() can raise a soft interrupt in a function - * call. On RT kernels this is undesired and the only known functionality - * in the block layer which does this is disabled on RT. If soft interrupts - * get raised which haven't been raised before the flush, warn so it can be + * call. On RT kernels this is undesired and the only known functionalities + * are in the block layer which is disabled on RT, and in the scheduler for + * idle load balancing. If soft interrupts get raised which haven't been + * raised before the flush, warn if it is not a SCHED_SOFTIRQ so it can be * investigated. */ void do_softirq_post_smp_call_flush(unsigned int was_pending) { - if (WARN_ON_ONCE(was_pending != local_softirq_pending())) + unsigned int is_pending = local_softirq_pending(); + + if (unlikely(was_pending != is_pending)) { + WARN_ON_ONCE(was_pending != (is_pending & ~SCHED_SOFTIRQ_MASK)); invoke_softirq(); + } }
#else /* CONFIG_PREEMPT_RT */
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nikita Yushchenko nikita.yoush@cogentembedded.com
[ Upstream commit 5cb099902b6b6292b3a85ffa1bb844e0ba195945 ]
When sending frame split into multiple descriptors, hardware processes descriptors one by one, including writing back DT values. The first descriptor could be already marked as completed when processing of next descriptors for the same frame is still in progress.
Although only the last descriptor is configured to generate interrupt, completion of the first descriptor could be noticed by the driver when handling interrupt for the previous frame.
Currently, driver stores skb in the entry that corresponds to the first descriptor. This results into skb could be unmapped and freed when hardware did not complete the send yet. This opens a window for corrupting the data being sent.
Fix this by saving skb in the entry that corresponds to the last descriptor used to send the frame.
Fixes: d2c96b9d5f83 ("net: rswitch: Add jumbo frames handling for TX") Signed-off-by: Nikita Yushchenko nikita.yoush@cogentembedded.com Reviewed-by: Yoshihiro Shimoda yoshihiro.shimoda.uh@renesas.com Link: https://patch.msgid.link/20241208095004.69468-2-nikita.yoush@cogentembedded.... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/renesas/rswitch.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/renesas/rswitch.c b/drivers/net/ethernet/renesas/rswitch.c index 54aa56c84133..2f483531d95c 100644 --- a/drivers/net/ethernet/renesas/rswitch.c +++ b/drivers/net/ethernet/renesas/rswitch.c @@ -1632,8 +1632,9 @@ static netdev_tx_t rswitch_start_xmit(struct sk_buff *skb, struct net_device *nd if (dma_mapping_error(ndev->dev.parent, dma_addr_orig)) goto err_kfree;
- gq->skbs[gq->cur] = skb; - gq->unmap_addrs[gq->cur] = dma_addr_orig; + /* Stored the skb at the last descriptor to avoid skb free before hardware completes send */ + gq->skbs[(gq->cur + nr_desc - 1) % gq->ring_size] = skb; + gq->unmap_addrs[(gq->cur + nr_desc - 1) % gq->ring_size] = dma_addr_orig;
dma_wmb();
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Michal Pecio michal.pecio@gmail.com
[ Upstream commit fd9d55d190c0e5fefd3a9165ea361809427885a1 ]
Two NEC uPD720200 adapters have been observed to randomly misbehave: a Stop Endpoint command fails with Context Error, the Output Context indicates Stopped state, and the endpoint keeps running. Very often, Set TR Dequeue Pointer is seen to fail next with Context Error too, in addition to problems from unexpectedly completed cancelled work.
The pathology is common on fast running isoc endpoints like uvcvideo, but has also been reproduced on a full-speed bulk endpoint of pl2303. It seems all EPs are affected, with risk proportional to their load.
Reproduction involves receiving any kind of stream and closing it to make the device driver cancel URBs already queued in advance.
Deal with it by retrying the command like in the Running state.
Signed-off-by: Michal Pecio michal.pecio@gmail.com Signed-off-by: Mathias Nyman mathias.nyman@linux.intel.com Link: https://lore.kernel.org/r/20240229141438.619372-8-mathias.nyman@linux.intel.... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Stable-dep-of: e21ebe51af68 ("xhci: Turn NEC specific quirk for handling Stop Endpoint errors generic") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/usb/host/xhci-ring.c | 9 +++++++++ 1 file changed, 9 insertions(+)
diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c index 50f588011400..7c3e39482834 100644 --- a/drivers/usb/host/xhci-ring.c +++ b/drivers/usb/host/xhci-ring.c @@ -1180,6 +1180,15 @@ static void xhci_handle_cmd_stop_ep(struct xhci_hcd *xhci, int slot_id, break; ep->ep_state &= ~EP_STOP_CMD_PENDING; return; + case EP_STATE_STOPPED: + /* + * NEC uPD720200 sometimes sets this state and fails with + * Context Error while continuing to process TRBs. + * Be conservative and trust EP_CTX_STATE on other chips. + */ + if (!(xhci->quirks & XHCI_NEC_HOST)) + break; + fallthrough; case EP_STATE_RUNNING: /* Race, HW handled stop ep cmd before ep was running */ xhci_dbg(xhci, "Stop ep completion ctx error, ep is running\n");
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Michal Pecio michal.pecio@gmail.com
[ Upstream commit 42b7581376015c1bbcbe5831f043cd0ac119d028 ]
Some host controllers fail to atomically transition an endpoint to the Running state on a doorbell ring and enter a hidden "Restarting" state, which looks very much like Stopped, with the important difference that it will spontaneously transition to Running anytime soon.
A Stop Endpoint command queued in the Restarting state typically fails with Context State Error and the completion handler sees the Endpoint Context State as either still Stopped or already Running. Even a case of Halted was observed, when an error occurred right after the restart.
The Halted state is already recovered from by resetting the endpoint. The Running state is handled by retrying Stop Endpoint.
The Stopped state was recognized as a problem on NEC controllers and worked around also by retrying, because the endpoint soon restarts and then stops for good. But there is a risk: the command may fail if the endpoint is "stopped for good" already, and retries will fail forever.
The possibility of this was not realized at the time, but a number of cases were discovered later and reproduced. Some proved difficult to deal with, and it is outright impossible to predict if an endpoint may fail to ever start at all due to a hardware bug. One such bug (albeit on ASM3142, not on NEC) was found to be reliably triggered simply by toggling an AX88179 NIC up/down in a tight loop for a few seconds.
An endless retries storm is quite nasty. Besides putting needless load on the xHC and CPU, it causes URBs never to be given back, paralyzing the device and connection/disconnection logic for the whole bus if the device is unplugged. User processes waiting for URBs become unkillable, drivers and kworker threads lock up and xhci_hcd cannot be reloaded.
For peace of mind, impose a timeout on Stop Endpoint retries in this case. If they don't succeed in 100ms, consider the endpoint stopped permanently for some reason and just give back the unlinked URBs. This failure case is rare already and work is under way to make it rarer.
Start this work today by also handling one simple case of race with Reset Endpoint, because it costs just two lines to implement.
Fixes: fd9d55d190c0 ("xhci: retry Stop Endpoint on buggy NEC controllers") CC: stable@vger.kernel.org Signed-off-by: Michal Pecio michal.pecio@gmail.com Signed-off-by: Mathias Nyman mathias.nyman@linux.intel.com Link: https://lore.kernel.org/r/20241106101459.775897-32-mathias.nyman@linux.intel... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Stable-dep-of: e21ebe51af68 ("xhci: Turn NEC specific quirk for handling Stop Endpoint errors generic") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/usb/host/xhci-ring.c | 28 ++++++++++++++++++++++++---- drivers/usb/host/xhci.c | 2 ++ drivers/usb/host/xhci.h | 1 + 3 files changed, 27 insertions(+), 4 deletions(-)
diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c index 7c3e39482834..d318310e3135 100644 --- a/drivers/usb/host/xhci-ring.c +++ b/drivers/usb/host/xhci-ring.c @@ -52,6 +52,7 @@ * endpoint rings; it generates events on the event ring for these. */
+#include <linux/jiffies.h> #include <linux/scatterlist.h> #include <linux/slab.h> #include <linux/dma-mapping.h> @@ -1182,16 +1183,35 @@ static void xhci_handle_cmd_stop_ep(struct xhci_hcd *xhci, int slot_id, return; case EP_STATE_STOPPED: /* - * NEC uPD720200 sometimes sets this state and fails with - * Context Error while continuing to process TRBs. - * Be conservative and trust EP_CTX_STATE on other chips. + * Per xHCI 4.6.9, Stop Endpoint command on a Stopped + * EP is a Context State Error, and EP stays Stopped. + * + * But maybe it failed on Halted, and somebody ran Reset + * Endpoint later. EP state is now Stopped and EP_HALTED + * still set because Reset EP handler will run after us. + */ + if (ep->ep_state & EP_HALTED) + break; + /* + * On some HCs EP state remains Stopped for some tens of + * us to a few ms or more after a doorbell ring, and any + * new Stop Endpoint fails without aborting the restart. + * This handler may run quickly enough to still see this + * Stopped state, but it will soon change to Running. + * + * Assume this bug on unexpected Stop Endpoint failures. + * Keep retrying until the EP starts and stops again, on + * chips where this is known to help. Wait for 100ms. */ if (!(xhci->quirks & XHCI_NEC_HOST)) break; + if (time_is_before_jiffies(ep->stop_time + msecs_to_jiffies(100))) + break; fallthrough; case EP_STATE_RUNNING: /* Race, HW handled stop ep cmd before ep was running */ - xhci_dbg(xhci, "Stop ep completion ctx error, ep is running\n"); + xhci_dbg(xhci, "Stop ep completion ctx error, ctx_state %d\n", + GET_EP_CTX_STATE(ep_ctx));
command = xhci_alloc_command(xhci, false, GFP_ATOMIC); if (!command) { diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c index 3bd70e6ad64b..0b3688478ed3 100644 --- a/drivers/usb/host/xhci.c +++ b/drivers/usb/host/xhci.c @@ -8,6 +8,7 @@ * Some code borrowed from the Linux EHCI driver. */
+#include <linux/jiffies.h> #include <linux/pci.h> #include <linux/iommu.h> #include <linux/iopoll.h> @@ -1746,6 +1747,7 @@ static int xhci_urb_dequeue(struct usb_hcd *hcd, struct urb *urb, int status) ret = -ENOMEM; goto done; } + ep->stop_time = jiffies; ep->ep_state |= EP_STOP_CMD_PENDING; xhci_queue_stop_endpoint(xhci, command, urb->dev->slot_id, ep_index, 0); diff --git a/drivers/usb/host/xhci.h b/drivers/usb/host/xhci.h index 4bbd12db7239..d89010ac5f8a 100644 --- a/drivers/usb/host/xhci.h +++ b/drivers/usb/host/xhci.h @@ -717,6 +717,7 @@ struct xhci_virt_ep { /* Bandwidth checking storage */ struct xhci_bw_info bw_info; struct list_head bw_endpoint_list; + unsigned long stop_time; /* Isoch Frame ID checking storage */ int next_frame_id; /* Use new Isoch TRB layout needed for extended TBC support */
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mathias Nyman mathias.nyman@linux.intel.com
[ Upstream commit e21ebe51af688eb98fd6269240212a3c7300deea ]
xHC hosts from several vendors have the same issue where endpoints start so slowly that a later queued 'Stop Endpoint' command may complete before endpoint is up and running.
The 'Stop Endpoint' command fails with context state error as the endpoint still appears as stopped.
See commit 42b758137601 ("usb: xhci: Limit Stop Endpoint retries") for details
CC: stable@vger.kernel.org Signed-off-by: Mathias Nyman mathias.nyman@linux.intel.com Link: https://lore.kernel.org/r/20241217102122.2316814-2-mathias.nyman@linux.intel... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/usb/host/xhci-ring.c | 2 -- 1 file changed, 2 deletions(-)
diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c index d318310e3135..729319d81753 100644 --- a/drivers/usb/host/xhci-ring.c +++ b/drivers/usb/host/xhci-ring.c @@ -1203,8 +1203,6 @@ static void xhci_handle_cmd_stop_ep(struct xhci_hcd *xhci, int slot_id, * Keep retrying until the EP starts and stops again, on * chips where this is known to help. Wait for 100ms. */ - if (!(xhci->quirks & XHCI_NEC_HOST)) - break; if (time_is_before_jiffies(ep->stop_time + msecs_to_jiffies(100))) break; fallthrough;
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mika Westerberg mika.westerberg@linux.intel.com
[ Upstream commit 2cd3da4e37453019e21a486d9de3144f46b4fdf7 ]
Intel Lunar Lake has similar integrated Thunderbolt/USB4 controller as Intel Meteor Lake with some small differences in the host router (it has 3 DP IN adapters for instance). Add the Intel Lunar Lake PCI IDs to the driver list of supported devices.
Tested-by: Pengfei Xu pengfei.xu@intel.com Signed-off-by: Mika Westerberg mika.westerberg@linux.intel.com Stable-dep-of: 8644b48714dc ("thunderbolt: Add support for Intel Panther Lake-M/P") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/thunderbolt/nhi.c | 4 ++++ drivers/thunderbolt/nhi.h | 2 ++ 2 files changed, 6 insertions(+)
diff --git a/drivers/thunderbolt/nhi.c b/drivers/thunderbolt/nhi.c index 1ec6f9c82aef..b22023fae60d 100644 --- a/drivers/thunderbolt/nhi.c +++ b/drivers/thunderbolt/nhi.c @@ -1524,6 +1524,10 @@ static struct pci_device_id nhi_ids[] = { .driver_data = (kernel_ulong_t)&icl_nhi_ops }, { PCI_VDEVICE(INTEL, PCI_DEVICE_ID_INTEL_MTL_P_NHI1), .driver_data = (kernel_ulong_t)&icl_nhi_ops }, + { PCI_VDEVICE(INTEL, PCI_DEVICE_ID_INTEL_LNL_NHI0), + .driver_data = (kernel_ulong_t)&icl_nhi_ops }, + { PCI_VDEVICE(INTEL, PCI_DEVICE_ID_INTEL_LNL_NHI1), + .driver_data = (kernel_ulong_t)&icl_nhi_ops }, { PCI_VDEVICE(INTEL, PCI_DEVICE_ID_INTEL_BARLOW_RIDGE_HOST_80G_NHI) }, { PCI_VDEVICE(INTEL, PCI_DEVICE_ID_INTEL_BARLOW_RIDGE_HOST_40G_NHI) },
diff --git a/drivers/thunderbolt/nhi.h b/drivers/thunderbolt/nhi.h index 0f029ce75882..7a07c7c1a9c2 100644 --- a/drivers/thunderbolt/nhi.h +++ b/drivers/thunderbolt/nhi.h @@ -90,6 +90,8 @@ extern const struct tb_nhi_ops icl_nhi_ops; #define PCI_DEVICE_ID_INTEL_TGL_H_NHI1 0x9a21 #define PCI_DEVICE_ID_INTEL_RPL_NHI0 0xa73e #define PCI_DEVICE_ID_INTEL_RPL_NHI1 0xa76d +#define PCI_DEVICE_ID_INTEL_LNL_NHI0 0xa833 +#define PCI_DEVICE_ID_INTEL_LNL_NHI1 0xa834
#define PCI_CLASS_SERIAL_USB_USB4 0x0c0340
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mika Westerberg mika.westerberg@linux.intel.com
[ Upstream commit 8644b48714dca8bf2f42a4ff8311de8efc9bd8c3 ]
Intel Panther Lake-M/P has the same integrated Thunderbolt/USB4 controller as Lunar Lake. Add these PCI IDs to the driver list of supported devices.
Cc: stable@vger.kernel.org Signed-off-by: Mika Westerberg mika.westerberg@linux.intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/thunderbolt/nhi.c | 8 ++++++++ drivers/thunderbolt/nhi.h | 4 ++++ 2 files changed, 12 insertions(+)
diff --git a/drivers/thunderbolt/nhi.c b/drivers/thunderbolt/nhi.c index b22023fae60d..79f2bf5df19a 100644 --- a/drivers/thunderbolt/nhi.c +++ b/drivers/thunderbolt/nhi.c @@ -1528,6 +1528,14 @@ static struct pci_device_id nhi_ids[] = { .driver_data = (kernel_ulong_t)&icl_nhi_ops }, { PCI_VDEVICE(INTEL, PCI_DEVICE_ID_INTEL_LNL_NHI1), .driver_data = (kernel_ulong_t)&icl_nhi_ops }, + { PCI_VDEVICE(INTEL, PCI_DEVICE_ID_INTEL_PTL_M_NHI0), + .driver_data = (kernel_ulong_t)&icl_nhi_ops }, + { PCI_VDEVICE(INTEL, PCI_DEVICE_ID_INTEL_PTL_M_NHI1), + .driver_data = (kernel_ulong_t)&icl_nhi_ops }, + { PCI_VDEVICE(INTEL, PCI_DEVICE_ID_INTEL_PTL_P_NHI0), + .driver_data = (kernel_ulong_t)&icl_nhi_ops }, + { PCI_VDEVICE(INTEL, PCI_DEVICE_ID_INTEL_PTL_P_NHI1), + .driver_data = (kernel_ulong_t)&icl_nhi_ops }, { PCI_VDEVICE(INTEL, PCI_DEVICE_ID_INTEL_BARLOW_RIDGE_HOST_80G_NHI) }, { PCI_VDEVICE(INTEL, PCI_DEVICE_ID_INTEL_BARLOW_RIDGE_HOST_40G_NHI) },
diff --git a/drivers/thunderbolt/nhi.h b/drivers/thunderbolt/nhi.h index 7a07c7c1a9c2..16744f25a9a0 100644 --- a/drivers/thunderbolt/nhi.h +++ b/drivers/thunderbolt/nhi.h @@ -92,6 +92,10 @@ extern const struct tb_nhi_ops icl_nhi_ops; #define PCI_DEVICE_ID_INTEL_RPL_NHI1 0xa76d #define PCI_DEVICE_ID_INTEL_LNL_NHI0 0xa833 #define PCI_DEVICE_ID_INTEL_LNL_NHI1 0xa834 +#define PCI_DEVICE_ID_INTEL_PTL_M_NHI0 0xe333 +#define PCI_DEVICE_ID_INTEL_PTL_M_NHI1 0xe334 +#define PCI_DEVICE_ID_INTEL_PTL_P_NHI0 0xe433 +#define PCI_DEVICE_ID_INTEL_PTL_P_NHI1 0xe434
#define PCI_CLASS_SERIAL_USB_USB4 0x0c0340
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mario Limonciello mario.limonciello@amd.com
[ Upstream commit e34f1717ef0632fcec5cb827e5e0e9f223d70c9b ]
The read will never succeed if NVM wasn't initialized due to an unknown format.
Add a new callback for visibility to only show when supported.
Cc: stable@vger.kernel.org Fixes: aef9c693e7e5 ("thunderbolt: Move vendor specific NVM handling into nvm.c") Reported-by: Richard Hughes hughsient@gmail.com Closes: https://github.com/fwupd/fwupd/issues/8200 Signed-off-by: Mario Limonciello mario.limonciello@amd.com Signed-off-by: Mika Westerberg mika.westerberg@linux.intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/thunderbolt/retimer.c | 17 +++++++++++++++-- 1 file changed, 15 insertions(+), 2 deletions(-)
diff --git a/drivers/thunderbolt/retimer.c b/drivers/thunderbolt/retimer.c index 47becb363ada..2ee8c5ebca7c 100644 --- a/drivers/thunderbolt/retimer.c +++ b/drivers/thunderbolt/retimer.c @@ -98,6 +98,7 @@ static int tb_retimer_nvm_add(struct tb_retimer *rt)
err_nvm: dev_dbg(&rt->dev, "NVM upgrade disabled\n"); + rt->no_nvm_upgrade = true; if (!IS_ERR(nvm)) tb_nvm_free(nvm);
@@ -177,8 +178,6 @@ static ssize_t nvm_authenticate_show(struct device *dev,
if (!rt->nvm) ret = -EAGAIN; - else if (rt->no_nvm_upgrade) - ret = -EOPNOTSUPP; else ret = sysfs_emit(buf, "%#x\n", rt->auth_status);
@@ -331,6 +330,19 @@ static ssize_t vendor_show(struct device *dev, struct device_attribute *attr, } static DEVICE_ATTR_RO(vendor);
+static umode_t retimer_is_visible(struct kobject *kobj, struct attribute *attr, + int n) +{ + struct device *dev = kobj_to_dev(kobj); + struct tb_retimer *rt = tb_to_retimer(dev); + + if (attr == &dev_attr_nvm_authenticate.attr || + attr == &dev_attr_nvm_version.attr) + return rt->no_nvm_upgrade ? 0 : attr->mode; + + return attr->mode; +} + static struct attribute *retimer_attrs[] = { &dev_attr_device.attr, &dev_attr_nvm_authenticate.attr, @@ -340,6 +352,7 @@ static struct attribute *retimer_attrs[] = { };
static const struct attribute_group retimer_group = { + .is_visible = retimer_is_visible, .attrs = retimer_attrs, };
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Baoquan He bhe@redhat.com
[ Upstream commit a4eeb2176d89fdf2785851521577b94b31690a60 ]
Now crash codes under kernel/ folder has been split out from kexec code, crash dumping can be separated from kexec reboot in config items on x86 with some adjustments.
Here, also change some ifdefs or IS_ENABLED() check to more appropriate ones, e,g - #ifdef CONFIG_KEXEC_CORE -> #ifdef CONFIG_CRASH_DUMP - (!IS_ENABLED(CONFIG_KEXEC_CORE)) - > (!IS_ENABLED(CONFIG_CRASH_RESERVE))
[bhe@redhat.com: don't nest CONFIG_CRASH_DUMP ifdef inside CONFIG_KEXEC_CODE ifdef scope] Link: https://lore.kernel.org/all/SN6PR02MB4157931105FA68D72E3D3DB8D47B2@SN6PR02MB... Link: https://lkml.kernel.org/r/20240124051254.67105-7-bhe@redhat.com Signed-off-by: Baoquan He bhe@redhat.com Cc: Al Viro viro@zeniv.linux.org.uk Cc: Eric W. Biederman ebiederm@xmission.com Cc: Hari Bathini hbathini@linux.ibm.com Cc: Pingfan Liu piliu@redhat.com Cc: Klara Modin klarasmodin@gmail.com Cc: Michael Kelley mhklinux@outlook.com Cc: Nathan Chancellor nathan@kernel.org Cc: Stephen Rothwell sfr@canb.auug.org.au Cc: Yang Li yang.lee@linux.alibaba.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Stable-dep-of: bcc80dec91ee ("x86/hyperv: Fix hv tsc page based sched_clock for hibernation") Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/kernel/Makefile | 4 ++-- arch/x86/kernel/cpu/mshyperv.c | 10 ++++++++-- arch/x86/kernel/kexec-bzimage64.c | 4 ++++ arch/x86/kernel/kvm.c | 4 ++-- arch/x86/kernel/machine_kexec_64.c | 3 +++ arch/x86/kernel/reboot.c | 4 ++-- arch/x86/kernel/setup.c | 2 +- arch/x86/kernel/smp.c | 2 +- arch/x86/xen/enlighten_hvm.c | 4 ++++ arch/x86/xen/mmu_pv.c | 2 +- 10 files changed, 28 insertions(+), 11 deletions(-)
diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile index 3269a0e23d3a..15fc9fc3dcf0 100644 --- a/arch/x86/kernel/Makefile +++ b/arch/x86/kernel/Makefile @@ -99,9 +99,9 @@ obj-$(CONFIG_TRACING) += trace.o obj-$(CONFIG_RETHOOK) += rethook.o obj-$(CONFIG_CRASH_CORE) += crash_core_$(BITS).o obj-$(CONFIG_KEXEC_CORE) += machine_kexec_$(BITS).o -obj-$(CONFIG_KEXEC_CORE) += relocate_kernel_$(BITS).o crash.o +obj-$(CONFIG_KEXEC_CORE) += relocate_kernel_$(BITS).o obj-$(CONFIG_KEXEC_FILE) += kexec-bzimage64.o -obj-$(CONFIG_CRASH_DUMP) += crash_dump_$(BITS).o +obj-$(CONFIG_CRASH_DUMP) += crash_dump_$(BITS).o crash.o obj-y += kprobes/ obj-$(CONFIG_MODULES) += module.o obj-$(CONFIG_X86_32) += doublefault_32.o diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c index bcb2d640a0cd..93e1cb4f7ff1 100644 --- a/arch/x86/kernel/cpu/mshyperv.c +++ b/arch/x86/kernel/cpu/mshyperv.c @@ -209,7 +209,9 @@ static void hv_machine_shutdown(void) if (kexec_in_progress) hyperv_cleanup(); } +#endif /* CONFIG_KEXEC_CORE */
+#ifdef CONFIG_CRASH_DUMP static void hv_machine_crash_shutdown(struct pt_regs *regs) { if (hv_crash_handler) @@ -221,7 +223,7 @@ static void hv_machine_crash_shutdown(struct pt_regs *regs) /* Disable the hypercall page when there is only 1 active CPU. */ hyperv_cleanup(); } -#endif /* CONFIG_KEXEC_CORE */ +#endif /* CONFIG_CRASH_DUMP */ #endif /* CONFIG_HYPERV */
static uint32_t __init ms_hyperv_platform(void) @@ -493,9 +495,13 @@ static void __init ms_hyperv_init_platform(void) no_timer_check = 1; #endif
-#if IS_ENABLED(CONFIG_HYPERV) && defined(CONFIG_KEXEC_CORE) +#if IS_ENABLED(CONFIG_HYPERV) +#if defined(CONFIG_KEXEC_CORE) machine_ops.shutdown = hv_machine_shutdown; +#endif +#if defined(CONFIG_CRASH_DUMP) machine_ops.crash_shutdown = hv_machine_crash_shutdown; +#endif #endif if (ms_hyperv.features & HV_ACCESS_TSC_INVARIANT) { /* diff --git a/arch/x86/kernel/kexec-bzimage64.c b/arch/x86/kernel/kexec-bzimage64.c index a61c12c01270..0de509c02d18 100644 --- a/arch/x86/kernel/kexec-bzimage64.c +++ b/arch/x86/kernel/kexec-bzimage64.c @@ -263,11 +263,13 @@ setup_boot_parameters(struct kimage *image, struct boot_params *params, memset(¶ms->hd0_info, 0, sizeof(params->hd0_info)); memset(¶ms->hd1_info, 0, sizeof(params->hd1_info));
+#ifdef CONFIG_CRASH_DUMP if (image->type == KEXEC_TYPE_CRASH) { ret = crash_setup_memmap_entries(image, params); if (ret) return ret; } else +#endif setup_e820_entries(params);
nr_e820_entries = params->e820_entries; @@ -428,12 +430,14 @@ static void *bzImage64_load(struct kimage *image, char *kernel, return ERR_PTR(-EINVAL); }
+#ifdef CONFIG_CRASH_DUMP /* Allocate and load backup region */ if (image->type == KEXEC_TYPE_CRASH) { ret = crash_load_segments(image); if (ret) return ERR_PTR(ret); } +#endif
/* * Load purgatory. For 64bit entry point, purgatory code can be diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c index b8ab9ee5896c..38d88c8b56ec 100644 --- a/arch/x86/kernel/kvm.c +++ b/arch/x86/kernel/kvm.c @@ -769,7 +769,7 @@ static struct notifier_block kvm_pv_reboot_nb = { * won't be valid. In cases like kexec, in which you install a new kernel, this * means a random memory location will be kept being written. */ -#ifdef CONFIG_KEXEC_CORE +#ifdef CONFIG_CRASH_DUMP static void kvm_crash_shutdown(struct pt_regs *regs) { kvm_guest_cpu_offline(true); @@ -852,7 +852,7 @@ static void __init kvm_guest_init(void) kvm_guest_cpu_init(); #endif
-#ifdef CONFIG_KEXEC_CORE +#ifdef CONFIG_CRASH_DUMP machine_ops.crash_shutdown = kvm_crash_shutdown; #endif
diff --git a/arch/x86/kernel/machine_kexec_64.c b/arch/x86/kernel/machine_kexec_64.c index 2fa12d1dc676..aaeac2deb85d 100644 --- a/arch/x86/kernel/machine_kexec_64.c +++ b/arch/x86/kernel/machine_kexec_64.c @@ -545,6 +545,8 @@ int arch_kimage_file_post_load_cleanup(struct kimage *image) } #endif /* CONFIG_KEXEC_FILE */
+#ifdef CONFIG_CRASH_DUMP + static int kexec_mark_range(unsigned long start, unsigned long end, bool protect) { @@ -589,6 +591,7 @@ void arch_kexec_unprotect_crashkres(void) { kexec_mark_crashkres(false); } +#endif
/* * During a traditional boot under SME, SME will encrypt the kernel, diff --git a/arch/x86/kernel/reboot.c b/arch/x86/kernel/reboot.c index 830425e6d38e..f3130f762784 100644 --- a/arch/x86/kernel/reboot.c +++ b/arch/x86/kernel/reboot.c @@ -796,7 +796,7 @@ struct machine_ops machine_ops __ro_after_init = { .emergency_restart = native_machine_emergency_restart, .restart = native_machine_restart, .halt = native_machine_halt, -#ifdef CONFIG_KEXEC_CORE +#ifdef CONFIG_CRASH_DUMP .crash_shutdown = native_machine_crash_shutdown, #endif }; @@ -826,7 +826,7 @@ void machine_halt(void) machine_ops.halt(); }
-#ifdef CONFIG_KEXEC_CORE +#ifdef CONFIG_CRASH_DUMP void machine_crash_shutdown(struct pt_regs *regs) { machine_ops.crash_shutdown(regs); diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c index eb129277dcdd..8bcecabd475b 100644 --- a/arch/x86/kernel/setup.c +++ b/arch/x86/kernel/setup.c @@ -547,7 +547,7 @@ static void __init reserve_crashkernel(void) bool high = false; int ret;
- if (!IS_ENABLED(CONFIG_KEXEC_CORE)) + if (!IS_ENABLED(CONFIG_CRASH_RESERVE)) return;
total_mem = memblock_phys_mem_size(); diff --git a/arch/x86/kernel/smp.c b/arch/x86/kernel/smp.c index 96a771f9f930..52c3823b7211 100644 --- a/arch/x86/kernel/smp.c +++ b/arch/x86/kernel/smp.c @@ -282,7 +282,7 @@ struct smp_ops smp_ops = { .smp_cpus_done = native_smp_cpus_done,
.stop_other_cpus = native_stop_other_cpus, -#if defined(CONFIG_KEXEC_CORE) +#if defined(CONFIG_CRASH_DUMP) .crash_stop_other_cpus = kdump_nmi_shootdown_cpus, #endif .smp_send_reschedule = native_smp_send_reschedule, diff --git a/arch/x86/xen/enlighten_hvm.c b/arch/x86/xen/enlighten_hvm.c index 70be57e8f51c..ade22feee7ae 100644 --- a/arch/x86/xen/enlighten_hvm.c +++ b/arch/x86/xen/enlighten_hvm.c @@ -141,7 +141,9 @@ static void xen_hvm_shutdown(void) if (kexec_in_progress) xen_reboot(SHUTDOWN_soft_reset); } +#endif
+#ifdef CONFIG_CRASH_DUMP static void xen_hvm_crash_shutdown(struct pt_regs *regs) { native_machine_crash_shutdown(regs); @@ -229,6 +231,8 @@ static void __init xen_hvm_guest_init(void)
#ifdef CONFIG_KEXEC_CORE machine_ops.shutdown = xen_hvm_shutdown; +#endif +#ifdef CONFIG_CRASH_DUMP machine_ops.crash_shutdown = xen_hvm_crash_shutdown; #endif } diff --git a/arch/x86/xen/mmu_pv.c b/arch/x86/xen/mmu_pv.c index 6b201e64d8ab..bfd57d07f4b5 100644 --- a/arch/x86/xen/mmu_pv.c +++ b/arch/x86/xen/mmu_pv.c @@ -2517,7 +2517,7 @@ int xen_remap_pfn(struct vm_area_struct *vma, unsigned long addr, } EXPORT_SYMBOL_GPL(xen_remap_pfn);
-#ifdef CONFIG_KEXEC_CORE +#ifdef CONFIG_VMCORE_INFO phys_addr_t paddr_vmcoreinfo_note(void) { if (xen_pv_domain())
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Naman Jain namjain@linux.microsoft.com
[ Upstream commit bcc80dec91ee745b3d66f3e48f0ec2efdea97149 ]
read_hv_sched_clock_tsc() assumes that the Hyper-V clock counter is bigger than the variable hv_sched_clock_offset, which is cached during early boot, but depending on the timing this assumption may be false when a hibernated VM starts again (the clock counter starts from 0 again) and is resuming back (Note: hv_init_tsc_clocksource() is not called during hibernation/resume); consequently, read_hv_sched_clock_tsc() may return a negative integer (which is interpreted as a huge positive integer since the return type is u64) and new kernel messages are prefixed with huge timestamps before read_hv_sched_clock_tsc() grows big enough (which typically takes several seconds).
Fix the issue by saving the Hyper-V clock counter just before the suspend, and using it to correct the hv_sched_clock_offset in resume. This makes hv tsc page based sched_clock continuous and ensures that post resume, it starts from where it left off during suspend. Override x86_platform.save_sched_clock_state and x86_platform.restore_sched_clock_state routines to correct this as soon as possible.
Note: if Invariant TSC is available, the issue doesn't happen because 1) we don't register read_hv_sched_clock_tsc() for sched clock: See commit e5313f1c5404 ("clocksource/drivers/hyper-v: Rework clocksource and sched clock setup"); 2) the common x86 code adjusts TSC similarly: see __restore_processor_state() -> tsc_verify_tsc_adjust(true) and x86_platform.restore_sched_clock_state().
Cc: stable@vger.kernel.org Fixes: 1349401ff1aa ("clocksource/drivers/hyper-v: Suspend/resume Hyper-V clocksource for hibernation") Co-developed-by: Dexuan Cui decui@microsoft.com Signed-off-by: Dexuan Cui decui@microsoft.com Signed-off-by: Naman Jain namjain@linux.microsoft.com Reviewed-by: Michael Kelley mhklinux@outlook.com Link: https://lore.kernel.org/r/20240917053917.76787-1-namjain@linux.microsoft.com Signed-off-by: Wei Liu wei.liu@kernel.org Message-ID: 20240917053917.76787-1-namjain@linux.microsoft.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/kernel/cpu/mshyperv.c | 58 ++++++++++++++++++++++++++++++ drivers/clocksource/hyperv_timer.c | 14 +++++++- include/clocksource/hyperv_timer.h | 2 ++ 3 files changed, 73 insertions(+), 1 deletion(-)
diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c index 93e1cb4f7ff1..6328cf56e59b 100644 --- a/arch/x86/kernel/cpu/mshyperv.c +++ b/arch/x86/kernel/cpu/mshyperv.c @@ -224,6 +224,63 @@ static void hv_machine_crash_shutdown(struct pt_regs *regs) hyperv_cleanup(); } #endif /* CONFIG_CRASH_DUMP */ + +static u64 hv_ref_counter_at_suspend; +static void (*old_save_sched_clock_state)(void); +static void (*old_restore_sched_clock_state)(void); + +/* + * Hyper-V clock counter resets during hibernation. Save and restore clock + * offset during suspend/resume, while also considering the time passed + * before suspend. This is to make sure that sched_clock using hv tsc page + * based clocksource, proceeds from where it left off during suspend and + * it shows correct time for the timestamps of kernel messages after resume. + */ +static void save_hv_clock_tsc_state(void) +{ + hv_ref_counter_at_suspend = hv_read_reference_counter(); +} + +static void restore_hv_clock_tsc_state(void) +{ + /* + * Adjust the offsets used by hv tsc clocksource to + * account for the time spent before hibernation. + * adjusted value = reference counter (time) at suspend + * - reference counter (time) now. + */ + hv_adj_sched_clock_offset(hv_ref_counter_at_suspend - hv_read_reference_counter()); +} + +/* + * Functions to override save_sched_clock_state and restore_sched_clock_state + * functions of x86_platform. The Hyper-V clock counter is reset during + * suspend-resume and the offset used to measure time needs to be + * corrected, post resume. + */ +static void hv_save_sched_clock_state(void) +{ + old_save_sched_clock_state(); + save_hv_clock_tsc_state(); +} + +static void hv_restore_sched_clock_state(void) +{ + restore_hv_clock_tsc_state(); + old_restore_sched_clock_state(); +} + +static void __init x86_setup_ops_for_tsc_pg_clock(void) +{ + if (!(ms_hyperv.features & HV_MSR_REFERENCE_TSC_AVAILABLE)) + return; + + old_save_sched_clock_state = x86_platform.save_sched_clock_state; + x86_platform.save_sched_clock_state = hv_save_sched_clock_state; + + old_restore_sched_clock_state = x86_platform.restore_sched_clock_state; + x86_platform.restore_sched_clock_state = hv_restore_sched_clock_state; +} #endif /* CONFIG_HYPERV */
static uint32_t __init ms_hyperv_platform(void) @@ -578,6 +635,7 @@ static void __init ms_hyperv_init_platform(void)
/* Register Hyper-V specific clocksource */ hv_init_clocksource(); + x86_setup_ops_for_tsc_pg_clock(); hv_vtl_init_platform(); #endif /* diff --git a/drivers/clocksource/hyperv_timer.c b/drivers/clocksource/hyperv_timer.c index 8ff7cd4e20bb..5eec1457e139 100644 --- a/drivers/clocksource/hyperv_timer.c +++ b/drivers/clocksource/hyperv_timer.c @@ -27,7 +27,8 @@ #include <asm/mshyperv.h>
static struct clock_event_device __percpu *hv_clock_event; -static u64 hv_sched_clock_offset __ro_after_init; +/* Note: offset can hold negative values after hibernation. */ +static u64 hv_sched_clock_offset __read_mostly;
/* * If false, we're using the old mechanism for stimer0 interrupts @@ -456,6 +457,17 @@ static void resume_hv_clock_tsc(struct clocksource *arg) hv_set_register(HV_REGISTER_REFERENCE_TSC, tsc_msr.as_uint64); }
+/* + * Called during resume from hibernation, from overridden + * x86_platform.restore_sched_clock_state routine. This is to adjust offsets + * used to calculate time for hv tsc page based sched_clock, to account for + * time spent before hibernation. + */ +void hv_adj_sched_clock_offset(u64 offset) +{ + hv_sched_clock_offset -= offset; +} + #ifdef HAVE_VDSO_CLOCKMODE_HVCLOCK static int hv_cs_enable(struct clocksource *cs) { diff --git a/include/clocksource/hyperv_timer.h b/include/clocksource/hyperv_timer.h index 6cdc873ac907..aa5233b1eba9 100644 --- a/include/clocksource/hyperv_timer.h +++ b/include/clocksource/hyperv_timer.h @@ -38,6 +38,8 @@ extern void hv_remap_tsc_clocksource(void); extern unsigned long hv_get_tsc_pfn(void); extern struct ms_hyperv_tsc_page *hv_get_tsc_page(void);
+extern void hv_adj_sched_clock_offset(u64 offset); + static __always_inline bool hv_read_tsc_page_tsc(const struct ms_hyperv_tsc_page *tsc_pg, u64 *cur_tsc, u64 *time)
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Herve Codina herve.codina@bootlin.com
[ Upstream commit 3eb030c60835668997d5763b1a0c7938faf169f6 ]
The recently added of_bus_default_flags_translate() performs the exact same operation as of_bus_pci_translate() and of_bus_isa_translate().
Avoid duplicated code replacing both of_bus_pci_translate() and of_bus_isa_translate() with of_bus_default_flags_translate().
Signed-off-by: Herve Codina herve.codina@bootlin.com Link: https://lore.kernel.org/r/20231017110221.189299-3-herve.codina@bootlin.com Signed-off-by: Rob Herring robh@kernel.org Stable-dep-of: 7f05e20b989a ("of: address: Preserve the flags portion on 1:1 dma-ranges mapping") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/of/address.c | 13 ++----------- 1 file changed, 2 insertions(+), 11 deletions(-)
diff --git a/drivers/of/address.c b/drivers/of/address.c index dfd05cb2b2fc..cfe5a11b620a 100644 --- a/drivers/of/address.c +++ b/drivers/of/address.c @@ -217,10 +217,6 @@ static u64 of_bus_pci_map(__be32 *addr, const __be32 *range, int na, int ns, return da - cp; }
-static int of_bus_pci_translate(__be32 *addr, u64 offset, int na) -{ - return of_bus_default_translate(addr + 1, offset, na - 1); -} #endif /* CONFIG_PCI */
/* @@ -344,11 +340,6 @@ static u64 of_bus_isa_map(__be32 *addr, const __be32 *range, int na, int ns, return da - cp; }
-static int of_bus_isa_translate(__be32 *addr, u64 offset, int na) -{ - return of_bus_default_translate(addr + 1, offset, na - 1); -} - static unsigned int of_bus_isa_get_flags(const __be32 *addr) { unsigned int flags = 0; @@ -379,7 +370,7 @@ static struct of_bus of_busses[] = { .match = of_bus_pci_match, .count_cells = of_bus_pci_count_cells, .map = of_bus_pci_map, - .translate = of_bus_pci_translate, + .translate = of_bus_default_flags_translate, .has_flags = true, .get_flags = of_bus_pci_get_flags, }, @@ -391,7 +382,7 @@ static struct of_bus of_busses[] = { .match = of_bus_isa_match, .count_cells = of_bus_isa_count_cells, .map = of_bus_isa_map, - .translate = of_bus_isa_translate, + .translate = of_bus_default_flags_translate, .has_flags = true, .get_flags = of_bus_isa_get_flags, },
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Rob Herring robh@kernel.org
[ Upstream commit 88696db08b7efa3b6bb722014ea7429e78f6be32 ]
It is more useful to know how many flags cells a bus has rather than whether a bus has flags or not as ultimately the number of cells is the information used. Replace 'has_flags' boolean with 'flag_cells' count.
Acked-by: Herve Codina herve.codina@bootlin.com Link: https://lore.kernel.org/r/20231026135358.3564307-2-robh@kernel.org Signed-off-by: Rob Herring robh@kernel.org Stable-dep-of: 7f05e20b989a ("of: address: Preserve the flags portion on 1:1 dma-ranges mapping") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/of/address.c | 14 +++++--------- 1 file changed, 5 insertions(+), 9 deletions(-)
diff --git a/drivers/of/address.c b/drivers/of/address.c index cfe5a11b620a..cdefe5a89e5d 100644 --- a/drivers/of/address.c +++ b/drivers/of/address.c @@ -46,7 +46,7 @@ struct of_bus { u64 (*map)(__be32 *addr, const __be32 *range, int na, int ns, int pna); int (*translate)(__be32 *addr, u64 offset, int na); - bool has_flags; + int flag_cells; unsigned int (*get_flags)(const __be32 *addr); };
@@ -371,7 +371,7 @@ static struct of_bus of_busses[] = { .count_cells = of_bus_pci_count_cells, .map = of_bus_pci_map, .translate = of_bus_default_flags_translate, - .has_flags = true, + .flag_cells = 1, .get_flags = of_bus_pci_get_flags, }, #endif /* CONFIG_PCI */ @@ -383,7 +383,7 @@ static struct of_bus of_busses[] = { .count_cells = of_bus_isa_count_cells, .map = of_bus_isa_map, .translate = of_bus_default_flags_translate, - .has_flags = true, + .flag_cells = 1, .get_flags = of_bus_isa_get_flags, }, /* Default with flags cell */ @@ -394,7 +394,7 @@ static struct of_bus of_busses[] = { .count_cells = of_bus_default_count_cells, .map = of_bus_default_flags_map, .translate = of_bus_default_flags_translate, - .has_flags = true, + .flag_cells = 1, .get_flags = of_bus_default_flags_get_flags, }, /* Default */ @@ -827,7 +827,7 @@ struct of_pci_range *of_pci_range_parser_one(struct of_pci_range_parser *parser, int na = parser->na; int ns = parser->ns; int np = parser->pna + na + ns; - int busflag_na = 0; + int busflag_na = parser->bus->flag_cells;
if (!range) return NULL; @@ -837,10 +837,6 @@ struct of_pci_range *of_pci_range_parser_one(struct of_pci_range_parser *parser,
range->flags = parser->bus->get_flags(parser->range);
- /* A extra cell for resource flags */ - if (parser->bus->has_flags) - busflag_na = 1; - range->bus_addr = of_read_number(parser->range + busflag_na, na - busflag_na);
if (parser->dma)
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Andrea della Porta andrea.porta@suse.com
[ Upstream commit 7f05e20b989ac33c9c0f8c2028ec0a566493548f ]
A missing or empty dma-ranges in a DT node implies a 1:1 mapping for dma translations. In this specific case, the current behaviour is to zero out the entire specifier so that the translation could be carried on as an offset from zero. This includes address specifier that has flags (e.g. PCI ranges).
Once the flags portion has been zeroed, the translation chain is broken since the mapping functions will check the upcoming address specifier against mismatching flags, always failing the 1:1 mapping and its entire purpose of always succeeding.
Set to zero only the address portion while passing the flags through.
Fixes: dbbdee94734b ("of/address: Merge all of the bus translation code") Cc: stable@vger.kernel.org Signed-off-by: Andrea della Porta andrea.porta@suse.com Tested-by: Herve Codina herve.codina@bootlin.com Link: https://lore.kernel.org/r/e51ae57874e58a9b349c35e2e877425ebc075d7a.173244181... Signed-off-by: Rob Herring (Arm) robh@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/of/address.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/of/address.c b/drivers/of/address.c index cdefe5a89e5d..34d880a1be0a 100644 --- a/drivers/of/address.c +++ b/drivers/of/address.c @@ -476,7 +476,8 @@ static int of_translate_one(struct device_node *parent, struct of_bus *bus, } if (ranges == NULL || rlen == 0) { offset = of_read_number(addr, na); - memset(addr, 0, pna * 4); + /* set address to zero, pass flags through */ + memset(addr + pbus->flag_cells, 0, (pna - pbus->flag_cells) * 4); pr_debug("empty ranges; 1:1 translation\n"); goto finish; }
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Claudiu Beznea claudiu.beznea.uj@bp.renesas.com
[ Upstream commit 064319c3fac88e04f53f3460cd24ae90de2d9fb6 ]
There is no need to de-assert the reset signal on probe as the watchdog is not used prior executing start. Also, the clocks are not enabled in probe (pm_runtime_enable() doesn't do that), thus this is another indicator that the watchdog wasn't used previously like this. Instead, keep the watchdog hardware in its previous state at probe (by default it is in reset state), enable it when it is started and move it to reset state when it is stopped. This saves some extra power when the watchdog is unused.
Signed-off-by: Claudiu Beznea claudiu.beznea.uj@bp.renesas.com Reviewed-by: Guenter Roeck linux@roeck-us.net Link: https://lore.kernel.org/r/20240531065723.1085423-6-claudiu.beznea.uj@bp.rene... Signed-off-by: Guenter Roeck linux@roeck-us.net Signed-off-by: Wim Van Sebroeck wim@linux-watchdog.org Stable-dep-of: bad201b2ac4e ("watchdog: rzg2l_wdt: Power on the watchdog domain in the restart handler") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/watchdog/rzg2l_wdt.c | 28 +++++++++++++++++----------- 1 file changed, 17 insertions(+), 11 deletions(-)
diff --git a/drivers/watchdog/rzg2l_wdt.c b/drivers/watchdog/rzg2l_wdt.c index 7bce093316c4..7aad66da138a 100644 --- a/drivers/watchdog/rzg2l_wdt.c +++ b/drivers/watchdog/rzg2l_wdt.c @@ -129,6 +129,12 @@ static int rzg2l_wdt_start(struct watchdog_device *wdev) if (ret) return ret;
+ ret = reset_control_deassert(priv->rstc); + if (ret) { + pm_runtime_put(wdev->parent); + return ret; + } + /* Initialize time out */ rzg2l_wdt_init_timeout(wdev);
@@ -146,7 +152,9 @@ static int rzg2l_wdt_stop(struct watchdog_device *wdev) struct rzg2l_wdt_priv *priv = watchdog_get_drvdata(wdev); int ret;
- rzg2l_wdt_reset(priv); + ret = reset_control_assert(priv->rstc); + if (ret) + return ret;
ret = pm_runtime_put(wdev->parent); if (ret < 0) @@ -186,6 +194,12 @@ static int rzg2l_wdt_restart(struct watchdog_device *wdev, clk_prepare_enable(priv->osc_clk);
if (priv->devtype == WDT_RZG2L) { + int ret; + + ret = reset_control_deassert(priv->rstc); + if (ret) + return ret; + /* Generate Reset (WDTRSTB) Signal on parity error */ rzg2l_wdt_write(priv, 0, PECR);
@@ -236,13 +250,11 @@ static const struct watchdog_ops rzg2l_wdt_ops = { .restart = rzg2l_wdt_restart, };
-static void rzg2l_wdt_reset_assert_pm_disable(void *data) +static void rzg2l_wdt_pm_disable(void *data) { struct watchdog_device *wdev = data; - struct rzg2l_wdt_priv *priv = watchdog_get_drvdata(wdev);
pm_runtime_disable(wdev->parent); - reset_control_assert(priv->rstc); }
static int rzg2l_wdt_probe(struct platform_device *pdev) @@ -285,10 +297,6 @@ static int rzg2l_wdt_probe(struct platform_device *pdev) return dev_err_probe(&pdev->dev, PTR_ERR(priv->rstc), "failed to get cpg reset");
- ret = reset_control_deassert(priv->rstc); - if (ret) - return dev_err_probe(dev, ret, "failed to deassert"); - priv->devtype = (uintptr_t)of_device_get_match_data(dev);
if (priv->devtype == WDT_RZV2M) { @@ -309,9 +317,7 @@ static int rzg2l_wdt_probe(struct platform_device *pdev) priv->wdev.timeout = WDT_DEFAULT_TIMEOUT;
watchdog_set_drvdata(&priv->wdev, priv); - ret = devm_add_action_or_reset(&pdev->dev, - rzg2l_wdt_reset_assert_pm_disable, - &priv->wdev); + ret = devm_add_action_or_reset(&pdev->dev, rzg2l_wdt_pm_disable, &priv->wdev); if (ret < 0) return ret;
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Claudiu Beznea claudiu.beznea.uj@bp.renesas.com
[ Upstream commit d8997ed79ed7c7c32b2ae571e0d99a58bbfd01fe ]
The reset driver has been adapted in commit da235d2fac21 ("clk: renesas: rzg2l: Check reset monitor registers") to check the reset monitor bits before declaring reset asserts/de-asserts as successful/failure operations. With that, there is no need to keep the reset workaround for RZ/V2M in place in the watchdog driver.
Signed-off-by: Claudiu Beznea claudiu.beznea.uj@bp.renesas.com Reviewed-by: Philipp Zabel p.zabel@pengutronix.de Reviewed-by: Guenter Roeck linux@roeck-us.net Link: https://lore.kernel.org/r/20240531065723.1085423-8-claudiu.beznea.uj@bp.rene... Signed-off-by: Guenter Roeck linux@roeck-us.net Signed-off-by: Wim Van Sebroeck wim@linux-watchdog.org Stable-dep-of: bad201b2ac4e ("watchdog: rzg2l_wdt: Power on the watchdog domain in the restart handler") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/watchdog/rzg2l_wdt.c | 39 ++++-------------------------------- 1 file changed, 4 insertions(+), 35 deletions(-)
diff --git a/drivers/watchdog/rzg2l_wdt.c b/drivers/watchdog/rzg2l_wdt.c index 7aad66da138a..d09f938415fc 100644 --- a/drivers/watchdog/rzg2l_wdt.c +++ b/drivers/watchdog/rzg2l_wdt.c @@ -8,7 +8,6 @@ #include <linux/clk.h> #include <linux/delay.h> #include <linux/io.h> -#include <linux/iopoll.h> #include <linux/kernel.h> #include <linux/module.h> #include <linux/of.h> @@ -54,35 +53,11 @@ struct rzg2l_wdt_priv { struct reset_control *rstc; unsigned long osc_clk_rate; unsigned long delay; - unsigned long minimum_assertion_period; struct clk *pclk; struct clk *osc_clk; enum rz_wdt_type devtype; };
-static int rzg2l_wdt_reset(struct rzg2l_wdt_priv *priv) -{ - int err, status; - - if (priv->devtype == WDT_RZV2M) { - /* WDT needs TYPE-B reset control */ - err = reset_control_assert(priv->rstc); - if (err) - return err; - ndelay(priv->minimum_assertion_period); - err = reset_control_deassert(priv->rstc); - if (err) - return err; - err = read_poll_timeout(reset_control_status, status, - status != 1, 0, 1000, false, - priv->rstc); - } else { - err = reset_control_reset(priv->rstc); - } - - return err; -} - static void rzg2l_wdt_wait_delay(struct rzg2l_wdt_priv *priv) { /* delay timer when change the setting register */ @@ -189,13 +164,12 @@ static int rzg2l_wdt_restart(struct watchdog_device *wdev, unsigned long action, void *data) { struct rzg2l_wdt_priv *priv = watchdog_get_drvdata(wdev); + int ret;
clk_prepare_enable(priv->pclk); clk_prepare_enable(priv->osc_clk);
if (priv->devtype == WDT_RZG2L) { - int ret; - ret = reset_control_deassert(priv->rstc); if (ret) return ret; @@ -207,7 +181,9 @@ static int rzg2l_wdt_restart(struct watchdog_device *wdev, rzg2l_wdt_write(priv, PEEN_FORCE, PEEN); } else { /* RZ/V2M doesn't have parity error registers */ - rzg2l_wdt_reset(priv); + ret = reset_control_reset(priv->rstc); + if (ret) + return ret;
wdev->timeout = 0;
@@ -299,13 +275,6 @@ static int rzg2l_wdt_probe(struct platform_device *pdev)
priv->devtype = (uintptr_t)of_device_get_match_data(dev);
- if (priv->devtype == WDT_RZV2M) { - priv->minimum_assertion_period = RZV2M_A_NSEC + - 3 * F2CYCLE_NSEC(pclk_rate) + 5 * - max(F2CYCLE_NSEC(priv->osc_clk_rate), - F2CYCLE_NSEC(pclk_rate)); - } - pm_runtime_enable(&pdev->dev);
priv->wdev.info = &rzg2l_wdt_ident;
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Claudiu Beznea claudiu.beznea.uj@bp.renesas.com
[ Upstream commit bad201b2ac4e238c6d4b6966a220240e3861640c ]
On RZ/G3S the watchdog can be part of a software-controlled PM domain. In this case, the watchdog device need to be powered on in struct watchdog_ops::restart API. This can be done though pm_runtime_resume_and_get() API if the watchdog PM domain and watchdog device are marked as IRQ safe. We mark the watchdog PM domain as IRQ safe with GENPD_FLAG_IRQ_SAFE when the watchdog PM domain is registered and the watchdog device though pm_runtime_irq_safe().
Before commit e4cf89596c1f ("watchdog: rzg2l_wdt: Fix 'BUG: Invalid wait context'") pm_runtime_get_sync() was used in watchdog restart handler (which is similar to pm_runtime_resume_and_get() except the later one handles the runtime resume errors).
Commit e4cf89596c1f ("watchdog: rzg2l_wdt: Fix 'BUG: Invalid wait context'") dropped the pm_runtime_get_sync() and replaced it with clk_prepare_enable() to avoid invalid wait context due to genpd_lock() in genpd_runtime_resume() being called from atomic context. But clk_prepare_enable() doesn't fit for this either (as reported by Ulf Hansson) as clk_prepare() can also sleep (it just not throw invalid wait context warning as it is not written for this).
Because the watchdog device is marked now as IRQ safe (though this patch) the irq_safe_dev_in_sleep_domain() call from genpd_runtime_resume() returns 1 for devices not registering an IRQ safe PM domain for watchdog (as the watchdog device is IRQ safe, PM domain is not and watchdog PM domain is always-on), this being the case for RZ/G3S with old device trees and the rest of the SoCs that use this driver, we can now drop also the clk_prepare_enable() calls in restart handler and rely on pm_runtime_resume_and_get().
Thus, drop clk_prepare_enable() and use pm_runtime_resume_and_get() in watchdog restart handler.
Signed-off-by: Claudiu Beznea claudiu.beznea.uj@bp.renesas.com Reviewed-by: Ulf Hansson ulf.hansson@linaro.org Reviewed-by: Geert Uytterhoeven geert+renesas@glider.be Acked-by: Guenter Roeck linux@roeck-us.net Link: https://lore.kernel.org/r/20241015164732.4085249-5-claudiu.beznea.uj@bp.rene... Signed-off-by: Guenter Roeck linux@roeck-us.net Signed-off-by: Wim Van Sebroeck wim@linux-watchdog.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/watchdog/rzg2l_wdt.c | 20 ++++++++++++++++++-- 1 file changed, 18 insertions(+), 2 deletions(-)
diff --git a/drivers/watchdog/rzg2l_wdt.c b/drivers/watchdog/rzg2l_wdt.c index d09f938415fc..525a72d8d746 100644 --- a/drivers/watchdog/rzg2l_wdt.c +++ b/drivers/watchdog/rzg2l_wdt.c @@ -12,6 +12,7 @@ #include <linux/module.h> #include <linux/of.h> #include <linux/platform_device.h> +#include <linux/pm_domain.h> #include <linux/pm_runtime.h> #include <linux/reset.h> #include <linux/units.h> @@ -166,8 +167,22 @@ static int rzg2l_wdt_restart(struct watchdog_device *wdev, struct rzg2l_wdt_priv *priv = watchdog_get_drvdata(wdev); int ret;
- clk_prepare_enable(priv->pclk); - clk_prepare_enable(priv->osc_clk); + /* + * In case of RZ/G3S the watchdog device may be part of an IRQ safe power + * domain that is currently powered off. In this case we need to power + * it on before accessing registers. Along with this the clocks will be + * enabled. We don't undo the pm_runtime_resume_and_get() as the device + * need to be on for the reboot to happen. + * + * For the rest of SoCs not registering a watchdog IRQ safe power + * domain it is safe to call pm_runtime_resume_and_get() as the + * irq_safe_dev_in_sleep_domain() call in genpd_runtime_resume() + * returns non zero value and the genpd_lock() is avoided, thus, there + * will be no invalid wait context reported by lockdep. + */ + ret = pm_runtime_resume_and_get(wdev->parent); + if (ret) + return ret;
if (priv->devtype == WDT_RZG2L) { ret = reset_control_deassert(priv->rstc); @@ -275,6 +290,7 @@ static int rzg2l_wdt_probe(struct platform_device *pdev)
priv->devtype = (uintptr_t)of_device_get_match_data(dev);
+ pm_runtime_irq_safe(&pdev->dev); pm_runtime_enable(&pdev->dev);
priv->wdev.info = &rzg2l_wdt_ident;
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Al Viro viro@zeniv.linux.org.uk
[ Upstream commit 9d35cebb794bb7be93db76c3383979c7deacfef9 ]
We can't really afford locking the source on same-directory rename; currently vfs_rename() tries to do that, but it will have to be changed. The logics in udf_rename() is lazy and goes looking for ".." in source even in same-directory case. It's not hard to get rid of that, leaving that behaviour only for cross-directory case; that VFS can get locks safely (and will keep doing that after the coming changes).
Reviewed-by: Jan Kara jack@suse.cz Signed-off-by: Al Viro viro@zeniv.linux.org.uk Stable-dep-of: 6756af923e06 ("udf: Verify inode link counts before performing rename") Signed-off-by: Sasha Levin sashal@kernel.org --- fs/udf/namei.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/fs/udf/namei.c b/fs/udf/namei.c index b3f57ad2b869..0461a7b1e9b4 100644 --- a/fs/udf/namei.c +++ b/fs/udf/namei.c @@ -770,7 +770,7 @@ static int udf_rename(struct mnt_idmap *idmap, struct inode *old_dir, struct inode *old_inode = d_inode(old_dentry); struct inode *new_inode = d_inode(new_dentry); struct udf_fileident_iter oiter, niter, diriter; - bool has_diriter = false; + bool has_diriter = false, is_dir = false; int retval; struct kernel_lb_addr tloc;
@@ -793,6 +793,9 @@ static int udf_rename(struct mnt_idmap *idmap, struct inode *old_dir, if (!empty_dir(new_inode)) goto out_oiter; } + is_dir = true; + } + if (is_dir && old_dir != new_dir) { retval = udf_fiiter_find_entry(old_inode, &dotdot_name, &diriter); if (retval == -ENOENT) { @@ -880,7 +883,9 @@ static int udf_rename(struct mnt_idmap *idmap, struct inode *old_dir, cpu_to_lelb(UDF_I(new_dir)->i_location); udf_fiiter_write_fi(&diriter, NULL); udf_fiiter_release(&diriter); + }
+ if (is_dir) { inode_dec_link_count(old_dir); if (new_inode) inode_dec_link_count(new_inode);
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jan Kara jack@suse.cz
[ Upstream commit 6756af923e06aa33ad8894aaecbf9060953ba00f ]
During rename, we are updating link counts of various inodes either when rename deletes target or when moving directory across directories. Verify involved link counts are sane so that we don't trip warnings in VFS.
Reported-by: syzbot+3ff7365dc04a6bcafa66@syzkaller.appspotmail.com Signed-off-by: Jan Kara jack@suse.cz Signed-off-by: Sasha Levin sashal@kernel.org --- fs/udf/namei.c | 10 ++++++++++ 1 file changed, 10 insertions(+)
diff --git a/fs/udf/namei.c b/fs/udf/namei.c index 0461a7b1e9b4..8ac73f41d6eb 100644 --- a/fs/udf/namei.c +++ b/fs/udf/namei.c @@ -792,8 +792,18 @@ static int udf_rename(struct mnt_idmap *idmap, struct inode *old_dir, retval = -ENOTEMPTY; if (!empty_dir(new_inode)) goto out_oiter; + retval = -EFSCORRUPTED; + if (new_inode->i_nlink != 2) + goto out_oiter; } + retval = -EFSCORRUPTED; + if (old_dir->i_nlink < 3) + goto out_oiter; is_dir = true; + } else if (new_inode) { + retval = -EFSCORRUPTED; + if (new_inode->i_nlink < 1) + goto out_oiter; } if (is_dir && old_dir != new_dir) { retval = udf_fiiter_find_entry(old_inode, &dotdot_name,
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Takashi Iwai tiwai@suse.de
[ Upstream commit 631896f7eaaf8cf8c639b065d3c9fbaa66da5d32 ]
We can simplify the code gracefully with new guard() macro and co for automatic cleanup of locks.
Only the code refactoring, and no functional changes.
Signed-off-by: Takashi Iwai tiwai@suse.de Link: https://lore.kernel.org/r/20240227085306.9764-2-tiwai@suse.de Stable-dep-of: 3978d53df723 ("ALSA: ump: Don't open legacy substream for an inactive group") Signed-off-by: Sasha Levin sashal@kernel.org --- sound/core/ump.c | 35 ++++++++++++----------------------- 1 file changed, 12 insertions(+), 23 deletions(-)
diff --git a/sound/core/ump.c b/sound/core/ump.c index 83856b2f88b8..ada2625ce78f 100644 --- a/sound/core/ump.c +++ b/sound/core/ump.c @@ -1076,13 +1076,11 @@ static int snd_ump_legacy_open(struct snd_rawmidi_substream *substream) struct snd_ump_endpoint *ump = substream->rmidi->private_data; int dir = substream->stream; int group = ump->legacy_mapping[substream->number]; - int err = 0; + int err;
- mutex_lock(&ump->open_mutex); - if (ump->legacy_substreams[dir][group]) { - err = -EBUSY; - goto unlock; - } + guard(mutex)(&ump->open_mutex); + if (ump->legacy_substreams[dir][group]) + return -EBUSY; if (dir == SNDRV_RAWMIDI_STREAM_OUTPUT) { if (!ump->legacy_out_opens) { err = snd_rawmidi_kernel_open(&ump->core, 0, @@ -1090,17 +1088,14 @@ static int snd_ump_legacy_open(struct snd_rawmidi_substream *substream) SNDRV_RAWMIDI_LFLG_APPEND, &ump->legacy_out_rfile); if (err < 0) - goto unlock; + return err; } ump->legacy_out_opens++; snd_ump_convert_reset(&ump->out_cvts[group]); } - spin_lock_irq(&ump->legacy_locks[dir]); + guard(spinlock_irq)(&ump->legacy_locks[dir]); ump->legacy_substreams[dir][group] = substream; - spin_unlock_irq(&ump->legacy_locks[dir]); - unlock: - mutex_unlock(&ump->open_mutex); - return err; + return 0; }
static int snd_ump_legacy_close(struct snd_rawmidi_substream *substream) @@ -1109,15 +1104,13 @@ static int snd_ump_legacy_close(struct snd_rawmidi_substream *substream) int dir = substream->stream; int group = ump->legacy_mapping[substream->number];
- mutex_lock(&ump->open_mutex); - spin_lock_irq(&ump->legacy_locks[dir]); - ump->legacy_substreams[dir][group] = NULL; - spin_unlock_irq(&ump->legacy_locks[dir]); + guard(mutex)(&ump->open_mutex); + scoped_guard(spinlock_irq, &ump->legacy_locks[dir]) + ump->legacy_substreams[dir][group] = NULL; if (dir == SNDRV_RAWMIDI_STREAM_OUTPUT) { if (!--ump->legacy_out_opens) snd_rawmidi_kernel_release(&ump->legacy_out_rfile); } - mutex_unlock(&ump->open_mutex); return 0; }
@@ -1169,12 +1162,11 @@ static int process_legacy_output(struct snd_ump_endpoint *ump, const int dir = SNDRV_RAWMIDI_STREAM_OUTPUT; unsigned char c; int group, size = 0; - unsigned long flags;
if (!ump->out_cvts || !ump->legacy_out_opens) return 0;
- spin_lock_irqsave(&ump->legacy_locks[dir], flags); + guard(spinlock_irqsave)(&ump->legacy_locks[dir]); for (group = 0; group < SNDRV_UMP_MAX_GROUPS; group++) { substream = ump->legacy_substreams[dir][group]; if (!substream) @@ -1190,7 +1182,6 @@ static int process_legacy_output(struct snd_ump_endpoint *ump, break; } } - spin_unlock_irqrestore(&ump->legacy_locks[dir], flags); return size; }
@@ -1200,18 +1191,16 @@ static void process_legacy_input(struct snd_ump_endpoint *ump, const u32 *src, struct snd_rawmidi_substream *substream; unsigned char buf[16]; unsigned char group; - unsigned long flags; const int dir = SNDRV_RAWMIDI_STREAM_INPUT; int size;
size = snd_ump_convert_from_ump(src, buf, &group); if (size <= 0) return; - spin_lock_irqsave(&ump->legacy_locks[dir], flags); + guard(spinlock_irqsave)(&ump->legacy_locks[dir]); substream = ump->legacy_substreams[dir][group]; if (substream) snd_rawmidi_receive(substream, buf, size); - spin_unlock_irqrestore(&ump->legacy_locks[dir], flags); }
/* Fill ump->legacy_mapping[] for groups to be used for legacy rawmidi */
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Takashi Iwai tiwai@suse.de
[ Upstream commit 3978d53df7236f0a517c2abeb43ddf6ac162cdd8 ]
When a UMP Group is inactive, we shouldn't allow users to access it via the legacy MIDI access. Add the group active flag check and return -ENODEV if it's inactive.
Link: https://patch.msgid.link/20241129094546.32119-2-tiwai@suse.de Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Sasha Levin sashal@kernel.org --- sound/core/ump.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/sound/core/ump.c b/sound/core/ump.c index ada2625ce78f..5a4a7d0b7cca 100644 --- a/sound/core/ump.c +++ b/sound/core/ump.c @@ -1081,6 +1081,8 @@ static int snd_ump_legacy_open(struct snd_rawmidi_substream *substream) guard(mutex)(&ump->open_mutex); if (ump->legacy_substreams[dir][group]) return -EBUSY; + if (!ump->groups[group].active) + return -ENODEV; if (dir == SNDRV_RAWMIDI_STREAM_OUTPUT) { if (!ump->legacy_out_opens) { err = snd_rawmidi_kernel_open(&ump->core, 0,
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Takashi Iwai tiwai@suse.de
[ Upstream commit e29e504e7890b9ee438ca6370d0180d607c473f9 ]
Since the legacy rawmidi has no proper way to know the inactive group, indicate it in the rawmidi substream names with "[Inactive]" suffix when the corresponding UMP group is inactive.
Link: https://patch.msgid.link/20241129094546.32119-3-tiwai@suse.de Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Sasha Levin sashal@kernel.org --- sound/core/ump.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/sound/core/ump.c b/sound/core/ump.c index 5a4a7d0b7cca..bb94f119869a 100644 --- a/sound/core/ump.c +++ b/sound/core/ump.c @@ -1245,8 +1245,9 @@ static void fill_substream_names(struct snd_ump_endpoint *ump, name = ump->groups[idx].name; if (!*name) name = ump->info.name; - snprintf(s->name, sizeof(s->name), "Group %d (%.16s)", - idx + 1, name); + snprintf(s->name, sizeof(s->name), "Group %d (%.16s)%s", + idx + 1, name, + ump->groups[idx].active ? "" : " [Inactive]"); } }
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Takashi Iwai tiwai@suse.de
[ Upstream commit edad3f9519fcacb926d0e3f3217aecaf628a593f ]
The legacy rawmidi substreams should be updated when UMP FB Info or UMP FB Name are received, too.
Link: https://patch.msgid.link/20241129094546.32119-4-tiwai@suse.de Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Sasha Levin sashal@kernel.org --- sound/core/ump.c | 19 +++++++++++++++---- 1 file changed, 15 insertions(+), 4 deletions(-)
diff --git a/sound/core/ump.c b/sound/core/ump.c index bb94f119869a..4aec90dac07e 100644 --- a/sound/core/ump.c +++ b/sound/core/ump.c @@ -37,6 +37,7 @@ static int process_legacy_output(struct snd_ump_endpoint *ump, u32 *buffer, int count); static void process_legacy_input(struct snd_ump_endpoint *ump, const u32 *src, int words); +static void update_legacy_names(struct snd_ump_endpoint *ump); #else static inline int process_legacy_output(struct snd_ump_endpoint *ump, u32 *buffer, int count) @@ -47,6 +48,9 @@ static inline void process_legacy_input(struct snd_ump_endpoint *ump, const u32 *src, int words) { } +static inline void update_legacy_names(struct snd_ump_endpoint *ump) +{ +} #endif
static const struct snd_rawmidi_global_ops snd_ump_rawmidi_ops = { @@ -850,6 +854,7 @@ static int ump_handle_fb_info_msg(struct snd_ump_endpoint *ump, fill_fb_info(ump, &fb->info, buf); if (ump->parsed) { snd_ump_update_group_attrs(ump); + update_legacy_names(ump); seq_notify_fb_change(ump, fb); } } @@ -882,6 +887,7 @@ static int ump_handle_fb_name_msg(struct snd_ump_endpoint *ump, /* notify the FB name update to sequencer, too */ if (ret > 0 && ump->parsed) { snd_ump_update_group_attrs(ump); + update_legacy_names(ump); seq_notify_fb_change(ump, fb); } return ret; @@ -1251,6 +1257,14 @@ static void fill_substream_names(struct snd_ump_endpoint *ump, } }
+static void update_legacy_names(struct snd_ump_endpoint *ump) +{ + struct snd_rawmidi *rmidi = ump->legacy_rmidi; + + fill_substream_names(ump, rmidi, SNDRV_RAWMIDI_STREAM_INPUT); + fill_substream_names(ump, rmidi, SNDRV_RAWMIDI_STREAM_OUTPUT); +} + int snd_ump_attach_legacy_rawmidi(struct snd_ump_endpoint *ump, char *id, int device) { @@ -1287,10 +1301,7 @@ int snd_ump_attach_legacy_rawmidi(struct snd_ump_endpoint *ump, rmidi->ops = &snd_ump_legacy_ops; rmidi->private_data = ump; ump->legacy_rmidi = rmidi; - if (input) - fill_substream_names(ump, rmidi, SNDRV_RAWMIDI_STREAM_INPUT); - if (output) - fill_substream_names(ump, rmidi, SNDRV_RAWMIDI_STREAM_OUTPUT); + update_legacy_names(ump);
ump_dbg(ump, "Created a legacy rawmidi #%d (%s)\n", device, id); return 0;
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Guixin Liu kanie@linux.alibaba.com
[ Upstream commit 29b75184f721b16c51ef6e67eec0e40ed88381c7 ]
To ensure that the same ID is not obtained during concurrent execution of the probe, an ida is used to manage the mrioc's ID.
Signed-off-by: Guixin Liu kanie@linux.alibaba.com Link: https://lore.kernel.org/r/20231229040331.52518-1-kanie@linux.alibaba.com Reviewed-by: Lee Duncan lduncan@suse.com Reviewed-by: Martin Wilck mwilck@suse.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Stable-dep-of: 0d32014f1e3e ("scsi: mpi3mr: Start controller indexing from 0") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/mpi3mr/mpi3mr_os.c | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-)
diff --git a/drivers/scsi/mpi3mr/mpi3mr_os.c b/drivers/scsi/mpi3mr/mpi3mr_os.c index 7f3261923469..3f86f1d0a9be 100644 --- a/drivers/scsi/mpi3mr/mpi3mr_os.c +++ b/drivers/scsi/mpi3mr/mpi3mr_os.c @@ -8,11 +8,12 @@ */
#include "mpi3mr.h" +#include <linux/idr.h>
/* global driver scop variables */ LIST_HEAD(mrioc_list); DEFINE_SPINLOCK(mrioc_list_lock); -static int mrioc_ids; +static DEFINE_IDA(mrioc_ida); static int warn_non_secure_ctlr; atomic64_t event_counter;
@@ -5065,7 +5066,10 @@ mpi3mr_probe(struct pci_dev *pdev, const struct pci_device_id *id) }
mrioc = shost_priv(shost); - mrioc->id = mrioc_ids++; + retval = ida_alloc_range(&mrioc_ida, 1, U8_MAX, GFP_KERNEL); + if (retval < 0) + goto id_alloc_failed; + mrioc->id = (u8)retval; sprintf(mrioc->driver_name, "%s", MPI3MR_DRIVER_NAME); sprintf(mrioc->name, "%s%d", mrioc->driver_name, mrioc->id); INIT_LIST_HEAD(&mrioc->list); @@ -5215,9 +5219,11 @@ mpi3mr_probe(struct pci_dev *pdev, const struct pci_device_id *id) resource_alloc_failed: destroy_workqueue(mrioc->fwevt_worker_thread); fwevtthread_failed: + ida_free(&mrioc_ida, mrioc->id); spin_lock(&mrioc_list_lock); list_del(&mrioc->list); spin_unlock(&mrioc_list_lock); +id_alloc_failed: scsi_host_put(shost); shost_failed: return retval; @@ -5303,6 +5309,7 @@ static void mpi3mr_remove(struct pci_dev *pdev) mrioc->sas_hba.num_phys = 0; }
+ ida_free(&mrioc_ida, mrioc->id); spin_lock(&mrioc_list_lock); list_del(&mrioc->list); spin_unlock(&mrioc_list_lock); @@ -5518,6 +5525,7 @@ static void __exit mpi3mr_exit(void) &driver_attr_event_counter); pci_unregister_driver(&mpi3mr_pci_driver); sas_release_transport(mpi3mr_transport_template); + ida_destroy(&mrioc_ida); }
module_init(mpi3mr_init);
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ranjan Kumar ranjan.kumar@broadcom.com
[ Upstream commit 0d32014f1e3e7a7adf1583c45387f26b9bb3a49d ]
Instead of displaying the controller index starting from '1' make the driver display the controller index starting from '0'.
Signed-off-by: Sumit Saxena sumit.saxena@broadcom.com Signed-off-by: Ranjan Kumar ranjan.kumar@broadcom.com Link: https://lore.kernel.org/r/20241110194405.10108-4-ranjan.kumar@broadcom.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/mpi3mr/mpi3mr_os.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/scsi/mpi3mr/mpi3mr_os.c b/drivers/scsi/mpi3mr/mpi3mr_os.c index 3f86f1d0a9be..7880675a68db 100644 --- a/drivers/scsi/mpi3mr/mpi3mr_os.c +++ b/drivers/scsi/mpi3mr/mpi3mr_os.c @@ -5066,7 +5066,7 @@ mpi3mr_probe(struct pci_dev *pdev, const struct pci_device_id *id) }
mrioc = shost_priv(shost); - retval = ida_alloc_range(&mrioc_ida, 1, U8_MAX, GFP_KERNEL); + retval = ida_alloc_range(&mrioc_ida, 0, U8_MAX, GFP_KERNEL); if (retval < 0) goto id_alloc_failed; mrioc->id = (u8)retval;
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Yicong Yang yangyicong@hisilicon.com
[ Upstream commit f3b78b470f28bb2a3a40e88bdf5c6de6a35a9b76 ]
HiSilicon HIP10/11 platforms using the same SMMU PMCG with HIP09 and thus suffers the same erratum. List them in the PMCG platform information list without introducing a new SMMU PMCG Model.
Update the silicon-errata.rst as well.
Signed-off-by: Yicong Yang yangyicong@hisilicon.com Link: https://lore.kernel.org/r/20240731092658.11012-1-yangyicong@huawei.com Signed-off-by: Will Deacon will@kernel.org Stable-dep-of: c2b46ae02270 ("ACPI/IORT: Add PMCG platform information for HiSilicon HIP09A") Signed-off-by: Sasha Levin sashal@kernel.org --- Documentation/arch/arm64/silicon-errata.rst | 4 ++-- drivers/acpi/arm64/iort.c | 7 +++++++ 2 files changed, 9 insertions(+), 2 deletions(-)
diff --git a/Documentation/arch/arm64/silicon-errata.rst b/Documentation/arch/arm64/silicon-errata.rst index 3cf806733083..f4e6afd59630 100644 --- a/Documentation/arch/arm64/silicon-errata.rst +++ b/Documentation/arch/arm64/silicon-errata.rst @@ -244,8 +244,8 @@ stable kernels. +----------------+-----------------+-----------------+-----------------------------+ | Hisilicon | Hip08 SMMU PMCG | #162001800 | N/A | +----------------+-----------------+-----------------+-----------------------------+ -| Hisilicon | Hip08 SMMU PMCG | #162001900 | N/A | -| | Hip09 SMMU PMCG | | | +| Hisilicon | Hip{08,09,10,10C| #162001900 | N/A | +| | ,11} SMMU PMCG | | | +----------------+-----------------+-----------------+-----------------------------+ +----------------+-----------------+-----------------+-----------------------------+ | Qualcomm Tech. | Kryo/Falkor v1 | E1003 | QCOM_FALKOR_ERRATUM_1003 | diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c index 6496ff5a6ba2..b1f483845bc0 100644 --- a/drivers/acpi/arm64/iort.c +++ b/drivers/acpi/arm64/iort.c @@ -1712,6 +1712,13 @@ static struct acpi_platform_list pmcg_plat_info[] __initdata = { /* HiSilicon Hip09 Platform */ {"HISI ", "HIP09 ", 0, ACPI_SIG_IORT, greater_than_or_equal, "Erratum #162001900", IORT_SMMU_V3_PMCG_HISI_HIP09}, + /* HiSilicon Hip10/11 Platform uses the same SMMU IP with Hip09 */ + {"HISI ", "HIP10 ", 0, ACPI_SIG_IORT, greater_than_or_equal, + "Erratum #162001900", IORT_SMMU_V3_PMCG_HISI_HIP09}, + {"HISI ", "HIP10C ", 0, ACPI_SIG_IORT, greater_than_or_equal, + "Erratum #162001900", IORT_SMMU_V3_PMCG_HISI_HIP09}, + {"HISI ", "HIP11 ", 0, ACPI_SIG_IORT, greater_than_or_equal, + "Erratum #162001900", IORT_SMMU_V3_PMCG_HISI_HIP09}, { } };
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Qinxin Xia xiaqinxin@huawei.com
[ Upstream commit c2b46ae022704a2d845e59461fa24431ad627022 ]
HiSilicon HIP09A platforms using the same SMMU PMCG with HIP09 and thus suffers the same erratum. List them in the PMCG platform information list without introducing a new SMMU PMCG Model.
Update the silicon-errata.rst as well.
Reviewed-by: Yicong Yang yangyicong@hisilicon.com Acked-by: Hanjun Guo guohanjun@huawei.com Signed-off-by: Qinxin Xia xiaqinxin@huawei.com Link: https://lore.kernel.org/r/20241205013331.1484017-1-xiaqinxin@huawei.com Signed-off-by: Catalin Marinas catalin.marinas@arm.com Signed-off-by: Sasha Levin sashal@kernel.org --- Documentation/arch/arm64/silicon-errata.rst | 5 +++-- drivers/acpi/arm64/iort.c | 2 ++ 2 files changed, 5 insertions(+), 2 deletions(-)
diff --git a/Documentation/arch/arm64/silicon-errata.rst b/Documentation/arch/arm64/silicon-errata.rst index f4e6afd59630..8209c7a7c397 100644 --- a/Documentation/arch/arm64/silicon-errata.rst +++ b/Documentation/arch/arm64/silicon-errata.rst @@ -244,8 +244,9 @@ stable kernels. +----------------+-----------------+-----------------+-----------------------------+ | Hisilicon | Hip08 SMMU PMCG | #162001800 | N/A | +----------------+-----------------+-----------------+-----------------------------+ -| Hisilicon | Hip{08,09,10,10C| #162001900 | N/A | -| | ,11} SMMU PMCG | | | +| Hisilicon | Hip{08,09,09A,10| #162001900 | N/A | +| | ,10C,11} | | | +| | SMMU PMCG | | | +----------------+-----------------+-----------------+-----------------------------+ +----------------+-----------------+-----------------+-----------------------------+ | Qualcomm Tech. | Kryo/Falkor v1 | E1003 | QCOM_FALKOR_ERRATUM_1003 | diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c index b1f483845bc0..1a31106a14e4 100644 --- a/drivers/acpi/arm64/iort.c +++ b/drivers/acpi/arm64/iort.c @@ -1712,6 +1712,8 @@ static struct acpi_platform_list pmcg_plat_info[] __initdata = { /* HiSilicon Hip09 Platform */ {"HISI ", "HIP09 ", 0, ACPI_SIG_IORT, greater_than_or_equal, "Erratum #162001900", IORT_SMMU_V3_PMCG_HISI_HIP09}, + {"HISI ", "HIP09A ", 0, ACPI_SIG_IORT, greater_than_or_equal, + "Erratum #162001900", IORT_SMMU_V3_PMCG_HISI_HIP09}, /* HiSilicon Hip10/11 Platform uses the same SMMU IP with Hip09 */ {"HISI ", "HIP10 ", 0, ACPI_SIG_IORT, greater_than_or_equal, "Erratum #162001900", IORT_SMMU_V3_PMCG_HISI_HIP09},
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Xin Li xin3.li@intel.com
[ Upstream commit ee63291aa8287cb7ded767d340155fe8681fc075 ]
struct pt_regs is hard to read because the member or section related comments are not aligned with the members.
The 'cs' and 'ss' members of pt_regs are type of 'unsigned long' while in reality they are only 16-bit wide. This works so far as the remaining space is unused, but FRED will use the remaining bits for other purposes.
To prepare for FRED:
- Cleanup the formatting - Convert 'cs' and 'ss' to u16 and embed them into an union with a u64 - Fixup the related printk() format strings
Suggested-by: Thomas Gleixner tglx@linutronix.de Originally-by: H. Peter Anvin (Intel) hpa@zytor.com Signed-off-by: Xin Li xin3.li@intel.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Signed-off-by: Borislav Petkov (AMD) bp@alien8.de Tested-by: Shan Kang shan.kang@intel.com Link: https://lore.kernel.org/r/20231205105030.8698-14-xin3.li@intel.com Stable-dep-of: dc81e556f2a0 ("x86/fred: Clear WFE in missing-ENDBRANCH #CPs") Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/entry/vsyscall/vsyscall_64.c | 2 +- arch/x86/include/asm/ptrace.h | 48 +++++++++++++++++++-------- arch/x86/kernel/process_64.c | 2 +- 3 files changed, 37 insertions(+), 15 deletions(-)
diff --git a/arch/x86/entry/vsyscall/vsyscall_64.c b/arch/x86/entry/vsyscall/vsyscall_64.c index 1245000a8792..2fb7d53cf333 100644 --- a/arch/x86/entry/vsyscall/vsyscall_64.c +++ b/arch/x86/entry/vsyscall/vsyscall_64.c @@ -76,7 +76,7 @@ static void warn_bad_vsyscall(const char *level, struct pt_regs *regs, if (!show_unhandled_signals) return;
- printk_ratelimited("%s%s[%d] %s ip:%lx cs:%lx sp:%lx ax:%lx si:%lx di:%lx\n", + printk_ratelimited("%s%s[%d] %s ip:%lx cs:%x sp:%lx ax:%lx si:%lx di:%lx\n", level, current->comm, task_pid_nr(current), message, regs->ip, regs->cs, regs->sp, regs->ax, regs->si, regs->di); diff --git a/arch/x86/include/asm/ptrace.h b/arch/x86/include/asm/ptrace.h index f4db78b09c8f..b268cd2a2d01 100644 --- a/arch/x86/include/asm/ptrace.h +++ b/arch/x86/include/asm/ptrace.h @@ -57,17 +57,19 @@ struct pt_regs { #else /* __i386__ */
struct pt_regs { -/* - * C ABI says these regs are callee-preserved. They aren't saved on kernel entry - * unless syscall needs a complete, fully filled "struct pt_regs". - */ + /* + * C ABI says these regs are callee-preserved. They aren't saved on + * kernel entry unless syscall needs a complete, fully filled + * "struct pt_regs". + */ unsigned long r15; unsigned long r14; unsigned long r13; unsigned long r12; unsigned long bp; unsigned long bx; -/* These regs are callee-clobbered. Always saved on kernel entry. */ + + /* These regs are callee-clobbered. Always saved on kernel entry. */ unsigned long r11; unsigned long r10; unsigned long r9; @@ -77,18 +79,38 @@ struct pt_regs { unsigned long dx; unsigned long si; unsigned long di; -/* - * On syscall entry, this is syscall#. On CPU exception, this is error code. - * On hw interrupt, it's IRQ number: - */ + + /* + * orig_ax is used on entry for: + * - the syscall number (syscall, sysenter, int80) + * - error_code stored by the CPU on traps and exceptions + * - the interrupt number for device interrupts + */ unsigned long orig_ax; -/* Return frame for iretq */ + + /* The IRETQ return frame starts here */ unsigned long ip; - unsigned long cs; + + union { + /* The full 64-bit data slot containing CS */ + u64 csx; + /* CS selector */ + u16 cs; + }; + unsigned long flags; unsigned long sp; - unsigned long ss; -/* top of stack page */ + + union { + /* The full 64-bit data slot containing SS */ + u64 ssx; + /* SS selector */ + u16 ss; + }; + + /* + * Top of stack on IDT systems. + */ };
#endif /* !__i386__ */ diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c index d595ef7c1de0..dd19a4db741a 100644 --- a/arch/x86/kernel/process_64.c +++ b/arch/x86/kernel/process_64.c @@ -117,7 +117,7 @@ void __show_regs(struct pt_regs *regs, enum show_regs_mode mode,
printk("%sFS: %016lx(%04x) GS:%016lx(%04x) knlGS:%016lx\n", log_lvl, fs, fsindex, gs, gsindex, shadowgs); - printk("%sCS: %04lx DS: %04x ES: %04x CR0: %016lx\n", + printk("%sCS: %04x DS: %04x ES: %04x CR0: %016lx\n", log_lvl, regs->cs, ds, es, cr0); printk("%sCR2: %016lx CR3: %016lx CR4: %016lx\n", log_lvl, cr2, cr3, cr4);
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Xin Li xin3.li@intel.com
[ Upstream commit 3c77bf02d0c03beb3efdf7a5b427fb2e1a76c265 ]
FRED defines additional information in the upper 48 bits of cs/ss fields. Therefore add the information definitions into the pt_regs structure.
Specifically introduce a new structure fred_ss to denote the FRED flags above SS selector, which avoids FRED_SSX_ macros and makes the code simpler and easier to read.
Suggested-by: Thomas Gleixner tglx@linutronix.de Originally-by: H. Peter Anvin (Intel) hpa@zytor.com Signed-off-by: Xin Li xin3.li@intel.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Signed-off-by: Borislav Petkov (AMD) bp@alien8.de Tested-by: Shan Kang shan.kang@intel.com Link: https://lore.kernel.org/r/20231205105030.8698-15-xin3.li@intel.com Stable-dep-of: dc81e556f2a0 ("x86/fred: Clear WFE in missing-ENDBRANCH #CPs") Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/include/asm/ptrace.h | 66 ++++++++++++++++++++++++++++++++--- 1 file changed, 61 insertions(+), 5 deletions(-)
diff --git a/arch/x86/include/asm/ptrace.h b/arch/x86/include/asm/ptrace.h index b268cd2a2d01..5a83fbd9bc0b 100644 --- a/arch/x86/include/asm/ptrace.h +++ b/arch/x86/include/asm/ptrace.h @@ -56,6 +56,50 @@ struct pt_regs {
#else /* __i386__ */
+struct fred_cs { + /* CS selector */ + u64 cs : 16, + /* Stack level at event time */ + sl : 2, + /* IBT in WAIT_FOR_ENDBRANCH state */ + wfe : 1, + : 45; +}; + +struct fred_ss { + /* SS selector */ + u64 ss : 16, + /* STI state */ + sti : 1, + /* Set if syscall, sysenter or INT n */ + swevent : 1, + /* Event is NMI type */ + nmi : 1, + : 13, + /* Event vector */ + vector : 8, + : 8, + /* Event type */ + type : 4, + : 4, + /* Event was incident to enclave execution */ + enclave : 1, + /* CPU was in long mode */ + lm : 1, + /* + * Nested exception during FRED delivery, not set + * for #DF. + */ + nested : 1, + : 1, + /* + * The length of the instruction causing the event. + * Only set for INTO, INT1, INT3, INT n, SYSCALL + * and SYSENTER. 0 otherwise. + */ + insnlen : 4; +}; + struct pt_regs { /* * C ABI says these regs are callee-preserved. They aren't saved on @@ -85,6 +129,12 @@ struct pt_regs { * - the syscall number (syscall, sysenter, int80) * - error_code stored by the CPU on traps and exceptions * - the interrupt number for device interrupts + * + * A FRED stack frame starts here: + * 1) It _always_ includes an error code; + * + * 2) The return frame for ERET[US] starts here, but + * the content of orig_ax is ignored. */ unsigned long orig_ax;
@@ -92,24 +142,30 @@ struct pt_regs { unsigned long ip;
union { - /* The full 64-bit data slot containing CS */ - u64 csx; /* CS selector */ u16 cs; + /* The extended 64-bit data slot containing CS */ + u64 csx; + /* The FRED CS extension */ + struct fred_cs fred_cs; };
unsigned long flags; unsigned long sp;
union { - /* The full 64-bit data slot containing SS */ - u64 ssx; /* SS selector */ u16 ss; + /* The extended 64-bit data slot containing SS */ + u64 ssx; + /* The FRED SS extension */ + struct fred_ss fred_ss; };
/* - * Top of stack on IDT systems. + * Top of stack on IDT systems, while FRED systems have extra fields + * defined above for storing exception related information, e.g. CR2 or + * DR6. */ };
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Xin Li (Intel) xin@zytor.com
[ Upstream commit dc81e556f2a017d681251ace21bf06c126d5a192 ]
An indirect branch instruction sets the CPU indirect branch tracker (IBT) into WAIT_FOR_ENDBRANCH (WFE) state and WFE stays asserted across the instruction boundary. When the decoder finds an inappropriate instruction while WFE is set ENDBR, the CPU raises a #CP fault.
For the "kernel IBT no ENDBR" selftest where #CPs are deliberately triggered, the WFE state of the interrupted context needs to be cleared to let execution continue. Otherwise when the CPU resumes from the instruction that just caused the previous #CP, another missing-ENDBRANCH #CP is raised and the CPU enters a dead loop.
This is not a problem with IDT because it doesn't preserve WFE and IRET doesn't set WFE. But FRED provides space on the entry stack (in an expanded CS area) to save and restore the WFE state, thus the WFE state is no longer clobbered, so software must clear it.
Clear WFE to avoid dead looping in ibt_clear_fred_wfe() and the !ibt_fatal code path when execution is allowed to continue.
Clobbering WFE in any other circumstance is a security-relevant bug.
[ dhansen: changelog rewording ]
Fixes: a5f6c2ace997 ("x86/shstk: Add user control-protection fault handler") Signed-off-by: Xin Li (Intel) xin@zytor.com Signed-off-by: Dave Hansen dave.hansen@linux.intel.com Signed-off-by: Ingo Molnar mingo@kernel.org Acked-by: Dave Hansen dave.hansen@linux.intel.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/all/20241113175934.3897541-1-xin%40zytor.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/kernel/cet.c | 30 ++++++++++++++++++++++++++++++ 1 file changed, 30 insertions(+)
diff --git a/arch/x86/kernel/cet.c b/arch/x86/kernel/cet.c index d2c732a34e5d..303bf74d175b 100644 --- a/arch/x86/kernel/cet.c +++ b/arch/x86/kernel/cet.c @@ -81,6 +81,34 @@ static void do_user_cp_fault(struct pt_regs *regs, unsigned long error_code)
static __ro_after_init bool ibt_fatal = true;
+/* + * By definition, all missing-ENDBRANCH #CPs are a result of WFE && !ENDBR. + * + * For the kernel IBT no ENDBR selftest where #CPs are deliberately triggered, + * the WFE state of the interrupted context needs to be cleared to let execution + * continue. Otherwise when the CPU resumes from the instruction that just + * caused the previous #CP, another missing-ENDBRANCH #CP is raised and the CPU + * enters a dead loop. + * + * This is not a problem with IDT because it doesn't preserve WFE and IRET doesn't + * set WFE. But FRED provides space on the entry stack (in an expanded CS area) + * to save and restore the WFE state, thus the WFE state is no longer clobbered, + * so software must clear it. + */ +static void ibt_clear_fred_wfe(struct pt_regs *regs) +{ + /* + * No need to do any FRED checks. + * + * For IDT event delivery, the high-order 48 bits of CS are pushed + * as 0s into the stack, and later IRET ignores these bits. + * + * For FRED, a test to check if fred_cs.wfe is set would be dropped + * by compilers. + */ + regs->fred_cs.wfe = 0; +} + static void do_kernel_cp_fault(struct pt_regs *regs, unsigned long error_code) { if ((error_code & CP_EC) != CP_ENDBR) { @@ -90,6 +118,7 @@ static void do_kernel_cp_fault(struct pt_regs *regs, unsigned long error_code)
if (unlikely(regs->ip == (unsigned long)&ibt_selftest_noendbr)) { regs->ax = 0; + ibt_clear_fred_wfe(regs); return; }
@@ -97,6 +126,7 @@ static void do_kernel_cp_fault(struct pt_regs *regs, unsigned long error_code) if (!ibt_fatal) { printk(KERN_DEFAULT CUT_HERE); __warn(__FILE__, __LINE__, (void *)regs->ip, TAINT_WARN, regs, NULL); + ibt_clear_fred_wfe(regs); return; } BUG();
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Filipe Manana fdmanana@suse.com
[ Upstream commit 95f93bc4cbcac6121a5ee85cd5019ee8e7447e0b ]
Rename and export __btrfs_cow_block() as btrfs_force_cow_block(). This is to allow to move defrag specific code out of ctree.c and into defrag.c in one of the next patches.
Signed-off-by: Filipe Manana fdmanana@suse.com Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com Stable-dep-of: 44f52bbe96df ("btrfs: fix use-after-free when COWing tree bock and tracing is enabled") Signed-off-by: Sasha Levin sashal@kernel.org --- fs/btrfs/ctree.c | 30 +++++++++++++++--------------- fs/btrfs/ctree.h | 7 +++++++ 2 files changed, 22 insertions(+), 15 deletions(-)
diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c index 25c902e7556d..62032d3fda85 100644 --- a/fs/btrfs/ctree.c +++ b/fs/btrfs/ctree.c @@ -526,13 +526,13 @@ static noinline int update_ref_for_cow(struct btrfs_trans_handle *trans, * bytes the allocator should try to find free next to the block it returns. * This is just a hint and may be ignored by the allocator. */ -static noinline int __btrfs_cow_block(struct btrfs_trans_handle *trans, - struct btrfs_root *root, - struct extent_buffer *buf, - struct extent_buffer *parent, int parent_slot, - struct extent_buffer **cow_ret, - u64 search_start, u64 empty_size, - enum btrfs_lock_nesting nest) +int btrfs_force_cow_block(struct btrfs_trans_handle *trans, + struct btrfs_root *root, + struct extent_buffer *buf, + struct extent_buffer *parent, int parent_slot, + struct extent_buffer **cow_ret, + u64 search_start, u64 empty_size, + enum btrfs_lock_nesting nest) { struct btrfs_fs_info *fs_info = root->fs_info; struct btrfs_disk_key disk_key; @@ -699,7 +699,7 @@ static inline int should_cow_block(struct btrfs_trans_handle *trans, }
/* - * cows a single block, see __btrfs_cow_block for the real work. + * COWs a single block, see btrfs_force_cow_block() for the real work. * This version of it has extra checks so that a block isn't COWed more than * once per transaction, as long as it hasn't been written yet */ @@ -752,8 +752,8 @@ noinline int btrfs_cow_block(struct btrfs_trans_handle *trans, * Also We don't care about the error, as it's handled internally. */ btrfs_qgroup_trace_subtree_after_cow(trans, root, buf); - ret = __btrfs_cow_block(trans, root, buf, parent, - parent_slot, cow_ret, search_start, 0, nest); + ret = btrfs_force_cow_block(trans, root, buf, parent, parent_slot, + cow_ret, search_start, 0, nest);
trace_btrfs_cow_block(root, buf, *cow_ret);
@@ -904,11 +904,11 @@ int btrfs_realloc_node(struct btrfs_trans_handle *trans, search_start = last_block;
btrfs_tree_lock(cur); - err = __btrfs_cow_block(trans, root, cur, parent, i, - &cur, search_start, - min(16 * blocksize, - (end_slot - i) * blocksize), - BTRFS_NESTING_COW); + err = btrfs_force_cow_block(trans, root, cur, parent, i, + &cur, search_start, + min(16 * blocksize, + (end_slot - i) * blocksize), + BTRFS_NESTING_COW); if (err) { btrfs_tree_unlock(cur); free_extent_buffer(cur); diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index f7bb4c34b984..7df3ed2945b0 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -538,6 +538,13 @@ int btrfs_cow_block(struct btrfs_trans_handle *trans, struct extent_buffer *parent, int parent_slot, struct extent_buffer **cow_ret, enum btrfs_lock_nesting nest); +int btrfs_force_cow_block(struct btrfs_trans_handle *trans, + struct btrfs_root *root, + struct extent_buffer *buf, + struct extent_buffer *parent, int parent_slot, + struct extent_buffer **cow_ret, + u64 search_start, u64 empty_size, + enum btrfs_lock_nesting nest); int btrfs_copy_root(struct btrfs_trans_handle *trans, struct btrfs_root *root, struct extent_buffer *buf,
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Filipe Manana fdmanana@suse.com
[ Upstream commit 44f52bbe96dfdbe4aca3818a2534520082a07040 ]
When a COWing a tree block, at btrfs_cow_block(), and we have the tracepoint trace_btrfs_cow_block() enabled and preemption is also enabled (CONFIG_PREEMPT=y), we can trigger a use-after-free in the COWed extent buffer while inside the tracepoint code. This is because in some paths that call btrfs_cow_block(), such as btrfs_search_slot(), we are holding the last reference on the extent buffer @buf so btrfs_force_cow_block() drops the last reference on the @buf extent buffer when it calls free_extent_buffer_stale(buf), which schedules the release of the extent buffer with RCU. This means that if we are on a kernel with preemption, the current task may be preempted before calling trace_btrfs_cow_block() and the extent buffer already released by the time trace_btrfs_cow_block() is called, resulting in a use-after-free.
Fix this by moving the trace_btrfs_cow_block() from btrfs_cow_block() to btrfs_force_cow_block() before the COWed extent buffer is freed. This also has a side effect of invoking the tracepoint in the tree defrag code, at defrag.c:btrfs_realloc_node(), since btrfs_force_cow_block() is called there, but this is fine and it was actually missing there.
Reported-by: syzbot+8517da8635307182c8a5@syzkaller.appspotmail.com Link: https://lore.kernel.org/linux-btrfs/6759a9b9.050a0220.1ac542.000d.GAE@google... CC: stable@vger.kernel.org # 5.4+ Reviewed-by: Qu Wenruo wqu@suse.com Signed-off-by: Filipe Manana fdmanana@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/btrfs/ctree.c | 11 ++++------- 1 file changed, 4 insertions(+), 7 deletions(-)
diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c index 62032d3fda85..4b21ca49b666 100644 --- a/fs/btrfs/ctree.c +++ b/fs/btrfs/ctree.c @@ -660,6 +660,8 @@ int btrfs_force_cow_block(struct btrfs_trans_handle *trans, return ret; } } + + trace_btrfs_cow_block(root, buf, cow); if (unlock_orig) btrfs_tree_unlock(buf); free_extent_buffer_stale(buf); @@ -711,7 +713,6 @@ noinline int btrfs_cow_block(struct btrfs_trans_handle *trans, { struct btrfs_fs_info *fs_info = root->fs_info; u64 search_start; - int ret;
if (unlikely(test_bit(BTRFS_ROOT_DELETING, &root->state))) { btrfs_abort_transaction(trans, -EUCLEAN); @@ -752,12 +753,8 @@ noinline int btrfs_cow_block(struct btrfs_trans_handle *trans, * Also We don't care about the error, as it's handled internally. */ btrfs_qgroup_trace_subtree_after_cow(trans, root, buf); - ret = btrfs_force_cow_block(trans, root, buf, parent, parent_slot, - cow_ret, search_start, 0, nest); - - trace_btrfs_cow_block(root, buf, *cow_ret); - - return ret; + return btrfs_force_cow_block(trans, root, buf, parent, parent_slot, + cow_ret, search_start, 0, nest); } ALLOW_ERROR_INJECTION(btrfs_cow_block, ERRNO);
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Chris Lu chris.lu@mediatek.com
[ Upstream commit 95f92928ad2215b5f524903e67eebd8e14f99564 ]
Add suspend/resum callback function in btusb_data which are reserved for vendor specific usage during suspend/resume. hdev->suspend will be added before stop traffic in btusb_suspend and hdev-> resume will be added after resubmit urb in btusb_resume.
Signed-off-by: Chris Lu chris.lu@mediatek.com Signed-off-by: Sean Wang sean.wang@mediatek.com Signed-off-by: Luiz Augusto von Dentz luiz.von.dentz@intel.com Stable-dep-of: cea1805f165c ("Bluetooth: btusb: mediatek: add callback function in btusb_disconnect") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/bluetooth/btusb.c | 9 +++++++++ 1 file changed, 9 insertions(+)
diff --git a/drivers/bluetooth/btusb.c b/drivers/bluetooth/btusb.c index 67577933835f..19d371aa8317 100644 --- a/drivers/bluetooth/btusb.c +++ b/drivers/bluetooth/btusb.c @@ -892,6 +892,9 @@ struct btusb_data {
int (*setup_on_usb)(struct hci_dev *hdev);
+ int (*suspend)(struct hci_dev *hdev); + int (*resume)(struct hci_dev *hdev); + int oob_wake_irq; /* irq for out-of-band wake-on-bt */ unsigned cmd_timeout_cnt;
@@ -4691,6 +4694,9 @@ static int btusb_suspend(struct usb_interface *intf, pm_message_t message)
cancel_work_sync(&data->work);
+ if (data->suspend) + data->suspend(data->hdev); + btusb_stop_traffic(data); usb_kill_anchored_urbs(&data->tx_anchor);
@@ -4794,6 +4800,9 @@ static int btusb_resume(struct usb_interface *intf) btusb_submit_isoc_urb(hdev, GFP_NOIO); }
+ if (data->resume) + data->resume(hdev); + spin_lock_irq(&data->txlock); play_deferred(data); clear_bit(BTUSB_SUSPENDING, &data->flags);
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Chris Lu chris.lu@mediatek.com
[ Upstream commit cea1805f165cdd783dd21f26df957118cb8641b4 ]
Add disconnect callback function in btusb_disconnect which is reserved for vendor specific usage before deregister hci in btusb_disconnect.
Signed-off-by: Chris Lu chris.lu@mediatek.com Signed-off-by: Luiz Augusto von Dentz luiz.von.dentz@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/bluetooth/btusb.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/drivers/bluetooth/btusb.c b/drivers/bluetooth/btusb.c index 19d371aa8317..c80b5aa7628a 100644 --- a/drivers/bluetooth/btusb.c +++ b/drivers/bluetooth/btusb.c @@ -894,6 +894,7 @@ struct btusb_data {
int (*suspend)(struct hci_dev *hdev); int (*resume)(struct hci_dev *hdev); + int (*disconnect)(struct hci_dev *hdev);
int oob_wake_irq; /* irq for out-of-band wake-on-bt */ unsigned cmd_timeout_cnt; @@ -4646,6 +4647,9 @@ static void btusb_disconnect(struct usb_interface *intf) if (data->diag) usb_set_intfdata(data->diag, NULL);
+ if (data->disconnect) + data->disconnect(hdev); + hci_unregister_dev(hdev);
if (intf == data->intf) {
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jan Beulich jbeulich@suse.com
[ Upstream commit e0eec24e2e199873f43df99ec39773ad3af2bff7 ]
On an (old) x86 system with SRAT just covering space above 4Gb:
ACPI: SRAT: Node 0 PXM 0 [mem 0x100000000-0xfffffffff] hotplug
the commit referenced below leads to this NUMA configuration no longer being refused by a CONFIG_NUMA=y kernel (previously
NUMA: nodes only cover 6144MB of your 8185MB e820 RAM. Not used. No NUMA configuration found Faking a node at [mem 0x0000000000000000-0x000000027fffffff]
was seen in the log directly after the message quoted above), because of memblock_validate_numa_coverage() checking for NUMA_NO_NODE (only). This in turn led to memblock_alloc_range_nid()'s warning about MAX_NUMNODES triggering, followed by a NULL deref in memmap_init() when trying to access node 64's (NODE_SHIFT=6) node data.
To compensate said change, make memblock_set_node() warn on and adjust a passed in value of MAX_NUMNODES, just like various other functions already do.
Fixes: ff6c3d81f2e8 ("NUMA: optimize detection of memory with no node id assigned by firmware") Signed-off-by: Jan Beulich jbeulich@suse.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/1c8a058c-5365-4f27-a9f1-3aeb7fb3e7b2@suse.com Signed-off-by: Mike Rapoport (IBM) rppt@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- mm/memblock.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/mm/memblock.c b/mm/memblock.c index 87a2b4340ce4..ba64b47b7c3b 100644 --- a/mm/memblock.c +++ b/mm/memblock.c @@ -1321,6 +1321,10 @@ int __init_memblock memblock_set_node(phys_addr_t base, phys_addr_t size, int start_rgn, end_rgn; int i, ret;
+ if (WARN_ONCE(nid == MAX_NUMNODES, + "Usage of MAX_NUMNODES is deprecated. Use NUMA_NO_NODE instead\n")) + nid = NUMA_NO_NODE; + ret = memblock_isolate_range(type, base, size, &start_rgn, &end_rgn); if (ret) return ret;
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Stefan Berger stefanb@linux.ibm.com
[ Upstream commit c6ab5c915da460c0397960af3c308386c3f3247b ]
Prevent ecc_digits_from_bytes from reading too many bytes from the input byte array in case an insufficient number of bytes is provided to fill the output digit array of ndigits. Therefore, initialize the most significant digits with 0 to avoid trying to read too many bytes later on. Convert the function into a regular function since it is getting too big for an inline function.
If too many bytes are provided on the input byte array the extra bytes are ignored since the input variable 'ndigits' limits the number of digits that will be filled.
Fixes: d67c96fb97b5 ("crypto: ecdsa - Convert byte arrays with key coordinates to digits") Reviewed-by: Jarkko Sakkinen jarkko@kernel.org Signed-off-by: Stefan Berger stefanb@linux.ibm.com Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Sasha Levin sashal@kernel.org --- crypto/ecc.c | 22 ++++++++++++++++++++++ include/crypto/internal/ecc.h | 15 ++------------- 2 files changed, 24 insertions(+), 13 deletions(-)
diff --git a/crypto/ecc.c b/crypto/ecc.c index f53fb4d6af99..21504280aca2 100644 --- a/crypto/ecc.c +++ b/crypto/ecc.c @@ -66,6 +66,28 @@ const struct ecc_curve *ecc_get_curve(unsigned int curve_id) } EXPORT_SYMBOL(ecc_get_curve);
+void ecc_digits_from_bytes(const u8 *in, unsigned int nbytes, + u64 *out, unsigned int ndigits) +{ + int diff = ndigits - DIV_ROUND_UP(nbytes, sizeof(u64)); + unsigned int o = nbytes & 7; + __be64 msd = 0; + + /* diff > 0: not enough input bytes: set most significant digits to 0 */ + if (diff > 0) { + ndigits -= diff; + memset(&out[ndigits - 1], 0, diff * sizeof(u64)); + } + + if (o) { + memcpy((u8 *)&msd + sizeof(msd) - o, in, o); + out[--ndigits] = be64_to_cpu(msd); + in += o; + } + ecc_swap_digits(in, out, ndigits); +} +EXPORT_SYMBOL(ecc_digits_from_bytes); + static u64 *ecc_alloc_digits_space(unsigned int ndigits) { size_t len = ndigits * sizeof(u64); diff --git a/include/crypto/internal/ecc.h b/include/crypto/internal/ecc.h index ab722a8986b7..c0b8be63cbde 100644 --- a/include/crypto/internal/ecc.h +++ b/include/crypto/internal/ecc.h @@ -63,19 +63,8 @@ static inline void ecc_swap_digits(const void *in, u64 *out, unsigned int ndigit * @out Output digits array * @ndigits: Number of digits to create from byte array */ -static inline void ecc_digits_from_bytes(const u8 *in, unsigned int nbytes, - u64 *out, unsigned int ndigits) -{ - unsigned int o = nbytes & 7; - __be64 msd = 0; - - if (o) { - memcpy((u8 *)&msd + sizeof(msd) - o, in, o); - out[--ndigits] = be64_to_cpu(msd); - in += o; - } - ecc_swap_digits(in, out, ndigits); -} +void ecc_digits_from_bytes(const u8 *in, unsigned int nbytes, + u64 *out, unsigned int ndigits);
/** * ecc_is_key_valid() - Validate a given ECDH private key
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Uros Bizjak ubizjak@gmail.com
[ Upstream commit f730fd535fc51573f982fad629f2fc6b4a0cde2f ]
Guard functions in local_lock.h are defined using DEFINE_GUARD() and DEFINE_LOCK_GUARD_1() macros having lock type defined as pointer in the percpu address space. The functions, defined by these macros return value in generic address space, causing:
cleanup.h:157:18: error: return from pointer to non-enclosed address space
and
cleanup.h:214:18: error: return from pointer to non-enclosed address space
when strict percpu checks are enabled.
Add explicit casts to remove address space of the returned pointer.
Found by GCC's named address space checks.
Fixes: e4ab322fbaaa ("cleanup: Add conditional guard support") Signed-off-by: Uros Bizjak ubizjak@gmail.com Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Link: https://lkml.kernel.org/r/20240819074124.143565-1-ubizjak@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/cleanup.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/include/linux/cleanup.h b/include/linux/cleanup.h index f0c6d1d45e67..64b8600eb8c0 100644 --- a/include/linux/cleanup.h +++ b/include/linux/cleanup.h @@ -123,7 +123,7 @@ static __maybe_unused const bool class_##_name##_is_conditional = _is_cond __DEFINE_CLASS_IS_CONDITIONAL(_name, false); \ DEFINE_CLASS(_name, _type, if (_T) { _unlock; }, ({ _lock; _T; }), _type _T); \ static inline void * class_##_name##_lock_ptr(class_##_name##_t *_T) \ - { return *_T; } + { return (void *)(__force unsigned long)*_T; }
#define DEFINE_GUARD_COND(_name, _ext, _condlock) \ __DEFINE_CLASS_IS_CONDITIONAL(_name##_ext, true); \ @@ -204,7 +204,7 @@ static inline void class_##_name##_destructor(class_##_name##_t *_T) \ \ static inline void *class_##_name##_lock_ptr(class_##_name##_t *_T) \ { \ - return _T->lock; \ + return (void *)(__force unsigned long)_T->lock; \ }
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Yihang Li liyihang9@huawei.com
[ Upstream commit 3c4f53b2c341ec6428b98cb51a89a09b025d0953 ]
If we issue a disabling PHY command, the device attached with it will go offline, if a 2 bit ECC error occurs at the same time, a hung task may be found:
[ 4613.652388] INFO: task kworker/u256:0:165233 blocked for more than 120 seconds. [ 4613.666297] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 4613.674809] task:kworker/u256:0 state:D stack: 0 pid:165233 ppid: 2 flags:0x00000208 [ 4613.683959] Workqueue: 0000:74:02.0_disco_q sas_revalidate_domain [libsas] [ 4613.691518] Call trace: [ 4613.694678] __switch_to+0xf8/0x17c [ 4613.698872] __schedule+0x660/0xee0 [ 4613.703063] schedule+0xac/0x240 [ 4613.706994] schedule_timeout+0x500/0x610 [ 4613.711705] __down+0x128/0x36c [ 4613.715548] down+0x240/0x2d0 [ 4613.719221] hisi_sas_internal_abort_timeout+0x1bc/0x260 [hisi_sas_main] [ 4613.726618] sas_execute_internal_abort+0x144/0x310 [libsas] [ 4613.732976] sas_execute_internal_abort_dev+0x44/0x60 [libsas] [ 4613.739504] hisi_sas_internal_task_abort_dev.isra.0+0xbc/0x1b0 [hisi_sas_main] [ 4613.747499] hisi_sas_dev_gone+0x174/0x250 [hisi_sas_main] [ 4613.753682] sas_notify_lldd_dev_gone+0xec/0x2e0 [libsas] [ 4613.759781] sas_unregister_common_dev+0x4c/0x7a0 [libsas] [ 4613.765962] sas_destruct_devices+0xb8/0x120 [libsas] [ 4613.771709] sas_do_revalidate_domain.constprop.0+0x1b8/0x31c [libsas] [ 4613.778930] sas_revalidate_domain+0x60/0xa4 [libsas] [ 4613.784716] process_one_work+0x248/0x950 [ 4613.789424] worker_thread+0x318/0x934 [ 4613.793878] kthread+0x190/0x200 [ 4613.797810] ret_from_fork+0x10/0x18 [ 4613.802121] INFO: task kworker/u256:4:316722 blocked for more than 120 seconds. [ 4613.816026] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 4613.824538] task:kworker/u256:4 state:D stack: 0 pid:316722 ppid: 2 flags:0x00000208 [ 4613.833670] Workqueue: 0000:74:02.0 hisi_sas_rst_work_handler [hisi_sas_main] [ 4613.841491] Call trace: [ 4613.844647] __switch_to+0xf8/0x17c [ 4613.848852] __schedule+0x660/0xee0 [ 4613.853052] schedule+0xac/0x240 [ 4613.856984] schedule_timeout+0x500/0x610 [ 4613.861695] __down+0x128/0x36c [ 4613.865542] down+0x240/0x2d0 [ 4613.869216] hisi_sas_controller_prereset+0x58/0x1fc [hisi_sas_main] [ 4613.876324] hisi_sas_rst_work_handler+0x40/0x8c [hisi_sas_main] [ 4613.883019] process_one_work+0x248/0x950 [ 4613.887732] worker_thread+0x318/0x934 [ 4613.892204] kthread+0x190/0x200 [ 4613.896118] ret_from_fork+0x10/0x18 [ 4613.900423] INFO: task kworker/u256:1:348985 blocked for more than 121 seconds. [ 4613.914341] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 4613.922852] task:kworker/u256:1 state:D stack: 0 pid:348985 ppid: 2 flags:0x00000208 [ 4613.931984] Workqueue: 0000:74:02.0_event_q sas_port_event_worker [libsas] [ 4613.939549] Call trace: [ 4613.942702] __switch_to+0xf8/0x17c [ 4613.946892] __schedule+0x660/0xee0 [ 4613.951083] schedule+0xac/0x240 [ 4613.955015] schedule_timeout+0x500/0x610 [ 4613.959725] wait_for_common+0x200/0x610 [ 4613.964349] wait_for_completion+0x3c/0x5c [ 4613.969146] flush_workqueue+0x198/0x790 [ 4613.973776] sas_porte_broadcast_rcvd+0x1e8/0x320 [libsas] [ 4613.979960] sas_port_event_worker+0x54/0xa0 [libsas] [ 4613.985708] process_one_work+0x248/0x950 [ 4613.990420] worker_thread+0x318/0x934 [ 4613.994868] kthread+0x190/0x200 [ 4613.998800] ret_from_fork+0x10/0x18
This is because when the device goes offline, we obtain the hisi_hba semaphore and send the ABORT_DEV command to the device. However, the internal abort timed out due to the 2 bit ECC error and triggers automatic dump. In addition, since the hisi_hba semaphore has been obtained, the dump cannot be executed and the controller cannot be reset.
Therefore, the deadlocks occur on the following circular dependencies: hisi_sas_dev_gone() -> down() -> hisi_sas_internal_task_abort_dev() -> ... -> hisi_sas_internal_abort_timeout() -> down().
The deadlock is triggered only when the timeout occurs during device goes offline. To fix this issue, use .rst_ha_timeout to distinguish the scenario where a device goes offline from other scenarios.
Fixes: 2ff07b5c6fe9 ("scsi: hisi_sas: Directly call register snapshot instead of using workqueue") Signed-off-by: Yihang Li liyihang9@huawei.com Signed-off-by: Xiang Chen chenxiang66@hisilicon.com Link: https://lore.kernel.org/r/1705904747-62186-2-git-send-email-chenxiang66@hisi... Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/hisi_sas/hisi_sas_main.c | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-)
diff --git a/drivers/scsi/hisi_sas/hisi_sas_main.c b/drivers/scsi/hisi_sas/hisi_sas_main.c index 5fdba7b39a1b..4ce737ddb058 100644 --- a/drivers/scsi/hisi_sas/hisi_sas_main.c +++ b/drivers/scsi/hisi_sas/hisi_sas_main.c @@ -1968,9 +1968,17 @@ static bool hisi_sas_internal_abort_timeout(struct sas_task *task, struct hisi_sas_internal_abort_data *timeout = data;
if (hisi_sas_debugfs_enable && hisi_hba->debugfs_itct[0].itct) { - down(&hisi_hba->sem); + /* + * If timeout occurs in device gone scenario, to avoid + * circular dependency like: + * hisi_sas_dev_gone() -> down() -> ... -> + * hisi_sas_internal_abort_timeout() -> down(). + */ + if (!timeout->rst_ha_timeout) + down(&hisi_hba->sem); hisi_hba->hw->debugfs_snapshot_regs(hisi_hba); - up(&hisi_hba->sem); + if (!timeout->rst_ha_timeout) + up(&hisi_hba->sem); }
if (task->task_state_flags & SAS_TASK_STATE_DONE) {
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dmitry Baryshkov dmitry.baryshkov@linaro.org
[ Upstream commit 4a22918810980897393fa1776ea3877e4baf8cca ]
UCSI connector's indices start from 1 up to 3, PMIC_GLINK_MAX_PORTS. Correct the condition in the pmic_glink_ucsi_connector_status() callback, fixing Type-C orientation reporting for the third USB-C connector.
Fixes: 76716fd5bf09 ("usb: typec: ucsi: glink: move GPIO reading into connector_status callback") Cc: stable@vger.kernel.org Reported-by: Abel Vesa abel.vesa@linaro.org Reviewed-by: Neil Armstrong neil.armstrong@linaro.org Reviewed-by: Johan Hovold johan+linaro@kernel.org Tested-by: Johan Hovold johan+linaro@kernel.org Signed-off-by: Dmitry Baryshkov dmitry.baryshkov@linaro.org Link: https://lore.kernel.org/r/20241109-ucsi-glue-fixes-v2-1-8b21ff4f9fbe@linaro.... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/usb/typec/ucsi/ucsi_glink.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/usb/typec/ucsi/ucsi_glink.c b/drivers/usb/typec/ucsi/ucsi_glink.c index f0b4d0a4bb19..82a1081d44f1 100644 --- a/drivers/usb/typec/ucsi/ucsi_glink.c +++ b/drivers/usb/typec/ucsi/ucsi_glink.c @@ -202,7 +202,7 @@ static void pmic_glink_ucsi_connector_status(struct ucsi_connector *con) struct pmic_glink_ucsi *ucsi = ucsi_get_drvdata(con->ucsi); int orientation;
- if (con->num >= PMIC_GLINK_MAX_PORTS || + if (con->num > PMIC_GLINK_MAX_PORTS || !ucsi->port_orientation[con->num - 1]) return;
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Michal Pecio michal.pecio@gmail.com
[ Upstream commit 474538b8dd1cd9c666e56cfe8ef60fbb0fb513f4 ]
Stop Endpoint command on an already stopped endpoint fails and may be misinterpreted as a known hardware bug by the completion handler. This results in an unnecessary delay with repeated retries of the command.
Avoid queuing this command when endpoint state flags indicate that it's stopped or halted and the command will fail. If commands are pending on the endpoint, their completion handlers will process cancelled TDs so it's done. In case of waiting for external operations like clearing TT buffer, the endpoint is stopped and cancelled TDs can be processed now.
This eliminates practically all unnecessary retries because an endpoint with pending URBs is maintained in Running state by the driver, unless aforementioned commands or other operations are pending on it. This is guaranteed by xhci_ring_ep_doorbell() and by the fact that it is called every time any of those operations completes.
The only known exceptions are hardware bugs (the endpoint never starts at all) and Stream Protocol errors not associated with any TRB, which cause an endpoint reset not followed by restart. Sounds like a bug.
Generally, these retries are only expected to happen when the endpoint fails to start for unknown/no reason, which is a worse problem itself, and fixing the bug eliminates the retries too.
All cases were tested and found to work as expected. SET_DEQ_PENDING was produced by patching uvcvideo to unlink URBs in 100us intervals, which then runs into this case very often. EP_HALTED was produced by restarting 'cat /dev/ttyUSB0' on a serial dongle with broken cable. EP_CLEARING_TT by the same, with the dongle on an external hub.
Fixes: fd9d55d190c0 ("xhci: retry Stop Endpoint on buggy NEC controllers") CC: stable@vger.kernel.org Signed-off-by: Michal Pecio michal.pecio@gmail.com Signed-off-by: Mathias Nyman mathias.nyman@linux.intel.com Link: https://lore.kernel.org/r/20241106101459.775897-34-mathias.nyman@linux.intel... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/usb/host/xhci-ring.c | 13 +++++++++++++ drivers/usb/host/xhci.c | 19 +++++++++++++++---- drivers/usb/host/xhci.h | 1 + 3 files changed, 29 insertions(+), 4 deletions(-)
diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c index 729319d81753..0d628af5c3ba 100644 --- a/drivers/usb/host/xhci-ring.c +++ b/drivers/usb/host/xhci-ring.c @@ -1092,6 +1092,19 @@ static int xhci_invalidate_cancelled_tds(struct xhci_virt_ep *ep) return 0; }
+/* + * Erase queued TDs from transfer ring(s) and give back those the xHC didn't + * stop on. If necessary, queue commands to move the xHC off cancelled TDs it + * stopped on. Those will be given back later when the commands complete. + * + * Call under xhci->lock on a stopped endpoint. + */ +void xhci_process_cancelled_tds(struct xhci_virt_ep *ep) +{ + xhci_invalidate_cancelled_tds(ep); + xhci_giveback_invalidated_tds(ep); +} + /* * Returns the TD the endpoint ring halted on. * Only call for non-running rings without streams. diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c index 0b3688478ed3..70e6c240a540 100644 --- a/drivers/usb/host/xhci.c +++ b/drivers/usb/host/xhci.c @@ -1738,10 +1738,21 @@ static int xhci_urb_dequeue(struct usb_hcd *hcd, struct urb *urb, int status) } }
- /* Queue a stop endpoint command, but only if this is - * the first cancellation to be handled. - */ - if (!(ep->ep_state & EP_STOP_CMD_PENDING)) { + /* These completion handlers will sort out cancelled TDs for us */ + if (ep->ep_state & (EP_STOP_CMD_PENDING | EP_HALTED | SET_DEQ_PENDING)) { + xhci_dbg(xhci, "Not queuing Stop Endpoint on slot %d ep %d in state 0x%x\n", + urb->dev->slot_id, ep_index, ep->ep_state); + goto done; + } + + /* In this case no commands are pending but the endpoint is stopped */ + if (ep->ep_state & EP_CLEARING_TT) { + /* and cancelled TDs can be given back right away */ + xhci_dbg(xhci, "Invalidating TDs instantly on slot %d ep %d in state 0x%x\n", + urb->dev->slot_id, ep_index, ep->ep_state); + xhci_process_cancelled_tds(ep); + } else { + /* Otherwise, queue a new Stop Endpoint command */ command = xhci_alloc_command(xhci, false, GFP_ATOMIC); if (!command) { ret = -ENOMEM; diff --git a/drivers/usb/host/xhci.h b/drivers/usb/host/xhci.h index d89010ac5f8a..fddb3a90dae3 100644 --- a/drivers/usb/host/xhci.h +++ b/drivers/usb/host/xhci.h @@ -1952,6 +1952,7 @@ void xhci_ring_doorbell_for_active_rings(struct xhci_hcd *xhci, void xhci_cleanup_command_queue(struct xhci_hcd *xhci); void inc_deq(struct xhci_hcd *xhci, struct xhci_ring *ring); unsigned int count_trbs(u64 addr, u64 len); +void xhci_process_cancelled_tds(struct xhci_virt_ep *ep);
/* xHCI roothub code */ void xhci_set_link_state(struct xhci_hcd *xhci, struct xhci_port *port,
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Takashi Iwai tiwai@suse.de
[ Upstream commit ed990c07af70d286f5736021c6e25d8df6f2f7b0 ]
The recent change for the legacy substream name update brought a compile warning for some compilers due to the nature of snprintf(). Use scnprintf() to shut up the warning since the truncation is intentional.
Fixes: e29e504e7890 ("ALSA: ump: Indicate the inactive group in legacy substream names") Reported-by: kernel test robot lkp@intel.com Closes: https://lore.kernel.org/oe-kbuild-all/202411300103.FrGuTAYp-lkp@intel.com/ Link: https://patch.msgid.link/20241130090009.19849-1-tiwai@suse.de Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Sasha Levin sashal@kernel.org --- sound/core/ump.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/sound/core/ump.c b/sound/core/ump.c index 4aec90dac07e..32d27e58416a 100644 --- a/sound/core/ump.c +++ b/sound/core/ump.c @@ -1251,9 +1251,9 @@ static void fill_substream_names(struct snd_ump_endpoint *ump, name = ump->groups[idx].name; if (!*name) name = ump->info.name; - snprintf(s->name, sizeof(s->name), "Group %d (%.16s)%s", - idx + 1, name, - ump->groups[idx].active ? "" : " [Inactive]"); + scnprintf(s->name, sizeof(s->name), "Group %d (%.16s)%s", + idx + 1, name, + ump->groups[idx].active ? "" : " [Inactive]"); } }
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Joe Hattori joe@pf.is.s.u-tokyo.ac.jp
[ Upstream commit 185e1b1d91e419445d3fd99c1c0376a970438acf ]
mlxplat_pci_fpga_device_init() calls pci_get_device() but does not release the refcount on error path. Call pci_dev_put() on the error path and in mlxplat_pci_fpga_device_exit() to fix this.
This bug was found by an experimental static analysis tool that I am developing.
Fixes: 02daa222fbdd ("platform: mellanox: Add initial support for PCIe based programming logic device") Signed-off-by: Joe Hattori joe@pf.is.s.u-tokyo.ac.jp Reviewed-by: Vadim Pasternak vadimp@nvidia.com Link: https://lore.kernel.org/r/20241216022538.381209-1-joe@pf.is.s.u-tokyo.ac.jp Reviewed-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com Signed-off-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/platform/x86/mlx-platform.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/platform/x86/mlx-platform.c b/drivers/platform/x86/mlx-platform.c index a2ffe4157df1..b8d77adc9ea1 100644 --- a/drivers/platform/x86/mlx-platform.c +++ b/drivers/platform/x86/mlx-platform.c @@ -6237,6 +6237,7 @@ mlxplat_pci_fpga_device_init(unsigned int device, const char *res_name, struct p fail_pci_request_regions: pci_disable_device(pci_dev); fail_pci_enable_device: + pci_dev_put(pci_dev); return err; }
@@ -6247,6 +6248,7 @@ mlxplat_pci_fpga_device_exit(struct pci_dev *pci_bridge, iounmap(pci_bridge_addr); pci_release_regions(pci_bridge); pci_disable_device(pci_bridge); + pci_dev_put(pci_bridge); }
static int
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Chao Yu chao@kernel.org
commit 96cfeb0389530ae32ade8a48ae3ae1ac3b6c009d upstream.
It should wait all existing dio write IOs before block removal, otherwise, previous direct write IO may overwrite data in the block which may be reused by other inode.
Cc: stable@vger.kernel.org Signed-off-by: Chao Yu chao@kernel.org Signed-off-by: Jaegeuk Kim jaegeuk@kernel.org [ Resolve line conflicts to make it work on 6.6.y ] Signed-off-by: Alva Lan alvalan9@foxmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/f2fs/file.c | 13 +++++++++++++ 1 file changed, 13 insertions(+)
--- a/fs/f2fs/file.c +++ b/fs/f2fs/file.c @@ -1037,6 +1037,13 @@ int f2fs_setattr(struct mnt_idmap *idmap return err; }
+ /* + * wait for inflight dio, blocks should be removed after + * IO completion. + */ + if (attr->ia_size < old_size) + inode_dio_wait(inode); + f2fs_down_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]); filemap_invalidate_lock(inode->i_mapping);
@@ -1873,6 +1880,12 @@ static long f2fs_fallocate(struct file * if (ret) goto out;
+ /* + * wait for inflight dio, blocks should be removed after IO + * completion. + */ + inode_dio_wait(inode); + if (mode & FALLOC_FL_PUNCH_HOLE) { if (offset >= inode->i_size) goto out;
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Thiébaud Weksteen tweek@google.com
commit 900f83cf376bdaf798b6f5dcb2eae0c822e908b6 upstream.
When evaluating extended permissions, ignore unknown permissions instead of calling BUG(). This commit ensures that future permissions can be added without interfering with older kernels.
Cc: stable@vger.kernel.org Fixes: fa1aa143ac4a ("selinux: extended permissions for ioctls") Signed-off-by: Thiébaud Weksteen tweek@google.com Signed-off-by: Paul Moore paul@paul-moore.com Acked-by: Paul Moore paul@paul-moore.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- security/selinux/ss/services.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)
--- a/security/selinux/ss/services.c +++ b/security/selinux/ss/services.c @@ -956,7 +956,10 @@ void services_compute_xperms_decision(st xpermd->driver)) return; } else { - BUG(); + pr_warn_once( + "SELinux: unknown extended permission (%u) will be ignored\n", + node->datum.u.xperms->specified); + return; }
if (node->key.specified == AVTAB_XPERMS_ALLOWED) { @@ -993,7 +996,8 @@ void services_compute_xperms_decision(st node->datum.u.xperms->perms.p[i]; } } else { - BUG(); + pr_warn_once("SELinux: unknown specified key (%u)\n", + node->key.specified); } }
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Johannes Thumshirn johannes.thumshirn@wdc.com
commit 05b36b04d74a517d6675bf2f90829ff1ac7e28dc upstream.
Shinichiro reported the following use-after free that sometimes is happening in our CI system when running fstests' btrfs/284 on a TCMU runner device:
BUG: KASAN: slab-use-after-free in lock_release+0x708/0x780 Read of size 8 at addr ffff888106a83f18 by task kworker/u80:6/219
CPU: 8 UID: 0 PID: 219 Comm: kworker/u80:6 Not tainted 6.12.0-rc6-kts+ #15 Hardware name: Supermicro Super Server/X11SPi-TF, BIOS 3.3 02/21/2020 Workqueue: btrfs-endio btrfs_end_bio_work [btrfs] Call Trace: <TASK> dump_stack_lvl+0x6e/0xa0 ? lock_release+0x708/0x780 print_report+0x174/0x505 ? lock_release+0x708/0x780 ? __virt_addr_valid+0x224/0x410 ? lock_release+0x708/0x780 kasan_report+0xda/0x1b0 ? lock_release+0x708/0x780 ? __wake_up+0x44/0x60 lock_release+0x708/0x780 ? __pfx_lock_release+0x10/0x10 ? __pfx_do_raw_spin_lock+0x10/0x10 ? lock_is_held_type+0x9a/0x110 _raw_spin_unlock_irqrestore+0x1f/0x60 __wake_up+0x44/0x60 btrfs_encoded_read_endio+0x14b/0x190 [btrfs] btrfs_check_read_bio+0x8d9/0x1360 [btrfs] ? lock_release+0x1b0/0x780 ? trace_lock_acquire+0x12f/0x1a0 ? __pfx_btrfs_check_read_bio+0x10/0x10 [btrfs] ? process_one_work+0x7e3/0x1460 ? lock_acquire+0x31/0xc0 ? process_one_work+0x7e3/0x1460 process_one_work+0x85c/0x1460 ? __pfx_process_one_work+0x10/0x10 ? assign_work+0x16c/0x240 worker_thread+0x5e6/0xfc0 ? __pfx_worker_thread+0x10/0x10 kthread+0x2c3/0x3a0 ? __pfx_kthread+0x10/0x10 ret_from_fork+0x31/0x70 ? __pfx_kthread+0x10/0x10 ret_from_fork_asm+0x1a/0x30 </TASK>
Allocated by task 3661: kasan_save_stack+0x30/0x50 kasan_save_track+0x14/0x30 __kasan_kmalloc+0xaa/0xb0 btrfs_encoded_read_regular_fill_pages+0x16c/0x6d0 [btrfs] send_extent_data+0xf0f/0x24a0 [btrfs] process_extent+0x48a/0x1830 [btrfs] changed_cb+0x178b/0x2ea0 [btrfs] btrfs_ioctl_send+0x3bf9/0x5c20 [btrfs] _btrfs_ioctl_send+0x117/0x330 [btrfs] btrfs_ioctl+0x184a/0x60a0 [btrfs] __x64_sys_ioctl+0x12e/0x1a0 do_syscall_64+0x95/0x180 entry_SYSCALL_64_after_hwframe+0x76/0x7e
Freed by task 3661: kasan_save_stack+0x30/0x50 kasan_save_track+0x14/0x30 kasan_save_free_info+0x3b/0x70 __kasan_slab_free+0x4f/0x70 kfree+0x143/0x490 btrfs_encoded_read_regular_fill_pages+0x531/0x6d0 [btrfs] send_extent_data+0xf0f/0x24a0 [btrfs] process_extent+0x48a/0x1830 [btrfs] changed_cb+0x178b/0x2ea0 [btrfs] btrfs_ioctl_send+0x3bf9/0x5c20 [btrfs] _btrfs_ioctl_send+0x117/0x330 [btrfs] btrfs_ioctl+0x184a/0x60a0 [btrfs] __x64_sys_ioctl+0x12e/0x1a0 do_syscall_64+0x95/0x180 entry_SYSCALL_64_after_hwframe+0x76/0x7e
The buggy address belongs to the object at ffff888106a83f00 which belongs to the cache kmalloc-rnd-07-96 of size 96 The buggy address is located 24 bytes inside of freed 96-byte region [ffff888106a83f00, ffff888106a83f60)
The buggy address belongs to the physical page: page: refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff888106a83800 pfn:0x106a83 flags: 0x17ffffc0000000(node=0|zone=2|lastcpupid=0x1fffff) page_type: f5(slab) raw: 0017ffffc0000000 ffff888100053680 ffffea0004917200 0000000000000004 raw: ffff888106a83800 0000000080200019 00000001f5000000 0000000000000000 page dumped because: kasan: bad access detected
Memory state around the buggy address: ffff888106a83e00: fa fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc ffff888106a83e80: fa fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
ffff888106a83f00: fa fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
^ ffff888106a83f80: fa fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc ffff888106a84000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ==================================================================
Further analyzing the trace and the crash dump's vmcore file shows that the wake_up() call in btrfs_encoded_read_endio() is calling wake_up() on the wait_queue that is in the private data passed to the end_io handler.
Commit 4ff47df40447 ("btrfs: move priv off stack in btrfs_encoded_read_regular_fill_pages()") moved 'struct btrfs_encoded_read_private' off the stack.
Before that commit one can see a corruption of the private data when analyzing the vmcore after a crash:
*(struct btrfs_encoded_read_private *)0xffff88815626eec8 = { .wait = (wait_queue_head_t){ .lock = (spinlock_t){ .rlock = (struct raw_spinlock){ .raw_lock = (arch_spinlock_t){ .val = (atomic_t){ .counter = (int)-2005885696, }, .locked = (u8)0, .pending = (u8)157, .locked_pending = (u16)40192, .tail = (u16)34928, }, .magic = (unsigned int)536325682, .owner_cpu = (unsigned int)29, .owner = (void *)__SCT__tp_func_btrfs_transaction_commit+0x0 = 0x0, .dep_map = (struct lockdep_map){ .key = (struct lock_class_key *)0xffff8881575a3b6c, .class_cache = (struct lock_class *[2]){ 0xffff8882a71985c0, 0xffffea00066f5d40 }, .name = (const char *)0xffff88815626f100 = "", .wait_type_outer = (u8)37, .wait_type_inner = (u8)178, .lock_type = (u8)154, }, }, .__padding = (u8 [24]){ 0, 157, 112, 136, 50, 174, 247, 31, 29 }, .dep_map = (struct lockdep_map){ .key = (struct lock_class_key *)0xffff8881575a3b6c, .class_cache = (struct lock_class *[2]){ 0xffff8882a71985c0, 0xffffea00066f5d40 }, .name = (const char *)0xffff88815626f100 = "", .wait_type_outer = (u8)37, .wait_type_inner = (u8)178, .lock_type = (u8)154, }, }, .head = (struct list_head){ .next = (struct list_head *)0x112cca, .prev = (struct list_head *)0x47, }, }, .pending = (atomic_t){ .counter = (int)-1491499288, }, .status = (blk_status_t)130, }
Here we can see several indicators of in-memory data corruption, e.g. the large negative atomic values of ->pending or ->wait->lock->rlock->raw_lock->val, as well as the bogus spinlock magic 0x1ff7ae32 (decimal 536325682 above) instead of 0xdead4ead or the bogus pointer values for ->wait->head.
To fix this, change atomic_dec_return() to atomic_dec_and_test() to fix the corruption, as atomic_dec_return() is defined as two instructions on x86_64, whereas atomic_dec_and_test() is defined as a single atomic operation. This can lead to a situation where counter value is already decremented but the if statement in btrfs_encoded_read_endio() is not completely processed, i.e. the 0 test has not completed. If another thread continues executing btrfs_encoded_read_regular_fill_pages() the atomic_dec_return() there can see an already updated ->pending counter and continues by freeing the private data. Continuing in the endio handler the test for 0 succeeds and the wait_queue is woken up, resulting in a use-after-free.
Reported-by: Shinichiro Kawasaki shinichiro.kawasaki@wdc.com Suggested-by: Damien Le Moal Damien.LeMoal@wdc.com Fixes: 1881fba89bd5 ("btrfs: add BTRFS_IOC_ENCODED_READ ioctl") CC: stable@vger.kernel.org # 6.1+ Reviewed-by: Filipe Manana fdmanana@suse.com Reviewed-by: Qu Wenruo wqu@suse.com Signed-off-by: Johannes Thumshirn johannes.thumshirn@wdc.com Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Alva Lan alvalan9@foxmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/inode.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -9972,7 +9972,7 @@ static void btrfs_encoded_read_endio(str */ WRITE_ONCE(priv->status, bbio->bio.bi_status); } - if (!atomic_dec_return(&priv->pending)) + if (atomic_dec_and_test(&priv->pending)) wake_up(&priv->wait); bio_put(&bbio->bio); }
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Biggers ebiggers@google.com
commit 8d90a86ed053226a297ce062f4d9f4f521e05c4c upstream.
Commit c7eed31e235c ("mmc: sdhci-msm: Switch to the new ICE API") introduced an incorrect check of the algorithm ID into the key eviction path, and thus qcom_ice_evict_key() is no longer ever called. Fix it.
Fixes: c7eed31e235c ("mmc: sdhci-msm: Switch to the new ICE API") Cc: stable@vger.kernel.org Cc: Abel Vesa abel.vesa@linaro.org Signed-off-by: Eric Biggers ebiggers@google.com Message-ID: 20241213041958.202565-6-ebiggers@kernel.org Signed-off-by: Ulf Hansson ulf.hansson@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/mmc/host/sdhci-msm.c | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-)
--- a/drivers/mmc/host/sdhci-msm.c +++ b/drivers/mmc/host/sdhci-msm.c @@ -1867,20 +1867,20 @@ static int sdhci_msm_program_key(struct struct sdhci_msm_host *msm_host = sdhci_pltfm_priv(pltfm_host); union cqhci_crypto_cap_entry cap;
+ if (!(cfg->config_enable & CQHCI_CRYPTO_CONFIGURATION_ENABLE)) + return qcom_ice_evict_key(msm_host->ice, slot); + /* Only AES-256-XTS has been tested so far. */ cap = cq_host->crypto_cap_array[cfg->crypto_cap_idx]; if (cap.algorithm_id != CQHCI_CRYPTO_ALG_AES_XTS || cap.key_size != CQHCI_CRYPTO_KEY_SIZE_256) return -EINVAL;
- if (cfg->config_enable & CQHCI_CRYPTO_CONFIGURATION_ENABLE) - return qcom_ice_program_key(msm_host->ice, - QCOM_ICE_CRYPTO_ALG_AES_XTS, - QCOM_ICE_CRYPTO_KEY_SIZE_256, - cfg->crypto_key, - cfg->data_unit_size, slot); - else - return qcom_ice_evict_key(msm_host->ice, slot); + return qcom_ice_program_key(msm_host->ice, + QCOM_ICE_CRYPTO_ALG_AES_XTS, + QCOM_ICE_CRYPTO_KEY_SIZE_256, + cfg->crypto_key, + cfg->data_unit_size, slot); }
#else /* CONFIG_MMC_CRYPTO */
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Steven Rostedt rostedt@goodmis.org
commit afc6717628f959941d7b33728570568b4af1c4b8 upstream.
In order to catch a common bug where a TRACE_EVENT() TP_fast_assign() assigns an address of an allocated string to the ring buffer and then references it in TP_printk(), which can be executed hours later when the string is free, the function test_event_printk() runs on all events as they are registered to make sure there's no unwanted dereferencing.
It calls process_string() to handle cases in TP_printk() format that has "%s". It returns whether or not the string is safe. But it can have some false positives.
For instance, xe_bo_move() has:
TP_printk("move_lacks_source:%s, migrate object %p [size %zu] from %s to %s device_id:%s", __entry->move_lacks_source ? "yes" : "no", __entry->bo, __entry->size, xe_mem_type_to_name[__entry->old_placement], xe_mem_type_to_name[__entry->new_placement], __get_str(device_id))
Where the "%s" references into xe_mem_type_to_name[]. This is an array of pointers that should be safe for the event to access. Instead of flagging this as a bad reference, if a reference points to an array, where the record field is the index, consider it safe.
Link: https://lore.kernel.org/all/9dee19b6185d325d0e6fa5f7cbba81d007d99166.camel@s...
Cc: stable@vger.kernel.org Cc: Masami Hiramatsu mhiramat@kernel.org Cc: Mathieu Desnoyers mathieu.desnoyers@efficios.com Link: https://lore.kernel.org/20241231000646.324fb5f7@gandalf.local.home Fixes: 65a25d9f7ac02 ("tracing: Add "%s" check in test_event_printk()") Reported-by: Genes Lists lists@sapience.com Tested-by: Gene C arch@sapience.com Signed-off-by: Steven Rostedt (Google) rostedt@goodmis.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/trace/trace_events.c | 12 ++++++++++++ 1 file changed, 12 insertions(+)
--- a/kernel/trace/trace_events.c +++ b/kernel/trace/trace_events.c @@ -361,6 +361,18 @@ static bool process_string(const char *f } while (s < e);
/* + * Check for arrays. If the argument has: foo[REC->val] + * then it is very likely that foo is an array of strings + * that are safe to use. + */ + r = strstr(s, "["); + if (r && r < e) { + r = strstr(r, "REC->"); + if (r && r < e) + return true; + } + + /* * If there's any strings in the argument consider this arg OK as it * could be: REC->field ? "foo" : "bar" and we don't want to get into * verifying that logic here.
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Xiubo Li xiubli@redhat.com
[ Upstream commit 5c5f0d2b5f92c47baf82b9b211e27edd7d195158 ]
This will help print the fsid and client's global_id in debug logs, and also print the function names.
[ idryomov: %lld -> %llu, leading space for doutc(), don't include __func__ in pr_*() variants ]
Link: https://tracker.ceph.com/issues/61590 Signed-off-by: Xiubo Li xiubli@redhat.com Reviewed-by: Patrick Donnelly pdonnell@redhat.com Reviewed-by: Milind Changire mchangir@redhat.com Signed-off-by: Ilya Dryomov idryomov@gmail.com Stable-dep-of: 550f7ca98ee0 ("ceph: give up on paths longer than PATH_MAX") Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/ceph/ceph_debug.h | 38 +++++++++++++++++++++++++++++++++ 1 file changed, 38 insertions(+)
diff --git a/include/linux/ceph/ceph_debug.h b/include/linux/ceph/ceph_debug.h index d5a5da838caf..11a92a946016 100644 --- a/include/linux/ceph/ceph_debug.h +++ b/include/linux/ceph/ceph_debug.h @@ -19,12 +19,25 @@ pr_debug("%.*s %12.12s:%-4d : " fmt, \ 8 - (int)sizeof(KBUILD_MODNAME), " ", \ kbasename(__FILE__), __LINE__, ##__VA_ARGS__) +# define doutc(client, fmt, ...) \ + pr_debug("%.*s %12.12s:%-4d : [%pU %llu] " fmt, \ + 8 - (int)sizeof(KBUILD_MODNAME), " ", \ + kbasename(__FILE__), __LINE__, \ + &client->fsid, client->monc.auth->global_id, \ + ##__VA_ARGS__) # else /* faux printk call just to see any compiler warnings. */ # define dout(fmt, ...) do { \ if (0) \ printk(KERN_DEBUG fmt, ##__VA_ARGS__); \ } while (0) +# define doutc(client, fmt, ...) do { \ + if (0) \ + printk(KERN_DEBUG "[%pU %llu] " fmt, \ + &client->fsid, \ + client->monc.auth->global_id, \ + ##__VA_ARGS__); \ + } while (0) # endif
#else @@ -33,7 +46,32 @@ * or, just wrap pr_debug */ # define dout(fmt, ...) pr_debug(" " fmt, ##__VA_ARGS__) +# define doutc(client, fmt, ...) \ + pr_debug(" [%pU %llu] %s: " fmt, &client->fsid, \ + client->monc.auth->global_id, __func__, ##__VA_ARGS__)
#endif
+#define pr_notice_client(client, fmt, ...) \ + pr_notice("[%pU %llu]: " fmt, &client->fsid, \ + client->monc.auth->global_id, ##__VA_ARGS__) +#define pr_info_client(client, fmt, ...) \ + pr_info("[%pU %llu]: " fmt, &client->fsid, \ + client->monc.auth->global_id, ##__VA_ARGS__) +#define pr_warn_client(client, fmt, ...) \ + pr_warn("[%pU %llu]: " fmt, &client->fsid, \ + client->monc.auth->global_id, ##__VA_ARGS__) +#define pr_warn_once_client(client, fmt, ...) \ + pr_warn_once("[%pU %llu]: " fmt, &client->fsid, \ + client->monc.auth->global_id, ##__VA_ARGS__) +#define pr_err_client(client, fmt, ...) \ + pr_err("[%pU %llu]: " fmt, &client->fsid, \ + client->monc.auth->global_id, ##__VA_ARGS__) +#define pr_warn_ratelimited_client(client, fmt, ...) \ + pr_warn_ratelimited("[%pU %llu]: " fmt, &client->fsid, \ + client->monc.auth->global_id, ##__VA_ARGS__) +#define pr_err_ratelimited_client(client, fmt, ...) \ + pr_err_ratelimited("[%pU %llu]: " fmt, &client->fsid, \ + client->monc.auth->global_id, ##__VA_ARGS__) + #endif
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Xiubo Li xiubli@redhat.com
[ Upstream commit 38d46409c4639a1d659ebfa70e27a8bed6b8ee1d ]
Multiple CephFS mounts on a host is increasingly common so disambiguating messages like this is necessary and will make it easier to debug issues.
At the same this will improve the debug logs to make them easier to troubleshooting issues, such as print the ino# instead only printing the memory addresses of the corresponding inodes and print the dentry names instead of the corresponding memory addresses for the dentry,etc.
Link: https://tracker.ceph.com/issues/61590 Signed-off-by: Xiubo Li xiubli@redhat.com Reviewed-by: Patrick Donnelly pdonnell@redhat.com Reviewed-by: Milind Changire mchangir@redhat.com Signed-off-by: Ilya Dryomov idryomov@gmail.com Stable-dep-of: 550f7ca98ee0 ("ceph: give up on paths longer than PATH_MAX") Signed-off-by: Sasha Levin sashal@kernel.org --- fs/ceph/acl.c | 6 +- fs/ceph/addr.c | 279 +++++++++-------- fs/ceph/caps.c | 710 +++++++++++++++++++++++++------------------ fs/ceph/crypto.c | 39 ++- fs/ceph/debugfs.c | 6 +- fs/ceph/dir.c | 218 +++++++------ fs/ceph/export.c | 39 +-- fs/ceph/file.c | 245 ++++++++------- fs/ceph/inode.c | 485 ++++++++++++++++------------- fs/ceph/ioctl.c | 13 +- fs/ceph/locks.c | 57 ++-- fs/ceph/mds_client.c | 558 +++++++++++++++++++--------------- fs/ceph/mdsmap.c | 24 +- fs/ceph/metric.c | 5 +- fs/ceph/quota.c | 29 +- fs/ceph/snap.c | 174 ++++++----- fs/ceph/super.c | 70 +++-- fs/ceph/super.h | 6 + fs/ceph/xattr.c | 96 +++--- 19 files changed, 1747 insertions(+), 1312 deletions(-)
diff --git a/fs/ceph/acl.c b/fs/ceph/acl.c index c53a1d220622..e95841961397 100644 --- a/fs/ceph/acl.c +++ b/fs/ceph/acl.c @@ -15,6 +15,7 @@ #include <linux/slab.h>
#include "super.h" +#include "mds_client.h"
static inline void ceph_set_cached_acl(struct inode *inode, int type, struct posix_acl *acl) @@ -31,6 +32,7 @@ static inline void ceph_set_cached_acl(struct inode *inode,
struct posix_acl *ceph_get_acl(struct inode *inode, int type, bool rcu) { + struct ceph_client *cl = ceph_inode_to_client(inode); int size; unsigned int retry_cnt = 0; const char *name; @@ -72,8 +74,8 @@ struct posix_acl *ceph_get_acl(struct inode *inode, int type, bool rcu) } else if (size == -ENODATA || size == 0) { acl = NULL; } else { - pr_err_ratelimited("get acl %llx.%llx failed, err=%d\n", - ceph_vinop(inode), size); + pr_err_ratelimited_client(cl, "%llx.%llx failed, err=%d\n", + ceph_vinop(inode), size); acl = ERR_PTR(-EIO); }
diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c index 2c92de964c5a..ffd3ff5ae99b 100644 --- a/fs/ceph/addr.c +++ b/fs/ceph/addr.c @@ -79,18 +79,18 @@ static inline struct ceph_snap_context *page_snap_context(struct page *page) */ static bool ceph_dirty_folio(struct address_space *mapping, struct folio *folio) { - struct inode *inode; + struct inode *inode = mapping->host; + struct ceph_client *cl = ceph_inode_to_client(inode); struct ceph_inode_info *ci; struct ceph_snap_context *snapc;
if (folio_test_dirty(folio)) { - dout("%p dirty_folio %p idx %lu -- already dirty\n", - mapping->host, folio, folio->index); + doutc(cl, "%llx.%llx %p idx %lu -- already dirty\n", + ceph_vinop(inode), folio, folio->index); VM_BUG_ON_FOLIO(!folio_test_private(folio), folio); return false; }
- inode = mapping->host; ci = ceph_inode(inode);
/* dirty the head */ @@ -110,12 +110,12 @@ static bool ceph_dirty_folio(struct address_space *mapping, struct folio *folio) if (ci->i_wrbuffer_ref == 0) ihold(inode); ++ci->i_wrbuffer_ref; - dout("%p dirty_folio %p idx %lu head %d/%d -> %d/%d " - "snapc %p seq %lld (%d snaps)\n", - mapping->host, folio, folio->index, - ci->i_wrbuffer_ref-1, ci->i_wrbuffer_ref_head-1, - ci->i_wrbuffer_ref, ci->i_wrbuffer_ref_head, - snapc, snapc->seq, snapc->num_snaps); + doutc(cl, "%llx.%llx %p idx %lu head %d/%d -> %d/%d " + "snapc %p seq %lld (%d snaps)\n", + ceph_vinop(inode), folio, folio->index, + ci->i_wrbuffer_ref-1, ci->i_wrbuffer_ref_head-1, + ci->i_wrbuffer_ref, ci->i_wrbuffer_ref_head, + snapc, snapc->seq, snapc->num_snaps); spin_unlock(&ci->i_ceph_lock);
/* @@ -136,23 +136,22 @@ static bool ceph_dirty_folio(struct address_space *mapping, struct folio *folio) static void ceph_invalidate_folio(struct folio *folio, size_t offset, size_t length) { - struct inode *inode; - struct ceph_inode_info *ci; + struct inode *inode = folio->mapping->host; + struct ceph_client *cl = ceph_inode_to_client(inode); + struct ceph_inode_info *ci = ceph_inode(inode); struct ceph_snap_context *snapc;
- inode = folio->mapping->host; - ci = ceph_inode(inode);
if (offset != 0 || length != folio_size(folio)) { - dout("%p invalidate_folio idx %lu partial dirty page %zu~%zu\n", - inode, folio->index, offset, length); + doutc(cl, "%llx.%llx idx %lu partial dirty page %zu~%zu\n", + ceph_vinop(inode), folio->index, offset, length); return; }
WARN_ON(!folio_test_locked(folio)); if (folio_test_private(folio)) { - dout("%p invalidate_folio idx %lu full dirty page\n", - inode, folio->index); + doutc(cl, "%llx.%llx idx %lu full dirty page\n", + ceph_vinop(inode), folio->index);
snapc = folio_detach_private(folio); ceph_put_wrbuffer_cap_refs(ci, 1, snapc); @@ -165,10 +164,10 @@ static void ceph_invalidate_folio(struct folio *folio, size_t offset, static bool ceph_release_folio(struct folio *folio, gfp_t gfp) { struct inode *inode = folio->mapping->host; + struct ceph_client *cl = ceph_inode_to_client(inode);
- dout("%llx:%llx release_folio idx %lu (%sdirty)\n", - ceph_vinop(inode), - folio->index, folio_test_dirty(folio) ? "" : "not "); + doutc(cl, "%llx.%llx idx %lu (%sdirty)\n", ceph_vinop(inode), + folio->index, folio_test_dirty(folio) ? "" : "not ");
if (folio_test_private(folio)) return false; @@ -244,6 +243,7 @@ static void finish_netfs_read(struct ceph_osd_request *req) { struct inode *inode = req->r_inode; struct ceph_fs_client *fsc = ceph_inode_to_fs_client(inode); + struct ceph_client *cl = fsc->client; struct ceph_osd_data *osd_data = osd_req_op_extent_osd_data(req, 0); struct netfs_io_subrequest *subreq = req->r_priv; struct ceph_osd_req_op *op = &req->r_ops[0]; @@ -253,8 +253,8 @@ static void finish_netfs_read(struct ceph_osd_request *req) ceph_update_read_metrics(&fsc->mdsc->metric, req->r_start_latency, req->r_end_latency, osd_data->length, err);
- dout("%s: result %d subreq->len=%zu i_size=%lld\n", __func__, req->r_result, - subreq->len, i_size_read(req->r_inode)); + doutc(cl, "result %d subreq->len=%zu i_size=%lld\n", req->r_result, + subreq->len, i_size_read(req->r_inode));
/* no object means success but no data */ if (err == -ENOENT) @@ -348,6 +348,7 @@ static void ceph_netfs_issue_read(struct netfs_io_subrequest *subreq) struct inode *inode = rreq->inode; struct ceph_inode_info *ci = ceph_inode(inode); struct ceph_fs_client *fsc = ceph_inode_to_fs_client(inode); + struct ceph_client *cl = fsc->client; struct ceph_osd_request *req = NULL; struct ceph_vino vino = ceph_vino(inode); struct iov_iter iter; @@ -384,7 +385,8 @@ static void ceph_netfs_issue_read(struct netfs_io_subrequest *subreq) goto out; }
- dout("%s: pos=%llu orig_len=%zu len=%llu\n", __func__, subreq->start, subreq->len, len); + doutc(cl, "%llx.%llx pos=%llu orig_len=%zu len=%llu\n", + ceph_vinop(inode), subreq->start, subreq->len, len);
iov_iter_xarray(&iter, ITER_DEST, &rreq->mapping->i_pages, subreq->start, len);
@@ -401,8 +403,8 @@ static void ceph_netfs_issue_read(struct netfs_io_subrequest *subreq)
err = iov_iter_get_pages_alloc2(&iter, &pages, len, &page_off); if (err < 0) { - dout("%s: iov_ter_get_pages_alloc returned %d\n", - __func__, err); + doutc(cl, "%llx.%llx failed to allocate pages, %d\n", + ceph_vinop(inode), err); goto out; }
@@ -430,12 +432,13 @@ static void ceph_netfs_issue_read(struct netfs_io_subrequest *subreq) ceph_osdc_put_request(req); if (err) netfs_subreq_terminated(subreq, err, false); - dout("%s: result %d\n", __func__, err); + doutc(cl, "%llx.%llx result %d\n", ceph_vinop(inode), err); }
static int ceph_init_request(struct netfs_io_request *rreq, struct file *file) { struct inode *inode = rreq->inode; + struct ceph_client *cl = ceph_inode_to_client(inode); int got = 0, want = CEPH_CAP_FILE_CACHE; struct ceph_netfs_request_data *priv; int ret = 0; @@ -467,12 +470,12 @@ static int ceph_init_request(struct netfs_io_request *rreq, struct file *file) */ ret = ceph_try_get_caps(inode, CEPH_CAP_FILE_RD, want, true, &got); if (ret < 0) { - dout("start_read %p, error getting cap\n", inode); + doutc(cl, "%llx.%llx, error getting cap\n", ceph_vinop(inode)); goto out; }
if (!(got & want)) { - dout("start_read %p, no cache cap\n", inode); + doutc(cl, "%llx.%llx, no cache cap\n", ceph_vinop(inode)); ret = -EACCES; goto out; } @@ -567,13 +570,14 @@ get_oldest_context(struct inode *inode, struct ceph_writeback_ctl *ctl, struct ceph_snap_context *page_snapc) { struct ceph_inode_info *ci = ceph_inode(inode); + struct ceph_client *cl = ceph_inode_to_client(inode); struct ceph_snap_context *snapc = NULL; struct ceph_cap_snap *capsnap = NULL;
spin_lock(&ci->i_ceph_lock); list_for_each_entry(capsnap, &ci->i_cap_snaps, ci_item) { - dout(" cap_snap %p snapc %p has %d dirty pages\n", capsnap, - capsnap->context, capsnap->dirty_pages); + doutc(cl, " capsnap %p snapc %p has %d dirty pages\n", + capsnap, capsnap->context, capsnap->dirty_pages); if (!capsnap->dirty_pages) continue;
@@ -605,8 +609,8 @@ get_oldest_context(struct inode *inode, struct ceph_writeback_ctl *ctl, } if (!snapc && ci->i_wrbuffer_ref_head) { snapc = ceph_get_snap_context(ci->i_head_snapc); - dout(" head snapc %p has %d dirty pages\n", - snapc, ci->i_wrbuffer_ref_head); + doutc(cl, " head snapc %p has %d dirty pages\n", snapc, + ci->i_wrbuffer_ref_head); if (ctl) { ctl->i_size = i_size_read(inode); ctl->truncate_size = ci->i_truncate_size; @@ -663,6 +667,7 @@ static int writepage_nounlock(struct page *page, struct writeback_control *wbc) struct inode *inode = page->mapping->host; struct ceph_inode_info *ci = ceph_inode(inode); struct ceph_fs_client *fsc = ceph_inode_to_fs_client(inode); + struct ceph_client *cl = fsc->client; struct ceph_snap_context *snapc, *oldest; loff_t page_off = page_offset(page); int err; @@ -674,7 +679,8 @@ static int writepage_nounlock(struct page *page, struct writeback_control *wbc) bool caching = ceph_is_cache_enabled(inode); struct page *bounce_page = NULL;
- dout("writepage %p idx %lu\n", page, page->index); + doutc(cl, "%llx.%llx page %p idx %lu\n", ceph_vinop(inode), page, + page->index);
if (ceph_inode_is_shutdown(inode)) return -EIO; @@ -682,13 +688,14 @@ static int writepage_nounlock(struct page *page, struct writeback_control *wbc) /* verify this is a writeable snap context */ snapc = page_snap_context(page); if (!snapc) { - dout("writepage %p page %p not dirty?\n", inode, page); + doutc(cl, "%llx.%llx page %p not dirty?\n", ceph_vinop(inode), + page); return 0; } oldest = get_oldest_context(inode, &ceph_wbc, snapc); if (snapc->seq > oldest->seq) { - dout("writepage %p page %p snapc %p not writeable - noop\n", - inode, page, snapc); + doutc(cl, "%llx.%llx page %p snapc %p not writeable - noop\n", + ceph_vinop(inode), page, snapc); /* we should only noop if called by kswapd */ WARN_ON(!(current->flags & PF_MEMALLOC)); ceph_put_snap_context(oldest); @@ -699,8 +706,8 @@ static int writepage_nounlock(struct page *page, struct writeback_control *wbc)
/* is this a partial page at end of file? */ if (page_off >= ceph_wbc.i_size) { - dout("folio at %lu beyond eof %llu\n", folio->index, - ceph_wbc.i_size); + doutc(cl, "%llx.%llx folio at %lu beyond eof %llu\n", + ceph_vinop(inode), folio->index, ceph_wbc.i_size); folio_invalidate(folio, 0, folio_size(folio)); return 0; } @@ -709,8 +716,9 @@ static int writepage_nounlock(struct page *page, struct writeback_control *wbc) len = ceph_wbc.i_size - page_off;
wlen = IS_ENCRYPTED(inode) ? round_up(len, CEPH_FSCRYPT_BLOCK_SIZE) : len; - dout("writepage %p page %p index %lu on %llu~%llu snapc %p seq %lld\n", - inode, page, page->index, page_off, wlen, snapc, snapc->seq); + doutc(cl, "%llx.%llx page %p index %lu on %llu~%llu snapc %p seq %lld\n", + ceph_vinop(inode), page, page->index, page_off, wlen, snapc, + snapc->seq);
if (atomic_long_inc_return(&fsc->writeback_count) > CONGESTION_ON_THRESH(fsc->mount_options->congestion_kb)) @@ -751,8 +759,9 @@ static int writepage_nounlock(struct page *page, struct writeback_control *wbc) osd_req_op_extent_osd_data_pages(req, 0, bounce_page ? &bounce_page : &page, wlen, 0, false, false); - dout("writepage %llu~%llu (%llu bytes, %sencrypted)\n", - page_off, len, wlen, IS_ENCRYPTED(inode) ? "" : "not "); + doutc(cl, "%llx.%llx %llu~%llu (%llu bytes, %sencrypted)\n", + ceph_vinop(inode), page_off, len, wlen, + IS_ENCRYPTED(inode) ? "" : "not ");
req->r_mtime = inode->i_mtime; ceph_osdc_start_request(osdc, req); @@ -771,19 +780,21 @@ static int writepage_nounlock(struct page *page, struct writeback_control *wbc) wbc = &tmp_wbc; if (err == -ERESTARTSYS) { /* killed by SIGKILL */ - dout("writepage interrupted page %p\n", page); + doutc(cl, "%llx.%llx interrupted page %p\n", + ceph_vinop(inode), page); redirty_page_for_writepage(wbc, page); end_page_writeback(page); return err; } if (err == -EBLOCKLISTED) fsc->blocklisted = true; - dout("writepage setting page/mapping error %d %p\n", - err, page); + doutc(cl, "%llx.%llx setting page/mapping error %d %p\n", + ceph_vinop(inode), err, page); mapping_set_error(&inode->i_data, err); wbc->pages_skipped++; } else { - dout("writepage cleaned page %p\n", page); + doutc(cl, "%llx.%llx cleaned page %p\n", + ceph_vinop(inode), page); err = 0; /* vfs expects us to return 0 */ } oldest = detach_page_private(page); @@ -835,6 +846,7 @@ static void writepages_finish(struct ceph_osd_request *req) { struct inode *inode = req->r_inode; struct ceph_inode_info *ci = ceph_inode(inode); + struct ceph_client *cl = ceph_inode_to_client(inode); struct ceph_osd_data *osd_data; struct page *page; int num_pages, total_pages = 0; @@ -846,7 +858,7 @@ static void writepages_finish(struct ceph_osd_request *req) unsigned int len = 0; bool remove_page;
- dout("writepages_finish %p rc %d\n", inode, rc); + doutc(cl, "%llx.%llx rc %d\n", ceph_vinop(inode), rc); if (rc < 0) { mapping_set_error(mapping, rc); ceph_set_error_write(ci); @@ -868,8 +880,10 @@ static void writepages_finish(struct ceph_osd_request *req) /* clean all pages */ for (i = 0; i < req->r_num_ops; i++) { if (req->r_ops[i].op != CEPH_OSD_OP_WRITE) { - pr_warn("%s incorrect op %d req %p index %d tid %llu\n", - __func__, req->r_ops[i].op, req, i, req->r_tid); + pr_warn_client(cl, + "%llx.%llx incorrect op %d req %p index %d tid %llu\n", + ceph_vinop(inode), req->r_ops[i].op, req, i, + req->r_tid); break; }
@@ -896,7 +910,7 @@ static void writepages_finish(struct ceph_osd_request *req)
ceph_put_snap_context(detach_page_private(page)); end_page_writeback(page); - dout("unlocking %p\n", page); + doutc(cl, "unlocking %p\n", page);
if (remove_page) generic_error_remove_page(inode->i_mapping, @@ -904,8 +918,9 @@ static void writepages_finish(struct ceph_osd_request *req)
unlock_page(page); } - dout("writepages_finish %p wrote %llu bytes cleaned %d pages\n", - inode, osd_data->length, rc >= 0 ? num_pages : 0); + doutc(cl, "%llx.%llx wrote %llu bytes cleaned %d pages\n", + ceph_vinop(inode), osd_data->length, + rc >= 0 ? num_pages : 0);
release_pages(osd_data->pages, num_pages); } @@ -933,6 +948,7 @@ static int ceph_writepages_start(struct address_space *mapping, struct inode *inode = mapping->host; struct ceph_inode_info *ci = ceph_inode(inode); struct ceph_fs_client *fsc = ceph_inode_to_fs_client(inode); + struct ceph_client *cl = fsc->client; struct ceph_vino vino = ceph_vino(inode); pgoff_t index, start_index, end = -1; struct ceph_snap_context *snapc = NULL, *last_snapc = NULL, *pgsnapc; @@ -950,15 +966,15 @@ static int ceph_writepages_start(struct address_space *mapping, fsc->write_congested) return 0;
- dout("writepages_start %p (mode=%s)\n", inode, - wbc->sync_mode == WB_SYNC_NONE ? "NONE" : - (wbc->sync_mode == WB_SYNC_ALL ? "ALL" : "HOLD")); + doutc(cl, "%llx.%llx (mode=%s)\n", ceph_vinop(inode), + wbc->sync_mode == WB_SYNC_NONE ? "NONE" : + (wbc->sync_mode == WB_SYNC_ALL ? "ALL" : "HOLD"));
if (ceph_inode_is_shutdown(inode)) { if (ci->i_wrbuffer_ref > 0) { - pr_warn_ratelimited( - "writepage_start %p %lld forced umount\n", - inode, ceph_ino(inode)); + pr_warn_ratelimited_client(cl, + "%llx.%llx %lld forced umount\n", + ceph_vinop(inode), ceph_ino(inode)); } mapping_set_error(mapping, -EIO); return -EIO; /* we're in a forced umount, don't write! */ @@ -982,11 +998,11 @@ static int ceph_writepages_start(struct address_space *mapping, if (!snapc) { /* hmm, why does writepages get called when there is no dirty data? */ - dout(" no snap context with dirty data?\n"); + doutc(cl, " no snap context with dirty data?\n"); goto out; } - dout(" oldest snapc is %p seq %lld (%d snaps)\n", - snapc, snapc->seq, snapc->num_snaps); + doutc(cl, " oldest snapc is %p seq %lld (%d snaps)\n", snapc, + snapc->seq, snapc->num_snaps);
should_loop = false; if (ceph_wbc.head_snapc && snapc != last_snapc) { @@ -996,13 +1012,13 @@ static int ceph_writepages_start(struct address_space *mapping, end = -1; if (index > 0) should_loop = true; - dout(" cyclic, start at %lu\n", index); + doutc(cl, " cyclic, start at %lu\n", index); } else { index = wbc->range_start >> PAGE_SHIFT; end = wbc->range_end >> PAGE_SHIFT; if (wbc->range_start == 0 && wbc->range_end == LLONG_MAX) range_whole = true; - dout(" not cyclic, %lu to %lu\n", index, end); + doutc(cl, " not cyclic, %lu to %lu\n", index, end); } } else if (!ceph_wbc.head_snapc) { /* Do not respect wbc->range_{start,end}. Dirty pages @@ -1011,7 +1027,7 @@ static int ceph_writepages_start(struct address_space *mapping, * associated with 'snapc' get written */ if (index > 0) should_loop = true; - dout(" non-head snapc, range whole\n"); + doutc(cl, " non-head snapc, range whole\n"); }
if (wbc->sync_mode == WB_SYNC_ALL || wbc->tagged_writepages) @@ -1034,12 +1050,12 @@ static int ceph_writepages_start(struct address_space *mapping, get_more_pages: nr_folios = filemap_get_folios_tag(mapping, &index, end, tag, &fbatch); - dout("pagevec_lookup_range_tag got %d\n", nr_folios); + doutc(cl, "pagevec_lookup_range_tag got %d\n", nr_folios); if (!nr_folios && !locked_pages) break; for (i = 0; i < nr_folios && locked_pages < max_pages; i++) { page = &fbatch.folios[i]->page; - dout("? %p idx %lu\n", page, page->index); + doutc(cl, "? %p idx %lu\n", page, page->index); if (locked_pages == 0) lock_page(page); /* first page */ else if (!trylock_page(page)) @@ -1048,15 +1064,15 @@ static int ceph_writepages_start(struct address_space *mapping, /* only dirty pages, or our accounting breaks */ if (unlikely(!PageDirty(page)) || unlikely(page->mapping != mapping)) { - dout("!dirty or !mapping %p\n", page); + doutc(cl, "!dirty or !mapping %p\n", page); unlock_page(page); continue; } /* only if matching snap context */ pgsnapc = page_snap_context(page); if (pgsnapc != snapc) { - dout("page snapc %p %lld != oldest %p %lld\n", - pgsnapc, pgsnapc->seq, snapc, snapc->seq); + doutc(cl, "page snapc %p %lld != oldest %p %lld\n", + pgsnapc, pgsnapc->seq, snapc, snapc->seq); if (!should_loop && !ceph_wbc.head_snapc && wbc->sync_mode != WB_SYNC_NONE) @@ -1067,8 +1083,8 @@ static int ceph_writepages_start(struct address_space *mapping, if (page_offset(page) >= ceph_wbc.i_size) { struct folio *folio = page_folio(page);
- dout("folio at %lu beyond eof %llu\n", - folio->index, ceph_wbc.i_size); + doutc(cl, "folio at %lu beyond eof %llu\n", + folio->index, ceph_wbc.i_size); if ((ceph_wbc.size_stable || folio_pos(folio) >= i_size_read(inode)) && folio_clear_dirty_for_io(folio)) @@ -1078,23 +1094,23 @@ static int ceph_writepages_start(struct address_space *mapping, continue; } if (strip_unit_end && (page->index > strip_unit_end)) { - dout("end of strip unit %p\n", page); + doutc(cl, "end of strip unit %p\n", page); unlock_page(page); break; } if (PageWriteback(page) || PageFsCache(page)) { if (wbc->sync_mode == WB_SYNC_NONE) { - dout("%p under writeback\n", page); + doutc(cl, "%p under writeback\n", page); unlock_page(page); continue; } - dout("waiting on writeback %p\n", page); + doutc(cl, "waiting on writeback %p\n", page); wait_on_page_writeback(page); wait_on_page_fscache(page); }
if (!clear_page_dirty_for_io(page)) { - dout("%p !clear_page_dirty_for_io\n", page); + doutc(cl, "%p !clear_page_dirty_for_io\n", page); unlock_page(page); continue; } @@ -1149,8 +1165,8 @@ static int ceph_writepages_start(struct address_space *mapping, }
/* note position of first page in fbatch */ - dout("%p will write page %p idx %lu\n", - inode, page, page->index); + doutc(cl, "%llx.%llx will write page %p idx %lu\n", + ceph_vinop(inode), page, page->index);
if (atomic_long_inc_return(&fsc->writeback_count) > CONGESTION_ON_THRESH( @@ -1164,8 +1180,9 @@ static int ceph_writepages_start(struct address_space *mapping, locked_pages ? GFP_NOWAIT : GFP_NOFS); if (IS_ERR(pages[locked_pages])) { if (PTR_ERR(pages[locked_pages]) == -EINVAL) - pr_err("%s: inode->i_blkbits=%hhu\n", - __func__, inode->i_blkbits); + pr_err_client(cl, + "inode->i_blkbits=%hhu\n", + inode->i_blkbits); /* better not fail on first page! */ BUG_ON(locked_pages == 0); pages[locked_pages] = NULL; @@ -1199,7 +1216,7 @@ static int ceph_writepages_start(struct address_space *mapping,
if (nr_folios && i == nr_folios && locked_pages < max_pages) { - dout("reached end fbatch, trying for more\n"); + doutc(cl, "reached end fbatch, trying for more\n"); folio_batch_release(&fbatch); goto get_more_pages; } @@ -1260,8 +1277,8 @@ static int ceph_writepages_start(struct address_space *mapping, /* Start a new extent */ osd_req_op_extent_dup_last(req, op_idx, cur_offset - offset); - dout("writepages got pages at %llu~%llu\n", - offset, len); + doutc(cl, "got pages at %llu~%llu\n", offset, + len); osd_req_op_extent_osd_data_pages(req, op_idx, data_pages, len, 0, from_pool, false); @@ -1294,12 +1311,13 @@ static int ceph_writepages_start(struct address_space *mapping, if (IS_ENCRYPTED(inode)) len = round_up(len, CEPH_FSCRYPT_BLOCK_SIZE);
- dout("writepages got pages at %llu~%llu\n", offset, len); + doutc(cl, "got pages at %llu~%llu\n", offset, len);
if (IS_ENCRYPTED(inode) && ((offset | len) & ~CEPH_FSCRYPT_BLOCK_MASK)) - pr_warn("%s: bad encrypted write offset=%lld len=%llu\n", - __func__, offset, len); + pr_warn_client(cl, + "bad encrypted write offset=%lld len=%llu\n", + offset, len);
osd_req_op_extent_osd_data_pages(req, op_idx, data_pages, len, 0, from_pool, false); @@ -1351,14 +1369,14 @@ static int ceph_writepages_start(struct address_space *mapping, done = true;
release_folios: - dout("folio_batch release on %d folios (%p)\n", (int)fbatch.nr, - fbatch.nr ? fbatch.folios[0] : NULL); + doutc(cl, "folio_batch release on %d folios (%p)\n", + (int)fbatch.nr, fbatch.nr ? fbatch.folios[0] : NULL); folio_batch_release(&fbatch); }
if (should_loop && !done) { /* more to do; loop back to beginning of file */ - dout("writepages looping back to beginning of file\n"); + doutc(cl, "looping back to beginning of file\n"); end = start_index - 1; /* OK even when start_index == 0 */
/* to write dirty pages associated with next snapc, @@ -1396,7 +1414,8 @@ static int ceph_writepages_start(struct address_space *mapping, out: ceph_osdc_put_request(req); ceph_put_snap_context(last_snapc); - dout("writepages dend - startone, rc = %d\n", rc); + doutc(cl, "%llx.%llx dend - startone, rc = %d\n", ceph_vinop(inode), + rc); return rc; }
@@ -1430,11 +1449,12 @@ static struct ceph_snap_context * ceph_find_incompatible(struct page *page) { struct inode *inode = page->mapping->host; + struct ceph_client *cl = ceph_inode_to_client(inode); struct ceph_inode_info *ci = ceph_inode(inode);
if (ceph_inode_is_shutdown(inode)) { - dout(" page %p %llx:%llx is shutdown\n", page, - ceph_vinop(inode)); + doutc(cl, " %llx.%llx page %p is shutdown\n", + ceph_vinop(inode), page); return ERR_PTR(-ESTALE); }
@@ -1455,13 +1475,15 @@ ceph_find_incompatible(struct page *page) if (snapc->seq > oldest->seq) { /* not writeable -- return it for the caller to deal with */ ceph_put_snap_context(oldest); - dout(" page %p snapc %p not current or oldest\n", page, snapc); + doutc(cl, " %llx.%llx page %p snapc %p not current or oldest\n", + ceph_vinop(inode), page, snapc); return ceph_get_snap_context(snapc); } ceph_put_snap_context(oldest);
/* yay, writeable, do it now (without dropping page lock) */ - dout(" page %p snapc %p not current, but oldest\n", page, snapc); + doutc(cl, " %llx.%llx page %p snapc %p not current, but oldest\n", + ceph_vinop(inode), page, snapc); if (clear_page_dirty_for_io(page)) { int r = writepage_nounlock(page, NULL); if (r < 0) @@ -1530,10 +1552,11 @@ static int ceph_write_end(struct file *file, struct address_space *mapping, { struct folio *folio = page_folio(subpage); struct inode *inode = file_inode(file); + struct ceph_client *cl = ceph_inode_to_client(inode); bool check_cap = false;
- dout("write_end file %p inode %p folio %p %d~%d (%d)\n", file, - inode, folio, (int)pos, (int)copied, (int)len); + doutc(cl, "%llx.%llx file %p folio %p %d~%d (%d)\n", ceph_vinop(inode), + file, folio, (int)pos, (int)copied, (int)len);
if (!folio_test_uptodate(folio)) { /* just return that nothing was copied on a short copy */ @@ -1593,6 +1616,7 @@ static vm_fault_t ceph_filemap_fault(struct vm_fault *vmf) struct vm_area_struct *vma = vmf->vma; struct inode *inode = file_inode(vma->vm_file); struct ceph_inode_info *ci = ceph_inode(inode); + struct ceph_client *cl = ceph_inode_to_client(inode); struct ceph_file_info *fi = vma->vm_file->private_data; loff_t off = (loff_t)vmf->pgoff << PAGE_SHIFT; int want, got, err; @@ -1604,8 +1628,8 @@ static vm_fault_t ceph_filemap_fault(struct vm_fault *vmf)
ceph_block_sigs(&oldset);
- dout("filemap_fault %p %llx.%llx %llu trying to get caps\n", - inode, ceph_vinop(inode), off); + doutc(cl, "%llx.%llx %llu trying to get caps\n", + ceph_vinop(inode), off); if (fi->fmode & CEPH_FILE_MODE_LAZY) want = CEPH_CAP_FILE_CACHE | CEPH_CAP_FILE_LAZYIO; else @@ -1616,8 +1640,8 @@ static vm_fault_t ceph_filemap_fault(struct vm_fault *vmf) if (err < 0) goto out_restore;
- dout("filemap_fault %p %llu got cap refs on %s\n", - inode, off, ceph_cap_string(got)); + doutc(cl, "%llx.%llx %llu got cap refs on %s\n", ceph_vinop(inode), + off, ceph_cap_string(got));
if ((got & (CEPH_CAP_FILE_CACHE | CEPH_CAP_FILE_LAZYIO)) || !ceph_has_inline_data(ci)) { @@ -1625,8 +1649,8 @@ static vm_fault_t ceph_filemap_fault(struct vm_fault *vmf) ceph_add_rw_context(fi, &rw_ctx); ret = filemap_fault(vmf); ceph_del_rw_context(fi, &rw_ctx); - dout("filemap_fault %p %llu drop cap refs %s ret %x\n", - inode, off, ceph_cap_string(got), ret); + doutc(cl, "%llx.%llx %llu drop cap refs %s ret %x\n", + ceph_vinop(inode), off, ceph_cap_string(got), ret); } else err = -EAGAIN;
@@ -1667,8 +1691,8 @@ static vm_fault_t ceph_filemap_fault(struct vm_fault *vmf) ret = VM_FAULT_MAJOR | VM_FAULT_LOCKED; out_inline: filemap_invalidate_unlock_shared(mapping); - dout("filemap_fault %p %llu read inline data ret %x\n", - inode, off, ret); + doutc(cl, "%llx.%llx %llu read inline data ret %x\n", + ceph_vinop(inode), off, ret); } out_restore: ceph_restore_sigs(&oldset); @@ -1682,6 +1706,7 @@ static vm_fault_t ceph_page_mkwrite(struct vm_fault *vmf) { struct vm_area_struct *vma = vmf->vma; struct inode *inode = file_inode(vma->vm_file); + struct ceph_client *cl = ceph_inode_to_client(inode); struct ceph_inode_info *ci = ceph_inode(inode); struct ceph_file_info *fi = vma->vm_file->private_data; struct ceph_cap_flush *prealloc_cf; @@ -1708,8 +1733,8 @@ static vm_fault_t ceph_page_mkwrite(struct vm_fault *vmf) else len = offset_in_thp(page, size);
- dout("page_mkwrite %p %llx.%llx %llu~%zd getting caps i_size %llu\n", - inode, ceph_vinop(inode), off, len, size); + doutc(cl, "%llx.%llx %llu~%zd getting caps i_size %llu\n", + ceph_vinop(inode), off, len, size); if (fi->fmode & CEPH_FILE_MODE_LAZY) want = CEPH_CAP_FILE_BUFFER | CEPH_CAP_FILE_LAZYIO; else @@ -1720,8 +1745,8 @@ static vm_fault_t ceph_page_mkwrite(struct vm_fault *vmf) if (err < 0) goto out_free;
- dout("page_mkwrite %p %llu~%zd got cap refs on %s\n", - inode, off, len, ceph_cap_string(got)); + doutc(cl, "%llx.%llx %llu~%zd got cap refs on %s\n", ceph_vinop(inode), + off, len, ceph_cap_string(got));
/* Update time before taking page lock */ file_update_time(vma->vm_file); @@ -1769,8 +1794,8 @@ static vm_fault_t ceph_page_mkwrite(struct vm_fault *vmf) __mark_inode_dirty(inode, dirty); }
- dout("page_mkwrite %p %llu~%zd dropping cap refs on %s ret %x\n", - inode, off, len, ceph_cap_string(got), ret); + doutc(cl, "%llx.%llx %llu~%zd dropping cap refs on %s ret %x\n", + ceph_vinop(inode), off, len, ceph_cap_string(got), ret); ceph_put_cap_refs_async(ci, got); out_free: ceph_restore_sigs(&oldset); @@ -1784,6 +1809,7 @@ static vm_fault_t ceph_page_mkwrite(struct vm_fault *vmf) void ceph_fill_inline_data(struct inode *inode, struct page *locked_page, char *data, size_t len) { + struct ceph_client *cl = ceph_inode_to_client(inode); struct address_space *mapping = inode->i_mapping; struct page *page;
@@ -1804,8 +1830,8 @@ void ceph_fill_inline_data(struct inode *inode, struct page *locked_page, } }
- dout("fill_inline_data %p %llx.%llx len %zu locked_page %p\n", - inode, ceph_vinop(inode), len, locked_page); + doutc(cl, "%p %llx.%llx len %zu locked_page %p\n", inode, + ceph_vinop(inode), len, locked_page);
if (len > 0) { void *kaddr = kmap_atomic(page); @@ -1830,6 +1856,7 @@ int ceph_uninline_data(struct file *file) struct inode *inode = file_inode(file); struct ceph_inode_info *ci = ceph_inode(inode); struct ceph_fs_client *fsc = ceph_inode_to_fs_client(inode); + struct ceph_client *cl = fsc->client; struct ceph_osd_request *req = NULL; struct ceph_cap_flush *prealloc_cf = NULL; struct folio *folio = NULL; @@ -1842,8 +1869,8 @@ int ceph_uninline_data(struct file *file) inline_version = ci->i_inline_version; spin_unlock(&ci->i_ceph_lock);
- dout("uninline_data %p %llx.%llx inline_version %llu\n", - inode, ceph_vinop(inode), inline_version); + doutc(cl, "%llx.%llx inline_version %llu\n", ceph_vinop(inode), + inline_version);
if (ceph_inode_is_shutdown(inode)) { err = -EIO; @@ -1955,8 +1982,8 @@ int ceph_uninline_data(struct file *file) } out: ceph_free_cap_flush(prealloc_cf); - dout("uninline_data %p %llx.%llx inline_version %llu = %d\n", - inode, ceph_vinop(inode), inline_version, err); + doutc(cl, "%llx.%llx inline_version %llu = %d\n", + ceph_vinop(inode), inline_version, err); return err; }
@@ -1985,6 +2012,7 @@ static int __ceph_pool_perm_get(struct ceph_inode_info *ci, { struct ceph_fs_client *fsc = ceph_inode_to_fs_client(&ci->netfs.inode); struct ceph_mds_client *mdsc = fsc->mdsc; + struct ceph_client *cl = fsc->client; struct ceph_osd_request *rd_req = NULL, *wr_req = NULL; struct rb_node **p, *parent; struct ceph_pool_perm *perm; @@ -2019,10 +2047,10 @@ static int __ceph_pool_perm_get(struct ceph_inode_info *ci, goto out;
if (pool_ns) - dout("__ceph_pool_perm_get pool %lld ns %.*s no perm cached\n", - pool, (int)pool_ns->len, pool_ns->str); + doutc(cl, "pool %lld ns %.*s no perm cached\n", pool, + (int)pool_ns->len, pool_ns->str); else - dout("__ceph_pool_perm_get pool %lld no perm cached\n", pool); + doutc(cl, "pool %lld no perm cached\n", pool);
down_write(&mdsc->pool_perm_rwsem); p = &mdsc->pool_perm_tree.rb_node; @@ -2147,15 +2175,16 @@ static int __ceph_pool_perm_get(struct ceph_inode_info *ci, if (!err) err = have; if (pool_ns) - dout("__ceph_pool_perm_get pool %lld ns %.*s result = %d\n", - pool, (int)pool_ns->len, pool_ns->str, err); + doutc(cl, "pool %lld ns %.*s result = %d\n", pool, + (int)pool_ns->len, pool_ns->str, err); else - dout("__ceph_pool_perm_get pool %lld result = %d\n", pool, err); + doutc(cl, "pool %lld result = %d\n", pool, err); return err; }
int ceph_pool_perm_check(struct inode *inode, int need) { + struct ceph_client *cl = ceph_inode_to_client(inode); struct ceph_inode_info *ci = ceph_inode(inode); struct ceph_string *pool_ns; s64 pool; @@ -2185,13 +2214,11 @@ int ceph_pool_perm_check(struct inode *inode, int need) check: if (flags & CEPH_I_POOL_PERM) { if ((need & CEPH_CAP_FILE_RD) && !(flags & CEPH_I_POOL_RD)) { - dout("ceph_pool_perm_check pool %lld no read perm\n", - pool); + doutc(cl, "pool %lld no read perm\n", pool); return -EPERM; } if ((need & CEPH_CAP_FILE_WR) && !(flags & CEPH_I_POOL_WR)) { - dout("ceph_pool_perm_check pool %lld no write perm\n", - pool); + doutc(cl, "pool %lld no write perm\n", pool); return -EPERM; } return 0; diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c index 00045b8eadd1..a0e3f51b5d02 100644 --- a/fs/ceph/caps.c +++ b/fs/ceph/caps.c @@ -186,10 +186,10 @@ static void __ceph_unreserve_caps(struct ceph_mds_client *mdsc, int nr_caps) mdsc->caps_avail_count += nr_caps; }
- dout("%s: caps %d = %d used + %d resv + %d avail\n", - __func__, - mdsc->caps_total_count, mdsc->caps_use_count, - mdsc->caps_reserve_count, mdsc->caps_avail_count); + doutc(mdsc->fsc->client, + "caps %d = %d used + %d resv + %d avail\n", + mdsc->caps_total_count, mdsc->caps_use_count, + mdsc->caps_reserve_count, mdsc->caps_avail_count); BUG_ON(mdsc->caps_total_count != mdsc->caps_use_count + mdsc->caps_reserve_count + mdsc->caps_avail_count); @@ -202,6 +202,7 @@ static void __ceph_unreserve_caps(struct ceph_mds_client *mdsc, int nr_caps) int ceph_reserve_caps(struct ceph_mds_client *mdsc, struct ceph_cap_reservation *ctx, int need) { + struct ceph_client *cl = mdsc->fsc->client; int i, j; struct ceph_cap *cap; int have; @@ -212,7 +213,7 @@ int ceph_reserve_caps(struct ceph_mds_client *mdsc, struct ceph_mds_session *s; LIST_HEAD(newcaps);
- dout("reserve caps ctx=%p need=%d\n", ctx, need); + doutc(cl, "ctx=%p need=%d\n", ctx, need);
/* first reserve any caps that are already allocated */ spin_lock(&mdsc->caps_list_lock); @@ -272,8 +273,8 @@ int ceph_reserve_caps(struct ceph_mds_client *mdsc, continue; }
- pr_warn("reserve caps ctx=%p ENOMEM need=%d got=%d\n", - ctx, need, have + alloc); + pr_warn_client(cl, "ctx=%p ENOMEM need=%d got=%d\n", ctx, need, + have + alloc); err = -ENOMEM; break; } @@ -298,20 +299,21 @@ int ceph_reserve_caps(struct ceph_mds_client *mdsc,
spin_unlock(&mdsc->caps_list_lock);
- dout("reserve caps ctx=%p %d = %d used + %d resv + %d avail\n", - ctx, mdsc->caps_total_count, mdsc->caps_use_count, - mdsc->caps_reserve_count, mdsc->caps_avail_count); + doutc(cl, "ctx=%p %d = %d used + %d resv + %d avail\n", ctx, + mdsc->caps_total_count, mdsc->caps_use_count, + mdsc->caps_reserve_count, mdsc->caps_avail_count); return err; }
void ceph_unreserve_caps(struct ceph_mds_client *mdsc, struct ceph_cap_reservation *ctx) { + struct ceph_client *cl = mdsc->fsc->client; bool reclaim = false; if (!ctx->count) return;
- dout("unreserve caps ctx=%p count=%d\n", ctx, ctx->count); + doutc(cl, "ctx=%p count=%d\n", ctx, ctx->count); spin_lock(&mdsc->caps_list_lock); __ceph_unreserve_caps(mdsc, ctx->count); ctx->count = 0; @@ -328,6 +330,7 @@ void ceph_unreserve_caps(struct ceph_mds_client *mdsc, struct ceph_cap *ceph_get_cap(struct ceph_mds_client *mdsc, struct ceph_cap_reservation *ctx) { + struct ceph_client *cl = mdsc->fsc->client; struct ceph_cap *cap = NULL;
/* temporary, until we do something about cap import/export */ @@ -359,9 +362,9 @@ struct ceph_cap *ceph_get_cap(struct ceph_mds_client *mdsc, }
spin_lock(&mdsc->caps_list_lock); - dout("get_cap ctx=%p (%d) %d = %d used + %d resv + %d avail\n", - ctx, ctx->count, mdsc->caps_total_count, mdsc->caps_use_count, - mdsc->caps_reserve_count, mdsc->caps_avail_count); + doutc(cl, "ctx=%p (%d) %d = %d used + %d resv + %d avail\n", ctx, + ctx->count, mdsc->caps_total_count, mdsc->caps_use_count, + mdsc->caps_reserve_count, mdsc->caps_avail_count); BUG_ON(!ctx->count); BUG_ON(ctx->count > mdsc->caps_reserve_count); BUG_ON(list_empty(&mdsc->caps_list)); @@ -382,10 +385,12 @@ struct ceph_cap *ceph_get_cap(struct ceph_mds_client *mdsc,
void ceph_put_cap(struct ceph_mds_client *mdsc, struct ceph_cap *cap) { + struct ceph_client *cl = mdsc->fsc->client; + spin_lock(&mdsc->caps_list_lock); - dout("put_cap %p %d = %d used + %d resv + %d avail\n", - cap, mdsc->caps_total_count, mdsc->caps_use_count, - mdsc->caps_reserve_count, mdsc->caps_avail_count); + doutc(cl, "%p %d = %d used + %d resv + %d avail\n", cap, + mdsc->caps_total_count, mdsc->caps_use_count, + mdsc->caps_reserve_count, mdsc->caps_avail_count); mdsc->caps_use_count--; /* * Keep some preallocated caps around (ceph_min_count), to @@ -491,11 +496,13 @@ static void __insert_cap_node(struct ceph_inode_info *ci, static void __cap_set_timeouts(struct ceph_mds_client *mdsc, struct ceph_inode_info *ci) { + struct inode *inode = &ci->netfs.inode; struct ceph_mount_options *opt = mdsc->fsc->mount_options; + ci->i_hold_caps_max = round_jiffies(jiffies + opt->caps_wanted_delay_max * HZ); - dout("__cap_set_timeouts %p %lu\n", &ci->netfs.inode, - ci->i_hold_caps_max - jiffies); + doutc(mdsc->fsc->client, "%p %llx.%llx %lu\n", inode, + ceph_vinop(inode), ci->i_hold_caps_max - jiffies); }
/* @@ -509,8 +516,11 @@ static void __cap_set_timeouts(struct ceph_mds_client *mdsc, static void __cap_delay_requeue(struct ceph_mds_client *mdsc, struct ceph_inode_info *ci) { - dout("__cap_delay_requeue %p flags 0x%lx at %lu\n", &ci->netfs.inode, - ci->i_ceph_flags, ci->i_hold_caps_max); + struct inode *inode = &ci->netfs.inode; + + doutc(mdsc->fsc->client, "%p %llx.%llx flags 0x%lx at %lu\n", + inode, ceph_vinop(inode), ci->i_ceph_flags, + ci->i_hold_caps_max); if (!mdsc->stopping) { spin_lock(&mdsc->cap_delay_lock); if (!list_empty(&ci->i_cap_delay_list)) { @@ -533,7 +543,9 @@ static void __cap_delay_requeue(struct ceph_mds_client *mdsc, static void __cap_delay_requeue_front(struct ceph_mds_client *mdsc, struct ceph_inode_info *ci) { - dout("__cap_delay_requeue_front %p\n", &ci->netfs.inode); + struct inode *inode = &ci->netfs.inode; + + doutc(mdsc->fsc->client, "%p %llx.%llx\n", inode, ceph_vinop(inode)); spin_lock(&mdsc->cap_delay_lock); ci->i_ceph_flags |= CEPH_I_FLUSH; if (!list_empty(&ci->i_cap_delay_list)) @@ -550,7 +562,9 @@ static void __cap_delay_requeue_front(struct ceph_mds_client *mdsc, static void __cap_delay_cancel(struct ceph_mds_client *mdsc, struct ceph_inode_info *ci) { - dout("__cap_delay_cancel %p\n", &ci->netfs.inode); + struct inode *inode = &ci->netfs.inode; + + doutc(mdsc->fsc->client, "%p %llx.%llx\n", inode, ceph_vinop(inode)); if (list_empty(&ci->i_cap_delay_list)) return; spin_lock(&mdsc->cap_delay_lock); @@ -562,6 +576,9 @@ static void __cap_delay_cancel(struct ceph_mds_client *mdsc, static void __check_cap_issue(struct ceph_inode_info *ci, struct ceph_cap *cap, unsigned issued) { + struct inode *inode = &ci->netfs.inode; + struct ceph_client *cl = ceph_inode_to_client(inode); + unsigned had = __ceph_caps_issued(ci, NULL);
lockdep_assert_held(&ci->i_ceph_lock); @@ -586,7 +603,7 @@ static void __check_cap_issue(struct ceph_inode_info *ci, struct ceph_cap *cap, if (issued & CEPH_CAP_FILE_SHARED) atomic_inc(&ci->i_shared_gen); if (S_ISDIR(ci->netfs.inode.i_mode)) { - dout(" marking %p NOT complete\n", &ci->netfs.inode); + doutc(cl, " marking %p NOT complete\n", inode); __ceph_dir_clear_complete(ci); } } @@ -636,6 +653,7 @@ void ceph_add_cap(struct inode *inode, struct ceph_cap **new_cap) { struct ceph_mds_client *mdsc = ceph_inode_to_fs_client(inode)->mdsc; + struct ceph_client *cl = ceph_inode_to_client(inode); struct ceph_inode_info *ci = ceph_inode(inode); struct ceph_cap *cap; int mds = session->s_mds; @@ -644,8 +662,9 @@ void ceph_add_cap(struct inode *inode,
lockdep_assert_held(&ci->i_ceph_lock);
- dout("add_cap %p mds%d cap %llx %s seq %d\n", inode, - session->s_mds, cap_id, ceph_cap_string(issued), seq); + doutc(cl, "%p %llx.%llx mds%d cap %llx %s seq %d\n", inode, + ceph_vinop(inode), session->s_mds, cap_id, + ceph_cap_string(issued), seq);
gen = atomic_read(&session->s_cap_gen);
@@ -723,9 +742,9 @@ void ceph_add_cap(struct inode *inode, actual_wanted = __ceph_caps_wanted(ci); if ((wanted & ~actual_wanted) || (issued & ~actual_wanted & CEPH_CAP_ANY_WR)) { - dout(" issued %s, mds wanted %s, actual %s, queueing\n", - ceph_cap_string(issued), ceph_cap_string(wanted), - ceph_cap_string(actual_wanted)); + doutc(cl, "issued %s, mds wanted %s, actual %s, queueing\n", + ceph_cap_string(issued), ceph_cap_string(wanted), + ceph_cap_string(actual_wanted)); __cap_delay_requeue(mdsc, ci); }
@@ -742,9 +761,9 @@ void ceph_add_cap(struct inode *inode, WARN_ON(ci->i_auth_cap == cap); }
- dout("add_cap inode %p (%llx.%llx) cap %p %s now %s seq %d mds%d\n", - inode, ceph_vinop(inode), cap, ceph_cap_string(issued), - ceph_cap_string(issued|cap->issued), seq, mds); + doutc(cl, "inode %p %llx.%llx cap %p %s now %s seq %d mds%d\n", + inode, ceph_vinop(inode), cap, ceph_cap_string(issued), + ceph_cap_string(issued|cap->issued), seq, mds); cap->cap_id = cap_id; cap->issued = issued; cap->implemented |= issued; @@ -766,6 +785,8 @@ void ceph_add_cap(struct inode *inode, */ static int __cap_is_valid(struct ceph_cap *cap) { + struct inode *inode = &cap->ci->netfs.inode; + struct ceph_client *cl = cap->session->s_mdsc->fsc->client; unsigned long ttl; u32 gen;
@@ -773,9 +794,9 @@ static int __cap_is_valid(struct ceph_cap *cap) ttl = cap->session->s_cap_ttl;
if (cap->cap_gen < gen || time_after_eq(jiffies, ttl)) { - dout("__cap_is_valid %p cap %p issued %s " - "but STALE (gen %u vs %u)\n", &cap->ci->netfs.inode, - cap, ceph_cap_string(cap->issued), cap->cap_gen, gen); + doutc(cl, "%p %llx.%llx cap %p issued %s but STALE (gen %u vs %u)\n", + inode, ceph_vinop(inode), cap, + ceph_cap_string(cap->issued), cap->cap_gen, gen); return 0; }
@@ -789,6 +810,8 @@ static int __cap_is_valid(struct ceph_cap *cap) */ int __ceph_caps_issued(struct ceph_inode_info *ci, int *implemented) { + struct inode *inode = &ci->netfs.inode; + struct ceph_client *cl = ceph_inode_to_client(inode); int have = ci->i_snap_caps; struct ceph_cap *cap; struct rb_node *p; @@ -799,8 +822,8 @@ int __ceph_caps_issued(struct ceph_inode_info *ci, int *implemented) cap = rb_entry(p, struct ceph_cap, ci_node); if (!__cap_is_valid(cap)) continue; - dout("__ceph_caps_issued %p cap %p issued %s\n", - &ci->netfs.inode, cap, ceph_cap_string(cap->issued)); + doutc(cl, "%p %llx.%llx cap %p issued %s\n", inode, + ceph_vinop(inode), cap, ceph_cap_string(cap->issued)); have |= cap->issued; if (implemented) *implemented |= cap->implemented; @@ -843,16 +866,18 @@ int __ceph_caps_issued_other(struct ceph_inode_info *ci, struct ceph_cap *ocap) */ static void __touch_cap(struct ceph_cap *cap) { + struct inode *inode = &cap->ci->netfs.inode; struct ceph_mds_session *s = cap->session; + struct ceph_client *cl = s->s_mdsc->fsc->client;
spin_lock(&s->s_cap_lock); if (!s->s_cap_iterator) { - dout("__touch_cap %p cap %p mds%d\n", &cap->ci->netfs.inode, cap, - s->s_mds); + doutc(cl, "%p %llx.%llx cap %p mds%d\n", inode, + ceph_vinop(inode), cap, s->s_mds); list_move_tail(&cap->session_caps, &s->s_caps); } else { - dout("__touch_cap %p cap %p mds%d NOP, iterating over caps\n", - &cap->ci->netfs.inode, cap, s->s_mds); + doutc(cl, "%p %llx.%llx cap %p mds%d NOP, iterating over caps\n", + inode, ceph_vinop(inode), cap, s->s_mds); } spin_unlock(&s->s_cap_lock); } @@ -864,15 +889,16 @@ static void __touch_cap(struct ceph_cap *cap) */ int __ceph_caps_issued_mask(struct ceph_inode_info *ci, int mask, int touch) { + struct inode *inode = &ci->netfs.inode; + struct ceph_client *cl = ceph_inode_to_client(inode); struct ceph_cap *cap; struct rb_node *p; int have = ci->i_snap_caps;
if ((have & mask) == mask) { - dout("__ceph_caps_issued_mask ino 0x%llx snap issued %s" - " (mask %s)\n", ceph_ino(&ci->netfs.inode), - ceph_cap_string(have), - ceph_cap_string(mask)); + doutc(cl, "mask %p %llx.%llx snap issued %s (mask %s)\n", + inode, ceph_vinop(inode), ceph_cap_string(have), + ceph_cap_string(mask)); return 1; }
@@ -881,10 +907,10 @@ int __ceph_caps_issued_mask(struct ceph_inode_info *ci, int mask, int touch) if (!__cap_is_valid(cap)) continue; if ((cap->issued & mask) == mask) { - dout("__ceph_caps_issued_mask ino 0x%llx cap %p issued %s" - " (mask %s)\n", ceph_ino(&ci->netfs.inode), cap, - ceph_cap_string(cap->issued), - ceph_cap_string(mask)); + doutc(cl, "mask %p %llx.%llx cap %p issued %s (mask %s)\n", + inode, ceph_vinop(inode), cap, + ceph_cap_string(cap->issued), + ceph_cap_string(mask)); if (touch) __touch_cap(cap); return 1; @@ -893,10 +919,10 @@ int __ceph_caps_issued_mask(struct ceph_inode_info *ci, int mask, int touch) /* does a combination of caps satisfy mask? */ have |= cap->issued; if ((have & mask) == mask) { - dout("__ceph_caps_issued_mask ino 0x%llx combo issued %s" - " (mask %s)\n", ceph_ino(&ci->netfs.inode), - ceph_cap_string(cap->issued), - ceph_cap_string(mask)); + doutc(cl, "mask %p %llx.%llx combo issued %s (mask %s)\n", + inode, ceph_vinop(inode), + ceph_cap_string(cap->issued), + ceph_cap_string(mask)); if (touch) { struct rb_node *q;
@@ -954,13 +980,14 @@ int __ceph_caps_revoking_other(struct ceph_inode_info *ci, int ceph_caps_revoking(struct ceph_inode_info *ci, int mask) { struct inode *inode = &ci->netfs.inode; + struct ceph_client *cl = ceph_inode_to_client(inode); int ret;
spin_lock(&ci->i_ceph_lock); ret = __ceph_caps_revoking_other(ci, NULL, mask); spin_unlock(&ci->i_ceph_lock); - dout("ceph_caps_revoking %p %s = %d\n", inode, - ceph_cap_string(mask), ret); + doutc(cl, "%p %llx.%llx %s = %d\n", inode, ceph_vinop(inode), + ceph_cap_string(mask), ret); return ret; }
@@ -1107,19 +1134,21 @@ int ceph_is_any_caps(struct inode *inode) void __ceph_remove_cap(struct ceph_cap *cap, bool queue_release) { struct ceph_mds_session *session = cap->session; + struct ceph_client *cl = session->s_mdsc->fsc->client; struct ceph_inode_info *ci = cap->ci; + struct inode *inode = &ci->netfs.inode; struct ceph_mds_client *mdsc; int removed = 0;
/* 'ci' being NULL means the remove have already occurred */ if (!ci) { - dout("%s: cap inode is NULL\n", __func__); + doutc(cl, "inode is NULL\n"); return; }
lockdep_assert_held(&ci->i_ceph_lock);
- dout("__ceph_remove_cap %p from %p\n", cap, &ci->netfs.inode); + doutc(cl, "%p from %p %llx.%llx\n", cap, inode, ceph_vinop(inode));
mdsc = ceph_inode_to_fs_client(&ci->netfs.inode)->mdsc;
@@ -1132,8 +1161,8 @@ void __ceph_remove_cap(struct ceph_cap *cap, bool queue_release) spin_lock(&session->s_cap_lock); if (session->s_cap_iterator == cap) { /* not yet, we are iterating over this very cap */ - dout("__ceph_remove_cap delaying %p removal from session %p\n", - cap, cap->session); + doutc(cl, "delaying %p removal from session %p\n", cap, + cap->session); } else { list_del_init(&cap->session_caps); session->s_nr_caps--; @@ -1186,7 +1215,7 @@ void ceph_remove_cap(struct ceph_mds_client *mdsc, struct ceph_cap *cap,
/* 'ci' being NULL means the remove have already occurred */ if (!ci) { - dout("%s: cap inode is NULL\n", __func__); + doutc(mdsc->fsc->client, "inode is NULL\n"); return; }
@@ -1228,15 +1257,19 @@ static void encode_cap_msg(struct ceph_msg *msg, struct cap_msg_args *arg) { struct ceph_mds_caps *fc; void *p; - struct ceph_osd_client *osdc = &arg->session->s_mdsc->fsc->client->osdc; - - dout("%s %s %llx %llx caps %s wanted %s dirty %s seq %u/%u tid %llu/%llu mseq %u follows %lld size %llu/%llu xattr_ver %llu xattr_len %d\n", - __func__, ceph_cap_op_name(arg->op), arg->cid, arg->ino, - ceph_cap_string(arg->caps), ceph_cap_string(arg->wanted), - ceph_cap_string(arg->dirty), arg->seq, arg->issue_seq, - arg->flush_tid, arg->oldest_flush_tid, arg->mseq, arg->follows, - arg->size, arg->max_size, arg->xattr_version, - arg->xattr_buf ? (int)arg->xattr_buf->vec.iov_len : 0); + struct ceph_mds_client *mdsc = arg->session->s_mdsc; + struct ceph_osd_client *osdc = &mdsc->fsc->client->osdc; + + doutc(mdsc->fsc->client, + "%s %llx %llx caps %s wanted %s dirty %s seq %u/%u" + " tid %llu/%llu mseq %u follows %lld size %llu/%llu" + " xattr_ver %llu xattr_len %d\n", + ceph_cap_op_name(arg->op), arg->cid, arg->ino, + ceph_cap_string(arg->caps), ceph_cap_string(arg->wanted), + ceph_cap_string(arg->dirty), arg->seq, arg->issue_seq, + arg->flush_tid, arg->oldest_flush_tid, arg->mseq, arg->follows, + arg->size, arg->max_size, arg->xattr_version, + arg->xattr_buf ? (int)arg->xattr_buf->vec.iov_len : 0);
msg->hdr.version = cpu_to_le16(12); msg->hdr.tid = cpu_to_le64(arg->flush_tid); @@ -1373,6 +1406,7 @@ static void __prep_cap(struct cap_msg_args *arg, struct ceph_cap *cap, { struct ceph_inode_info *ci = cap->ci; struct inode *inode = &ci->netfs.inode; + struct ceph_client *cl = ceph_inode_to_client(inode); int held, revoking;
lockdep_assert_held(&ci->i_ceph_lock); @@ -1381,10 +1415,10 @@ static void __prep_cap(struct cap_msg_args *arg, struct ceph_cap *cap, revoking = cap->implemented & ~cap->issued; retain &= ~revoking;
- dout("%s %p cap %p session %p %s -> %s (revoking %s)\n", - __func__, inode, cap, cap->session, - ceph_cap_string(held), ceph_cap_string(held & retain), - ceph_cap_string(revoking)); + doutc(cl, "%p %llx.%llx cap %p session %p %s -> %s (revoking %s)\n", + inode, ceph_vinop(inode), cap, cap->session, + ceph_cap_string(held), ceph_cap_string(held & retain), + ceph_cap_string(revoking)); BUG_ON((retain & CEPH_CAP_PIN) == 0);
ci->i_ceph_flags &= ~CEPH_I_FLUSH; @@ -1500,13 +1534,16 @@ static void __send_cap(struct cap_msg_args *arg, struct ceph_inode_info *ci) { struct ceph_msg *msg; struct inode *inode = &ci->netfs.inode; + struct ceph_client *cl = ceph_inode_to_client(inode);
msg = ceph_msg_new(CEPH_MSG_CLIENT_CAPS, cap_msg_size(arg), GFP_NOFS, false); if (!msg) { - pr_err("error allocating cap msg: ino (%llx.%llx) flushing %s tid %llu, requeuing cap.\n", - ceph_vinop(inode), ceph_cap_string(arg->dirty), - arg->flush_tid); + pr_err_client(cl, + "error allocating cap msg: ino (%llx.%llx)" + " flushing %s tid %llu, requeuing cap.\n", + ceph_vinop(inode), ceph_cap_string(arg->dirty), + arg->flush_tid); spin_lock(&ci->i_ceph_lock); __cap_delay_requeue(arg->session->s_mdsc, ci); spin_unlock(&ci->i_ceph_lock); @@ -1596,11 +1633,13 @@ static void __ceph_flush_snaps(struct ceph_inode_info *ci, { struct inode *inode = &ci->netfs.inode; struct ceph_mds_client *mdsc = session->s_mdsc; + struct ceph_client *cl = mdsc->fsc->client; struct ceph_cap_snap *capsnap; u64 oldest_flush_tid = 0; u64 first_tid = 1, last_tid = 0;
- dout("__flush_snaps %p session %p\n", inode, session); + doutc(cl, "%p %llx.%llx session %p\n", inode, ceph_vinop(inode), + session);
list_for_each_entry(capsnap, &ci->i_cap_snaps, ci_item) { /* @@ -1615,7 +1654,7 @@ static void __ceph_flush_snaps(struct ceph_inode_info *ci,
/* only flush each capsnap once */ if (capsnap->cap_flush.tid > 0) { - dout(" already flushed %p, skipping\n", capsnap); + doutc(cl, "already flushed %p, skipping\n", capsnap); continue; }
@@ -1647,8 +1686,8 @@ static void __ceph_flush_snaps(struct ceph_inode_info *ci, int ret;
if (!(cap && cap->session == session)) { - dout("__flush_snaps %p auth cap %p not mds%d, " - "stop\n", inode, cap, session->s_mds); + doutc(cl, "%p %llx.%llx auth cap %p not mds%d, stop\n", + inode, ceph_vinop(inode), cap, session->s_mds); break; }
@@ -1669,15 +1708,17 @@ static void __ceph_flush_snaps(struct ceph_inode_info *ci, refcount_inc(&capsnap->nref); spin_unlock(&ci->i_ceph_lock);
- dout("__flush_snaps %p capsnap %p tid %llu %s\n", - inode, capsnap, cf->tid, ceph_cap_string(capsnap->dirty)); + doutc(cl, "%p %llx.%llx capsnap %p tid %llu %s\n", inode, + ceph_vinop(inode), capsnap, cf->tid, + ceph_cap_string(capsnap->dirty));
ret = __send_flush_snap(inode, session, capsnap, cap->mseq, oldest_flush_tid); if (ret < 0) { - pr_err("__flush_snaps: error sending cap flushsnap, " - "ino (%llx.%llx) tid %llu follows %llu\n", - ceph_vinop(inode), cf->tid, capsnap->follows); + pr_err_client(cl, "error sending cap flushsnap, " + "ino (%llx.%llx) tid %llu follows %llu\n", + ceph_vinop(inode), cf->tid, + capsnap->follows); }
ceph_put_cap_snap(capsnap); @@ -1690,27 +1731,28 @@ void ceph_flush_snaps(struct ceph_inode_info *ci, { struct inode *inode = &ci->netfs.inode; struct ceph_mds_client *mdsc = ceph_inode_to_fs_client(inode)->mdsc; + struct ceph_client *cl = ceph_inode_to_client(inode); struct ceph_mds_session *session = NULL; bool need_put = false; int mds;
- dout("ceph_flush_snaps %p\n", inode); + doutc(cl, "%p %llx.%llx\n", inode, ceph_vinop(inode)); if (psession) session = *psession; retry: spin_lock(&ci->i_ceph_lock); if (!(ci->i_ceph_flags & CEPH_I_FLUSH_SNAPS)) { - dout(" no capsnap needs flush, doing nothing\n"); + doutc(cl, " no capsnap needs flush, doing nothing\n"); goto out; } if (!ci->i_auth_cap) { - dout(" no auth cap (migrating?), doing nothing\n"); + doutc(cl, " no auth cap (migrating?), doing nothing\n"); goto out; }
mds = ci->i_auth_cap->session->s_mds; if (session && session->s_mds != mds) { - dout(" oops, wrong session %p mutex\n", session); + doutc(cl, " oops, wrong session %p mutex\n", session); ceph_put_mds_session(session); session = NULL; } @@ -1756,21 +1798,23 @@ int __ceph_mark_dirty_caps(struct ceph_inode_info *ci, int mask, struct ceph_mds_client *mdsc = ceph_sb_to_fs_client(ci->netfs.inode.i_sb)->mdsc; struct inode *inode = &ci->netfs.inode; + struct ceph_client *cl = ceph_inode_to_client(inode); int was = ci->i_dirty_caps; int dirty = 0;
lockdep_assert_held(&ci->i_ceph_lock);
if (!ci->i_auth_cap) { - pr_warn("__mark_dirty_caps %p %llx mask %s, " - "but no auth cap (session was closed?)\n", - inode, ceph_ino(inode), ceph_cap_string(mask)); + pr_warn_client(cl, "%p %llx.%llx mask %s, " + "but no auth cap (session was closed?)\n", + inode, ceph_vinop(inode), + ceph_cap_string(mask)); return 0; }
- dout("__mark_dirty_caps %p %s dirty %s -> %s\n", &ci->netfs.inode, - ceph_cap_string(mask), ceph_cap_string(was), - ceph_cap_string(was | mask)); + doutc(cl, "%p %llx.%llx %s dirty %s -> %s\n", inode, + ceph_vinop(inode), ceph_cap_string(mask), + ceph_cap_string(was), ceph_cap_string(was | mask)); ci->i_dirty_caps |= mask; if (was == 0) { struct ceph_mds_session *session = ci->i_auth_cap->session; @@ -1783,8 +1827,9 @@ int __ceph_mark_dirty_caps(struct ceph_inode_info *ci, int mask, ci->i_head_snapc = ceph_get_snap_context( ci->i_snap_realm->cached_context); } - dout(" inode %p now dirty snapc %p auth cap %p\n", - &ci->netfs.inode, ci->i_head_snapc, ci->i_auth_cap); + doutc(cl, "%p %llx.%llx now dirty snapc %p auth cap %p\n", + inode, ceph_vinop(inode), ci->i_head_snapc, + ci->i_auth_cap); BUG_ON(!list_empty(&ci->i_dirty_item)); spin_lock(&mdsc->cap_dirty_lock); list_add(&ci->i_dirty_item, &session->s_cap_dirty); @@ -1878,6 +1923,7 @@ static u64 __mark_caps_flushing(struct inode *inode, u64 *oldest_flush_tid) { struct ceph_mds_client *mdsc = ceph_sb_to_fs_client(inode->i_sb)->mdsc; + struct ceph_client *cl = ceph_inode_to_client(inode); struct ceph_inode_info *ci = ceph_inode(inode); struct ceph_cap_flush *cf = NULL; int flushing; @@ -1888,13 +1934,13 @@ static u64 __mark_caps_flushing(struct inode *inode, BUG_ON(!ci->i_prealloc_cap_flush);
flushing = ci->i_dirty_caps; - dout("__mark_caps_flushing flushing %s, flushing_caps %s -> %s\n", - ceph_cap_string(flushing), - ceph_cap_string(ci->i_flushing_caps), - ceph_cap_string(ci->i_flushing_caps | flushing)); + doutc(cl, "flushing %s, flushing_caps %s -> %s\n", + ceph_cap_string(flushing), + ceph_cap_string(ci->i_flushing_caps), + ceph_cap_string(ci->i_flushing_caps | flushing)); ci->i_flushing_caps |= flushing; ci->i_dirty_caps = 0; - dout(" inode %p now !dirty\n", inode); + doutc(cl, "%p %llx.%llx now !dirty\n", inode, ceph_vinop(inode));
swap(cf, ci->i_prealloc_cap_flush); cf->caps = flushing; @@ -1925,6 +1971,7 @@ static int try_nonblocking_invalidate(struct inode *inode) __releases(ci->i_ceph_lock) __acquires(ci->i_ceph_lock) { + struct ceph_client *cl = ceph_inode_to_client(inode); struct ceph_inode_info *ci = ceph_inode(inode); u32 invalidating_gen = ci->i_rdcache_gen;
@@ -1936,12 +1983,13 @@ static int try_nonblocking_invalidate(struct inode *inode) if (inode->i_data.nrpages == 0 && invalidating_gen == ci->i_rdcache_gen) { /* success. */ - dout("try_nonblocking_invalidate %p success\n", inode); + doutc(cl, "%p %llx.%llx success\n", inode, + ceph_vinop(inode)); /* save any racing async invalidate some trouble */ ci->i_rdcache_revoking = ci->i_rdcache_gen - 1; return 0; } - dout("try_nonblocking_invalidate %p failed\n", inode); + doutc(cl, "%p %llx.%llx failed\n", inode, ceph_vinop(inode)); return -1; }
@@ -1973,6 +2021,7 @@ void ceph_check_caps(struct ceph_inode_info *ci, int flags) { struct inode *inode = &ci->netfs.inode; struct ceph_mds_client *mdsc = ceph_sb_to_mdsc(inode->i_sb); + struct ceph_client *cl = ceph_inode_to_client(inode); struct ceph_cap *cap; u64 flush_tid, oldest_flush_tid; int file_wanted, used, cap_used; @@ -2047,9 +2096,9 @@ void ceph_check_caps(struct ceph_inode_info *ci, int flags) } }
- dout("check_caps %llx.%llx file_want %s used %s dirty %s flushing %s" - " issued %s revoking %s retain %s %s%s%s\n", ceph_vinop(inode), - ceph_cap_string(file_wanted), + doutc(cl, "%p %llx.%llx file_want %s used %s dirty %s " + "flushing %s issued %s revoking %s retain %s %s%s%s\n", + inode, ceph_vinop(inode), ceph_cap_string(file_wanted), ceph_cap_string(used), ceph_cap_string(ci->i_dirty_caps), ceph_cap_string(ci->i_flushing_caps), ceph_cap_string(issued), ceph_cap_string(revoking), @@ -2070,10 +2119,10 @@ void ceph_check_caps(struct ceph_inode_info *ci, int flags) (revoking & (CEPH_CAP_FILE_CACHE| CEPH_CAP_FILE_LAZYIO)) && /* or revoking cache */ !tried_invalidate) { - dout("check_caps trying to invalidate on %llx.%llx\n", - ceph_vinop(inode)); + doutc(cl, "trying to invalidate on %p %llx.%llx\n", + inode, ceph_vinop(inode)); if (try_nonblocking_invalidate(inode) < 0) { - dout("check_caps queuing invalidate\n"); + doutc(cl, "queuing invalidate\n"); queue_invalidate = true; ci->i_rdcache_revoking = ci->i_rdcache_gen; } @@ -2101,35 +2150,35 @@ void ceph_check_caps(struct ceph_inode_info *ci, int flags) cap_used &= ~ci->i_auth_cap->issued;
revoking = cap->implemented & ~cap->issued; - dout(" mds%d cap %p used %s issued %s implemented %s revoking %s\n", - cap->mds, cap, ceph_cap_string(cap_used), - ceph_cap_string(cap->issued), - ceph_cap_string(cap->implemented), - ceph_cap_string(revoking)); + doutc(cl, " mds%d cap %p used %s issued %s implemented %s revoking %s\n", + cap->mds, cap, ceph_cap_string(cap_used), + ceph_cap_string(cap->issued), + ceph_cap_string(cap->implemented), + ceph_cap_string(revoking));
if (cap == ci->i_auth_cap && (cap->issued & CEPH_CAP_FILE_WR)) { /* request larger max_size from MDS? */ if (ci->i_wanted_max_size > ci->i_max_size && ci->i_wanted_max_size > ci->i_requested_max_size) { - dout("requesting new max_size\n"); + doutc(cl, "requesting new max_size\n"); goto ack; }
/* approaching file_max? */ if (__ceph_should_report_size(ci)) { - dout("i_size approaching max_size\n"); + doutc(cl, "i_size approaching max_size\n"); goto ack; } } /* flush anything dirty? */ if (cap == ci->i_auth_cap) { if ((flags & CHECK_CAPS_FLUSH) && ci->i_dirty_caps) { - dout("flushing dirty caps\n"); + doutc(cl, "flushing dirty caps\n"); goto ack; } if (ci->i_ceph_flags & CEPH_I_FLUSH_SNAPS) { - dout("flushing snap caps\n"); + doutc(cl, "flushing snap caps\n"); goto ack; } } @@ -2137,7 +2186,7 @@ void ceph_check_caps(struct ceph_inode_info *ci, int flags) /* completed revocation? going down and there are no caps? */ if (revoking) { if ((revoking & cap_used) == 0) { - dout("completed revocation of %s\n", + doutc(cl, "completed revocation of %s\n", ceph_cap_string(cap->implemented & ~cap->issued)); goto ack; } @@ -2315,6 +2364,7 @@ static int caps_are_flushed(struct inode *inode, u64 flush_tid) static int flush_mdlog_and_wait_inode_unsafe_requests(struct inode *inode) { struct ceph_mds_client *mdsc = ceph_sb_to_fs_client(inode->i_sb)->mdsc; + struct ceph_client *cl = ceph_inode_to_client(inode); struct ceph_inode_info *ci = ceph_inode(inode); struct ceph_mds_request *req1 = NULL, *req2 = NULL; int ret, err = 0; @@ -2404,8 +2454,9 @@ static int flush_mdlog_and_wait_inode_unsafe_requests(struct inode *inode) kfree(sessions); }
- dout("%s %p wait on tid %llu %llu\n", __func__, - inode, req1 ? req1->r_tid : 0ULL, req2 ? req2->r_tid : 0ULL); + doutc(cl, "%p %llx.%llx wait on tid %llu %llu\n", inode, + ceph_vinop(inode), req1 ? req1->r_tid : 0ULL, + req2 ? req2->r_tid : 0ULL); if (req1) { ret = !wait_for_completion_timeout(&req1->r_safe_completion, ceph_timeout_jiffies(req1->r_timeout)); @@ -2431,11 +2482,13 @@ int ceph_fsync(struct file *file, loff_t start, loff_t end, int datasync) { struct inode *inode = file->f_mapping->host; struct ceph_inode_info *ci = ceph_inode(inode); + struct ceph_client *cl = ceph_inode_to_client(inode); u64 flush_tid; int ret, err; int dirty;
- dout("fsync %p%s\n", inode, datasync ? " datasync" : ""); + doutc(cl, "%p %llx.%llx%s\n", inode, ceph_vinop(inode), + datasync ? " datasync" : "");
ret = file_write_and_wait_range(file, start, end); if (datasync) @@ -2446,7 +2499,7 @@ int ceph_fsync(struct file *file, loff_t start, loff_t end, int datasync) goto out;
dirty = try_flush_caps(inode, &flush_tid); - dout("fsync dirty caps are %s\n", ceph_cap_string(dirty)); + doutc(cl, "dirty caps are %s\n", ceph_cap_string(dirty));
err = flush_mdlog_and_wait_inode_unsafe_requests(inode);
@@ -2467,7 +2520,8 @@ int ceph_fsync(struct file *file, loff_t start, loff_t end, int datasync) if (err < 0) ret = err; out: - dout("fsync %p%s result=%d\n", inode, datasync ? " datasync" : "", ret); + doutc(cl, "%p %llx.%llx%s result=%d\n", inode, ceph_vinop(inode), + datasync ? " datasync" : "", ret); return ret; }
@@ -2480,12 +2534,13 @@ int ceph_fsync(struct file *file, loff_t start, loff_t end, int datasync) int ceph_write_inode(struct inode *inode, struct writeback_control *wbc) { struct ceph_inode_info *ci = ceph_inode(inode); + struct ceph_client *cl = ceph_inode_to_client(inode); u64 flush_tid; int err = 0; int dirty; int wait = (wbc->sync_mode == WB_SYNC_ALL && !wbc->for_sync);
- dout("write_inode %p wait=%d\n", inode, wait); + doutc(cl, "%p %llx.%llx wait=%d\n", inode, ceph_vinop(inode), wait); ceph_fscache_unpin_writeback(inode, wbc); if (wait) { err = ceph_wait_on_async_create(inode); @@ -2515,6 +2570,7 @@ static void __kick_flushing_caps(struct ceph_mds_client *mdsc, __acquires(ci->i_ceph_lock) { struct inode *inode = &ci->netfs.inode; + struct ceph_client *cl = mdsc->fsc->client; struct ceph_cap *cap; struct ceph_cap_flush *cf; int ret; @@ -2540,8 +2596,8 @@ static void __kick_flushing_caps(struct ceph_mds_client *mdsc,
cap = ci->i_auth_cap; if (!(cap && cap->session == session)) { - pr_err("%p auth cap %p not mds%d ???\n", - inode, cap, session->s_mds); + pr_err_client(cl, "%p auth cap %p not mds%d ???\n", + inode, cap, session->s_mds); break; }
@@ -2550,8 +2606,9 @@ static void __kick_flushing_caps(struct ceph_mds_client *mdsc, if (!cf->is_capsnap) { struct cap_msg_args arg;
- dout("kick_flushing_caps %p cap %p tid %llu %s\n", - inode, cap, cf->tid, ceph_cap_string(cf->caps)); + doutc(cl, "%p %llx.%llx cap %p tid %llu %s\n", + inode, ceph_vinop(inode), cap, cf->tid, + ceph_cap_string(cf->caps)); __prep_cap(&arg, cap, CEPH_CAP_OP_FLUSH, (cf->tid < last_snap_flush ? CEPH_CLIENT_CAPS_PENDING_CAPSNAP : 0), @@ -2565,9 +2622,9 @@ static void __kick_flushing_caps(struct ceph_mds_client *mdsc, struct ceph_cap_snap *capsnap = container_of(cf, struct ceph_cap_snap, cap_flush); - dout("kick_flushing_caps %p capsnap %p tid %llu %s\n", - inode, capsnap, cf->tid, - ceph_cap_string(capsnap->dirty)); + doutc(cl, "%p %llx.%llx capsnap %p tid %llu %s\n", + inode, ceph_vinop(inode), capsnap, cf->tid, + ceph_cap_string(capsnap->dirty));
refcount_inc(&capsnap->nref); spin_unlock(&ci->i_ceph_lock); @@ -2575,11 +2632,10 @@ static void __kick_flushing_caps(struct ceph_mds_client *mdsc, ret = __send_flush_snap(inode, session, capsnap, cap->mseq, oldest_flush_tid); if (ret < 0) { - pr_err("kick_flushing_caps: error sending " - "cap flushsnap, ino (%llx.%llx) " - "tid %llu follows %llu\n", - ceph_vinop(inode), cf->tid, - capsnap->follows); + pr_err_client(cl, "error sending cap flushsnap," + " %p %llx.%llx tid %llu follows %llu\n", + inode, ceph_vinop(inode), cf->tid, + capsnap->follows); }
ceph_put_cap_snap(capsnap); @@ -2592,22 +2648,26 @@ static void __kick_flushing_caps(struct ceph_mds_client *mdsc, void ceph_early_kick_flushing_caps(struct ceph_mds_client *mdsc, struct ceph_mds_session *session) { + struct ceph_client *cl = mdsc->fsc->client; struct ceph_inode_info *ci; struct ceph_cap *cap; u64 oldest_flush_tid;
- dout("early_kick_flushing_caps mds%d\n", session->s_mds); + doutc(cl, "mds%d\n", session->s_mds);
spin_lock(&mdsc->cap_dirty_lock); oldest_flush_tid = __get_oldest_flush_tid(mdsc); spin_unlock(&mdsc->cap_dirty_lock);
list_for_each_entry(ci, &session->s_cap_flushing, i_flushing_item) { + struct inode *inode = &ci->netfs.inode; + spin_lock(&ci->i_ceph_lock); cap = ci->i_auth_cap; if (!(cap && cap->session == session)) { - pr_err("%p auth cap %p not mds%d ???\n", - &ci->netfs.inode, cap, session->s_mds); + pr_err_client(cl, "%p %llx.%llx auth cap %p not mds%d ???\n", + inode, ceph_vinop(inode), cap, + session->s_mds); spin_unlock(&ci->i_ceph_lock); continue; } @@ -2640,24 +2700,28 @@ void ceph_early_kick_flushing_caps(struct ceph_mds_client *mdsc, void ceph_kick_flushing_caps(struct ceph_mds_client *mdsc, struct ceph_mds_session *session) { + struct ceph_client *cl = mdsc->fsc->client; struct ceph_inode_info *ci; struct ceph_cap *cap; u64 oldest_flush_tid;
lockdep_assert_held(&session->s_mutex);
- dout("kick_flushing_caps mds%d\n", session->s_mds); + doutc(cl, "mds%d\n", session->s_mds);
spin_lock(&mdsc->cap_dirty_lock); oldest_flush_tid = __get_oldest_flush_tid(mdsc); spin_unlock(&mdsc->cap_dirty_lock);
list_for_each_entry(ci, &session->s_cap_flushing, i_flushing_item) { + struct inode *inode = &ci->netfs.inode; + spin_lock(&ci->i_ceph_lock); cap = ci->i_auth_cap; if (!(cap && cap->session == session)) { - pr_err("%p auth cap %p not mds%d ???\n", - &ci->netfs.inode, cap, session->s_mds); + pr_err_client(cl, "%p %llx.%llx auth cap %p not mds%d ???\n", + inode, ceph_vinop(inode), cap, + session->s_mds); spin_unlock(&ci->i_ceph_lock); continue; } @@ -2674,11 +2738,13 @@ void ceph_kick_flushing_inode_caps(struct ceph_mds_session *session, { struct ceph_mds_client *mdsc = session->s_mdsc; struct ceph_cap *cap = ci->i_auth_cap; + struct inode *inode = &ci->netfs.inode;
lockdep_assert_held(&ci->i_ceph_lock);
- dout("%s %p flushing %s\n", __func__, &ci->netfs.inode, - ceph_cap_string(ci->i_flushing_caps)); + doutc(mdsc->fsc->client, "%p %llx.%llx flushing %s\n", + inode, ceph_vinop(inode), + ceph_cap_string(ci->i_flushing_caps));
if (!list_empty(&ci->i_cap_flush_list)) { u64 oldest_flush_tid; @@ -2700,6 +2766,9 @@ void ceph_kick_flushing_inode_caps(struct ceph_mds_session *session, void ceph_take_cap_refs(struct ceph_inode_info *ci, int got, bool snap_rwsem_locked) { + struct inode *inode = &ci->netfs.inode; + struct ceph_client *cl = ceph_inode_to_client(inode); + lockdep_assert_held(&ci->i_ceph_lock);
if (got & CEPH_CAP_PIN) @@ -2720,10 +2789,10 @@ void ceph_take_cap_refs(struct ceph_inode_info *ci, int got, } if (got & CEPH_CAP_FILE_BUFFER) { if (ci->i_wb_ref == 0) - ihold(&ci->netfs.inode); + ihold(inode); ci->i_wb_ref++; - dout("%s %p wb %d -> %d (?)\n", __func__, - &ci->netfs.inode, ci->i_wb_ref-1, ci->i_wb_ref); + doutc(cl, "%p %llx.%llx wb %d -> %d (?)\n", inode, + ceph_vinop(inode), ci->i_wb_ref-1, ci->i_wb_ref); } }
@@ -2751,19 +2820,22 @@ static int try_get_cap_refs(struct inode *inode, int need, int want, { struct ceph_inode_info *ci = ceph_inode(inode); struct ceph_mds_client *mdsc = ceph_inode_to_fs_client(inode)->mdsc; + struct ceph_client *cl = ceph_inode_to_client(inode); int ret = 0; int have, implemented; bool snap_rwsem_locked = false;
- dout("get_cap_refs %p need %s want %s\n", inode, - ceph_cap_string(need), ceph_cap_string(want)); + doutc(cl, "%p %llx.%llx need %s want %s\n", inode, + ceph_vinop(inode), ceph_cap_string(need), + ceph_cap_string(want));
again: spin_lock(&ci->i_ceph_lock);
if ((flags & CHECK_FILELOCK) && (ci->i_ceph_flags & CEPH_I_ERROR_FILELOCK)) { - dout("try_get_cap_refs %p error filelock\n", inode); + doutc(cl, "%p %llx.%llx error filelock\n", inode, + ceph_vinop(inode)); ret = -EIO; goto out_unlock; } @@ -2783,8 +2855,8 @@ static int try_get_cap_refs(struct inode *inode, int need, int want,
if (have & need & CEPH_CAP_FILE_WR) { if (endoff >= 0 && endoff > (loff_t)ci->i_max_size) { - dout("get_cap_refs %p endoff %llu > maxsize %llu\n", - inode, endoff, ci->i_max_size); + doutc(cl, "%p %llx.%llx endoff %llu > maxsize %llu\n", + inode, ceph_vinop(inode), endoff, ci->i_max_size); if (endoff > ci->i_requested_max_size) ret = ci->i_auth_cap ? -EFBIG : -EUCLEAN; goto out_unlock; @@ -2794,7 +2866,8 @@ static int try_get_cap_refs(struct inode *inode, int need, int want, * can get a final snapshot value for size+mtime. */ if (__ceph_have_pending_cap_snap(ci)) { - dout("get_cap_refs %p cap_snap_pending\n", inode); + doutc(cl, "%p %llx.%llx cap_snap_pending\n", inode, + ceph_vinop(inode)); goto out_unlock; } } @@ -2812,9 +2885,9 @@ static int try_get_cap_refs(struct inode *inode, int need, int want, int not = want & ~(have & need); int revoking = implemented & ~have; int exclude = revoking & not; - dout("get_cap_refs %p have %s but not %s (revoking %s)\n", - inode, ceph_cap_string(have), ceph_cap_string(not), - ceph_cap_string(revoking)); + doutc(cl, "%p %llx.%llx have %s but not %s (revoking %s)\n", + inode, ceph_vinop(inode), ceph_cap_string(have), + ceph_cap_string(not), ceph_cap_string(revoking)); if (!exclude || !(exclude & CEPH_CAP_FILE_BUFFER)) { if (!snap_rwsem_locked && !ci->i_head_snapc && @@ -2854,28 +2927,31 @@ static int try_get_cap_refs(struct inode *inode, int need, int want, spin_unlock(&s->s_cap_lock); } if (session_readonly) { - dout("get_cap_refs %p need %s but mds%d readonly\n", - inode, ceph_cap_string(need), ci->i_auth_cap->mds); + doutc(cl, "%p %llx.%llx need %s but mds%d readonly\n", + inode, ceph_vinop(inode), ceph_cap_string(need), + ci->i_auth_cap->mds); ret = -EROFS; goto out_unlock; }
if (ceph_inode_is_shutdown(inode)) { - dout("get_cap_refs %p inode is shutdown\n", inode); + doutc(cl, "%p %llx.%llx inode is shutdown\n", + inode, ceph_vinop(inode)); ret = -ESTALE; goto out_unlock; } mds_wanted = __ceph_caps_mds_wanted(ci, false); if (need & ~mds_wanted) { - dout("get_cap_refs %p need %s > mds_wanted %s\n", - inode, ceph_cap_string(need), - ceph_cap_string(mds_wanted)); + doutc(cl, "%p %llx.%llx need %s > mds_wanted %s\n", + inode, ceph_vinop(inode), ceph_cap_string(need), + ceph_cap_string(mds_wanted)); ret = -EUCLEAN; goto out_unlock; }
- dout("get_cap_refs %p have %s need %s\n", inode, - ceph_cap_string(have), ceph_cap_string(need)); + doutc(cl, "%p %llx.%llx have %s need %s\n", inode, + ceph_vinop(inode), ceph_cap_string(have), + ceph_cap_string(need)); } out_unlock:
@@ -2890,8 +2966,8 @@ static int try_get_cap_refs(struct inode *inode, int need, int want, else if (ret == 1) ceph_update_cap_hit(&mdsc->metric);
- dout("get_cap_refs %p ret %d got %s\n", inode, - ret, ceph_cap_string(*got)); + doutc(cl, "%p %llx.%llx ret %d got %s\n", inode, + ceph_vinop(inode), ret, ceph_cap_string(*got)); return ret; }
@@ -2903,13 +2979,14 @@ static int try_get_cap_refs(struct inode *inode, int need, int want, static void check_max_size(struct inode *inode, loff_t endoff) { struct ceph_inode_info *ci = ceph_inode(inode); + struct ceph_client *cl = ceph_inode_to_client(inode); int check = 0;
/* do we need to explicitly request a larger max_size? */ spin_lock(&ci->i_ceph_lock); if (endoff >= ci->i_max_size && endoff > ci->i_wanted_max_size) { - dout("write %p at large endoff %llu, req max_size\n", - inode, endoff); + doutc(cl, "write %p %llx.%llx at large endoff %llu, req max_size\n", + inode, ceph_vinop(inode), endoff); ci->i_wanted_max_size = endoff; } /* duplicate ceph_check_caps()'s logic */ @@ -3119,10 +3196,12 @@ void ceph_get_cap_refs(struct ceph_inode_info *ci, int caps) static int ceph_try_drop_cap_snap(struct ceph_inode_info *ci, struct ceph_cap_snap *capsnap) { + struct inode *inode = &ci->netfs.inode; + struct ceph_client *cl = ceph_inode_to_client(inode); + if (!capsnap->need_flush && !capsnap->writing && !capsnap->dirty_pages) { - dout("dropping cap_snap %p follows %llu\n", - capsnap, capsnap->follows); + doutc(cl, "%p follows %llu\n", capsnap, capsnap->follows); BUG_ON(capsnap->cap_flush.tid > 0); ceph_put_snap_context(capsnap->context); if (!list_is_last(&capsnap->ci_item, &ci->i_cap_snaps)) @@ -3154,6 +3233,7 @@ static void __ceph_put_cap_refs(struct ceph_inode_info *ci, int had, enum put_cap_refs_mode mode) { struct inode *inode = &ci->netfs.inode; + struct ceph_client *cl = ceph_inode_to_client(inode); int last = 0, put = 0, flushsnaps = 0, wake = 0; bool check_flushsnaps = false;
@@ -3176,8 +3256,8 @@ static void __ceph_put_cap_refs(struct ceph_inode_info *ci, int had, put++; check_flushsnaps = true; } - dout("put_cap_refs %p wb %d -> %d (?)\n", - inode, ci->i_wb_ref+1, ci->i_wb_ref); + doutc(cl, "%p %llx.%llx wb %d -> %d (?)\n", inode, + ceph_vinop(inode), ci->i_wb_ref+1, ci->i_wb_ref); } if (had & CEPH_CAP_FILE_WR) { if (--ci->i_wr_ref == 0) { @@ -3217,8 +3297,8 @@ static void __ceph_put_cap_refs(struct ceph_inode_info *ci, int had, } spin_unlock(&ci->i_ceph_lock);
- dout("put_cap_refs %p had %s%s%s\n", inode, ceph_cap_string(had), - last ? " last" : "", put ? " put" : ""); + doutc(cl, "%p %llx.%llx had %s%s%s\n", inode, ceph_vinop(inode), + ceph_cap_string(had), last ? " last" : "", put ? " put" : "");
switch (mode) { case PUT_CAP_REFS_SYNC: @@ -3268,6 +3348,7 @@ void ceph_put_wrbuffer_cap_refs(struct ceph_inode_info *ci, int nr, struct ceph_snap_context *snapc) { struct inode *inode = &ci->netfs.inode; + struct ceph_client *cl = ceph_inode_to_client(inode); struct ceph_cap_snap *capsnap = NULL, *iter; int put = 0; bool last = false; @@ -3291,11 +3372,10 @@ void ceph_put_wrbuffer_cap_refs(struct ceph_inode_info *ci, int nr, ceph_put_snap_context(ci->i_head_snapc); ci->i_head_snapc = NULL; } - dout("put_wrbuffer_cap_refs on %p head %d/%d -> %d/%d %s\n", - inode, - ci->i_wrbuffer_ref+nr, ci->i_wrbuffer_ref_head+nr, - ci->i_wrbuffer_ref, ci->i_wrbuffer_ref_head, - last ? " LAST" : ""); + doutc(cl, "on %p %llx.%llx head %d/%d -> %d/%d %s\n", + inode, ceph_vinop(inode), ci->i_wrbuffer_ref+nr, + ci->i_wrbuffer_ref_head+nr, ci->i_wrbuffer_ref, + ci->i_wrbuffer_ref_head, last ? " LAST" : ""); } else { list_for_each_entry(iter, &ci->i_cap_snaps, ci_item) { if (iter->context == snapc) { @@ -3325,13 +3405,12 @@ void ceph_put_wrbuffer_cap_refs(struct ceph_inode_info *ci, int nr, } } } - dout("put_wrbuffer_cap_refs on %p cap_snap %p " - " snap %lld %d/%d -> %d/%d %s%s\n", - inode, capsnap, capsnap->context->seq, - ci->i_wrbuffer_ref+nr, capsnap->dirty_pages + nr, - ci->i_wrbuffer_ref, capsnap->dirty_pages, - last ? " (wrbuffer last)" : "", - complete_capsnap ? " (complete capsnap)" : ""); + doutc(cl, "%p %llx.%llx cap_snap %p snap %lld %d/%d -> %d/%d %s%s\n", + inode, ceph_vinop(inode), capsnap, capsnap->context->seq, + ci->i_wrbuffer_ref+nr, capsnap->dirty_pages + nr, + ci->i_wrbuffer_ref, capsnap->dirty_pages, + last ? " (wrbuffer last)" : "", + complete_capsnap ? " (complete capsnap)" : ""); }
unlock: @@ -3354,9 +3433,10 @@ void ceph_put_wrbuffer_cap_refs(struct ceph_inode_info *ci, int nr, */ static void invalidate_aliases(struct inode *inode) { + struct ceph_client *cl = ceph_inode_to_client(inode); struct dentry *dn, *prev = NULL;
- dout("invalidate_aliases inode %p\n", inode); + doutc(cl, "%p %llx.%llx\n", inode, ceph_vinop(inode)); d_prune_aliases(inode); /* * For non-directory inode, d_find_alias() only returns @@ -3415,6 +3495,7 @@ static void handle_cap_grant(struct inode *inode, __releases(ci->i_ceph_lock) __releases(session->s_mdsc->snap_rwsem) { + struct ceph_client *cl = ceph_inode_to_client(inode); struct ceph_inode_info *ci = ceph_inode(inode); int seq = le32_to_cpu(grant->seq); int newcaps = le32_to_cpu(grant->caps); @@ -3438,10 +3519,11 @@ static void handle_cap_grant(struct inode *inode, if (IS_ENCRYPTED(inode) && size) size = extra_info->fscrypt_file_size;
- dout("handle_cap_grant inode %p cap %p mds%d seq %d %s\n", - inode, cap, session->s_mds, seq, ceph_cap_string(newcaps)); - dout(" size %llu max_size %llu, i_size %llu\n", size, max_size, - i_size_read(inode)); + doutc(cl, "%p %llx.%llx cap %p mds%d seq %d %s\n", inode, + ceph_vinop(inode), cap, session->s_mds, seq, + ceph_cap_string(newcaps)); + doutc(cl, " size %llu max_size %llu, i_size %llu\n", size, + max_size, i_size_read(inode));
/* @@ -3501,15 +3583,17 @@ static void handle_cap_grant(struct inode *inode, inode->i_uid = make_kuid(&init_user_ns, le32_to_cpu(grant->uid)); inode->i_gid = make_kgid(&init_user_ns, le32_to_cpu(grant->gid)); ci->i_btime = extra_info->btime; - dout("%p mode 0%o uid.gid %d.%d\n", inode, inode->i_mode, - from_kuid(&init_user_ns, inode->i_uid), - from_kgid(&init_user_ns, inode->i_gid)); + doutc(cl, "%p %llx.%llx mode 0%o uid.gid %d.%d\n", inode, + ceph_vinop(inode), inode->i_mode, + from_kuid(&init_user_ns, inode->i_uid), + from_kgid(&init_user_ns, inode->i_gid)); #if IS_ENABLED(CONFIG_FS_ENCRYPTION) if (ci->fscrypt_auth_len != extra_info->fscrypt_auth_len || memcmp(ci->fscrypt_auth, extra_info->fscrypt_auth, ci->fscrypt_auth_len)) - pr_warn_ratelimited("%s: cap grant attempt to change fscrypt_auth on non-I_NEW inode (old len %d new len %d)\n", - __func__, ci->fscrypt_auth_len, + pr_warn_ratelimited_client(cl, + "cap grant attempt to change fscrypt_auth on non-I_NEW inode (old len %d new len %d)\n", + ci->fscrypt_auth_len, extra_info->fscrypt_auth_len); #endif } @@ -3527,8 +3611,8 @@ static void handle_cap_grant(struct inode *inode, u64 version = le64_to_cpu(grant->xattr_version);
if (version > ci->i_xattrs.version) { - dout(" got new xattrs v%llu on %p len %d\n", - version, inode, len); + doutc(cl, " got new xattrs v%llu on %p %llx.%llx len %d\n", + version, inode, ceph_vinop(inode), len); if (ci->i_xattrs.blob) ceph_buffer_put(ci->i_xattrs.blob); ci->i_xattrs.blob = ceph_buffer_get(xattr_buf); @@ -3579,8 +3663,8 @@ static void handle_cap_grant(struct inode *inode,
if (ci->i_auth_cap == cap && (newcaps & CEPH_CAP_ANY_FILE_WR)) { if (max_size != ci->i_max_size) { - dout("max_size %lld -> %llu\n", - ci->i_max_size, max_size); + doutc(cl, "max_size %lld -> %llu\n", ci->i_max_size, + max_size); ci->i_max_size = max_size; if (max_size >= ci->i_wanted_max_size) { ci->i_wanted_max_size = 0; /* reset */ @@ -3594,10 +3678,9 @@ static void handle_cap_grant(struct inode *inode, wanted = __ceph_caps_wanted(ci); used = __ceph_caps_used(ci); dirty = __ceph_caps_dirty(ci); - dout(" my wanted = %s, used = %s, dirty %s\n", - ceph_cap_string(wanted), - ceph_cap_string(used), - ceph_cap_string(dirty)); + doutc(cl, " my wanted = %s, used = %s, dirty %s\n", + ceph_cap_string(wanted), ceph_cap_string(used), + ceph_cap_string(dirty));
if ((was_stale || le32_to_cpu(grant->op) == CEPH_CAP_OP_IMPORT) && (wanted & ~(cap->mds_wanted | newcaps))) { @@ -3618,10 +3701,9 @@ static void handle_cap_grant(struct inode *inode, if (cap->issued & ~newcaps) { int revoking = cap->issued & ~newcaps;
- dout("revocation: %s -> %s (revoking %s)\n", - ceph_cap_string(cap->issued), - ceph_cap_string(newcaps), - ceph_cap_string(revoking)); + doutc(cl, "revocation: %s -> %s (revoking %s)\n", + ceph_cap_string(cap->issued), ceph_cap_string(newcaps), + ceph_cap_string(revoking)); if (S_ISREG(inode->i_mode) && (revoking & used & CEPH_CAP_FILE_BUFFER)) writeback = true; /* initiate writeback; will delay ack */ @@ -3639,11 +3721,12 @@ static void handle_cap_grant(struct inode *inode, cap->issued = newcaps; cap->implemented |= newcaps; } else if (cap->issued == newcaps) { - dout("caps unchanged: %s -> %s\n", - ceph_cap_string(cap->issued), ceph_cap_string(newcaps)); + doutc(cl, "caps unchanged: %s -> %s\n", + ceph_cap_string(cap->issued), + ceph_cap_string(newcaps)); } else { - dout("grant: %s -> %s\n", ceph_cap_string(cap->issued), - ceph_cap_string(newcaps)); + doutc(cl, "grant: %s -> %s\n", ceph_cap_string(cap->issued), + ceph_cap_string(newcaps)); /* non-auth MDS is revoking the newly grant caps ? */ if (cap == ci->i_auth_cap && __ceph_caps_revoking_other(ci, cap, newcaps)) @@ -3732,6 +3815,7 @@ static void handle_cap_flush_ack(struct inode *inode, u64 flush_tid, { struct ceph_inode_info *ci = ceph_inode(inode); struct ceph_mds_client *mdsc = ceph_sb_to_fs_client(inode->i_sb)->mdsc; + struct ceph_client *cl = mdsc->fsc->client; struct ceph_cap_flush *cf, *tmp_cf; LIST_HEAD(to_remove); unsigned seq = le32_to_cpu(m->seq); @@ -3768,11 +3852,11 @@ static void handle_cap_flush_ack(struct inode *inode, u64 flush_tid, } }
- dout("handle_cap_flush_ack inode %p mds%d seq %d on %s cleaned %s," - " flushing %s -> %s\n", - inode, session->s_mds, seq, ceph_cap_string(dirty), - ceph_cap_string(cleaned), ceph_cap_string(ci->i_flushing_caps), - ceph_cap_string(ci->i_flushing_caps & ~cleaned)); + doutc(cl, "%p %llx.%llx mds%d seq %d on %s cleaned %s, flushing %s -> %s\n", + inode, ceph_vinop(inode), session->s_mds, seq, + ceph_cap_string(dirty), ceph_cap_string(cleaned), + ceph_cap_string(ci->i_flushing_caps), + ceph_cap_string(ci->i_flushing_caps & ~cleaned));
if (list_empty(&to_remove) && !cleaned) goto out; @@ -3788,18 +3872,21 @@ static void handle_cap_flush_ack(struct inode *inode, u64 flush_tid, if (list_empty(&ci->i_cap_flush_list)) { list_del_init(&ci->i_flushing_item); if (!list_empty(&session->s_cap_flushing)) { - dout(" mds%d still flushing cap on %p\n", - session->s_mds, - &list_first_entry(&session->s_cap_flushing, - struct ceph_inode_info, - i_flushing_item)->netfs.inode); + struct inode *inode = + &list_first_entry(&session->s_cap_flushing, + struct ceph_inode_info, + i_flushing_item)->netfs.inode; + doutc(cl, " mds%d still flushing cap on %p %llx.%llx\n", + session->s_mds, inode, ceph_vinop(inode)); } } mdsc->num_cap_flushing--; - dout(" inode %p now !flushing\n", inode); + doutc(cl, " %p %llx.%llx now !flushing\n", inode, + ceph_vinop(inode));
if (ci->i_dirty_caps == 0) { - dout(" inode %p now clean\n", inode); + doutc(cl, " %p %llx.%llx now clean\n", inode, + ceph_vinop(inode)); BUG_ON(!list_empty(&ci->i_dirty_item)); drop = true; if (ci->i_wr_ref == 0 && @@ -3838,11 +3925,13 @@ void __ceph_remove_capsnap(struct inode *inode, struct ceph_cap_snap *capsnap, { struct ceph_inode_info *ci = ceph_inode(inode); struct ceph_mds_client *mdsc = ceph_sb_to_fs_client(inode->i_sb)->mdsc; + struct ceph_client *cl = mdsc->fsc->client; bool ret;
lockdep_assert_held(&ci->i_ceph_lock);
- dout("removing capsnap %p, inode %p ci %p\n", capsnap, inode, ci); + doutc(cl, "removing capsnap %p, %p %llx.%llx ci %p\n", capsnap, + inode, ceph_vinop(inode), ci);
list_del_init(&capsnap->ci_item); ret = __detach_cap_flush_from_ci(ci, &capsnap->cap_flush); @@ -3882,28 +3971,30 @@ static void handle_cap_flushsnap_ack(struct inode *inode, u64 flush_tid, { struct ceph_inode_info *ci = ceph_inode(inode); struct ceph_mds_client *mdsc = ceph_sb_to_fs_client(inode->i_sb)->mdsc; + struct ceph_client *cl = mdsc->fsc->client; u64 follows = le64_to_cpu(m->snap_follows); struct ceph_cap_snap *capsnap = NULL, *iter; bool wake_ci = false; bool wake_mdsc = false;
- dout("handle_cap_flushsnap_ack inode %p ci %p mds%d follows %lld\n", - inode, ci, session->s_mds, follows); + doutc(cl, "%p %llx.%llx ci %p mds%d follows %lld\n", inode, + ceph_vinop(inode), ci, session->s_mds, follows);
spin_lock(&ci->i_ceph_lock); list_for_each_entry(iter, &ci->i_cap_snaps, ci_item) { if (iter->follows == follows) { if (iter->cap_flush.tid != flush_tid) { - dout(" cap_snap %p follows %lld tid %lld !=" - " %lld\n", iter, follows, - flush_tid, iter->cap_flush.tid); + doutc(cl, " cap_snap %p follows %lld " + "tid %lld != %lld\n", iter, + follows, flush_tid, + iter->cap_flush.tid); break; } capsnap = iter; break; } else { - dout(" skipping cap_snap %p follows %lld\n", - iter, iter->follows); + doutc(cl, " skipping cap_snap %p follows %lld\n", + iter, iter->follows); } } if (capsnap) @@ -3932,6 +4023,7 @@ static bool handle_cap_trunc(struct inode *inode, struct cap_extra_info *extra_info) { struct ceph_inode_info *ci = ceph_inode(inode); + struct ceph_client *cl = ceph_inode_to_client(inode); int mds = session->s_mds; int seq = le32_to_cpu(trunc->seq); u32 truncate_seq = le32_to_cpu(trunc->truncate_seq); @@ -3954,8 +4046,8 @@ static bool handle_cap_trunc(struct inode *inode, if (IS_ENCRYPTED(inode) && size) size = extra_info->fscrypt_file_size;
- dout("%s inode %p mds%d seq %d to %lld truncate seq %d\n", - __func__, inode, mds, seq, truncate_size, truncate_seq); + doutc(cl, "%p %llx.%llx mds%d seq %d to %lld truncate seq %d\n", + inode, ceph_vinop(inode), mds, seq, truncate_size, truncate_seq); queue_trunc = ceph_fill_file_size(inode, issued, truncate_seq, truncate_size, size); return queue_trunc; @@ -3974,6 +4066,7 @@ static void handle_cap_export(struct inode *inode, struct ceph_mds_caps *ex, struct ceph_mds_session *session) { struct ceph_mds_client *mdsc = ceph_inode_to_fs_client(inode)->mdsc; + struct ceph_client *cl = mdsc->fsc->client; struct ceph_mds_session *tsession = NULL; struct ceph_cap *cap, *tcap, *new_cap = NULL; struct ceph_inode_info *ci = ceph_inode(inode); @@ -3993,8 +4086,8 @@ static void handle_cap_export(struct inode *inode, struct ceph_mds_caps *ex, target = -1; }
- dout("handle_cap_export inode %p ci %p mds%d mseq %d target %d\n", - inode, ci, mds, mseq, target); + doutc(cl, "%p %llx.%llx ci %p mds%d mseq %d target %d\n", + inode, ceph_vinop(inode), ci, mds, mseq, target); retry: down_read(&mdsc->snap_rwsem); spin_lock(&ci->i_ceph_lock); @@ -4014,12 +4107,13 @@ static void handle_cap_export(struct inode *inode, struct ceph_mds_caps *ex,
issued = cap->issued; if (issued != cap->implemented) - pr_err_ratelimited("handle_cap_export: issued != implemented: " - "ino (%llx.%llx) mds%d seq %d mseq %d " - "issued %s implemented %s\n", - ceph_vinop(inode), mds, cap->seq, cap->mseq, - ceph_cap_string(issued), - ceph_cap_string(cap->implemented)); + pr_err_ratelimited_client(cl, "issued != implemented: " + "%p %llx.%llx mds%d seq %d mseq %d" + " issued %s implemented %s\n", + inode, ceph_vinop(inode), mds, + cap->seq, cap->mseq, + ceph_cap_string(issued), + ceph_cap_string(cap->implemented));
tcap = __get_cap_for_mds(ci, target); @@ -4027,7 +4121,8 @@ static void handle_cap_export(struct inode *inode, struct ceph_mds_caps *ex, /* already have caps from the target */ if (tcap->cap_id == t_cap_id && ceph_seq_cmp(tcap->seq, t_seq) < 0) { - dout(" updating import cap %p mds%d\n", tcap, target); + doutc(cl, " updating import cap %p mds%d\n", tcap, + target); tcap->cap_id = t_cap_id; tcap->seq = t_seq - 1; tcap->issue_seq = t_seq - 1; @@ -4108,6 +4203,7 @@ static void handle_cap_import(struct ceph_mds_client *mdsc, struct ceph_cap **target_cap, int *old_issued) { struct ceph_inode_info *ci = ceph_inode(inode); + struct ceph_client *cl = mdsc->fsc->client; struct ceph_cap *cap, *ocap, *new_cap = NULL; int mds = session->s_mds; int issued; @@ -4128,8 +4224,8 @@ static void handle_cap_import(struct ceph_mds_client *mdsc, peer = -1; }
- dout("handle_cap_import inode %p ci %p mds%d mseq %d peer %d\n", - inode, ci, mds, mseq, peer); + doutc(cl, "%p %llx.%llx ci %p mds%d mseq %d peer %d\n", + inode, ceph_vinop(inode), ci, mds, mseq, peer); retry: cap = __get_cap_for_mds(ci, mds); if (!cap) { @@ -4155,17 +4251,17 @@ static void handle_cap_import(struct ceph_mds_client *mdsc,
ocap = peer >= 0 ? __get_cap_for_mds(ci, peer) : NULL; if (ocap && ocap->cap_id == p_cap_id) { - dout(" remove export cap %p mds%d flags %d\n", - ocap, peer, ph->flags); + doutc(cl, " remove export cap %p mds%d flags %d\n", + ocap, peer, ph->flags); if ((ph->flags & CEPH_CAP_FLAG_AUTH) && (ocap->seq != le32_to_cpu(ph->seq) || ocap->mseq != le32_to_cpu(ph->mseq))) { - pr_err_ratelimited("handle_cap_import: " - "mismatched seq/mseq: ino (%llx.%llx) " - "mds%d seq %d mseq %d importer mds%d " - "has peer seq %d mseq %d\n", - ceph_vinop(inode), peer, ocap->seq, - ocap->mseq, mds, le32_to_cpu(ph->seq), + pr_err_ratelimited_client(cl, "mismatched seq/mseq: " + "%p %llx.%llx mds%d seq %d mseq %d" + " importer mds%d has peer seq %d mseq %d\n", + inode, ceph_vinop(inode), peer, + ocap->seq, ocap->mseq, mds, + le32_to_cpu(ph->seq), le32_to_cpu(ph->mseq)); } ceph_remove_cap(mdsc, ocap, (ph->flags & CEPH_CAP_FLAG_RELEASE)); @@ -4231,6 +4327,7 @@ void ceph_handle_caps(struct ceph_mds_session *session, struct ceph_msg *msg) { struct ceph_mds_client *mdsc = session->s_mdsc; + struct ceph_client *cl = mdsc->fsc->client; struct inode *inode; struct ceph_inode_info *ci; struct ceph_cap *cap; @@ -4249,7 +4346,7 @@ void ceph_handle_caps(struct ceph_mds_session *session, bool close_sessions = false; bool do_cap_release = false;
- dout("handle_caps from mds%d\n", session->s_mds); + doutc(cl, "from mds%d\n", session->s_mds);
if (!ceph_inc_mds_stopping_blocker(mdsc, session)) return; @@ -4351,15 +4448,15 @@ void ceph_handle_caps(struct ceph_mds_session *session,
/* lookup ino */ inode = ceph_find_inode(mdsc->fsc->sb, vino); - dout(" op %s ino %llx.%llx inode %p\n", ceph_cap_op_name(op), vino.ino, - vino.snap, inode); + doutc(cl, " op %s ino %llx.%llx inode %p\n", ceph_cap_op_name(op), + vino.ino, vino.snap, inode);
mutex_lock(&session->s_mutex); - dout(" mds%d seq %lld cap seq %u\n", session->s_mds, session->s_seq, - (unsigned)seq); + doutc(cl, " mds%d seq %lld cap seq %u\n", session->s_mds, + session->s_seq, (unsigned)seq);
if (!inode) { - dout(" i don't have ino %llx\n", vino.ino); + doutc(cl, " i don't have ino %llx\n", vino.ino);
switch (op) { case CEPH_CAP_OP_IMPORT: @@ -4414,9 +4511,9 @@ void ceph_handle_caps(struct ceph_mds_session *session, spin_lock(&ci->i_ceph_lock); cap = __get_cap_for_mds(ceph_inode(inode), session->s_mds); if (!cap) { - dout(" no cap on %p ino %llx.%llx from mds%d\n", - inode, ceph_ino(inode), ceph_snap(inode), - session->s_mds); + doutc(cl, " no cap on %p ino %llx.%llx from mds%d\n", + inode, ceph_ino(inode), ceph_snap(inode), + session->s_mds); spin_unlock(&ci->i_ceph_lock); switch (op) { case CEPH_CAP_OP_REVOKE: @@ -4454,8 +4551,8 @@ void ceph_handle_caps(struct ceph_mds_session *session,
default: spin_unlock(&ci->i_ceph_lock); - pr_err("ceph_handle_caps: unknown cap op %d %s\n", op, - ceph_cap_op_name(op)); + pr_err_client(cl, "unknown cap op %d %s\n", op, + ceph_cap_op_name(op)); }
done: @@ -4496,7 +4593,7 @@ void ceph_handle_caps(struct ceph_mds_session *session, goto done;
bad: - pr_err("ceph_handle_caps: corrupt message\n"); + pr_err_client(cl, "corrupt message\n"); ceph_msg_dump(msg); goto out; } @@ -4510,6 +4607,7 @@ void ceph_handle_caps(struct ceph_mds_session *session, */ unsigned long ceph_check_delayed_caps(struct ceph_mds_client *mdsc) { + struct ceph_client *cl = mdsc->fsc->client; struct inode *inode; struct ceph_inode_info *ci; struct ceph_mount_options *opt = mdsc->fsc->mount_options; @@ -4517,14 +4615,14 @@ unsigned long ceph_check_delayed_caps(struct ceph_mds_client *mdsc) unsigned long loop_start = jiffies; unsigned long delay = 0;
- dout("check_delayed_caps\n"); + doutc(cl, "begin\n"); spin_lock(&mdsc->cap_delay_lock); while (!list_empty(&mdsc->cap_delay_list)) { ci = list_first_entry(&mdsc->cap_delay_list, struct ceph_inode_info, i_cap_delay_list); if (time_before(loop_start, ci->i_hold_caps_max - delay_max)) { - dout("%s caps added recently. Exiting loop", __func__); + doutc(cl, "caps added recently. Exiting loop"); delay = ci->i_hold_caps_max; break; } @@ -4536,13 +4634,15 @@ unsigned long ceph_check_delayed_caps(struct ceph_mds_client *mdsc) inode = igrab(&ci->netfs.inode); if (inode) { spin_unlock(&mdsc->cap_delay_lock); - dout("check_delayed_caps on %p\n", inode); + doutc(cl, "on %p %llx.%llx\n", inode, + ceph_vinop(inode)); ceph_check_caps(ci, 0); iput(inode); spin_lock(&mdsc->cap_delay_lock); } } spin_unlock(&mdsc->cap_delay_lock); + doutc(cl, "done\n");
return delay; } @@ -4553,17 +4653,18 @@ unsigned long ceph_check_delayed_caps(struct ceph_mds_client *mdsc) static void flush_dirty_session_caps(struct ceph_mds_session *s) { struct ceph_mds_client *mdsc = s->s_mdsc; + struct ceph_client *cl = mdsc->fsc->client; struct ceph_inode_info *ci; struct inode *inode;
- dout("flush_dirty_caps\n"); + doutc(cl, "begin\n"); spin_lock(&mdsc->cap_dirty_lock); while (!list_empty(&s->s_cap_dirty)) { ci = list_first_entry(&s->s_cap_dirty, struct ceph_inode_info, i_dirty_item); inode = &ci->netfs.inode; ihold(inode); - dout("flush_dirty_caps %llx.%llx\n", ceph_vinop(inode)); + doutc(cl, "%p %llx.%llx\n", inode, ceph_vinop(inode)); spin_unlock(&mdsc->cap_dirty_lock); ceph_wait_on_async_create(inode); ceph_check_caps(ci, CHECK_CAPS_FLUSH); @@ -4571,7 +4672,7 @@ static void flush_dirty_session_caps(struct ceph_mds_session *s) spin_lock(&mdsc->cap_dirty_lock); } spin_unlock(&mdsc->cap_dirty_lock); - dout("flush_dirty_caps done\n"); + doutc(cl, "done\n"); }
void ceph_flush_dirty_caps(struct ceph_mds_client *mdsc) @@ -4696,6 +4797,7 @@ int ceph_encode_inode_release(void **p, struct inode *inode, int mds, int drop, int unless, int force) { struct ceph_inode_info *ci = ceph_inode(inode); + struct ceph_client *cl = ceph_inode_to_client(inode); struct ceph_cap *cap; struct ceph_mds_request_release *rel = *p; int used, dirty; @@ -4705,9 +4807,9 @@ int ceph_encode_inode_release(void **p, struct inode *inode, used = __ceph_caps_used(ci); dirty = __ceph_caps_dirty(ci);
- dout("encode_inode_release %p mds%d used|dirty %s drop %s unless %s\n", - inode, mds, ceph_cap_string(used|dirty), ceph_cap_string(drop), - ceph_cap_string(unless)); + doutc(cl, "%p %llx.%llx mds%d used|dirty %s drop %s unless %s\n", + inode, ceph_vinop(inode), mds, ceph_cap_string(used|dirty), + ceph_cap_string(drop), ceph_cap_string(unless));
/* only drop unused, clean caps */ drop &= ~(used | dirty); @@ -4729,12 +4831,13 @@ int ceph_encode_inode_release(void **p, struct inode *inode, if (force || (cap->issued & drop)) { if (cap->issued & drop) { int wanted = __ceph_caps_wanted(ci); - dout("encode_inode_release %p cap %p " - "%s -> %s, wanted %s -> %s\n", inode, cap, - ceph_cap_string(cap->issued), - ceph_cap_string(cap->issued & ~drop), - ceph_cap_string(cap->mds_wanted), - ceph_cap_string(wanted)); + doutc(cl, "%p %llx.%llx cap %p %s -> %s, " + "wanted %s -> %s\n", inode, + ceph_vinop(inode), cap, + ceph_cap_string(cap->issued), + ceph_cap_string(cap->issued & ~drop), + ceph_cap_string(cap->mds_wanted), + ceph_cap_string(wanted));
cap->issued &= ~drop; cap->implemented &= ~drop; @@ -4743,9 +4846,9 @@ int ceph_encode_inode_release(void **p, struct inode *inode, !(wanted & CEPH_CAP_ANY_FILE_WR)) ci->i_requested_max_size = 0; } else { - dout("encode_inode_release %p cap %p %s" - " (force)\n", inode, cap, - ceph_cap_string(cap->issued)); + doutc(cl, "%p %llx.%llx cap %p %s (force)\n", + inode, ceph_vinop(inode), cap, + ceph_cap_string(cap->issued)); }
rel->ino = cpu_to_le64(ceph_ino(inode)); @@ -4760,8 +4863,9 @@ int ceph_encode_inode_release(void **p, struct inode *inode, *p += sizeof(*rel); ret = 1; } else { - dout("encode_inode_release %p cap %p %s (noop)\n", - inode, cap, ceph_cap_string(cap->issued)); + doutc(cl, "%p %llx.%llx cap %p %s (noop)\n", + inode, ceph_vinop(inode), cap, + ceph_cap_string(cap->issued)); } } spin_unlock(&ci->i_ceph_lock); @@ -4786,6 +4890,7 @@ int ceph_encode_dentry_release(void **p, struct dentry *dentry, { struct ceph_mds_request_release *rel = *p; struct ceph_dentry_info *di = ceph_dentry(dentry); + struct ceph_client *cl; int force = 0; int ret;
@@ -4805,10 +4910,11 @@ int ceph_encode_dentry_release(void **p, struct dentry *dentry,
ret = ceph_encode_inode_release(p, dir, mds, drop, unless, force);
+ cl = ceph_inode_to_client(dir); spin_lock(&dentry->d_lock); if (ret && di->lease_session && di->lease_session->s_mds == mds) { - dout("encode_dentry_release %p mds%d seq %d\n", - dentry, mds, (int)di->lease_seq); + doutc(cl, "%p mds%d seq %d\n", dentry, mds, + (int)di->lease_seq); rel->dname_seq = cpu_to_le32(di->lease_seq); __ceph_mdsc_drop_dentry_lease(dentry); spin_unlock(&dentry->d_lock); @@ -4834,12 +4940,14 @@ int ceph_encode_dentry_release(void **p, struct dentry *dentry, static int remove_capsnaps(struct ceph_mds_client *mdsc, struct inode *inode) { struct ceph_inode_info *ci = ceph_inode(inode); + struct ceph_client *cl = mdsc->fsc->client; struct ceph_cap_snap *capsnap; int capsnap_release = 0;
lockdep_assert_held(&ci->i_ceph_lock);
- dout("removing capsnaps, ci is %p, inode is %p\n", ci, inode); + doutc(cl, "removing capsnaps, ci is %p, %p %llx.%llx\n", + ci, inode, ceph_vinop(inode));
while (!list_empty(&ci->i_cap_snaps)) { capsnap = list_first_entry(&ci->i_cap_snaps, @@ -4858,6 +4966,7 @@ int ceph_purge_inode_cap(struct inode *inode, struct ceph_cap *cap, bool *invali { struct ceph_fs_client *fsc = ceph_inode_to_fs_client(inode); struct ceph_mds_client *mdsc = fsc->mdsc; + struct ceph_client *cl = fsc->client; struct ceph_inode_info *ci = ceph_inode(inode); bool is_auth; bool dirty_dropped = false; @@ -4865,8 +4974,8 @@ int ceph_purge_inode_cap(struct inode *inode, struct ceph_cap *cap, bool *invali
lockdep_assert_held(&ci->i_ceph_lock);
- dout("removing cap %p, ci is %p, inode is %p\n", - cap, ci, &ci->netfs.inode); + doutc(cl, "removing cap %p, ci is %p, %p %llx.%llx\n", + cap, ci, inode, ceph_vinop(inode));
is_auth = (cap == ci->i_auth_cap); __ceph_remove_cap(cap, false); @@ -4893,19 +5002,19 @@ int ceph_purge_inode_cap(struct inode *inode, struct ceph_cap *cap, bool *invali }
if (!list_empty(&ci->i_dirty_item)) { - pr_warn_ratelimited( - " dropping dirty %s state for %p %lld\n", + pr_warn_ratelimited_client(cl, + " dropping dirty %s state for %p %llx.%llx\n", ceph_cap_string(ci->i_dirty_caps), - inode, ceph_ino(inode)); + inode, ceph_vinop(inode)); ci->i_dirty_caps = 0; list_del_init(&ci->i_dirty_item); dirty_dropped = true; } if (!list_empty(&ci->i_flushing_item)) { - pr_warn_ratelimited( - " dropping dirty+flushing %s state for %p %lld\n", + pr_warn_ratelimited_client(cl, + " dropping dirty+flushing %s state for %p %llx.%llx\n", ceph_cap_string(ci->i_flushing_caps), - inode, ceph_ino(inode)); + inode, ceph_vinop(inode)); ci->i_flushing_caps = 0; list_del_init(&ci->i_flushing_item); mdsc->num_cap_flushing--; @@ -4928,8 +5037,9 @@ int ceph_purge_inode_cap(struct inode *inode, struct ceph_cap *cap, bool *invali if (atomic_read(&ci->i_filelock_ref) > 0) { /* make further file lock syscall return -EIO */ ci->i_ceph_flags |= CEPH_I_ERROR_FILELOCK; - pr_warn_ratelimited(" dropping file locks for %p %lld\n", - inode, ceph_ino(inode)); + pr_warn_ratelimited_client(cl, + " dropping file locks for %p %llx.%llx\n", + inode, ceph_vinop(inode)); }
if (!ci->i_dirty_caps && ci->i_prealloc_cap_flush) { diff --git a/fs/ceph/crypto.c b/fs/ceph/crypto.c index 08c385610731..d692ebfddedb 100644 --- a/fs/ceph/crypto.c +++ b/fs/ceph/crypto.c @@ -211,6 +211,7 @@ void ceph_fscrypt_as_ctx_to_req(struct ceph_mds_request *req, static struct inode *parse_longname(const struct inode *parent, const char *name, int *name_len) { + struct ceph_client *cl = ceph_inode_to_client(parent); struct inode *dir = NULL; struct ceph_vino vino = { .snap = CEPH_NOSNAP }; char *inode_number; @@ -222,12 +223,12 @@ static struct inode *parse_longname(const struct inode *parent, name++; name_end = strrchr(name, '_'); if (!name_end) { - dout("Failed to parse long snapshot name: %s\n", name); + doutc(cl, "failed to parse long snapshot name: %s\n", name); return ERR_PTR(-EIO); } *name_len = (name_end - name); if (*name_len <= 0) { - pr_err("Failed to parse long snapshot name\n"); + pr_err_client(cl, "failed to parse long snapshot name\n"); return ERR_PTR(-EIO); }
@@ -239,7 +240,7 @@ static struct inode *parse_longname(const struct inode *parent, return ERR_PTR(-ENOMEM); ret = kstrtou64(inode_number, 10, &vino.ino); if (ret) { - dout("Failed to parse inode number: %s\n", name); + doutc(cl, "failed to parse inode number: %s\n", name); dir = ERR_PTR(ret); goto out; } @@ -250,7 +251,7 @@ static struct inode *parse_longname(const struct inode *parent, /* This can happen if we're not mounting cephfs on the root */ dir = ceph_get_inode(parent->i_sb, vino, NULL); if (IS_ERR(dir)) - dout("Can't find inode %s (%s)\n", inode_number, name); + doutc(cl, "can't find inode %s (%s)\n", inode_number, name); }
out: @@ -261,6 +262,7 @@ static struct inode *parse_longname(const struct inode *parent, int ceph_encode_encrypted_dname(struct inode *parent, struct qstr *d_name, char *buf) { + struct ceph_client *cl = ceph_inode_to_client(parent); struct inode *dir = parent; struct qstr iname; u32 len; @@ -329,7 +331,7 @@ int ceph_encode_encrypted_dname(struct inode *parent, struct qstr *d_name,
/* base64 encode the encrypted name */ elen = ceph_base64_encode(cryptbuf, len, buf); - dout("base64-encoded ciphertext name = %.*s\n", elen, buf); + doutc(cl, "base64-encoded ciphertext name = %.*s\n", elen, buf);
/* To understand the 240 limit, see CEPH_NOHASH_NAME_MAX comments */ WARN_ON(elen > 240); @@ -504,7 +506,10 @@ int ceph_fscrypt_decrypt_block_inplace(const struct inode *inode, struct page *page, unsigned int len, unsigned int offs, u64 lblk_num) { - dout("%s: len %u offs %u blk %llu\n", __func__, len, offs, lblk_num); + struct ceph_client *cl = ceph_inode_to_client(inode); + + doutc(cl, "%p %llx.%llx len %u offs %u blk %llu\n", inode, + ceph_vinop(inode), len, offs, lblk_num); return fscrypt_decrypt_block_inplace(inode, page, len, offs, lblk_num); }
@@ -513,7 +518,10 @@ int ceph_fscrypt_encrypt_block_inplace(const struct inode *inode, unsigned int offs, u64 lblk_num, gfp_t gfp_flags) { - dout("%s: len %u offs %u blk %llu\n", __func__, len, offs, lblk_num); + struct ceph_client *cl = ceph_inode_to_client(inode); + + doutc(cl, "%p %llx.%llx len %u offs %u blk %llu\n", inode, + ceph_vinop(inode), len, offs, lblk_num); return fscrypt_encrypt_block_inplace(inode, page, len, offs, lblk_num, gfp_flags); } @@ -582,6 +590,7 @@ int ceph_fscrypt_decrypt_extents(struct inode *inode, struct page **page, u64 off, struct ceph_sparse_extent *map, u32 ext_cnt) { + struct ceph_client *cl = ceph_inode_to_client(inode); int i, ret = 0; struct ceph_inode_info *ci = ceph_inode(inode); u64 objno, objoff; @@ -589,7 +598,8 @@ int ceph_fscrypt_decrypt_extents(struct inode *inode, struct page **page,
/* Nothing to do for empty array */ if (ext_cnt == 0) { - dout("%s: empty array, ret 0\n", __func__); + doutc(cl, "%p %llx.%llx empty array, ret 0\n", inode, + ceph_vinop(inode)); return 0; }
@@ -603,14 +613,17 @@ int ceph_fscrypt_decrypt_extents(struct inode *inode, struct page **page, int fret;
if ((ext->off | ext->len) & ~CEPH_FSCRYPT_BLOCK_MASK) { - pr_warn("%s: bad encrypted sparse extent idx %d off %llx len %llx\n", - __func__, i, ext->off, ext->len); + pr_warn_client(cl, + "%p %llx.%llx bad encrypted sparse extent " + "idx %d off %llx len %llx\n", + inode, ceph_vinop(inode), i, ext->off, + ext->len); return -EIO; } fret = ceph_fscrypt_decrypt_pages(inode, &page[pgidx], off + pgsoff, ext->len); - dout("%s: [%d] 0x%llx~0x%llx fret %d\n", __func__, i, - ext->off, ext->len, fret); + doutc(cl, "%p %llx.%llx [%d] 0x%llx~0x%llx fret %d\n", inode, + ceph_vinop(inode), i, ext->off, ext->len, fret); if (fret < 0) { if (ret == 0) ret = fret; @@ -618,7 +631,7 @@ int ceph_fscrypt_decrypt_extents(struct inode *inode, struct page **page, } ret = pgsoff + fret; } - dout("%s: ret %d\n", __func__, ret); + doutc(cl, "ret %d\n", ret); return ret; }
diff --git a/fs/ceph/debugfs.c b/fs/ceph/debugfs.c index 2f1e7498cd74..24c08078f5aa 100644 --- a/fs/ceph/debugfs.c +++ b/fs/ceph/debugfs.c @@ -398,7 +398,7 @@ DEFINE_SIMPLE_ATTRIBUTE(congestion_kb_fops, congestion_kb_get,
void ceph_fs_debugfs_cleanup(struct ceph_fs_client *fsc) { - dout("ceph_fs_debugfs_cleanup\n"); + doutc(fsc->client, "begin\n"); debugfs_remove(fsc->debugfs_bdi); debugfs_remove(fsc->debugfs_congestion_kb); debugfs_remove(fsc->debugfs_mdsmap); @@ -407,13 +407,14 @@ void ceph_fs_debugfs_cleanup(struct ceph_fs_client *fsc) debugfs_remove(fsc->debugfs_status); debugfs_remove(fsc->debugfs_mdsc); debugfs_remove_recursive(fsc->debugfs_metrics_dir); + doutc(fsc->client, "done\n"); }
void ceph_fs_debugfs_init(struct ceph_fs_client *fsc) { char name[100];
- dout("ceph_fs_debugfs_init\n"); + doutc(fsc->client, "begin\n"); fsc->debugfs_congestion_kb = debugfs_create_file("writeback_congestion_kb", 0600, @@ -469,6 +470,7 @@ void ceph_fs_debugfs_init(struct ceph_fs_client *fsc) &metrics_size_fops); debugfs_create_file("caps", 0400, fsc->debugfs_metrics_dir, fsc, &metrics_caps_fops); + doutc(fsc->client, "done\n"); }
diff --git a/fs/ceph/dir.c b/fs/ceph/dir.c index 1395b71df5cc..6bb95b4e93a8 100644 --- a/fs/ceph/dir.c +++ b/fs/ceph/dir.c @@ -109,7 +109,9 @@ static int fpos_cmp(loff_t l, loff_t r) * regardless of what dir changes take place on the * server. */ -static int note_last_dentry(struct ceph_dir_file_info *dfi, const char *name, +static int note_last_dentry(struct ceph_fs_client *fsc, + struct ceph_dir_file_info *dfi, + const char *name, int len, unsigned next_offset) { char *buf = kmalloc(len+1, GFP_KERNEL); @@ -120,7 +122,7 @@ static int note_last_dentry(struct ceph_dir_file_info *dfi, const char *name, memcpy(dfi->last_name, name, len); dfi->last_name[len] = 0; dfi->next_offset = next_offset; - dout("note_last_dentry '%s'\n", dfi->last_name); + doutc(fsc->client, "'%s'\n", dfi->last_name); return 0; }
@@ -130,6 +132,7 @@ __dcache_find_get_entry(struct dentry *parent, u64 idx, struct ceph_readdir_cache_control *cache_ctl) { struct inode *dir = d_inode(parent); + struct ceph_client *cl = ceph_inode_to_client(dir); struct dentry *dentry; unsigned idx_mask = (PAGE_SIZE / sizeof(struct dentry *)) - 1; loff_t ptr_pos = idx * sizeof(struct dentry *); @@ -142,7 +145,7 @@ __dcache_find_get_entry(struct dentry *parent, u64 idx, ceph_readdir_cache_release(cache_ctl); cache_ctl->page = find_lock_page(&dir->i_data, ptr_pgoff); if (!cache_ctl->page) { - dout(" page %lu not found\n", ptr_pgoff); + doutc(cl, " page %lu not found\n", ptr_pgoff); return ERR_PTR(-EAGAIN); } /* reading/filling the cache are serialized by @@ -185,13 +188,16 @@ static int __dcache_readdir(struct file *file, struct dir_context *ctx, struct ceph_dir_file_info *dfi = file->private_data; struct dentry *parent = file->f_path.dentry; struct inode *dir = d_inode(parent); + struct ceph_fs_client *fsc = ceph_inode_to_fs_client(dir); + struct ceph_client *cl = ceph_inode_to_client(dir); struct dentry *dentry, *last = NULL; struct ceph_dentry_info *di; struct ceph_readdir_cache_control cache_ctl = {}; u64 idx = 0; int err = 0;
- dout("__dcache_readdir %p v%u at %llx\n", dir, (unsigned)shared_gen, ctx->pos); + doutc(cl, "%p %llx.%llx v%u at %llx\n", dir, ceph_vinop(dir), + (unsigned)shared_gen, ctx->pos);
/* search start position */ if (ctx->pos > 2) { @@ -221,7 +227,8 @@ static int __dcache_readdir(struct file *file, struct dir_context *ctx, dput(dentry); }
- dout("__dcache_readdir %p cache idx %llu\n", dir, idx); + doutc(cl, "%p %llx.%llx cache idx %llu\n", dir, + ceph_vinop(dir), idx); }
@@ -257,8 +264,8 @@ static int __dcache_readdir(struct file *file, struct dir_context *ctx, spin_unlock(&dentry->d_lock);
if (emit_dentry) { - dout(" %llx dentry %p %pd %p\n", di->offset, - dentry, dentry, d_inode(dentry)); + doutc(cl, " %llx dentry %p %pd %p\n", di->offset, + dentry, dentry, d_inode(dentry)); ctx->pos = di->offset; if (!dir_emit(ctx, dentry->d_name.name, dentry->d_name.len, ceph_present_inode(d_inode(dentry)), @@ -281,7 +288,8 @@ static int __dcache_readdir(struct file *file, struct dir_context *ctx, if (last) { int ret; di = ceph_dentry(last); - ret = note_last_dentry(dfi, last->d_name.name, last->d_name.len, + ret = note_last_dentry(fsc, dfi, last->d_name.name, + last->d_name.len, fpos_off(di->offset) + 1); if (ret < 0) err = ret; @@ -312,18 +320,21 @@ static int ceph_readdir(struct file *file, struct dir_context *ctx) struct ceph_inode_info *ci = ceph_inode(inode); struct ceph_fs_client *fsc = ceph_inode_to_fs_client(inode); struct ceph_mds_client *mdsc = fsc->mdsc; + struct ceph_client *cl = fsc->client; int i; int err; unsigned frag = -1; struct ceph_mds_reply_info_parsed *rinfo;
- dout("readdir %p file %p pos %llx\n", inode, file, ctx->pos); + doutc(cl, "%p %llx.%llx file %p pos %llx\n", inode, + ceph_vinop(inode), file, ctx->pos); if (dfi->file_info.flags & CEPH_F_ATEND) return 0;
/* always start with . and .. */ if (ctx->pos == 0) { - dout("readdir off 0 -> '.'\n"); + doutc(cl, "%p %llx.%llx off 0 -> '.'\n", inode, + ceph_vinop(inode)); if (!dir_emit(ctx, ".", 1, ceph_present_inode(inode), inode->i_mode >> 12)) return 0; @@ -337,7 +348,8 @@ static int ceph_readdir(struct file *file, struct dir_context *ctx) ino = ceph_present_inode(dentry->d_parent->d_inode); spin_unlock(&dentry->d_lock);
- dout("readdir off 1 -> '..'\n"); + doutc(cl, "%p %llx.%llx off 1 -> '..'\n", inode, + ceph_vinop(inode)); if (!dir_emit(ctx, "..", 2, ino, inode->i_mode >> 12)) return 0; ctx->pos = 2; @@ -391,8 +403,8 @@ static int ceph_readdir(struct file *file, struct dir_context *ctx) frag = fpos_frag(ctx->pos); }
- dout("readdir fetching %llx.%llx frag %x offset '%s'\n", - ceph_vinop(inode), frag, dfi->last_name); + doutc(cl, "fetching %p %llx.%llx frag %x offset '%s'\n", + inode, ceph_vinop(inode), frag, dfi->last_name); req = ceph_mdsc_create_request(mdsc, op, USE_AUTH_MDS); if (IS_ERR(req)) return PTR_ERR(req); @@ -446,12 +458,12 @@ static int ceph_readdir(struct file *file, struct dir_context *ctx) ceph_mdsc_put_request(req); return err; } - dout("readdir got and parsed readdir result=%d on " - "frag %x, end=%d, complete=%d, hash_order=%d\n", - err, frag, - (int)req->r_reply_info.dir_end, - (int)req->r_reply_info.dir_complete, - (int)req->r_reply_info.hash_order); + doutc(cl, "%p %llx.%llx got and parsed readdir result=%d" + "on frag %x, end=%d, complete=%d, hash_order=%d\n", + inode, ceph_vinop(inode), err, frag, + (int)req->r_reply_info.dir_end, + (int)req->r_reply_info.dir_complete, + (int)req->r_reply_info.hash_order);
rinfo = &req->r_reply_info; if (le32_to_cpu(rinfo->dir_dir->frag) != frag) { @@ -481,7 +493,8 @@ static int ceph_readdir(struct file *file, struct dir_context *ctx) dfi->dir_ordered_count = req->r_dir_ordered_cnt; } } else { - dout("readdir !did_prepopulate\n"); + doutc(cl, "%p %llx.%llx !did_prepopulate\n", inode, + ceph_vinop(inode)); /* disable readdir cache */ dfi->readdir_cache_idx = -1; /* preclude from marking dir complete */ @@ -494,8 +507,8 @@ static int ceph_readdir(struct file *file, struct dir_context *ctx) rinfo->dir_entries + (rinfo->dir_nr-1); unsigned next_offset = req->r_reply_info.dir_end ? 2 : (fpos_off(rde->offset) + 1); - err = note_last_dentry(dfi, rde->name, rde->name_len, - next_offset); + err = note_last_dentry(fsc, dfi, rde->name, + rde->name_len, next_offset); if (err) { ceph_mdsc_put_request(dfi->last_readdir); dfi->last_readdir = NULL; @@ -508,9 +521,9 @@ static int ceph_readdir(struct file *file, struct dir_context *ctx) }
rinfo = &dfi->last_readdir->r_reply_info; - dout("readdir frag %x num %d pos %llx chunk first %llx\n", - dfi->frag, rinfo->dir_nr, ctx->pos, - rinfo->dir_nr ? rinfo->dir_entries[0].offset : 0LL); + doutc(cl, "%p %llx.%llx frag %x num %d pos %llx chunk first %llx\n", + inode, ceph_vinop(inode), dfi->frag, rinfo->dir_nr, ctx->pos, + rinfo->dir_nr ? rinfo->dir_entries[0].offset : 0LL);
i = 0; /* search start position */ @@ -530,8 +543,9 @@ static int ceph_readdir(struct file *file, struct dir_context *ctx) struct ceph_mds_reply_dir_entry *rde = rinfo->dir_entries + i;
if (rde->offset < ctx->pos) { - pr_warn("%s: rde->offset 0x%llx ctx->pos 0x%llx\n", - __func__, rde->offset, ctx->pos); + pr_warn_client(cl, + "%p %llx.%llx rde->offset 0x%llx ctx->pos 0x%llx\n", + inode, ceph_vinop(inode), rde->offset, ctx->pos); return -EIO; }
@@ -539,9 +553,9 @@ static int ceph_readdir(struct file *file, struct dir_context *ctx) return -EIO;
ctx->pos = rde->offset; - dout("readdir (%d/%d) -> %llx '%.*s' %p\n", - i, rinfo->dir_nr, ctx->pos, - rde->name_len, rde->name, &rde->inode.in); + doutc(cl, "%p %llx.%llx (%d/%d) -> %llx '%.*s' %p\n", inode, + ceph_vinop(inode), i, rinfo->dir_nr, ctx->pos, + rde->name_len, rde->name, &rde->inode.in);
if (!dir_emit(ctx, rde->name, rde->name_len, ceph_present_ino(inode->i_sb, le64_to_cpu(rde->inode.in->ino)), @@ -552,7 +566,7 @@ static int ceph_readdir(struct file *file, struct dir_context *ctx) * doesn't have enough memory, etc. So for next readdir * it will continue. */ - dout("filldir stopping us...\n"); + doutc(cl, "filldir stopping us...\n"); return 0; }
@@ -583,7 +597,8 @@ static int ceph_readdir(struct file *file, struct dir_context *ctx) kfree(dfi->last_name); dfi->last_name = NULL; } - dout("readdir next frag is %x\n", frag); + doutc(cl, "%p %llx.%llx next frag is %x\n", inode, + ceph_vinop(inode), frag); goto more; } dfi->file_info.flags |= CEPH_F_ATEND; @@ -598,20 +613,23 @@ static int ceph_readdir(struct file *file, struct dir_context *ctx) spin_lock(&ci->i_ceph_lock); if (dfi->dir_ordered_count == atomic64_read(&ci->i_ordered_count)) { - dout(" marking %p complete and ordered\n", inode); + doutc(cl, " marking %p %llx.%llx complete and ordered\n", + inode, ceph_vinop(inode)); /* use i_size to track number of entries in * readdir cache */ BUG_ON(dfi->readdir_cache_idx < 0); i_size_write(inode, dfi->readdir_cache_idx * sizeof(struct dentry*)); } else { - dout(" marking %p complete\n", inode); + doutc(cl, " marking %llx.%llx complete\n", + ceph_vinop(inode)); } __ceph_dir_set_complete(ci, dfi->dir_release_count, dfi->dir_ordered_count); spin_unlock(&ci->i_ceph_lock); } - dout("readdir %p file %p done.\n", inode, file); + doutc(cl, "%p %llx.%llx file %p done.\n", inode, ceph_vinop(inode), + file); return 0; }
@@ -657,6 +675,7 @@ static loff_t ceph_dir_llseek(struct file *file, loff_t offset, int whence) { struct ceph_dir_file_info *dfi = file->private_data; struct inode *inode = file->f_mapping->host; + struct ceph_client *cl = ceph_inode_to_client(inode); loff_t retval;
inode_lock(inode); @@ -676,7 +695,8 @@ static loff_t ceph_dir_llseek(struct file *file, loff_t offset, int whence)
if (offset >= 0) { if (need_reset_readdir(dfi, offset)) { - dout("dir_llseek dropping %p content\n", file); + doutc(cl, "%p %llx.%llx dropping %p content\n", + inode, ceph_vinop(inode), file); reset_readdir(dfi); } else if (is_hash_order(offset) && offset > file->f_pos) { /* for hash offset, we don't know if a forward seek @@ -705,6 +725,7 @@ struct dentry *ceph_handle_snapdir(struct ceph_mds_request *req, { struct ceph_fs_client *fsc = ceph_sb_to_fs_client(dentry->d_sb); struct inode *parent = d_inode(dentry->d_parent); /* we hold i_rwsem */ + struct ceph_client *cl = ceph_inode_to_client(parent);
/* .snap dir? */ if (ceph_snap(parent) == CEPH_NOSNAP && @@ -713,8 +734,9 @@ struct dentry *ceph_handle_snapdir(struct ceph_mds_request *req, struct inode *inode = ceph_get_snapdir(parent);
res = d_splice_alias(inode, dentry); - dout("ENOENT on snapdir %p '%pd', linking to snapdir %p. Spliced dentry %p\n", - dentry, dentry, inode, res); + doutc(cl, "ENOENT on snapdir %p '%pd', linking to " + "snapdir %p %llx.%llx. Spliced dentry %p\n", + dentry, dentry, inode, ceph_vinop(inode), res); if (res) dentry = res; } @@ -735,12 +757,15 @@ struct dentry *ceph_handle_snapdir(struct ceph_mds_request *req, struct dentry *ceph_finish_lookup(struct ceph_mds_request *req, struct dentry *dentry, int err) { + struct ceph_client *cl = req->r_mdsc->fsc->client; + if (err == -ENOENT) { /* no trace? */ err = 0; if (!req->r_reply_info.head->is_dentry) { - dout("ENOENT and no trace, dentry %p inode %p\n", - dentry, d_inode(dentry)); + doutc(cl, + "ENOENT and no trace, dentry %p inode %llx.%llx\n", + dentry, ceph_vinop(d_inode(dentry))); if (d_really_is_positive(dentry)) { d_drop(dentry); err = -ENOENT; @@ -773,13 +798,14 @@ static struct dentry *ceph_lookup(struct inode *dir, struct dentry *dentry, { struct ceph_fs_client *fsc = ceph_sb_to_fs_client(dir->i_sb); struct ceph_mds_client *mdsc = ceph_sb_to_mdsc(dir->i_sb); + struct ceph_client *cl = fsc->client; struct ceph_mds_request *req; int op; int mask; int err;
- dout("lookup %p dentry %p '%pd'\n", - dir, dentry, dentry); + doutc(cl, "%p %llx.%llx/'%pd' dentry %p\n", dir, ceph_vinop(dir), + dentry, dentry);
if (dentry->d_name.len > NAME_MAX) return ERR_PTR(-ENAMETOOLONG); @@ -802,7 +828,8 @@ static struct dentry *ceph_lookup(struct inode *dir, struct dentry *dentry, struct ceph_dentry_info *di = ceph_dentry(dentry);
spin_lock(&ci->i_ceph_lock); - dout(" dir %p flags are 0x%lx\n", dir, ci->i_ceph_flags); + doutc(cl, " dir %llx.%llx flags are 0x%lx\n", + ceph_vinop(dir), ci->i_ceph_flags); if (strncmp(dentry->d_name.name, fsc->mount_options->snapdir_name, dentry->d_name.len) && @@ -812,7 +839,8 @@ static struct dentry *ceph_lookup(struct inode *dir, struct dentry *dentry, __ceph_caps_issued_mask_metric(ci, CEPH_CAP_FILE_SHARED, 1)) { __ceph_touch_fmode(ci, mdsc, CEPH_FILE_MODE_RD); spin_unlock(&ci->i_ceph_lock); - dout(" dir %p complete, -ENOENT\n", dir); + doutc(cl, " dir %llx.%llx complete, -ENOENT\n", + ceph_vinop(dir)); d_add(dentry, NULL); di->lease_shared_gen = atomic_read(&ci->i_shared_gen); return NULL; @@ -850,7 +878,7 @@ static struct dentry *ceph_lookup(struct inode *dir, struct dentry *dentry, } dentry = ceph_finish_lookup(req, dentry, err); ceph_mdsc_put_request(req); /* will dput(dentry) */ - dout("lookup result=%p\n", dentry); + doutc(cl, "result=%p\n", dentry); return dentry; }
@@ -885,6 +913,7 @@ static int ceph_mknod(struct mnt_idmap *idmap, struct inode *dir, struct dentry *dentry, umode_t mode, dev_t rdev) { struct ceph_mds_client *mdsc = ceph_sb_to_mdsc(dir->i_sb); + struct ceph_client *cl = mdsc->fsc->client; struct ceph_mds_request *req; struct ceph_acl_sec_ctx as_ctx = {}; int err; @@ -901,8 +930,8 @@ static int ceph_mknod(struct mnt_idmap *idmap, struct inode *dir, goto out; }
- dout("mknod in dir %p dentry %p mode 0%ho rdev %d\n", - dir, dentry, mode, rdev); + doutc(cl, "%p %llx.%llx/'%pd' dentry %p mode 0%ho rdev %d\n", + dir, ceph_vinop(dir), dentry, dentry, mode, rdev); req = ceph_mdsc_create_request(mdsc, CEPH_MDS_OP_MKNOD, USE_AUTH_MDS); if (IS_ERR(req)) { err = PTR_ERR(req); @@ -993,6 +1022,7 @@ static int ceph_symlink(struct mnt_idmap *idmap, struct inode *dir, struct dentry *dentry, const char *dest) { struct ceph_mds_client *mdsc = ceph_sb_to_mdsc(dir->i_sb); + struct ceph_client *cl = mdsc->fsc->client; struct ceph_mds_request *req; struct ceph_acl_sec_ctx as_ctx = {}; umode_t mode = S_IFLNK | 0777; @@ -1010,7 +1040,8 @@ static int ceph_symlink(struct mnt_idmap *idmap, struct inode *dir, goto out; }
- dout("symlink in dir %p dentry %p to '%s'\n", dir, dentry, dest); + doutc(cl, "%p %llx.%llx/'%pd' to '%s'\n", dir, ceph_vinop(dir), dentry, + dest); req = ceph_mdsc_create_request(mdsc, CEPH_MDS_OP_SYMLINK, USE_AUTH_MDS); if (IS_ERR(req)) { err = PTR_ERR(req); @@ -1064,6 +1095,7 @@ static int ceph_mkdir(struct mnt_idmap *idmap, struct inode *dir, struct dentry *dentry, umode_t mode) { struct ceph_mds_client *mdsc = ceph_sb_to_mdsc(dir->i_sb); + struct ceph_client *cl = mdsc->fsc->client; struct ceph_mds_request *req; struct ceph_acl_sec_ctx as_ctx = {}; int err; @@ -1076,10 +1108,11 @@ static int ceph_mkdir(struct mnt_idmap *idmap, struct inode *dir, if (ceph_snap(dir) == CEPH_SNAPDIR) { /* mkdir .snap/foo is a MKSNAP */ op = CEPH_MDS_OP_MKSNAP; - dout("mksnap dir %p snap '%pd' dn %p\n", dir, - dentry, dentry); + doutc(cl, "mksnap %llx.%llx/'%pd' dentry %p\n", + ceph_vinop(dir), dentry, dentry); } else if (ceph_snap(dir) == CEPH_NOSNAP) { - dout("mkdir dir %p dn %p mode 0%ho\n", dir, dentry, mode); + doutc(cl, "mkdir %llx.%llx/'%pd' dentry %p mode 0%ho\n", + ceph_vinop(dir), dentry, dentry, mode); op = CEPH_MDS_OP_MKDIR; } else { err = -EROFS; @@ -1144,6 +1177,7 @@ static int ceph_link(struct dentry *old_dentry, struct inode *dir, struct dentry *dentry) { struct ceph_mds_client *mdsc = ceph_sb_to_mdsc(dir->i_sb); + struct ceph_client *cl = mdsc->fsc->client; struct ceph_mds_request *req; int err;
@@ -1161,8 +1195,8 @@ static int ceph_link(struct dentry *old_dentry, struct inode *dir, if (err) return err;
- dout("link in dir %p %llx.%llx old_dentry %p:'%pd' dentry %p:'%pd'\n", - dir, ceph_vinop(dir), old_dentry, old_dentry, dentry, dentry); + doutc(cl, "%p %llx.%llx/'%pd' to '%pd'\n", dir, ceph_vinop(dir), + old_dentry, dentry); req = ceph_mdsc_create_request(mdsc, CEPH_MDS_OP_LINK, USE_AUTH_MDS); if (IS_ERR(req)) { d_drop(dentry); @@ -1200,13 +1234,15 @@ static void ceph_async_unlink_cb(struct ceph_mds_client *mdsc, { struct dentry *dentry = req->r_dentry; struct ceph_fs_client *fsc = ceph_sb_to_fs_client(dentry->d_sb); + struct ceph_client *cl = fsc->client; struct ceph_dentry_info *di = ceph_dentry(dentry); int result = req->r_err ? req->r_err : le32_to_cpu(req->r_reply_info.head->result);
if (!test_bit(CEPH_DENTRY_ASYNC_UNLINK_BIT, &di->flags)) - pr_warn("%s dentry %p:%pd async unlink bit is not set\n", - __func__, dentry, dentry); + pr_warn_client(cl, + "dentry %p:%pd async unlink bit is not set\n", + dentry, dentry);
spin_lock(&fsc->async_unlink_conflict_lock); hash_del_rcu(&di->hnode); @@ -1240,8 +1276,8 @@ static void ceph_async_unlink_cb(struct ceph_mds_client *mdsc, /* mark inode itself for an error (since metadata is bogus) */ mapping_set_error(req->r_old_inode->i_mapping, result);
- pr_warn("async unlink failure path=(%llx)%s result=%d!\n", - base, IS_ERR(path) ? "<<bad>>" : path, result); + pr_warn_client(cl, "failure path=(%llx)%s result=%d!\n", + base, IS_ERR(path) ? "<<bad>>" : path, result); ceph_mdsc_free_path(path, pathlen); } out: @@ -1291,6 +1327,7 @@ static int get_caps_for_async_unlink(struct inode *dir, struct dentry *dentry) static int ceph_unlink(struct inode *dir, struct dentry *dentry) { struct ceph_fs_client *fsc = ceph_sb_to_fs_client(dir->i_sb); + struct ceph_client *cl = fsc->client; struct ceph_mds_client *mdsc = fsc->mdsc; struct inode *inode = d_inode(dentry); struct ceph_mds_request *req; @@ -1300,11 +1337,12 @@ static int ceph_unlink(struct inode *dir, struct dentry *dentry)
if (ceph_snap(dir) == CEPH_SNAPDIR) { /* rmdir .snap/foo is RMSNAP */ - dout("rmsnap dir %p '%pd' dn %p\n", dir, dentry, dentry); + doutc(cl, "rmsnap %llx.%llx/'%pd' dn\n", ceph_vinop(dir), + dentry); op = CEPH_MDS_OP_RMSNAP; } else if (ceph_snap(dir) == CEPH_NOSNAP) { - dout("unlink/rmdir dir %p dn %p inode %p\n", - dir, dentry, inode); + doutc(cl, "unlink/rmdir %llx.%llx/'%pd' inode %llx.%llx\n", + ceph_vinop(dir), dentry, ceph_vinop(inode)); op = d_is_dir(dentry) ? CEPH_MDS_OP_RMDIR : CEPH_MDS_OP_UNLINK; } else @@ -1327,9 +1365,9 @@ static int ceph_unlink(struct inode *dir, struct dentry *dentry) (req->r_dir_caps = get_caps_for_async_unlink(dir, dentry))) { struct ceph_dentry_info *di = ceph_dentry(dentry);
- dout("async unlink on %llu/%.*s caps=%s", ceph_ino(dir), - dentry->d_name.len, dentry->d_name.name, - ceph_cap_string(req->r_dir_caps)); + doutc(cl, "async unlink on %llx.%llx/'%pd' caps=%s", + ceph_vinop(dir), dentry, + ceph_cap_string(req->r_dir_caps)); set_bit(CEPH_MDS_R_ASYNC, &req->r_req_flags); req->r_callback = ceph_async_unlink_cb; req->r_old_inode = d_inode(dentry); @@ -1384,6 +1422,7 @@ static int ceph_rename(struct mnt_idmap *idmap, struct inode *old_dir, struct dentry *new_dentry, unsigned int flags) { struct ceph_mds_client *mdsc = ceph_sb_to_mdsc(old_dir->i_sb); + struct ceph_client *cl = mdsc->fsc->client; struct ceph_mds_request *req; int op = CEPH_MDS_OP_RENAME; int err; @@ -1413,8 +1452,9 @@ static int ceph_rename(struct mnt_idmap *idmap, struct inode *old_dir, if (err) return err;
- dout("rename dir %p dentry %p to dir %p dentry %p\n", - old_dir, old_dentry, new_dir, new_dentry); + doutc(cl, "%llx.%llx/'%pd' to %llx.%llx/'%pd'\n", + ceph_vinop(old_dir), old_dentry, ceph_vinop(new_dir), + new_dentry); req = ceph_mdsc_create_request(mdsc, op, USE_AUTH_MDS); if (IS_ERR(req)) return PTR_ERR(req); @@ -1459,9 +1499,10 @@ static int ceph_rename(struct mnt_idmap *idmap, struct inode *old_dir, void __ceph_dentry_lease_touch(struct ceph_dentry_info *di) { struct dentry *dn = di->dentry; - struct ceph_mds_client *mdsc; + struct ceph_mds_client *mdsc = ceph_sb_to_fs_client(dn->d_sb)->mdsc; + struct ceph_client *cl = mdsc->fsc->client;
- dout("dentry_lease_touch %p %p '%pd'\n", di, dn, dn); + doutc(cl, "%p %p '%pd'\n", di, dn, dn);
di->flags |= CEPH_DENTRY_LEASE_LIST; if (di->flags & CEPH_DENTRY_SHRINK_LIST) { @@ -1469,7 +1510,6 @@ void __ceph_dentry_lease_touch(struct ceph_dentry_info *di) return; }
- mdsc = ceph_sb_to_fs_client(dn->d_sb)->mdsc; spin_lock(&mdsc->dentry_list_lock); list_move_tail(&di->lease_list, &mdsc->dentry_leases); spin_unlock(&mdsc->dentry_list_lock); @@ -1493,10 +1533,10 @@ static void __dentry_dir_lease_touch(struct ceph_mds_client* mdsc, void __ceph_dentry_dir_lease_touch(struct ceph_dentry_info *di) { struct dentry *dn = di->dentry; - struct ceph_mds_client *mdsc; + struct ceph_mds_client *mdsc = ceph_sb_to_fs_client(dn->d_sb)->mdsc; + struct ceph_client *cl = mdsc->fsc->client;
- dout("dentry_dir_lease_touch %p %p '%pd' (offset 0x%llx)\n", - di, dn, dn, di->offset); + doutc(cl, "%p %p '%pd' (offset 0x%llx)\n", di, dn, dn, di->offset);
if (!list_empty(&di->lease_list)) { if (di->flags & CEPH_DENTRY_LEASE_LIST) { @@ -1516,7 +1556,6 @@ void __ceph_dentry_dir_lease_touch(struct ceph_dentry_info *di) return; }
- mdsc = ceph_sb_to_fs_client(dn->d_sb)->mdsc; spin_lock(&mdsc->dentry_list_lock); __dentry_dir_lease_touch(mdsc, di), spin_unlock(&mdsc->dentry_list_lock); @@ -1757,6 +1796,8 @@ static int dentry_lease_is_valid(struct dentry *dentry, unsigned int flags) { struct ceph_dentry_info *di; struct ceph_mds_session *session = NULL; + struct ceph_mds_client *mdsc = ceph_sb_to_fs_client(dentry->d_sb)->mdsc; + struct ceph_client *cl = mdsc->fsc->client; u32 seq = 0; int valid = 0;
@@ -1789,7 +1830,7 @@ static int dentry_lease_is_valid(struct dentry *dentry, unsigned int flags) CEPH_MDS_LEASE_RENEW, seq); ceph_put_mds_session(session); } - dout("dentry_lease_is_valid - dentry %p = %d\n", dentry, valid); + doutc(cl, "dentry %p = %d\n", dentry, valid); return valid; }
@@ -1832,6 +1873,7 @@ static int dir_lease_is_valid(struct inode *dir, struct dentry *dentry, struct ceph_mds_client *mdsc) { struct ceph_inode_info *ci = ceph_inode(dir); + struct ceph_client *cl = mdsc->fsc->client; int valid; int shared_gen;
@@ -1853,8 +1895,9 @@ static int dir_lease_is_valid(struct inode *dir, struct dentry *dentry, valid = 0; spin_unlock(&dentry->d_lock); } - dout("dir_lease_is_valid dir %p v%u dentry %p = %d\n", - dir, (unsigned)atomic_read(&ci->i_shared_gen), dentry, valid); + doutc(cl, "dir %p %llx.%llx v%u dentry %p '%pd' = %d\n", dir, + ceph_vinop(dir), (unsigned)atomic_read(&ci->i_shared_gen), + dentry, dentry, valid); return valid; }
@@ -1863,10 +1906,11 @@ static int dir_lease_is_valid(struct inode *dir, struct dentry *dentry, */ static int ceph_d_revalidate(struct dentry *dentry, unsigned int flags) { + struct ceph_mds_client *mdsc = ceph_sb_to_fs_client(dentry->d_sb)->mdsc; + struct ceph_client *cl = mdsc->fsc->client; int valid = 0; struct dentry *parent; struct inode *dir, *inode; - struct ceph_mds_client *mdsc;
valid = fscrypt_d_revalidate(dentry, flags); if (valid <= 0) @@ -1884,16 +1928,16 @@ static int ceph_d_revalidate(struct dentry *dentry, unsigned int flags) inode = d_inode(dentry); }
- dout("d_revalidate %p '%pd' inode %p offset 0x%llx nokey %d\n", dentry, - dentry, inode, ceph_dentry(dentry)->offset, - !!(dentry->d_flags & DCACHE_NOKEY_NAME)); + doutc(cl, "%p '%pd' inode %p offset 0x%llx nokey %d\n", + dentry, dentry, inode, ceph_dentry(dentry)->offset, + !!(dentry->d_flags & DCACHE_NOKEY_NAME));
mdsc = ceph_sb_to_fs_client(dir->i_sb)->mdsc;
/* always trust cached snapped dentries, snapdir dentry */ if (ceph_snap(dir) != CEPH_NOSNAP) { - dout("d_revalidate %p '%pd' inode %p is SNAPPED\n", dentry, - dentry, inode); + doutc(cl, "%p '%pd' inode %p is SNAPPED\n", dentry, + dentry, inode); valid = 1; } else if (inode && ceph_snap(inode) == CEPH_SNAPDIR) { valid = 1; @@ -1948,14 +1992,14 @@ static int ceph_d_revalidate(struct dentry *dentry, unsigned int flags) break; } ceph_mdsc_put_request(req); - dout("d_revalidate %p lookup result=%d\n", - dentry, err); + doutc(cl, "%p '%pd', lookup result=%d\n", dentry, + dentry, err); } } else { percpu_counter_inc(&mdsc->metric.d_lease_hit); }
- dout("d_revalidate %p %s\n", dentry, valid ? "valid" : "invalid"); + doutc(cl, "%p '%pd' %s\n", dentry, dentry, valid ? "valid" : "invalid"); if (!valid) ceph_dir_clear_complete(dir);
@@ -1997,7 +2041,7 @@ static void ceph_d_release(struct dentry *dentry) struct ceph_dentry_info *di = ceph_dentry(dentry); struct ceph_fs_client *fsc = ceph_sb_to_fs_client(dentry->d_sb);
- dout("d_release %p\n", dentry); + doutc(fsc->client, "dentry %p '%pd'\n", dentry, dentry);
atomic64_dec(&fsc->mdsc->metric.total_dentries);
@@ -2018,10 +2062,12 @@ static void ceph_d_release(struct dentry *dentry) */ static void ceph_d_prune(struct dentry *dentry) { + struct ceph_mds_client *mdsc = ceph_sb_to_mdsc(dentry->d_sb); + struct ceph_client *cl = mdsc->fsc->client; struct ceph_inode_info *dir_ci; struct ceph_dentry_info *di;
- dout("ceph_d_prune %pd %p\n", dentry, dentry); + doutc(cl, "dentry %p '%pd'\n", dentry, dentry);
/* do we have a valid parent? */ if (IS_ROOT(dentry)) diff --git a/fs/ceph/export.c b/fs/ceph/export.c index 52c4daf2447d..726af69d4d62 100644 --- a/fs/ceph/export.c +++ b/fs/ceph/export.c @@ -36,6 +36,7 @@ struct ceph_nfs_snapfh { static int ceph_encode_snapfh(struct inode *inode, u32 *rawfh, int *max_len, struct inode *parent_inode) { + struct ceph_client *cl = ceph_inode_to_client(inode); static const int snap_handle_length = sizeof(struct ceph_nfs_snapfh) >> 2; struct ceph_nfs_snapfh *sfh = (void *)rawfh; @@ -79,13 +80,14 @@ static int ceph_encode_snapfh(struct inode *inode, u32 *rawfh, int *max_len, *max_len = snap_handle_length; ret = FILEID_BTRFS_WITH_PARENT; out: - dout("encode_snapfh %llx.%llx ret=%d\n", ceph_vinop(inode), ret); + doutc(cl, "%p %llx.%llx ret=%d\n", inode, ceph_vinop(inode), ret); return ret; }
static int ceph_encode_fh(struct inode *inode, u32 *rawfh, int *max_len, struct inode *parent_inode) { + struct ceph_client *cl = ceph_inode_to_client(inode); static const int handle_length = sizeof(struct ceph_nfs_fh) >> 2; static const int connected_handle_length = @@ -105,15 +107,15 @@ static int ceph_encode_fh(struct inode *inode, u32 *rawfh, int *max_len,
if (parent_inode) { struct ceph_nfs_confh *cfh = (void *)rawfh; - dout("encode_fh %llx with parent %llx\n", - ceph_ino(inode), ceph_ino(parent_inode)); + doutc(cl, "%p %llx.%llx with parent %p %llx.%llx\n", inode, + ceph_vinop(inode), parent_inode, ceph_vinop(parent_inode)); cfh->ino = ceph_ino(inode); cfh->parent_ino = ceph_ino(parent_inode); *max_len = connected_handle_length; type = FILEID_INO32_GEN_PARENT; } else { struct ceph_nfs_fh *fh = (void *)rawfh; - dout("encode_fh %llx\n", ceph_ino(inode)); + doutc(cl, "%p %llx.%llx\n", inode, ceph_vinop(inode)); fh->ino = ceph_ino(inode); *max_len = handle_length; type = FILEID_INO32_GEN; @@ -206,6 +208,7 @@ static struct dentry *__snapfh_to_dentry(struct super_block *sb, bool want_parent) { struct ceph_mds_client *mdsc = ceph_sb_to_fs_client(sb)->mdsc; + struct ceph_client *cl = mdsc->fsc->client; struct ceph_mds_request *req; struct inode *inode; struct ceph_vino vino; @@ -278,11 +281,10 @@ static struct dentry *__snapfh_to_dentry(struct super_block *sb, ceph_mdsc_put_request(req);
if (want_parent) { - dout("snapfh_to_parent %llx.%llx\n err=%d\n", - vino.ino, vino.snap, err); + doutc(cl, "%llx.%llx\n err=%d\n", vino.ino, vino.snap, err); } else { - dout("snapfh_to_dentry %llx.%llx parent %llx hash %x err=%d", - vino.ino, vino.snap, sfh->parent_ino, sfh->hash, err); + doutc(cl, "%llx.%llx parent %llx hash %x err=%d", vino.ino, + vino.snap, sfh->parent_ino, sfh->hash, err); } if (IS_ERR(inode)) return ERR_CAST(inode); @@ -297,6 +299,7 @@ static struct dentry *ceph_fh_to_dentry(struct super_block *sb, struct fid *fid, int fh_len, int fh_type) { + struct ceph_fs_client *fsc = ceph_sb_to_fs_client(sb); struct ceph_nfs_fh *fh = (void *)fid->raw;
if (fh_type == FILEID_BTRFS_WITH_PARENT) { @@ -310,7 +313,7 @@ static struct dentry *ceph_fh_to_dentry(struct super_block *sb, if (fh_len < sizeof(*fh) / 4) return NULL;
- dout("fh_to_dentry %llx\n", fh->ino); + doutc(fsc->client, "%llx\n", fh->ino); return __fh_to_dentry(sb, fh->ino); }
@@ -363,6 +366,7 @@ static struct dentry *__get_parent(struct super_block *sb, static struct dentry *ceph_get_parent(struct dentry *child) { struct inode *inode = d_inode(child); + struct ceph_client *cl = ceph_inode_to_client(inode); struct dentry *dn;
if (ceph_snap(inode) != CEPH_NOSNAP) { @@ -402,8 +406,8 @@ static struct dentry *ceph_get_parent(struct dentry *child) dn = __get_parent(child->d_sb, child, 0); } out: - dout("get_parent %p ino %llx.%llx err=%ld\n", - child, ceph_vinop(inode), (long)PTR_ERR_OR_ZERO(dn)); + doutc(cl, "child %p %p %llx.%llx err=%ld\n", child, inode, + ceph_vinop(inode), (long)PTR_ERR_OR_ZERO(dn)); return dn; }
@@ -414,6 +418,7 @@ static struct dentry *ceph_fh_to_parent(struct super_block *sb, struct fid *fid, int fh_len, int fh_type) { + struct ceph_fs_client *fsc = ceph_sb_to_fs_client(sb); struct ceph_nfs_confh *cfh = (void *)fid->raw; struct dentry *dentry;
@@ -427,7 +432,7 @@ static struct dentry *ceph_fh_to_parent(struct super_block *sb, if (fh_len < sizeof(*cfh) / 4) return NULL;
- dout("fh_to_parent %llx\n", cfh->parent_ino); + doutc(fsc->client, "%llx\n", cfh->parent_ino); dentry = __get_parent(sb, NULL, cfh->ino); if (unlikely(dentry == ERR_PTR(-ENOENT))) dentry = __fh_to_dentry(sb, cfh->parent_ino); @@ -526,8 +531,8 @@ static int __get_snap_name(struct dentry *parent, char *name, if (req) ceph_mdsc_put_request(req); kfree(last_name); - dout("get_snap_name %p ino %llx.%llx err=%d\n", - child, ceph_vinop(inode), err); + doutc(fsc->client, "child dentry %p %p %llx.%llx err=%d\n", child, + inode, ceph_vinop(inode), err); return err; }
@@ -588,9 +593,9 @@ static int ceph_get_name(struct dentry *parent, char *name, ceph_fname_free_buffer(dir, &oname); } out: - dout("get_name %p ino %llx.%llx err %d %s%s\n", - child, ceph_vinop(inode), err, - err ? "" : "name ", err ? "" : name); + doutc(mdsc->fsc->client, "child dentry %p %p %llx.%llx err %d %s%s\n", + child, inode, ceph_vinop(inode), err, err ? "" : "name ", + err ? "" : name); ceph_mdsc_put_request(req); return err; } diff --git a/fs/ceph/file.c b/fs/ceph/file.c index a03b11cf7887..59a0f149ec82 100644 --- a/fs/ceph/file.c +++ b/fs/ceph/file.c @@ -19,8 +19,9 @@ #include "io.h" #include "metric.h"
-static __le32 ceph_flags_sys2wire(u32 flags) +static __le32 ceph_flags_sys2wire(struct ceph_mds_client *mdsc, u32 flags) { + struct ceph_client *cl = mdsc->fsc->client; u32 wire_flags = 0;
switch (flags & O_ACCMODE) { @@ -48,7 +49,7 @@ static __le32 ceph_flags_sys2wire(u32 flags) #undef ceph_sys2wire
if (flags) - dout("unused open flags: %x\n", flags); + doutc(cl, "unused open flags: %x\n", flags);
return cpu_to_le32(wire_flags); } @@ -189,7 +190,7 @@ prepare_open_request(struct super_block *sb, int flags, int create_mode) if (IS_ERR(req)) goto out; req->r_fmode = ceph_flags_to_mode(flags); - req->r_args.open.flags = ceph_flags_sys2wire(flags); + req->r_args.open.flags = ceph_flags_sys2wire(mdsc, flags); req->r_args.open.mode = cpu_to_le32(create_mode); out: return req; @@ -201,11 +202,12 @@ static int ceph_init_file_info(struct inode *inode, struct file *file, struct ceph_inode_info *ci = ceph_inode(inode); struct ceph_mount_options *opt = ceph_inode_to_fs_client(&ci->netfs.inode)->mount_options; + struct ceph_client *cl = ceph_inode_to_client(inode); struct ceph_file_info *fi; int ret;
- dout("%s %p %p 0%o (%s)\n", __func__, inode, file, - inode->i_mode, isdir ? "dir" : "regular"); + doutc(cl, "%p %llx.%llx %p 0%o (%s)\n", inode, ceph_vinop(inode), + file, inode->i_mode, isdir ? "dir" : "regular"); BUG_ON(inode->i_fop->release != ceph_release);
if (isdir) { @@ -259,6 +261,7 @@ static int ceph_init_file_info(struct inode *inode, struct file *file, */ static int ceph_init_file(struct inode *inode, struct file *file, int fmode) { + struct ceph_client *cl = ceph_inode_to_client(inode); int ret = 0;
switch (inode->i_mode & S_IFMT) { @@ -271,13 +274,13 @@ static int ceph_init_file(struct inode *inode, struct file *file, int fmode) break;
case S_IFLNK: - dout("init_file %p %p 0%o (symlink)\n", inode, file, - inode->i_mode); + doutc(cl, "%p %llx.%llx %p 0%o (symlink)\n", inode, + ceph_vinop(inode), file, inode->i_mode); break;
default: - dout("init_file %p %p 0%o (special)\n", inode, file, - inode->i_mode); + doutc(cl, "%p %llx.%llx %p 0%o (special)\n", inode, + ceph_vinop(inode), file, inode->i_mode); /* * we need to drop the open ref now, since we don't * have .release set to ceph_release. @@ -296,6 +299,7 @@ static int ceph_init_file(struct inode *inode, struct file *file, int fmode) int ceph_renew_caps(struct inode *inode, int fmode) { struct ceph_mds_client *mdsc = ceph_sb_to_mdsc(inode->i_sb); + struct ceph_client *cl = mdsc->fsc->client; struct ceph_inode_info *ci = ceph_inode(inode); struct ceph_mds_request *req; int err, flags, wanted; @@ -307,8 +311,9 @@ int ceph_renew_caps(struct inode *inode, int fmode) (!(wanted & CEPH_CAP_ANY_WR) || ci->i_auth_cap)) { int issued = __ceph_caps_issued(ci, NULL); spin_unlock(&ci->i_ceph_lock); - dout("renew caps %p want %s issued %s updating mds_wanted\n", - inode, ceph_cap_string(wanted), ceph_cap_string(issued)); + doutc(cl, "%p %llx.%llx want %s issued %s updating mds_wanted\n", + inode, ceph_vinop(inode), ceph_cap_string(wanted), + ceph_cap_string(issued)); ceph_check_caps(ci, 0); return 0; } @@ -339,7 +344,8 @@ int ceph_renew_caps(struct inode *inode, int fmode) err = ceph_mdsc_do_request(mdsc, NULL, req); ceph_mdsc_put_request(req); out: - dout("renew caps %p open result=%d\n", inode, err); + doutc(cl, "%p %llx.%llx open result=%d\n", inode, ceph_vinop(inode), + err); return err < 0 ? err : 0; }
@@ -353,6 +359,7 @@ int ceph_open(struct inode *inode, struct file *file) { struct ceph_inode_info *ci = ceph_inode(inode); struct ceph_fs_client *fsc = ceph_sb_to_fs_client(inode->i_sb); + struct ceph_client *cl = fsc->client; struct ceph_mds_client *mdsc = fsc->mdsc; struct ceph_mds_request *req; struct ceph_file_info *fi = file->private_data; @@ -360,7 +367,7 @@ int ceph_open(struct inode *inode, struct file *file) int flags, fmode, wanted;
if (fi) { - dout("open file %p is already opened\n", file); + doutc(cl, "file %p is already opened\n", file); return 0; }
@@ -374,8 +381,8 @@ int ceph_open(struct inode *inode, struct file *file) return err; }
- dout("open inode %p ino %llx.%llx file %p flags %d (%d)\n", inode, - ceph_vinop(inode), file, flags, file->f_flags); + doutc(cl, "%p %llx.%llx file %p flags %d (%d)\n", inode, + ceph_vinop(inode), file, flags, file->f_flags); fmode = ceph_flags_to_mode(flags); wanted = ceph_caps_for_mode(fmode);
@@ -399,9 +406,9 @@ int ceph_open(struct inode *inode, struct file *file) int mds_wanted = __ceph_caps_mds_wanted(ci, true); int issued = __ceph_caps_issued(ci, NULL);
- dout("open %p fmode %d want %s issued %s using existing\n", - inode, fmode, ceph_cap_string(wanted), - ceph_cap_string(issued)); + doutc(cl, "open %p fmode %d want %s issued %s using existing\n", + inode, fmode, ceph_cap_string(wanted), + ceph_cap_string(issued)); __ceph_touch_fmode(ci, mdsc, fmode); spin_unlock(&ci->i_ceph_lock);
@@ -421,7 +428,7 @@ int ceph_open(struct inode *inode, struct file *file)
spin_unlock(&ci->i_ceph_lock);
- dout("open fmode %d wants %s\n", fmode, ceph_cap_string(wanted)); + doutc(cl, "open fmode %d wants %s\n", fmode, ceph_cap_string(wanted)); req = prepare_open_request(inode->i_sb, flags, 0); if (IS_ERR(req)) { err = PTR_ERR(req); @@ -435,7 +442,7 @@ int ceph_open(struct inode *inode, struct file *file) if (!err) err = ceph_init_file(inode, file, req->r_fmode); ceph_mdsc_put_request(req); - dout("open result=%d on %llx.%llx\n", err, ceph_vinop(inode)); + doutc(cl, "open result=%d on %llx.%llx\n", err, ceph_vinop(inode)); out: return err; } @@ -515,6 +522,7 @@ static int try_prep_async_create(struct inode *dir, struct dentry *dentry,
static void restore_deleg_ino(struct inode *dir, u64 ino) { + struct ceph_client *cl = ceph_inode_to_client(dir); struct ceph_inode_info *ci = ceph_inode(dir); struct ceph_mds_session *s = NULL;
@@ -525,7 +533,8 @@ static void restore_deleg_ino(struct inode *dir, u64 ino) if (s) { int err = ceph_restore_deleg_ino(s, ino); if (err) - pr_warn("ceph: unable to restore delegated ino 0x%llx to session: %d\n", + pr_warn_client(cl, + "unable to restore delegated ino 0x%llx to session: %d\n", ino, err); ceph_put_mds_session(s); } @@ -557,6 +566,7 @@ static void wake_async_create_waiters(struct inode *inode, static void ceph_async_create_cb(struct ceph_mds_client *mdsc, struct ceph_mds_request *req) { + struct ceph_client *cl = mdsc->fsc->client; struct dentry *dentry = req->r_dentry; struct inode *dinode = d_inode(dentry); struct inode *tinode = req->r_target_inode; @@ -577,7 +587,8 @@ static void ceph_async_create_cb(struct ceph_mds_client *mdsc, char *path = ceph_mdsc_build_path(mdsc, req->r_dentry, &pathlen, &base, 0);
- pr_warn("async create failure path=(%llx)%s result=%d!\n", + pr_warn_client(cl, + "async create failure path=(%llx)%s result=%d!\n", base, IS_ERR(path) ? "<<bad>>" : path, result); ceph_mdsc_free_path(path, pathlen);
@@ -596,14 +607,15 @@ static void ceph_async_create_cb(struct ceph_mds_client *mdsc, u64 ino = ceph_vino(tinode).ino;
if (req->r_deleg_ino != ino) - pr_warn("%s: inode number mismatch! err=%d deleg_ino=0x%llx target=0x%llx\n", - __func__, req->r_err, req->r_deleg_ino, ino); + pr_warn_client(cl, + "inode number mismatch! err=%d deleg_ino=0x%llx target=0x%llx\n", + req->r_err, req->r_deleg_ino, ino);
mapping_set_error(tinode->i_mapping, result); wake_async_create_waiters(tinode, req->r_session); } else if (!result) { - pr_warn("%s: no req->r_target_inode for 0x%llx\n", __func__, - req->r_deleg_ino); + pr_warn_client(cl, "no req->r_target_inode for 0x%llx\n", + req->r_deleg_ino); } out: ceph_mdsc_release_dir_caps(req); @@ -625,6 +637,7 @@ static int ceph_finish_async_create(struct inode *dir, struct inode *inode, struct timespec64 now; struct ceph_string *pool_ns; struct ceph_mds_client *mdsc = ceph_sb_to_mdsc(dir->i_sb); + struct ceph_client *cl = mdsc->fsc->client; struct ceph_vino vino = { .ino = req->r_deleg_ino, .snap = CEPH_NOSNAP };
@@ -683,7 +696,7 @@ static int ceph_finish_async_create(struct inode *dir, struct inode *inode, req->r_fmode, NULL); up_read(&mdsc->snap_rwsem); if (ret) { - dout("%s failed to fill inode: %d\n", __func__, ret); + doutc(cl, "failed to fill inode: %d\n", ret); ceph_dir_clear_complete(dir); if (!d_unhashed(dentry)) d_drop(dentry); @@ -691,8 +704,8 @@ static int ceph_finish_async_create(struct inode *dir, struct inode *inode, } else { struct dentry *dn;
- dout("%s d_adding new inode 0x%llx to 0x%llx/%s\n", __func__, - vino.ino, ceph_ino(dir), dentry->d_name.name); + doutc(cl, "d_adding new inode 0x%llx to 0x%llx/%s\n", + vino.ino, ceph_ino(dir), dentry->d_name.name); ceph_dir_clear_ordered(dir); ceph_init_inode_acls(inode, as_ctx); if (inode->i_state & I_NEW) { @@ -731,6 +744,7 @@ int ceph_atomic_open(struct inode *dir, struct dentry *dentry, struct file *file, unsigned flags, umode_t mode) { struct ceph_fs_client *fsc = ceph_sb_to_fs_client(dir->i_sb); + struct ceph_client *cl = fsc->client; struct ceph_mds_client *mdsc = fsc->mdsc; struct ceph_mds_request *req; struct inode *new_inode = NULL; @@ -740,9 +754,9 @@ int ceph_atomic_open(struct inode *dir, struct dentry *dentry, int mask; int err;
- dout("atomic_open %p dentry %p '%pd' %s flags %d mode 0%o\n", - dir, dentry, dentry, - d_unhashed(dentry) ? "unhashed" : "hashed", flags, mode); + doutc(cl, "%p %llx.%llx dentry %p '%pd' %s flags %d mode 0%o\n", + dir, ceph_vinop(dir), dentry, dentry, + d_unhashed(dentry) ? "unhashed" : "hashed", flags, mode);
if (dentry->d_name.len > NAME_MAX) return -ENAMETOOLONG; @@ -880,17 +894,18 @@ int ceph_atomic_open(struct inode *dir, struct dentry *dentry, goto out_req; if (dn || d_really_is_negative(dentry) || d_is_symlink(dentry)) { /* make vfs retry on splice, ENOENT, or symlink */ - dout("atomic_open finish_no_open on dn %p\n", dn); + doutc(cl, "finish_no_open on dn %p\n", dn); err = finish_no_open(file, dn); } else { if (IS_ENCRYPTED(dir) && !fscrypt_has_permitted_context(dir, d_inode(dentry))) { - pr_warn("Inconsistent encryption context (parent %llx:%llx child %llx:%llx)\n", + pr_warn_client(cl, + "Inconsistent encryption context (parent %llx:%llx child %llx:%llx)\n", ceph_vinop(dir), ceph_vinop(d_inode(dentry))); goto out_req; }
- dout("atomic_open finish_open on dn %p\n", dn); + doutc(cl, "finish_open on dn %p\n", dn); if (req->r_op == CEPH_MDS_OP_CREATE && req->r_reply_info.has_create_ino) { struct inode *newino = d_inode(dentry);
@@ -905,17 +920,19 @@ int ceph_atomic_open(struct inode *dir, struct dentry *dentry, iput(new_inode); out_ctx: ceph_release_acl_sec_ctx(&as_ctx); - dout("atomic_open result=%d\n", err); + doutc(cl, "result=%d\n", err); return err; }
int ceph_release(struct inode *inode, struct file *file) { + struct ceph_client *cl = ceph_inode_to_client(inode); struct ceph_inode_info *ci = ceph_inode(inode);
if (S_ISDIR(inode->i_mode)) { struct ceph_dir_file_info *dfi = file->private_data; - dout("release inode %p dir file %p\n", inode, file); + doutc(cl, "%p %llx.%llx dir file %p\n", inode, + ceph_vinop(inode), file); WARN_ON(!list_empty(&dfi->file_info.rw_contexts));
ceph_put_fmode(ci, dfi->file_info.fmode, 1); @@ -927,7 +944,8 @@ int ceph_release(struct inode *inode, struct file *file) kmem_cache_free(ceph_dir_file_cachep, dfi); } else { struct ceph_file_info *fi = file->private_data; - dout("release inode %p regular file %p\n", inode, file); + doutc(cl, "%p %llx.%llx regular file %p\n", inode, + ceph_vinop(inode), file); WARN_ON(!list_empty(&fi->rw_contexts));
ceph_fscache_unuse_cookie(inode, file->f_mode & FMODE_WRITE); @@ -963,6 +981,7 @@ ssize_t __ceph_sync_read(struct inode *inode, loff_t *ki_pos, { struct ceph_inode_info *ci = ceph_inode(inode); struct ceph_fs_client *fsc = ceph_inode_to_fs_client(inode); + struct ceph_client *cl = fsc->client; struct ceph_osd_client *osdc = &fsc->client->osdc; ssize_t ret; u64 off = *ki_pos; @@ -971,7 +990,8 @@ ssize_t __ceph_sync_read(struct inode *inode, loff_t *ki_pos, bool sparse = IS_ENCRYPTED(inode) || ceph_test_mount_opt(fsc, SPARSEREAD); u64 objver = 0;
- dout("sync_read on inode %p %llx~%llx\n", inode, *ki_pos, len); + doutc(cl, "on inode %p %llx.%llx %llx~%llx\n", inode, + ceph_vinop(inode), *ki_pos, len);
if (ceph_inode_is_shutdown(inode)) return -EIO; @@ -1006,8 +1026,8 @@ ssize_t __ceph_sync_read(struct inode *inode, loff_t *ki_pos, /* determine new offset/length if encrypted */ ceph_fscrypt_adjust_off_and_len(inode, &read_off, &read_len);
- dout("sync_read orig %llu~%llu reading %llu~%llu", - off, len, read_off, read_len); + doutc(cl, "orig %llu~%llu reading %llu~%llu", off, len, + read_off, read_len);
req = ceph_osdc_new_request(osdc, &ci->i_layout, ci->i_vino, read_off, &read_len, 0, 1, @@ -1061,8 +1081,8 @@ ssize_t __ceph_sync_read(struct inode *inode, loff_t *ki_pos, objver = req->r_version;
i_size = i_size_read(inode); - dout("sync_read %llu~%llu got %zd i_size %llu%s\n", - off, len, ret, i_size, (more ? " MORE" : "")); + doutc(cl, "%llu~%llu got %zd i_size %llu%s\n", off, len, + ret, i_size, (more ? " MORE" : ""));
/* Fix it to go to end of extent map */ if (sparse && ret >= 0) @@ -1108,8 +1128,8 @@ ssize_t __ceph_sync_read(struct inode *inode, loff_t *ki_pos, int zlen = min(len - ret, i_size - off - ret); int zoff = page_off + ret;
- dout("sync_read zero gap %llu~%llu\n", - off + ret, off + ret + zlen); + doutc(cl, "zero gap %llu~%llu\n", off + ret, + off + ret + zlen); ceph_zero_page_vector_range(zoff, zlen, pages); ret += zlen; } @@ -1154,7 +1174,7 @@ ssize_t __ceph_sync_read(struct inode *inode, loff_t *ki_pos, if (last_objver) *last_objver = objver; } - dout("sync_read result %zd retry_op %d\n", ret, *retry_op); + doutc(cl, "result %zd retry_op %d\n", ret, *retry_op); return ret; }
@@ -1163,9 +1183,11 @@ static ssize_t ceph_sync_read(struct kiocb *iocb, struct iov_iter *to, { struct file *file = iocb->ki_filp; struct inode *inode = file_inode(file); + struct ceph_client *cl = ceph_inode_to_client(inode);
- dout("sync_read on file %p %llx~%zx %s\n", file, iocb->ki_pos, - iov_iter_count(to), (file->f_flags & O_DIRECT) ? "O_DIRECT" : ""); + doutc(cl, "on file %p %llx~%zx %s\n", file, iocb->ki_pos, + iov_iter_count(to), + (file->f_flags & O_DIRECT) ? "O_DIRECT" : "");
return __ceph_sync_read(inode, &iocb->ki_pos, to, retry_op, NULL); } @@ -1193,6 +1215,7 @@ static void ceph_aio_retry_work(struct work_struct *work); static void ceph_aio_complete(struct inode *inode, struct ceph_aio_request *aio_req) { + struct ceph_client *cl = ceph_inode_to_client(inode); struct ceph_inode_info *ci = ceph_inode(inode); int ret;
@@ -1206,7 +1229,7 @@ static void ceph_aio_complete(struct inode *inode, if (!ret) ret = aio_req->total_len;
- dout("ceph_aio_complete %p rc %d\n", inode, ret); + doutc(cl, "%p %llx.%llx rc %d\n", inode, ceph_vinop(inode), ret);
if (ret >= 0 && aio_req->write) { int dirty; @@ -1245,11 +1268,13 @@ static void ceph_aio_complete_req(struct ceph_osd_request *req) struct ceph_client_metric *metric = &ceph_sb_to_mdsc(inode->i_sb)->metric; unsigned int len = osd_data->bvec_pos.iter.bi_size; bool sparse = (op->op == CEPH_OSD_OP_SPARSE_READ); + struct ceph_client *cl = ceph_inode_to_client(inode);
BUG_ON(osd_data->type != CEPH_OSD_DATA_TYPE_BVECS); BUG_ON(!osd_data->num_bvecs);
- dout("ceph_aio_complete_req %p rc %d bytes %u\n", inode, rc, len); + doutc(cl, "req %p inode %p %llx.%llx, rc %d bytes %u\n", req, + inode, ceph_vinop(inode), rc, len);
if (rc == -EOLDSNAPC) { struct ceph_aio_work *aio_work; @@ -1390,6 +1415,7 @@ ceph_direct_read_write(struct kiocb *iocb, struct iov_iter *iter, struct inode *inode = file_inode(file); struct ceph_inode_info *ci = ceph_inode(inode); struct ceph_fs_client *fsc = ceph_inode_to_fs_client(inode); + struct ceph_client *cl = fsc->client; struct ceph_client_metric *metric = &fsc->mdsc->metric; struct ceph_vino vino; struct ceph_osd_request *req; @@ -1408,9 +1434,9 @@ ceph_direct_read_write(struct kiocb *iocb, struct iov_iter *iter, if (write && ceph_snap(file_inode(file)) != CEPH_NOSNAP) return -EROFS;
- dout("sync_direct_%s on file %p %lld~%u snapc %p seq %lld\n", - (write ? "write" : "read"), file, pos, (unsigned)count, - snapc, snapc ? snapc->seq : 0); + doutc(cl, "sync_direct_%s on file %p %lld~%u snapc %p seq %lld\n", + (write ? "write" : "read"), file, pos, (unsigned)count, + snapc, snapc ? snapc->seq : 0);
if (write) { int ret2; @@ -1421,7 +1447,8 @@ ceph_direct_read_write(struct kiocb *iocb, struct iov_iter *iter, pos >> PAGE_SHIFT, (pos + count - 1) >> PAGE_SHIFT); if (ret2 < 0) - dout("invalidate_inode_pages2_range returned %d\n", ret2); + doutc(cl, "invalidate_inode_pages2_range returned %d\n", + ret2);
flags = /* CEPH_OSD_FLAG_ORDERSNAP | */ CEPH_OSD_FLAG_WRITE; } else { @@ -1617,6 +1644,7 @@ ceph_sync_write(struct kiocb *iocb, struct iov_iter *from, loff_t pos, struct inode *inode = file_inode(file); struct ceph_inode_info *ci = ceph_inode(inode); struct ceph_fs_client *fsc = ceph_inode_to_fs_client(inode); + struct ceph_client *cl = fsc->client; struct ceph_osd_client *osdc = &fsc->client->osdc; struct ceph_osd_request *req; struct page **pages; @@ -1631,8 +1659,8 @@ ceph_sync_write(struct kiocb *iocb, struct iov_iter *from, loff_t pos, if (ceph_snap(file_inode(file)) != CEPH_NOSNAP) return -EROFS;
- dout("sync_write on file %p %lld~%u snapc %p seq %lld\n", - file, pos, (unsigned)count, snapc, snapc->seq); + doutc(cl, "on file %p %lld~%u snapc %p seq %lld\n", file, pos, + (unsigned)count, snapc, snapc->seq);
ret = filemap_write_and_wait_range(inode->i_mapping, pos, pos + count - 1); @@ -1676,9 +1704,9 @@ ceph_sync_write(struct kiocb *iocb, struct iov_iter *from, loff_t pos, last = (pos + len) != (write_pos + write_len); rmw = first || last;
- dout("sync_write ino %llx %lld~%llu adjusted %lld~%llu -- %srmw\n", - ci->i_vino.ino, pos, len, write_pos, write_len, - rmw ? "" : "no "); + doutc(cl, "ino %llx %lld~%llu adjusted %lld~%llu -- %srmw\n", + ci->i_vino.ino, pos, len, write_pos, write_len, + rmw ? "" : "no ");
/* * The data is emplaced into the page as it would be if it were @@ -1887,7 +1915,7 @@ ceph_sync_write(struct kiocb *iocb, struct iov_iter *from, loff_t pos, left -= ret; } if (ret < 0) { - dout("sync_write write failed with %d\n", ret); + doutc(cl, "write failed with %d\n", ret); ceph_release_page_vector(pages, num_pages); break; } @@ -1897,7 +1925,7 @@ ceph_sync_write(struct kiocb *iocb, struct iov_iter *from, loff_t pos, write_pos, write_len, GFP_KERNEL); if (ret < 0) { - dout("encryption failed with %d\n", ret); + doutc(cl, "encryption failed with %d\n", ret); ceph_release_page_vector(pages, num_pages); break; } @@ -1916,7 +1944,7 @@ ceph_sync_write(struct kiocb *iocb, struct iov_iter *from, loff_t pos, break; }
- dout("sync_write write op %lld~%llu\n", write_pos, write_len); + doutc(cl, "write op %lld~%llu\n", write_pos, write_len); osd_req_op_extent_osd_data_pages(req, rmw ? 1 : 0, pages, write_len, offset_in_page(write_pos), false, true); @@ -1947,7 +1975,7 @@ ceph_sync_write(struct kiocb *iocb, struct iov_iter *from, loff_t pos, req->r_end_latency, len, ret); ceph_osdc_put_request(req); if (ret != 0) { - dout("sync_write osd write returned %d\n", ret); + doutc(cl, "osd write returned %d\n", ret); /* Version changed! Must re-do the rmw cycle */ if ((assert_ver && (ret == -ERANGE || ret == -EOVERFLOW)) || (!assert_ver && ret == -EEXIST)) { @@ -1977,13 +2005,13 @@ ceph_sync_write(struct kiocb *iocb, struct iov_iter *from, loff_t pos, pos >> PAGE_SHIFT, (pos + len - 1) >> PAGE_SHIFT); if (ret < 0) { - dout("invalidate_inode_pages2_range returned %d\n", - ret); + doutc(cl, "invalidate_inode_pages2_range returned %d\n", + ret); ret = 0; } pos += len; written += len; - dout("sync_write written %d\n", written); + doutc(cl, "written %d\n", written); if (pos > i_size_read(inode)) { check_caps = ceph_inode_set_size(inode, pos); if (check_caps) @@ -1997,7 +2025,7 @@ ceph_sync_write(struct kiocb *iocb, struct iov_iter *from, loff_t pos, ret = written; iocb->ki_pos = pos; } - dout("sync_write returning %d\n", ret); + doutc(cl, "returning %d\n", ret); return ret; }
@@ -2016,13 +2044,14 @@ static ssize_t ceph_read_iter(struct kiocb *iocb, struct iov_iter *to) struct inode *inode = file_inode(filp); struct ceph_inode_info *ci = ceph_inode(inode); bool direct_lock = iocb->ki_flags & IOCB_DIRECT; + struct ceph_client *cl = ceph_inode_to_client(inode); ssize_t ret; int want = 0, got = 0; int retry_op = 0, read = 0;
again: - dout("aio_read %p %llx.%llx %llu~%u trying to get caps on %p\n", - inode, ceph_vinop(inode), iocb->ki_pos, (unsigned)len, inode); + doutc(cl, "%llu~%u trying to get caps on %p %llx.%llx\n", + iocb->ki_pos, (unsigned)len, inode, ceph_vinop(inode));
if (ceph_inode_is_shutdown(inode)) return -ESTALE; @@ -2050,9 +2079,9 @@ static ssize_t ceph_read_iter(struct kiocb *iocb, struct iov_iter *to) (iocb->ki_flags & IOCB_DIRECT) || (fi->flags & CEPH_F_SYNC)) {
- dout("aio_sync_read %p %llx.%llx %llu~%u got cap refs on %s\n", - inode, ceph_vinop(inode), iocb->ki_pos, (unsigned)len, - ceph_cap_string(got)); + doutc(cl, "sync %p %llx.%llx %llu~%u got cap refs on %s\n", + inode, ceph_vinop(inode), iocb->ki_pos, (unsigned)len, + ceph_cap_string(got));
if (!ceph_has_inline_data(ci)) { if (!retry_op && @@ -2070,16 +2099,16 @@ static ssize_t ceph_read_iter(struct kiocb *iocb, struct iov_iter *to) } } else { CEPH_DEFINE_RW_CONTEXT(rw_ctx, got); - dout("aio_read %p %llx.%llx %llu~%u got cap refs on %s\n", - inode, ceph_vinop(inode), iocb->ki_pos, (unsigned)len, - ceph_cap_string(got)); + doutc(cl, "async %p %llx.%llx %llu~%u got cap refs on %s\n", + inode, ceph_vinop(inode), iocb->ki_pos, (unsigned)len, + ceph_cap_string(got)); ceph_add_rw_context(fi, &rw_ctx); ret = generic_file_read_iter(iocb, to); ceph_del_rw_context(fi, &rw_ctx); }
- dout("aio_read %p %llx.%llx dropping cap refs on %s = %d\n", - inode, ceph_vinop(inode), ceph_cap_string(got), (int)ret); + doutc(cl, "%p %llx.%llx dropping cap refs on %s = %d\n", + inode, ceph_vinop(inode), ceph_cap_string(got), (int)ret); ceph_put_cap_refs(ci, got);
if (direct_lock) @@ -2139,8 +2168,8 @@ static ssize_t ceph_read_iter(struct kiocb *iocb, struct iov_iter *to) /* hit EOF or hole? */ if (retry_op == CHECK_EOF && iocb->ki_pos < i_size && ret < len) { - dout("sync_read hit hole, ppos %lld < size %lld" - ", reading more\n", iocb->ki_pos, i_size); + doutc(cl, "hit hole, ppos %lld < size %lld, reading more\n", + iocb->ki_pos, i_size);
read += ret; len -= ret; @@ -2235,6 +2264,7 @@ static ssize_t ceph_write_iter(struct kiocb *iocb, struct iov_iter *from) struct inode *inode = file_inode(file); struct ceph_inode_info *ci = ceph_inode(inode); struct ceph_fs_client *fsc = ceph_inode_to_fs_client(inode); + struct ceph_client *cl = fsc->client; struct ceph_osd_client *osdc = &fsc->client->osdc; struct ceph_cap_flush *prealloc_cf; ssize_t count, written = 0; @@ -2302,8 +2332,9 @@ static ssize_t ceph_write_iter(struct kiocb *iocb, struct iov_iter *from) if (err) goto out;
- dout("aio_write %p %llx.%llx %llu~%zd getting caps. i_size %llu\n", - inode, ceph_vinop(inode), pos, count, i_size_read(inode)); + doutc(cl, "%p %llx.%llx %llu~%zd getting caps. i_size %llu\n", + inode, ceph_vinop(inode), pos, count, + i_size_read(inode)); if (!(fi->flags & CEPH_F_SYNC) && !direct_lock) want |= CEPH_CAP_FILE_BUFFER; if (fi->fmode & CEPH_FILE_MODE_LAZY) @@ -2319,8 +2350,8 @@ static ssize_t ceph_write_iter(struct kiocb *iocb, struct iov_iter *from)
inode_inc_iversion_raw(inode);
- dout("aio_write %p %llx.%llx %llu~%zd got cap refs on %s\n", - inode, ceph_vinop(inode), pos, count, ceph_cap_string(got)); + doutc(cl, "%p %llx.%llx %llu~%zd got cap refs on %s\n", + inode, ceph_vinop(inode), pos, count, ceph_cap_string(got));
if ((got & (CEPH_CAP_FILE_BUFFER|CEPH_CAP_FILE_LAZYIO)) == 0 || (iocb->ki_flags & IOCB_DIRECT) || (fi->flags & CEPH_F_SYNC) || @@ -2380,14 +2411,14 @@ static ssize_t ceph_write_iter(struct kiocb *iocb, struct iov_iter *from) ceph_check_caps(ci, CHECK_CAPS_FLUSH); }
- dout("aio_write %p %llx.%llx %llu~%u dropping cap refs on %s\n", - inode, ceph_vinop(inode), pos, (unsigned)count, - ceph_cap_string(got)); + doutc(cl, "%p %llx.%llx %llu~%u dropping cap refs on %s\n", + inode, ceph_vinop(inode), pos, (unsigned)count, + ceph_cap_string(got)); ceph_put_cap_refs(ci, got);
if (written == -EOLDSNAPC) { - dout("aio_write %p %llx.%llx %llu~%u" "got EOLDSNAPC, retrying\n", - inode, ceph_vinop(inode), pos, (unsigned)count); + doutc(cl, "%p %llx.%llx %llu~%u" "got EOLDSNAPC, retrying\n", + inode, ceph_vinop(inode), pos, (unsigned)count); goto retry_snap; }
@@ -2559,14 +2590,15 @@ static long ceph_fallocate(struct file *file, int mode, struct inode *inode = file_inode(file); struct ceph_inode_info *ci = ceph_inode(inode); struct ceph_cap_flush *prealloc_cf; + struct ceph_client *cl = ceph_inode_to_client(inode); int want, got = 0; int dirty; int ret = 0; loff_t endoff = 0; loff_t size;
- dout("%s %p %llx.%llx mode %x, offset %llu length %llu\n", __func__, - inode, ceph_vinop(inode), mode, offset, length); + doutc(cl, "%p %llx.%llx mode %x, offset %llu length %llu\n", + inode, ceph_vinop(inode), mode, offset, length);
if (mode != (FALLOC_FL_KEEP_SIZE | FALLOC_FL_PUNCH_HOLE)) return -EOPNOTSUPP; @@ -2695,6 +2727,7 @@ static void put_rd_wr_caps(struct ceph_inode_info *src_ci, int src_got, static int is_file_size_ok(struct inode *src_inode, struct inode *dst_inode, loff_t src_off, loff_t dst_off, size_t len) { + struct ceph_client *cl = ceph_inode_to_client(src_inode); loff_t size, endoff;
size = i_size_read(src_inode); @@ -2705,8 +2738,8 @@ static int is_file_size_ok(struct inode *src_inode, struct inode *dst_inode, * inode. */ if (src_off + len > size) { - dout("Copy beyond EOF (%llu + %zu > %llu)\n", - src_off, len, size); + doutc(cl, "Copy beyond EOF (%llu + %zu > %llu)\n", src_off, + len, size); return -EOPNOTSUPP; } size = i_size_read(dst_inode); @@ -2782,6 +2815,7 @@ static ssize_t ceph_do_objects_copy(struct ceph_inode_info *src_ci, u64 *src_off u64 src_objnum, src_objoff, dst_objnum, dst_objoff; u32 src_objlen, dst_objlen; u32 object_size = src_ci->i_layout.object_size; + struct ceph_client *cl = fsc->client; int ret;
src_oloc.pool = src_ci->i_layout.pool_id; @@ -2823,9 +2857,10 @@ static ssize_t ceph_do_objects_copy(struct ceph_inode_info *src_ci, u64 *src_off if (ret) { if (ret == -EOPNOTSUPP) { fsc->have_copy_from2 = false; - pr_notice("OSDs don't support copy-from2; disabling copy offload\n"); + pr_notice_client(cl, + "OSDs don't support copy-from2; disabling copy offload\n"); } - dout("ceph_osdc_copy_from returned %d\n", ret); + doutc(cl, "returned %d\n", ret); if (!bytes) bytes = ret; goto out; @@ -2852,6 +2887,7 @@ static ssize_t __ceph_copy_file_range(struct file *src_file, loff_t src_off, struct ceph_inode_info *dst_ci = ceph_inode(dst_inode); struct ceph_cap_flush *prealloc_cf; struct ceph_fs_client *src_fsc = ceph_inode_to_fs_client(src_inode); + struct ceph_client *cl = src_fsc->client; loff_t size; ssize_t ret = -EIO, bytes; u64 src_objnum, dst_objnum, src_objoff, dst_objoff; @@ -2894,7 +2930,7 @@ static ssize_t __ceph_copy_file_range(struct file *src_file, loff_t src_off, (src_ci->i_layout.stripe_count != 1) || (dst_ci->i_layout.stripe_count != 1) || (src_ci->i_layout.object_size != dst_ci->i_layout.object_size)) { - dout("Invalid src/dst files layout\n"); + doutc(cl, "Invalid src/dst files layout\n"); return -EOPNOTSUPP; }
@@ -2912,12 +2948,12 @@ static ssize_t __ceph_copy_file_range(struct file *src_file, loff_t src_off, /* Start by sync'ing the source and destination files */ ret = file_write_and_wait_range(src_file, src_off, (src_off + len)); if (ret < 0) { - dout("failed to write src file (%zd)\n", ret); + doutc(cl, "failed to write src file (%zd)\n", ret); goto out; } ret = file_write_and_wait_range(dst_file, dst_off, (dst_off + len)); if (ret < 0) { - dout("failed to write dst file (%zd)\n", ret); + doutc(cl, "failed to write dst file (%zd)\n", ret); goto out; }
@@ -2929,7 +2965,7 @@ static ssize_t __ceph_copy_file_range(struct file *src_file, loff_t src_off, err = get_rd_wr_caps(src_file, &src_got, dst_file, (dst_off + len), &dst_got); if (err < 0) { - dout("get_rd_wr_caps returned %d\n", err); + doutc(cl, "get_rd_wr_caps returned %d\n", err); ret = -EOPNOTSUPP; goto out; } @@ -2944,7 +2980,8 @@ static ssize_t __ceph_copy_file_range(struct file *src_file, loff_t src_off, dst_off >> PAGE_SHIFT, (dst_off + len) >> PAGE_SHIFT); if (ret < 0) { - dout("Failed to invalidate inode pages (%zd)\n", ret); + doutc(cl, "Failed to invalidate inode pages (%zd)\n", + ret); ret = 0; /* XXX */ } ceph_calc_file_object_mapping(&src_ci->i_layout, src_off, @@ -2965,7 +3002,7 @@ static ssize_t __ceph_copy_file_range(struct file *src_file, loff_t src_off, * starting at the src_off */ if (src_objoff) { - dout("Initial partial copy of %u bytes\n", src_objlen); + doutc(cl, "Initial partial copy of %u bytes\n", src_objlen);
/* * we need to temporarily drop all caps as we'll be calling @@ -2976,7 +3013,7 @@ static ssize_t __ceph_copy_file_range(struct file *src_file, loff_t src_off, &dst_off, src_objlen, flags); /* Abort on short copies or on error */ if (ret < (long)src_objlen) { - dout("Failed partial copy (%zd)\n", ret); + doutc(cl, "Failed partial copy (%zd)\n", ret); goto out; } len -= ret; @@ -2998,7 +3035,7 @@ static ssize_t __ceph_copy_file_range(struct file *src_file, loff_t src_off, ret = bytes; goto out_caps; } - dout("Copied %zu bytes out of %zu\n", bytes, len); + doutc(cl, "Copied %zu bytes out of %zu\n", bytes, len); len -= bytes; ret += bytes;
@@ -3026,13 +3063,13 @@ static ssize_t __ceph_copy_file_range(struct file *src_file, loff_t src_off, * there were errors in remote object copies (len >= object_size). */ if (len && (len < src_ci->i_layout.object_size)) { - dout("Final partial copy of %zu bytes\n", len); + doutc(cl, "Final partial copy of %zu bytes\n", len); bytes = do_splice_direct(src_file, &src_off, dst_file, &dst_off, len, flags); if (bytes > 0) ret += bytes; else - dout("Failed partial copy (%zd)\n", bytes); + doutc(cl, "Failed partial copy (%zd)\n", bytes); }
out: diff --git a/fs/ceph/inode.c b/fs/ceph/inode.c index db6977c15c28..808b5a69f028 100644 --- a/fs/ceph/inode.c +++ b/fs/ceph/inode.c @@ -129,6 +129,8 @@ void ceph_as_ctx_to_req(struct ceph_mds_request *req, struct inode *ceph_get_inode(struct super_block *sb, struct ceph_vino vino, struct inode *newino) { + struct ceph_mds_client *mdsc = ceph_sb_to_mdsc(sb); + struct ceph_client *cl = mdsc->fsc->client; struct inode *inode;
if (ceph_vino_is_reserved(vino)) @@ -145,12 +147,13 @@ struct inode *ceph_get_inode(struct super_block *sb, struct ceph_vino vino, }
if (!inode) { - dout("No inode found for %llx.%llx\n", vino.ino, vino.snap); + doutc(cl, "no inode found for %llx.%llx\n", vino.ino, vino.snap); return ERR_PTR(-ENOMEM); }
- dout("get_inode on %llu=%llx.%llx got %p new %d\n", ceph_present_inode(inode), - ceph_vinop(inode), inode, !!(inode->i_state & I_NEW)); + doutc(cl, "on %llx=%llx.%llx got %p new %d\n", + ceph_present_inode(inode), ceph_vinop(inode), inode, + !!(inode->i_state & I_NEW)); return inode; }
@@ -159,6 +162,7 @@ struct inode *ceph_get_inode(struct super_block *sb, struct ceph_vino vino, */ struct inode *ceph_get_snapdir(struct inode *parent) { + struct ceph_client *cl = ceph_inode_to_client(parent); struct ceph_vino vino = { .ino = ceph_ino(parent), .snap = CEPH_SNAPDIR, @@ -171,14 +175,14 @@ struct inode *ceph_get_snapdir(struct inode *parent) return inode;
if (!S_ISDIR(parent->i_mode)) { - pr_warn_once("bad snapdir parent type (mode=0%o)\n", - parent->i_mode); + pr_warn_once_client(cl, "bad snapdir parent type (mode=0%o)\n", + parent->i_mode); goto err; }
if (!(inode->i_state & I_NEW) && !S_ISDIR(inode->i_mode)) { - pr_warn_once("bad snapdir inode type (mode=0%o)\n", - inode->i_mode); + pr_warn_once_client(cl, "bad snapdir inode type (mode=0%o)\n", + inode->i_mode); goto err; }
@@ -203,7 +207,7 @@ struct inode *ceph_get_snapdir(struct inode *parent) inode->i_flags |= S_ENCRYPTED; ci->fscrypt_auth_len = pci->fscrypt_auth_len; } else { - dout("Failed to alloc snapdir fscrypt_auth\n"); + doutc(cl, "Failed to alloc snapdir fscrypt_auth\n"); ret = -ENOMEM; goto err; } @@ -249,6 +253,8 @@ const struct inode_operations ceph_file_iops = { static struct ceph_inode_frag *__get_or_create_frag(struct ceph_inode_info *ci, u32 f) { + struct inode *inode = &ci->netfs.inode; + struct ceph_client *cl = ceph_inode_to_client(inode); struct rb_node **p; struct rb_node *parent = NULL; struct ceph_inode_frag *frag; @@ -279,8 +285,7 @@ static struct ceph_inode_frag *__get_or_create_frag(struct ceph_inode_info *ci, rb_link_node(&frag->node, parent, p); rb_insert_color(&frag->node, &ci->i_fragtree);
- dout("get_or_create_frag added %llx.%llx frag %x\n", - ceph_vinop(&ci->netfs.inode), f); + doutc(cl, "added %p %llx.%llx frag %x\n", inode, ceph_vinop(inode), f); return frag; }
@@ -313,6 +318,7 @@ struct ceph_inode_frag *__ceph_find_frag(struct ceph_inode_info *ci, u32 f) static u32 __ceph_choose_frag(struct ceph_inode_info *ci, u32 v, struct ceph_inode_frag *pfrag, int *found) { + struct ceph_client *cl = ceph_inode_to_client(&ci->netfs.inode); u32 t = ceph_frag_make(0, 0); struct ceph_inode_frag *frag; unsigned nway, i; @@ -336,8 +342,8 @@ static u32 __ceph_choose_frag(struct ceph_inode_info *ci, u32 v,
/* choose child */ nway = 1 << frag->split_by; - dout("choose_frag(%x) %x splits by %d (%d ways)\n", v, t, - frag->split_by, nway); + doutc(cl, "frag(%x) %x splits by %d (%d ways)\n", v, t, + frag->split_by, nway); for (i = 0; i < nway; i++) { n = ceph_frag_make_child(t, frag->split_by, i); if (ceph_frag_contains_value(n, v)) { @@ -347,7 +353,7 @@ static u32 __ceph_choose_frag(struct ceph_inode_info *ci, u32 v, } BUG_ON(i == nway); } - dout("choose_frag(%x) = %x\n", v, t); + doutc(cl, "frag(%x) = %x\n", v, t);
return t; } @@ -371,6 +377,7 @@ static int ceph_fill_dirfrag(struct inode *inode, struct ceph_mds_reply_dirfrag *dirinfo) { struct ceph_inode_info *ci = ceph_inode(inode); + struct ceph_client *cl = ceph_inode_to_client(inode); struct ceph_inode_frag *frag; u32 id = le32_to_cpu(dirinfo->frag); int mds = le32_to_cpu(dirinfo->auth); @@ -395,14 +402,14 @@ static int ceph_fill_dirfrag(struct inode *inode, goto out; if (frag->split_by == 0) { /* tree leaf, remove */ - dout("fill_dirfrag removed %llx.%llx frag %x" - " (no ref)\n", ceph_vinop(inode), id); + doutc(cl, "removed %p %llx.%llx frag %x (no ref)\n", + inode, ceph_vinop(inode), id); rb_erase(&frag->node, &ci->i_fragtree); kfree(frag); } else { /* tree branch, keep and clear */ - dout("fill_dirfrag cleared %llx.%llx frag %x" - " referral\n", ceph_vinop(inode), id); + doutc(cl, "cleared %p %llx.%llx frag %x referral\n", + inode, ceph_vinop(inode), id); frag->mds = -1; frag->ndist = 0; } @@ -415,8 +422,9 @@ static int ceph_fill_dirfrag(struct inode *inode, if (IS_ERR(frag)) { /* this is not the end of the world; we can continue with bad/inaccurate delegation info */ - pr_err("fill_dirfrag ENOMEM on mds ref %llx.%llx fg %x\n", - ceph_vinop(inode), le32_to_cpu(dirinfo->frag)); + pr_err_client(cl, "ENOMEM on mds ref %p %llx.%llx fg %x\n", + inode, ceph_vinop(inode), + le32_to_cpu(dirinfo->frag)); err = -ENOMEM; goto out; } @@ -425,8 +433,8 @@ static int ceph_fill_dirfrag(struct inode *inode, frag->ndist = min_t(u32, ndist, CEPH_MAX_DIRFRAG_REP); for (i = 0; i < frag->ndist; i++) frag->dist[i] = le32_to_cpu(dirinfo->dist[i]); - dout("fill_dirfrag %llx.%llx frag %x ndist=%d\n", - ceph_vinop(inode), frag->frag, frag->ndist); + doutc(cl, "%p %llx.%llx frag %x ndist=%d\n", inode, + ceph_vinop(inode), frag->frag, frag->ndist);
out: mutex_unlock(&ci->i_fragtree_mutex); @@ -454,6 +462,7 @@ static int ceph_fill_fragtree(struct inode *inode, struct ceph_frag_tree_head *fragtree, struct ceph_mds_reply_dirfrag *dirinfo) { + struct ceph_client *cl = ceph_inode_to_client(inode); struct ceph_inode_info *ci = ceph_inode(inode); struct ceph_inode_frag *frag, *prev_frag = NULL; struct rb_node *rb_node; @@ -489,15 +498,15 @@ static int ceph_fill_fragtree(struct inode *inode, frag_tree_split_cmp, NULL); }
- dout("fill_fragtree %llx.%llx\n", ceph_vinop(inode)); + doutc(cl, "%p %llx.%llx\n", inode, ceph_vinop(inode)); rb_node = rb_first(&ci->i_fragtree); for (i = 0; i < nsplits; i++) { id = le32_to_cpu(fragtree->splits[i].frag); split_by = le32_to_cpu(fragtree->splits[i].by); if (split_by == 0 || ceph_frag_bits(id) + split_by > 24) { - pr_err("fill_fragtree %llx.%llx invalid split %d/%u, " - "frag %x split by %d\n", ceph_vinop(inode), - i, nsplits, id, split_by); + pr_err_client(cl, "%p %llx.%llx invalid split %d/%u, " + "frag %x split by %d\n", inode, + ceph_vinop(inode), i, nsplits, id, split_by); continue; } frag = NULL; @@ -529,7 +538,7 @@ static int ceph_fill_fragtree(struct inode *inode, if (frag->split_by == 0) ci->i_fragtree_nsplits++; frag->split_by = split_by; - dout(" frag %x split by %d\n", frag->frag, frag->split_by); + doutc(cl, " frag %x split by %d\n", frag->frag, frag->split_by); prev_frag = frag; } while (rb_node) { @@ -554,6 +563,7 @@ static int ceph_fill_fragtree(struct inode *inode, */ struct inode *ceph_alloc_inode(struct super_block *sb) { + struct ceph_fs_client *fsc = ceph_sb_to_fs_client(sb); struct ceph_inode_info *ci; int i;
@@ -561,7 +571,7 @@ struct inode *ceph_alloc_inode(struct super_block *sb) if (!ci) return NULL;
- dout("alloc_inode %p\n", &ci->netfs.inode); + doutc(fsc->client, "%p\n", &ci->netfs.inode);
/* Set parameters for the netfs library */ netfs_inode_init(&ci->netfs, &ceph_netfs_ops); @@ -675,10 +685,11 @@ void ceph_evict_inode(struct inode *inode) { struct ceph_inode_info *ci = ceph_inode(inode); struct ceph_mds_client *mdsc = ceph_sb_to_mdsc(inode->i_sb); + struct ceph_client *cl = ceph_inode_to_client(inode); struct ceph_inode_frag *frag; struct rb_node *n;
- dout("evict_inode %p ino %llx.%llx\n", inode, ceph_vinop(inode)); + doutc(cl, "%p ino %llx.%llx\n", inode, ceph_vinop(inode));
percpu_counter_dec(&mdsc->metric.total_inodes);
@@ -701,8 +712,8 @@ void ceph_evict_inode(struct inode *inode) */ if (ci->i_snap_realm) { if (ceph_snap(inode) == CEPH_NOSNAP) { - dout(" dropping residual ref to snap realm %p\n", - ci->i_snap_realm); + doutc(cl, " dropping residual ref to snap realm %p\n", + ci->i_snap_realm); ceph_change_snap_realm(inode, NULL); } else { ceph_put_snapid_map(mdsc, ci->i_snapid_map); @@ -743,15 +754,16 @@ static inline blkcnt_t calc_inode_blocks(u64 size) int ceph_fill_file_size(struct inode *inode, int issued, u32 truncate_seq, u64 truncate_size, u64 size) { + struct ceph_client *cl = ceph_inode_to_client(inode); struct ceph_inode_info *ci = ceph_inode(inode); int queue_trunc = 0; loff_t isize = i_size_read(inode);
if (ceph_seq_cmp(truncate_seq, ci->i_truncate_seq) > 0 || (truncate_seq == ci->i_truncate_seq && size > isize)) { - dout("size %lld -> %llu\n", isize, size); + doutc(cl, "size %lld -> %llu\n", isize, size); if (size > 0 && S_ISDIR(inode->i_mode)) { - pr_err("fill_file_size non-zero size for directory\n"); + pr_err_client(cl, "non-zero size for directory\n"); size = 0; } i_size_write(inode, size); @@ -764,8 +776,8 @@ int ceph_fill_file_size(struct inode *inode, int issued, ceph_fscache_update(inode); ci->i_reported_size = size; if (truncate_seq != ci->i_truncate_seq) { - dout("%s truncate_seq %u -> %u\n", __func__, - ci->i_truncate_seq, truncate_seq); + doutc(cl, "truncate_seq %u -> %u\n", + ci->i_truncate_seq, truncate_seq); ci->i_truncate_seq = truncate_seq;
/* the MDS should have revoked these caps */ @@ -794,14 +806,15 @@ int ceph_fill_file_size(struct inode *inode, int issued, * anyway. */ if (ceph_seq_cmp(truncate_seq, ci->i_truncate_seq) >= 0) { - dout("%s truncate_size %lld -> %llu, encrypted %d\n", __func__, - ci->i_truncate_size, truncate_size, !!IS_ENCRYPTED(inode)); + doutc(cl, "truncate_size %lld -> %llu, encrypted %d\n", + ci->i_truncate_size, truncate_size, + !!IS_ENCRYPTED(inode));
ci->i_truncate_size = truncate_size;
if (IS_ENCRYPTED(inode)) { - dout("%s truncate_pagecache_size %lld -> %llu\n", - __func__, ci->i_truncate_pagecache_size, size); + doutc(cl, "truncate_pagecache_size %lld -> %llu\n", + ci->i_truncate_pagecache_size, size); ci->i_truncate_pagecache_size = size; } else { ci->i_truncate_pagecache_size = truncate_size; @@ -814,6 +827,7 @@ void ceph_fill_file_time(struct inode *inode, int issued, u64 time_warp_seq, struct timespec64 *ctime, struct timespec64 *mtime, struct timespec64 *atime) { + struct ceph_client *cl = ceph_inode_to_client(inode); struct ceph_inode_info *ci = ceph_inode(inode); struct timespec64 ictime = inode_get_ctime(inode); int warn = 0; @@ -825,7 +839,7 @@ void ceph_fill_file_time(struct inode *inode, int issued, CEPH_CAP_XATTR_EXCL)) { if (ci->i_version == 0 || timespec64_compare(ctime, &ictime) > 0) { - dout("ctime %lld.%09ld -> %lld.%09ld inc w/ cap\n", + doutc(cl, "ctime %lld.%09ld -> %lld.%09ld inc w/ cap\n", ictime.tv_sec, ictime.tv_nsec, ctime->tv_sec, ctime->tv_nsec); inode_set_ctime_to_ts(inode, *ctime); @@ -833,11 +847,10 @@ void ceph_fill_file_time(struct inode *inode, int issued, if (ci->i_version == 0 || ceph_seq_cmp(time_warp_seq, ci->i_time_warp_seq) > 0) { /* the MDS did a utimes() */ - dout("mtime %lld.%09ld -> %lld.%09ld " - "tw %d -> %d\n", - inode->i_mtime.tv_sec, inode->i_mtime.tv_nsec, - mtime->tv_sec, mtime->tv_nsec, - ci->i_time_warp_seq, (int)time_warp_seq); + doutc(cl, "mtime %lld.%09ld -> %lld.%09ld tw %d -> %d\n", + inode->i_mtime.tv_sec, inode->i_mtime.tv_nsec, + mtime->tv_sec, mtime->tv_nsec, + ci->i_time_warp_seq, (int)time_warp_seq);
inode->i_mtime = *mtime; inode->i_atime = *atime; @@ -845,17 +858,17 @@ void ceph_fill_file_time(struct inode *inode, int issued, } else if (time_warp_seq == ci->i_time_warp_seq) { /* nobody did utimes(); take the max */ if (timespec64_compare(mtime, &inode->i_mtime) > 0) { - dout("mtime %lld.%09ld -> %lld.%09ld inc\n", - inode->i_mtime.tv_sec, - inode->i_mtime.tv_nsec, - mtime->tv_sec, mtime->tv_nsec); + doutc(cl, "mtime %lld.%09ld -> %lld.%09ld inc\n", + inode->i_mtime.tv_sec, + inode->i_mtime.tv_nsec, + mtime->tv_sec, mtime->tv_nsec); inode->i_mtime = *mtime; } if (timespec64_compare(atime, &inode->i_atime) > 0) { - dout("atime %lld.%09ld -> %lld.%09ld inc\n", - inode->i_atime.tv_sec, - inode->i_atime.tv_nsec, - atime->tv_sec, atime->tv_nsec); + doutc(cl, "atime %lld.%09ld -> %lld.%09ld inc\n", + inode->i_atime.tv_sec, + inode->i_atime.tv_nsec, + atime->tv_sec, atime->tv_nsec); inode->i_atime = *atime; } } else if (issued & CEPH_CAP_FILE_EXCL) { @@ -875,13 +888,16 @@ void ceph_fill_file_time(struct inode *inode, int issued, } } if (warn) /* time_warp_seq shouldn't go backwards */ - dout("%p mds time_warp_seq %llu < %u\n", - inode, time_warp_seq, ci->i_time_warp_seq); + doutc(cl, "%p mds time_warp_seq %llu < %u\n", inode, + time_warp_seq, ci->i_time_warp_seq); }
#if IS_ENABLED(CONFIG_FS_ENCRYPTION) -static int decode_encrypted_symlink(const char *encsym, int enclen, u8 **decsym) +static int decode_encrypted_symlink(struct ceph_mds_client *mdsc, + const char *encsym, + int enclen, u8 **decsym) { + struct ceph_client *cl = mdsc->fsc->client; int declen; u8 *sym;
@@ -891,8 +907,9 @@ static int decode_encrypted_symlink(const char *encsym, int enclen, u8 **decsym)
declen = ceph_base64_decode(encsym, enclen, sym); if (declen < 0) { - pr_err("%s: can't decode symlink (%d). Content: %.*s\n", - __func__, declen, enclen, encsym); + pr_err_client(cl, + "can't decode symlink (%d). Content: %.*s\n", + declen, enclen, encsym); kfree(sym); return -EIO; } @@ -901,7 +918,9 @@ static int decode_encrypted_symlink(const char *encsym, int enclen, u8 **decsym) return declen; } #else -static int decode_encrypted_symlink(const char *encsym, int symlen, u8 **decsym) +static int decode_encrypted_symlink(struct ceph_mds_client *mdsc, + const char *encsym, + int symlen, u8 **decsym) { return -EOPNOTSUPP; } @@ -918,6 +937,7 @@ int ceph_fill_inode(struct inode *inode, struct page *locked_page, struct ceph_cap_reservation *caps_reservation) { struct ceph_mds_client *mdsc = ceph_sb_to_mdsc(inode->i_sb); + struct ceph_client *cl = mdsc->fsc->client; struct ceph_mds_reply_inode *info = iinfo->in; struct ceph_inode_info *ci = ceph_inode(inode); int issued, new_issued, info_caps; @@ -936,25 +956,26 @@ int ceph_fill_inode(struct inode *inode, struct page *locked_page,
lockdep_assert_held(&mdsc->snap_rwsem);
- dout("%s %p ino %llx.%llx v %llu had %llu\n", __func__, - inode, ceph_vinop(inode), le64_to_cpu(info->version), - ci->i_version); + doutc(cl, "%p ino %llx.%llx v %llu had %llu\n", inode, ceph_vinop(inode), + le64_to_cpu(info->version), ci->i_version);
/* Once I_NEW is cleared, we can't change type or dev numbers */ if (inode->i_state & I_NEW) { inode->i_mode = mode; } else { if (inode_wrong_type(inode, mode)) { - pr_warn_once("inode type changed! (ino %llx.%llx is 0%o, mds says 0%o)\n", - ceph_vinop(inode), inode->i_mode, mode); + pr_warn_once_client(cl, + "inode type changed! (ino %llx.%llx is 0%o, mds says 0%o)\n", + ceph_vinop(inode), inode->i_mode, mode); return -ESTALE; }
if ((S_ISCHR(mode) || S_ISBLK(mode)) && inode->i_rdev != rdev) { - pr_warn_once("dev inode rdev changed! (ino %llx.%llx is %u:%u, mds says %u:%u)\n", - ceph_vinop(inode), MAJOR(inode->i_rdev), - MINOR(inode->i_rdev), MAJOR(rdev), - MINOR(rdev)); + pr_warn_once_client(cl, + "dev inode rdev changed! (ino %llx.%llx is %u:%u, mds says %u:%u)\n", + ceph_vinop(inode), MAJOR(inode->i_rdev), + MINOR(inode->i_rdev), MAJOR(rdev), + MINOR(rdev)); return -ESTALE; } } @@ -976,8 +997,8 @@ int ceph_fill_inode(struct inode *inode, struct page *locked_page, if (iinfo->xattr_len > 4) { xattr_blob = ceph_buffer_new(iinfo->xattr_len, GFP_NOFS); if (!xattr_blob) - pr_err("%s ENOMEM xattr blob %d bytes\n", __func__, - iinfo->xattr_len); + pr_err_client(cl, "ENOMEM xattr blob %d bytes\n", + iinfo->xattr_len); }
if (iinfo->pool_ns_len > 0) @@ -1031,9 +1052,10 @@ int ceph_fill_inode(struct inode *inode, struct page *locked_page, inode->i_mode = mode; inode->i_uid = make_kuid(&init_user_ns, le32_to_cpu(info->uid)); inode->i_gid = make_kgid(&init_user_ns, le32_to_cpu(info->gid)); - dout("%p mode 0%o uid.gid %d.%d\n", inode, inode->i_mode, - from_kuid(&init_user_ns, inode->i_uid), - from_kgid(&init_user_ns, inode->i_gid)); + doutc(cl, "%p %llx.%llx mode 0%o uid.gid %d.%d\n", inode, + ceph_vinop(inode), inode->i_mode, + from_kuid(&init_user_ns, inode->i_uid), + from_kgid(&init_user_ns, inode->i_gid)); ceph_decode_timespec64(&ci->i_btime, &iinfo->btime); ceph_decode_timespec64(&ci->i_snap_btime, &iinfo->snap_btime); } @@ -1089,7 +1111,8 @@ int ceph_fill_inode(struct inode *inode, struct page *locked_page, if (size == round_up(fsize, CEPH_FSCRYPT_BLOCK_SIZE)) { size = fsize; } else { - pr_warn("fscrypt size mismatch: size=%llu fscrypt_file=%llu, discarding fscrypt_file size.\n", + pr_warn_client(cl, + "fscrypt size mismatch: size=%llu fscrypt_file=%llu, discarding fscrypt_file size.\n", info->size, size); } } @@ -1101,8 +1124,8 @@ int ceph_fill_inode(struct inode *inode, struct page *locked_page, /* only update max_size on auth cap */ if ((info->cap.flags & CEPH_CAP_FLAG_AUTH) && ci->i_max_size != le64_to_cpu(info->max_size)) { - dout("max_size %lld -> %llu\n", ci->i_max_size, - le64_to_cpu(info->max_size)); + doutc(cl, "max_size %lld -> %llu\n", + ci->i_max_size, le64_to_cpu(info->max_size)); ci->i_max_size = le64_to_cpu(info->max_size); } } @@ -1165,15 +1188,17 @@ int ceph_fill_inode(struct inode *inode, struct page *locked_page,
if (IS_ENCRYPTED(inode)) { if (symlen != i_size_read(inode)) - pr_err("%s %llx.%llx BAD symlink size %lld\n", - __func__, ceph_vinop(inode), + pr_err_client(cl, + "%p %llx.%llx BAD symlink size %lld\n", + inode, ceph_vinop(inode), i_size_read(inode));
- err = decode_encrypted_symlink(iinfo->symlink, + err = decode_encrypted_symlink(mdsc, iinfo->symlink, symlen, (u8 **)&sym); if (err < 0) { - pr_err("%s decoding encrypted symlink failed: %d\n", - __func__, err); + pr_err_client(cl, + "decoding encrypted symlink failed: %d\n", + err); goto out; } symlen = err; @@ -1181,8 +1206,9 @@ int ceph_fill_inode(struct inode *inode, struct page *locked_page, inode->i_blocks = calc_inode_blocks(symlen); } else { if (symlen != i_size_read(inode)) { - pr_err("%s %llx.%llx BAD symlink size %lld\n", - __func__, ceph_vinop(inode), + pr_err_client(cl, + "%p %llx.%llx BAD symlink size %lld\n", + inode, ceph_vinop(inode), i_size_read(inode)); i_size_write(inode, symlen); inode->i_blocks = calc_inode_blocks(symlen); @@ -1217,8 +1243,8 @@ int ceph_fill_inode(struct inode *inode, struct page *locked_page, inode->i_fop = &ceph_dir_fops; break; default: - pr_err("%s %llx.%llx BAD mode 0%o\n", __func__, - ceph_vinop(inode), inode->i_mode); + pr_err_client(cl, "%p %llx.%llx BAD mode 0%o\n", inode, + ceph_vinop(inode), inode->i_mode); }
/* were we issued a capability? */ @@ -1239,7 +1265,8 @@ int ceph_fill_inode(struct inode *inode, struct page *locked_page, (info_caps & CEPH_CAP_FILE_SHARED) && (issued & CEPH_CAP_FILE_EXCL) == 0 && !__ceph_dir_is_complete(ci)) { - dout(" marking %p complete (empty)\n", inode); + doutc(cl, " marking %p complete (empty)\n", + inode); i_size_write(inode, 0); __ceph_dir_set_complete(ci, atomic64_read(&ci->i_release_count), @@ -1248,8 +1275,8 @@ int ceph_fill_inode(struct inode *inode, struct page *locked_page,
wake = true; } else { - dout(" %p got snap_caps %s\n", inode, - ceph_cap_string(info_caps)); + doutc(cl, " %p got snap_caps %s\n", inode, + ceph_cap_string(info_caps)); ci->i_snap_caps |= info_caps; } } @@ -1265,8 +1292,8 @@ int ceph_fill_inode(struct inode *inode, struct page *locked_page,
if (cap_fmode >= 0) { if (!info_caps) - pr_warn("mds issued no caps on %llx.%llx\n", - ceph_vinop(inode)); + pr_warn_client(cl, "mds issued no caps on %llx.%llx\n", + ceph_vinop(inode)); __ceph_touch_fmode(ci, mdsc, cap_fmode); }
@@ -1312,14 +1339,14 @@ static void __update_dentry_lease(struct inode *dir, struct dentry *dentry, unsigned long from_time, struct ceph_mds_session **old_lease_session) { + struct ceph_client *cl = ceph_inode_to_client(dir); struct ceph_dentry_info *di = ceph_dentry(dentry); unsigned mask = le16_to_cpu(lease->mask); long unsigned duration = le32_to_cpu(lease->duration_ms); long unsigned ttl = from_time + (duration * HZ) / 1000; long unsigned half_ttl = from_time + (duration * HZ / 2) / 1000;
- dout("update_dentry_lease %p duration %lu ms ttl %lu\n", - dentry, duration, ttl); + doutc(cl, "%p duration %lu ms ttl %lu\n", dentry, duration, ttl);
/* only track leases on regular dentries */ if (ceph_snap(dir) != CEPH_NOSNAP) @@ -1420,6 +1447,7 @@ static void update_dentry_lease_careful(struct dentry *dentry, */ static int splice_dentry(struct dentry **pdn, struct inode *in) { + struct ceph_client *cl = ceph_inode_to_client(in); struct dentry *dn = *pdn; struct dentry *realdn;
@@ -1451,23 +1479,21 @@ static int splice_dentry(struct dentry **pdn, struct inode *in) d_drop(dn); realdn = d_splice_alias(in, dn); if (IS_ERR(realdn)) { - pr_err("splice_dentry error %ld %p inode %p ino %llx.%llx\n", - PTR_ERR(realdn), dn, in, ceph_vinop(in)); + pr_err_client(cl, "error %ld %p inode %p ino %llx.%llx\n", + PTR_ERR(realdn), dn, in, ceph_vinop(in)); return PTR_ERR(realdn); }
if (realdn) { - dout("dn %p (%d) spliced with %p (%d) " - "inode %p ino %llx.%llx\n", - dn, d_count(dn), - realdn, d_count(realdn), - d_inode(realdn), ceph_vinop(d_inode(realdn))); + doutc(cl, "dn %p (%d) spliced with %p (%d) inode %p ino %llx.%llx\n", + dn, d_count(dn), realdn, d_count(realdn), + d_inode(realdn), ceph_vinop(d_inode(realdn))); dput(dn); *pdn = realdn; } else { BUG_ON(!ceph_dentry(dn)); - dout("dn %p attached to %p ino %llx.%llx\n", - dn, d_inode(dn), ceph_vinop(d_inode(dn))); + doutc(cl, "dn %p attached to %p ino %llx.%llx\n", dn, + d_inode(dn), ceph_vinop(d_inode(dn))); } return 0; } @@ -1490,13 +1516,14 @@ int ceph_fill_trace(struct super_block *sb, struct ceph_mds_request *req) struct inode *in = NULL; struct ceph_vino tvino, dvino; struct ceph_fs_client *fsc = ceph_sb_to_fs_client(sb); + struct ceph_client *cl = fsc->client; int err = 0;
- dout("fill_trace %p is_dentry %d is_target %d\n", req, - rinfo->head->is_dentry, rinfo->head->is_target); + doutc(cl, "%p is_dentry %d is_target %d\n", req, + rinfo->head->is_dentry, rinfo->head->is_target);
if (!rinfo->head->is_target && !rinfo->head->is_dentry) { - dout("fill_trace reply is empty!\n"); + doutc(cl, "reply is empty!\n"); if (rinfo->head->result == 0 && req->r_parent) ceph_invalidate_dir_request(req); return 0; @@ -1553,13 +1580,13 @@ int ceph_fill_trace(struct super_block *sb, struct ceph_mds_request *req) tvino.snap = le64_to_cpu(rinfo->targeti.in->snapid); retry_lookup: dn = d_lookup(parent, &dname); - dout("d_lookup on parent=%p name=%.*s got %p\n", - parent, dname.len, dname.name, dn); + doutc(cl, "d_lookup on parent=%p name=%.*s got %p\n", + parent, dname.len, dname.name, dn);
if (!dn) { dn = d_alloc(parent, &dname); - dout("d_alloc %p '%.*s' = %p\n", parent, - dname.len, dname.name, dn); + doutc(cl, "d_alloc %p '%.*s' = %p\n", parent, + dname.len, dname.name, dn); if (!dn) { dput(parent); ceph_fname_free_buffer(dir, &oname); @@ -1575,8 +1602,8 @@ int ceph_fill_trace(struct super_block *sb, struct ceph_mds_request *req) } else if (d_really_is_positive(dn) && (ceph_ino(d_inode(dn)) != tvino.ino || ceph_snap(d_inode(dn)) != tvino.snap)) { - dout(" dn %p points to wrong inode %p\n", - dn, d_inode(dn)); + doutc(cl, " dn %p points to wrong inode %p\n", + dn, d_inode(dn)); ceph_dir_clear_ordered(dir); d_delete(dn); dput(dn); @@ -1601,8 +1628,8 @@ int ceph_fill_trace(struct super_block *sb, struct ceph_mds_request *req) rinfo->head->result == 0) ? req->r_fmode : -1, &req->r_caps_reservation); if (err < 0) { - pr_err("ceph_fill_inode badness %p %llx.%llx\n", - in, ceph_vinop(in)); + pr_err_client(cl, "badness %p %llx.%llx\n", in, + ceph_vinop(in)); req->r_target_inode = NULL; if (in->i_state & I_NEW) discard_new_inode(in); @@ -1652,36 +1679,32 @@ int ceph_fill_trace(struct super_block *sb, struct ceph_mds_request *req) have_lease = have_dir_cap || le32_to_cpu(rinfo->dlease->duration_ms); if (!have_lease) - dout("fill_trace no dentry lease or dir cap\n"); + doutc(cl, "no dentry lease or dir cap\n");
/* rename? */ if (req->r_old_dentry && req->r_op == CEPH_MDS_OP_RENAME) { struct inode *olddir = req->r_old_dentry_dir; BUG_ON(!olddir);
- dout(" src %p '%pd' dst %p '%pd'\n", - req->r_old_dentry, - req->r_old_dentry, - dn, dn); - dout("fill_trace doing d_move %p -> %p\n", - req->r_old_dentry, dn); + doutc(cl, " src %p '%pd' dst %p '%pd'\n", + req->r_old_dentry, req->r_old_dentry, dn, dn); + doutc(cl, "doing d_move %p -> %p\n", req->r_old_dentry, dn);
/* d_move screws up sibling dentries' offsets */ ceph_dir_clear_ordered(dir); ceph_dir_clear_ordered(olddir);
d_move(req->r_old_dentry, dn); - dout(" src %p '%pd' dst %p '%pd'\n", - req->r_old_dentry, - req->r_old_dentry, - dn, dn); + doutc(cl, " src %p '%pd' dst %p '%pd'\n", + req->r_old_dentry, req->r_old_dentry, dn, dn);
/* ensure target dentry is invalidated, despite rehashing bug in vfs_rename_dir */ ceph_invalidate_dentry_lease(dn);
- dout("dn %p gets new offset %lld\n", req->r_old_dentry, - ceph_dentry(req->r_old_dentry)->offset); + doutc(cl, "dn %p gets new offset %lld\n", + req->r_old_dentry, + ceph_dentry(req->r_old_dentry)->offset);
/* swap r_dentry and r_old_dentry in case that * splice_dentry() gets called later. This is safe @@ -1693,9 +1716,9 @@ int ceph_fill_trace(struct super_block *sb, struct ceph_mds_request *req)
/* null dentry? */ if (!rinfo->head->is_target) { - dout("fill_trace null dentry\n"); + doutc(cl, "null dentry\n"); if (d_really_is_positive(dn)) { - dout("d_delete %p\n", dn); + doutc(cl, "d_delete %p\n", dn); ceph_dir_clear_ordered(dir); d_delete(dn); } else if (have_lease) { @@ -1719,9 +1742,9 @@ int ceph_fill_trace(struct super_block *sb, struct ceph_mds_request *req) goto done; dn = req->r_dentry; /* may have spliced */ } else if (d_really_is_positive(dn) && d_inode(dn) != in) { - dout(" %p links to %p %llx.%llx, not %llx.%llx\n", - dn, d_inode(dn), ceph_vinop(d_inode(dn)), - ceph_vinop(in)); + doutc(cl, " %p links to %p %llx.%llx, not %llx.%llx\n", + dn, d_inode(dn), ceph_vinop(d_inode(dn)), + ceph_vinop(in)); d_invalidate(dn); have_lease = false; } @@ -1731,7 +1754,7 @@ int ceph_fill_trace(struct super_block *sb, struct ceph_mds_request *req) rinfo->dlease, session, req->r_request_started); } - dout(" final dn %p\n", dn); + doutc(cl, " final dn %p\n", dn); } else if ((req->r_op == CEPH_MDS_OP_LOOKUPSNAP || req->r_op == CEPH_MDS_OP_MKSNAP) && test_bit(CEPH_MDS_R_PARENT_LOCKED, &req->r_req_flags) && @@ -1742,7 +1765,8 @@ int ceph_fill_trace(struct super_block *sb, struct ceph_mds_request *req) BUG_ON(!dir); BUG_ON(ceph_snap(dir) != CEPH_SNAPDIR); BUG_ON(!req->r_dentry); - dout(" linking snapped dir %p to dn %p\n", in, req->r_dentry); + doutc(cl, " linking snapped dir %p to dn %p\n", in, + req->r_dentry); ceph_dir_clear_ordered(dir); ihold(in); err = splice_dentry(&req->r_dentry, in); @@ -1764,7 +1788,7 @@ int ceph_fill_trace(struct super_block *sb, struct ceph_mds_request *req) &dvino, ptvino); } done: - dout("fill_trace done err=%d\n", err); + doutc(cl, "done err=%d\n", err); return err; }
@@ -1775,6 +1799,7 @@ static int readdir_prepopulate_inodes_only(struct ceph_mds_request *req, struct ceph_mds_session *session) { struct ceph_mds_reply_info_parsed *rinfo = &req->r_reply_info; + struct ceph_client *cl = session->s_mdsc->fsc->client; int i, err = 0;
for (i = 0; i < rinfo->dir_nr; i++) { @@ -1789,14 +1814,14 @@ static int readdir_prepopulate_inodes_only(struct ceph_mds_request *req, in = ceph_get_inode(req->r_dentry->d_sb, vino, NULL); if (IS_ERR(in)) { err = PTR_ERR(in); - dout("new_inode badness got %d\n", err); + doutc(cl, "badness got %d\n", err); continue; } rc = ceph_fill_inode(in, NULL, &rde->inode, NULL, session, -1, &req->r_caps_reservation); if (rc < 0) { - pr_err("ceph_fill_inode badness on %p got %d\n", - in, rc); + pr_err_client(cl, "inode badness on %p got %d\n", in, + rc); err = rc; if (in->i_state & I_NEW) { ihold(in); @@ -1825,6 +1850,7 @@ static int fill_readdir_cache(struct inode *dir, struct dentry *dn, struct ceph_readdir_cache_control *ctl, struct ceph_mds_request *req) { + struct ceph_client *cl = ceph_inode_to_client(dir); struct ceph_inode_info *ci = ceph_inode(dir); unsigned nsize = PAGE_SIZE / sizeof(struct dentry*); unsigned idx = ctl->index % nsize; @@ -1850,11 +1876,11 @@ static int fill_readdir_cache(struct inode *dir, struct dentry *dn,
if (req->r_dir_release_cnt == atomic64_read(&ci->i_release_count) && req->r_dir_ordered_cnt == atomic64_read(&ci->i_ordered_count)) { - dout("readdir cache dn %p idx %d\n", dn, ctl->index); + doutc(cl, "dn %p idx %d\n", dn, ctl->index); ctl->dentries[idx] = dn; ctl->index++; } else { - dout("disable readdir cache\n"); + doutc(cl, "disable readdir cache\n"); ctl->index = -1; } return 0; @@ -1867,6 +1893,7 @@ int ceph_readdir_prepopulate(struct ceph_mds_request *req, struct inode *inode = d_inode(parent); struct ceph_inode_info *ci = ceph_inode(inode); struct ceph_mds_reply_info_parsed *rinfo = &req->r_reply_info; + struct ceph_client *cl = session->s_mdsc->fsc->client; struct qstr dname; struct dentry *dn; struct inode *in; @@ -1894,19 +1921,18 @@ int ceph_readdir_prepopulate(struct ceph_mds_request *req,
if (rinfo->dir_dir && le32_to_cpu(rinfo->dir_dir->frag) != frag) { - dout("readdir_prepopulate got new frag %x -> %x\n", - frag, le32_to_cpu(rinfo->dir_dir->frag)); + doutc(cl, "got new frag %x -> %x\n", frag, + le32_to_cpu(rinfo->dir_dir->frag)); frag = le32_to_cpu(rinfo->dir_dir->frag); if (!rinfo->hash_order) req->r_readdir_offset = 2; }
if (le32_to_cpu(rinfo->head->op) == CEPH_MDS_OP_LSSNAP) { - dout("readdir_prepopulate %d items under SNAPDIR dn %p\n", - rinfo->dir_nr, parent); + doutc(cl, "%d items under SNAPDIR dn %p\n", + rinfo->dir_nr, parent); } else { - dout("readdir_prepopulate %d items under dn %p\n", - rinfo->dir_nr, parent); + doutc(cl, "%d items under dn %p\n", rinfo->dir_nr, parent); if (rinfo->dir_dir) ceph_fill_dirfrag(d_inode(parent), rinfo->dir_dir);
@@ -1950,15 +1976,15 @@ int ceph_readdir_prepopulate(struct ceph_mds_request *req,
retry_lookup: dn = d_lookup(parent, &dname); - dout("d_lookup on parent=%p name=%.*s got %p\n", - parent, dname.len, dname.name, dn); + doutc(cl, "d_lookup on parent=%p name=%.*s got %p\n", + parent, dname.len, dname.name, dn);
if (!dn) { dn = d_alloc(parent, &dname); - dout("d_alloc %p '%.*s' = %p\n", parent, - dname.len, dname.name, dn); + doutc(cl, "d_alloc %p '%.*s' = %p\n", parent, + dname.len, dname.name, dn); if (!dn) { - dout("d_alloc badness\n"); + doutc(cl, "d_alloc badness\n"); err = -ENOMEM; goto out; } @@ -1971,8 +1997,8 @@ int ceph_readdir_prepopulate(struct ceph_mds_request *req, (ceph_ino(d_inode(dn)) != tvino.ino || ceph_snap(d_inode(dn)) != tvino.snap)) { struct ceph_dentry_info *di = ceph_dentry(dn); - dout(" dn %p points to wrong inode %p\n", - dn, d_inode(dn)); + doutc(cl, " dn %p points to wrong inode %p\n", + dn, d_inode(dn));
spin_lock(&dn->d_lock); if (di->offset > 0 && @@ -1994,7 +2020,7 @@ int ceph_readdir_prepopulate(struct ceph_mds_request *req, } else { in = ceph_get_inode(parent->d_sb, tvino, NULL); if (IS_ERR(in)) { - dout("new_inode badness\n"); + doutc(cl, "new_inode badness\n"); d_drop(dn); dput(dn); err = PTR_ERR(in); @@ -2005,7 +2031,8 @@ int ceph_readdir_prepopulate(struct ceph_mds_request *req, ret = ceph_fill_inode(in, NULL, &rde->inode, NULL, session, -1, &req->r_caps_reservation); if (ret < 0) { - pr_err("ceph_fill_inode badness on %p\n", in); + pr_err_client(cl, "badness on %p %llx.%llx\n", in, + ceph_vinop(in)); if (d_really_is_negative(dn)) { if (in->i_state & I_NEW) { ihold(in); @@ -2022,8 +2049,8 @@ int ceph_readdir_prepopulate(struct ceph_mds_request *req,
if (d_really_is_negative(dn)) { if (ceph_security_xattr_deadlock(in)) { - dout(" skip splicing dn %p to inode %p" - " (security xattr deadlock)\n", dn, in); + doutc(cl, " skip splicing dn %p to inode %p" + " (security xattr deadlock)\n", dn, in); iput(in); skipped++; goto next_item; @@ -2055,17 +2082,18 @@ int ceph_readdir_prepopulate(struct ceph_mds_request *req, req->r_readdir_cache_idx = cache_ctl.index; } ceph_readdir_cache_release(&cache_ctl); - dout("readdir_prepopulate done\n"); + doutc(cl, "done\n"); return err; }
bool ceph_inode_set_size(struct inode *inode, loff_t size) { + struct ceph_client *cl = ceph_inode_to_client(inode); struct ceph_inode_info *ci = ceph_inode(inode); bool ret;
spin_lock(&ci->i_ceph_lock); - dout("set_size %p %llu -> %llu\n", inode, i_size_read(inode), size); + doutc(cl, "set_size %p %llu -> %llu\n", inode, i_size_read(inode), size); i_size_write(inode, size); ceph_fscache_update(inode); inode->i_blocks = calc_inode_blocks(size); @@ -2080,21 +2108,24 @@ bool ceph_inode_set_size(struct inode *inode, loff_t size) void ceph_queue_inode_work(struct inode *inode, int work_bit) { struct ceph_fs_client *fsc = ceph_inode_to_fs_client(inode); + struct ceph_client *cl = fsc->client; struct ceph_inode_info *ci = ceph_inode(inode); set_bit(work_bit, &ci->i_work_mask);
ihold(inode); if (queue_work(fsc->inode_wq, &ci->i_work)) { - dout("queue_inode_work %p, mask=%lx\n", inode, ci->i_work_mask); + doutc(cl, "%p %llx.%llx mask=%lx\n", inode, + ceph_vinop(inode), ci->i_work_mask); } else { - dout("queue_inode_work %p already queued, mask=%lx\n", - inode, ci->i_work_mask); + doutc(cl, "%p %llx.%llx already queued, mask=%lx\n", + inode, ceph_vinop(inode), ci->i_work_mask); iput(inode); } }
static void ceph_do_invalidate_pages(struct inode *inode) { + struct ceph_client *cl = ceph_inode_to_client(inode); struct ceph_inode_info *ci = ceph_inode(inode); u32 orig_gen; int check = 0; @@ -2104,8 +2135,9 @@ static void ceph_do_invalidate_pages(struct inode *inode) mutex_lock(&ci->i_truncate_mutex);
if (ceph_inode_is_shutdown(inode)) { - pr_warn_ratelimited("%s: inode %llx.%llx is shut down\n", - __func__, ceph_vinop(inode)); + pr_warn_ratelimited_client(cl, + "%p %llx.%llx is shut down\n", inode, + ceph_vinop(inode)); mapping_set_error(inode->i_mapping, -EIO); truncate_pagecache(inode, 0); mutex_unlock(&ci->i_truncate_mutex); @@ -2113,8 +2145,8 @@ static void ceph_do_invalidate_pages(struct inode *inode) }
spin_lock(&ci->i_ceph_lock); - dout("invalidate_pages %p gen %d revoking %d\n", inode, - ci->i_rdcache_gen, ci->i_rdcache_revoking); + doutc(cl, "%p %llx.%llx gen %d revoking %d\n", inode, + ceph_vinop(inode), ci->i_rdcache_gen, ci->i_rdcache_revoking); if (ci->i_rdcache_revoking != ci->i_rdcache_gen) { if (__ceph_caps_revoking_other(ci, NULL, CEPH_CAP_FILE_CACHE)) check = 1; @@ -2126,21 +2158,21 @@ static void ceph_do_invalidate_pages(struct inode *inode) spin_unlock(&ci->i_ceph_lock);
if (invalidate_inode_pages2(inode->i_mapping) < 0) { - pr_err("invalidate_inode_pages2 %llx.%llx failed\n", - ceph_vinop(inode)); + pr_err_client(cl, "invalidate_inode_pages2 %llx.%llx failed\n", + ceph_vinop(inode)); }
spin_lock(&ci->i_ceph_lock); if (orig_gen == ci->i_rdcache_gen && orig_gen == ci->i_rdcache_revoking) { - dout("invalidate_pages %p gen %d successful\n", inode, - ci->i_rdcache_gen); + doutc(cl, "%p %llx.%llx gen %d successful\n", inode, + ceph_vinop(inode), ci->i_rdcache_gen); ci->i_rdcache_revoking--; check = 1; } else { - dout("invalidate_pages %p gen %d raced, now %d revoking %d\n", - inode, orig_gen, ci->i_rdcache_gen, - ci->i_rdcache_revoking); + doutc(cl, "%p %llx.%llx gen %d raced, now %d revoking %d\n", + inode, ceph_vinop(inode), orig_gen, ci->i_rdcache_gen, + ci->i_rdcache_revoking); if (__ceph_caps_revoking_other(ci, NULL, CEPH_CAP_FILE_CACHE)) check = 1; } @@ -2157,6 +2189,7 @@ static void ceph_do_invalidate_pages(struct inode *inode) */ void __ceph_do_pending_vmtruncate(struct inode *inode) { + struct ceph_client *cl = ceph_inode_to_client(inode); struct ceph_inode_info *ci = ceph_inode(inode); u64 to; int wrbuffer_refs, finish = 0; @@ -2165,7 +2198,8 @@ void __ceph_do_pending_vmtruncate(struct inode *inode) retry: spin_lock(&ci->i_ceph_lock); if (ci->i_truncate_pending == 0) { - dout("%s %p none pending\n", __func__, inode); + doutc(cl, "%p %llx.%llx none pending\n", inode, + ceph_vinop(inode)); spin_unlock(&ci->i_ceph_lock); mutex_unlock(&ci->i_truncate_mutex); return; @@ -2177,7 +2211,8 @@ void __ceph_do_pending_vmtruncate(struct inode *inode) */ if (ci->i_wrbuffer_ref_head < ci->i_wrbuffer_ref) { spin_unlock(&ci->i_ceph_lock); - dout("%s %p flushing snaps first\n", __func__, inode); + doutc(cl, "%p %llx.%llx flushing snaps first\n", inode, + ceph_vinop(inode)); filemap_write_and_wait_range(&inode->i_data, 0, inode->i_sb->s_maxbytes); goto retry; @@ -2188,8 +2223,8 @@ void __ceph_do_pending_vmtruncate(struct inode *inode)
to = ci->i_truncate_pagecache_size; wrbuffer_refs = ci->i_wrbuffer_ref; - dout("%s %p (%d) to %lld\n", __func__, inode, - ci->i_truncate_pending, to); + doutc(cl, "%p %llx.%llx (%d) to %lld\n", inode, ceph_vinop(inode), + ci->i_truncate_pending, to); spin_unlock(&ci->i_ceph_lock);
ceph_fscache_resize(inode, to); @@ -2217,9 +2252,10 @@ static void ceph_inode_work(struct work_struct *work) struct ceph_inode_info *ci = container_of(work, struct ceph_inode_info, i_work); struct inode *inode = &ci->netfs.inode; + struct ceph_client *cl = ceph_inode_to_client(inode);
if (test_and_clear_bit(CEPH_I_WORK_WRITEBACK, &ci->i_work_mask)) { - dout("writeback %p\n", inode); + doutc(cl, "writeback %p %llx.%llx\n", inode, ceph_vinop(inode)); filemap_fdatawrite(&inode->i_data); } if (test_and_clear_bit(CEPH_I_WORK_INVALIDATE_PAGES, &ci->i_work_mask)) @@ -2291,6 +2327,7 @@ static int fill_fscrypt_truncate(struct inode *inode, struct ceph_mds_request *req, struct iattr *attr) { + struct ceph_client *cl = ceph_inode_to_client(inode); struct ceph_inode_info *ci = ceph_inode(inode); int boff = attr->ia_size % CEPH_FSCRYPT_BLOCK_SIZE; loff_t pos, orig_pos = round_down(attr->ia_size, @@ -2313,9 +2350,9 @@ static int fill_fscrypt_truncate(struct inode *inode,
issued = __ceph_caps_issued(ci, NULL);
- dout("%s size %lld -> %lld got cap refs on %s, issued %s\n", __func__, - i_size, attr->ia_size, ceph_cap_string(got), - ceph_cap_string(issued)); + doutc(cl, "size %lld -> %lld got cap refs on %s, issued %s\n", + i_size, attr->ia_size, ceph_cap_string(got), + ceph_cap_string(issued));
/* Try to writeback the dirty pagecaches */ if (issued & (CEPH_CAP_FILE_BUFFER)) { @@ -2370,8 +2407,7 @@ static int fill_fscrypt_truncate(struct inode *inode, * If the Rados object doesn't exist, it will be set to 0. */ if (!objver) { - dout("%s hit hole, ppos %lld < size %lld\n", __func__, - pos, i_size); + doutc(cl, "hit hole, ppos %lld < size %lld\n", pos, i_size);
header.data_len = cpu_to_le32(8 + 8 + 4); header.file_offset = 0; @@ -2380,8 +2416,8 @@ static int fill_fscrypt_truncate(struct inode *inode, header.data_len = cpu_to_le32(8 + 8 + 4 + CEPH_FSCRYPT_BLOCK_SIZE); header.file_offset = cpu_to_le64(orig_pos);
- dout("%s encrypt block boff/bsize %d/%lu\n", __func__, - boff, CEPH_FSCRYPT_BLOCK_SIZE); + doutc(cl, "encrypt block boff/bsize %d/%lu\n", boff, + CEPH_FSCRYPT_BLOCK_SIZE);
/* truncate and zero out the extra contents for the last block */ memset(iov.iov_base + boff, 0, PAGE_SIZE - boff); @@ -2409,8 +2445,8 @@ static int fill_fscrypt_truncate(struct inode *inode, } req->r_pagelist = pagelist; out: - dout("%s %p size dropping cap refs on %s\n", __func__, - inode, ceph_cap_string(got)); + doutc(cl, "%p %llx.%llx size dropping cap refs on %s\n", inode, + ceph_vinop(inode), ceph_cap_string(got)); ceph_put_cap_refs(ci, got); if (iov.iov_base) kunmap_local(iov.iov_base); @@ -2428,6 +2464,7 @@ int __ceph_setattr(struct inode *inode, struct iattr *attr, unsigned int ia_valid = attr->ia_valid; struct ceph_mds_request *req; struct ceph_mds_client *mdsc = ceph_sb_to_fs_client(inode->i_sb)->mdsc; + struct ceph_client *cl = ceph_inode_to_client(inode); struct ceph_cap_flush *prealloc_cf; loff_t isize = i_size_read(inode); int issued; @@ -2466,7 +2503,8 @@ int __ceph_setattr(struct inode *inode, struct iattr *attr, } }
- dout("setattr %p issued %s\n", inode, ceph_cap_string(issued)); + doutc(cl, "%p %llx.%llx issued %s\n", inode, ceph_vinop(inode), + ceph_cap_string(issued)); #if IS_ENABLED(CONFIG_FS_ENCRYPTION) if (cia && cia->fscrypt_auth) { u32 len = ceph_fscrypt_auth_len(cia->fscrypt_auth); @@ -2477,8 +2515,8 @@ int __ceph_setattr(struct inode *inode, struct iattr *attr, goto out; }
- dout("setattr %llx:%llx fscrypt_auth len %u to %u)\n", - ceph_vinop(inode), ci->fscrypt_auth_len, len); + doutc(cl, "%p %llx.%llx fscrypt_auth len %u to %u)\n", inode, + ceph_vinop(inode), ci->fscrypt_auth_len, len);
/* It should never be re-set once set */ WARN_ON_ONCE(ci->fscrypt_auth); @@ -2506,9 +2544,10 @@ int __ceph_setattr(struct inode *inode, struct iattr *attr, #endif /* CONFIG_FS_ENCRYPTION */
if (ia_valid & ATTR_UID) { - dout("setattr %p uid %d -> %d\n", inode, - from_kuid(&init_user_ns, inode->i_uid), - from_kuid(&init_user_ns, attr->ia_uid)); + doutc(cl, "%p %llx.%llx uid %d -> %d\n", inode, + ceph_vinop(inode), + from_kuid(&init_user_ns, inode->i_uid), + from_kuid(&init_user_ns, attr->ia_uid)); if (issued & CEPH_CAP_AUTH_EXCL) { inode->i_uid = attr->ia_uid; dirtied |= CEPH_CAP_AUTH_EXCL; @@ -2521,9 +2560,10 @@ int __ceph_setattr(struct inode *inode, struct iattr *attr, } } if (ia_valid & ATTR_GID) { - dout("setattr %p gid %d -> %d\n", inode, - from_kgid(&init_user_ns, inode->i_gid), - from_kgid(&init_user_ns, attr->ia_gid)); + doutc(cl, "%p %llx.%llx gid %d -> %d\n", inode, + ceph_vinop(inode), + from_kgid(&init_user_ns, inode->i_gid), + from_kgid(&init_user_ns, attr->ia_gid)); if (issued & CEPH_CAP_AUTH_EXCL) { inode->i_gid = attr->ia_gid; dirtied |= CEPH_CAP_AUTH_EXCL; @@ -2536,8 +2576,8 @@ int __ceph_setattr(struct inode *inode, struct iattr *attr, } } if (ia_valid & ATTR_MODE) { - dout("setattr %p mode 0%o -> 0%o\n", inode, inode->i_mode, - attr->ia_mode); + doutc(cl, "%p %llx.%llx mode 0%o -> 0%o\n", inode, + ceph_vinop(inode), inode->i_mode, attr->ia_mode); if (issued & CEPH_CAP_AUTH_EXCL) { inode->i_mode = attr->ia_mode; dirtied |= CEPH_CAP_AUTH_EXCL; @@ -2551,9 +2591,10 @@ int __ceph_setattr(struct inode *inode, struct iattr *attr, }
if (ia_valid & ATTR_ATIME) { - dout("setattr %p atime %lld.%ld -> %lld.%ld\n", inode, - inode->i_atime.tv_sec, inode->i_atime.tv_nsec, - attr->ia_atime.tv_sec, attr->ia_atime.tv_nsec); + doutc(cl, "%p %llx.%llx atime %lld.%ld -> %lld.%ld\n", + inode, ceph_vinop(inode), inode->i_atime.tv_sec, + inode->i_atime.tv_nsec, attr->ia_atime.tv_sec, + attr->ia_atime.tv_nsec); if (issued & CEPH_CAP_FILE_EXCL) { ci->i_time_warp_seq++; inode->i_atime = attr->ia_atime; @@ -2573,7 +2614,8 @@ int __ceph_setattr(struct inode *inode, struct iattr *attr, } } if (ia_valid & ATTR_SIZE) { - dout("setattr %p size %lld -> %lld\n", inode, isize, attr->ia_size); + doutc(cl, "%p %llx.%llx size %lld -> %lld\n", inode, + ceph_vinop(inode), isize, attr->ia_size); /* * Only when the new size is smaller and not aligned to * CEPH_FSCRYPT_BLOCK_SIZE will the RMW is needed. @@ -2624,9 +2666,10 @@ int __ceph_setattr(struct inode *inode, struct iattr *attr, } } if (ia_valid & ATTR_MTIME) { - dout("setattr %p mtime %lld.%ld -> %lld.%ld\n", inode, - inode->i_mtime.tv_sec, inode->i_mtime.tv_nsec, - attr->ia_mtime.tv_sec, attr->ia_mtime.tv_nsec); + doutc(cl, "%p %llx.%llx mtime %lld.%ld -> %lld.%ld\n", + inode, ceph_vinop(inode), inode->i_mtime.tv_sec, + inode->i_mtime.tv_nsec, attr->ia_mtime.tv_sec, + attr->ia_mtime.tv_nsec); if (issued & CEPH_CAP_FILE_EXCL) { ci->i_time_warp_seq++; inode->i_mtime = attr->ia_mtime; @@ -2650,11 +2693,12 @@ int __ceph_setattr(struct inode *inode, struct iattr *attr, if (ia_valid & ATTR_CTIME) { bool only = (ia_valid & (ATTR_SIZE|ATTR_MTIME|ATTR_ATIME| ATTR_MODE|ATTR_UID|ATTR_GID)) == 0; - dout("setattr %p ctime %lld.%ld -> %lld.%ld (%s)\n", inode, - inode_get_ctime(inode).tv_sec, - inode_get_ctime(inode).tv_nsec, - attr->ia_ctime.tv_sec, attr->ia_ctime.tv_nsec, - only ? "ctime only" : "ignored"); + doutc(cl, "%p %llx.%llx ctime %lld.%ld -> %lld.%ld (%s)\n", + inode, ceph_vinop(inode), inode_get_ctime(inode).tv_sec, + inode_get_ctime(inode).tv_nsec, + attr->ia_ctime.tv_sec, attr->ia_ctime.tv_nsec, + only ? "ctime only" : "ignored"); + if (only) { /* * if kernel wants to dirty ctime but nothing else, @@ -2672,7 +2716,8 @@ int __ceph_setattr(struct inode *inode, struct iattr *attr, } } if (ia_valid & ATTR_FILE) - dout("setattr %p ATTR_FILE ... hrm!\n", inode); + doutc(cl, "%p %llx.%llx ATTR_FILE ... hrm!\n", inode, + ceph_vinop(inode));
if (dirtied) { inode_dirty_flags = __ceph_mark_dirty_caps(ci, dirtied, @@ -2713,16 +2758,17 @@ int __ceph_setattr(struct inode *inode, struct iattr *attr, */ err = ceph_mdsc_do_request(mdsc, NULL, req); if (err == -EAGAIN && truncate_retry--) { - dout("setattr %p result=%d (%s locally, %d remote), retry it!\n", - inode, err, ceph_cap_string(dirtied), mask); + doutc(cl, "%p %llx.%llx result=%d (%s locally, %d remote), retry it!\n", + inode, ceph_vinop(inode), err, + ceph_cap_string(dirtied), mask); ceph_mdsc_put_request(req); ceph_free_cap_flush(prealloc_cf); goto retry; } } out: - dout("setattr %p result=%d (%s locally, %d remote)\n", inode, err, - ceph_cap_string(dirtied), mask); + doutc(cl, "%p %llx.%llx result=%d (%s locally, %d remote)\n", inode, + ceph_vinop(inode), err, ceph_cap_string(dirtied), mask);
ceph_mdsc_put_request(req); ceph_free_cap_flush(prealloc_cf); @@ -2811,18 +2857,20 @@ int __ceph_do_getattr(struct inode *inode, struct page *locked_page, int mask, bool force) { struct ceph_fs_client *fsc = ceph_sb_to_fs_client(inode->i_sb); + struct ceph_client *cl = fsc->client; struct ceph_mds_client *mdsc = fsc->mdsc; struct ceph_mds_request *req; int mode; int err;
if (ceph_snap(inode) == CEPH_SNAPDIR) { - dout("do_getattr inode %p SNAPDIR\n", inode); + doutc(cl, "inode %p %llx.%llx SNAPDIR\n", inode, + ceph_vinop(inode)); return 0; }
- dout("do_getattr inode %p mask %s mode 0%o\n", - inode, ceph_cap_string(mask), inode->i_mode); + doutc(cl, "inode %p %llx.%llx mask %s mode 0%o\n", inode, + ceph_vinop(inode), ceph_cap_string(mask), inode->i_mode); if (!force && ceph_caps_issued_mask_metric(ceph_inode(inode), mask, 1)) return 0;
@@ -2849,7 +2897,7 @@ int __ceph_do_getattr(struct inode *inode, struct page *locked_page, } } ceph_mdsc_put_request(req); - dout("do_getattr result=%d\n", err); + doutc(cl, "result=%d\n", err); return err; }
@@ -2857,6 +2905,7 @@ int ceph_do_getvxattr(struct inode *inode, const char *name, void *value, size_t size) { struct ceph_fs_client *fsc = ceph_sb_to_fs_client(inode->i_sb); + struct ceph_client *cl = fsc->client; struct ceph_mds_client *mdsc = fsc->mdsc; struct ceph_mds_request *req; int mode = USE_AUTH_MDS; @@ -2886,7 +2935,7 @@ int ceph_do_getvxattr(struct inode *inode, const char *name, void *value, xattr_value = req->r_reply_info.xattr_info.xattr_value; xattr_value_len = req->r_reply_info.xattr_info.xattr_value_len;
- dout("do_getvxattr xattr_value_len:%zu, size:%zu\n", xattr_value_len, size); + doutc(cl, "xattr_value_len:%zu, size:%zu\n", xattr_value_len, size);
err = (int)xattr_value_len; if (size == 0) @@ -2901,7 +2950,7 @@ int ceph_do_getvxattr(struct inode *inode, const char *name, void *value, put: ceph_mdsc_put_request(req); out: - dout("do_getvxattr result=%d\n", err); + doutc(cl, "result=%d\n", err); return err; }
diff --git a/fs/ceph/ioctl.c b/fs/ceph/ioctl.c index 3f617146e4ad..e861de3c79b9 100644 --- a/fs/ceph/ioctl.c +++ b/fs/ceph/ioctl.c @@ -245,6 +245,7 @@ static long ceph_ioctl_lazyio(struct file *file) struct inode *inode = file_inode(file); struct ceph_inode_info *ci = ceph_inode(inode); struct ceph_mds_client *mdsc = ceph_inode_to_fs_client(inode)->mdsc; + struct ceph_client *cl = mdsc->fsc->client;
if ((fi->fmode & CEPH_FILE_MODE_LAZY) == 0) { spin_lock(&ci->i_ceph_lock); @@ -252,11 +253,13 @@ static long ceph_ioctl_lazyio(struct file *file) ci->i_nr_by_mode[ffs(CEPH_FILE_MODE_LAZY)]++; __ceph_touch_fmode(ci, mdsc, fi->fmode); spin_unlock(&ci->i_ceph_lock); - dout("ioctl_layzio: file %p marked lazy\n", file); + doutc(cl, "file %p %p %llx.%llx marked lazy\n", file, inode, + ceph_vinop(inode));
ceph_check_caps(ci, 0); } else { - dout("ioctl_layzio: file %p already lazy\n", file); + doutc(cl, "file %p %p %llx.%llx already lazy\n", file, inode, + ceph_vinop(inode)); } return 0; } @@ -355,10 +358,12 @@ static const char *ceph_ioctl_cmd_name(const unsigned int cmd)
long ceph_ioctl(struct file *file, unsigned int cmd, unsigned long arg) { + struct inode *inode = file_inode(file); + struct ceph_fs_client *fsc = ceph_inode_to_fs_client(inode); int ret;
- dout("ioctl file %p cmd %s arg %lu\n", file, - ceph_ioctl_cmd_name(cmd), arg); + doutc(fsc->client, "file %p %p %llx.%llx cmd %s arg %lu\n", file, + inode, ceph_vinop(inode), ceph_ioctl_cmd_name(cmd), arg); switch (cmd) { case CEPH_IOC_GET_LAYOUT: return ceph_ioctl_get_layout(file, (void __user *)arg); diff --git a/fs/ceph/locks.c b/fs/ceph/locks.c index cb51c7e9c8e2..e07ad29ff8b9 100644 --- a/fs/ceph/locks.c +++ b/fs/ceph/locks.c @@ -77,6 +77,7 @@ static int ceph_lock_message(u8 lock_type, u16 operation, struct inode *inode, int cmd, u8 wait, struct file_lock *fl) { struct ceph_mds_client *mdsc = ceph_sb_to_mdsc(inode->i_sb); + struct ceph_client *cl = mdsc->fsc->client; struct ceph_mds_request *req; int err; u64 length = 0; @@ -111,10 +112,10 @@ static int ceph_lock_message(u8 lock_type, u16 operation, struct inode *inode,
owner = secure_addr(fl->fl_owner);
- dout("ceph_lock_message: rule: %d, op: %d, owner: %llx, pid: %llu, " - "start: %llu, length: %llu, wait: %d, type: %d\n", (int)lock_type, - (int)operation, owner, (u64)fl->fl_pid, fl->fl_start, length, - wait, fl->fl_type); + doutc(cl, "rule: %d, op: %d, owner: %llx, pid: %llu, " + "start: %llu, length: %llu, wait: %d, type: %d\n", + (int)lock_type, (int)operation, owner, (u64)fl->fl_pid, + fl->fl_start, length, wait, fl->fl_type);
req->r_args.filelock_change.rule = lock_type; req->r_args.filelock_change.type = cmd; @@ -147,16 +148,17 @@ static int ceph_lock_message(u8 lock_type, u16 operation, struct inode *inode,
} ceph_mdsc_put_request(req); - dout("ceph_lock_message: rule: %d, op: %d, pid: %llu, start: %llu, " - "length: %llu, wait: %d, type: %d, err code %d\n", (int)lock_type, - (int)operation, (u64)fl->fl_pid, fl->fl_start, - length, wait, fl->fl_type, err); + doutc(cl, "rule: %d, op: %d, pid: %llu, start: %llu, " + "length: %llu, wait: %d, type: %d, err code %d\n", + (int)lock_type, (int)operation, (u64)fl->fl_pid, + fl->fl_start, length, wait, fl->fl_type, err); return err; }
static int ceph_lock_wait_for_completion(struct ceph_mds_client *mdsc, struct ceph_mds_request *req) { + struct ceph_client *cl = mdsc->fsc->client; struct ceph_mds_request *intr_req; struct inode *inode = req->r_inode; int err, lock_type; @@ -174,8 +176,7 @@ static int ceph_lock_wait_for_completion(struct ceph_mds_client *mdsc, if (!err) return 0;
- dout("ceph_lock_wait_for_completion: request %llu was interrupted\n", - req->r_tid); + doutc(cl, "request %llu was interrupted\n", req->r_tid);
mutex_lock(&mdsc->mutex); if (test_bit(CEPH_MDS_R_GOT_RESULT, &req->r_req_flags)) { @@ -246,6 +247,7 @@ int ceph_lock(struct file *file, int cmd, struct file_lock *fl) { struct inode *inode = file_inode(file); struct ceph_inode_info *ci = ceph_inode(inode); + struct ceph_client *cl = ceph_inode_to_client(inode); int err = 0; u16 op = CEPH_MDS_OP_SETFILELOCK; u8 wait = 0; @@ -257,7 +259,7 @@ int ceph_lock(struct file *file, int cmd, struct file_lock *fl) if (ceph_inode_is_shutdown(inode)) return -ESTALE;
- dout("ceph_lock, fl_owner: %p\n", fl->fl_owner); + doutc(cl, "fl_owner: %p\n", fl->fl_owner);
/* set wait bit as appropriate, then make command as Ceph expects it*/ if (IS_GETLK(cmd)) @@ -292,7 +294,7 @@ int ceph_lock(struct file *file, int cmd, struct file_lock *fl) err = ceph_lock_message(CEPH_LOCK_FCNTL, op, inode, lock_cmd, wait, fl); if (!err) { if (op == CEPH_MDS_OP_SETFILELOCK && F_UNLCK != fl->fl_type) { - dout("mds locked, locking locally\n"); + doutc(cl, "locking locally\n"); err = posix_lock_file(file, fl, NULL); if (err) { /* undo! This should only happen if @@ -300,8 +302,8 @@ int ceph_lock(struct file *file, int cmd, struct file_lock *fl) * deadlock. */ ceph_lock_message(CEPH_LOCK_FCNTL, op, inode, CEPH_LOCK_UNLOCK, 0, fl); - dout("got %d on posix_lock_file, undid lock\n", - err); + doutc(cl, "got %d on posix_lock_file, undid lock\n", + err); } } } @@ -312,6 +314,7 @@ int ceph_flock(struct file *file, int cmd, struct file_lock *fl) { struct inode *inode = file_inode(file); struct ceph_inode_info *ci = ceph_inode(inode); + struct ceph_client *cl = ceph_inode_to_client(inode); int err = 0; u8 wait = 0; u8 lock_cmd; @@ -322,7 +325,7 @@ int ceph_flock(struct file *file, int cmd, struct file_lock *fl) if (ceph_inode_is_shutdown(inode)) return -ESTALE;
- dout("ceph_flock, fl_file: %p\n", fl->fl_file); + doutc(cl, "fl_file: %p\n", fl->fl_file);
spin_lock(&ci->i_ceph_lock); if (ci->i_ceph_flags & CEPH_I_ERROR_FILELOCK) { @@ -359,7 +362,8 @@ int ceph_flock(struct file *file, int cmd, struct file_lock *fl) ceph_lock_message(CEPH_LOCK_FLOCK, CEPH_MDS_OP_SETFILELOCK, inode, CEPH_LOCK_UNLOCK, 0, fl); - dout("got %d on locks_lock_file_wait, undid lock\n", err); + doutc(cl, "got %d on locks_lock_file_wait, undid lock\n", + err); } } return err; @@ -371,6 +375,7 @@ int ceph_flock(struct file *file, int cmd, struct file_lock *fl) */ void ceph_count_locks(struct inode *inode, int *fcntl_count, int *flock_count) { + struct ceph_client *cl = ceph_inode_to_client(inode); struct file_lock *lock; struct file_lock_context *ctx;
@@ -386,17 +391,20 @@ void ceph_count_locks(struct inode *inode, int *fcntl_count, int *flock_count) ++(*flock_count); spin_unlock(&ctx->flc_lock); } - dout("counted %d flock locks and %d fcntl locks\n", - *flock_count, *fcntl_count); + doutc(cl, "counted %d flock locks and %d fcntl locks\n", + *flock_count, *fcntl_count); }
/* * Given a pointer to a lock, convert it to a ceph filelock */ -static int lock_to_ceph_filelock(struct file_lock *lock, +static int lock_to_ceph_filelock(struct inode *inode, + struct file_lock *lock, struct ceph_filelock *cephlock) { + struct ceph_client *cl = ceph_inode_to_client(inode); int err = 0; + cephlock->start = cpu_to_le64(lock->fl_start); cephlock->length = cpu_to_le64(lock->fl_end - lock->fl_start + 1); cephlock->client = cpu_to_le64(0); @@ -414,7 +422,7 @@ static int lock_to_ceph_filelock(struct file_lock *lock, cephlock->type = CEPH_LOCK_UNLOCK; break; default: - dout("Have unknown lock type %d\n", lock->fl_type); + doutc(cl, "Have unknown lock type %d\n", lock->fl_type); err = -EINVAL; }
@@ -432,13 +440,14 @@ int ceph_encode_locks_to_buffer(struct inode *inode, { struct file_lock *lock; struct file_lock_context *ctx = locks_inode_context(inode); + struct ceph_client *cl = ceph_inode_to_client(inode); int err = 0; int seen_fcntl = 0; int seen_flock = 0; int l = 0;
- dout("encoding %d flock and %d fcntl locks\n", num_flock_locks, - num_fcntl_locks); + doutc(cl, "encoding %d flock and %d fcntl locks\n", num_flock_locks, + num_fcntl_locks);
if (!ctx) return 0; @@ -450,7 +459,7 @@ int ceph_encode_locks_to_buffer(struct inode *inode, err = -ENOSPC; goto fail; } - err = lock_to_ceph_filelock(lock, &flocks[l]); + err = lock_to_ceph_filelock(inode, lock, &flocks[l]); if (err) goto fail; ++l; @@ -461,7 +470,7 @@ int ceph_encode_locks_to_buffer(struct inode *inode, err = -ENOSPC; goto fail; } - err = lock_to_ceph_filelock(lock, &flocks[l]); + err = lock_to_ceph_filelock(inode, lock, &flocks[l]); if (err) goto fail; ++l; diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c index 11289ce8a8cc..56609b80880c 100644 --- a/fs/ceph/mds_client.c +++ b/fs/ceph/mds_client.c @@ -411,6 +411,7 @@ static int parse_reply_info_readdir(void **p, void *end, u64 features) { struct ceph_mds_reply_info_parsed *info = &req->r_reply_info; + struct ceph_client *cl = req->r_mdsc->fsc->client; u32 num, i = 0; int err;
@@ -433,7 +434,7 @@ static int parse_reply_info_readdir(void **p, void *end, BUG_ON(!info->dir_entries); if ((unsigned long)(info->dir_entries + num) > (unsigned long)info->dir_entries + info->dir_buf_size) { - pr_err("dir contents are larger than expected\n"); + pr_err_client(cl, "dir contents are larger than expected\n"); WARN_ON(1); goto bad; } @@ -454,7 +455,7 @@ static int parse_reply_info_readdir(void **p, void *end, ceph_decode_need(p, end, _name_len, bad); _name = *p; *p += _name_len; - dout("parsed dir dname '%.*s'\n", _name_len, _name); + doutc(cl, "parsed dir dname '%.*s'\n", _name_len, _name);
if (info->hash_order) rde->raw_hash = ceph_str_hash(ci->i_dir_layout.dl_dir_hash, @@ -514,8 +515,8 @@ static int parse_reply_info_readdir(void **p, void *end, rde->is_nokey = false; err = ceph_fname_to_usr(&fname, &tname, &oname, &rde->is_nokey); if (err) { - pr_err("%s unable to decode %.*s, got %d\n", __func__, - _name_len, _name, err); + pr_err_client(cl, "unable to decode %.*s, got %d\n", + _name_len, _name, err); goto out_bad; } rde->name = oname.name; @@ -539,7 +540,7 @@ static int parse_reply_info_readdir(void **p, void *end, bad: err = -EIO; out_bad: - pr_err("problem parsing dir contents %d\n", err); + pr_err_client(cl, "problem parsing dir contents %d\n", err); return err; }
@@ -570,10 +571,11 @@ static int parse_reply_info_filelock(void **p, void *end, static int ceph_parse_deleg_inos(void **p, void *end, struct ceph_mds_session *s) { + struct ceph_client *cl = s->s_mdsc->fsc->client; u32 sets;
ceph_decode_32_safe(p, end, sets, bad); - dout("got %u sets of delegated inodes\n", sets); + doutc(cl, "got %u sets of delegated inodes\n", sets); while (sets--) { u64 start, len;
@@ -582,8 +584,9 @@ static int ceph_parse_deleg_inos(void **p, void *end,
/* Don't accept a delegation of system inodes */ if (start < CEPH_INO_SYSTEM_BASE) { - pr_warn_ratelimited("ceph: ignoring reserved inode range delegation (start=0x%llx len=0x%llx)\n", - start, len); + pr_warn_ratelimited_client(cl, + "ignoring reserved inode range delegation (start=0x%llx len=0x%llx)\n", + start, len); continue; } while (len--) { @@ -591,10 +594,10 @@ static int ceph_parse_deleg_inos(void **p, void *end, DELEGATED_INO_AVAILABLE, GFP_KERNEL); if (!err) { - dout("added delegated inode 0x%llx\n", - start - 1); + doutc(cl, "added delegated inode 0x%llx\n", start - 1); } else if (err == -EBUSY) { - pr_warn("MDS delegated inode 0x%llx more than once.\n", + pr_warn_client(cl, + "MDS delegated inode 0x%llx more than once.\n", start - 1); } else { return err; @@ -744,6 +747,7 @@ static int parse_reply_info(struct ceph_mds_session *s, struct ceph_msg *msg, struct ceph_mds_request *req, u64 features) { struct ceph_mds_reply_info_parsed *info = &req->r_reply_info; + struct ceph_client *cl = s->s_mdsc->fsc->client; void *p, *end; u32 len; int err; @@ -783,7 +787,7 @@ static int parse_reply_info(struct ceph_mds_session *s, struct ceph_msg *msg, bad: err = -EIO; out_bad: - pr_err("mds parse_reply err %d\n", err); + pr_err_client(cl, "mds parse_reply err %d\n", err); ceph_msg_dump(msg); return err; } @@ -831,6 +835,7 @@ static void destroy_reply_info(struct ceph_mds_reply_info_parsed *info) int ceph_wait_on_conflict_unlink(struct dentry *dentry) { struct ceph_fs_client *fsc = ceph_sb_to_fs_client(dentry->d_sb); + struct ceph_client *cl = fsc->client; struct dentry *pdentry = dentry->d_parent; struct dentry *udentry, *found = NULL; struct ceph_dentry_info *di; @@ -855,8 +860,8 @@ int ceph_wait_on_conflict_unlink(struct dentry *dentry) goto next;
if (!test_bit(CEPH_DENTRY_ASYNC_UNLINK_BIT, &di->flags)) - pr_warn("%s dentry %p:%pd async unlink bit is not set\n", - __func__, dentry, dentry); + pr_warn_client(cl, "dentry %p:%pd async unlink bit is not set\n", + dentry, dentry);
if (!d_same_name(udentry, pdentry, &dname)) goto next; @@ -872,8 +877,8 @@ int ceph_wait_on_conflict_unlink(struct dentry *dentry) if (likely(!found)) return 0;
- dout("%s dentry %p:%pd conflict with old %p:%pd\n", __func__, - dentry, dentry, found, found); + doutc(cl, "dentry %p:%pd conflict with old %p:%pd\n", dentry, dentry, + found, found);
err = wait_on_bit(&di->flags, CEPH_DENTRY_ASYNC_UNLINK_BIT, TASK_KILLABLE); @@ -957,6 +962,7 @@ static int __verify_registered_session(struct ceph_mds_client *mdsc, static struct ceph_mds_session *register_session(struct ceph_mds_client *mdsc, int mds) { + struct ceph_client *cl = mdsc->fsc->client; struct ceph_mds_session *s;
if (READ_ONCE(mdsc->fsc->mount_state) == CEPH_MOUNT_FENCE_IO) @@ -973,7 +979,7 @@ static struct ceph_mds_session *register_session(struct ceph_mds_client *mdsc, int newmax = 1 << get_count_order(mds + 1); struct ceph_mds_session **sa;
- dout("%s: realloc to %d\n", __func__, newmax); + doutc(cl, "realloc to %d\n", newmax); sa = kcalloc(newmax, sizeof(void *), GFP_NOFS); if (!sa) goto fail_realloc; @@ -986,7 +992,7 @@ static struct ceph_mds_session *register_session(struct ceph_mds_client *mdsc, mdsc->max_sessions = newmax; }
- dout("%s: mds%d\n", __func__, mds); + doutc(cl, "mds%d\n", mds); s->s_mdsc = mdsc; s->s_mds = mds; s->s_state = CEPH_MDS_SESSION_NEW; @@ -1029,7 +1035,7 @@ static struct ceph_mds_session *register_session(struct ceph_mds_client *mdsc, static void __unregister_session(struct ceph_mds_client *mdsc, struct ceph_mds_session *s) { - dout("__unregister_session mds%d %p\n", s->s_mds, s); + doutc(mdsc->fsc->client, "mds%d %p\n", s->s_mds, s); BUG_ON(mdsc->sessions[s->s_mds] != s); mdsc->sessions[s->s_mds] = NULL; ceph_con_close(&s->s_con); @@ -1155,6 +1161,7 @@ static void __register_request(struct ceph_mds_client *mdsc, struct ceph_mds_request *req, struct inode *dir) { + struct ceph_client *cl = mdsc->fsc->client; int ret = 0;
req->r_tid = ++mdsc->last_tid; @@ -1162,14 +1169,14 @@ static void __register_request(struct ceph_mds_client *mdsc, ret = ceph_reserve_caps(mdsc, &req->r_caps_reservation, req->r_num_caps); if (ret < 0) { - pr_err("__register_request %p " - "failed to reserve caps: %d\n", req, ret); + pr_err_client(cl, "%p failed to reserve caps: %d\n", + req, ret); /* set req->r_err to fail early from __do_request */ req->r_err = ret; return; } } - dout("__register_request %p tid %lld\n", req, req->r_tid); + doutc(cl, "%p tid %lld\n", req, req->r_tid); ceph_mdsc_get_request(req); insert_request(&mdsc->request_tree, req);
@@ -1192,7 +1199,7 @@ static void __register_request(struct ceph_mds_client *mdsc, static void __unregister_request(struct ceph_mds_client *mdsc, struct ceph_mds_request *req) { - dout("__unregister_request %p tid %lld\n", req, req->r_tid); + doutc(mdsc->fsc->client, "%p tid %lld\n", req, req->r_tid);
/* Never leave an unregistered request on an unsafe list! */ list_del_init(&req->r_unsafe_item); @@ -1278,6 +1285,7 @@ static int __choose_mds(struct ceph_mds_client *mdsc, int mds = -1; u32 hash = req->r_direct_hash; bool is_hash = test_bit(CEPH_MDS_R_DIRECT_IS_HASH, &req->r_req_flags); + struct ceph_client *cl = mdsc->fsc->client;
if (random) *random = false; @@ -1289,8 +1297,7 @@ static int __choose_mds(struct ceph_mds_client *mdsc, if (req->r_resend_mds >= 0 && (__have_session(mdsc, req->r_resend_mds) || ceph_mdsmap_get_state(mdsc->mdsmap, req->r_resend_mds) > 0)) { - dout("%s using resend_mds mds%d\n", __func__, - req->r_resend_mds); + doutc(cl, "using resend_mds mds%d\n", req->r_resend_mds); return req->r_resend_mds; }
@@ -1307,7 +1314,8 @@ static int __choose_mds(struct ceph_mds_client *mdsc, rcu_read_lock(); inode = get_nonsnap_parent(req->r_dentry); rcu_read_unlock(); - dout("%s using snapdir's parent %p\n", __func__, inode); + doutc(cl, "using snapdir's parent %p %llx.%llx\n", + inode, ceph_vinop(inode)); } } else if (req->r_dentry) { /* ignore race with rename; old or new d_parent is okay */ @@ -1327,7 +1335,8 @@ static int __choose_mds(struct ceph_mds_client *mdsc, /* direct snapped/virtual snapdir requests * based on parent dir inode */ inode = get_nonsnap_parent(parent); - dout("%s using nonsnap parent %p\n", __func__, inode); + doutc(cl, "using nonsnap parent %p %llx.%llx\n", + inode, ceph_vinop(inode)); } else { /* dentry target */ inode = d_inode(req->r_dentry); @@ -1343,10 +1352,11 @@ static int __choose_mds(struct ceph_mds_client *mdsc, rcu_read_unlock(); }
- dout("%s %p is_hash=%d (0x%x) mode %d\n", __func__, inode, (int)is_hash, - hash, mode); if (!inode) goto random; + + doutc(cl, "%p %llx.%llx is_hash=%d (0x%x) mode %d\n", inode, + ceph_vinop(inode), (int)is_hash, hash, mode); ci = ceph_inode(inode);
if (is_hash && S_ISDIR(inode->i_mode)) { @@ -1362,9 +1372,9 @@ static int __choose_mds(struct ceph_mds_client *mdsc, get_random_bytes(&r, 1); r %= frag.ndist; mds = frag.dist[r]; - dout("%s %p %llx.%llx frag %u mds%d (%d/%d)\n", - __func__, inode, ceph_vinop(inode), - frag.frag, mds, (int)r, frag.ndist); + doutc(cl, "%p %llx.%llx frag %u mds%d (%d/%d)\n", + inode, ceph_vinop(inode), frag.frag, + mds, (int)r, frag.ndist); if (ceph_mdsmap_get_state(mdsc->mdsmap, mds) >= CEPH_MDS_STATE_ACTIVE && !ceph_mdsmap_is_laggy(mdsc->mdsmap, mds)) @@ -1377,9 +1387,8 @@ static int __choose_mds(struct ceph_mds_client *mdsc, if (frag.mds >= 0) { /* choose auth mds */ mds = frag.mds; - dout("%s %p %llx.%llx frag %u mds%d (auth)\n", - __func__, inode, ceph_vinop(inode), - frag.frag, mds); + doutc(cl, "%p %llx.%llx frag %u mds%d (auth)\n", + inode, ceph_vinop(inode), frag.frag, mds); if (ceph_mdsmap_get_state(mdsc->mdsmap, mds) >= CEPH_MDS_STATE_ACTIVE) { if (!ceph_mdsmap_is_laggy(mdsc->mdsmap, @@ -1403,9 +1412,9 @@ static int __choose_mds(struct ceph_mds_client *mdsc, goto random; } mds = cap->session->s_mds; - dout("%s %p %llx.%llx mds%d (%scap %p)\n", __func__, - inode, ceph_vinop(inode), mds, - cap == ci->i_auth_cap ? "auth " : "", cap); + doutc(cl, "%p %llx.%llx mds%d (%scap %p)\n", inode, + ceph_vinop(inode), mds, + cap == ci->i_auth_cap ? "auth " : "", cap); spin_unlock(&ci->i_ceph_lock); out: iput(inode); @@ -1416,7 +1425,7 @@ static int __choose_mds(struct ceph_mds_client *mdsc, *random = true;
mds = ceph_mdsmap_get_random_mds(mdsc->mdsmap); - dout("%s chose random mds%d\n", __func__, mds); + doutc(cl, "chose random mds%d\n", mds); return mds; }
@@ -1529,6 +1538,7 @@ static struct ceph_msg *create_session_open_msg(struct ceph_mds_client *mdsc, u6 int metadata_key_count = 0; struct ceph_options *opt = mdsc->fsc->client->options; struct ceph_mount_options *fsopt = mdsc->fsc->mount_options; + struct ceph_client *cl = mdsc->fsc->client; size_t size, count; void *p, *end; int ret; @@ -1567,7 +1577,7 @@ static struct ceph_msg *create_session_open_msg(struct ceph_mds_client *mdsc, u6 msg = ceph_msg_new(CEPH_MSG_CLIENT_SESSION, sizeof(*h) + extra_bytes, GFP_NOFS, false); if (!msg) { - pr_err("ENOMEM creating session open msg\n"); + pr_err_client(cl, "ENOMEM creating session open msg\n"); return ERR_PTR(-ENOMEM); } p = msg->front.iov_base; @@ -1607,14 +1617,14 @@ static struct ceph_msg *create_session_open_msg(struct ceph_mds_client *mdsc, u6
ret = encode_supported_features(&p, end); if (ret) { - pr_err("encode_supported_features failed!\n"); + pr_err_client(cl, "encode_supported_features failed!\n"); ceph_msg_put(msg); return ERR_PTR(ret); }
ret = encode_metric_spec(&p, end); if (ret) { - pr_err("encode_metric_spec failed!\n"); + pr_err_client(cl, "encode_metric_spec failed!\n"); ceph_msg_put(msg); return ERR_PTR(ret); } @@ -1642,8 +1652,8 @@ static int __open_session(struct ceph_mds_client *mdsc,
/* wait for mds to go active? */ mstate = ceph_mdsmap_get_state(mdsc->mdsmap, mds); - dout("open_session to mds%d (%s)\n", mds, - ceph_mds_state_name(mstate)); + doutc(mdsc->fsc->client, "open_session to mds%d (%s)\n", mds, + ceph_mds_state_name(mstate)); session->s_state = CEPH_MDS_SESSION_OPENING; session->s_renew_requested = jiffies;
@@ -1686,8 +1696,9 @@ struct ceph_mds_session * ceph_mdsc_open_export_target_session(struct ceph_mds_client *mdsc, int target) { struct ceph_mds_session *session; + struct ceph_client *cl = mdsc->fsc->client;
- dout("open_export_target_session to mds%d\n", target); + doutc(cl, "to mds%d\n", target);
mutex_lock(&mdsc->mutex); session = __open_export_target_session(mdsc, target); @@ -1702,13 +1713,14 @@ static void __open_export_target_sessions(struct ceph_mds_client *mdsc, struct ceph_mds_info *mi; struct ceph_mds_session *ts; int i, mds = session->s_mds; + struct ceph_client *cl = mdsc->fsc->client;
if (mds >= mdsc->mdsmap->possible_max_rank) return;
mi = &mdsc->mdsmap->m_info[mds]; - dout("open_export_target_sessions for mds%d (%d targets)\n", - session->s_mds, mi->num_export_targets); + doutc(cl, "for mds%d (%d targets)\n", session->s_mds, + mi->num_export_targets);
for (i = 0; i < mi->num_export_targets; i++) { ts = __open_export_target_session(mdsc, mi->export_targets[i]); @@ -1731,11 +1743,13 @@ void ceph_mdsc_open_export_target_sessions(struct ceph_mds_client *mdsc, static void detach_cap_releases(struct ceph_mds_session *session, struct list_head *target) { + struct ceph_client *cl = session->s_mdsc->fsc->client; + lockdep_assert_held(&session->s_cap_lock);
list_splice_init(&session->s_cap_releases, target); session->s_num_cap_releases = 0; - dout("dispose_cap_releases mds%d\n", session->s_mds); + doutc(cl, "mds%d\n", session->s_mds); }
static void dispose_cap_releases(struct ceph_mds_client *mdsc, @@ -1753,16 +1767,17 @@ static void dispose_cap_releases(struct ceph_mds_client *mdsc, static void cleanup_session_requests(struct ceph_mds_client *mdsc, struct ceph_mds_session *session) { + struct ceph_client *cl = mdsc->fsc->client; struct ceph_mds_request *req; struct rb_node *p;
- dout("cleanup_session_requests mds%d\n", session->s_mds); + doutc(cl, "mds%d\n", session->s_mds); mutex_lock(&mdsc->mutex); while (!list_empty(&session->s_unsafe)) { req = list_first_entry(&session->s_unsafe, struct ceph_mds_request, r_unsafe_item); - pr_warn_ratelimited(" dropping unsafe request %llu\n", - req->r_tid); + pr_warn_ratelimited_client(cl, " dropping unsafe request %llu\n", + req->r_tid); if (req->r_target_inode) mapping_set_error(req->r_target_inode->i_mapping, -EIO); if (req->r_unsafe_dir) @@ -1791,13 +1806,14 @@ int ceph_iterate_session_caps(struct ceph_mds_session *session, int (*cb)(struct inode *, int mds, void *), void *arg) { + struct ceph_client *cl = session->s_mdsc->fsc->client; struct list_head *p; struct ceph_cap *cap; struct inode *inode, *last_inode = NULL; struct ceph_cap *old_cap = NULL; int ret;
- dout("iterate_session_caps %p mds%d\n", session, session->s_mds); + doutc(cl, "%p mds%d\n", session, session->s_mds); spin_lock(&session->s_cap_lock); p = session->s_caps.next; while (p != &session->s_caps) { @@ -1828,8 +1844,7 @@ int ceph_iterate_session_caps(struct ceph_mds_session *session, spin_lock(&session->s_cap_lock); p = p->next; if (!cap->ci) { - dout("iterate_session_caps finishing cap %p removal\n", - cap); + doutc(cl, "finishing cap %p removal\n", cap); BUG_ON(cap->session != session); cap->session = NULL; list_del_init(&cap->session_caps); @@ -1858,6 +1873,7 @@ int ceph_iterate_session_caps(struct ceph_mds_session *session, static int remove_session_caps_cb(struct inode *inode, int mds, void *arg) { struct ceph_inode_info *ci = ceph_inode(inode); + struct ceph_client *cl = ceph_inode_to_client(inode); bool invalidate = false; struct ceph_cap *cap; int iputs = 0; @@ -1865,8 +1881,8 @@ static int remove_session_caps_cb(struct inode *inode, int mds, void *arg) spin_lock(&ci->i_ceph_lock); cap = __get_cap_for_mds(ci, mds); if (cap) { - dout(" removing cap %p, ci is %p, inode is %p\n", - cap, ci, &ci->netfs.inode); + doutc(cl, " removing cap %p, ci is %p, inode is %p\n", + cap, ci, &ci->netfs.inode);
iputs = ceph_purge_inode_cap(inode, cap, &invalidate); } @@ -1890,7 +1906,7 @@ static void remove_session_caps(struct ceph_mds_session *session) struct super_block *sb = fsc->sb; LIST_HEAD(dispose);
- dout("remove_session_caps on %p\n", session); + doutc(fsc->client, "on %p\n", session); ceph_iterate_session_caps(session, remove_session_caps_cb, fsc);
wake_up_all(&fsc->mdsc->cap_flushing_wq); @@ -1971,7 +1987,9 @@ static int wake_up_session_cb(struct inode *inode, int mds, void *arg)
static void wake_up_session_caps(struct ceph_mds_session *session, int ev) { - dout("wake_up_session_caps %p mds%d\n", session, session->s_mds); + struct ceph_client *cl = session->s_mdsc->fsc->client; + + doutc(cl, "session %p mds%d\n", session, session->s_mds); ceph_iterate_session_caps(session, wake_up_session_cb, (void *)(unsigned long)ev); } @@ -1985,25 +2003,26 @@ static void wake_up_session_caps(struct ceph_mds_session *session, int ev) static int send_renew_caps(struct ceph_mds_client *mdsc, struct ceph_mds_session *session) { + struct ceph_client *cl = mdsc->fsc->client; struct ceph_msg *msg; int state;
if (time_after_eq(jiffies, session->s_cap_ttl) && time_after_eq(session->s_cap_ttl, session->s_renew_requested)) - pr_info("mds%d caps stale\n", session->s_mds); + pr_info_client(cl, "mds%d caps stale\n", session->s_mds); session->s_renew_requested = jiffies;
/* do not try to renew caps until a recovering mds has reconnected * with its clients. */ state = ceph_mdsmap_get_state(mdsc->mdsmap, session->s_mds); if (state < CEPH_MDS_STATE_RECONNECT) { - dout("send_renew_caps ignoring mds%d (%s)\n", - session->s_mds, ceph_mds_state_name(state)); + doutc(cl, "ignoring mds%d (%s)\n", session->s_mds, + ceph_mds_state_name(state)); return 0; }
- dout("send_renew_caps to mds%d (%s)\n", session->s_mds, - ceph_mds_state_name(state)); + doutc(cl, "to mds%d (%s)\n", session->s_mds, + ceph_mds_state_name(state)); msg = ceph_create_session_msg(CEPH_SESSION_REQUEST_RENEWCAPS, ++session->s_renew_seq); if (!msg) @@ -2015,10 +2034,11 @@ static int send_renew_caps(struct ceph_mds_client *mdsc, static int send_flushmsg_ack(struct ceph_mds_client *mdsc, struct ceph_mds_session *session, u64 seq) { + struct ceph_client *cl = mdsc->fsc->client; struct ceph_msg *msg;
- dout("send_flushmsg_ack to mds%d (%s)s seq %lld\n", - session->s_mds, ceph_session_state_name(session->s_state), seq); + doutc(cl, "to mds%d (%s)s seq %lld\n", session->s_mds, + ceph_session_state_name(session->s_state), seq); msg = ceph_create_session_msg(CEPH_SESSION_FLUSHMSG_ACK, seq); if (!msg) return -ENOMEM; @@ -2035,6 +2055,7 @@ static int send_flushmsg_ack(struct ceph_mds_client *mdsc, static void renewed_caps(struct ceph_mds_client *mdsc, struct ceph_mds_session *session, int is_renew) { + struct ceph_client *cl = mdsc->fsc->client; int was_stale; int wake = 0;
@@ -2046,15 +2067,17 @@ static void renewed_caps(struct ceph_mds_client *mdsc,
if (was_stale) { if (time_before(jiffies, session->s_cap_ttl)) { - pr_info("mds%d caps renewed\n", session->s_mds); + pr_info_client(cl, "mds%d caps renewed\n", + session->s_mds); wake = 1; } else { - pr_info("mds%d caps still stale\n", session->s_mds); + pr_info_client(cl, "mds%d caps still stale\n", + session->s_mds); } } - dout("renewed_caps mds%d ttl now %lu, was %s, now %s\n", - session->s_mds, session->s_cap_ttl, was_stale ? "stale" : "fresh", - time_before(jiffies, session->s_cap_ttl) ? "stale" : "fresh"); + doutc(cl, "mds%d ttl now %lu, was %s, now %s\n", session->s_mds, + session->s_cap_ttl, was_stale ? "stale" : "fresh", + time_before(jiffies, session->s_cap_ttl) ? "stale" : "fresh"); spin_unlock(&session->s_cap_lock);
if (wake) @@ -2066,11 +2089,11 @@ static void renewed_caps(struct ceph_mds_client *mdsc, */ static int request_close_session(struct ceph_mds_session *session) { + struct ceph_client *cl = session->s_mdsc->fsc->client; struct ceph_msg *msg;
- dout("request_close_session mds%d state %s seq %lld\n", - session->s_mds, ceph_session_state_name(session->s_state), - session->s_seq); + doutc(cl, "mds%d state %s seq %lld\n", session->s_mds, + ceph_session_state_name(session->s_state), session->s_seq); msg = ceph_create_session_msg(CEPH_SESSION_REQUEST_CLOSE, session->s_seq); if (!msg) @@ -2127,6 +2150,7 @@ static bool drop_negative_children(struct dentry *dentry) static int trim_caps_cb(struct inode *inode, int mds, void *arg) { struct ceph_mds_client *mdsc = ceph_sb_to_mdsc(inode->i_sb); + struct ceph_client *cl = mdsc->fsc->client; int *remaining = arg; struct ceph_inode_info *ci = ceph_inode(inode); int used, wanted, oissued, mine; @@ -2146,9 +2170,10 @@ static int trim_caps_cb(struct inode *inode, int mds, void *arg) wanted = __ceph_caps_file_wanted(ci); oissued = __ceph_caps_issued_other(ci, cap);
- dout("trim_caps_cb %p cap %p mine %s oissued %s used %s wanted %s\n", - inode, cap, ceph_cap_string(mine), ceph_cap_string(oissued), - ceph_cap_string(used), ceph_cap_string(wanted)); + doutc(cl, "%p %llx.%llx cap %p mine %s oissued %s used %s wanted %s\n", + inode, ceph_vinop(inode), cap, ceph_cap_string(mine), + ceph_cap_string(oissued), ceph_cap_string(used), + ceph_cap_string(wanted)); if (cap == ci->i_auth_cap) { if (ci->i_dirty_caps || ci->i_flushing_caps || !list_empty(&ci->i_cap_snaps)) @@ -2188,8 +2213,8 @@ static int trim_caps_cb(struct inode *inode, int mds, void *arg) count = atomic_read(&inode->i_count); if (count == 1) (*remaining)--; - dout("trim_caps_cb %p cap %p pruned, count now %d\n", - inode, cap, count); + doutc(cl, "%p %llx.%llx cap %p pruned, count now %d\n", + inode, ceph_vinop(inode), cap, count); } else { dput(dentry); } @@ -2208,17 +2233,18 @@ int ceph_trim_caps(struct ceph_mds_client *mdsc, struct ceph_mds_session *session, int max_caps) { + struct ceph_client *cl = mdsc->fsc->client; int trim_caps = session->s_nr_caps - max_caps;
- dout("trim_caps mds%d start: %d / %d, trim %d\n", - session->s_mds, session->s_nr_caps, max_caps, trim_caps); + doutc(cl, "mds%d start: %d / %d, trim %d\n", session->s_mds, + session->s_nr_caps, max_caps, trim_caps); if (trim_caps > 0) { int remaining = trim_caps;
ceph_iterate_session_caps(session, trim_caps_cb, &remaining); - dout("trim_caps mds%d done: %d / %d, trimmed %d\n", - session->s_mds, session->s_nr_caps, max_caps, - trim_caps - remaining); + doutc(cl, "mds%d done: %d / %d, trimmed %d\n", + session->s_mds, session->s_nr_caps, max_caps, + trim_caps - remaining); }
ceph_flush_cap_releases(mdsc, session); @@ -2228,6 +2254,7 @@ int ceph_trim_caps(struct ceph_mds_client *mdsc, static int check_caps_flush(struct ceph_mds_client *mdsc, u64 want_flush_tid) { + struct ceph_client *cl = mdsc->fsc->client; int ret = 1;
spin_lock(&mdsc->cap_dirty_lock); @@ -2236,8 +2263,8 @@ static int check_caps_flush(struct ceph_mds_client *mdsc, list_first_entry(&mdsc->cap_flush_list, struct ceph_cap_flush, g_list); if (cf->tid <= want_flush_tid) { - dout("check_caps_flush still flushing tid " - "%llu <= %llu\n", cf->tid, want_flush_tid); + doutc(cl, "still flushing tid %llu <= %llu\n", + cf->tid, want_flush_tid); ret = 0; } } @@ -2253,12 +2280,14 @@ static int check_caps_flush(struct ceph_mds_client *mdsc, static void wait_caps_flush(struct ceph_mds_client *mdsc, u64 want_flush_tid) { - dout("check_caps_flush want %llu\n", want_flush_tid); + struct ceph_client *cl = mdsc->fsc->client; + + doutc(cl, "want %llu\n", want_flush_tid);
wait_event(mdsc->cap_flushing_wq, check_caps_flush(mdsc, want_flush_tid));
- dout("check_caps_flush ok, flushed thru %llu\n", want_flush_tid); + doutc(cl, "ok, flushed thru %llu\n", want_flush_tid); }
/* @@ -2267,6 +2296,7 @@ static void wait_caps_flush(struct ceph_mds_client *mdsc, static void ceph_send_cap_releases(struct ceph_mds_client *mdsc, struct ceph_mds_session *session) { + struct ceph_client *cl = mdsc->fsc->client; struct ceph_msg *msg = NULL; struct ceph_mds_cap_release *head; struct ceph_mds_cap_item *item; @@ -2325,7 +2355,7 @@ static void ceph_send_cap_releases(struct ceph_mds_client *mdsc, msg->front.iov_len += sizeof(*cap_barrier);
msg->hdr.front_len = cpu_to_le32(msg->front.iov_len); - dout("send_cap_releases mds%d %p\n", session->s_mds, msg); + doutc(cl, "mds%d %p\n", session->s_mds, msg); ceph_con_send(&session->s_con, msg); msg = NULL; } @@ -2345,13 +2375,13 @@ static void ceph_send_cap_releases(struct ceph_mds_client *mdsc, msg->front.iov_len += sizeof(*cap_barrier);
msg->hdr.front_len = cpu_to_le32(msg->front.iov_len); - dout("send_cap_releases mds%d %p\n", session->s_mds, msg); + doutc(cl, "mds%d %p\n", session->s_mds, msg); ceph_con_send(&session->s_con, msg); } return; out_err: - pr_err("send_cap_releases mds%d, failed to allocate message\n", - session->s_mds); + pr_err_client(cl, "mds%d, failed to allocate message\n", + session->s_mds); spin_lock(&session->s_cap_lock); list_splice(&tmp_list, &session->s_cap_releases); session->s_num_cap_releases += num_cap_releases; @@ -2374,16 +2404,17 @@ static void ceph_cap_release_work(struct work_struct *work) void ceph_flush_cap_releases(struct ceph_mds_client *mdsc, struct ceph_mds_session *session) { + struct ceph_client *cl = mdsc->fsc->client; if (mdsc->stopping) return;
ceph_get_mds_session(session); if (queue_work(mdsc->fsc->cap_wq, &session->s_cap_release_work)) { - dout("cap release work queued\n"); + doutc(cl, "cap release work queued\n"); } else { ceph_put_mds_session(session); - dout("failed to queue cap release work\n"); + doutc(cl, "failed to queue cap release work\n"); } }
@@ -2411,13 +2442,14 @@ static void ceph_cap_reclaim_work(struct work_struct *work)
void ceph_queue_cap_reclaim_work(struct ceph_mds_client *mdsc) { + struct ceph_client *cl = mdsc->fsc->client; if (mdsc->stopping) return;
if (queue_work(mdsc->fsc->cap_wq, &mdsc->cap_reclaim_work)) { - dout("caps reclaim work queued\n"); + doutc(cl, "caps reclaim work queued\n"); } else { - dout("failed to queue caps release work\n"); + doutc(cl, "failed to queue caps release work\n"); } }
@@ -2612,6 +2644,7 @@ static u8 *get_fscrypt_altname(const struct ceph_mds_request *req, u32 *plen) char *ceph_mdsc_build_path(struct ceph_mds_client *mdsc, struct dentry *dentry, int *plen, u64 *pbase, int for_wire) { + struct ceph_client *cl = mdsc->fsc->client; struct dentry *cur; struct inode *inode; char *path; @@ -2637,8 +2670,7 @@ char *ceph_mdsc_build_path(struct ceph_mds_client *mdsc, struct dentry *dentry, spin_lock(&cur->d_lock); inode = d_inode(cur); if (inode && ceph_snap(inode) == CEPH_SNAPDIR) { - dout("build_path path+%d: %p SNAPDIR\n", - pos, cur); + doutc(cl, "path+%d: %p SNAPDIR\n", pos, cur); spin_unlock(&cur->d_lock); parent = dget_parent(cur); } else if (for_wire && inode && dentry != cur && @@ -2716,15 +2748,15 @@ char *ceph_mdsc_build_path(struct ceph_mds_client *mdsc, struct dentry *dentry, * A rename didn't occur, but somehow we didn't end up where * we thought we would. Throw a warning and try again. */ - pr_warn("build_path did not end path lookup where expected (pos = %d)\n", - pos); + pr_warn_client(cl, "did not end path lookup where expected (pos = %d)\n", + pos); goto retry; }
*pbase = base; *plen = PATH_MAX - 1 - pos; - dout("build_path on %p %d built %llx '%.*s'\n", - dentry, d_count(dentry), base, *plen, path + pos); + doutc(cl, "on %p %d built %llx '%.*s'\n", dentry, d_count(dentry), + base, *plen, path + pos); return path + pos; }
@@ -2787,22 +2819,22 @@ static int set_request_path_attr(struct ceph_mds_client *mdsc, struct inode *rin int *pathlen, u64 *ino, bool *freepath, bool parent_locked) { + struct ceph_client *cl = mdsc->fsc->client; int r = 0;
if (rinode) { r = build_inode_path(rinode, ppath, pathlen, ino, freepath); - dout(" inode %p %llx.%llx\n", rinode, ceph_ino(rinode), - ceph_snap(rinode)); + doutc(cl, " inode %p %llx.%llx\n", rinode, ceph_ino(rinode), + ceph_snap(rinode)); } else if (rdentry) { r = build_dentry_path(mdsc, rdentry, rdiri, ppath, pathlen, ino, freepath, parent_locked); - dout(" dentry %p %llx/%.*s\n", rdentry, *ino, *pathlen, - *ppath); + doutc(cl, " dentry %p %llx/%.*s\n", rdentry, *ino, *pathlen, *ppath); } else if (rpath || rino) { *ino = rino; *ppath = rpath; *pathlen = rpath ? strlen(rpath) : 0; - dout(" path %.*s\n", *pathlen, rpath); + doutc(cl, " path %.*s\n", *pathlen, rpath); }
return r; @@ -3103,6 +3135,7 @@ static int __prepare_send_request(struct ceph_mds_session *session, { int mds = session->s_mds; struct ceph_mds_client *mdsc = session->s_mdsc; + struct ceph_client *cl = mdsc->fsc->client; struct ceph_mds_request_head_legacy *lhead; struct ceph_mds_request_head *nhead; struct ceph_msg *msg; @@ -3121,8 +3154,8 @@ static int __prepare_send_request(struct ceph_mds_session *session, old_max_retry = 1 << (old_max_retry * BITS_PER_BYTE); if ((old_version && req->r_attempts >= old_max_retry) || ((uint32_t)req->r_attempts >= U32_MAX)) { - pr_warn_ratelimited("%s request tid %llu seq overflow\n", - __func__, req->r_tid); + pr_warn_ratelimited_client(cl, "request tid %llu seq overflow\n", + req->r_tid); return -EMULTIHOP; } } @@ -3137,8 +3170,8 @@ static int __prepare_send_request(struct ceph_mds_session *session, else req->r_sent_on_mseq = -1; } - dout("%s %p tid %lld %s (attempt %d)\n", __func__, req, - req->r_tid, ceph_mds_op_name(req->r_op), req->r_attempts); + doutc(cl, "%p tid %lld %s (attempt %d)\n", req, req->r_tid, + ceph_mds_op_name(req->r_op), req->r_attempts);
if (test_bit(CEPH_MDS_R_GOT_UNSAFE, &req->r_req_flags)) { void *p; @@ -3206,7 +3239,7 @@ static int __prepare_send_request(struct ceph_mds_session *session, nhead->ext_num_retry = cpu_to_le32(req->r_attempts - 1); }
- dout(" r_parent = %p\n", req->r_parent); + doutc(cl, " r_parent = %p\n", req->r_parent); return 0; }
@@ -3234,6 +3267,7 @@ static int __send_request(struct ceph_mds_session *session, static void __do_request(struct ceph_mds_client *mdsc, struct ceph_mds_request *req) { + struct ceph_client *cl = mdsc->fsc->client; struct ceph_mds_session *session = NULL; int mds = -1; int err = 0; @@ -3246,29 +3280,29 @@ static void __do_request(struct ceph_mds_client *mdsc, }
if (READ_ONCE(mdsc->fsc->mount_state) == CEPH_MOUNT_FENCE_IO) { - dout("do_request metadata corrupted\n"); + doutc(cl, "metadata corrupted\n"); err = -EIO; goto finish; } if (req->r_timeout && time_after_eq(jiffies, req->r_started + req->r_timeout)) { - dout("do_request timed out\n"); + doutc(cl, "timed out\n"); err = -ETIMEDOUT; goto finish; } if (READ_ONCE(mdsc->fsc->mount_state) == CEPH_MOUNT_SHUTDOWN) { - dout("do_request forced umount\n"); + doutc(cl, "forced umount\n"); err = -EIO; goto finish; } if (READ_ONCE(mdsc->fsc->mount_state) == CEPH_MOUNT_MOUNTING) { if (mdsc->mdsmap_err) { err = mdsc->mdsmap_err; - dout("do_request mdsmap err %d\n", err); + doutc(cl, "mdsmap err %d\n", err); goto finish; } if (mdsc->mdsmap->m_epoch == 0) { - dout("do_request no mdsmap, waiting for map\n"); + doutc(cl, "no mdsmap, waiting for map\n"); list_add(&req->r_wait, &mdsc->waiting_for_map); return; } @@ -3289,7 +3323,7 @@ static void __do_request(struct ceph_mds_client *mdsc, err = -EJUKEBOX; goto finish; } - dout("do_request no mds or not active, waiting for map\n"); + doutc(cl, "no mds or not active, waiting for map\n"); list_add(&req->r_wait, &mdsc->waiting_for_map); return; } @@ -3305,8 +3339,8 @@ static void __do_request(struct ceph_mds_client *mdsc, } req->r_session = ceph_get_mds_session(session);
- dout("do_request mds%d session %p state %s\n", mds, session, - ceph_session_state_name(session->s_state)); + doutc(cl, "mds%d session %p state %s\n", mds, session, + ceph_session_state_name(session->s_state));
/* * The old ceph will crash the MDSs when see unknown OPs @@ -3397,8 +3431,8 @@ static void __do_request(struct ceph_mds_client *mdsc, spin_lock(&ci->i_ceph_lock); cap = ci->i_auth_cap; if (ci->i_ceph_flags & CEPH_I_ASYNC_CREATE && mds != cap->mds) { - dout("do_request session changed for auth cap %d -> %d\n", - cap->session->s_mds, session->s_mds); + doutc(cl, "session changed for auth cap %d -> %d\n", + cap->session->s_mds, session->s_mds);
/* Remove the auth cap from old session */ spin_lock(&cap->session->s_cap_lock); @@ -3425,7 +3459,7 @@ static void __do_request(struct ceph_mds_client *mdsc, ceph_put_mds_session(session); finish: if (err) { - dout("__do_request early error %d\n", err); + doutc(cl, "early error %d\n", err); req->r_err = err; complete_request(mdsc, req); __unregister_request(mdsc, req); @@ -3439,6 +3473,7 @@ static void __do_request(struct ceph_mds_client *mdsc, static void __wake_requests(struct ceph_mds_client *mdsc, struct list_head *head) { + struct ceph_client *cl = mdsc->fsc->client; struct ceph_mds_request *req; LIST_HEAD(tmp_list);
@@ -3448,7 +3483,8 @@ static void __wake_requests(struct ceph_mds_client *mdsc, req = list_entry(tmp_list.next, struct ceph_mds_request, r_wait); list_del_init(&req->r_wait); - dout(" wake request %p tid %llu\n", req, req->r_tid); + doutc(cl, " wake request %p tid %llu\n", req, + req->r_tid); __do_request(mdsc, req); } } @@ -3459,10 +3495,11 @@ static void __wake_requests(struct ceph_mds_client *mdsc, */ static void kick_requests(struct ceph_mds_client *mdsc, int mds) { + struct ceph_client *cl = mdsc->fsc->client; struct ceph_mds_request *req; struct rb_node *p = rb_first(&mdsc->request_tree);
- dout("kick_requests mds%d\n", mds); + doutc(cl, "kick_requests mds%d\n", mds); while (p) { req = rb_entry(p, struct ceph_mds_request, r_node); p = rb_next(p); @@ -3472,7 +3509,7 @@ static void kick_requests(struct ceph_mds_client *mdsc, int mds) continue; /* only new requests */ if (req->r_session && req->r_session->s_mds == mds) { - dout(" kicking tid %llu\n", req->r_tid); + doutc(cl, " kicking tid %llu\n", req->r_tid); list_del_init(&req->r_wait); __do_request(mdsc, req); } @@ -3482,6 +3519,7 @@ static void kick_requests(struct ceph_mds_client *mdsc, int mds) int ceph_mdsc_submit_request(struct ceph_mds_client *mdsc, struct inode *dir, struct ceph_mds_request *req) { + struct ceph_client *cl = mdsc->fsc->client; int err = 0;
/* take CAP_PIN refs for r_inode, r_parent, r_old_dentry */ @@ -3503,8 +3541,7 @@ int ceph_mdsc_submit_request(struct ceph_mds_client *mdsc, struct inode *dir, if (req->r_inode) { err = ceph_wait_on_async_create(req->r_inode); if (err) { - dout("%s: wait for async create returned: %d\n", - __func__, err); + doutc(cl, "wait for async create returned: %d\n", err); return err; } } @@ -3512,13 +3549,12 @@ int ceph_mdsc_submit_request(struct ceph_mds_client *mdsc, struct inode *dir, if (!err && req->r_old_inode) { err = ceph_wait_on_async_create(req->r_old_inode); if (err) { - dout("%s: wait for async create returned: %d\n", - __func__, err); + doutc(cl, "wait for async create returned: %d\n", err); return err; } }
- dout("submit_request on %p for inode %p\n", req, dir); + doutc(cl, "submit_request on %p for inode %p\n", req, dir); mutex_lock(&mdsc->mutex); __register_request(mdsc, req, dir); __do_request(mdsc, req); @@ -3531,10 +3567,11 @@ int ceph_mdsc_wait_request(struct ceph_mds_client *mdsc, struct ceph_mds_request *req, ceph_mds_request_wait_callback_t wait_func) { + struct ceph_client *cl = mdsc->fsc->client; int err;
/* wait */ - dout("do_request waiting\n"); + doutc(cl, "do_request waiting\n"); if (wait_func) { err = wait_func(mdsc, req); } else { @@ -3548,14 +3585,14 @@ int ceph_mdsc_wait_request(struct ceph_mds_client *mdsc, else err = timeleft; /* killed */ } - dout("do_request waited, got %d\n", err); + doutc(cl, "do_request waited, got %d\n", err); mutex_lock(&mdsc->mutex);
/* only abort if we didn't race with a real reply */ if (test_bit(CEPH_MDS_R_GOT_RESULT, &req->r_req_flags)) { err = le32_to_cpu(req->r_reply_info.head->result); } else if (err < 0) { - dout("aborted request %lld with %d\n", req->r_tid, err); + doutc(cl, "aborted request %lld with %d\n", req->r_tid, err);
/* * ensure we aren't running concurrently with @@ -3586,15 +3623,16 @@ int ceph_mdsc_do_request(struct ceph_mds_client *mdsc, struct inode *dir, struct ceph_mds_request *req) { + struct ceph_client *cl = mdsc->fsc->client; int err;
- dout("do_request on %p\n", req); + doutc(cl, "do_request on %p\n", req);
/* issue */ err = ceph_mdsc_submit_request(mdsc, dir, req); if (!err) err = ceph_mdsc_wait_request(mdsc, req, NULL); - dout("do_request %p done, result %d\n", req, err); + doutc(cl, "do_request %p done, result %d\n", req, err); return err; }
@@ -3606,8 +3644,10 @@ void ceph_invalidate_dir_request(struct ceph_mds_request *req) { struct inode *dir = req->r_parent; struct inode *old_dir = req->r_old_dentry_dir; + struct ceph_client *cl = req->r_mdsc->fsc->client;
- dout("invalidate_dir_request %p %p (complete, lease(s))\n", dir, old_dir); + doutc(cl, "invalidate_dir_request %p %p (complete, lease(s))\n", + dir, old_dir);
ceph_dir_clear_complete(dir); if (old_dir) @@ -3628,6 +3668,7 @@ void ceph_invalidate_dir_request(struct ceph_mds_request *req) static void handle_reply(struct ceph_mds_session *session, struct ceph_msg *msg) { struct ceph_mds_client *mdsc = session->s_mdsc; + struct ceph_client *cl = mdsc->fsc->client; struct ceph_mds_request *req; struct ceph_mds_reply_head *head = msg->front.iov_base; struct ceph_mds_reply_info_parsed *rinfo; /* parsed reply info */ @@ -3638,7 +3679,7 @@ static void handle_reply(struct ceph_mds_session *session, struct ceph_msg *msg) bool close_sessions = false;
if (msg->front.iov_len < sizeof(*head)) { - pr_err("mdsc_handle_reply got corrupt (short) reply\n"); + pr_err_client(cl, "got corrupt (short) reply\n"); ceph_msg_dump(msg); return; } @@ -3648,17 +3689,17 @@ static void handle_reply(struct ceph_mds_session *session, struct ceph_msg *msg) mutex_lock(&mdsc->mutex); req = lookup_get_request(mdsc, tid); if (!req) { - dout("handle_reply on unknown tid %llu\n", tid); + doutc(cl, "on unknown tid %llu\n", tid); mutex_unlock(&mdsc->mutex); return; } - dout("handle_reply %p\n", req); + doutc(cl, "handle_reply %p\n", req);
/* correct session? */ if (req->r_session != session) { - pr_err("mdsc_handle_reply got %llu on session mds%d" - " not mds%d\n", tid, session->s_mds, - req->r_session ? req->r_session->s_mds : -1); + pr_err_client(cl, "got %llu on session mds%d not mds%d\n", + tid, session->s_mds, + req->r_session ? req->r_session->s_mds : -1); mutex_unlock(&mdsc->mutex); goto out; } @@ -3666,14 +3707,14 @@ static void handle_reply(struct ceph_mds_session *session, struct ceph_msg *msg) /* dup? */ if ((test_bit(CEPH_MDS_R_GOT_UNSAFE, &req->r_req_flags) && !head->safe) || (test_bit(CEPH_MDS_R_GOT_SAFE, &req->r_req_flags) && head->safe)) { - pr_warn("got a dup %s reply on %llu from mds%d\n", - head->safe ? "safe" : "unsafe", tid, mds); + pr_warn_client(cl, "got a dup %s reply on %llu from mds%d\n", + head->safe ? "safe" : "unsafe", tid, mds); mutex_unlock(&mdsc->mutex); goto out; } if (test_bit(CEPH_MDS_R_GOT_SAFE, &req->r_req_flags)) { - pr_warn("got unsafe after safe on %llu from mds%d\n", - tid, mds); + pr_warn_client(cl, "got unsafe after safe on %llu from mds%d\n", + tid, mds); mutex_unlock(&mdsc->mutex); goto out; } @@ -3696,7 +3737,7 @@ static void handle_reply(struct ceph_mds_session *session, struct ceph_msg *msg) * response. And even if it did, there is nothing * useful we could do with a revised return value. */ - dout("got safe reply %llu, mds%d\n", tid, mds); + doutc(cl, "got safe reply %llu, mds%d\n", tid, mds);
mutex_unlock(&mdsc->mutex); goto out; @@ -3706,7 +3747,7 @@ static void handle_reply(struct ceph_mds_session *session, struct ceph_msg *msg) list_add_tail(&req->r_unsafe_item, &req->r_session->s_unsafe); }
- dout("handle_reply tid %lld result %d\n", tid, result); + doutc(cl, "tid %lld result %d\n", tid, result); if (test_bit(CEPHFS_FEATURE_REPLY_ENCODING, &session->s_features)) err = parse_reply_info(session, msg, req, (u64)-1); else @@ -3746,7 +3787,8 @@ static void handle_reply(struct ceph_mds_session *session, struct ceph_msg *msg)
mutex_lock(&session->s_mutex); if (err < 0) { - pr_err("mdsc_handle_reply got corrupt reply mds%d(tid:%lld)\n", mds, tid); + pr_err_client(cl, "got corrupt reply mds%d(tid:%lld)\n", + mds, tid); ceph_msg_dump(msg); goto out_err; } @@ -3810,7 +3852,7 @@ static void handle_reply(struct ceph_mds_session *session, struct ceph_msg *msg) set_bit(CEPH_MDS_R_GOT_RESULT, &req->r_req_flags); } } else { - dout("reply arrived after request %lld was aborted\n", tid); + doutc(cl, "reply arrived after request %lld was aborted\n", tid); } mutex_unlock(&mdsc->mutex);
@@ -3839,6 +3881,7 @@ static void handle_forward(struct ceph_mds_client *mdsc, struct ceph_mds_session *session, struct ceph_msg *msg) { + struct ceph_client *cl = mdsc->fsc->client; struct ceph_mds_request *req; u64 tid = le64_to_cpu(msg->hdr.tid); u32 next_mds; @@ -3856,12 +3899,12 @@ static void handle_forward(struct ceph_mds_client *mdsc, req = lookup_get_request(mdsc, tid); if (!req) { mutex_unlock(&mdsc->mutex); - dout("forward tid %llu to mds%d - req dne\n", tid, next_mds); + doutc(cl, "forward tid %llu to mds%d - req dne\n", tid, next_mds); return; /* dup reply? */ }
if (test_bit(CEPH_MDS_R_ABORTED, &req->r_req_flags)) { - dout("forward tid %llu aborted, unregistering\n", tid); + doutc(cl, "forward tid %llu aborted, unregistering\n", tid); __unregister_request(mdsc, req); } else if (fwd_seq <= req->r_num_fwd || (uint32_t)fwd_seq >= U32_MAX) { /* @@ -3877,10 +3920,11 @@ static void handle_forward(struct ceph_mds_client *mdsc, set_bit(CEPH_MDS_R_ABORTED, &req->r_req_flags); mutex_unlock(&req->r_fill_mutex); aborted = true; - pr_warn_ratelimited("forward tid %llu seq overflow\n", tid); + pr_warn_ratelimited_client(cl, "forward tid %llu seq overflow\n", + tid); } else { /* resend. forward race not possible; mds would drop */ - dout("forward tid %llu to mds%d (we resend)\n", tid, next_mds); + doutc(cl, "forward tid %llu to mds%d (we resend)\n", tid, next_mds); BUG_ON(req->r_err); BUG_ON(test_bit(CEPH_MDS_R_GOT_RESULT, &req->r_req_flags)); req->r_attempts = 0; @@ -3898,7 +3942,7 @@ static void handle_forward(struct ceph_mds_client *mdsc, return;
bad: - pr_err("mdsc_handle_forward decode error err=%d\n", err); + pr_err_client(cl, "decode error err=%d\n", err); ceph_msg_dump(msg); }
@@ -3937,6 +3981,7 @@ static void handle_session(struct ceph_mds_session *session, struct ceph_msg *msg) { struct ceph_mds_client *mdsc = session->s_mdsc; + struct ceph_client *cl = mdsc->fsc->client; int mds = session->s_mds; int msg_version = le16_to_cpu(msg->hdr.version); void *p = msg->front.iov_base; @@ -3984,7 +4029,8 @@ static void handle_session(struct ceph_mds_session *session, /* version >= 5, flags */ ceph_decode_32_safe(&p, end, flags, bad); if (flags & CEPH_SESSION_BLOCKLISTED) { - pr_warn("mds%d session blocklisted\n", session->s_mds); + pr_warn_client(cl, "mds%d session blocklisted\n", + session->s_mds); blocklisted = true; } } @@ -4000,23 +4046,25 @@ static void handle_session(struct ceph_mds_session *session,
mutex_lock(&session->s_mutex);
- dout("handle_session mds%d %s %p state %s seq %llu\n", - mds, ceph_session_op_name(op), session, - ceph_session_state_name(session->s_state), seq); + doutc(cl, "mds%d %s %p state %s seq %llu\n", mds, + ceph_session_op_name(op), session, + ceph_session_state_name(session->s_state), seq);
if (session->s_state == CEPH_MDS_SESSION_HUNG) { session->s_state = CEPH_MDS_SESSION_OPEN; - pr_info("mds%d came back\n", session->s_mds); + pr_info_client(cl, "mds%d came back\n", session->s_mds); }
switch (op) { case CEPH_SESSION_OPEN: if (session->s_state == CEPH_MDS_SESSION_RECONNECTING) - pr_info("mds%d reconnect success\n", session->s_mds); + pr_info_client(cl, "mds%d reconnect success\n", + session->s_mds);
session->s_features = features; if (session->s_state == CEPH_MDS_SESSION_OPEN) { - pr_notice("mds%d is already opened\n", session->s_mds); + pr_notice_client(cl, "mds%d is already opened\n", + session->s_mds); } else { session->s_state = CEPH_MDS_SESSION_OPEN; renewed_caps(mdsc, session, 0); @@ -4045,7 +4093,8 @@ static void handle_session(struct ceph_mds_session *session,
case CEPH_SESSION_CLOSE: if (session->s_state == CEPH_MDS_SESSION_RECONNECTING) - pr_info("mds%d reconnect denied\n", session->s_mds); + pr_info_client(cl, "mds%d reconnect denied\n", + session->s_mds); session->s_state = CEPH_MDS_SESSION_CLOSED; cleanup_session_requests(mdsc, session); remove_session_caps(session); @@ -4054,8 +4103,8 @@ static void handle_session(struct ceph_mds_session *session, break;
case CEPH_SESSION_STALE: - pr_info("mds%d caps went stale, renewing\n", - session->s_mds); + pr_info_client(cl, "mds%d caps went stale, renewing\n", + session->s_mds); atomic_inc(&session->s_cap_gen); session->s_cap_ttl = jiffies - 1; send_renew_caps(mdsc, session); @@ -4076,7 +4125,7 @@ static void handle_session(struct ceph_mds_session *session, break;
case CEPH_SESSION_FORCE_RO: - dout("force_session_readonly %p\n", session); + doutc(cl, "force_session_readonly %p\n", session); spin_lock(&session->s_cap_lock); session->s_readonly = true; spin_unlock(&session->s_cap_lock); @@ -4085,7 +4134,8 @@ static void handle_session(struct ceph_mds_session *session,
case CEPH_SESSION_REJECT: WARN_ON(session->s_state != CEPH_MDS_SESSION_OPENING); - pr_info("mds%d rejected session\n", session->s_mds); + pr_info_client(cl, "mds%d rejected session\n", + session->s_mds); session->s_state = CEPH_MDS_SESSION_REJECTED; cleanup_session_requests(mdsc, session); remove_session_caps(session); @@ -4095,7 +4145,7 @@ static void handle_session(struct ceph_mds_session *session, break;
default: - pr_err("mdsc_handle_session bad op %d mds%d\n", op, mds); + pr_err_client(cl, "bad op %d mds%d\n", op, mds); WARN_ON(1); }
@@ -4112,30 +4162,32 @@ static void handle_session(struct ceph_mds_session *session, return;
bad: - pr_err("mdsc_handle_session corrupt message mds%d len %d\n", mds, - (int)msg->front.iov_len); + pr_err_client(cl, "corrupt message mds%d len %d\n", mds, + (int)msg->front.iov_len); ceph_msg_dump(msg); return; }
void ceph_mdsc_release_dir_caps(struct ceph_mds_request *req) { + struct ceph_client *cl = req->r_mdsc->fsc->client; int dcaps;
dcaps = xchg(&req->r_dir_caps, 0); if (dcaps) { - dout("releasing r_dir_caps=%s\n", ceph_cap_string(dcaps)); + doutc(cl, "releasing r_dir_caps=%s\n", ceph_cap_string(dcaps)); ceph_put_cap_refs(ceph_inode(req->r_parent), dcaps); } }
void ceph_mdsc_release_dir_caps_no_check(struct ceph_mds_request *req) { + struct ceph_client *cl = req->r_mdsc->fsc->client; int dcaps;
dcaps = xchg(&req->r_dir_caps, 0); if (dcaps) { - dout("releasing r_dir_caps=%s\n", ceph_cap_string(dcaps)); + doutc(cl, "releasing r_dir_caps=%s\n", ceph_cap_string(dcaps)); ceph_put_cap_refs_no_check_caps(ceph_inode(req->r_parent), dcaps); } @@ -4150,7 +4202,7 @@ static void replay_unsafe_requests(struct ceph_mds_client *mdsc, struct ceph_mds_request *req, *nreq; struct rb_node *p;
- dout("replay_unsafe_requests mds%d\n", session->s_mds); + doutc(mdsc->fsc->client, "mds%d\n", session->s_mds);
mutex_lock(&mdsc->mutex); list_for_each_entry_safe(req, nreq, &session->s_unsafe, r_unsafe_item) @@ -4295,6 +4347,7 @@ static struct dentry* d_find_primary(struct inode *inode) static int reconnect_caps_cb(struct inode *inode, int mds, void *arg) { struct ceph_mds_client *mdsc = ceph_sb_to_mdsc(inode->i_sb); + struct ceph_client *cl = ceph_inode_to_client(inode); union { struct ceph_mds_cap_reconnect v2; struct ceph_mds_cap_reconnect_v1 v1; @@ -4331,9 +4384,9 @@ static int reconnect_caps_cb(struct inode *inode, int mds, void *arg) err = 0; goto out_err; } - dout(" adding %p ino %llx.%llx cap %p %lld %s\n", - inode, ceph_vinop(inode), cap, cap->cap_id, - ceph_cap_string(cap->issued)); + doutc(cl, " adding %p ino %llx.%llx cap %p %lld %s\n", inode, + ceph_vinop(inode), cap, cap->cap_id, + ceph_cap_string(cap->issued));
cap->seq = 0; /* reset cap seq */ cap->issue_seq = 0; /* and issue_seq */ @@ -4483,6 +4536,7 @@ static int encode_snap_realms(struct ceph_mds_client *mdsc, { struct rb_node *p; struct ceph_pagelist *pagelist = recon_state->pagelist; + struct ceph_client *cl = mdsc->fsc->client; int err = 0;
if (recon_state->msg_version >= 4) { @@ -4521,8 +4575,8 @@ static int encode_snap_realms(struct ceph_mds_client *mdsc, ceph_pagelist_encode_32(pagelist, sizeof(sr_rec)); }
- dout(" adding snap realm %llx seq %lld parent %llx\n", - realm->ino, realm->seq, realm->parent_ino); + doutc(cl, " adding snap realm %llx seq %lld parent %llx\n", + realm->ino, realm->seq, realm->parent_ino); sr_rec.ino = cpu_to_le64(realm->ino); sr_rec.seq = cpu_to_le64(realm->seq); sr_rec.parent = cpu_to_le64(realm->parent_ino); @@ -4551,6 +4605,7 @@ static int encode_snap_realms(struct ceph_mds_client *mdsc, static void send_mds_reconnect(struct ceph_mds_client *mdsc, struct ceph_mds_session *session) { + struct ceph_client *cl = mdsc->fsc->client; struct ceph_msg *reply; int mds = session->s_mds; int err = -ENOMEM; @@ -4559,7 +4614,7 @@ static void send_mds_reconnect(struct ceph_mds_client *mdsc, }; LIST_HEAD(dispose);
- pr_info("mds%d reconnect start\n", mds); + pr_info_client(cl, "mds%d reconnect start\n", mds);
recon_state.pagelist = ceph_pagelist_alloc(GFP_NOFS); if (!recon_state.pagelist) @@ -4575,8 +4630,8 @@ static void send_mds_reconnect(struct ceph_mds_client *mdsc, session->s_state = CEPH_MDS_SESSION_RECONNECTING; session->s_seq = 0;
- dout("session %p state %s\n", session, - ceph_session_state_name(session->s_state)); + doutc(cl, "session %p state %s\n", session, + ceph_session_state_name(session->s_state));
atomic_inc(&session->s_cap_gen);
@@ -4710,7 +4765,8 @@ static void send_mds_reconnect(struct ceph_mds_client *mdsc, fail_nomsg: ceph_pagelist_release(recon_state.pagelist); fail_nopagelist: - pr_err("error %d preparing reconnect for mds%d\n", err, mds); + pr_err_client(cl, "error %d preparing reconnect for mds%d\n", + err, mds); return; }
@@ -4729,9 +4785,9 @@ static void check_new_map(struct ceph_mds_client *mdsc, int oldstate, newstate; struct ceph_mds_session *s; unsigned long targets[DIV_ROUND_UP(CEPH_MAX_MDS, sizeof(unsigned long))] = {0}; + struct ceph_client *cl = mdsc->fsc->client;
- dout("check_new_map new %u old %u\n", - newmap->m_epoch, oldmap->m_epoch); + doutc(cl, "new %u old %u\n", newmap->m_epoch, oldmap->m_epoch);
if (newmap->m_info) { for (i = 0; i < newmap->possible_max_rank; i++) { @@ -4747,12 +4803,12 @@ static void check_new_map(struct ceph_mds_client *mdsc, oldstate = ceph_mdsmap_get_state(oldmap, i); newstate = ceph_mdsmap_get_state(newmap, i);
- dout("check_new_map mds%d state %s%s -> %s%s (session %s)\n", - i, ceph_mds_state_name(oldstate), - ceph_mdsmap_is_laggy(oldmap, i) ? " (laggy)" : "", - ceph_mds_state_name(newstate), - ceph_mdsmap_is_laggy(newmap, i) ? " (laggy)" : "", - ceph_session_state_name(s->s_state)); + doutc(cl, "mds%d state %s%s -> %s%s (session %s)\n", + i, ceph_mds_state_name(oldstate), + ceph_mdsmap_is_laggy(oldmap, i) ? " (laggy)" : "", + ceph_mds_state_name(newstate), + ceph_mdsmap_is_laggy(newmap, i) ? " (laggy)" : "", + ceph_session_state_name(s->s_state));
if (i >= newmap->possible_max_rank) { /* force close session for stopped mds */ @@ -4805,7 +4861,8 @@ static void check_new_map(struct ceph_mds_client *mdsc, newstate >= CEPH_MDS_STATE_ACTIVE) { if (oldstate != CEPH_MDS_STATE_CREATING && oldstate != CEPH_MDS_STATE_STARTING) - pr_info("mds%d recovery completed\n", s->s_mds); + pr_info_client(cl, "mds%d recovery completed\n", + s->s_mds); kick_requests(mdsc, i); mutex_unlock(&mdsc->mutex); mutex_lock(&s->s_mutex); @@ -4849,12 +4906,13 @@ static void check_new_map(struct ceph_mds_client *mdsc, s = __open_export_target_session(mdsc, i); if (IS_ERR(s)) { err = PTR_ERR(s); - pr_err("failed to open export target session, err %d\n", - err); + pr_err_client(cl, + "failed to open export target session, err %d\n", + err); continue; } } - dout("send reconnect to export target mds.%d\n", i); + doutc(cl, "send reconnect to export target mds.%d\n", i); mutex_unlock(&mdsc->mutex); send_mds_reconnect(mdsc, s); ceph_put_mds_session(s); @@ -4870,8 +4928,7 @@ static void check_new_map(struct ceph_mds_client *mdsc, if (s->s_state == CEPH_MDS_SESSION_OPEN || s->s_state == CEPH_MDS_SESSION_HUNG || s->s_state == CEPH_MDS_SESSION_CLOSING) { - dout(" connecting to export targets of laggy mds%d\n", - i); + doutc(cl, " connecting to export targets of laggy mds%d\n", i); __open_export_target_sessions(mdsc, s); } } @@ -4898,6 +4955,7 @@ static void handle_lease(struct ceph_mds_client *mdsc, struct ceph_mds_session *session, struct ceph_msg *msg) { + struct ceph_client *cl = mdsc->fsc->client; struct super_block *sb = mdsc->fsc->sb; struct inode *inode; struct dentry *parent, *dentry; @@ -4909,7 +4967,7 @@ static void handle_lease(struct ceph_mds_client *mdsc, struct qstr dname; int release = 0;
- dout("handle_lease from mds%d\n", mds); + doutc(cl, "from mds%d\n", mds);
if (!ceph_inc_mds_stopping_blocker(mdsc, session)) return; @@ -4927,20 +4985,19 @@ static void handle_lease(struct ceph_mds_client *mdsc,
/* lookup inode */ inode = ceph_find_inode(sb, vino); - dout("handle_lease %s, ino %llx %p %.*s\n", - ceph_lease_op_name(h->action), vino.ino, inode, - dname.len, dname.name); + doutc(cl, "%s, ino %llx %p %.*s\n", ceph_lease_op_name(h->action), + vino.ino, inode, dname.len, dname.name);
mutex_lock(&session->s_mutex); if (!inode) { - dout("handle_lease no inode %llx\n", vino.ino); + doutc(cl, "no inode %llx\n", vino.ino); goto release; }
/* dentry */ parent = d_find_alias(inode); if (!parent) { - dout("no parent dentry on inode %p\n", inode); + doutc(cl, "no parent dentry on inode %p\n", inode); WARN_ON(1); goto release; /* hrm... */ } @@ -5000,7 +5057,7 @@ static void handle_lease(struct ceph_mds_client *mdsc, bad: ceph_dec_mds_stopping_blocker(mdsc);
- pr_err("corrupt lease message\n"); + pr_err_client(cl, "corrupt lease message\n"); ceph_msg_dump(msg); }
@@ -5008,13 +5065,14 @@ void ceph_mdsc_lease_send_msg(struct ceph_mds_session *session, struct dentry *dentry, char action, u32 seq) { + struct ceph_client *cl = session->s_mdsc->fsc->client; struct ceph_msg *msg; struct ceph_mds_lease *lease; struct inode *dir; int len = sizeof(*lease) + sizeof(u32) + NAME_MAX;
- dout("lease_send_msg identry %p %s to mds%d\n", - dentry, ceph_lease_op_name(action), session->s_mds); + doutc(cl, "identry %p %s to mds%d\n", dentry, ceph_lease_op_name(action), + session->s_mds);
msg = ceph_msg_new(CEPH_MSG_CLIENT_LEASE, len, GFP_NOFS, false); if (!msg) @@ -5047,6 +5105,7 @@ static void lock_unlock_session(struct ceph_mds_session *s)
static void maybe_recover_session(struct ceph_mds_client *mdsc) { + struct ceph_client *cl = mdsc->fsc->client; struct ceph_fs_client *fsc = mdsc->fsc;
if (!ceph_test_mount_opt(fsc, CLEANRECOVER)) @@ -5058,17 +5117,19 @@ static void maybe_recover_session(struct ceph_mds_client *mdsc) if (!READ_ONCE(fsc->blocklisted)) return;
- pr_info("auto reconnect after blocklisted\n"); + pr_info_client(cl, "auto reconnect after blocklisted\n"); ceph_force_reconnect(fsc->sb); }
bool check_session_state(struct ceph_mds_session *s) { + struct ceph_client *cl = s->s_mdsc->fsc->client; + switch (s->s_state) { case CEPH_MDS_SESSION_OPEN: if (s->s_ttl && time_after(jiffies, s->s_ttl)) { s->s_state = CEPH_MDS_SESSION_HUNG; - pr_info("mds%d hung\n", s->s_mds); + pr_info_client(cl, "mds%d hung\n", s->s_mds); } break; case CEPH_MDS_SESSION_CLOSING: @@ -5088,6 +5149,8 @@ bool check_session_state(struct ceph_mds_session *s) */ void inc_session_sequence(struct ceph_mds_session *s) { + struct ceph_client *cl = s->s_mdsc->fsc->client; + lockdep_assert_held(&s->s_mutex);
s->s_seq++; @@ -5095,11 +5158,11 @@ void inc_session_sequence(struct ceph_mds_session *s) if (s->s_state == CEPH_MDS_SESSION_CLOSING) { int ret;
- dout("resending session close request for mds%d\n", s->s_mds); + doutc(cl, "resending session close request for mds%d\n", s->s_mds); ret = request_close_session(s); if (ret < 0) - pr_err("unable to close session to mds%d: %d\n", - s->s_mds, ret); + pr_err_client(cl, "unable to close session to mds%d: %d\n", + s->s_mds, ret); } }
@@ -5128,7 +5191,7 @@ static void delayed_work(struct work_struct *work) int renew_caps; int i;
- dout("mdsc delayed_work\n"); + doutc(mdsc->fsc->client, "mdsc delayed_work\n");
if (mdsc->stopping >= CEPH_MDSC_STOPPING_FLUSHED) return; @@ -5257,6 +5320,7 @@ int ceph_mdsc_init(struct ceph_fs_client *fsc) */ static void wait_requests(struct ceph_mds_client *mdsc) { + struct ceph_client *cl = mdsc->fsc->client; struct ceph_options *opts = mdsc->fsc->client->options; struct ceph_mds_request *req;
@@ -5264,25 +5328,25 @@ static void wait_requests(struct ceph_mds_client *mdsc) if (__get_oldest_req(mdsc)) { mutex_unlock(&mdsc->mutex);
- dout("wait_requests waiting for requests\n"); + doutc(cl, "waiting for requests\n"); wait_for_completion_timeout(&mdsc->safe_umount_waiters, ceph_timeout_jiffies(opts->mount_timeout));
/* tear down remaining requests */ mutex_lock(&mdsc->mutex); while ((req = __get_oldest_req(mdsc))) { - dout("wait_requests timed out on tid %llu\n", - req->r_tid); + doutc(cl, "timed out on tid %llu\n", req->r_tid); list_del_init(&req->r_wait); __unregister_request(mdsc, req); } } mutex_unlock(&mdsc->mutex); - dout("wait_requests done\n"); + doutc(cl, "done\n"); }
void send_flush_mdlog(struct ceph_mds_session *s) { + struct ceph_client *cl = s->s_mdsc->fsc->client; struct ceph_msg *msg;
/* @@ -5292,13 +5356,13 @@ void send_flush_mdlog(struct ceph_mds_session *s) return;
mutex_lock(&s->s_mutex); - dout("request mdlog flush to mds%d (%s)s seq %lld\n", s->s_mds, - ceph_session_state_name(s->s_state), s->s_seq); + doutc(cl, "request mdlog flush to mds%d (%s)s seq %lld\n", + s->s_mds, ceph_session_state_name(s->s_state), s->s_seq); msg = ceph_create_session_msg(CEPH_SESSION_REQUEST_FLUSH_MDLOG, s->s_seq); if (!msg) { - pr_err("failed to request mdlog flush to mds%d (%s) seq %lld\n", - s->s_mds, ceph_session_state_name(s->s_state), s->s_seq); + pr_err_client(cl, "failed to request mdlog flush to mds%d (%s) seq %lld\n", + s->s_mds, ceph_session_state_name(s->s_state), s->s_seq); } else { ceph_con_send(&s->s_con, msg); } @@ -5311,7 +5375,7 @@ void send_flush_mdlog(struct ceph_mds_session *s) */ void ceph_mdsc_pre_umount(struct ceph_mds_client *mdsc) { - dout("pre_umount\n"); + doutc(mdsc->fsc->client, "begin\n"); mdsc->stopping = CEPH_MDSC_STOPPING_BEGIN;
ceph_mdsc_iterate_sessions(mdsc, send_flush_mdlog, true); @@ -5326,6 +5390,7 @@ void ceph_mdsc_pre_umount(struct ceph_mds_client *mdsc) ceph_msgr_flush();
ceph_cleanup_quotarealms_inodes(mdsc); + doutc(mdsc->fsc->client, "done\n"); }
/* @@ -5334,12 +5399,13 @@ void ceph_mdsc_pre_umount(struct ceph_mds_client *mdsc) static void flush_mdlog_and_wait_mdsc_unsafe_requests(struct ceph_mds_client *mdsc, u64 want_tid) { + struct ceph_client *cl = mdsc->fsc->client; struct ceph_mds_request *req = NULL, *nextreq; struct ceph_mds_session *last_session = NULL; struct rb_node *n;
mutex_lock(&mdsc->mutex); - dout("%s want %lld\n", __func__, want_tid); + doutc(cl, "want %lld\n", want_tid); restart: req = __get_oldest_req(mdsc); while (req && req->r_tid <= want_tid) { @@ -5373,8 +5439,8 @@ static void flush_mdlog_and_wait_mdsc_unsafe_requests(struct ceph_mds_client *md } else { ceph_put_mds_session(s); } - dout("%s wait on %llu (want %llu)\n", __func__, - req->r_tid, want_tid); + doutc(cl, "wait on %llu (want %llu)\n", + req->r_tid, want_tid); wait_for_completion(&req->r_safe_completion);
mutex_lock(&mdsc->mutex); @@ -5392,17 +5458,18 @@ static void flush_mdlog_and_wait_mdsc_unsafe_requests(struct ceph_mds_client *md } mutex_unlock(&mdsc->mutex); ceph_put_mds_session(last_session); - dout("%s done\n", __func__); + doutc(cl, "done\n"); }
void ceph_mdsc_sync(struct ceph_mds_client *mdsc) { + struct ceph_client *cl = mdsc->fsc->client; u64 want_tid, want_flush;
if (READ_ONCE(mdsc->fsc->mount_state) >= CEPH_MOUNT_SHUTDOWN) return;
- dout("sync\n"); + doutc(cl, "sync\n"); mutex_lock(&mdsc->mutex); want_tid = mdsc->last_tid; mutex_unlock(&mdsc->mutex); @@ -5418,8 +5485,7 @@ void ceph_mdsc_sync(struct ceph_mds_client *mdsc) } spin_unlock(&mdsc->cap_dirty_lock);
- dout("sync want tid %lld flush_seq %lld\n", - want_tid, want_flush); + doutc(cl, "sync want tid %lld flush_seq %lld\n", want_tid, want_flush);
flush_mdlog_and_wait_mdsc_unsafe_requests(mdsc, want_tid); wait_caps_flush(mdsc, want_flush); @@ -5441,11 +5507,12 @@ static bool done_closing_sessions(struct ceph_mds_client *mdsc, int skipped) void ceph_mdsc_close_sessions(struct ceph_mds_client *mdsc) { struct ceph_options *opts = mdsc->fsc->client->options; + struct ceph_client *cl = mdsc->fsc->client; struct ceph_mds_session *session; int i; int skipped = 0;
- dout("close_sessions\n"); + doutc(cl, "begin\n");
/* close sessions */ mutex_lock(&mdsc->mutex); @@ -5463,7 +5530,7 @@ void ceph_mdsc_close_sessions(struct ceph_mds_client *mdsc) } mutex_unlock(&mdsc->mutex);
- dout("waiting for sessions to close\n"); + doutc(cl, "waiting for sessions to close\n"); wait_event_timeout(mdsc->session_close_wq, done_closing_sessions(mdsc, skipped), ceph_timeout_jiffies(opts->mount_timeout)); @@ -5491,7 +5558,7 @@ void ceph_mdsc_close_sessions(struct ceph_mds_client *mdsc) cancel_work_sync(&mdsc->cap_reclaim_work); cancel_delayed_work_sync(&mdsc->delayed_work); /* cancel timer */
- dout("stopped\n"); + doutc(cl, "done\n"); }
void ceph_mdsc_force_umount(struct ceph_mds_client *mdsc) @@ -5499,7 +5566,7 @@ void ceph_mdsc_force_umount(struct ceph_mds_client *mdsc) struct ceph_mds_session *session; int mds;
- dout("force umount\n"); + doutc(mdsc->fsc->client, "force umount\n");
mutex_lock(&mdsc->mutex); for (mds = 0; mds < mdsc->max_sessions; mds++) { @@ -5530,7 +5597,7 @@ void ceph_mdsc_force_umount(struct ceph_mds_client *mdsc)
static void ceph_mdsc_stop(struct ceph_mds_client *mdsc) { - dout("stop\n"); + doutc(mdsc->fsc->client, "stop\n"); /* * Make sure the delayed work stopped before releasing * the resources. @@ -5551,7 +5618,7 @@ static void ceph_mdsc_stop(struct ceph_mds_client *mdsc) void ceph_mdsc_destroy(struct ceph_fs_client *fsc) { struct ceph_mds_client *mdsc = fsc->mdsc; - dout("mdsc_destroy %p\n", mdsc); + doutc(fsc->client, "%p\n", mdsc);
if (!mdsc) return; @@ -5565,12 +5632,13 @@ void ceph_mdsc_destroy(struct ceph_fs_client *fsc)
fsc->mdsc = NULL; kfree(mdsc); - dout("mdsc_destroy %p done\n", mdsc); + doutc(fsc->client, "%p done\n", mdsc); }
void ceph_mdsc_handle_fsmap(struct ceph_mds_client *mdsc, struct ceph_msg *msg) { struct ceph_fs_client *fsc = mdsc->fsc; + struct ceph_client *cl = fsc->client; const char *mds_namespace = fsc->mount_options->mds_namespace; void *p = msg->front.iov_base; void *end = p + msg->front.iov_len; @@ -5582,7 +5650,7 @@ void ceph_mdsc_handle_fsmap(struct ceph_mds_client *mdsc, struct ceph_msg *msg) ceph_decode_need(&p, end, sizeof(u32), bad); epoch = ceph_decode_32(&p);
- dout("handle_fsmap epoch %u\n", epoch); + doutc(cl, "epoch %u\n", epoch);
/* struct_v, struct_cv, map_len, epoch, legacy_client_fscid */ ceph_decode_skip_n(&p, end, 2 + sizeof(u32) * 3, bad); @@ -5627,7 +5695,8 @@ void ceph_mdsc_handle_fsmap(struct ceph_mds_client *mdsc, struct ceph_msg *msg) return;
bad: - pr_err("error decoding fsmap %d. Shutting down mount.\n", err); + pr_err_client(cl, "error decoding fsmap %d. Shutting down mount.\n", + err); ceph_umount_begin(mdsc->fsc->sb); ceph_msg_dump(msg); err_out: @@ -5642,6 +5711,7 @@ void ceph_mdsc_handle_fsmap(struct ceph_mds_client *mdsc, struct ceph_msg *msg) */ void ceph_mdsc_handle_mdsmap(struct ceph_mds_client *mdsc, struct ceph_msg *msg) { + struct ceph_client *cl = mdsc->fsc->client; u32 epoch; u32 maplen; void *p = msg->front.iov_base; @@ -5656,13 +5726,12 @@ void ceph_mdsc_handle_mdsmap(struct ceph_mds_client *mdsc, struct ceph_msg *msg) return; epoch = ceph_decode_32(&p); maplen = ceph_decode_32(&p); - dout("handle_map epoch %u len %d\n", epoch, (int)maplen); + doutc(cl, "epoch %u len %d\n", epoch, (int)maplen);
/* do we need it? */ mutex_lock(&mdsc->mutex); if (mdsc->mdsmap && epoch <= mdsc->mdsmap->m_epoch) { - dout("handle_map epoch %u <= our %u\n", - epoch, mdsc->mdsmap->m_epoch); + doutc(cl, "epoch %u <= our %u\n", epoch, mdsc->mdsmap->m_epoch); mutex_unlock(&mdsc->mutex); return; } @@ -5696,7 +5765,8 @@ void ceph_mdsc_handle_mdsmap(struct ceph_mds_client *mdsc, struct ceph_msg *msg) bad_unlock: mutex_unlock(&mdsc->mutex); bad: - pr_err("error decoding mdsmap %d. Shutting down mount.\n", err); + pr_err_client(cl, "error decoding mdsmap %d. Shutting down mount.\n", + err); ceph_umount_begin(mdsc->fsc->sb); ceph_msg_dump(msg); return; @@ -5727,7 +5797,8 @@ static void mds_peer_reset(struct ceph_connection *con) struct ceph_mds_session *s = con->private; struct ceph_mds_client *mdsc = s->s_mdsc;
- pr_warn("mds%d closed our session\n", s->s_mds); + pr_warn_client(mdsc->fsc->client, "mds%d closed our session\n", + s->s_mds); if (READ_ONCE(mdsc->fsc->mount_state) != CEPH_MOUNT_FENCE_IO) send_mds_reconnect(mdsc, s); } @@ -5736,6 +5807,7 @@ static void mds_dispatch(struct ceph_connection *con, struct ceph_msg *msg) { struct ceph_mds_session *s = con->private; struct ceph_mds_client *mdsc = s->s_mdsc; + struct ceph_client *cl = mdsc->fsc->client; int type = le16_to_cpu(msg->hdr.type);
mutex_lock(&mdsc->mutex); @@ -5775,8 +5847,8 @@ static void mds_dispatch(struct ceph_connection *con, struct ceph_msg *msg) break;
default: - pr_err("received unknown message type %d %s\n", type, - ceph_msg_type_name(type)); + pr_err_client(cl, "received unknown message type %d %s\n", + type, ceph_msg_type_name(type)); } out: ceph_msg_put(msg); diff --git a/fs/ceph/mdsmap.c b/fs/ceph/mdsmap.c index 66afb18df76b..a476d5e6d39f 100644 --- a/fs/ceph/mdsmap.c +++ b/fs/ceph/mdsmap.c @@ -11,6 +11,7 @@ #include <linux/ceph/messenger.h> #include <linux/ceph/decode.h>
+#include "mds_client.h" #include "super.h"
#define CEPH_MDS_IS_READY(i, ignore_laggy) \ @@ -117,6 +118,7 @@ static int __decode_and_drop_compat_set(void **p, void* end) struct ceph_mdsmap *ceph_mdsmap_decode(struct ceph_mds_client *mdsc, void **p, void *end, bool msgr2) { + struct ceph_client *cl = mdsc->fsc->client; struct ceph_mdsmap *m; const void *start = *p; int i, j, n; @@ -234,20 +236,18 @@ struct ceph_mdsmap *ceph_mdsmap_decode(struct ceph_mds_client *mdsc, void **p, *p = info_end; }
- dout("mdsmap_decode %d/%d %lld mds%d.%d %s %s%s\n", - i+1, n, global_id, mds, inc, - ceph_pr_addr(&addr), - ceph_mds_state_name(state), - laggy ? "(laggy)" : ""); + doutc(cl, "%d/%d %lld mds%d.%d %s %s%s\n", i+1, n, global_id, + mds, inc, ceph_pr_addr(&addr), + ceph_mds_state_name(state), laggy ? "(laggy)" : "");
if (mds < 0 || mds >= m->possible_max_rank) { - pr_warn("mdsmap_decode got incorrect mds(%d)\n", mds); + pr_warn_client(cl, "got incorrect mds(%d)\n", mds); continue; }
if (state <= 0) { - dout("mdsmap_decode got incorrect state(%s)\n", - ceph_mds_state_name(state)); + doutc(cl, "got incorrect state(%s)\n", + ceph_mds_state_name(state)); continue; }
@@ -387,16 +387,16 @@ struct ceph_mdsmap *ceph_mdsmap_decode(struct ceph_mds_client *mdsc, void **p, ceph_decode_64_safe(p, end, m->m_max_xattr_size, bad_ext); } bad_ext: - dout("mdsmap_decode m_enabled: %d, m_damaged: %d, m_num_laggy: %d\n", - !!m->m_enabled, !!m->m_damaged, m->m_num_laggy); + doutc(cl, "m_enabled: %d, m_damaged: %d, m_num_laggy: %d\n", + !!m->m_enabled, !!m->m_damaged, m->m_num_laggy); *p = end; - dout("mdsmap_decode success epoch %u\n", m->m_epoch); + doutc(cl, "success epoch %u\n", m->m_epoch); return m; nomem: err = -ENOMEM; goto out_err; corrupt: - pr_err("corrupt mdsmap\n"); + pr_err_client(cl, "corrupt mdsmap\n"); print_hex_dump(KERN_DEBUG, "mdsmap: ", DUMP_PREFIX_OFFSET, 16, 1, start, end - start, true); diff --git a/fs/ceph/metric.c b/fs/ceph/metric.c index 6d3584f16f9a..871c1090e520 100644 --- a/fs/ceph/metric.c +++ b/fs/ceph/metric.c @@ -31,6 +31,7 @@ static bool ceph_mdsc_send_metrics(struct ceph_mds_client *mdsc, struct ceph_client_metric *m = &mdsc->metric; u64 nr_caps = atomic64_read(&m->total_caps); u32 header_len = sizeof(struct ceph_metric_header); + struct ceph_client *cl = mdsc->fsc->client; struct ceph_msg *msg; s64 sum; s32 items = 0; @@ -51,8 +52,8 @@ static bool ceph_mdsc_send_metrics(struct ceph_mds_client *mdsc,
msg = ceph_msg_new(CEPH_MSG_CLIENT_METRICS, len, GFP_NOFS, true); if (!msg) { - pr_err("send metrics to mds%d, failed to allocate message\n", - s->s_mds); + pr_err_client(cl, "to mds%d, failed to allocate message\n", + s->s_mds); return false; }
diff --git a/fs/ceph/quota.c b/fs/ceph/quota.c index ca4932e6f71b..06ee397e0c3a 100644 --- a/fs/ceph/quota.c +++ b/fs/ceph/quota.c @@ -43,6 +43,7 @@ void ceph_handle_quota(struct ceph_mds_client *mdsc, { struct super_block *sb = mdsc->fsc->sb; struct ceph_mds_quota *h = msg->front.iov_base; + struct ceph_client *cl = mdsc->fsc->client; struct ceph_vino vino; struct inode *inode; struct ceph_inode_info *ci; @@ -51,8 +52,8 @@ void ceph_handle_quota(struct ceph_mds_client *mdsc, return;
if (msg->front.iov_len < sizeof(*h)) { - pr_err("%s corrupt message mds%d len %d\n", __func__, - session->s_mds, (int)msg->front.iov_len); + pr_err_client(cl, "corrupt message mds%d len %d\n", + session->s_mds, (int)msg->front.iov_len); ceph_msg_dump(msg); goto out; } @@ -62,7 +63,7 @@ void ceph_handle_quota(struct ceph_mds_client *mdsc, vino.snap = CEPH_NOSNAP; inode = ceph_find_inode(sb, vino); if (!inode) { - pr_warn("Failed to find inode %llu\n", vino.ino); + pr_warn_client(cl, "failed to find inode %llx\n", vino.ino); goto out; } ci = ceph_inode(inode); @@ -85,6 +86,7 @@ find_quotarealm_inode(struct ceph_mds_client *mdsc, u64 ino) { struct ceph_quotarealm_inode *qri = NULL; struct rb_node **node, *parent = NULL; + struct ceph_client *cl = mdsc->fsc->client;
mutex_lock(&mdsc->quotarealms_inodes_mutex); node = &(mdsc->quotarealms_inodes.rb_node); @@ -110,7 +112,7 @@ find_quotarealm_inode(struct ceph_mds_client *mdsc, u64 ino) rb_link_node(&qri->node, parent, node); rb_insert_color(&qri->node, &mdsc->quotarealms_inodes); } else - pr_warn("Failed to alloc quotarealms_inode\n"); + pr_warn_client(cl, "Failed to alloc quotarealms_inode\n"); } mutex_unlock(&mdsc->quotarealms_inodes_mutex);
@@ -129,6 +131,7 @@ static struct inode *lookup_quotarealm_inode(struct ceph_mds_client *mdsc, struct super_block *sb, struct ceph_snap_realm *realm) { + struct ceph_client *cl = mdsc->fsc->client; struct ceph_quotarealm_inode *qri; struct inode *in;
@@ -161,8 +164,8 @@ static struct inode *lookup_quotarealm_inode(struct ceph_mds_client *mdsc, }
if (IS_ERR(in)) { - dout("Can't lookup inode %llx (err: %ld)\n", - realm->ino, PTR_ERR(in)); + doutc(cl, "Can't lookup inode %llx (err: %ld)\n", realm->ino, + PTR_ERR(in)); qri->timeout = jiffies + msecs_to_jiffies(60 * 1000); /* XXX */ } else { qri->timeout = 0; @@ -212,6 +215,7 @@ static int get_quota_realm(struct ceph_mds_client *mdsc, struct inode *inode, enum quota_get_realm which_quota, struct ceph_snap_realm **realmp, bool retry) { + struct ceph_client *cl = mdsc->fsc->client; struct ceph_inode_info *ci = NULL; struct ceph_snap_realm *realm, *next; struct inode *in; @@ -227,8 +231,9 @@ static int get_quota_realm(struct ceph_mds_client *mdsc, struct inode *inode, if (realm) ceph_get_snap_realm(mdsc, realm); else - pr_err_ratelimited("get_quota_realm: ino (%llx.%llx) " - "null i_snap_realm\n", ceph_vinop(inode)); + pr_err_ratelimited_client(cl, + "%p %llx.%llx null i_snap_realm\n", + inode, ceph_vinop(inode)); while (realm) { bool has_inode;
@@ -322,6 +327,7 @@ static bool check_quota_exceeded(struct inode *inode, enum quota_check_op op, loff_t delta) { struct ceph_mds_client *mdsc = ceph_sb_to_mdsc(inode->i_sb); + struct ceph_client *cl = mdsc->fsc->client; struct ceph_inode_info *ci; struct ceph_snap_realm *realm, *next; struct inode *in; @@ -337,8 +343,9 @@ static bool check_quota_exceeded(struct inode *inode, enum quota_check_op op, if (realm) ceph_get_snap_realm(mdsc, realm); else - pr_err_ratelimited("check_quota_exceeded: ino (%llx.%llx) " - "null i_snap_realm\n", ceph_vinop(inode)); + pr_err_ratelimited_client(cl, + "%p %llx.%llx null i_snap_realm\n", + inode, ceph_vinop(inode)); while (realm) { bool has_inode;
@@ -388,7 +395,7 @@ static bool check_quota_exceeded(struct inode *inode, enum quota_check_op op, break; default: /* Shouldn't happen */ - pr_warn("Invalid quota check op (%d)\n", op); + pr_warn_client(cl, "Invalid quota check op (%d)\n", op); exceeded = true; /* Just break the loop */ } iput(in); diff --git a/fs/ceph/snap.c b/fs/ceph/snap.c index d0d3612f28f0..73c526954749 100644 --- a/fs/ceph/snap.c +++ b/fs/ceph/snap.c @@ -138,7 +138,7 @@ static struct ceph_snap_realm *ceph_create_snap_realm( __insert_snap_realm(&mdsc->snap_realms, realm); mdsc->num_snap_realms++;
- dout("%s %llx %p\n", __func__, realm->ino, realm); + doutc(mdsc->fsc->client, "%llx %p\n", realm->ino, realm); return realm; }
@@ -150,6 +150,7 @@ static struct ceph_snap_realm *ceph_create_snap_realm( static struct ceph_snap_realm *__lookup_snap_realm(struct ceph_mds_client *mdsc, u64 ino) { + struct ceph_client *cl = mdsc->fsc->client; struct rb_node *n = mdsc->snap_realms.rb_node; struct ceph_snap_realm *r;
@@ -162,7 +163,7 @@ static struct ceph_snap_realm *__lookup_snap_realm(struct ceph_mds_client *mdsc, else if (ino > r->ino) n = n->rb_right; else { - dout("%s %llx %p\n", __func__, r->ino, r); + doutc(cl, "%llx %p\n", r->ino, r); return r; } } @@ -188,9 +189,10 @@ static void __put_snap_realm(struct ceph_mds_client *mdsc, static void __destroy_snap_realm(struct ceph_mds_client *mdsc, struct ceph_snap_realm *realm) { + struct ceph_client *cl = mdsc->fsc->client; lockdep_assert_held_write(&mdsc->snap_rwsem);
- dout("%s %p %llx\n", __func__, realm, realm->ino); + doutc(cl, "%p %llx\n", realm, realm->ino);
rb_erase(&realm->node, &mdsc->snap_realms); mdsc->num_snap_realms--; @@ -290,6 +292,7 @@ static int adjust_snap_realm_parent(struct ceph_mds_client *mdsc, struct ceph_snap_realm *realm, u64 parentino) { + struct ceph_client *cl = mdsc->fsc->client; struct ceph_snap_realm *parent;
lockdep_assert_held_write(&mdsc->snap_rwsem); @@ -303,8 +306,8 @@ static int adjust_snap_realm_parent(struct ceph_mds_client *mdsc, if (IS_ERR(parent)) return PTR_ERR(parent); } - dout("%s %llx %p: %llx %p -> %llx %p\n", __func__, realm->ino, - realm, realm->parent_ino, realm->parent, parentino, parent); + doutc(cl, "%llx %p: %llx %p -> %llx %p\n", realm->ino, realm, + realm->parent_ino, realm->parent, parentino, parent); if (realm->parent) { list_del_init(&realm->child_item); ceph_put_snap_realm(mdsc, realm->parent); @@ -334,6 +337,7 @@ static int build_snap_context(struct ceph_mds_client *mdsc, struct list_head *realm_queue, struct list_head *dirty_realms) { + struct ceph_client *cl = mdsc->fsc->client; struct ceph_snap_realm *parent = realm->parent; struct ceph_snap_context *snapc; int err = 0; @@ -361,10 +365,10 @@ static int build_snap_context(struct ceph_mds_client *mdsc, realm->cached_context->seq == realm->seq && (!parent || realm->cached_context->seq >= parent->cached_context->seq)) { - dout("%s %llx %p: %p seq %lld (%u snaps) (unchanged)\n", - __func__, realm->ino, realm, realm->cached_context, - realm->cached_context->seq, - (unsigned int)realm->cached_context->num_snaps); + doutc(cl, "%llx %p: %p seq %lld (%u snaps) (unchanged)\n", + realm->ino, realm, realm->cached_context, + realm->cached_context->seq, + (unsigned int)realm->cached_context->num_snaps); return 0; }
@@ -401,8 +405,8 @@ static int build_snap_context(struct ceph_mds_client *mdsc,
sort(snapc->snaps, num, sizeof(u64), cmpu64_rev, NULL); snapc->num_snaps = num; - dout("%s %llx %p: %p seq %lld (%u snaps)\n", __func__, realm->ino, - realm, snapc, snapc->seq, (unsigned int) snapc->num_snaps); + doutc(cl, "%llx %p: %p seq %lld (%u snaps)\n", realm->ino, realm, + snapc, snapc->seq, (unsigned int) snapc->num_snaps);
ceph_put_snap_context(realm->cached_context); realm->cached_context = snapc; @@ -419,7 +423,7 @@ static int build_snap_context(struct ceph_mds_client *mdsc, ceph_put_snap_context(realm->cached_context); realm->cached_context = NULL; } - pr_err("%s %llx %p fail %d\n", __func__, realm->ino, realm, err); + pr_err_client(cl, "%llx %p fail %d\n", realm->ino, realm, err); return err; }
@@ -430,6 +434,7 @@ static void rebuild_snap_realms(struct ceph_mds_client *mdsc, struct ceph_snap_realm *realm, struct list_head *dirty_realms) { + struct ceph_client *cl = mdsc->fsc->client; LIST_HEAD(realm_queue); int last = 0; bool skip = false; @@ -455,8 +460,8 @@ static void rebuild_snap_realms(struct ceph_mds_client *mdsc,
last = build_snap_context(mdsc, _realm, &realm_queue, dirty_realms); - dout("%s %llx %p, %s\n", __func__, _realm->ino, _realm, - last > 0 ? "is deferred" : !last ? "succeeded" : "failed"); + doutc(cl, "%llx %p, %s\n", realm->ino, realm, + last > 0 ? "is deferred" : !last ? "succeeded" : "failed");
/* is any child in the list ? */ list_for_each_entry(child, &_realm->children, child_item) { @@ -526,6 +531,7 @@ static void ceph_queue_cap_snap(struct ceph_inode_info *ci, struct ceph_cap_snap **pcapsnap) { struct inode *inode = &ci->netfs.inode; + struct ceph_client *cl = ceph_inode_to_client(inode); struct ceph_snap_context *old_snapc, *new_snapc; struct ceph_cap_snap *capsnap = *pcapsnap; struct ceph_buffer *old_blob = NULL; @@ -551,14 +557,14 @@ static void ceph_queue_cap_snap(struct ceph_inode_info *ci, as no new writes are allowed to start when pending, so any writes in progress now were started before the previous cap_snap. lucky us. */ - dout("%s %p %llx.%llx already pending\n", - __func__, inode, ceph_vinop(inode)); + doutc(cl, "%p %llx.%llx already pending\n", inode, + ceph_vinop(inode)); goto update_snapc; } if (ci->i_wrbuffer_ref_head == 0 && !(dirty & (CEPH_CAP_ANY_EXCL|CEPH_CAP_FILE_WR))) { - dout("%s %p %llx.%llx nothing dirty|writing\n", - __func__, inode, ceph_vinop(inode)); + doutc(cl, "%p %llx.%llx nothing dirty|writing\n", inode, + ceph_vinop(inode)); goto update_snapc; }
@@ -578,15 +584,15 @@ static void ceph_queue_cap_snap(struct ceph_inode_info *ci, } else { if (!(used & CEPH_CAP_FILE_WR) && ci->i_wrbuffer_ref_head == 0) { - dout("%s %p %llx.%llx no new_snap|dirty_page|writing\n", - __func__, inode, ceph_vinop(inode)); + doutc(cl, "%p %llx.%llx no new_snap|dirty_page|writing\n", + inode, ceph_vinop(inode)); goto update_snapc; } }
- dout("%s %p %llx.%llx cap_snap %p queuing under %p %s %s\n", - __func__, inode, ceph_vinop(inode), capsnap, old_snapc, - ceph_cap_string(dirty), capsnap->need_flush ? "" : "no_flush"); + doutc(cl, "%p %llx.%llx cap_snap %p queuing under %p %s %s\n", + inode, ceph_vinop(inode), capsnap, old_snapc, + ceph_cap_string(dirty), capsnap->need_flush ? "" : "no_flush"); ihold(inode);
capsnap->follows = old_snapc->seq; @@ -618,9 +624,9 @@ static void ceph_queue_cap_snap(struct ceph_inode_info *ci, list_add_tail(&capsnap->ci_item, &ci->i_cap_snaps);
if (used & CEPH_CAP_FILE_WR) { - dout("%s %p %llx.%llx cap_snap %p snapc %p seq %llu used WR," - " now pending\n", __func__, inode, ceph_vinop(inode), - capsnap, old_snapc, old_snapc->seq); + doutc(cl, "%p %llx.%llx cap_snap %p snapc %p seq %llu used WR," + " now pending\n", inode, ceph_vinop(inode), capsnap, + old_snapc, old_snapc->seq); capsnap->writing = 1; } else { /* note mtime, size NOW. */ @@ -637,7 +643,7 @@ static void ceph_queue_cap_snap(struct ceph_inode_info *ci, ci->i_head_snapc = NULL; } else { ci->i_head_snapc = ceph_get_snap_context(new_snapc); - dout(" new snapc is %p\n", new_snapc); + doutc(cl, " new snapc is %p\n", new_snapc); } spin_unlock(&ci->i_ceph_lock);
@@ -658,6 +664,7 @@ int __ceph_finish_cap_snap(struct ceph_inode_info *ci, { struct inode *inode = &ci->netfs.inode; struct ceph_mds_client *mdsc = ceph_sb_to_mdsc(inode->i_sb); + struct ceph_client *cl = mdsc->fsc->client;
BUG_ON(capsnap->writing); capsnap->size = i_size_read(inode); @@ -670,11 +677,12 @@ int __ceph_finish_cap_snap(struct ceph_inode_info *ci, capsnap->truncate_size = ci->i_truncate_size; capsnap->truncate_seq = ci->i_truncate_seq; if (capsnap->dirty_pages) { - dout("%s %p %llx.%llx cap_snap %p snapc %p %llu %s s=%llu " - "still has %d dirty pages\n", __func__, inode, - ceph_vinop(inode), capsnap, capsnap->context, - capsnap->context->seq, ceph_cap_string(capsnap->dirty), - capsnap->size, capsnap->dirty_pages); + doutc(cl, "%p %llx.%llx cap_snap %p snapc %p %llu %s " + "s=%llu still has %d dirty pages\n", inode, + ceph_vinop(inode), capsnap, capsnap->context, + capsnap->context->seq, + ceph_cap_string(capsnap->dirty), + capsnap->size, capsnap->dirty_pages); return 0; }
@@ -683,20 +691,20 @@ int __ceph_finish_cap_snap(struct ceph_inode_info *ci, * And trigger to flush the buffer immediately. */ if (ci->i_wrbuffer_ref) { - dout("%s %p %llx.%llx cap_snap %p snapc %p %llu %s s=%llu " - "used WRBUFFER, delaying\n", __func__, inode, - ceph_vinop(inode), capsnap, capsnap->context, - capsnap->context->seq, ceph_cap_string(capsnap->dirty), - capsnap->size); + doutc(cl, "%p %llx.%llx cap_snap %p snapc %p %llu %s " + "s=%llu used WRBUFFER, delaying\n", inode, + ceph_vinop(inode), capsnap, capsnap->context, + capsnap->context->seq, ceph_cap_string(capsnap->dirty), + capsnap->size); ceph_queue_writeback(inode); return 0; }
ci->i_ceph_flags |= CEPH_I_FLUSH_SNAPS; - dout("%s %p %llx.%llx cap_snap %p snapc %p %llu %s s=%llu\n", - __func__, inode, ceph_vinop(inode), capsnap, capsnap->context, - capsnap->context->seq, ceph_cap_string(capsnap->dirty), - capsnap->size); + doutc(cl, "%p %llx.%llx cap_snap %p snapc %p %llu %s s=%llu\n", + inode, ceph_vinop(inode), capsnap, capsnap->context, + capsnap->context->seq, ceph_cap_string(capsnap->dirty), + capsnap->size);
spin_lock(&mdsc->snap_flush_lock); if (list_empty(&ci->i_snap_flush_item)) { @@ -714,11 +722,12 @@ int __ceph_finish_cap_snap(struct ceph_inode_info *ci, static void queue_realm_cap_snaps(struct ceph_mds_client *mdsc, struct ceph_snap_realm *realm) { + struct ceph_client *cl = mdsc->fsc->client; struct ceph_inode_info *ci; struct inode *lastinode = NULL; struct ceph_cap_snap *capsnap = NULL;
- dout("%s %p %llx inode\n", __func__, realm, realm->ino); + doutc(cl, "%p %llx inode\n", realm, realm->ino);
spin_lock(&realm->inodes_with_caps_lock); list_for_each_entry(ci, &realm->inodes_with_caps, i_snap_realm_item) { @@ -737,8 +746,9 @@ static void queue_realm_cap_snaps(struct ceph_mds_client *mdsc, if (!capsnap) { capsnap = kmem_cache_zalloc(ceph_cap_snap_cachep, GFP_NOFS); if (!capsnap) { - pr_err("ENOMEM allocating ceph_cap_snap on %p\n", - inode); + pr_err_client(cl, + "ENOMEM allocating ceph_cap_snap on %p\n", + inode); return; } } @@ -756,7 +766,7 @@ static void queue_realm_cap_snaps(struct ceph_mds_client *mdsc,
if (capsnap) kmem_cache_free(ceph_cap_snap_cachep, capsnap); - dout("%s %p %llx done\n", __func__, realm, realm->ino); + doutc(cl, "%p %llx done\n", realm, realm->ino); }
/* @@ -770,6 +780,7 @@ int ceph_update_snap_trace(struct ceph_mds_client *mdsc, void *p, void *e, bool deletion, struct ceph_snap_realm **realm_ret) { + struct ceph_client *cl = mdsc->fsc->client; struct ceph_mds_snap_realm *ri; /* encoded */ __le64 *snaps; /* encoded */ __le64 *prior_parent_snaps; /* encoded */ @@ -784,7 +795,7 @@ int ceph_update_snap_trace(struct ceph_mds_client *mdsc,
lockdep_assert_held_write(&mdsc->snap_rwsem);
- dout("%s deletion=%d\n", __func__, deletion); + doutc(cl, "deletion=%d\n", deletion); more: realm = NULL; rebuild_snapcs = 0; @@ -814,8 +825,8 @@ int ceph_update_snap_trace(struct ceph_mds_client *mdsc, rebuild_snapcs += err;
if (le64_to_cpu(ri->seq) > realm->seq) { - dout("%s updating %llx %p %lld -> %lld\n", __func__, - realm->ino, realm, realm->seq, le64_to_cpu(ri->seq)); + doutc(cl, "updating %llx %p %lld -> %lld\n", realm->ino, + realm, realm->seq, le64_to_cpu(ri->seq)); /* update realm parameters, snap lists */ realm->seq = le64_to_cpu(ri->seq); realm->created = le64_to_cpu(ri->created); @@ -838,16 +849,16 @@ int ceph_update_snap_trace(struct ceph_mds_client *mdsc,
rebuild_snapcs = 1; } else if (!realm->cached_context) { - dout("%s %llx %p seq %lld new\n", __func__, - realm->ino, realm, realm->seq); + doutc(cl, "%llx %p seq %lld new\n", realm->ino, realm, + realm->seq); rebuild_snapcs = 1; } else { - dout("%s %llx %p seq %lld unchanged\n", __func__, - realm->ino, realm, realm->seq); + doutc(cl, "%llx %p seq %lld unchanged\n", realm->ino, realm, + realm->seq); }
- dout("done with %llx %p, rebuild_snapcs=%d, %p %p\n", realm->ino, - realm, rebuild_snapcs, p, e); + doutc(cl, "done with %llx %p, rebuild_snapcs=%d, %p %p\n", realm->ino, + realm, rebuild_snapcs, p, e);
/* * this will always track the uppest parent realm from which @@ -895,7 +906,7 @@ int ceph_update_snap_trace(struct ceph_mds_client *mdsc, ceph_put_snap_realm(mdsc, realm); if (first_realm) ceph_put_snap_realm(mdsc, first_realm); - pr_err("%s error %d\n", __func__, err); + pr_err_client(cl, "error %d\n", err);
/* * When receiving a corrupted snap trace we don't know what @@ -909,11 +920,12 @@ int ceph_update_snap_trace(struct ceph_mds_client *mdsc, WRITE_ONCE(mdsc->fsc->mount_state, CEPH_MOUNT_FENCE_IO); ret = ceph_monc_blocklist_add(&client->monc, &client->msgr.inst.addr); if (ret) - pr_err("%s failed to blocklist %s: %d\n", __func__, - ceph_pr_addr(&client->msgr.inst.addr), ret); + pr_err_client(cl, "failed to blocklist %s: %d\n", + ceph_pr_addr(&client->msgr.inst.addr), ret);
- WARN(1, "%s: %s%sdo remount to continue%s", - __func__, ret ? "" : ceph_pr_addr(&client->msgr.inst.addr), + WARN(1, "[client.%lld] %s %s%sdo remount to continue%s", + client->monc.auth->global_id, __func__, + ret ? "" : ceph_pr_addr(&client->msgr.inst.addr), ret ? "" : " was blocklisted, ", err == -EIO ? " after corrupted snaptrace is fixed" : "");
@@ -929,11 +941,12 @@ int ceph_update_snap_trace(struct ceph_mds_client *mdsc, */ static void flush_snaps(struct ceph_mds_client *mdsc) { + struct ceph_client *cl = mdsc->fsc->client; struct ceph_inode_info *ci; struct inode *inode; struct ceph_mds_session *session = NULL;
- dout("%s\n", __func__); + doutc(cl, "begin\n"); spin_lock(&mdsc->snap_flush_lock); while (!list_empty(&mdsc->snap_flush_list)) { ci = list_first_entry(&mdsc->snap_flush_list, @@ -948,7 +961,7 @@ static void flush_snaps(struct ceph_mds_client *mdsc) spin_unlock(&mdsc->snap_flush_lock);
ceph_put_mds_session(session); - dout("%s done\n", __func__); + doutc(cl, "done\n"); }
/** @@ -1004,6 +1017,7 @@ void ceph_handle_snap(struct ceph_mds_client *mdsc, struct ceph_mds_session *session, struct ceph_msg *msg) { + struct ceph_client *cl = mdsc->fsc->client; struct super_block *sb = mdsc->fsc->sb; int mds = session->s_mds; u64 split; @@ -1034,8 +1048,8 @@ void ceph_handle_snap(struct ceph_mds_client *mdsc, trace_len = le32_to_cpu(h->trace_len); p += sizeof(*h);
- dout("%s from mds%d op %s split %llx tracelen %d\n", __func__, - mds, ceph_snap_op_name(op), split, trace_len); + doutc(cl, "from mds%d op %s split %llx tracelen %d\n", mds, + ceph_snap_op_name(op), split, trace_len);
down_write(&mdsc->snap_rwsem); locked_rwsem = 1; @@ -1066,7 +1080,7 @@ void ceph_handle_snap(struct ceph_mds_client *mdsc, goto out; }
- dout("splitting snap_realm %llx %p\n", realm->ino, realm); + doutc(cl, "splitting snap_realm %llx %p\n", realm->ino, realm); for (i = 0; i < num_split_inos; i++) { struct ceph_vino vino = { .ino = le64_to_cpu(split_inos[i]), @@ -1091,13 +1105,13 @@ void ceph_handle_snap(struct ceph_mds_client *mdsc, */ if (ci->i_snap_realm->created > le64_to_cpu(ri->created)) { - dout(" leaving %p %llx.%llx in newer realm %llx %p\n", - inode, ceph_vinop(inode), ci->i_snap_realm->ino, - ci->i_snap_realm); + doutc(cl, " leaving %p %llx.%llx in newer realm %llx %p\n", + inode, ceph_vinop(inode), ci->i_snap_realm->ino, + ci->i_snap_realm); goto skip_inode; } - dout(" will move %p %llx.%llx to split realm %llx %p\n", - inode, ceph_vinop(inode), realm->ino, realm); + doutc(cl, " will move %p %llx.%llx to split realm %llx %p\n", + inode, ceph_vinop(inode), realm->ino, realm);
ceph_get_snap_realm(mdsc, realm); ceph_change_snap_realm(inode, realm); @@ -1158,7 +1172,7 @@ void ceph_handle_snap(struct ceph_mds_client *mdsc, return;
bad: - pr_err("%s corrupt snap message from mds%d\n", __func__, mds); + pr_err_client(cl, "corrupt snap message from mds%d\n", mds); ceph_msg_dump(msg); out: if (locked_rwsem) @@ -1174,6 +1188,7 @@ void ceph_handle_snap(struct ceph_mds_client *mdsc, struct ceph_snapid_map* ceph_get_snapid_map(struct ceph_mds_client *mdsc, u64 snap) { + struct ceph_client *cl = mdsc->fsc->client; struct ceph_snapid_map *sm, *exist; struct rb_node **p, *parent; int ret; @@ -1196,8 +1211,8 @@ struct ceph_snapid_map* ceph_get_snapid_map(struct ceph_mds_client *mdsc, } spin_unlock(&mdsc->snapid_map_lock); if (exist) { - dout("%s found snapid map %llx -> %x\n", __func__, - exist->snap, exist->dev); + doutc(cl, "found snapid map %llx -> %x\n", exist->snap, + exist->dev); return exist; }
@@ -1241,13 +1256,12 @@ struct ceph_snapid_map* ceph_get_snapid_map(struct ceph_mds_client *mdsc, if (exist) { free_anon_bdev(sm->dev); kfree(sm); - dout("%s found snapid map %llx -> %x\n", __func__, - exist->snap, exist->dev); + doutc(cl, "found snapid map %llx -> %x\n", exist->snap, + exist->dev); return exist; }
- dout("%s create snapid map %llx -> %x\n", __func__, - sm->snap, sm->dev); + doutc(cl, "create snapid map %llx -> %x\n", sm->snap, sm->dev); return sm; }
@@ -1272,6 +1286,7 @@ void ceph_put_snapid_map(struct ceph_mds_client* mdsc,
void ceph_trim_snapid_map(struct ceph_mds_client *mdsc) { + struct ceph_client *cl = mdsc->fsc->client; struct ceph_snapid_map *sm; unsigned long now; LIST_HEAD(to_free); @@ -1293,7 +1308,7 @@ void ceph_trim_snapid_map(struct ceph_mds_client *mdsc) while (!list_empty(&to_free)) { sm = list_first_entry(&to_free, struct ceph_snapid_map, lru); list_del(&sm->lru); - dout("trim snapid map %llx -> %x\n", sm->snap, sm->dev); + doutc(cl, "trim snapid map %llx -> %x\n", sm->snap, sm->dev); free_anon_bdev(sm->dev); kfree(sm); } @@ -1301,6 +1316,7 @@ void ceph_trim_snapid_map(struct ceph_mds_client *mdsc)
void ceph_cleanup_snapid_map(struct ceph_mds_client *mdsc) { + struct ceph_client *cl = mdsc->fsc->client; struct ceph_snapid_map *sm; struct rb_node *p; LIST_HEAD(to_free); @@ -1319,8 +1335,8 @@ void ceph_cleanup_snapid_map(struct ceph_mds_client *mdsc) list_del(&sm->lru); free_anon_bdev(sm->dev); if (WARN_ON_ONCE(atomic_read(&sm->ref))) { - pr_err("snapid map %llx -> %x still in use\n", - sm->snap, sm->dev); + pr_err_client(cl, "snapid map %llx -> %x still in use\n", + sm->snap, sm->dev); } kfree(sm); } diff --git a/fs/ceph/super.c b/fs/ceph/super.c index 29026ba4f022..9139f102351b 100644 --- a/fs/ceph/super.c +++ b/fs/ceph/super.c @@ -46,9 +46,10 @@ static void ceph_put_super(struct super_block *s) { struct ceph_fs_client *fsc = ceph_sb_to_fs_client(s);
- dout("put_super\n"); + doutc(fsc->client, "begin\n"); ceph_fscrypt_free_dummy_policy(fsc); ceph_mdsc_close_sessions(fsc->mdsc); + doutc(fsc->client, "done\n"); }
static int ceph_statfs(struct dentry *dentry, struct kstatfs *buf) @@ -59,13 +60,13 @@ static int ceph_statfs(struct dentry *dentry, struct kstatfs *buf) int i, err; u64 data_pool;
+ doutc(fsc->client, "begin\n"); if (fsc->mdsc->mdsmap->m_num_data_pg_pools == 1) { data_pool = fsc->mdsc->mdsmap->m_data_pg_pools[0]; } else { data_pool = CEPH_NOPOOL; }
- dout("statfs\n"); err = ceph_monc_do_statfs(monc, data_pool, &st); if (err < 0) return err; @@ -113,24 +114,26 @@ static int ceph_statfs(struct dentry *dentry, struct kstatfs *buf) /* fold the fs_cluster_id into the upper bits */ buf->f_fsid.val[1] = monc->fs_cluster_id;
+ doutc(fsc->client, "done\n"); return 0; }
static int ceph_sync_fs(struct super_block *sb, int wait) { struct ceph_fs_client *fsc = ceph_sb_to_fs_client(sb); + struct ceph_client *cl = fsc->client;
if (!wait) { - dout("sync_fs (non-blocking)\n"); + doutc(cl, "(non-blocking)\n"); ceph_flush_dirty_caps(fsc->mdsc); - dout("sync_fs (non-blocking) done\n"); + doutc(cl, "(non-blocking) done\n"); return 0; }
- dout("sync_fs (blocking)\n"); + doutc(cl, "(blocking)\n"); ceph_osdc_sync(&fsc->client->osdc); ceph_mdsc_sync(fsc->mdsc); - dout("sync_fs (blocking) done\n"); + doutc(cl, "(blocking) done\n"); return 0; }
@@ -349,7 +352,7 @@ static int ceph_parse_source(struct fs_parameter *param, struct fs_context *fc) char *dev_name = param->string, *dev_name_end; int ret;
- dout("%s '%s'\n", __func__, dev_name); + dout("'%s'\n", dev_name); if (!dev_name || !*dev_name) return invalfc(fc, "Empty source");
@@ -421,7 +424,7 @@ static int ceph_parse_mount_param(struct fs_context *fc, return ret;
token = fs_parse(fc, ceph_mount_parameters, param, &result); - dout("%s fs_parse '%s' token %d\n", __func__, param->key, token); + dout("%s: fs_parse '%s' token %d\n",__func__, param->key, token); if (token < 0) return token;
@@ -891,7 +894,7 @@ static void flush_fs_workqueues(struct ceph_fs_client *fsc)
static void destroy_fs_client(struct ceph_fs_client *fsc) { - dout("destroy_fs_client %p\n", fsc); + doutc(fsc->client, "%p\n", fsc);
spin_lock(&ceph_fsc_lock); list_del(&fsc->metric_wakeup); @@ -906,7 +909,7 @@ static void destroy_fs_client(struct ceph_fs_client *fsc) ceph_destroy_client(fsc->client);
kfree(fsc); - dout("destroy_fs_client %p done\n", fsc); + dout("%s: %p done\n", __func__, fsc); }
/* @@ -1028,7 +1031,7 @@ void ceph_umount_begin(struct super_block *sb) { struct ceph_fs_client *fsc = ceph_sb_to_fs_client(sb);
- dout("ceph_umount_begin - starting forced umount\n"); + doutc(fsc->client, "starting forced umount\n"); if (!fsc) return; fsc->mount_state = CEPH_MOUNT_SHUTDOWN; @@ -1056,13 +1059,14 @@ static struct dentry *open_root_dentry(struct ceph_fs_client *fsc, const char *path, unsigned long started) { + struct ceph_client *cl = fsc->client; struct ceph_mds_client *mdsc = fsc->mdsc; struct ceph_mds_request *req = NULL; int err; struct dentry *root;
/* open dir */ - dout("open_root_inode opening '%s'\n", path); + doutc(cl, "opening '%s'\n", path); req = ceph_mdsc_create_request(mdsc, CEPH_MDS_OP_GETATTR, USE_ANY_MDS); if (IS_ERR(req)) return ERR_CAST(req); @@ -1082,13 +1086,13 @@ static struct dentry *open_root_dentry(struct ceph_fs_client *fsc, if (err == 0) { struct inode *inode = req->r_target_inode; req->r_target_inode = NULL; - dout("open_root_inode success\n"); + doutc(cl, "success\n"); root = d_make_root(inode); if (!root) { root = ERR_PTR(-ENOMEM); goto out; } - dout("open_root_inode success, root dentry is %p\n", root); + doutc(cl, "success, root dentry is %p\n", root); } else { root = ERR_PTR(err); } @@ -1147,11 +1151,12 @@ static int ceph_apply_test_dummy_encryption(struct super_block *sb, static struct dentry *ceph_real_mount(struct ceph_fs_client *fsc, struct fs_context *fc) { + struct ceph_client *cl = fsc->client; int err; unsigned long started = jiffies; /* note the start time */ struct dentry *root;
- dout("mount start %p\n", fsc); + doutc(cl, "mount start %p\n", fsc); mutex_lock(&fsc->client->mount_mutex);
if (!fsc->sb->s_root) { @@ -1174,7 +1179,7 @@ static struct dentry *ceph_real_mount(struct ceph_fs_client *fsc, if (err) goto out;
- dout("mount opening path '%s'\n", path); + doutc(cl, "mount opening path '%s'\n", path);
ceph_fs_debugfs_init(fsc);
@@ -1189,7 +1194,7 @@ static struct dentry *ceph_real_mount(struct ceph_fs_client *fsc, }
fsc->mount_state = CEPH_MOUNT_MOUNTED; - dout("mount success\n"); + doutc(cl, "mount success\n"); mutex_unlock(&fsc->client->mount_mutex); return root;
@@ -1202,9 +1207,10 @@ static struct dentry *ceph_real_mount(struct ceph_fs_client *fsc, static int ceph_set_super(struct super_block *s, struct fs_context *fc) { struct ceph_fs_client *fsc = s->s_fs_info; + struct ceph_client *cl = fsc->client; int ret;
- dout("set_super %p\n", s); + doutc(cl, "%p\n", s);
s->s_maxbytes = MAX_LFS_FILESIZE;
@@ -1238,30 +1244,31 @@ static int ceph_compare_super(struct super_block *sb, struct fs_context *fc) struct ceph_mount_options *fsopt = new->mount_options; struct ceph_options *opt = new->client->options; struct ceph_fs_client *fsc = ceph_sb_to_fs_client(sb); + struct ceph_client *cl = fsc->client;
- dout("ceph_compare_super %p\n", sb); + doutc(cl, "%p\n", sb);
if (compare_mount_options(fsopt, opt, fsc)) { - dout("monitor(s)/mount options don't match\n"); + doutc(cl, "monitor(s)/mount options don't match\n"); return 0; } if ((opt->flags & CEPH_OPT_FSID) && ceph_fsid_compare(&opt->fsid, &fsc->client->fsid)) { - dout("fsid doesn't match\n"); + doutc(cl, "fsid doesn't match\n"); return 0; } if (fc->sb_flags != (sb->s_flags & ~SB_BORN)) { - dout("flags differ\n"); + doutc(cl, "flags differ\n"); return 0; }
if (fsc->blocklisted && !ceph_test_mount_opt(fsc, CLEANRECOVER)) { - dout("client is blocklisted (and CLEANRECOVER is not set)\n"); + doutc(cl, "client is blocklisted (and CLEANRECOVER is not set)\n"); return 0; }
if (fsc->mount_state == CEPH_MOUNT_SHUTDOWN) { - dout("client has been forcibly unmounted\n"); + doutc(cl, "client has been forcibly unmounted\n"); return 0; }
@@ -1349,8 +1356,9 @@ static int ceph_get_tree(struct fs_context *fc) err = PTR_ERR(res); goto out_splat; } - dout("root %p inode %p ino %llx.%llx\n", res, - d_inode(res), ceph_vinop(d_inode(res))); + + doutc(fsc->client, "root %p inode %p ino %llx.%llx\n", res, + d_inode(res), ceph_vinop(d_inode(res))); fc->root = fsc->sb->s_root; return 0;
@@ -1408,7 +1416,8 @@ static int ceph_reconfigure_fc(struct fs_context *fc) kfree(fsc->mount_options->mon_addr); fsc->mount_options->mon_addr = fsopt->mon_addr; fsopt->mon_addr = NULL; - pr_notice("ceph: monitor addresses recorded, but not used for reconnection"); + pr_notice_client(fsc->client, + "monitor addresses recorded, but not used for reconnection"); }
sync_filesystem(sb); @@ -1528,10 +1537,11 @@ void ceph_dec_osd_stopping_blocker(struct ceph_mds_client *mdsc) static void ceph_kill_sb(struct super_block *s) { struct ceph_fs_client *fsc = ceph_sb_to_fs_client(s); + struct ceph_client *cl = fsc->client; struct ceph_mds_client *mdsc = fsc->mdsc; bool wait;
- dout("kill_sb %p\n", s); + doutc(cl, "%p\n", s);
ceph_mdsc_pre_umount(mdsc); flush_fs_workqueues(fsc); @@ -1562,9 +1572,9 @@ static void ceph_kill_sb(struct super_block *s) &mdsc->stopping_waiter, fsc->client->options->mount_timeout); if (!timeleft) /* timed out */ - pr_warn("umount timed out, %ld\n", timeleft); + pr_warn_client(cl, "umount timed out, %ld\n", timeleft); else if (timeleft < 0) /* killed */ - pr_warn("umount was killed, %ld\n", timeleft); + pr_warn_client(cl, "umount was killed, %ld\n", timeleft); }
mdsc->stopping = CEPH_MDSC_STOPPING_FLUSHED; diff --git a/fs/ceph/super.h b/fs/ceph/super.h index 5903e3fb6d75..e31ecbed609d 100644 --- a/fs/ceph/super.h +++ b/fs/ceph/super.h @@ -506,6 +506,12 @@ ceph_sb_to_mdsc(const struct super_block *sb) return (struct ceph_mds_client *)ceph_sb_to_fs_client(sb)->mdsc; }
+static inline struct ceph_client * +ceph_inode_to_client(const struct inode *inode) +{ + return (struct ceph_client *)ceph_inode_to_fs_client(inode)->client; +} + static inline struct ceph_vino ceph_vino(const struct inode *inode) { diff --git a/fs/ceph/xattr.c b/fs/ceph/xattr.c index 558f64554b59..04d031467193 100644 --- a/fs/ceph/xattr.c +++ b/fs/ceph/xattr.c @@ -58,6 +58,7 @@ static ssize_t ceph_vxattrcb_layout(struct ceph_inode_info *ci, char *val, size_t size) { struct ceph_fs_client *fsc = ceph_sb_to_fs_client(ci->netfs.inode.i_sb); + struct ceph_client *cl = fsc->client; struct ceph_osd_client *osdc = &fsc->client->osdc; struct ceph_string *pool_ns; s64 pool = ci->i_layout.pool_id; @@ -69,7 +70,7 @@ static ssize_t ceph_vxattrcb_layout(struct ceph_inode_info *ci, char *val,
pool_ns = ceph_try_get_string(ci->i_layout.pool_ns);
- dout("ceph_vxattrcb_layout %p\n", &ci->netfs.inode); + doutc(cl, "%p\n", &ci->netfs.inode); down_read(&osdc->lock); pool_name = ceph_pg_pool_name_by_id(osdc->osdmap, pool); if (pool_name) { @@ -570,6 +571,8 @@ static int __set_xattr(struct ceph_inode_info *ci, int flags, int update_xattr, struct ceph_inode_xattr **newxattr) { + struct inode *inode = &ci->netfs.inode; + struct ceph_client *cl = ceph_inode_to_client(inode); struct rb_node **p; struct rb_node *parent = NULL; struct ceph_inode_xattr *xattr = NULL; @@ -626,7 +629,7 @@ static int __set_xattr(struct ceph_inode_info *ci, xattr->should_free_name = update_xattr;
ci->i_xattrs.count++; - dout("%s count=%d\n", __func__, ci->i_xattrs.count); + doutc(cl, "count=%d\n", ci->i_xattrs.count); } else { kfree(*newxattr); *newxattr = NULL; @@ -654,13 +657,13 @@ static int __set_xattr(struct ceph_inode_info *ci, if (new) { rb_link_node(&xattr->node, parent, p); rb_insert_color(&xattr->node, &ci->i_xattrs.index); - dout("%s p=%p\n", __func__, p); + doutc(cl, "p=%p\n", p); }
- dout("%s added %llx.%llx xattr %p %.*s=%.*s%s\n", __func__, - ceph_vinop(&ci->netfs.inode), xattr, name_len, name, - min(val_len, MAX_XATTR_VAL_PRINT_LEN), val, - val_len > MAX_XATTR_VAL_PRINT_LEN ? "..." : ""); + doutc(cl, "added %p %llx.%llx xattr %p %.*s=%.*s%s\n", inode, + ceph_vinop(inode), xattr, name_len, name, min(val_len, + MAX_XATTR_VAL_PRINT_LEN), val, + val_len > MAX_XATTR_VAL_PRINT_LEN ? "..." : "");
return 0; } @@ -668,6 +671,7 @@ static int __set_xattr(struct ceph_inode_info *ci, static struct ceph_inode_xattr *__get_xattr(struct ceph_inode_info *ci, const char *name) { + struct ceph_client *cl = ceph_inode_to_client(&ci->netfs.inode); struct rb_node **p; struct rb_node *parent = NULL; struct ceph_inode_xattr *xattr = NULL; @@ -688,13 +692,13 @@ static struct ceph_inode_xattr *__get_xattr(struct ceph_inode_info *ci, else { int len = min(xattr->val_len, MAX_XATTR_VAL_PRINT_LEN);
- dout("%s %s: found %.*s%s\n", __func__, name, len, - xattr->val, xattr->val_len > len ? "..." : ""); + doutc(cl, "%s found %.*s%s\n", name, len, xattr->val, + xattr->val_len > len ? "..." : ""); return xattr; } }
- dout("%s %s: not found\n", __func__, name); + doutc(cl, "%s not found\n", name);
return NULL; } @@ -735,19 +739,20 @@ static int __remove_xattr(struct ceph_inode_info *ci, static char *__copy_xattr_names(struct ceph_inode_info *ci, char *dest) { + struct ceph_client *cl = ceph_inode_to_client(&ci->netfs.inode); struct rb_node *p; struct ceph_inode_xattr *xattr = NULL;
p = rb_first(&ci->i_xattrs.index); - dout("__copy_xattr_names count=%d\n", ci->i_xattrs.count); + doutc(cl, "count=%d\n", ci->i_xattrs.count);
while (p) { xattr = rb_entry(p, struct ceph_inode_xattr, node); memcpy(dest, xattr->name, xattr->name_len); dest[xattr->name_len] = '\0';
- dout("dest=%s %p (%s) (%d/%d)\n", dest, xattr, xattr->name, - xattr->name_len, ci->i_xattrs.names_size); + doutc(cl, "dest=%s %p (%s) (%d/%d)\n", dest, xattr, xattr->name, + xattr->name_len, ci->i_xattrs.names_size);
dest += xattr->name_len + 1; p = rb_next(p); @@ -758,19 +763,19 @@ static char *__copy_xattr_names(struct ceph_inode_info *ci,
void __ceph_destroy_xattrs(struct ceph_inode_info *ci) { + struct ceph_client *cl = ceph_inode_to_client(&ci->netfs.inode); struct rb_node *p, *tmp; struct ceph_inode_xattr *xattr = NULL;
p = rb_first(&ci->i_xattrs.index);
- dout("__ceph_destroy_xattrs p=%p\n", p); + doutc(cl, "p=%p\n", p);
while (p) { xattr = rb_entry(p, struct ceph_inode_xattr, node); tmp = p; p = rb_next(tmp); - dout("__ceph_destroy_xattrs next p=%p (%.*s)\n", p, - xattr->name_len, xattr->name); + doutc(cl, "next p=%p (%.*s)\n", p, xattr->name_len, xattr->name); rb_erase(tmp, &ci->i_xattrs.index);
__free_xattr(xattr); @@ -787,6 +792,7 @@ static int __build_xattrs(struct inode *inode) __releases(ci->i_ceph_lock) __acquires(ci->i_ceph_lock) { + struct ceph_client *cl = ceph_inode_to_client(inode); u32 namelen; u32 numattr = 0; void *p, *end; @@ -798,8 +804,8 @@ static int __build_xattrs(struct inode *inode) int err = 0; int i;
- dout("__build_xattrs() len=%d\n", - ci->i_xattrs.blob ? (int)ci->i_xattrs.blob->vec.iov_len : 0); + doutc(cl, "len=%d\n", + ci->i_xattrs.blob ? (int)ci->i_xattrs.blob->vec.iov_len : 0);
if (ci->i_xattrs.index_version >= ci->i_xattrs.version) return 0; /* already built */ @@ -874,6 +880,8 @@ static int __build_xattrs(struct inode *inode) static int __get_required_blob_size(struct ceph_inode_info *ci, int name_size, int val_size) { + struct ceph_client *cl = ceph_inode_to_client(&ci->netfs.inode); + /* * 4 bytes for the length, and additional 4 bytes per each xattr name, * 4 bytes per each value @@ -881,9 +889,8 @@ static int __get_required_blob_size(struct ceph_inode_info *ci, int name_size, int size = 4 + ci->i_xattrs.count*(4 + 4) + ci->i_xattrs.names_size + ci->i_xattrs.vals_size; - dout("__get_required_blob_size c=%d names.size=%d vals.size=%d\n", - ci->i_xattrs.count, ci->i_xattrs.names_size, - ci->i_xattrs.vals_size); + doutc(cl, "c=%d names.size=%d vals.size=%d\n", ci->i_xattrs.count, + ci->i_xattrs.names_size, ci->i_xattrs.vals_size);
if (name_size) size += 4 + 4 + name_size + val_size; @@ -899,12 +906,14 @@ static int __get_required_blob_size(struct ceph_inode_info *ci, int name_size, */ struct ceph_buffer *__ceph_build_xattrs_blob(struct ceph_inode_info *ci) { + struct inode *inode = &ci->netfs.inode; + struct ceph_client *cl = ceph_inode_to_client(inode); struct rb_node *p; struct ceph_inode_xattr *xattr = NULL; struct ceph_buffer *old_blob = NULL; void *dest;
- dout("__build_xattrs_blob %p\n", &ci->netfs.inode); + doutc(cl, "%p %llx.%llx\n", inode, ceph_vinop(inode)); if (ci->i_xattrs.dirty) { int need = __get_required_blob_size(ci, 0, 0);
@@ -962,6 +971,7 @@ static inline int __get_request_mask(struct inode *in) { ssize_t __ceph_getxattr(struct inode *inode, const char *name, void *value, size_t size) { + struct ceph_client *cl = ceph_inode_to_client(inode); struct ceph_inode_info *ci = ceph_inode(inode); struct ceph_inode_xattr *xattr; struct ceph_vxattr *vxattr; @@ -1000,8 +1010,9 @@ ssize_t __ceph_getxattr(struct inode *inode, const char *name, void *value, req_mask = __get_request_mask(inode);
spin_lock(&ci->i_ceph_lock); - dout("getxattr %p name '%s' ver=%lld index_ver=%lld\n", inode, name, - ci->i_xattrs.version, ci->i_xattrs.index_version); + doutc(cl, "%p %llx.%llx name '%s' ver=%lld index_ver=%lld\n", inode, + ceph_vinop(inode), name, ci->i_xattrs.version, + ci->i_xattrs.index_version);
if (ci->i_xattrs.version == 0 || !((req_mask & CEPH_CAP_XATTR_SHARED) || @@ -1010,8 +1021,9 @@ ssize_t __ceph_getxattr(struct inode *inode, const char *name, void *value,
/* security module gets xattr while filling trace */ if (current->journal_info) { - pr_warn_ratelimited("sync getxattr %p " - "during filling trace\n", inode); + pr_warn_ratelimited_client(cl, + "sync %p %llx.%llx during filling trace\n", + inode, ceph_vinop(inode)); return -EBUSY; }
@@ -1053,14 +1065,16 @@ ssize_t __ceph_getxattr(struct inode *inode, const char *name, void *value, ssize_t ceph_listxattr(struct dentry *dentry, char *names, size_t size) { struct inode *inode = d_inode(dentry); + struct ceph_client *cl = ceph_inode_to_client(inode); struct ceph_inode_info *ci = ceph_inode(inode); bool len_only = (size == 0); u32 namelen; int err;
spin_lock(&ci->i_ceph_lock); - dout("listxattr %p ver=%lld index_ver=%lld\n", inode, - ci->i_xattrs.version, ci->i_xattrs.index_version); + doutc(cl, "%p %llx.%llx ver=%lld index_ver=%lld\n", inode, + ceph_vinop(inode), ci->i_xattrs.version, + ci->i_xattrs.index_version);
if (ci->i_xattrs.version == 0 || !__ceph_caps_issued_mask_metric(ci, CEPH_CAP_XATTR_SHARED, 1)) { @@ -1095,6 +1109,7 @@ static int ceph_sync_setxattr(struct inode *inode, const char *name, const char *value, size_t size, int flags) { struct ceph_fs_client *fsc = ceph_sb_to_fs_client(inode->i_sb); + struct ceph_client *cl = ceph_inode_to_client(inode); struct ceph_inode_info *ci = ceph_inode(inode); struct ceph_mds_request *req; struct ceph_mds_client *mdsc = fsc->mdsc; @@ -1119,7 +1134,7 @@ static int ceph_sync_setxattr(struct inode *inode, const char *name, flags |= CEPH_XATTR_REMOVE; }
- dout("setxattr value size: %zu\n", size); + doutc(cl, "name %s value size %zu\n", name, size);
/* do request */ req = ceph_mdsc_create_request(mdsc, op, USE_AUTH_MDS); @@ -1148,10 +1163,10 @@ static int ceph_sync_setxattr(struct inode *inode, const char *name, req->r_num_caps = 1; req->r_inode_drop = CEPH_CAP_XATTR_SHARED;
- dout("xattr.ver (before): %lld\n", ci->i_xattrs.version); + doutc(cl, "xattr.ver (before): %lld\n", ci->i_xattrs.version); err = ceph_mdsc_do_request(mdsc, NULL, req); ceph_mdsc_put_request(req); - dout("xattr.ver (after): %lld\n", ci->i_xattrs.version); + doutc(cl, "xattr.ver (after): %lld\n", ci->i_xattrs.version);
out: if (pagelist) @@ -1162,6 +1177,7 @@ static int ceph_sync_setxattr(struct inode *inode, const char *name, int __ceph_setxattr(struct inode *inode, const char *name, const void *value, size_t size, int flags) { + struct ceph_client *cl = ceph_inode_to_client(inode); struct ceph_vxattr *vxattr; struct ceph_inode_info *ci = ceph_inode(inode); struct ceph_mds_client *mdsc = ceph_sb_to_fs_client(inode->i_sb)->mdsc; @@ -1220,9 +1236,9 @@ int __ceph_setxattr(struct inode *inode, const char *name, required_blob_size = __get_required_blob_size(ci, name_len, val_len); if ((ci->i_xattrs.version == 0) || !(issued & CEPH_CAP_XATTR_EXCL) || (required_blob_size > mdsc->mdsmap->m_max_xattr_size)) { - dout("%s do sync setxattr: version: %llu size: %d max: %llu\n", - __func__, ci->i_xattrs.version, required_blob_size, - mdsc->mdsmap->m_max_xattr_size); + doutc(cl, "sync version: %llu size: %d max: %llu\n", + ci->i_xattrs.version, required_blob_size, + mdsc->mdsmap->m_max_xattr_size); goto do_sync; }
@@ -1236,8 +1252,8 @@ int __ceph_setxattr(struct inode *inode, const char *name, } }
- dout("setxattr %p name '%s' issued %s\n", inode, name, - ceph_cap_string(issued)); + doutc(cl, "%p %llx.%llx name '%s' issued %s\n", inode, + ceph_vinop(inode), name, ceph_cap_string(issued)); __build_xattrs(inode);
if (!ci->i_xattrs.prealloc_blob || @@ -1246,7 +1262,8 @@ int __ceph_setxattr(struct inode *inode, const char *name,
spin_unlock(&ci->i_ceph_lock); ceph_buffer_put(old_blob); /* Shouldn't be required */ - dout(" pre-allocating new blob size=%d\n", required_blob_size); + doutc(cl, " pre-allocating new blob size=%d\n", + required_blob_size); blob = ceph_buffer_new(required_blob_size, GFP_NOFS); if (!blob) goto do_sync_unlocked; @@ -1285,8 +1302,9 @@ int __ceph_setxattr(struct inode *inode, const char *name,
/* security module set xattr while filling trace */ if (current->journal_info) { - pr_warn_ratelimited("sync setxattr %p " - "during filling trace\n", inode); + pr_warn_ratelimited_client(cl, + "sync %p %llx.%llx during filling trace\n", + inode, ceph_vinop(inode)); err = -EBUSY; } else { err = ceph_sync_setxattr(inode, name, value, size, flags);
On Mon, Jan 6, 2025 at 4:28 PM Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
6.6-stable review patch. If anyone has any objections, please let me know.
From: Xiubo Li xiubli@redhat.com
[ Upstream commit 38d46409c4639a1d659ebfa70e27a8bed6b8ee1d ]
Multiple CephFS mounts on a host is increasingly common so disambiguating messages like this is necessary and will make it easier to debug issues.
At the same this will improve the debug logs to make them easier to troubleshooting issues, such as print the ino# instead only printing the memory addresses of the corresponding inodes and print the dentry names instead of the corresponding memory addresses for the dentry,etc.
Link: https://tracker.ceph.com/issues/61590 Signed-off-by: Xiubo Li xiubli@redhat.com Reviewed-by: Patrick Donnelly pdonnell@redhat.com Reviewed-by: Milind Changire mchangir@redhat.com Signed-off-by: Ilya Dryomov idryomov@gmail.com Stable-dep-of: 550f7ca98ee0 ("ceph: give up on paths longer than PATH_MAX") Signed-off-by: Sasha Levin sashal@kernel.org
fs/ceph/acl.c | 6 +- fs/ceph/addr.c | 279 +++++++++-------- fs/ceph/caps.c | 710 +++++++++++++++++++++++++------------------ fs/ceph/crypto.c | 39 ++- fs/ceph/debugfs.c | 6 +- fs/ceph/dir.c | 218 +++++++------ fs/ceph/export.c | 39 +-- fs/ceph/file.c | 245 ++++++++------- fs/ceph/inode.c | 485 ++++++++++++++++------------- fs/ceph/ioctl.c | 13 +- fs/ceph/locks.c | 57 ++-- fs/ceph/mds_client.c | 558 +++++++++++++++++++--------------- fs/ceph/mdsmap.c | 24 +- fs/ceph/metric.c | 5 +- fs/ceph/quota.c | 29 +- fs/ceph/snap.c | 174 ++++++----- fs/ceph/super.c | 70 +++-- fs/ceph/super.h | 6 + fs/ceph/xattr.c | 96 +++--- 19 files changed, 1747 insertions(+), 1312 deletions(-)
Hi Greg,
This is a huge patch, albeit mostly mechanical. Commit 550f7ca98ee0 ("ceph: give up on paths longer than PATH_MAX") for which this patch is a dependency just removes the affected log message, so it could be backported with a trivial conflict resolution instead of taking in 5c5f0d2b5f92 ("libceph: add doutc and *_client debug macros support") and 38d46409c463 ("ceph: print cluster fsid and client global_id in all debug logs") to arrange for a "clean" backport.
Were these cherry picks done in an automated fashion by a tool that tries to identify and pull prerequisite patches based on "git blame" output? The result appears to go against the rules laid out in Documentation/process/stable-kernel-rules.rst (particularly the limit on the number of lines), so I wanted to clarify the expected workflow of the stable team in this area. Are "clean" backports considered to justify additional prerequisite patches of this size even when the conflict resolution is "take ours" or otherwise trivial?
Thanks,
Ilya
On Tue, Jan 07, 2025 at 01:21:12PM +0100, Ilya Dryomov wrote:
On Mon, Jan 6, 2025 at 4:28 PM Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
6.6-stable review patch. If anyone has any objections, please let me know.
From: Xiubo Li xiubli@redhat.com
[ Upstream commit 38d46409c4639a1d659ebfa70e27a8bed6b8ee1d ]
Multiple CephFS mounts on a host is increasingly common so disambiguating messages like this is necessary and will make it easier to debug issues.
At the same this will improve the debug logs to make them easier to troubleshooting issues, such as print the ino# instead only printing the memory addresses of the corresponding inodes and print the dentry names instead of the corresponding memory addresses for the dentry,etc.
Link: https://tracker.ceph.com/issues/61590 Signed-off-by: Xiubo Li xiubli@redhat.com Reviewed-by: Patrick Donnelly pdonnell@redhat.com Reviewed-by: Milind Changire mchangir@redhat.com Signed-off-by: Ilya Dryomov idryomov@gmail.com Stable-dep-of: 550f7ca98ee0 ("ceph: give up on paths longer than PATH_MAX") Signed-off-by: Sasha Levin sashal@kernel.org
fs/ceph/acl.c | 6 +- fs/ceph/addr.c | 279 +++++++++-------- fs/ceph/caps.c | 710 +++++++++++++++++++++++++------------------ fs/ceph/crypto.c | 39 ++- fs/ceph/debugfs.c | 6 +- fs/ceph/dir.c | 218 +++++++------ fs/ceph/export.c | 39 +-- fs/ceph/file.c | 245 ++++++++------- fs/ceph/inode.c | 485 ++++++++++++++++------------- fs/ceph/ioctl.c | 13 +- fs/ceph/locks.c | 57 ++-- fs/ceph/mds_client.c | 558 +++++++++++++++++++--------------- fs/ceph/mdsmap.c | 24 +- fs/ceph/metric.c | 5 +- fs/ceph/quota.c | 29 +- fs/ceph/snap.c | 174 ++++++----- fs/ceph/super.c | 70 +++-- fs/ceph/super.h | 6 + fs/ceph/xattr.c | 96 +++--- 19 files changed, 1747 insertions(+), 1312 deletions(-)
Hi Greg,
This is a huge patch, albeit mostly mechanical. Commit 550f7ca98ee0 ("ceph: give up on paths longer than PATH_MAX") for which this patch is a dependency just removes the affected log message, so it could be backported with a trivial conflict resolution instead of taking in 5c5f0d2b5f92 ("libceph: add doutc and *_client debug macros support") and 38d46409c463 ("ceph: print cluster fsid and client global_id in all debug logs") to arrange for a "clean" backport.
Great, can you send such a backport?
Were these cherry picks done in an automated fashion by a tool that tries to identify and pull prerequisite patches based on "git blame" output?
Yes.
The result appears to go against the rules laid out in Documentation/process/stable-kernel-rules.rst (particularly the limit on the number of lines), so I wanted to clarify the expected workflow of the stable team in this area. Are "clean" backports considered to justify additional prerequisite patches of this size even when the conflict resolution is "take ours" or otherwise trivial?
Yes. Keeping the tree in sync is almost always preferred over "one-off" changes that have to be hand-provided, when the maintainer is not involved to ensure that we don't break anything. But if you want to provide a working patch that you think should do the same thing without these dep-of: patches, that would be great!
thanks,
greg k-h
On Tue, Jan 7, 2025 at 2:05 PM Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
On Tue, Jan 07, 2025 at 01:21:12PM +0100, Ilya Dryomov wrote:
On Mon, Jan 6, 2025 at 4:28 PM Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
6.6-stable review patch. If anyone has any objections, please let me know.
From: Xiubo Li xiubli@redhat.com
[ Upstream commit 38d46409c4639a1d659ebfa70e27a8bed6b8ee1d ]
Multiple CephFS mounts on a host is increasingly common so disambiguating messages like this is necessary and will make it easier to debug issues.
At the same this will improve the debug logs to make them easier to troubleshooting issues, such as print the ino# instead only printing the memory addresses of the corresponding inodes and print the dentry names instead of the corresponding memory addresses for the dentry,etc.
Link: https://tracker.ceph.com/issues/61590 Signed-off-by: Xiubo Li xiubli@redhat.com Reviewed-by: Patrick Donnelly pdonnell@redhat.com Reviewed-by: Milind Changire mchangir@redhat.com Signed-off-by: Ilya Dryomov idryomov@gmail.com Stable-dep-of: 550f7ca98ee0 ("ceph: give up on paths longer than PATH_MAX") Signed-off-by: Sasha Levin sashal@kernel.org
fs/ceph/acl.c | 6 +- fs/ceph/addr.c | 279 +++++++++-------- fs/ceph/caps.c | 710 +++++++++++++++++++++++++------------------ fs/ceph/crypto.c | 39 ++- fs/ceph/debugfs.c | 6 +- fs/ceph/dir.c | 218 +++++++------ fs/ceph/export.c | 39 +-- fs/ceph/file.c | 245 ++++++++------- fs/ceph/inode.c | 485 ++++++++++++++++------------- fs/ceph/ioctl.c | 13 +- fs/ceph/locks.c | 57 ++-- fs/ceph/mds_client.c | 558 +++++++++++++++++++--------------- fs/ceph/mdsmap.c | 24 +- fs/ceph/metric.c | 5 +- fs/ceph/quota.c | 29 +- fs/ceph/snap.c | 174 ++++++----- fs/ceph/super.c | 70 +++-- fs/ceph/super.h | 6 + fs/ceph/xattr.c | 96 +++--- 19 files changed, 1747 insertions(+), 1312 deletions(-)
Hi Greg,
This is a huge patch, albeit mostly mechanical. Commit 550f7ca98ee0 ("ceph: give up on paths longer than PATH_MAX") for which this patch is a dependency just removes the affected log message, so it could be backported with a trivial conflict resolution instead of taking in 5c5f0d2b5f92 ("libceph: add doutc and *_client debug macros support") and 38d46409c463 ("ceph: print cluster fsid and client global_id in all debug logs") to arrange for a "clean" backport.
Great, can you send such a backport?
Sure, you should have one for 5.10-6.1 and a separate one for 6.6 in your inbox now.
Were these cherry picks done in an automated fashion by a tool that tries to identify and pull prerequisite patches based on "git blame" output?
Yes.
The result appears to go against the rules laid out in Documentation/process/stable-kernel-rules.rst (particularly the limit on the number of lines), so I wanted to clarify the expected workflow of the stable team in this area. Are "clean" backports considered to justify additional prerequisite patches of this size even when the conflict resolution is "take ours" or otherwise trivial?
Yes. Keeping the tree in sync is almost always preferred over "one-off" changes that have to be hand-provided, when the maintainer is not involved to ensure that we don't break anything. But if you want to
In this case this lead to pulling in
1 file changed, 38 insertions(+)
and
19 files changed, 1747 insertions(+), 1312 deletions(-)
patches as dependencies to satisfy the backport of
1 file changed, 4 insertions(+), 5 deletions(-)
patch. Is this OK given that stable-kernel-rules.rst still has "It cannot be bigger than 100 lines, with context." as one of the rules for what kind of patches are accepted?
Thanks,
Ilya
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Max Kellermann max.kellermann@ionos.com
[ Upstream commit 550f7ca98ee028a606aa75705a7e77b1bd11720f ]
If the full path to be built by ceph_mdsc_build_path() happens to be longer than PATH_MAX, then this function will enter an endless (retry) loop, effectively blocking the whole task. Most of the machine becomes unusable, making this a very simple and effective DoS vulnerability.
I cannot imagine why this retry was ever implemented, but it seems rather useless and harmful to me. Let's remove it and fail with ENAMETOOLONG instead.
Cc: stable@vger.kernel.org Reported-by: Dario Weißer dario@cure53.de Signed-off-by: Max Kellermann max.kellermann@ionos.com Reviewed-by: Alex Markuze amarkuze@redhat.com Signed-off-by: Ilya Dryomov idryomov@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/ceph/mds_client.c | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-)
diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c index 56609b80880c..5ee38dd2c9b9 100644 --- a/fs/ceph/mds_client.c +++ b/fs/ceph/mds_client.c @@ -2745,12 +2745,11 @@ char *ceph_mdsc_build_path(struct ceph_mds_client *mdsc, struct dentry *dentry,
if (pos < 0) { /* - * A rename didn't occur, but somehow we didn't end up where - * we thought we would. Throw a warning and try again. + * The path is longer than PATH_MAX and this function + * cannot ever succeed. Creating paths that long is + * possible with Ceph, but Linux cannot use them. */ - pr_warn_client(cl, "did not end path lookup where expected (pos = %d)\n", - pos); - goto retry; + return ERR_PTR(-ENAMETOOLONG); }
*pbase = base;
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jeremy Kerr jk@codeconstruct.com.au
[ Upstream commit ce1219c3f76bb131d095e90521506d3c6ccfa086 ]
Currently, we don't use the return value from sock_queue_rcv_skb, which means we may leak skbs if a message is not successfully queued to a socket.
Instead, ensure that we're freeing the skb where the sock hasn't otherwise taken ownership of the skb by adding checks on the sock_queue_rcv_skb() to invoke a kfree on failure.
In doing so, rather than using the 'rc' value to trigger the kfree_skb(), use the skb pointer itself, which is more explicit.
Also, add a kunit test for the sock delivery failure cases.
Fixes: 4a992bbd3650 ("mctp: Implement message fragmentation & reassembly") Cc: stable@vger.kernel.org Signed-off-by: Jeremy Kerr jk@codeconstruct.com.au Link: https://patch.msgid.link/20241218-mctp-next-v2-1-1c1729645eaa@codeconstruct.... Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/mctp/route.c | 36 ++++++++++++++++++++++++++---------- 1 file changed, 26 insertions(+), 10 deletions(-)
diff --git a/net/mctp/route.c b/net/mctp/route.c index c6a815df9d35..d3c1f54386ef 100644 --- a/net/mctp/route.c +++ b/net/mctp/route.c @@ -334,8 +334,13 @@ static int mctp_route_input(struct mctp_route *route, struct sk_buff *skb) msk = NULL; rc = -EINVAL;
- /* we may be receiving a locally-routed packet; drop source sk - * accounting + /* We may be receiving a locally-routed packet; drop source sk + * accounting. + * + * From here, we will either queue the skb - either to a frag_queue, or + * to a receiving socket. When that succeeds, we clear the skb pointer; + * a non-NULL skb on exit will be otherwise unowned, and hence + * kfree_skb()-ed. */ skb_orphan(skb);
@@ -389,7 +394,9 @@ static int mctp_route_input(struct mctp_route *route, struct sk_buff *skb) * pending key. */ if (flags & MCTP_HDR_FLAG_EOM) { - sock_queue_rcv_skb(&msk->sk, skb); + rc = sock_queue_rcv_skb(&msk->sk, skb); + if (!rc) + skb = NULL; if (key) { /* we've hit a pending reassembly; not much we * can do but drop it @@ -398,7 +405,6 @@ static int mctp_route_input(struct mctp_route *route, struct sk_buff *skb) MCTP_TRACE_KEY_REPLIED); key = NULL; } - rc = 0; goto out_unlock; }
@@ -425,8 +431,10 @@ static int mctp_route_input(struct mctp_route *route, struct sk_buff *skb) * this function. */ rc = mctp_key_add(key, msk); - if (!rc) + if (!rc) { trace_mctp_key_acquire(key); + skb = NULL; + }
/* we don't need to release key->lock on exit, so * clean up here and suppress the unlock via @@ -444,6 +452,8 @@ static int mctp_route_input(struct mctp_route *route, struct sk_buff *skb) key = NULL; } else { rc = mctp_frag_queue(key, skb); + if (!rc) + skb = NULL; } }
@@ -458,12 +468,19 @@ static int mctp_route_input(struct mctp_route *route, struct sk_buff *skb) else rc = mctp_frag_queue(key, skb);
+ if (rc) + goto out_unlock; + + /* we've queued; the queue owns the skb now */ + skb = NULL; + /* end of message? deliver to socket, and we're done with * the reassembly/response key */ - if (!rc && flags & MCTP_HDR_FLAG_EOM) { - sock_queue_rcv_skb(key->sk, key->reasm_head); - key->reasm_head = NULL; + if (flags & MCTP_HDR_FLAG_EOM) { + rc = sock_queue_rcv_skb(key->sk, key->reasm_head); + if (!rc) + key->reasm_head = NULL; __mctp_key_done_in(key, net, f, MCTP_TRACE_KEY_REPLIED); key = NULL; } @@ -482,8 +499,7 @@ static int mctp_route_input(struct mctp_route *route, struct sk_buff *skb) if (any_key) mctp_key_unref(any_key); out: - if (rc) - kfree_skb(skb); + kfree_skb(skb); return rc; }
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Matthew Wilcox (Oracle) willy@infradead.org
[ Upstream commit d0ed46b60396cfa7e0056f55e1ce0b43c7db57b6 ]
To make seq_buf more lightweight as a string buf, move the readpos member from seq_buf to its container, trace_seq. That puts the responsibility of maintaining the readpos entirely in the tracing code. If some future users want to package up the readpos with a seq_buf, we can define a new struct then.
Link: https://lore.kernel.org/linux-trace-kernel/20231020033545.2587554-2-willy@in...
Cc: Kees Cook keescook@chromium.org Cc: Justin Stitt justinstitt@google.com Cc: Kent Overstreet kent.overstreet@linux.dev Cc: Petr Mladek pmladek@suse.com Cc: Andy Shevchenko andriy.shevchenko@linux.intel.com Cc: Rasmus Villemoes linux@rasmusvillemoes.dk Cc: Sergey Senozhatsky senozhatsky@chromium.org Signed-off-by: Matthew Wilcox (Oracle) willy@infradead.org Reviewed-by: Christoph Hellwig hch@lst.de Acked-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Steven Rostedt (Google) rostedt@goodmis.org Stable-dep-of: afd2627f727b ("tracing: Check "%s" dereference via the field and not the TP_printk format") Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/seq_buf.h | 5 +---- include/linux/trace_seq.h | 2 ++ kernel/trace/trace.c | 10 +++++----- kernel/trace/trace_seq.c | 6 +++++- lib/seq_buf.c | 22 ++++++++++------------ 5 files changed, 23 insertions(+), 22 deletions(-)
diff --git a/include/linux/seq_buf.h b/include/linux/seq_buf.h index 515d7fcb9634..a0fb013cebdf 100644 --- a/include/linux/seq_buf.h +++ b/include/linux/seq_buf.h @@ -14,19 +14,16 @@ * @buffer: pointer to the buffer * @size: size of the buffer * @len: the amount of data inside the buffer - * @readpos: The next position to read in the buffer. */ struct seq_buf { char *buffer; size_t size; size_t len; - loff_t readpos; };
static inline void seq_buf_clear(struct seq_buf *s) { s->len = 0; - s->readpos = 0; }
static inline void @@ -143,7 +140,7 @@ extern __printf(2, 0) int seq_buf_vprintf(struct seq_buf *s, const char *fmt, va_list args); extern int seq_buf_print_seq(struct seq_file *m, struct seq_buf *s); extern int seq_buf_to_user(struct seq_buf *s, char __user *ubuf, - int cnt); + size_t start, int cnt); extern int seq_buf_puts(struct seq_buf *s, const char *str); extern int seq_buf_putc(struct seq_buf *s, unsigned char c); extern int seq_buf_putmem(struct seq_buf *s, const void *mem, unsigned int len); diff --git a/include/linux/trace_seq.h b/include/linux/trace_seq.h index 6be92bf559fe..3691e0e76a1a 100644 --- a/include/linux/trace_seq.h +++ b/include/linux/trace_seq.h @@ -14,6 +14,7 @@ struct trace_seq { char buffer[PAGE_SIZE]; struct seq_buf seq; + size_t readpos; int full; };
@@ -22,6 +23,7 @@ trace_seq_init(struct trace_seq *s) { seq_buf_init(&s->seq, s->buffer, PAGE_SIZE); s->full = 0; + s->readpos = 0; }
/** diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c index 220903117c51..83f6ef4d7419 100644 --- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -1731,15 +1731,15 @@ static ssize_t trace_seq_to_buffer(struct trace_seq *s, void *buf, size_t cnt) { int len;
- if (trace_seq_used(s) <= s->seq.readpos) + if (trace_seq_used(s) <= s->readpos) return -EBUSY;
- len = trace_seq_used(s) - s->seq.readpos; + len = trace_seq_used(s) - s->readpos; if (cnt > len) cnt = len; - memcpy(buf, s->buffer + s->seq.readpos, cnt); + memcpy(buf, s->buffer + s->readpos, cnt);
- s->seq.readpos += cnt; + s->readpos += cnt; return cnt; }
@@ -7011,7 +7011,7 @@ tracing_read_pipe(struct file *filp, char __user *ubuf,
/* Now copy what we have to the user */ sret = trace_seq_to_user(&iter->seq, ubuf, cnt); - if (iter->seq.seq.readpos >= trace_seq_used(&iter->seq)) + if (iter->seq.readpos >= trace_seq_used(&iter->seq)) trace_seq_init(&iter->seq);
/* diff --git a/kernel/trace/trace_seq.c b/kernel/trace/trace_seq.c index bac06ee3b98b..7be97229ddf8 100644 --- a/kernel/trace/trace_seq.c +++ b/kernel/trace/trace_seq.c @@ -370,8 +370,12 @@ EXPORT_SYMBOL_GPL(trace_seq_path); */ int trace_seq_to_user(struct trace_seq *s, char __user *ubuf, int cnt) { + int ret; __trace_seq_init(s); - return seq_buf_to_user(&s->seq, ubuf, cnt); + ret = seq_buf_to_user(&s->seq, ubuf, s->readpos, cnt); + if (ret > 0) + s->readpos += ret; + return ret; } EXPORT_SYMBOL_GPL(trace_seq_to_user);
diff --git a/lib/seq_buf.c b/lib/seq_buf.c index 45c450f423fa..b7477aefff53 100644 --- a/lib/seq_buf.c +++ b/lib/seq_buf.c @@ -324,23 +324,24 @@ int seq_buf_path(struct seq_buf *s, const struct path *path, const char *esc) * seq_buf_to_user - copy the sequence buffer to user space * @s: seq_buf descriptor * @ubuf: The userspace memory location to copy to + * @start: The first byte in the buffer to copy * @cnt: The amount to copy * * Copies the sequence buffer into the userspace memory pointed to - * by @ubuf. It starts from the last read position (@s->readpos) - * and writes up to @cnt characters or till it reaches the end of - * the content in the buffer (@s->len), which ever comes first. + * by @ubuf. It starts from @start and writes up to @cnt characters + * or until it reaches the end of the content in the buffer (@s->len), + * whichever comes first. * * On success, it returns a positive number of the number of bytes * it copied. * * On failure it returns -EBUSY if all of the content in the * sequence has been already read, which includes nothing in the - * sequence (@s->len == @s->readpos). + * sequence (@s->len == @start). * * Returns -EFAULT if the copy to userspace fails. */ -int seq_buf_to_user(struct seq_buf *s, char __user *ubuf, int cnt) +int seq_buf_to_user(struct seq_buf *s, char __user *ubuf, size_t start, int cnt) { int len; int ret; @@ -350,20 +351,17 @@ int seq_buf_to_user(struct seq_buf *s, char __user *ubuf, int cnt)
len = seq_buf_used(s);
- if (len <= s->readpos) + if (len <= start) return -EBUSY;
- len -= s->readpos; + len -= start; if (cnt > len) cnt = len; - ret = copy_to_user(ubuf, s->buffer + s->readpos, cnt); + ret = copy_to_user(ubuf, s->buffer + start, cnt); if (ret == cnt) return -EFAULT;
- cnt -= ret; - - s->readpos += cnt; - return cnt; + return cnt - ret; }
/**
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Matthew Wilcox (Oracle) willy@infradead.org
[ Upstream commit 0f7f544af60a6082cfaa3ed4c8f4ca1a858807ee ]
While powerpc doesn't use the seq_buf readpos, it did explicitly initialise it for no good reason.
Link: https://lore.kernel.org/linux-trace-kernel/20231024145600.739451-1-willy@inf...
Cc: Christoph Hellwig hch@lst.de Cc: Justin Stitt justinstitt@google.com Cc: Kent Overstreet kent.overstreet@linux.dev Cc: Petr Mladek pmladek@suse.com Cc: Andy Shevchenko andriy.shevchenko@linux.intel.com Cc: Rasmus Villemoes linux@rasmusvillemoes.dk Cc: Sergey Senozhatsky senozhatsky@chromium.org Cc: Michael Ellerman mpe@ellerman.id.au Reviewed-by: Kees Cook keescook@chromium.org Fixes: d0ed46b60396 ("tracing: Move readpos from seq_buf to trace_seq") Signed-off-by: Matthew Wilcox (Oracle) willy@infradead.org Signed-off-by: Steven Rostedt (Google) rostedt@goodmis.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/powerpc/kernel/setup-common.c | 1 - 1 file changed, 1 deletion(-)
diff --git a/arch/powerpc/kernel/setup-common.c b/arch/powerpc/kernel/setup-common.c index d43db8150767..dddf4f31c219 100644 --- a/arch/powerpc/kernel/setup-common.c +++ b/arch/powerpc/kernel/setup-common.c @@ -601,7 +601,6 @@ struct seq_buf ppc_hw_desc __initdata = { .buffer = ppc_hw_desc_buf, .size = sizeof(ppc_hw_desc_buf), .len = 0, - .readpos = 0, };
static __init void probe_machine(void)
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kees Cook keescook@chromium.org
[ Upstream commit dcc4e5728eeaeda84878ca0018758cff1abfca21 ]
Solve two ergonomic issues with struct seq_buf;
1) Too much boilerplate is required to initialize:
struct seq_buf s; char buf[32];
seq_buf_init(s, buf, sizeof(buf));
Instead, we can build this directly on the stack. Provide DECLARE_SEQ_BUF() macro to do this:
DECLARE_SEQ_BUF(s, 32);
2) %NUL termination is fragile and requires 2 steps to get a valid C String (and is a layering violation exposing the "internals" of seq_buf):
seq_buf_terminate(s); do_something(s->buffer);
Instead, we can just return s->buffer directly after terminating it in the refactored seq_buf_terminate(), now known as seq_buf_str():
do_something(seq_buf_str(s));
Link: https://lore.kernel.org/linux-trace-kernel/20231027155634.make.260-kees@kern... Link: https://lore.kernel.org/linux-trace-kernel/20231026194033.it.702-kees@kernel...
Cc: Yosry Ahmed yosryahmed@google.com Cc: "Matthew Wilcox (Oracle)" willy@infradead.org Cc: Christoph Hellwig hch@lst.de Cc: Justin Stitt justinstitt@google.com Cc: Kent Overstreet kent.overstreet@linux.dev Cc: Petr Mladek pmladek@suse.com Cc: Andy Shevchenko andriy.shevchenko@linux.intel.com Cc: Rasmus Villemoes linux@rasmusvillemoes.dk Cc: Sergey Senozhatsky senozhatsky@chromium.org Cc: Masami Hiramatsu mhiramat@kernel.org Cc: Greg Kroah-Hartman gregkh@linuxfoundation.org Cc: Arnd Bergmann arnd@arndb.de Cc: Jonathan Corbet corbet@lwn.net Cc: Yun Zhou yun.zhou@windriver.com Cc: Jacob Keller jacob.e.keller@intel.com Cc: Zhen Lei thunder.leizhen@huawei.com Signed-off-by: Kees Cook keescook@chromium.org Signed-off-by: Steven Rostedt (Google) rostedt@goodmis.org Stable-dep-of: afd2627f727b ("tracing: Check "%s" dereference via the field and not the TP_printk format") Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/seq_buf.h | 21 +++++++++++++++++---- kernel/trace/trace.c | 11 +---------- lib/seq_buf.c | 4 +--- 3 files changed, 19 insertions(+), 17 deletions(-)
diff --git a/include/linux/seq_buf.h b/include/linux/seq_buf.h index a0fb013cebdf..d9db59f420a4 100644 --- a/include/linux/seq_buf.h +++ b/include/linux/seq_buf.h @@ -21,9 +21,18 @@ struct seq_buf { size_t len; };
+#define DECLARE_SEQ_BUF(NAME, SIZE) \ + char __ ## NAME ## _buffer[SIZE] = ""; \ + struct seq_buf NAME = { \ + .buffer = &__ ## NAME ## _buffer, \ + .size = SIZE, \ + } + static inline void seq_buf_clear(struct seq_buf *s) { s->len = 0; + if (s->size) + s->buffer[0] = '\0'; }
static inline void @@ -69,8 +78,8 @@ static inline unsigned int seq_buf_used(struct seq_buf *s) }
/** - * seq_buf_terminate - Make sure buffer is nul terminated - * @s: the seq_buf descriptor to terminate. + * seq_buf_str - get %NUL-terminated C string from seq_buf + * @s: the seq_buf handle * * This makes sure that the buffer in @s is nul terminated and * safe to read as a string. @@ -81,16 +90,20 @@ static inline unsigned int seq_buf_used(struct seq_buf *s) * * After this function is called, s->buffer is safe to use * in string operations. + * + * Returns @s->buf after making sure it is terminated. */ -static inline void seq_buf_terminate(struct seq_buf *s) +static inline const char *seq_buf_str(struct seq_buf *s) { if (WARN_ON(s->size == 0)) - return; + return "";
if (seq_buf_buffer_left(s)) s->buffer[s->len] = 0; else s->buffer[s->size - 1] = 0; + + return s->buffer; }
/** diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c index 83f6ef4d7419..d9406a1f8795 100644 --- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -3810,15 +3810,6 @@ static bool trace_safe_str(struct trace_iterator *iter, const char *str, return false; }
-static const char *show_buffer(struct trace_seq *s) -{ - struct seq_buf *seq = &s->seq; - - seq_buf_terminate(seq); - - return seq->buffer; -} - static DEFINE_STATIC_KEY_FALSE(trace_no_verify);
static int test_can_verify_check(const char *fmt, ...) @@ -3958,7 +3949,7 @@ void trace_check_vprintf(struct trace_iterator *iter, const char *fmt, */ if (WARN_ONCE(!trace_safe_str(iter, str, star, len), "fmt: '%s' current_buffer: '%s'", - fmt, show_buffer(&iter->seq))) { + fmt, seq_buf_str(&iter->seq.seq))) { int ret;
/* Try to safely read the string */ diff --git a/lib/seq_buf.c b/lib/seq_buf.c index b7477aefff53..23518f77ea9c 100644 --- a/lib/seq_buf.c +++ b/lib/seq_buf.c @@ -109,9 +109,7 @@ void seq_buf_do_printk(struct seq_buf *s, const char *lvl) if (s->size == 0 || s->len == 0) return;
- seq_buf_terminate(s); - - start = s->buffer; + start = seq_buf_str(s); while ((lf = strchr(start, '\n'))) { int len = lf - start + 1;
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Steven Rostedt (Google) rostedt@goodmis.org
[ Upstream commit 07714b4bb3f9800261c8b4b2f47e9010ed60979d ]
Use the saved text_delta and data_delta of a persistent memory mapped ring buffer that was saved from a previous boot, and use the delta in the trace event print output so that strings and functions show up normally.
That is, for an event like trace_kmalloc() that prints the callsite via "%pS", if it used the address saved in the ring buffer it will not match the function that was saved in the previous boot if the kernel remaps itself between boots.
For RCU events that point to saved static strings where only the address of the string is saved in the ring buffer, it too will be adjusted to point to where the string is on the current boot.
Link: https://lkml.kernel.org/r/20240612232026.821020753@goodmis.org
Cc: Masami Hiramatsu mhiramat@kernel.org Cc: Mark Rutland mark.rutland@arm.com Cc: Mathieu Desnoyers mathieu.desnoyers@efficios.com Cc: Andrew Morton akpm@linux-foundation.org Cc: Vincent Donnefort vdonnefort@google.com Cc: Joel Fernandes joel@joelfernandes.org Cc: Daniel Bristot de Oliveira bristot@redhat.com Cc: Ingo Molnar mingo@kernel.org Cc: Peter Zijlstra peterz@infradead.org Cc: Thomas Gleixner tglx@linutronix.de Cc: Vineeth Pillai vineeth@bitbyteword.org Cc: Youssef Esmat youssefesmat@google.com Cc: Beau Belgrave beaub@linux.microsoft.com Cc: Alexander Graf graf@amazon.com Cc: Baoquan He bhe@redhat.com Cc: Borislav Petkov bp@alien8.de Cc: "Paul E. McKenney" paulmck@kernel.org Cc: David Howells dhowells@redhat.com Cc: Mike Rapoport rppt@kernel.org Cc: Dave Hansen dave.hansen@linux.intel.com Cc: Tony Luck tony.luck@intel.com Cc: Guenter Roeck linux@roeck-us.net Cc: Ross Zwisler zwisler@google.com Cc: Kees Cook keescook@chromium.org Signed-off-by: Steven Rostedt (Google) rostedt@goodmis.org Stable-dep-of: afd2627f727b ("tracing: Check "%s" dereference via the field and not the TP_printk format") Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/trace/trace.c | 42 +++++++++++++++++++++++++++++++++++++++--- 1 file changed, 39 insertions(+), 3 deletions(-)
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c index d9406a1f8795..2a45efc4e417 100644 --- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -3858,8 +3858,11 @@ static void test_can_verify(void) void trace_check_vprintf(struct trace_iterator *iter, const char *fmt, va_list ap) { + long text_delta = iter->tr->text_delta; + long data_delta = iter->tr->data_delta; const char *p = fmt; const char *str; + bool good; int i, j;
if (WARN_ON_ONCE(!fmt)) @@ -3878,7 +3881,10 @@ void trace_check_vprintf(struct trace_iterator *iter, const char *fmt,
j = 0;
- /* We only care about %s and variants */ + /* + * We only care about %s and variants + * as well as %p[sS] if delta is non-zero + */ for (i = 0; p[i]; i++) { if (i + 1 >= iter->fmt_size) { /* @@ -3907,6 +3913,11 @@ void trace_check_vprintf(struct trace_iterator *iter, const char *fmt, } if (p[i+j] == 's') break; + + if (text_delta && p[i+1] == 'p' && + ((p[i+2] == 's' || p[i+2] == 'S'))) + break; + star = false; } j = 0; @@ -3920,6 +3931,24 @@ void trace_check_vprintf(struct trace_iterator *iter, const char *fmt, iter->fmt[i] = '\0'; trace_seq_vprintf(&iter->seq, iter->fmt, ap);
+ /* Add delta to %pS pointers */ + if (p[i+1] == 'p') { + unsigned long addr; + char fmt[4]; + + fmt[0] = '%'; + fmt[1] = 'p'; + fmt[2] = p[i+2]; /* Either %ps or %pS */ + fmt[3] = '\0'; + + addr = va_arg(ap, unsigned long); + addr += text_delta; + trace_seq_printf(&iter->seq, fmt, (void *)addr); + + p += i + 3; + continue; + } + /* * If iter->seq is full, the above call no longer guarantees * that ap is in sync with fmt processing, and further calls @@ -3938,6 +3967,14 @@ void trace_check_vprintf(struct trace_iterator *iter, const char *fmt, /* The ap now points to the string data of the %s */ str = va_arg(ap, const char *);
+ good = trace_safe_str(iter, str, star, len); + + /* Could be from the last boot */ + if (data_delta && !good) { + str += data_delta; + good = trace_safe_str(iter, str, star, len); + } + /* * If you hit this warning, it is likely that the * trace event in question used %s on a string that @@ -3947,8 +3984,7 @@ void trace_check_vprintf(struct trace_iterator *iter, const char *fmt, * instead. See samples/trace_events/trace-events-sample.h * for reference. */ - if (WARN_ONCE(!trace_safe_str(iter, str, star, len), - "fmt: '%s' current_buffer: '%s'", + if (WARN_ONCE(!good, "fmt: '%s' current_buffer: '%s'", fmt, seq_buf_str(&iter->seq.seq))) { int ret;
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Steven Rostedt rostedt@goodmis.org
[ Upstream commit 50a3242d84ee1625b0bfef29b95f935958dccfbe ]
When the tp_printk kernel command line is used, the trace events go directly to printk(). It is still checked via the trace_check_vprintf() function to make sure the pointers of the trace event are legit.
The addition of reading buffers from previous boots required adding a delta between the addresses of the previous boot and the current boot so that the pointers in the old buffer can still be used. But this required adding a trace_array pointer to acquire the delta offsets.
The tp_printk code does not provide a trace_array (tr) pointer, so when the offsets were examined, a NULL pointer dereference happened and the kernel crashed.
If the trace_array does not exist, just default the delta offsets to zero, as that also means the trace event is not being read from a previous boot.
Link: https://lore.kernel.org/all/Zv3z5UsG_jsO9_Tb@aschofie-mobl2.lan/
Cc: Masami Hiramatsu mhiramat@kernel.org Cc: Mathieu Desnoyers mathieu.desnoyers@efficios.com Link: https://lore.kernel.org/20241003104925.4e1b1fd9@gandalf.local.home Fixes: 07714b4bb3f98 ("tracing: Handle old buffer mappings for event strings and functions") Reported-by: Alison Schofield alison.schofield@intel.com Tested-by: Alison Schofield alison.schofield@intel.com Signed-off-by: Steven Rostedt (Google) rostedt@goodmis.org Stable-dep-of: afd2627f727b ("tracing: Check "%s" dereference via the field and not the TP_printk format") Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/trace/trace.c | 15 +++++++++++++-- 1 file changed, 13 insertions(+), 2 deletions(-)
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c index 2a45efc4e417..addc1b326c79 100644 --- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -3858,8 +3858,8 @@ static void test_can_verify(void) void trace_check_vprintf(struct trace_iterator *iter, const char *fmt, va_list ap) { - long text_delta = iter->tr->text_delta; - long data_delta = iter->tr->data_delta; + long text_delta = 0; + long data_delta = 0; const char *p = fmt; const char *str; bool good; @@ -3871,6 +3871,17 @@ void trace_check_vprintf(struct trace_iterator *iter, const char *fmt, if (static_branch_unlikely(&trace_no_verify)) goto print;
+ /* + * When the kernel is booted with the tp_printk command line + * parameter, trace events go directly through to printk(). + * It also is checked by this function, but it does not + * have an associated trace_array (tr) for it. + */ + if (iter->tr) { + text_delta = iter->tr->text_delta; + data_delta = iter->tr->data_delta; + } + /* Don't bother checking when doing a ftrace_dump() */ if (iter->fmt == static_fmt_buf) goto print;
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Steven Rostedt rostedt@goodmis.org
[ Upstream commit afd2627f727b89496d79a6b934a025fc916d4ded ]
The TP_printk() portion of a trace event is executed at the time a event is read from the trace. This can happen seconds, minutes, hours, days, months, years possibly later since the event was recorded. If the print format contains a dereference to a string via "%s", and that string was allocated, there's a chance that string could be freed before it is read by the trace file.
To protect against such bugs, there are two functions that verify the event. The first one is test_event_printk(), which is called when the event is created. It reads the TP_printk() format as well as its arguments to make sure nothing may be dereferencing a pointer that was not copied into the ring buffer along with the event. If it is, it will trigger a WARN_ON().
For strings that use "%s", it is not so easy. The string may not reside in the ring buffer but may still be valid. Strings that are static and part of the kernel proper which will not be freed for the life of the running system, are safe to dereference. But to know if it is a pointer to a static string or to something on the heap can not be determined until the event is triggered.
This brings us to the second function that tests for the bad dereferencing of strings, trace_check_vprintf(). It would walk through the printf format looking for "%s", and when it finds it, it would validate that the pointer is safe to read. If not, it would produces a WARN_ON() as well and write into the ring buffer "[UNSAFE-MEMORY]".
The problem with this is how it used va_list to have vsnprintf() handle all the cases that it didn't need to check. Instead of re-implementing vsnprintf(), it would make a copy of the format up to the %s part, and call vsnprintf() with the current va_list ap variable, where the ap would then be ready to point at the string in question.
For architectures that passed va_list by reference this was possible. For architectures that passed it by copy it was not. A test_can_verify() function was used to differentiate between the two, and if it wasn't possible, it would disable it.
Even for architectures where this was feasible, it was a stretch to rely on such a method that is undocumented, and could cause issues later on with new optimizations of the compiler.
Instead, the first function test_event_printk() was updated to look at "%s" as well. If the "%s" argument is a pointer outside the event in the ring buffer, it would find the field type of the event that is the problem and mark the structure with a new flag called "needs_test". The event itself will be marked by TRACE_EVENT_FL_TEST_STR to let it be known that this event has a field that needs to be verified before the event can be printed using the printf format.
When the event fields are created from the field type structure, the fields would copy the field type's "needs_test" value.
Finally, before being printed, a new function ignore_event() is called which will check if the event has the TEST_STR flag set (if not, it returns false). If the flag is set, it then iterates through the events fields looking for the ones that have the "needs_test" flag set.
Then it uses the offset field from the field structure to find the pointer in the ring buffer event. It runs the tests to make sure that pointer is safe to print and if not, it triggers the WARN_ON() and also adds to the trace output that the event in question has an unsafe memory access.
The ignore_event() makes the trace_check_vprintf() obsolete so it is removed.
Link: https://lore.kernel.org/all/CAHk-=wh3uOnqnZPpR0PeLZZtyWbZLboZ7cHLCKRWsocvs9Y...
Cc: stable@vger.kernel.org Cc: Masami Hiramatsu mhiramat@kernel.org Cc: Mark Rutland mark.rutland@arm.com Cc: Mathieu Desnoyers mathieu.desnoyers@efficios.com Cc: Andrew Morton akpm@linux-foundation.org Cc: Al Viro viro@ZenIV.linux.org.uk Cc: Linus Torvalds torvalds@linux-foundation.org Link: https://lore.kernel.org/20241217024720.848621576@goodmis.org Fixes: 5013f454a352c ("tracing: Add check of trace event print fmts for dereferencing pointers") Signed-off-by: Steven Rostedt (Google) rostedt@goodmis.org Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/trace_events.h | 6 +- kernel/trace/trace.c | 255 ++++++++--------------------------- kernel/trace/trace.h | 6 +- kernel/trace/trace_events.c | 32 +++-- kernel/trace/trace_output.c | 6 +- 5 files changed, 88 insertions(+), 217 deletions(-)
diff --git a/include/linux/trace_events.h b/include/linux/trace_events.h index 9df2524fff33..aa1bc4172662 100644 --- a/include/linux/trace_events.h +++ b/include/linux/trace_events.h @@ -279,7 +279,8 @@ struct trace_event_fields { const char *name; const int size; const int align; - const int is_signed; + const unsigned int is_signed:1; + unsigned int needs_test:1; const int filter_type; const int len; }; @@ -331,6 +332,7 @@ enum { TRACE_EVENT_FL_EPROBE_BIT, TRACE_EVENT_FL_FPROBE_BIT, TRACE_EVENT_FL_CUSTOM_BIT, + TRACE_EVENT_FL_TEST_STR_BIT, };
/* @@ -348,6 +350,7 @@ enum { * CUSTOM - Event is a custom event (to be attached to an exsiting tracepoint) * This is set when the custom event has not been attached * to a tracepoint yet, then it is cleared when it is. + * TEST_STR - The event has a "%s" that points to a string outside the event */ enum { TRACE_EVENT_FL_FILTERED = (1 << TRACE_EVENT_FL_FILTERED_BIT), @@ -361,6 +364,7 @@ enum { TRACE_EVENT_FL_EPROBE = (1 << TRACE_EVENT_FL_EPROBE_BIT), TRACE_EVENT_FL_FPROBE = (1 << TRACE_EVENT_FL_FPROBE_BIT), TRACE_EVENT_FL_CUSTOM = (1 << TRACE_EVENT_FL_CUSTOM_BIT), + TRACE_EVENT_FL_TEST_STR = (1 << TRACE_EVENT_FL_TEST_STR_BIT), };
#define TRACE_EVENT_FL_UKPROBE (TRACE_EVENT_FL_KPROBE | TRACE_EVENT_FL_UPROBE) diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c index addc1b326c79..9d9af60b238e 100644 --- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -3760,17 +3760,12 @@ char *trace_iter_expand_format(struct trace_iterator *iter) }
/* Returns true if the string is safe to dereference from an event */ -static bool trace_safe_str(struct trace_iterator *iter, const char *str, - bool star, int len) +static bool trace_safe_str(struct trace_iterator *iter, const char *str) { unsigned long addr = (unsigned long)str; struct trace_event *trace_event; struct trace_event_call *event;
- /* Ignore strings with no length */ - if (star && !len) - return true; - /* OK if part of the event data */ if ((addr >= (unsigned long)iter->ent) && (addr < (unsigned long)iter->ent + iter->ent_size)) @@ -3810,181 +3805,69 @@ static bool trace_safe_str(struct trace_iterator *iter, const char *str, return false; }
-static DEFINE_STATIC_KEY_FALSE(trace_no_verify); - -static int test_can_verify_check(const char *fmt, ...) -{ - char buf[16]; - va_list ap; - int ret; - - /* - * The verifier is dependent on vsnprintf() modifies the va_list - * passed to it, where it is sent as a reference. Some architectures - * (like x86_32) passes it by value, which means that vsnprintf() - * does not modify the va_list passed to it, and the verifier - * would then need to be able to understand all the values that - * vsnprintf can use. If it is passed by value, then the verifier - * is disabled. - */ - va_start(ap, fmt); - vsnprintf(buf, 16, "%d", ap); - ret = va_arg(ap, int); - va_end(ap); - - return ret; -} - -static void test_can_verify(void) -{ - if (!test_can_verify_check("%d %d", 0, 1)) { - pr_info("trace event string verifier disabled\n"); - static_branch_inc(&trace_no_verify); - } -} - /** - * trace_check_vprintf - Check dereferenced strings while writing to the seq buffer + * ignore_event - Check dereferenced fields while writing to the seq buffer * @iter: The iterator that holds the seq buffer and the event being printed - * @fmt: The format used to print the event - * @ap: The va_list holding the data to print from @fmt. * - * This writes the data into the @iter->seq buffer using the data from - * @fmt and @ap. If the format has a %s, then the source of the string - * is examined to make sure it is safe to print, otherwise it will - * warn and print "[UNSAFE MEMORY]" in place of the dereferenced string - * pointer. + * At boot up, test_event_printk() will flag any event that dereferences + * a string with "%s" that does exist in the ring buffer. It may still + * be valid, as the string may point to a static string in the kernel + * rodata that never gets freed. But if the string pointer is pointing + * to something that was allocated, there's a chance that it can be freed + * by the time the user reads the trace. This would cause a bad memory + * access by the kernel and possibly crash the system. + * + * This function will check if the event has any fields flagged as needing + * to be checked at runtime and perform those checks. + * + * If it is found that a field is unsafe, it will write into the @iter->seq + * a message stating what was found to be unsafe. + * + * @return: true if the event is unsafe and should be ignored, + * false otherwise. */ -void trace_check_vprintf(struct trace_iterator *iter, const char *fmt, - va_list ap) +bool ignore_event(struct trace_iterator *iter) { - long text_delta = 0; - long data_delta = 0; - const char *p = fmt; - const char *str; - bool good; - int i, j; + struct ftrace_event_field *field; + struct trace_event *trace_event; + struct trace_event_call *event; + struct list_head *head; + struct trace_seq *seq; + const void *ptr;
- if (WARN_ON_ONCE(!fmt)) - return; + trace_event = ftrace_find_event(iter->ent->type);
- if (static_branch_unlikely(&trace_no_verify)) - goto print; + seq = &iter->seq;
- /* - * When the kernel is booted with the tp_printk command line - * parameter, trace events go directly through to printk(). - * It also is checked by this function, but it does not - * have an associated trace_array (tr) for it. - */ - if (iter->tr) { - text_delta = iter->tr->text_delta; - data_delta = iter->tr->data_delta; + if (!trace_event) { + trace_seq_printf(seq, "EVENT ID %d NOT FOUND?\n", iter->ent->type); + return true; }
- /* Don't bother checking when doing a ftrace_dump() */ - if (iter->fmt == static_fmt_buf) - goto print; - - while (*p) { - bool star = false; - int len = 0; - - j = 0; - - /* - * We only care about %s and variants - * as well as %p[sS] if delta is non-zero - */ - for (i = 0; p[i]; i++) { - if (i + 1 >= iter->fmt_size) { - /* - * If we can't expand the copy buffer, - * just print it. - */ - if (!trace_iter_expand_format(iter)) - goto print; - } - - if (p[i] == '\' && p[i+1]) { - i++; - continue; - } - if (p[i] == '%') { - /* Need to test cases like %08.*s */ - for (j = 1; p[i+j]; j++) { - if (isdigit(p[i+j]) || - p[i+j] == '.') - continue; - if (p[i+j] == '*') { - star = true; - continue; - } - break; - } - if (p[i+j] == 's') - break; - - if (text_delta && p[i+1] == 'p' && - ((p[i+2] == 's' || p[i+2] == 'S'))) - break; - - star = false; - } - j = 0; - } - /* If no %s found then just print normally */ - if (!p[i]) - break; - - /* Copy up to the %s, and print that */ - strncpy(iter->fmt, p, i); - iter->fmt[i] = '\0'; - trace_seq_vprintf(&iter->seq, iter->fmt, ap); + event = container_of(trace_event, struct trace_event_call, event); + if (!(event->flags & TRACE_EVENT_FL_TEST_STR)) + return false;
- /* Add delta to %pS pointers */ - if (p[i+1] == 'p') { - unsigned long addr; - char fmt[4]; + head = trace_get_fields(event); + if (!head) { + trace_seq_printf(seq, "FIELDS FOR EVENT '%s' NOT FOUND?\n", + trace_event_name(event)); + return true; + }
- fmt[0] = '%'; - fmt[1] = 'p'; - fmt[2] = p[i+2]; /* Either %ps or %pS */ - fmt[3] = '\0'; + /* Offsets are from the iter->ent that points to the raw event */ + ptr = iter->ent;
- addr = va_arg(ap, unsigned long); - addr += text_delta; - trace_seq_printf(&iter->seq, fmt, (void *)addr); + list_for_each_entry(field, head, link) { + const char *str; + bool good;
- p += i + 3; + if (!field->needs_test) continue; - }
- /* - * If iter->seq is full, the above call no longer guarantees - * that ap is in sync with fmt processing, and further calls - * to va_arg() can return wrong positional arguments. - * - * Ensure that ap is no longer used in this case. - */ - if (iter->seq.full) { - p = ""; - break; - } - - if (star) - len = va_arg(ap, int); - - /* The ap now points to the string data of the %s */ - str = va_arg(ap, const char *); + str = *(const char **)(ptr + field->offset);
- good = trace_safe_str(iter, str, star, len); - - /* Could be from the last boot */ - if (data_delta && !good) { - str += data_delta; - good = trace_safe_str(iter, str, star, len); - } + good = trace_safe_str(iter, str);
/* * If you hit this warning, it is likely that the @@ -3995,44 +3878,14 @@ void trace_check_vprintf(struct trace_iterator *iter, const char *fmt, * instead. See samples/trace_events/trace-events-sample.h * for reference. */ - if (WARN_ONCE(!good, "fmt: '%s' current_buffer: '%s'", - fmt, seq_buf_str(&iter->seq.seq))) { - int ret; - - /* Try to safely read the string */ - if (star) { - if (len + 1 > iter->fmt_size) - len = iter->fmt_size - 1; - if (len < 0) - len = 0; - ret = copy_from_kernel_nofault(iter->fmt, str, len); - iter->fmt[len] = 0; - star = false; - } else { - ret = strncpy_from_kernel_nofault(iter->fmt, str, - iter->fmt_size); - } - if (ret < 0) - trace_seq_printf(&iter->seq, "(0x%px)", str); - else - trace_seq_printf(&iter->seq, "(0x%px:%s)", - str, iter->fmt); - str = "[UNSAFE-MEMORY]"; - strcpy(iter->fmt, "%s"); - } else { - strncpy(iter->fmt, p + i, j + 1); - iter->fmt[j+1] = '\0'; + if (WARN_ONCE(!good, "event '%s' has unsafe pointer field '%s'", + trace_event_name(event), field->name)) { + trace_seq_printf(seq, "EVENT %s: HAS UNSAFE POINTER FIELD '%s'\n", + trace_event_name(event), field->name); + return true; } - if (star) - trace_seq_printf(&iter->seq, iter->fmt, len, str); - else - trace_seq_printf(&iter->seq, iter->fmt, str); - - p += i + j + 1; } - print: - if (*p) - trace_seq_vprintf(&iter->seq, p, ap); + return false; }
const char *trace_event_format(struct trace_iterator *iter, const char *fmt) @@ -10577,8 +10430,6 @@ __init static int tracer_alloc_buffers(void)
register_snapshot_cmd();
- test_can_verify(); - return 0;
out_free_pipe_cpumask: diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h index 3db42bae73f8..e45756f1ac2b 100644 --- a/kernel/trace/trace.h +++ b/kernel/trace/trace.h @@ -644,9 +644,8 @@ void trace_buffer_unlock_commit_nostack(struct trace_buffer *buffer,
bool trace_is_tracepoint_string(const char *str); const char *trace_event_format(struct trace_iterator *iter, const char *fmt); -void trace_check_vprintf(struct trace_iterator *iter, const char *fmt, - va_list ap) __printf(2, 0); char *trace_iter_expand_format(struct trace_iterator *iter); +bool ignore_event(struct trace_iterator *iter);
int trace_empty(struct trace_iterator *iter);
@@ -1323,7 +1322,8 @@ struct ftrace_event_field { int filter_type; int offset; int size; - int is_signed; + unsigned int is_signed:1; + unsigned int needs_test:1; int len; };
diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c index 2ee59b217d7d..9d22745cdea5 100644 --- a/kernel/trace/trace_events.c +++ b/kernel/trace/trace_events.c @@ -82,7 +82,7 @@ static int system_refcount_dec(struct event_subsystem *system) }
static struct ftrace_event_field * -__find_event_field(struct list_head *head, char *name) +__find_event_field(struct list_head *head, const char *name) { struct ftrace_event_field *field;
@@ -114,7 +114,8 @@ trace_find_event_field(struct trace_event_call *call, char *name)
static int __trace_define_field(struct list_head *head, const char *type, const char *name, int offset, int size, - int is_signed, int filter_type, int len) + int is_signed, int filter_type, int len, + int need_test) { struct ftrace_event_field *field;
@@ -133,6 +134,7 @@ static int __trace_define_field(struct list_head *head, const char *type, field->offset = offset; field->size = size; field->is_signed = is_signed; + field->needs_test = need_test; field->len = len;
list_add(&field->link, head); @@ -151,13 +153,13 @@ int trace_define_field(struct trace_event_call *call, const char *type,
head = trace_get_fields(call); return __trace_define_field(head, type, name, offset, size, - is_signed, filter_type, 0); + is_signed, filter_type, 0, 0); } EXPORT_SYMBOL_GPL(trace_define_field);
static int trace_define_field_ext(struct trace_event_call *call, const char *type, const char *name, int offset, int size, int is_signed, - int filter_type, int len) + int filter_type, int len, int need_test) { struct list_head *head;
@@ -166,13 +168,13 @@ static int trace_define_field_ext(struct trace_event_call *call, const char *typ
head = trace_get_fields(call); return __trace_define_field(head, type, name, offset, size, - is_signed, filter_type, len); + is_signed, filter_type, len, need_test); }
#define __generic_field(type, item, filter_type) \ ret = __trace_define_field(&ftrace_generic_fields, #type, \ #item, 0, 0, is_signed_type(type), \ - filter_type, 0); \ + filter_type, 0, 0); \ if (ret) \ return ret;
@@ -181,7 +183,8 @@ static int trace_define_field_ext(struct trace_event_call *call, const char *typ "common_" #item, \ offsetof(typeof(ent), item), \ sizeof(ent.item), \ - is_signed_type(type), FILTER_OTHER, 0); \ + is_signed_type(type), FILTER_OTHER, \ + 0, 0); \ if (ret) \ return ret;
@@ -332,6 +335,7 @@ static bool process_pointer(const char *fmt, int len, struct trace_event_call *c /* Return true if the string is safe */ static bool process_string(const char *fmt, int len, struct trace_event_call *call) { + struct trace_event_fields *field; const char *r, *e, *s;
e = fmt + len; @@ -384,8 +388,16 @@ static bool process_string(const char *fmt, int len, struct trace_event_call *ca if (process_pointer(fmt, len, call)) return true;
- /* Make sure the field is found, and consider it OK for now if it is */ - return find_event_field(fmt, call) != NULL; + /* Make sure the field is found */ + field = find_event_field(fmt, call); + if (!field) + return false; + + /* Test this field's string before printing the event */ + call->flags |= TRACE_EVENT_FL_TEST_STR; + field->needs_test = 1; + + return true; }
/* @@ -2564,7 +2576,7 @@ event_define_fields(struct trace_event_call *call) ret = trace_define_field_ext(call, field->type, field->name, offset, field->size, field->is_signed, field->filter_type, - field->len); + field->len, field->needs_test); if (WARN_ON_ONCE(ret)) { pr_err("error code is %d\n", ret); break; diff --git a/kernel/trace/trace_output.c b/kernel/trace/trace_output.c index db575094c498..2b948d35fb59 100644 --- a/kernel/trace/trace_output.c +++ b/kernel/trace/trace_output.c @@ -317,10 +317,14 @@ EXPORT_SYMBOL(trace_raw_output_prep);
void trace_event_printf(struct trace_iterator *iter, const char *fmt, ...) { + struct trace_seq *s = &iter->seq; va_list ap;
+ if (ignore_event(iter)) + return; + va_start(ap, fmt); - trace_check_vprintf(iter, trace_event_format(iter, fmt), ap); + trace_seq_vprintf(s, trace_event_format(iter, fmt), ap); va_end(ap); } EXPORT_SYMBOL(trace_event_printf);
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Selvin Xavier selvin.xavier@broadcom.com
[ Upstream commit 8d310ba845827a38fcd463d86bfe3b730ce7ab8f ]
FW reports the HW capability to use PSN table or MSN table and driver/library need to select it based on this capability. Use the new capability instead of the older capability check for HW retransmission while handling the MSN/PSN table. FW report zero (PSN table) for older adapters to maintain backward compatibility.
Also, Updated the FW interface structures to handle the new fields.
Signed-off-by: Selvin Xavier selvin.xavier@broadcom.com Link: https://lore.kernel.org/r/1716876697-25970-2-git-send-email-selvin.xavier@br... Signed-off-by: Leon Romanovsky leon@kernel.org Stable-dep-of: eb867d797d29 ("RDMA/bnxt_re: Remove always true dattr validity check") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/infiniband/hw/bnxt_re/qplib_fp.c | 12 ++++----- drivers/infiniband/hw/bnxt_re/qplib_fp.h | 2 +- drivers/infiniband/hw/bnxt_re/qplib_res.h | 6 +++++ drivers/infiniband/hw/bnxt_re/qplib_sp.c | 1 + drivers/infiniband/hw/bnxt_re/qplib_sp.h | 1 + drivers/infiniband/hw/bnxt_re/roce_hsi.h | 30 ++++++++++++++++++++++- 6 files changed, 44 insertions(+), 8 deletions(-)
diff --git a/drivers/infiniband/hw/bnxt_re/qplib_fp.c b/drivers/infiniband/hw/bnxt_re/qplib_fp.c index b624c255eee6..3e07500dcbcf 100644 --- a/drivers/infiniband/hw/bnxt_re/qplib_fp.c +++ b/drivers/infiniband/hw/bnxt_re/qplib_fp.c @@ -981,7 +981,7 @@ int bnxt_qplib_create_qp(struct bnxt_qplib_res *res, struct bnxt_qplib_qp *qp) u16 nsge;
if (res->dattr) - qp->dev_cap_flags = res->dattr->dev_cap_flags; + qp->is_host_msn_tbl = _is_host_msn_table(res->dattr->dev_cap_flags2);
sq->dbinfo.flags = 0; bnxt_qplib_rcfw_cmd_prep((struct cmdq_base *)&req, @@ -999,7 +999,7 @@ int bnxt_qplib_create_qp(struct bnxt_qplib_res *res, struct bnxt_qplib_qp *qp) sizeof(struct sq_psn_search_ext) : sizeof(struct sq_psn_search);
- if (BNXT_RE_HW_RETX(qp->dev_cap_flags)) { + if (qp->is_host_msn_tbl) { psn_sz = sizeof(struct sq_msn_search); qp->msn = 0; } @@ -1013,7 +1013,7 @@ int bnxt_qplib_create_qp(struct bnxt_qplib_res *res, struct bnxt_qplib_qp *qp) hwq_attr.aux_depth = psn_sz ? bnxt_qplib_set_sq_size(sq, qp->wqe_mode) : 0; /* Update msn tbl size */ - if (BNXT_RE_HW_RETX(qp->dev_cap_flags) && psn_sz) { + if (qp->is_host_msn_tbl && psn_sz) { hwq_attr.aux_depth = roundup_pow_of_two(bnxt_qplib_set_sq_size(sq, qp->wqe_mode)); qp->msn_tbl_sz = hwq_attr.aux_depth; qp->msn = 0; @@ -1638,7 +1638,7 @@ static void bnxt_qplib_fill_psn_search(struct bnxt_qplib_qp *qp, if (!swq->psn_search) return; /* Handle MSN differently on cap flags */ - if (BNXT_RE_HW_RETX(qp->dev_cap_flags)) { + if (qp->is_host_msn_tbl) { bnxt_qplib_fill_msn_search(qp, wqe, swq); return; } @@ -1820,7 +1820,7 @@ int bnxt_qplib_post_send(struct bnxt_qplib_qp *qp, }
swq = bnxt_qplib_get_swqe(sq, &wqe_idx); - bnxt_qplib_pull_psn_buff(qp, sq, swq, BNXT_RE_HW_RETX(qp->dev_cap_flags)); + bnxt_qplib_pull_psn_buff(qp, sq, swq, qp->is_host_msn_tbl);
idx = 0; swq->slot_idx = hwq->prod; @@ -2010,7 +2010,7 @@ int bnxt_qplib_post_send(struct bnxt_qplib_qp *qp, rc = -EINVAL; goto done; } - if (!BNXT_RE_HW_RETX(qp->dev_cap_flags) || msn_update) { + if (!qp->is_host_msn_tbl || msn_update) { swq->next_psn = sq->psn & BTH_PSN_MASK; bnxt_qplib_fill_psn_search(qp, wqe, swq); } diff --git a/drivers/infiniband/hw/bnxt_re/qplib_fp.h b/drivers/infiniband/hw/bnxt_re/qplib_fp.h index 5d4c49089a20..3a15ca7feb2b 100644 --- a/drivers/infiniband/hw/bnxt_re/qplib_fp.h +++ b/drivers/infiniband/hw/bnxt_re/qplib_fp.h @@ -340,7 +340,7 @@ struct bnxt_qplib_qp { struct list_head rq_flush; u32 msn; u32 msn_tbl_sz; - u16 dev_cap_flags; + bool is_host_msn_tbl; };
#define BNXT_QPLIB_MAX_CQE_ENTRY_SIZE sizeof(struct cq_base) diff --git a/drivers/infiniband/hw/bnxt_re/qplib_res.h b/drivers/infiniband/hw/bnxt_re/qplib_res.h index f9e7aa3757cf..c2152122a432 100644 --- a/drivers/infiniband/hw/bnxt_re/qplib_res.h +++ b/drivers/infiniband/hw/bnxt_re/qplib_res.h @@ -523,6 +523,12 @@ static inline bool _is_hw_retx_supported(u16 dev_cap_flags)
#define BNXT_RE_HW_RETX(a) _is_hw_retx_supported((a))
+static inline bool _is_host_msn_table(u16 dev_cap_ext_flags2) +{ + return (dev_cap_ext_flags2 & CREQ_QUERY_FUNC_RESP_SB_REQ_RETRANSMISSION_SUPPORT_MASK) == + CREQ_QUERY_FUNC_RESP_SB_REQ_RETRANSMISSION_SUPPORT_HOST_MSN_TABLE; +} + static inline u8 bnxt_qplib_dbr_pacing_en(struct bnxt_qplib_chip_ctx *cctx) { return cctx->modes.dbr_pacing; diff --git a/drivers/infiniband/hw/bnxt_re/qplib_sp.c b/drivers/infiniband/hw/bnxt_re/qplib_sp.c index 0b98577cd708..420f8613bcd5 100644 --- a/drivers/infiniband/hw/bnxt_re/qplib_sp.c +++ b/drivers/infiniband/hw/bnxt_re/qplib_sp.c @@ -165,6 +165,7 @@ int bnxt_qplib_get_dev_attr(struct bnxt_qplib_rcfw *rcfw, attr->max_sgid = le32_to_cpu(sb->max_gid); attr->max_sgid = min_t(u32, BNXT_QPLIB_NUM_GIDS_SUPPORTED, 2 * attr->max_sgid); attr->dev_cap_flags = le16_to_cpu(sb->dev_cap_flags); + attr->dev_cap_flags2 = le16_to_cpu(sb->dev_cap_ext_flags_2);
bnxt_qplib_query_version(rcfw, attr->fw_ver);
diff --git a/drivers/infiniband/hw/bnxt_re/qplib_sp.h b/drivers/infiniband/hw/bnxt_re/qplib_sp.h index 755765e68eaa..2f16f3db093e 100644 --- a/drivers/infiniband/hw/bnxt_re/qplib_sp.h +++ b/drivers/infiniband/hw/bnxt_re/qplib_sp.h @@ -73,6 +73,7 @@ struct bnxt_qplib_dev_attr { u8 tqm_alloc_reqs[MAX_TQM_ALLOC_REQ]; bool is_atomic; u16 dev_cap_flags; + u16 dev_cap_flags2; u32 max_dpi; };
diff --git a/drivers/infiniband/hw/bnxt_re/roce_hsi.h b/drivers/infiniband/hw/bnxt_re/roce_hsi.h index 2909608f4b5d..cb4e7e19fbaf 100644 --- a/drivers/infiniband/hw/bnxt_re/roce_hsi.h +++ b/drivers/infiniband/hw/bnxt_re/roce_hsi.h @@ -2157,8 +2157,36 @@ struct creq_query_func_resp_sb { __le32 tqm_alloc_reqs[12]; __le32 max_dpi; u8 max_sge_var_wqe; - u8 reserved_8; + u8 dev_cap_ext_flags; + #define CREQ_QUERY_FUNC_RESP_SB_ATOMIC_OPS_NOT_SUPPORTED 0x1UL + #define CREQ_QUERY_FUNC_RESP_SB_DRV_VERSION_RGTR_SUPPORTED 0x2UL + #define CREQ_QUERY_FUNC_RESP_SB_CREATE_QP_BATCH_SUPPORTED 0x4UL + #define CREQ_QUERY_FUNC_RESP_SB_DESTROY_QP_BATCH_SUPPORTED 0x8UL + #define CREQ_QUERY_FUNC_RESP_SB_ROCE_STATS_EXT_CTX_SUPPORTED 0x10UL + #define CREQ_QUERY_FUNC_RESP_SB_CREATE_SRQ_SGE_SUPPORTED 0x20UL + #define CREQ_QUERY_FUNC_RESP_SB_FIXED_SIZE_WQE_DISABLED 0x40UL + #define CREQ_QUERY_FUNC_RESP_SB_DCN_SUPPORTED 0x80UL __le16 max_inline_data_var_wqe; + __le32 start_qid; + u8 max_msn_table_size; + u8 reserved8_1; + __le16 dev_cap_ext_flags_2; + #define CREQ_QUERY_FUNC_RESP_SB_OPTIMIZE_MODIFY_QP_SUPPORTED 0x1UL + #define CREQ_QUERY_FUNC_RESP_SB_CHANGE_UDP_SRC_PORT_WQE_SUPPORTED 0x2UL + #define CREQ_QUERY_FUNC_RESP_SB_CQ_COALESCING_SUPPORTED 0x4UL + #define CREQ_QUERY_FUNC_RESP_SB_MEMORY_REGION_RO_SUPPORTED 0x8UL + #define CREQ_QUERY_FUNC_RESP_SB_REQ_RETRANSMISSION_SUPPORT_MASK 0x30UL + #define CREQ_QUERY_FUNC_RESP_SB_REQ_RETRANSMISSION_SUPPORT_SFT 4 + #define CREQ_QUERY_FUNC_RESP_SB_REQ_RETRANSMISSION_SUPPORT_HOST_PSN_TABLE (0x0UL << 4) + #define CREQ_QUERY_FUNC_RESP_SB_REQ_RETRANSMISSION_SUPPORT_HOST_MSN_TABLE (0x1UL << 4) + #define CREQ_QUERY_FUNC_RESP_SB_REQ_RETRANSMISSION_SUPPORT_IQM_MSN_TABLE (0x2UL << 4) + #define CREQ_QUERY_FUNC_RESP_SB_REQ_RETRANSMISSION_SUPPORT_LAST \ + CREQ_QUERY_FUNC_RESP_SB_REQ_RETRANSMISSION_SUPPORT_IQM_MSN_TABLE + __le16 max_xp_qp_size; + __le16 create_qp_batch_size; + __le16 destroy_qp_batch_size; + __le16 reserved16; + __le64 reserved64; };
/* cmdq_set_func_resources (size:448b/56B) */
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Leon Romanovsky leonro@nvidia.com
[ Upstream commit eb867d797d294a00a092b5027d08439da68940b2 ]
res->dattr is always valid at this point as it was initialized during device addition in bnxt_re_add_device().
This change is fixing the following smatch error: drivers/infiniband/hw/bnxt_re/qplib_fp.c:1090 bnxt_qplib_create_qp() error: we previously assumed 'res->dattr' could be null (see line 985)
Fixes: 07f830ae4913 ("RDMA/bnxt_re: Adds MSN table capability for Gen P7 adapters") Reported-by: kernel test robot lkp@intel.com Reported-by: Dan Carpenter dan.carpenter@linaro.org Closes: https://lore.kernel.org/r/202411222329.YTrwonWi-lkp@intel.com/ Link: https://patch.msgid.link/be0d8836b64cba3e479fbcbca717acad04aae02e.1732626579... Acked-by: Selvin Xavier selvin.xavier@broadcom.com Signed-off-by: Leon Romanovsky leonro@nvidia.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/infiniband/hw/bnxt_re/qplib_fp.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/drivers/infiniband/hw/bnxt_re/qplib_fp.c b/drivers/infiniband/hw/bnxt_re/qplib_fp.c index 3e07500dcbcf..4098e01666d1 100644 --- a/drivers/infiniband/hw/bnxt_re/qplib_fp.c +++ b/drivers/infiniband/hw/bnxt_re/qplib_fp.c @@ -980,9 +980,7 @@ int bnxt_qplib_create_qp(struct bnxt_qplib_res *res, struct bnxt_qplib_qp *qp) u32 tbl_indx; u16 nsge;
- if (res->dattr) - qp->is_host_msn_tbl = _is_host_msn_table(res->dattr->dev_cap_flags2); - + qp->is_host_msn_tbl = _is_host_msn_table(res->dattr->dev_cap_flags2); sq->dbinfo.flags = 0; bnxt_qplib_rcfw_cmd_prep((struct cmdq_base *)&req, CMDQ_BASE_OPCODE_CREATE_QP,
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Patrisious Haddad phaddad@nvidia.com
[ Upstream commit e05feab22fd7dabcd6d272c4e2401ec1acdfdb9b ]
Different core device types such as PFs and VFs shouldn't be affiliated together since they have different capabilities, fix that by enforcing type check before doing the affiliation.
Fixes: 32f69e4be269 ("{net, IB}/mlx5: Manage port association for multiport RoCE") Reviewed-by: Mark Bloch mbloch@nvidia.com Signed-off-by: Patrisious Haddad phaddad@nvidia.com Link: https://patch.msgid.link/88699500f690dff1c1852c1ddb71f8a1cc8b956e.1733233480... Reviewed-by: Mateusz Polchlopek mateusz.polchlopek@intel.com Signed-off-by: Leon Romanovsky leon@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/infiniband/hw/mlx5/main.c | 6 ++++-- include/linux/mlx5/driver.h | 6 ++++++ 2 files changed, 10 insertions(+), 2 deletions(-)
diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c index c510484e024b..ada7dbf8eb1c 100644 --- a/drivers/infiniband/hw/mlx5/main.c +++ b/drivers/infiniband/hw/mlx5/main.c @@ -3372,7 +3372,8 @@ static int mlx5_ib_init_multiport_master(struct mlx5_ib_dev *dev) list_for_each_entry(mpi, &mlx5_ib_unaffiliated_port_list, list) { if (dev->sys_image_guid == mpi->sys_image_guid && - (mlx5_core_native_port_num(mpi->mdev) - 1) == i) { + (mlx5_core_native_port_num(mpi->mdev) - 1) == i && + mlx5_core_same_coredev_type(dev->mdev, mpi->mdev)) { bound = mlx5_ib_bind_slave_port(dev, mpi); }
@@ -4406,7 +4407,8 @@ static int mlx5r_mp_probe(struct auxiliary_device *adev,
mutex_lock(&mlx5_ib_multiport_mutex); list_for_each_entry(dev, &mlx5_ib_dev_list, ib_dev_list) { - if (dev->sys_image_guid == mpi->sys_image_guid) + if (dev->sys_image_guid == mpi->sys_image_guid && + mlx5_core_same_coredev_type(dev->mdev, mpi->mdev)) bound = mlx5_ib_bind_slave_port(dev, mpi);
if (bound) { diff --git a/include/linux/mlx5/driver.h b/include/linux/mlx5/driver.h index ffb98bc43b2d..38a8ff9c685c 100644 --- a/include/linux/mlx5/driver.h +++ b/include/linux/mlx5/driver.h @@ -1225,6 +1225,12 @@ static inline bool mlx5_core_is_vf(const struct mlx5_core_dev *dev) return dev->coredev_type == MLX5_COREDEV_VF; }
+static inline bool mlx5_core_same_coredev_type(const struct mlx5_core_dev *dev1, + const struct mlx5_core_dev *dev2) +{ + return dev1->coredev_type == dev2->coredev_type; +} + static inline bool mlx5_core_is_ecpf(const struct mlx5_core_dev *dev) { return dev->caps.embedded_cpu;
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Selvin Xavier selvin.xavier@broadcom.com
[ Upstream commit 5effcacc8a8f3eb2a9f069d7e81a9ac793598dfb ]
Software Queues to hold the WRs needs to be created for only kernel queues. Avoid allocating the unnecessary memory for user Queues.
Fixes: 1ac5a4047975 ("RDMA/bnxt_re: Add bnxt_re RoCE driver") Fixes: 159fb4ceacd7 ("RDMA/bnxt_re: introduce a function to allocate swq") Signed-off-by: Selvin Xavier selvin.xavier@broadcom.com Link: https://patch.msgid.link/20241204075416.478431-3-kalesh-anakkur.purayil@broa... Signed-off-by: Leon Romanovsky leon@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/infiniband/hw/bnxt_re/qplib_fp.c | 42 +++++++++++++----------- 1 file changed, 23 insertions(+), 19 deletions(-)
diff --git a/drivers/infiniband/hw/bnxt_re/qplib_fp.c b/drivers/infiniband/hw/bnxt_re/qplib_fp.c index 4098e01666d1..d38e7880cebb 100644 --- a/drivers/infiniband/hw/bnxt_re/qplib_fp.c +++ b/drivers/infiniband/hw/bnxt_re/qplib_fp.c @@ -639,13 +639,6 @@ int bnxt_qplib_create_srq(struct bnxt_qplib_res *res, rc = bnxt_qplib_alloc_init_hwq(&srq->hwq, &hwq_attr); if (rc) return rc; - - srq->swq = kcalloc(srq->hwq.max_elements, sizeof(*srq->swq), - GFP_KERNEL); - if (!srq->swq) { - rc = -ENOMEM; - goto fail; - } srq->dbinfo.flags = 0; bnxt_qplib_rcfw_cmd_prep((struct cmdq_base *)&req, CMDQ_BASE_OPCODE_CREATE_SRQ, @@ -674,9 +667,17 @@ int bnxt_qplib_create_srq(struct bnxt_qplib_res *res, spin_lock_init(&srq->lock); srq->start_idx = 0; srq->last_idx = srq->hwq.max_elements - 1; - for (idx = 0; idx < srq->hwq.max_elements; idx++) - srq->swq[idx].next_idx = idx + 1; - srq->swq[srq->last_idx].next_idx = -1; + if (!srq->hwq.is_user) { + srq->swq = kcalloc(srq->hwq.max_elements, sizeof(*srq->swq), + GFP_KERNEL); + if (!srq->swq) { + rc = -ENOMEM; + goto fail; + } + for (idx = 0; idx < srq->hwq.max_elements; idx++) + srq->swq[idx].next_idx = idx + 1; + srq->swq[srq->last_idx].next_idx = -1; + }
srq->id = le32_to_cpu(resp.xid); srq->dbinfo.hwq = &srq->hwq; @@ -1022,13 +1023,14 @@ int bnxt_qplib_create_qp(struct bnxt_qplib_res *res, struct bnxt_qplib_qp *qp) if (rc) return rc;
- rc = bnxt_qplib_alloc_init_swq(sq); - if (rc) - goto fail_sq; - - if (psn_sz) - bnxt_qplib_init_psn_ptr(qp, psn_sz); + if (!sq->hwq.is_user) { + rc = bnxt_qplib_alloc_init_swq(sq); + if (rc) + goto fail_sq;
+ if (psn_sz) + bnxt_qplib_init_psn_ptr(qp, psn_sz); + } req.sq_size = cpu_to_le32(bnxt_qplib_set_sq_size(sq, qp->wqe_mode)); pbl = &sq->hwq.pbl[PBL_LVL_0]; req.sq_pbl = cpu_to_le64(pbl->pg_map_arr[0]); @@ -1054,9 +1056,11 @@ int bnxt_qplib_create_qp(struct bnxt_qplib_res *res, struct bnxt_qplib_qp *qp) rc = bnxt_qplib_alloc_init_hwq(&rq->hwq, &hwq_attr); if (rc) goto sq_swq; - rc = bnxt_qplib_alloc_init_swq(rq); - if (rc) - goto fail_rq; + if (!rq->hwq.is_user) { + rc = bnxt_qplib_alloc_init_swq(rq); + if (rc) + goto fail_rq; + }
req.rq_size = cpu_to_le32(rq->max_wqe); pbl = &rq->hwq.pbl[PBL_LVL_0];
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kashyap Desai kashyap.desai@broadcom.com
[ Upstream commit 064c22408a73b9e945139b64614c534cbbefb591 ]
The workaround to modify the UD QP from RTS to RTS is required only for older adapters. Issuing this for latest adapters can caus some unexpected behavior. Fix it
Fixes: 1801d87b3598 ("RDMA/bnxt_re: Support new 5760X P7 devices") Signed-off-by: Kashyap Desai kashyap.desai@broadcom.com Signed-off-by: Selvin Xavier selvin.xavier@broadcom.com Link: https://patch.msgid.link/20241204075416.478431-4-kalesh-anakkur.purayil@broa... Signed-off-by: Leon Romanovsky leon@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/infiniband/hw/bnxt_re/ib_verbs.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/infiniband/hw/bnxt_re/ib_verbs.c b/drivers/infiniband/hw/bnxt_re/ib_verbs.c index df5897260601..fb6f15bb9d4f 100644 --- a/drivers/infiniband/hw/bnxt_re/ib_verbs.c +++ b/drivers/infiniband/hw/bnxt_re/ib_verbs.c @@ -2710,7 +2710,8 @@ static int bnxt_re_post_send_shadow_qp(struct bnxt_re_dev *rdev, wr = wr->next; } bnxt_qplib_post_send_db(&qp->qplib_qp); - bnxt_ud_qp_hw_stall_workaround(qp); + if (!bnxt_qplib_is_chip_gen_p5_p7(qp->rdev->chip_ctx)) + bnxt_ud_qp_hw_stall_workaround(qp); spin_unlock_irqrestore(&qp->sq_lock, flags); return rc; } @@ -2822,7 +2823,8 @@ int bnxt_re_post_send(struct ib_qp *ib_qp, const struct ib_send_wr *wr, wr = wr->next; } bnxt_qplib_post_send_db(&qp->qplib_qp); - bnxt_ud_qp_hw_stall_workaround(qp); + if (!bnxt_qplib_is_chip_gen_p5_p7(qp->rdev->chip_ctx)) + bnxt_ud_qp_hw_stall_workaround(qp); spin_unlock_irqrestore(&qp->sq_lock, flags);
return rc;
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Robert Beckett bob.beckett@collabora.com
[ Upstream commit ebefac5647968679f6ef5803e5d35a71997d20fa ]
We initially introduced a quick fix limiting the queue depth to 1 as experimentation showed that it fixed data corruption on 64GB steamdecks.
Further experimentation revealed corruption only happens when the last PRP data element aligns to the end of the page boundary. The device appears to treat this as a PRP chain to a new list instead of the data element that it actually is. This implementation is in violation of the spec. Encountering this errata with the Linux driver requires the host request a 128k transfer and coincidently be handed the last small pool dma buffer within a page.
The QD1 quirk effectly works around this because the last data PRP always was at a 248 byte offset from the page start, so it never appeared at the end of the page, but comes at the expense of throttling IO and wasting the remainder of the PRP page beyond 256 bytes. Also to note, the MDTS on these devices is small enough that the "large" prp pool can hold enough PRP elements to never reach the end, so that pool is not a problem either.
Introduce a new quirk to ensure the small pool is always aligned such that the last PRP element can't appear a the end of the page. This comes at the expense of wasting 256 bytes per small pool page allocated.
Link: https://lore.kernel.org/linux-nvme/20241113043151.GA20077@lst.de/T/#u Fixes: 83bdfcbdbe5d ("nvme-pci: qdepth 1 quirk") Cc: Paweł Anikiel panikiel@google.com Signed-off-by: Robert Beckett bob.beckett@collabora.com Signed-off-by: Keith Busch kbusch@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/nvme/host/nvme.h | 5 +++++ drivers/nvme/host/pci.c | 9 +++++++-- 2 files changed, 12 insertions(+), 2 deletions(-)
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h index bddc068d58c7..e867ac859a87 100644 --- a/drivers/nvme/host/nvme.h +++ b/drivers/nvme/host/nvme.h @@ -172,6 +172,11 @@ enum nvme_quirks { * MSI (but not MSI-X) interrupts are broken and never fire. */ NVME_QUIRK_BROKEN_MSI = (1 << 21), + + /* + * Align dma pool segment size to 512 bytes + */ + NVME_QUIRK_DMAPOOL_ALIGN_512 = (1 << 22), };
/* diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c index d525fa1229d7..52c8fd3d5c47 100644 --- a/drivers/nvme/host/pci.c +++ b/drivers/nvme/host/pci.c @@ -2653,15 +2653,20 @@ static int nvme_disable_prepare_reset(struct nvme_dev *dev, bool shutdown)
static int nvme_setup_prp_pools(struct nvme_dev *dev) { + size_t small_align = 256; + dev->prp_page_pool = dma_pool_create("prp list page", dev->dev, NVME_CTRL_PAGE_SIZE, NVME_CTRL_PAGE_SIZE, 0); if (!dev->prp_page_pool) return -ENOMEM;
+ if (dev->ctrl.quirks & NVME_QUIRK_DMAPOOL_ALIGN_512) + small_align = 512; + /* Optimisation for I/Os between 4k and 128k */ dev->prp_small_pool = dma_pool_create("prp list 256", dev->dev, - 256, 256, 0); + 256, small_align, 0); if (!dev->prp_small_pool) { dma_pool_destroy(dev->prp_page_pool); return -ENOMEM; @@ -3403,7 +3408,7 @@ static const struct pci_device_id nvme_id_table[] = { { PCI_VDEVICE(REDHAT, 0x0010), /* Qemu emulated controller */ .driver_data = NVME_QUIRK_BOGUS_NID, }, { PCI_DEVICE(0x1217, 0x8760), /* O2 Micro 64GB Steam Deck */ - .driver_data = NVME_QUIRK_QDEPTH_ONE }, + .driver_data = NVME_QUIRK_DMAPOOL_ALIGN_512, }, { PCI_DEVICE(0x126f, 0x2262), /* Silicon Motion generic */ .driver_data = NVME_QUIRK_NO_DEEPEST_PS | NVME_QUIRK_BOGUS_NID, },
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kalesh AP kalesh-anakkur.purayil@broadcom.com
[ Upstream commit 38651476e46e088598354510502c383e932e2297 ]
The check for 9060 condition should only be made for legacy chips.
Fixes: 9152e0b722b2 ("RDMA/bnxt_re: HW workarounds for handling specific conditions") Reviewed-by: Kashyap Desai kashyap.desai@broadcom.com Signed-off-by: Kalesh AP kalesh-anakkur.purayil@broadcom.com Signed-off-by: Selvin Xavier selvin.xavier@broadcom.com Link: https://patch.msgid.link/20241211083931.968831-2-kalesh-anakkur.purayil@broa... Signed-off-by: Leon Romanovsky leon@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/infiniband/hw/bnxt_re/qplib_fp.c | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/drivers/infiniband/hw/bnxt_re/qplib_fp.c b/drivers/infiniband/hw/bnxt_re/qplib_fp.c index d38e7880cebb..8997f359b58b 100644 --- a/drivers/infiniband/hw/bnxt_re/qplib_fp.c +++ b/drivers/infiniband/hw/bnxt_re/qplib_fp.c @@ -2536,10 +2536,12 @@ static int bnxt_qplib_cq_process_req(struct bnxt_qplib_cq *cq, bnxt_qplib_add_flush_qp(qp); } else { /* Before we complete, do WA 9060 */ - if (do_wa9060(qp, cq, cq_cons, sq->swq_last, - cqe_sq_cons)) { - *lib_qp = qp; - goto out; + if (!bnxt_qplib_is_chip_gen_p5_p7(qp->cctx)) { + if (do_wa9060(qp, cq, cq_cons, sq->swq_last, + cqe_sq_cons)) { + *lib_qp = qp; + goto out; + } } if (swq->flags & SQ_SEND_FLAGS_SIGNAL_COMP) { cqe->status = CQ_REQ_STATUS_OK;
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Saravanan Vajravel saravanan.vajravel@broadcom.com
[ Upstream commit 798653a0ee30d3cd495099282751c0f248614ae7 ]
When RDMA app configures path MTU, add a check in modify_qp verb to make sure that it doesn't go beyond interface MTU. If this check fails, driver will fail the modify_qp verb.
Fixes: 1ac5a4047975 ("RDMA/bnxt_re: Add bnxt_re RoCE driver") Reviewed-by: Kalesh AP kalesh-anakkur.purayil@broadcom.com Signed-off-by: Saravanan Vajravel saravanan.vajravel@broadcom.com Signed-off-by: Selvin Xavier selvin.xavier@broadcom.com Link: https://patch.msgid.link/20241211083931.968831-3-kalesh-anakkur.purayil@broa... Signed-off-by: Leon Romanovsky leon@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/infiniband/hw/bnxt_re/ib_verbs.c | 26 +++++++++++++----------- 1 file changed, 14 insertions(+), 12 deletions(-)
diff --git a/drivers/infiniband/hw/bnxt_re/ib_verbs.c b/drivers/infiniband/hw/bnxt_re/ib_verbs.c index fb6f15bb9d4f..9bf00fb666d7 100644 --- a/drivers/infiniband/hw/bnxt_re/ib_verbs.c +++ b/drivers/infiniband/hw/bnxt_re/ib_verbs.c @@ -2055,18 +2055,20 @@ int bnxt_re_modify_qp(struct ib_qp *ib_qp, struct ib_qp_attr *qp_attr, } }
- if (qp_attr_mask & IB_QP_PATH_MTU) { - qp->qplib_qp.modify_flags |= - CMDQ_MODIFY_QP_MODIFY_MASK_PATH_MTU; - qp->qplib_qp.path_mtu = __from_ib_mtu(qp_attr->path_mtu); - qp->qplib_qp.mtu = ib_mtu_enum_to_int(qp_attr->path_mtu); - } else if (qp_attr->qp_state == IB_QPS_RTR) { - qp->qplib_qp.modify_flags |= - CMDQ_MODIFY_QP_MODIFY_MASK_PATH_MTU; - qp->qplib_qp.path_mtu = - __from_ib_mtu(iboe_get_mtu(rdev->netdev->mtu)); - qp->qplib_qp.mtu = - ib_mtu_enum_to_int(iboe_get_mtu(rdev->netdev->mtu)); + if (qp_attr->qp_state == IB_QPS_RTR) { + enum ib_mtu qpmtu; + + qpmtu = iboe_get_mtu(rdev->netdev->mtu); + if (qp_attr_mask & IB_QP_PATH_MTU) { + if (ib_mtu_enum_to_int(qp_attr->path_mtu) > + ib_mtu_enum_to_int(qpmtu)) + return -EINVAL; + qpmtu = qp_attr->path_mtu; + } + + qp->qplib_qp.modify_flags |= CMDQ_MODIFY_QP_MODIFY_MASK_PATH_MTU; + qp->qplib_qp.path_mtu = __from_ib_mtu(qpmtu); + qp->qplib_qp.mtu = ib_mtu_enum_to_int(qpmtu); }
if (qp_attr_mask & IB_QP_TIMEOUT) {
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kalesh AP kalesh-anakkur.purayil@broadcom.com
[ Upstream commit 7179fe0074a3c962e43a9e51169304c4911989ed ]
Driver currently populates subsystem_device id in the "hw_ver" field of ib_attr structure in query_device.
Updated to populate PCI revision ID.
Fixes: 1ac5a4047975 ("RDMA/bnxt_re: Add bnxt_re RoCE driver") Reviewed-by: Preethi G preethi.gurusiddalingeswaraswamy@broadcom.com Signed-off-by: Kalesh AP kalesh-anakkur.purayil@broadcom.com Signed-off-by: Selvin Xavier selvin.xavier@broadcom.com Link: https://patch.msgid.link/20241211083931.968831-6-kalesh-anakkur.purayil@broa... Signed-off-by: Leon Romanovsky leon@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/infiniband/hw/bnxt_re/ib_verbs.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/infiniband/hw/bnxt_re/ib_verbs.c b/drivers/infiniband/hw/bnxt_re/ib_verbs.c index 9bf00fb666d7..9e8f86f48801 100644 --- a/drivers/infiniband/hw/bnxt_re/ib_verbs.c +++ b/drivers/infiniband/hw/bnxt_re/ib_verbs.c @@ -147,7 +147,7 @@ int bnxt_re_query_device(struct ib_device *ibdev,
ib_attr->vendor_id = rdev->en_dev->pdev->vendor; ib_attr->vendor_part_id = rdev->en_dev->pdev->device; - ib_attr->hw_ver = rdev->en_dev->pdev->subsystem_device; + ib_attr->hw_ver = rdev->en_dev->pdev->revision; ib_attr->max_qp = dev_attr->max_qp; ib_attr->max_qp_wr = dev_attr->max_qp_wqes; ib_attr->device_cap_flags =
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Selvin Xavier selvin.xavier@broadcom.com
[ Upstream commit 40be32303ec829ea12f9883e499bfd3fe9e52baf ]
While creating qps, driver adds one extra entry to the sq size passed by the ULPs in order to avoid queue full condition. When ULPs creates QPs with max_qp_wr reported, driver creates QP with 1 more than the max_wqes supported by HW. Create QP fails in this case. To avoid this error, reduce 1 entry in max_qp_wqes and report it to the stack.
Fixes: 1ac5a4047975 ("RDMA/bnxt_re: Add bnxt_re RoCE driver") Reviewed-by: Kalesh AP kalesh-anakkur.purayil@broadcom.com Signed-off-by: Selvin Xavier selvin.xavier@broadcom.com Link: https://patch.msgid.link/20241217102649.1377704-2-kalesh-anakkur.purayil@bro... Signed-off-by: Leon Romanovsky leon@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/infiniband/hw/bnxt_re/qplib_sp.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/infiniband/hw/bnxt_re/qplib_sp.c b/drivers/infiniband/hw/bnxt_re/qplib_sp.c index 420f8613bcd5..0f6bae009af1 100644 --- a/drivers/infiniband/hw/bnxt_re/qplib_sp.c +++ b/drivers/infiniband/hw/bnxt_re/qplib_sp.c @@ -127,7 +127,7 @@ int bnxt_qplib_get_dev_attr(struct bnxt_qplib_rcfw *rcfw, attr->max_qp_init_rd_atom = sb->max_qp_init_rd_atom > BNXT_QPLIB_MAX_OUT_RD_ATOM ? BNXT_QPLIB_MAX_OUT_RD_ATOM : sb->max_qp_init_rd_atom; - attr->max_qp_wqes = le16_to_cpu(sb->max_qp_wr); + attr->max_qp_wqes = le16_to_cpu(sb->max_qp_wr) - 1; /* * 128 WQEs needs to be reserved for the HW (8916). Prevent * reporting the max number
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Selvin Xavier selvin.xavier@broadcom.com
[ Upstream commit de1d364c3815f9360a0945097ca2731950e914fa ]
Variable size WQE means that each send Work Queue Entry to HW can use different WQE sizes as opposed to the static WQE size on the current devices. Set variable WQE mode for Gen P7 devices. Depth of the Queue will be a multiple of slot which is 16 bytes. The number of slots should be a multiple of 256 as per the HW requirement.
Initialize the Software shadow queue to hold requests equal to the number of slots. Also, do not expose the variable size WQE capability until the last patch in the series.
Link: https://patch.msgid.link/r/1724042847-1481-2-git-send-email-selvin.xavier@br... Signed-off-by: Hongguang Gao hongguang.gao@broadcom.com Signed-off-by: Selvin Xavier selvin.xavier@broadcom.com Signed-off-by: Jason Gunthorpe jgg@nvidia.com Stable-dep-of: d5a38bf2f359 ("RDMA/bnxt_re: Disable use of reserved wqes") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/infiniband/hw/bnxt_re/ib_verbs.c | 8 +++++--- drivers/infiniband/hw/bnxt_re/main.c | 21 +++++++++++---------- drivers/infiniband/hw/bnxt_re/qplib_fp.c | 18 +++++++++--------- drivers/infiniband/hw/bnxt_re/qplib_fp.h | 14 +++++++++++--- drivers/infiniband/hw/bnxt_re/qplib_sp.c | 7 +++++-- drivers/infiniband/hw/bnxt_re/qplib_sp.h | 6 ++++++ 6 files changed, 47 insertions(+), 27 deletions(-)
diff --git a/drivers/infiniband/hw/bnxt_re/ib_verbs.c b/drivers/infiniband/hw/bnxt_re/ib_verbs.c index 9e8f86f48801..540998ddbb44 100644 --- a/drivers/infiniband/hw/bnxt_re/ib_verbs.c +++ b/drivers/infiniband/hw/bnxt_re/ib_verbs.c @@ -1154,6 +1154,7 @@ static struct bnxt_re_qp *bnxt_re_create_shadow_qp /* Shadow QP SQ depth should be same as QP1 RQ depth */ qp->qplib_qp.sq.wqe_size = bnxt_re_get_wqe_size(0, 6); qp->qplib_qp.sq.max_wqe = qp1_qp->rq.max_wqe; + qp->qplib_qp.sq.max_sw_wqe = qp1_qp->rq.max_wqe; qp->qplib_qp.sq.max_sge = 2; /* Q full delta can be 1 since it is internal QP */ qp->qplib_qp.sq.q_full_delta = 1; @@ -1165,6 +1166,7 @@ static struct bnxt_re_qp *bnxt_re_create_shadow_qp
qp->qplib_qp.rq.wqe_size = bnxt_re_get_rwqe_size(6); qp->qplib_qp.rq.max_wqe = qp1_qp->rq.max_wqe; + qp->qplib_qp.rq.max_sw_wqe = qp1_qp->rq.max_wqe; qp->qplib_qp.rq.max_sge = qp1_qp->rq.max_sge; /* Q full delta can be 1 since it is internal QP */ qp->qplib_qp.rq.q_full_delta = 1; @@ -1226,6 +1228,7 @@ static int bnxt_re_init_rq_attr(struct bnxt_re_qp *qp, */ entries = bnxt_re_init_depth(init_attr->cap.max_recv_wr + 1, uctx); rq->max_wqe = min_t(u32, entries, dev_attr->max_qp_wqes + 1); + rq->max_sw_wqe = rq->max_wqe; rq->q_full_delta = 0; rq->sg_info.pgsize = PAGE_SIZE; rq->sg_info.pgshft = PAGE_SHIFT; @@ -1285,6 +1288,7 @@ static int bnxt_re_init_sq_attr(struct bnxt_re_qp *qp, 0 : BNXT_QPLIB_RESERVED_QP_WRS; entries = bnxt_re_init_depth(entries + diff + 1, uctx); sq->max_wqe = min_t(u32, entries, dev_attr->max_qp_wqes + diff + 1); + sq->max_sw_wqe = bnxt_qplib_get_depth(sq, qplqp->wqe_mode, true); sq->q_full_delta = diff + 1; /* * Reserving one slot for Phantom WQE. Application can @@ -2155,6 +2159,7 @@ int bnxt_re_modify_qp(struct ib_qp *ib_qp, struct ib_qp_attr *qp_attr, entries = bnxt_re_init_depth(qp_attr->cap.max_recv_wr, uctx); qp->qplib_qp.rq.max_wqe = min_t(u32, entries, dev_attr->max_qp_wqes + 1); + qp->qplib_qp.rq.max_sw_wqe = qp->qplib_qp.rq.max_wqe; qp->qplib_qp.rq.q_full_delta = qp->qplib_qp.rq.max_wqe - qp_attr->cap.max_recv_wr; qp->qplib_qp.rq.max_sge = qp_attr->cap.max_recv_sge; @@ -4171,9 +4176,6 @@ int bnxt_re_alloc_ucontext(struct ib_ucontext *ctx, struct ib_udata *udata) resp.cqe_sz = sizeof(struct cq_base); resp.max_cqd = dev_attr->max_cq_wqes;
- resp.comp_mask |= BNXT_RE_UCNTX_CMASK_HAVE_MODE; - resp.mode = rdev->chip_ctx->modes.wqe_mode; - if (rdev->chip_ctx->modes.db_push) resp.comp_mask |= BNXT_RE_UCNTX_CMASK_WC_DPI_ENABLED;
diff --git a/drivers/infiniband/hw/bnxt_re/main.c b/drivers/infiniband/hw/bnxt_re/main.c index 0373d0e9db63..c7e51cc2ea26 100644 --- a/drivers/infiniband/hw/bnxt_re/main.c +++ b/drivers/infiniband/hw/bnxt_re/main.c @@ -128,13 +128,13 @@ static void bnxt_re_set_db_offset(struct bnxt_re_dev *rdev) } }
-static void bnxt_re_set_drv_mode(struct bnxt_re_dev *rdev, u8 mode) +static void bnxt_re_set_drv_mode(struct bnxt_re_dev *rdev) { struct bnxt_qplib_chip_ctx *cctx;
cctx = rdev->chip_ctx; - cctx->modes.wqe_mode = bnxt_qplib_is_chip_gen_p5_p7(rdev->chip_ctx) ? - mode : BNXT_QPLIB_WQE_MODE_STATIC; + cctx->modes.wqe_mode = bnxt_qplib_is_chip_gen_p7(rdev->chip_ctx) ? + BNXT_QPLIB_WQE_MODE_VARIABLE : BNXT_QPLIB_WQE_MODE_STATIC; if (bnxt_re_hwrm_qcaps(rdev)) dev_err(rdev_to_dev(rdev), "Failed to query hwrm qcaps\n"); @@ -155,7 +155,7 @@ static void bnxt_re_destroy_chip_ctx(struct bnxt_re_dev *rdev) kfree(chip_ctx); }
-static int bnxt_re_setup_chip_ctx(struct bnxt_re_dev *rdev, u8 wqe_mode) +static int bnxt_re_setup_chip_ctx(struct bnxt_re_dev *rdev) { struct bnxt_qplib_chip_ctx *chip_ctx; struct bnxt_en_dev *en_dev; @@ -177,7 +177,7 @@ static int bnxt_re_setup_chip_ctx(struct bnxt_re_dev *rdev, u8 wqe_mode) rdev->qplib_res.dattr = &rdev->dev_attr; rdev->qplib_res.is_vf = BNXT_EN_VF(en_dev);
- bnxt_re_set_drv_mode(rdev, wqe_mode); + bnxt_re_set_drv_mode(rdev);
bnxt_re_set_db_offset(rdev); rc = bnxt_qplib_map_db_bar(&rdev->qplib_res); @@ -1440,7 +1440,7 @@ static void bnxt_re_worker(struct work_struct *work) schedule_delayed_work(&rdev->worker, msecs_to_jiffies(30000)); }
-static int bnxt_re_dev_init(struct bnxt_re_dev *rdev, u8 wqe_mode) +static int bnxt_re_dev_init(struct bnxt_re_dev *rdev) { struct bnxt_re_ring_attr rattr = {}; struct bnxt_qplib_creq_ctx *creq; @@ -1458,7 +1458,7 @@ static int bnxt_re_dev_init(struct bnxt_re_dev *rdev, u8 wqe_mode) } set_bit(BNXT_RE_FLAG_NETDEV_REGISTERED, &rdev->flags);
- rc = bnxt_re_setup_chip_ctx(rdev, wqe_mode); + rc = bnxt_re_setup_chip_ctx(rdev); if (rc) { bnxt_unregister_dev(rdev->en_dev); clear_bit(BNXT_RE_FLAG_NETDEV_REGISTERED, &rdev->flags); @@ -1609,7 +1609,7 @@ static int bnxt_re_dev_init(struct bnxt_re_dev *rdev, u8 wqe_mode) return rc; }
-static int bnxt_re_add_device(struct auxiliary_device *adev, u8 wqe_mode) +static int bnxt_re_add_device(struct auxiliary_device *adev) { struct bnxt_aux_priv *aux_priv = container_of(adev, struct bnxt_aux_priv, aux_dev); @@ -1626,7 +1626,7 @@ static int bnxt_re_add_device(struct auxiliary_device *adev, u8 wqe_mode) goto exit; }
- rc = bnxt_re_dev_init(rdev, wqe_mode); + rc = bnxt_re_dev_init(rdev); if (rc) goto re_dev_dealloc;
@@ -1756,7 +1756,8 @@ static int bnxt_re_probe(struct auxiliary_device *adev, int rc;
mutex_lock(&bnxt_re_mutex); - rc = bnxt_re_add_device(adev, BNXT_QPLIB_WQE_MODE_STATIC); + + rc = bnxt_re_add_device(adev); if (rc) { mutex_unlock(&bnxt_re_mutex); return rc; diff --git a/drivers/infiniband/hw/bnxt_re/qplib_fp.c b/drivers/infiniband/hw/bnxt_re/qplib_fp.c index 8997f359b58b..2f85245d1285 100644 --- a/drivers/infiniband/hw/bnxt_re/qplib_fp.c +++ b/drivers/infiniband/hw/bnxt_re/qplib_fp.c @@ -807,13 +807,13 @@ static int bnxt_qplib_alloc_init_swq(struct bnxt_qplib_q *que) { int indx;
- que->swq = kcalloc(que->max_wqe, sizeof(*que->swq), GFP_KERNEL); + que->swq = kcalloc(que->max_sw_wqe, sizeof(*que->swq), GFP_KERNEL); if (!que->swq) return -ENOMEM;
que->swq_start = 0; - que->swq_last = que->max_wqe - 1; - for (indx = 0; indx < que->max_wqe; indx++) + que->swq_last = que->max_sw_wqe - 1; + for (indx = 0; indx < que->max_sw_wqe; indx++) que->swq[indx].next_idx = indx + 1; que->swq[que->swq_last].next_idx = 0; /* Make it circular */ que->swq_last = 0; @@ -849,7 +849,7 @@ int bnxt_qplib_create_qp1(struct bnxt_qplib_res *res, struct bnxt_qplib_qp *qp) hwq_attr.res = res; hwq_attr.sginfo = &sq->sg_info; hwq_attr.stride = sizeof(struct sq_sge); - hwq_attr.depth = bnxt_qplib_get_depth(sq); + hwq_attr.depth = bnxt_qplib_get_depth(sq, qp->wqe_mode, false); hwq_attr.type = HWQ_TYPE_QUEUE; rc = bnxt_qplib_alloc_init_hwq(&sq->hwq, &hwq_attr); if (rc) @@ -877,7 +877,7 @@ int bnxt_qplib_create_qp1(struct bnxt_qplib_res *res, struct bnxt_qplib_qp *qp) hwq_attr.res = res; hwq_attr.sginfo = &rq->sg_info; hwq_attr.stride = sizeof(struct sq_sge); - hwq_attr.depth = bnxt_qplib_get_depth(rq); + hwq_attr.depth = bnxt_qplib_get_depth(rq, qp->wqe_mode, false); hwq_attr.type = HWQ_TYPE_QUEUE; rc = bnxt_qplib_alloc_init_hwq(&rq->hwq, &hwq_attr); if (rc) @@ -1007,7 +1007,7 @@ int bnxt_qplib_create_qp(struct bnxt_qplib_res *res, struct bnxt_qplib_qp *qp) hwq_attr.res = res; hwq_attr.sginfo = &sq->sg_info; hwq_attr.stride = sizeof(struct sq_sge); - hwq_attr.depth = bnxt_qplib_get_depth(sq); + hwq_attr.depth = bnxt_qplib_get_depth(sq, qp->wqe_mode, true); hwq_attr.aux_stride = psn_sz; hwq_attr.aux_depth = psn_sz ? bnxt_qplib_set_sq_size(sq, qp->wqe_mode) : 0; @@ -1049,7 +1049,7 @@ int bnxt_qplib_create_qp(struct bnxt_qplib_res *res, struct bnxt_qplib_qp *qp) hwq_attr.res = res; hwq_attr.sginfo = &rq->sg_info; hwq_attr.stride = sizeof(struct sq_sge); - hwq_attr.depth = bnxt_qplib_get_depth(rq); + hwq_attr.depth = bnxt_qplib_get_depth(rq, qp->wqe_mode, false); hwq_attr.aux_stride = 0; hwq_attr.aux_depth = 0; hwq_attr.type = HWQ_TYPE_QUEUE; @@ -2493,7 +2493,7 @@ static int bnxt_qplib_cq_process_req(struct bnxt_qplib_cq *cq, } sq = &qp->sq;
- cqe_sq_cons = le16_to_cpu(hwcqe->sq_cons_idx) % sq->max_wqe; + cqe_sq_cons = le16_to_cpu(hwcqe->sq_cons_idx) % sq->max_sw_wqe; if (qp->sq.flushed) { dev_dbg(&cq->hwq.pdev->dev, "%s: QP in Flush QP = %p\n", __func__, qp); @@ -2885,7 +2885,7 @@ static int bnxt_qplib_cq_process_terminal(struct bnxt_qplib_cq *cq, cqe_cons = le16_to_cpu(hwcqe->sq_cons_idx); if (cqe_cons == 0xFFFF) goto do_rq; - cqe_cons %= sq->max_wqe; + cqe_cons %= sq->max_sw_wqe;
if (qp->sq.flushed) { dev_dbg(&cq->hwq.pdev->dev, diff --git a/drivers/infiniband/hw/bnxt_re/qplib_fp.h b/drivers/infiniband/hw/bnxt_re/qplib_fp.h index 3a15ca7feb2b..b64746d484d6 100644 --- a/drivers/infiniband/hw/bnxt_re/qplib_fp.h +++ b/drivers/infiniband/hw/bnxt_re/qplib_fp.h @@ -251,6 +251,7 @@ struct bnxt_qplib_q { struct bnxt_qplib_db_info dbinfo; struct bnxt_qplib_sg_info sg_info; u32 max_wqe; + u32 max_sw_wqe; u16 wqe_size; u16 q_full_delta; u16 max_sge; @@ -585,15 +586,22 @@ static inline void bnxt_qplib_swq_mod_start(struct bnxt_qplib_q *que, u32 idx) que->swq_start = que->swq[idx].next_idx; }
-static inline u32 bnxt_qplib_get_depth(struct bnxt_qplib_q *que) +static inline u32 bnxt_qplib_get_depth(struct bnxt_qplib_q *que, u8 wqe_mode, bool is_sq) { - return (que->wqe_size * que->max_wqe) / sizeof(struct sq_sge); + u32 slots; + + /* Queue depth is the number of slots. */ + slots = (que->wqe_size * que->max_wqe) / sizeof(struct sq_sge); + /* For variable WQE mode, need to align the slots to 256 */ + if (wqe_mode == BNXT_QPLIB_WQE_MODE_VARIABLE && is_sq) + slots = ALIGN(slots, BNXT_VAR_MAX_SLOT_ALIGN); + return slots; }
static inline u32 bnxt_qplib_set_sq_size(struct bnxt_qplib_q *que, u8 wqe_mode) { return (wqe_mode == BNXT_QPLIB_WQE_MODE_STATIC) ? - que->max_wqe : bnxt_qplib_get_depth(que); + que->max_wqe : bnxt_qplib_get_depth(que, wqe_mode, true); }
static inline u32 bnxt_qplib_set_sq_max_slot(u8 wqe_mode) diff --git a/drivers/infiniband/hw/bnxt_re/qplib_sp.c b/drivers/infiniband/hw/bnxt_re/qplib_sp.c index 0f6bae009af1..a46df2a5ab33 100644 --- a/drivers/infiniband/hw/bnxt_re/qplib_sp.c +++ b/drivers/infiniband/hw/bnxt_re/qplib_sp.c @@ -95,11 +95,13 @@ int bnxt_qplib_get_dev_attr(struct bnxt_qplib_rcfw *rcfw, struct bnxt_qplib_cmdqmsg msg = {}; struct creq_query_func_resp_sb *sb; struct bnxt_qplib_rcfw_sbuf sbuf; + struct bnxt_qplib_chip_ctx *cctx; struct cmdq_query_func req = {}; u8 *tqm_alloc; int i, rc; u32 temp;
+ cctx = rcfw->res->cctx; bnxt_qplib_rcfw_cmd_prep((struct cmdq_base *)&req, CMDQ_BASE_OPCODE_QUERY_FUNC, sizeof(req)); @@ -133,8 +135,9 @@ int bnxt_qplib_get_dev_attr(struct bnxt_qplib_rcfw *rcfw, * reporting the max number */ attr->max_qp_wqes -= BNXT_QPLIB_RESERVED_QP_WRS + 1; - attr->max_qp_sges = bnxt_qplib_is_chip_gen_p5_p7(rcfw->res->cctx) ? - 6 : sb->max_sge; + + attr->max_qp_sges = cctx->modes.wqe_mode == BNXT_QPLIB_WQE_MODE_VARIABLE ? + min_t(u32, sb->max_sge_var_wqe, BNXT_VAR_MAX_SGE) : 6; attr->max_cq = le32_to_cpu(sb->max_cq); attr->max_cq_wqes = le32_to_cpu(sb->max_cqe); if (!bnxt_qplib_is_chip_gen_p7(rcfw->res->cctx)) diff --git a/drivers/infiniband/hw/bnxt_re/qplib_sp.h b/drivers/infiniband/hw/bnxt_re/qplib_sp.h index 2f16f3db093e..b91e6a85e75d 100644 --- a/drivers/infiniband/hw/bnxt_re/qplib_sp.h +++ b/drivers/infiniband/hw/bnxt_re/qplib_sp.h @@ -40,6 +40,7 @@ #ifndef __BNXT_QPLIB_SP_H__ #define __BNXT_QPLIB_SP_H__
+#include <rdma/bnxt_re-abi.h> #define BNXT_QPLIB_RESERVED_QP_WRS 128
struct bnxt_qplib_dev_attr { @@ -352,4 +353,9 @@ int bnxt_qplib_qext_stat(struct bnxt_qplib_rcfw *rcfw, u32 fid, int bnxt_qplib_modify_cc(struct bnxt_qplib_res *res, struct bnxt_qplib_cc_param *cc_param);
+#define BNXT_VAR_MAX_WQE 4352 +#define BNXT_VAR_MAX_SLOT_ALIGN 256 +#define BNXT_VAR_MAX_SGE 13 +#define BNXT_RE_MAX_RQ_WQES 65536 + #endif /* __BNXT_QPLIB_SP_H__*/
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kalesh AP kalesh-anakkur.purayil@broadcom.com
[ Upstream commit d5a38bf2f35979537c526acbc56bc435ed40685f ]
Disabling the reserved wqes logic for Gen P5/P7 devices because this workaround is required only for legacy devices.
Fixes: ecb53febfcad ("RDMA/bnxt_en: Enable RDMA driver support for 57500 chip") Signed-off-by: Kalesh AP kalesh-anakkur.purayil@broadcom.com Signed-off-by: Selvin Xavier selvin.xavier@broadcom.com Link: https://patch.msgid.link/20241217102649.1377704-3-kalesh-anakkur.purayil@bro... Signed-off-by: Leon Romanovsky leon@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/infiniband/hw/bnxt_re/qplib_sp.c | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-)
diff --git a/drivers/infiniband/hw/bnxt_re/qplib_sp.c b/drivers/infiniband/hw/bnxt_re/qplib_sp.c index a46df2a5ab33..577a6eaca4ce 100644 --- a/drivers/infiniband/hw/bnxt_re/qplib_sp.c +++ b/drivers/infiniband/hw/bnxt_re/qplib_sp.c @@ -130,11 +130,13 @@ int bnxt_qplib_get_dev_attr(struct bnxt_qplib_rcfw *rcfw, sb->max_qp_init_rd_atom > BNXT_QPLIB_MAX_OUT_RD_ATOM ? BNXT_QPLIB_MAX_OUT_RD_ATOM : sb->max_qp_init_rd_atom; attr->max_qp_wqes = le16_to_cpu(sb->max_qp_wr) - 1; - /* - * 128 WQEs needs to be reserved for the HW (8916). Prevent - * reporting the max number - */ - attr->max_qp_wqes -= BNXT_QPLIB_RESERVED_QP_WRS + 1; + if (!bnxt_qplib_is_chip_gen_p5_p7(rcfw->res->cctx)) { + /* + * 128 WQEs needs to be reserved for the HW (8916). Prevent + * reporting the max number on legacy devices + */ + attr->max_qp_wqes -= BNXT_QPLIB_RESERVED_QP_WRS + 1; + }
attr->max_qp_sges = cctx->modes.wqe_mode == BNXT_QPLIB_WQE_MODE_VARIABLE ? min_t(u32, sb->max_sge_var_wqe, BNXT_VAR_MAX_SGE) : 6;
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Damodharam Ammepalli damodharam.ammepalli@broadcom.com
[ Upstream commit d13be54dc18baee7a3e44349b80755a8c8205d3f ]
For the fixed WQE case, HW supports 0xFFFF WQEs. For variable Size WQEs, HW treats this number as the 16 bytes slots. The maximum supported WQEs needs to be adjusted based on the number of slots. Set a maximum WQE limit for variable WQE scenario.
Fixes: de1d364c3815 ("RDMA/bnxt_re: Add support for Variable WQE in Genp7 adapters") Reviewed-by: Kalesh AP kalesh-anakkur.purayil@broadcom.com Signed-off-by: Damodharam Ammepalli damodharam.ammepalli@broadcom.com Signed-off-by: Selvin Xavier selvin.xavier@broadcom.com Link: https://patch.msgid.link/20241217102649.1377704-4-kalesh-anakkur.purayil@bro... Signed-off-by: Leon Romanovsky leon@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/infiniband/hw/bnxt_re/qplib_sp.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/drivers/infiniband/hw/bnxt_re/qplib_sp.c b/drivers/infiniband/hw/bnxt_re/qplib_sp.c index 577a6eaca4ce..74c3f6b26c4d 100644 --- a/drivers/infiniband/hw/bnxt_re/qplib_sp.c +++ b/drivers/infiniband/hw/bnxt_re/qplib_sp.c @@ -138,6 +138,10 @@ int bnxt_qplib_get_dev_attr(struct bnxt_qplib_rcfw *rcfw, attr->max_qp_wqes -= BNXT_QPLIB_RESERVED_QP_WRS + 1; }
+ /* Adjust for max_qp_wqes for variable wqe */ + if (cctx->modes.wqe_mode == BNXT_QPLIB_WQE_MODE_VARIABLE) + attr->max_qp_wqes = BNXT_VAR_MAX_WQE - 1; + attr->max_qp_sges = cctx->modes.wqe_mode == BNXT_QPLIB_WQE_MODE_VARIABLE ? min_t(u32, sb->max_sge_var_wqe, BNXT_VAR_MAX_SGE) : 6; attr->max_cq = le32_to_cpu(sb->max_cq);
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Damodharam Ammepalli damodharam.ammepalli@broadcom.com
[ Upstream commit bb839f3ace0fee532a0487b692cc4d868fccb7cf ]
For variable size wqe mode, the MSN table size should be half the size of the SQ depth. Fixing this to avoid wrap around problems in the retransmission path.
Fixes: de1d364c3815 ("RDMA/bnxt_re: Add support for Variable WQE in Genp7 adapters") Reviewed-by: Kashyap Desai kashyap.desai@broadcom.com Reviewed-by: Kalesh AP kalesh-anakkur.purayil@broadcom.com Signed-off-by: Damodharam Ammepalli damodharam.ammepalli@broadcom.com Signed-off-by: Selvin Xavier selvin.xavier@broadcom.com Link: https://patch.msgid.link/20241217102649.1377704-5-kalesh-anakkur.purayil@bro... Signed-off-by: Leon Romanovsky leon@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/infiniband/hw/bnxt_re/qplib_fp.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/drivers/infiniband/hw/bnxt_re/qplib_fp.c b/drivers/infiniband/hw/bnxt_re/qplib_fp.c index 2f85245d1285..1355061d698d 100644 --- a/drivers/infiniband/hw/bnxt_re/qplib_fp.c +++ b/drivers/infiniband/hw/bnxt_re/qplib_fp.c @@ -1013,7 +1013,12 @@ int bnxt_qplib_create_qp(struct bnxt_qplib_res *res, struct bnxt_qplib_qp *qp) : 0; /* Update msn tbl size */ if (qp->is_host_msn_tbl && psn_sz) { - hwq_attr.aux_depth = roundup_pow_of_two(bnxt_qplib_set_sq_size(sq, qp->wqe_mode)); + if (qp->wqe_mode == BNXT_QPLIB_WQE_MODE_STATIC) + hwq_attr.aux_depth = + roundup_pow_of_two(bnxt_qplib_set_sq_size(sq, qp->wqe_mode)); + else + hwq_attr.aux_depth = + roundup_pow_of_two(bnxt_qplib_set_sq_size(sq, qp->wqe_mode)) / 2; qp->msn_tbl_sz = hwq_attr.aux_depth; qp->msn = 0; }
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Selvin Xavier selvin.xavier@broadcom.com
[ Upstream commit 9272cba0ded71b5a2084da3004ec7806b8cb7fd2 ]
QP table handling is synchronized with destroy QP and Async event from the HW. The same needs to be synchronized during create_qp also. Use the same lock in create_qp also.
Fixes: 76d3ddff7153 ("RDMA/bnxt_re: synchronize the qp-handle table array") Fixes: f218d67ef004 ("RDMA/bnxt_re: Allow posting when QPs are in error") Fixes: 84cf229f4001 ("RDMA/bnxt_re: Fix the qp table indexing") Signed-off-by: Selvin Xavier selvin.xavier@broadcom.com Link: https://patch.msgid.link/20241217102649.1377704-6-kalesh-anakkur.purayil@bro... Signed-off-by: Leon Romanovsky leon@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/infiniband/hw/bnxt_re/qplib_fp.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/infiniband/hw/bnxt_re/qplib_fp.c b/drivers/infiniband/hw/bnxt_re/qplib_fp.c index 1355061d698d..871a49315c88 100644 --- a/drivers/infiniband/hw/bnxt_re/qplib_fp.c +++ b/drivers/infiniband/hw/bnxt_re/qplib_fp.c @@ -1161,9 +1161,11 @@ int bnxt_qplib_create_qp(struct bnxt_qplib_res *res, struct bnxt_qplib_qp *qp) rq->dbinfo.db = qp->dpi->dbr; rq->dbinfo.max_slot = bnxt_qplib_set_rq_max_slot(rq->wqe_size); } + spin_lock_bh(&rcfw->tbl_lock); tbl_indx = map_qp_id_to_tbl_indx(qp->id, rcfw); rcfw->qp_tbl[tbl_indx].qp_id = qp->id; rcfw->qp_tbl[tbl_indx].qp_handle = (void *)qp; + spin_unlock_bh(&rcfw->tbl_lock);
return 0; fail:
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Stefan Ekenberg stefan.ekenberg@axis.com
[ Upstream commit 902806baf3c1e8383c1fe3ff0b6042b8cb5c2707 ]
AUDIO_UPDATE bit (Bit 5 of MAIN register 0x4A) needs to be set to 1 while updating Audio InfoFrame information and then set to 0 when done. Otherwise partially updated Audio InfoFrames could be sent out. Two cases where this rule were not followed are fixed: - In adv7511_hdmi_hw_params() make sure AUDIO_UPDATE bit is updated before/after setting ADV7511_REG_AUDIO_INFOFRAME. - In audio_startup() use the correct register for clearing AUDIO_UPDATE bit.
The problem with corrupted audio infoframes were discovered by letting a HDMI logic analyser check the output of ADV7535.
Note that this patchs replaces writing REG_GC(1) with REG_INFOFRAME_UPDATE. Bit 5 of REG_GC(1) is positioned within field GC_PP[3:0] and that field doesn't control audio infoframe and is read- only. My conclusion therefore was that the author if this code meant to clear bit 5 of REG_INFOFRAME_UPDATE from the very beginning.
Tested-by: Biju Das biju.das.jz@bp.renesas.com Fixes: 53c515befe28 ("drm/bridge: adv7511: Add Audio support") Signed-off-by: Stefan Ekenberg stefan.ekenberg@axis.com Reviewed-by: Dmitry Baryshkov dmitry.baryshkov@linaro.org Link: https://patchwork.freedesktop.org/patch/msgid/20241119-adv7511-audio-info-fr... Signed-off-by: Dmitry Baryshkov dmitry.baryshkov@linaro.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/bridge/adv7511/adv7511_audio.c | 14 ++++++++++++-- 1 file changed, 12 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/bridge/adv7511/adv7511_audio.c b/drivers/gpu/drm/bridge/adv7511/adv7511_audio.c index 61f4a38e7d2b..8f786592143b 100644 --- a/drivers/gpu/drm/bridge/adv7511/adv7511_audio.c +++ b/drivers/gpu/drm/bridge/adv7511/adv7511_audio.c @@ -153,7 +153,16 @@ static int adv7511_hdmi_hw_params(struct device *dev, void *data, ADV7511_AUDIO_CFG3_LEN_MASK, len); regmap_update_bits(adv7511->regmap, ADV7511_REG_I2C_FREQ_ID_CFG, ADV7511_I2C_FREQ_ID_CFG_RATE_MASK, rate << 4); - regmap_write(adv7511->regmap, 0x73, 0x1); + + /* send current Audio infoframe values while updating */ + regmap_update_bits(adv7511->regmap, ADV7511_REG_INFOFRAME_UPDATE, + BIT(5), BIT(5)); + + regmap_write(adv7511->regmap, ADV7511_REG_AUDIO_INFOFRAME(0), 0x1); + + /* use Audio infoframe updated info */ + regmap_update_bits(adv7511->regmap, ADV7511_REG_INFOFRAME_UPDATE, + BIT(5), 0);
return 0; } @@ -184,8 +193,9 @@ static int audio_startup(struct device *dev, void *data) regmap_update_bits(adv7511->regmap, ADV7511_REG_GC(0), BIT(7) | BIT(6), BIT(7)); /* use Audio infoframe updated info */ - regmap_update_bits(adv7511->regmap, ADV7511_REG_GC(1), + regmap_update_bits(adv7511->regmap, ADV7511_REG_INFOFRAME_UPDATE, BIT(5), 0); + /* enable SPDIF receiver */ if (adv7511->audio_source == ADV7511_AUDIO_SOURCE_SPDIF) regmap_update_bits(adv7511->regmap, ADV7511_REG_AUDIO_CONFIG,
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Tristram Ha tristram.ha@microchip.com
[ Upstream commit 262bfba8ab820641c8cfbbf03b86d6c00242c078 ]
The aging count is not a simple 11-bit value but comprises a 3-bit multiplier and an 8-bit second count. The code tries to use the original multiplier which is 4 as the second count is still 300 seconds by default.
Fixes: 2c119d9982b1 ("net: dsa: microchip: add the support for set_ageing_time") Signed-off-by: Tristram Ha tristram.ha@microchip.com Reviewed-by: Andrew Lunn andrew@lunn.ch Link: https://patch.msgid.link/20241218020224.70590-2-Tristram.Ha@microchip.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/dsa/microchip/ksz9477.c | 47 +++++++++++++++++++------ drivers/net/dsa/microchip/ksz9477_reg.h | 4 +-- 2 files changed, 37 insertions(+), 14 deletions(-)
diff --git a/drivers/net/dsa/microchip/ksz9477.c b/drivers/net/dsa/microchip/ksz9477.c index a7e8fcdf2576..59134d117846 100644 --- a/drivers/net/dsa/microchip/ksz9477.c +++ b/drivers/net/dsa/microchip/ksz9477.c @@ -2,7 +2,7 @@ /* * Microchip KSZ9477 switch driver main logic * - * Copyright (C) 2017-2019 Microchip Technology Inc. + * Copyright (C) 2017-2024 Microchip Technology Inc. */
#include <linux/kernel.h> @@ -916,26 +916,51 @@ void ksz9477_get_caps(struct ksz_device *dev, int port, int ksz9477_set_ageing_time(struct ksz_device *dev, unsigned int msecs) { u32 secs = msecs / 1000; - u8 value; - u8 data; + u8 data, mult, value; + u32 max_val; int ret;
- value = FIELD_GET(SW_AGE_PERIOD_7_0_M, secs); +#define MAX_TIMER_VAL ((1 << 8) - 1)
- ret = ksz_write8(dev, REG_SW_LUE_CTRL_3, value); - if (ret < 0) - return ret; + /* The aging timer comprises a 3-bit multiplier and an 8-bit second + * value. Either of them cannot be zero. The maximum timer is then + * 7 * 255 = 1785 seconds. + */ + if (!secs) + secs = 1;
- data = FIELD_GET(SW_AGE_PERIOD_10_8_M, secs); + /* Return error if too large. */ + else if (secs > 7 * MAX_TIMER_VAL) + return -EINVAL;
ret = ksz_read8(dev, REG_SW_LUE_CTRL_0, &value); if (ret < 0) return ret;
- value &= ~SW_AGE_CNT_M; - value |= FIELD_PREP(SW_AGE_CNT_M, data); + /* Check whether there is need to update the multiplier. */ + mult = FIELD_GET(SW_AGE_CNT_M, value); + max_val = MAX_TIMER_VAL; + if (mult > 0) { + /* Try to use the same multiplier already in the register as + * the hardware default uses multiplier 4 and 75 seconds for + * 300 seconds. + */ + max_val = DIV_ROUND_UP(secs, mult); + if (max_val > MAX_TIMER_VAL || max_val * mult != secs) + max_val = MAX_TIMER_VAL; + } + + data = DIV_ROUND_UP(secs, max_val); + if (mult != data) { + value &= ~SW_AGE_CNT_M; + value |= FIELD_PREP(SW_AGE_CNT_M, data); + ret = ksz_write8(dev, REG_SW_LUE_CTRL_0, value); + if (ret < 0) + return ret; + }
- return ksz_write8(dev, REG_SW_LUE_CTRL_0, value); + value = DIV_ROUND_UP(secs, data); + return ksz_write8(dev, REG_SW_LUE_CTRL_3, value); }
void ksz9477_port_queue_split(struct ksz_device *dev, int port) diff --git a/drivers/net/dsa/microchip/ksz9477_reg.h b/drivers/net/dsa/microchip/ksz9477_reg.h index a2ef4b18349c..d0886ed984c5 100644 --- a/drivers/net/dsa/microchip/ksz9477_reg.h +++ b/drivers/net/dsa/microchip/ksz9477_reg.h @@ -2,7 +2,7 @@ /* * Microchip KSZ9477 register definitions * - * Copyright (C) 2017-2018 Microchip Technology Inc. + * Copyright (C) 2017-2024 Microchip Technology Inc. */
#ifndef __KSZ9477_REGS_H @@ -190,8 +190,6 @@ #define SW_VLAN_ENABLE BIT(7) #define SW_DROP_INVALID_VID BIT(6) #define SW_AGE_CNT_M GENMASK(5, 3) -#define SW_AGE_CNT_S 3 -#define SW_AGE_PERIOD_10_8_M GENMASK(10, 8) #define SW_RESV_MCAST_ENABLE BIT(2) #define SW_HASH_OPTION_M 0x03 #define SW_HASH_OPTION_CRC 1
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Tristram Ha tristram.ha@microchip.com
[ Upstream commit bb9869043438af5b94230f94fb4c39206525d758 ]
The aging count is not a simple 20-bit value but comprises a 3-bit multiplier and a 20-bit second time. The code tries to use the original multiplier which is 4 as the second count is still 300 seconds by default.
As the 20-bit number is now too large for practical use there is an option to interpret it as microseconds instead of seconds.
Fixes: 2c119d9982b1 ("net: dsa: microchip: add the support for set_ageing_time") Signed-off-by: Tristram Ha tristram.ha@microchip.com Reviewed-by: Andrew Lunn andrew@lunn.ch Link: https://patch.msgid.link/20241218020224.70590-3-Tristram.Ha@microchip.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/dsa/microchip/lan937x_main.c | 62 ++++++++++++++++++++++-- drivers/net/dsa/microchip/lan937x_reg.h | 9 ++-- 2 files changed, 65 insertions(+), 6 deletions(-)
diff --git a/drivers/net/dsa/microchip/lan937x_main.c b/drivers/net/dsa/microchip/lan937x_main.c index b479a628b1ae..dde37e61faa3 100644 --- a/drivers/net/dsa/microchip/lan937x_main.c +++ b/drivers/net/dsa/microchip/lan937x_main.c @@ -1,6 +1,6 @@ // SPDX-License-Identifier: GPL-2.0 /* Microchip LAN937X switch driver main logic - * Copyright (C) 2019-2022 Microchip Technology Inc. + * Copyright (C) 2019-2024 Microchip Technology Inc. */ #include <linux/kernel.h> #include <linux/module.h> @@ -257,10 +257,66 @@ int lan937x_change_mtu(struct ksz_device *dev, int port, int new_mtu)
int lan937x_set_ageing_time(struct ksz_device *dev, unsigned int msecs) { - u32 secs = msecs / 1000; - u32 value; + u8 data, mult, value8; + bool in_msec = false; + u32 max_val, value; + u32 secs = msecs; int ret;
+#define MAX_TIMER_VAL ((1 << 20) - 1) + + /* The aging timer comprises a 3-bit multiplier and a 20-bit second + * value. Either of them cannot be zero. The maximum timer is then + * 7 * 1048575 = 7340025 seconds. As this value is too large for + * practical use it can be interpreted as microseconds, making the + * maximum timer 7340 seconds with finer control. This allows for + * maximum 122 minutes compared to 29 minutes in KSZ9477 switch. + */ + if (msecs % 1000) + in_msec = true; + else + secs /= 1000; + if (!secs) + secs = 1; + + /* Return error if too large. */ + else if (secs > 7 * MAX_TIMER_VAL) + return -EINVAL; + + /* Configure how to interpret the number value. */ + ret = ksz_rmw8(dev, REG_SW_LUE_CTRL_2, SW_AGE_CNT_IN_MICROSEC, + in_msec ? SW_AGE_CNT_IN_MICROSEC : 0); + if (ret < 0) + return ret; + + ret = ksz_read8(dev, REG_SW_LUE_CTRL_0, &value8); + if (ret < 0) + return ret; + + /* Check whether there is need to update the multiplier. */ + mult = FIELD_GET(SW_AGE_CNT_M, value8); + max_val = MAX_TIMER_VAL; + if (mult > 0) { + /* Try to use the same multiplier already in the register as + * the hardware default uses multiplier 4 and 75 seconds for + * 300 seconds. + */ + max_val = DIV_ROUND_UP(secs, mult); + if (max_val > MAX_TIMER_VAL || max_val * mult != secs) + max_val = MAX_TIMER_VAL; + } + + data = DIV_ROUND_UP(secs, max_val); + if (mult != data) { + value8 &= ~SW_AGE_CNT_M; + value8 |= FIELD_PREP(SW_AGE_CNT_M, data); + ret = ksz_write8(dev, REG_SW_LUE_CTRL_0, value8); + if (ret < 0) + return ret; + } + + secs = DIV_ROUND_UP(secs, data); + value = FIELD_GET(SW_AGE_PERIOD_7_0_M, secs);
ret = ksz_write8(dev, REG_SW_AGE_PERIOD__1, value); diff --git a/drivers/net/dsa/microchip/lan937x_reg.h b/drivers/net/dsa/microchip/lan937x_reg.h index 45b606b6429f..b3e536e7c686 100644 --- a/drivers/net/dsa/microchip/lan937x_reg.h +++ b/drivers/net/dsa/microchip/lan937x_reg.h @@ -1,6 +1,6 @@ /* SPDX-License-Identifier: GPL-2.0 */ /* Microchip LAN937X switch register definitions - * Copyright (C) 2019-2021 Microchip Technology Inc. + * Copyright (C) 2019-2024 Microchip Technology Inc. */ #ifndef __LAN937X_REG_H #define __LAN937X_REG_H @@ -48,8 +48,7 @@
#define SW_VLAN_ENABLE BIT(7) #define SW_DROP_INVALID_VID BIT(6) -#define SW_AGE_CNT_M 0x7 -#define SW_AGE_CNT_S 3 +#define SW_AGE_CNT_M GENMASK(5, 3) #define SW_RESV_MCAST_ENABLE BIT(2)
#define REG_SW_LUE_CTRL_1 0x0311 @@ -62,6 +61,10 @@ #define SW_FAST_AGING BIT(1) #define SW_LINK_AUTO_AGING BIT(0)
+#define REG_SW_LUE_CTRL_2 0x0312 + +#define SW_AGE_CNT_IN_MICROSEC BIT(7) + #define REG_SW_AGE_PERIOD__1 0x0313 #define SW_AGE_PERIOD_7_0_M GENMASK(7, 0)
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Chengchang Tang tangchengchang@huawei.com
[ Upstream commit a4ca341080758d847db155b97887bff6f84016a4 ]
hns_roce_mtr_find() is a collection of multiple functions, and the return value is also difficult to understand, which is not conducive to modification and maintenance.
Separate the function of obtaining MTR root BA from this function. And some adjustments has been made to improve readability.
Signed-off-by: Chengchang Tang tangchengchang@huawei.com Signed-off-by: Junxian Huang huangjunxian6@hisilicon.com Link: https://lore.kernel.org/r/20240113085935.2838701-2-huangjunxian6@hisilicon.c... Signed-off-by: Leon Romanovsky leon@kernel.org Stable-dep-of: 8673a6c2d9e4 ("RDMA/hns: Fix mapping error of zero-hop WQE buffer") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/infiniband/hw/hns/hns_roce_cq.c | 11 +-- drivers/infiniband/hw/hns/hns_roce_device.h | 7 +- drivers/infiniband/hw/hns/hns_roce_hw_v2.c | 102 ++++++++++---------- drivers/infiniband/hw/hns/hns_roce_mr.c | 86 +++++++++++------ 4 files changed, 121 insertions(+), 85 deletions(-)
diff --git a/drivers/infiniband/hw/hns/hns_roce_cq.c b/drivers/infiniband/hw/hns/hns_roce_cq.c index 9b91731a6207..5e0d78f4e545 100644 --- a/drivers/infiniband/hw/hns/hns_roce_cq.c +++ b/drivers/infiniband/hw/hns/hns_roce_cq.c @@ -133,14 +133,12 @@ static int alloc_cqc(struct hns_roce_dev *hr_dev, struct hns_roce_cq *hr_cq) struct hns_roce_cq_table *cq_table = &hr_dev->cq_table; struct ib_device *ibdev = &hr_dev->ib_dev; u64 mtts[MTT_MIN_COUNT] = {}; - dma_addr_t dma_handle; int ret;
- ret = hns_roce_mtr_find(hr_dev, &hr_cq->mtr, 0, mtts, ARRAY_SIZE(mtts), - &dma_handle); - if (!ret) { + ret = hns_roce_mtr_find(hr_dev, &hr_cq->mtr, 0, mtts, ARRAY_SIZE(mtts)); + if (ret) { ibdev_err(ibdev, "failed to find CQ mtr, ret = %d.\n", ret); - return -EINVAL; + return ret; }
/* Get CQC memory HEM(Hardware Entry Memory) table */ @@ -157,7 +155,8 @@ static int alloc_cqc(struct hns_roce_dev *hr_dev, struct hns_roce_cq *hr_cq) goto err_put; }
- ret = hns_roce_create_cqc(hr_dev, hr_cq, mtts, dma_handle); + ret = hns_roce_create_cqc(hr_dev, hr_cq, mtts, + hns_roce_get_mtr_ba(&hr_cq->mtr)); if (ret) goto err_xa;
diff --git a/drivers/infiniband/hw/hns/hns_roce_device.h b/drivers/infiniband/hw/hns/hns_roce_device.h index 21ef00fdb656..a835368548e5 100644 --- a/drivers/infiniband/hw/hns/hns_roce_device.h +++ b/drivers/infiniband/hw/hns/hns_roce_device.h @@ -1129,8 +1129,13 @@ void hns_roce_cmd_use_polling(struct hns_roce_dev *hr_dev);
/* hns roce hw need current block and next block addr from mtt */ #define MTT_MIN_COUNT 2 +static inline dma_addr_t hns_roce_get_mtr_ba(struct hns_roce_mtr *mtr) +{ + return mtr->hem_cfg.root_ba; +} + int hns_roce_mtr_find(struct hns_roce_dev *hr_dev, struct hns_roce_mtr *mtr, - u32 offset, u64 *mtt_buf, int mtt_max, u64 *base_addr); + u32 offset, u64 *mtt_buf, int mtt_max); int hns_roce_mtr_create(struct hns_roce_dev *hr_dev, struct hns_roce_mtr *mtr, struct hns_roce_buf_attr *buf_attr, unsigned int page_shift, struct ib_udata *udata, diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c index 2824d390ec31..a396ba85cdce 100644 --- a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c +++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c @@ -3181,21 +3181,22 @@ static int set_mtpt_pbl(struct hns_roce_dev *hr_dev, u64 pages[HNS_ROCE_V2_MAX_INNER_MTPT_NUM] = { 0 }; struct ib_device *ibdev = &hr_dev->ib_dev; dma_addr_t pbl_ba; - int i, count; + int ret; + int i;
- count = hns_roce_mtr_find(hr_dev, &mr->pbl_mtr, 0, pages, - min_t(int, ARRAY_SIZE(pages), mr->npages), - &pbl_ba); - if (count < 1) { - ibdev_err(ibdev, "failed to find PBL mtr, count = %d.\n", - count); - return -ENOBUFS; + ret = hns_roce_mtr_find(hr_dev, &mr->pbl_mtr, 0, pages, + min_t(int, ARRAY_SIZE(pages), mr->npages)); + if (ret) { + ibdev_err(ibdev, "failed to find PBL mtr, ret = %d.\n", ret); + return ret; }
/* Aligned to the hardware address access unit */ - for (i = 0; i < count; i++) + for (i = 0; i < ARRAY_SIZE(pages); i++) pages[i] >>= 6;
+ pbl_ba = hns_roce_get_mtr_ba(&mr->pbl_mtr); + mpt_entry->pbl_size = cpu_to_le32(mr->npages); mpt_entry->pbl_ba_l = cpu_to_le32(pbl_ba >> 3); hr_reg_write(mpt_entry, MPT_PBL_BA_H, upper_32_bits(pbl_ba >> 3)); @@ -3294,18 +3295,12 @@ static int hns_roce_v2_rereg_write_mtpt(struct hns_roce_dev *hr_dev, static int hns_roce_v2_frmr_write_mtpt(struct hns_roce_dev *hr_dev, void *mb_buf, struct hns_roce_mr *mr) { - struct ib_device *ibdev = &hr_dev->ib_dev; + dma_addr_t pbl_ba = hns_roce_get_mtr_ba(&mr->pbl_mtr); struct hns_roce_v2_mpt_entry *mpt_entry; - dma_addr_t pbl_ba = 0;
mpt_entry = mb_buf; memset(mpt_entry, 0, sizeof(*mpt_entry));
- if (hns_roce_mtr_find(hr_dev, &mr->pbl_mtr, 0, NULL, 0, &pbl_ba) < 0) { - ibdev_err(ibdev, "failed to find frmr mtr.\n"); - return -ENOBUFS; - } - hr_reg_write(mpt_entry, MPT_ST, V2_MPT_ST_FREE); hr_reg_write(mpt_entry, MPT_PD, mr->pd);
@@ -4333,17 +4328,20 @@ static int config_qp_rq_buf(struct hns_roce_dev *hr_dev, { u64 mtts[MTT_MIN_COUNT] = { 0 }; u64 wqe_sge_ba; - int count; + int ret;
/* Search qp buf's mtts */ - count = hns_roce_mtr_find(hr_dev, &hr_qp->mtr, hr_qp->rq.offset, mtts, - MTT_MIN_COUNT, &wqe_sge_ba); - if (hr_qp->rq.wqe_cnt && count < 1) { + ret = hns_roce_mtr_find(hr_dev, &hr_qp->mtr, hr_qp->rq.offset, mtts, + MTT_MIN_COUNT); + if (hr_qp->rq.wqe_cnt && ret) { ibdev_err(&hr_dev->ib_dev, - "failed to find RQ WQE, QPN = 0x%lx.\n", hr_qp->qpn); - return -EINVAL; + "failed to find QP(0x%lx) RQ WQE buf, ret = %d.\n", + hr_qp->qpn, ret); + return ret; }
+ wqe_sge_ba = hns_roce_get_mtr_ba(&hr_qp->mtr); + context->wqe_sge_ba = cpu_to_le32(wqe_sge_ba >> 3); qpc_mask->wqe_sge_ba = 0;
@@ -4407,23 +4405,23 @@ static int config_qp_sq_buf(struct hns_roce_dev *hr_dev, struct ib_device *ibdev = &hr_dev->ib_dev; u64 sge_cur_blk = 0; u64 sq_cur_blk = 0; - int count; + int ret;
/* search qp buf's mtts */ - count = hns_roce_mtr_find(hr_dev, &hr_qp->mtr, 0, &sq_cur_blk, 1, NULL); - if (count < 1) { - ibdev_err(ibdev, "failed to find QP(0x%lx) SQ buf.\n", - hr_qp->qpn); - return -EINVAL; + ret = hns_roce_mtr_find(hr_dev, &hr_qp->mtr, hr_qp->sq.offset, + &sq_cur_blk, 1); + if (ret) { + ibdev_err(ibdev, "failed to find QP(0x%lx) SQ WQE buf, ret = %d.\n", + hr_qp->qpn, ret); + return ret; } if (hr_qp->sge.sge_cnt > 0) { - count = hns_roce_mtr_find(hr_dev, &hr_qp->mtr, - hr_qp->sge.offset, - &sge_cur_blk, 1, NULL); - if (count < 1) { - ibdev_err(ibdev, "failed to find QP(0x%lx) SGE buf.\n", - hr_qp->qpn); - return -EINVAL; + ret = hns_roce_mtr_find(hr_dev, &hr_qp->mtr, + hr_qp->sge.offset, &sge_cur_blk, 1); + if (ret) { + ibdev_err(ibdev, "failed to find QP(0x%lx) SGE buf, ret = %d.\n", + hr_qp->qpn, ret); + return ret; } }
@@ -5550,18 +5548,20 @@ static int hns_roce_v2_write_srqc_index_queue(struct hns_roce_srq *srq, struct ib_device *ibdev = srq->ibsrq.device; struct hns_roce_dev *hr_dev = to_hr_dev(ibdev); u64 mtts_idx[MTT_MIN_COUNT] = {}; - dma_addr_t dma_handle_idx = 0; + dma_addr_t dma_handle_idx; int ret;
/* Get physical address of idx que buf */ ret = hns_roce_mtr_find(hr_dev, &idx_que->mtr, 0, mtts_idx, - ARRAY_SIZE(mtts_idx), &dma_handle_idx); - if (ret < 1) { + ARRAY_SIZE(mtts_idx)); + if (ret) { ibdev_err(ibdev, "failed to find mtr for SRQ idx, ret = %d.\n", ret); - return -ENOBUFS; + return ret; }
+ dma_handle_idx = hns_roce_get_mtr_ba(&idx_que->mtr); + hr_reg_write(ctx, SRQC_IDX_HOP_NUM, to_hr_hem_hopnum(hr_dev->caps.idx_hop_num, srq->wqe_cnt));
@@ -5593,20 +5593,22 @@ static int hns_roce_v2_write_srqc(struct hns_roce_srq *srq, void *mb_buf) struct hns_roce_dev *hr_dev = to_hr_dev(ibdev); struct hns_roce_srq_context *ctx = mb_buf; u64 mtts_wqe[MTT_MIN_COUNT] = {}; - dma_addr_t dma_handle_wqe = 0; + dma_addr_t dma_handle_wqe; int ret;
memset(ctx, 0, sizeof(*ctx));
/* Get the physical address of srq buf */ ret = hns_roce_mtr_find(hr_dev, &srq->buf_mtr, 0, mtts_wqe, - ARRAY_SIZE(mtts_wqe), &dma_handle_wqe); - if (ret < 1) { + ARRAY_SIZE(mtts_wqe)); + if (ret) { ibdev_err(ibdev, "failed to find mtr for SRQ WQE, ret = %d.\n", ret); - return -ENOBUFS; + return ret; }
+ dma_handle_wqe = hns_roce_get_mtr_ba(&srq->buf_mtr); + hr_reg_write(ctx, SRQC_SRQ_ST, 1); hr_reg_write_bool(ctx, SRQC_SRQ_TYPE, srq->ibsrq.srq_type == IB_SRQT_XRC); @@ -6327,7 +6329,7 @@ static int config_eqc(struct hns_roce_dev *hr_dev, struct hns_roce_eq *eq, u64 eqe_ba[MTT_MIN_COUNT] = { 0 }; struct hns_roce_eq_context *eqc; u64 bt_ba = 0; - int count; + int ret;
eqc = mb_buf; memset(eqc, 0, sizeof(struct hns_roce_eq_context)); @@ -6335,13 +6337,15 @@ static int config_eqc(struct hns_roce_dev *hr_dev, struct hns_roce_eq *eq, init_eq_config(hr_dev, eq);
/* if not multi-hop, eqe buffer only use one trunk */ - count = hns_roce_mtr_find(hr_dev, &eq->mtr, 0, eqe_ba, MTT_MIN_COUNT, - &bt_ba); - if (count < 1) { - dev_err(hr_dev->dev, "failed to find EQE mtr\n"); - return -ENOBUFS; + ret = hns_roce_mtr_find(hr_dev, &eq->mtr, 0, eqe_ba, + ARRAY_SIZE(eqe_ba)); + if (ret) { + dev_err(hr_dev->dev, "failed to find EQE mtr, ret = %d\n", ret); + return ret; }
+ bt_ba = hns_roce_get_mtr_ba(&eq->mtr); + hr_reg_write(eqc, EQC_EQ_ST, HNS_ROCE_V2_EQ_STATE_VALID); hr_reg_write(eqc, EQC_EQE_HOP_NUM, eq->hop_num); hr_reg_write(eqc, EQC_OVER_IGNORE, eq->over_ignore); diff --git a/drivers/infiniband/hw/hns/hns_roce_mr.c b/drivers/infiniband/hw/hns/hns_roce_mr.c index 7f29a55d378f..7b59b95f87c2 100644 --- a/drivers/infiniband/hw/hns/hns_roce_mr.c +++ b/drivers/infiniband/hw/hns/hns_roce_mr.c @@ -802,47 +802,53 @@ int hns_roce_mtr_map(struct hns_roce_dev *hr_dev, struct hns_roce_mtr *mtr, return ret; }
-int hns_roce_mtr_find(struct hns_roce_dev *hr_dev, struct hns_roce_mtr *mtr, - u32 offset, u64 *mtt_buf, int mtt_max, u64 *base_addr) +static int hns_roce_get_direct_addr_mtt(struct hns_roce_hem_cfg *cfg, + u32 start_index, u64 *mtt_buf, + int mtt_cnt) { - struct hns_roce_hem_cfg *cfg = &mtr->hem_cfg; - int mtt_count, left; - u32 start_index; + int mtt_count; int total = 0; - __le64 *mtts; u32 npage; u64 addr;
- if (!mtt_buf || mtt_max < 1) - goto done; - - /* no mtt memory in direct mode, so just return the buffer address */ - if (cfg->is_direct) { - start_index = offset >> HNS_HW_PAGE_SHIFT; - for (mtt_count = 0; mtt_count < cfg->region_count && - total < mtt_max; mtt_count++) { - npage = cfg->region[mtt_count].offset; - if (npage < start_index) - continue; + if (mtt_cnt > cfg->region_count) + return -EINVAL;
- addr = cfg->root_ba + (npage << HNS_HW_PAGE_SHIFT); - mtt_buf[total] = addr; + for (mtt_count = 0; mtt_count < cfg->region_count && total < mtt_cnt; + mtt_count++) { + npage = cfg->region[mtt_count].offset; + if (npage < start_index) + continue;
- total++; - } + addr = cfg->root_ba + (npage << HNS_HW_PAGE_SHIFT); + mtt_buf[total] = addr;
- goto done; + total++; }
- start_index = offset >> cfg->buf_pg_shift; - left = mtt_max; + if (!total) + return -ENOENT; + + return 0; +} + +static int hns_roce_get_mhop_mtt(struct hns_roce_dev *hr_dev, + struct hns_roce_mtr *mtr, u32 start_index, + u64 *mtt_buf, int mtt_cnt) +{ + int left = mtt_cnt; + int total = 0; + int mtt_count; + __le64 *mtts; + u32 npage; + while (left > 0) { mtt_count = 0; mtts = hns_roce_hem_list_find_mtt(hr_dev, &mtr->hem_list, start_index + total, &mtt_count); if (!mtts || !mtt_count) - goto done; + break;
npage = min(mtt_count, left); left -= npage; @@ -850,11 +856,33 @@ int hns_roce_mtr_find(struct hns_roce_dev *hr_dev, struct hns_roce_mtr *mtr, mtt_buf[total++] = le64_to_cpu(mtts[mtt_count]); }
-done: - if (base_addr) - *base_addr = cfg->root_ba; + if (!total) + return -ENOENT; + + return 0; +} + +int hns_roce_mtr_find(struct hns_roce_dev *hr_dev, struct hns_roce_mtr *mtr, + u32 offset, u64 *mtt_buf, int mtt_max) +{ + struct hns_roce_hem_cfg *cfg = &mtr->hem_cfg; + u32 start_index; + int ret; + + if (!mtt_buf || mtt_max < 1) + return -EINVAL;
- return total; + /* no mtt memory in direct mode, so just return the buffer address */ + if (cfg->is_direct) { + start_index = offset >> HNS_HW_PAGE_SHIFT; + ret = hns_roce_get_direct_addr_mtt(cfg, start_index, + mtt_buf, mtt_max); + } else { + start_index = offset >> cfg->buf_pg_shift; + ret = hns_roce_get_mhop_mtt(hr_dev, mtr, start_index, + mtt_buf, mtt_max); + } + return ret; }
static int mtr_init_buf_cfg(struct hns_roce_dev *hr_dev,
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Chengchang Tang tangchengchang@huawei.com
[ Upstream commit f4caa864af84f801a5821ea2ba6c1cc46f8252c1 ]
Remove unused parameters and variables.
Signed-off-by: Chengchang Tang tangchengchang@huawei.com Signed-off-by: Junxian Huang huangjunxian6@hisilicon.com Link: https://lore.kernel.org/r/20240412091616.370789-3-huangjunxian6@hisilicon.co... Signed-off-by: Leon Romanovsky leon@kernel.org Stable-dep-of: 8673a6c2d9e4 ("RDMA/hns: Fix mapping error of zero-hop WQE buffer") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/infiniband/hw/hns/hns_roce_alloc.c | 3 +-- drivers/infiniband/hw/hns/hns_roce_device.h | 5 ++--- drivers/infiniband/hw/hns/hns_roce_hem.c | 13 +++++-------- drivers/infiniband/hw/hns/hns_roce_hw_v2.c | 20 +++++++------------- drivers/infiniband/hw/hns/hns_roce_mr.c | 4 ++-- drivers/infiniband/hw/hns/hns_roce_qp.c | 4 +--- drivers/infiniband/hw/hns/hns_roce_srq.c | 4 ++-- 7 files changed, 20 insertions(+), 33 deletions(-)
diff --git a/drivers/infiniband/hw/hns/hns_roce_alloc.c b/drivers/infiniband/hw/hns/hns_roce_alloc.c index 11a78ceae568..950c133d4220 100644 --- a/drivers/infiniband/hw/hns/hns_roce_alloc.c +++ b/drivers/infiniband/hw/hns/hns_roce_alloc.c @@ -153,8 +153,7 @@ int hns_roce_get_kmem_bufs(struct hns_roce_dev *hr_dev, dma_addr_t *bufs, return total; }
-int hns_roce_get_umem_bufs(struct hns_roce_dev *hr_dev, dma_addr_t *bufs, - int buf_cnt, struct ib_umem *umem, +int hns_roce_get_umem_bufs(dma_addr_t *bufs, int buf_cnt, struct ib_umem *umem, unsigned int page_shift) { struct ib_block_iter biter; diff --git a/drivers/infiniband/hw/hns/hns_roce_device.h b/drivers/infiniband/hw/hns/hns_roce_device.h index a835368548e5..03b6546f63cd 100644 --- a/drivers/infiniband/hw/hns/hns_roce_device.h +++ b/drivers/infiniband/hw/hns/hns_roce_device.h @@ -892,8 +892,7 @@ struct hns_roce_hw { int (*rereg_write_mtpt)(struct hns_roce_dev *hr_dev, struct hns_roce_mr *mr, int flags, void *mb_buf); - int (*frmr_write_mtpt)(struct hns_roce_dev *hr_dev, void *mb_buf, - struct hns_roce_mr *mr); + int (*frmr_write_mtpt)(void *mb_buf, struct hns_roce_mr *mr); int (*mw_write_mtpt)(void *mb_buf, struct hns_roce_mw *mw); void (*write_cqc)(struct hns_roce_dev *hr_dev, struct hns_roce_cq *hr_cq, void *mb_buf, u64 *mtts, @@ -1193,7 +1192,7 @@ struct hns_roce_buf *hns_roce_buf_alloc(struct hns_roce_dev *hr_dev, u32 size, int hns_roce_get_kmem_bufs(struct hns_roce_dev *hr_dev, dma_addr_t *bufs, int buf_cnt, struct hns_roce_buf *buf, unsigned int page_shift); -int hns_roce_get_umem_bufs(struct hns_roce_dev *hr_dev, dma_addr_t *bufs, +int hns_roce_get_umem_bufs(dma_addr_t *bufs, int buf_cnt, struct ib_umem *umem, unsigned int page_shift);
diff --git a/drivers/infiniband/hw/hns/hns_roce_hem.c b/drivers/infiniband/hw/hns/hns_roce_hem.c index 0ab514c49d5e..d3fabf64e390 100644 --- a/drivers/infiniband/hw/hns/hns_roce_hem.c +++ b/drivers/infiniband/hw/hns/hns_roce_hem.c @@ -1043,15 +1043,13 @@ static void hem_list_free_all(struct hns_roce_dev *hr_dev, } }
-static void hem_list_link_bt(struct hns_roce_dev *hr_dev, void *base_addr, - u64 table_addr) +static void hem_list_link_bt(void *base_addr, u64 table_addr) { *(u64 *)(base_addr) = table_addr; }
/* assign L0 table address to hem from root bt */ -static void hem_list_assign_bt(struct hns_roce_dev *hr_dev, - struct hns_roce_hem_item *hem, void *cpu_addr, +static void hem_list_assign_bt(struct hns_roce_hem_item *hem, void *cpu_addr, u64 phy_addr) { hem->addr = cpu_addr; @@ -1222,8 +1220,7 @@ static int hem_list_alloc_mid_bt(struct hns_roce_dev *hr_dev, if (level > 1) { pre = hem_ptrs[level - 1]; step = (cur->start - pre->start) / step * BA_BYTE_LEN; - hem_list_link_bt(hr_dev, pre->addr + step, - cur->dma_addr); + hem_list_link_bt(pre->addr + step, cur->dma_addr); } }
@@ -1281,7 +1278,7 @@ static int alloc_fake_root_bt(struct hns_roce_dev *hr_dev, void *cpu_base, if (!hem) return -ENOMEM;
- hem_list_assign_bt(hr_dev, hem, cpu_base, phy_base); + hem_list_assign_bt(hem, cpu_base, phy_base); list_add(&hem->list, branch_head); list_add(&hem->sibling, leaf_head);
@@ -1304,7 +1301,7 @@ static int setup_middle_bt(struct hns_roce_dev *hr_dev, void *cpu_base, /* if exist mid bt, link L1 to L0 */ list_for_each_entry_safe(hem, temp_hem, branch_head, list) { offset = (hem->start - r->offset) / step * BA_BYTE_LEN; - hem_list_link_bt(hr_dev, cpu_base + offset, hem->dma_addr); + hem_list_link_bt(cpu_base + offset, hem->dma_addr); total++; }
diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c index a396ba85cdce..aed9c403f3be 100644 --- a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c +++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c @@ -3292,8 +3292,7 @@ static int hns_roce_v2_rereg_write_mtpt(struct hns_roce_dev *hr_dev, return ret; }
-static int hns_roce_v2_frmr_write_mtpt(struct hns_roce_dev *hr_dev, - void *mb_buf, struct hns_roce_mr *mr) +static int hns_roce_v2_frmr_write_mtpt(void *mb_buf, struct hns_roce_mr *mr) { dma_addr_t pbl_ba = hns_roce_get_mtr_ba(&mr->pbl_mtr); struct hns_roce_v2_mpt_entry *mpt_entry; @@ -4208,8 +4207,7 @@ static void set_access_flags(struct hns_roce_qp *hr_qp, }
static void set_qpc_wqe_cnt(struct hns_roce_qp *hr_qp, - struct hns_roce_v2_qp_context *context, - struct hns_roce_v2_qp_context *qpc_mask) + struct hns_roce_v2_qp_context *context) { hr_reg_write(context, QPC_SGE_SHIFT, to_hr_hem_entries_shift(hr_qp->sge.sge_cnt, @@ -4231,7 +4229,6 @@ static inline int get_pdn(struct ib_pd *ib_pd) }
static void modify_qp_reset_to_init(struct ib_qp *ibqp, - const struct ib_qp_attr *attr, struct hns_roce_v2_qp_context *context, struct hns_roce_v2_qp_context *qpc_mask) { @@ -4250,7 +4247,7 @@ static void modify_qp_reset_to_init(struct ib_qp *ibqp,
hr_reg_write(context, QPC_RQWS, ilog2(hr_qp->rq.max_gs));
- set_qpc_wqe_cnt(hr_qp, context, qpc_mask); + set_qpc_wqe_cnt(hr_qp, context);
/* No VLAN need to set 0xFFF */ hr_reg_write(context, QPC_VLAN_ID, 0xfff); @@ -4291,7 +4288,6 @@ static void modify_qp_reset_to_init(struct ib_qp *ibqp, }
static void modify_qp_init_to_init(struct ib_qp *ibqp, - const struct ib_qp_attr *attr, struct hns_roce_v2_qp_context *context, struct hns_roce_v2_qp_context *qpc_mask) { @@ -4612,8 +4608,7 @@ static int modify_qp_init_to_rtr(struct ib_qp *ibqp, return 0; }
-static int modify_qp_rtr_to_rts(struct ib_qp *ibqp, - const struct ib_qp_attr *attr, int attr_mask, +static int modify_qp_rtr_to_rts(struct ib_qp *ibqp, int attr_mask, struct hns_roce_v2_qp_context *context, struct hns_roce_v2_qp_context *qpc_mask) { @@ -4982,15 +4977,14 @@ static int hns_roce_v2_set_abs_fields(struct ib_qp *ibqp,
if (cur_state == IB_QPS_RESET && new_state == IB_QPS_INIT) { memset(qpc_mask, 0, hr_dev->caps.qpc_sz); - modify_qp_reset_to_init(ibqp, attr, context, qpc_mask); + modify_qp_reset_to_init(ibqp, context, qpc_mask); } else if (cur_state == IB_QPS_INIT && new_state == IB_QPS_INIT) { - modify_qp_init_to_init(ibqp, attr, context, qpc_mask); + modify_qp_init_to_init(ibqp, context, qpc_mask); } else if (cur_state == IB_QPS_INIT && new_state == IB_QPS_RTR) { ret = modify_qp_init_to_rtr(ibqp, attr, attr_mask, context, qpc_mask, udata); } else if (cur_state == IB_QPS_RTR && new_state == IB_QPS_RTS) { - ret = modify_qp_rtr_to_rts(ibqp, attr, attr_mask, context, - qpc_mask); + ret = modify_qp_rtr_to_rts(ibqp, attr_mask, context, qpc_mask); }
return ret; diff --git a/drivers/infiniband/hw/hns/hns_roce_mr.c b/drivers/infiniband/hw/hns/hns_roce_mr.c index 7b59b95f87c2..acdcd5c9f42f 100644 --- a/drivers/infiniband/hw/hns/hns_roce_mr.c +++ b/drivers/infiniband/hw/hns/hns_roce_mr.c @@ -154,7 +154,7 @@ static int hns_roce_mr_enable(struct hns_roce_dev *hr_dev, if (mr->type != MR_TYPE_FRMR) ret = hr_dev->hw->write_mtpt(hr_dev, mailbox->buf, mr); else - ret = hr_dev->hw->frmr_write_mtpt(hr_dev, mailbox->buf, mr); + ret = hr_dev->hw->frmr_write_mtpt(mailbox->buf, mr); if (ret) { dev_err(dev, "failed to write mtpt, ret = %d.\n", ret); goto err_page; @@ -714,7 +714,7 @@ static int mtr_map_bufs(struct hns_roce_dev *hr_dev, struct hns_roce_mtr *mtr, return -ENOMEM;
if (mtr->umem) - npage = hns_roce_get_umem_bufs(hr_dev, pages, page_count, + npage = hns_roce_get_umem_bufs(pages, page_count, mtr->umem, page_shift); else npage = hns_roce_get_kmem_bufs(hr_dev, pages, page_count, diff --git a/drivers/infiniband/hw/hns/hns_roce_qp.c b/drivers/infiniband/hw/hns/hns_roce_qp.c index 88a4777d29f8..97d79c8d5cd0 100644 --- a/drivers/infiniband/hw/hns/hns_roce_qp.c +++ b/drivers/infiniband/hw/hns/hns_roce_qp.c @@ -1075,7 +1075,6 @@ static int set_qp_param(struct hns_roce_dev *hr_dev, struct hns_roce_qp *hr_qp, }
static int hns_roce_create_qp_common(struct hns_roce_dev *hr_dev, - struct ib_pd *ib_pd, struct ib_qp_init_attr *init_attr, struct ib_udata *udata, struct hns_roce_qp *hr_qp) @@ -1229,7 +1228,6 @@ int hns_roce_create_qp(struct ib_qp *qp, struct ib_qp_init_attr *init_attr, struct ib_device *ibdev = qp->device; struct hns_roce_dev *hr_dev = to_hr_dev(ibdev); struct hns_roce_qp *hr_qp = to_hr_qp(qp); - struct ib_pd *pd = qp->pd; int ret;
ret = check_qp_type(hr_dev, init_attr->qp_type, !!udata); @@ -1244,7 +1242,7 @@ int hns_roce_create_qp(struct ib_qp *qp, struct ib_qp_init_attr *init_attr, hr_qp->phy_port = hr_dev->iboe.phy_port[hr_qp->port]; }
- ret = hns_roce_create_qp_common(hr_dev, pd, init_attr, udata, hr_qp); + ret = hns_roce_create_qp_common(hr_dev, init_attr, udata, hr_qp); if (ret) ibdev_err(ibdev, "create QP type 0x%x failed(%d)\n", init_attr->qp_type, ret); diff --git a/drivers/infiniband/hw/hns/hns_roce_srq.c b/drivers/infiniband/hw/hns/hns_roce_srq.c index 652508b660a0..80fcb1b0e8fd 100644 --- a/drivers/infiniband/hw/hns/hns_roce_srq.c +++ b/drivers/infiniband/hw/hns/hns_roce_srq.c @@ -249,7 +249,7 @@ static void free_srq_wqe_buf(struct hns_roce_dev *hr_dev, hns_roce_mtr_destroy(hr_dev, &srq->buf_mtr); }
-static int alloc_srq_wrid(struct hns_roce_dev *hr_dev, struct hns_roce_srq *srq) +static int alloc_srq_wrid(struct hns_roce_srq *srq) { srq->wrid = kvmalloc_array(srq->wqe_cnt, sizeof(u64), GFP_KERNEL); if (!srq->wrid) @@ -365,7 +365,7 @@ static int alloc_srq_buf(struct hns_roce_dev *hr_dev, struct hns_roce_srq *srq, goto err_idx;
if (!udata) { - ret = alloc_srq_wrid(hr_dev, srq); + ret = alloc_srq_wrid(srq); if (ret) goto err_wqe_buf; }
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: wenglianfa wenglianfa@huawei.com
[ Upstream commit 8673a6c2d9e483dfeeef83a1f06f59e05636f4d1 ]
Due to HW limitation, the three region of WQE buffer must be mapped and set to HW in a fixed order: SQ buffer, SGE buffer, and RQ buffer.
Currently when one region is zero-hop while the other two are not, the zero-hop region will not be mapped. This violate the limitation above and leads to address error.
Fixes: 38389eaa4db1 ("RDMA/hns: Add mtr support for mixed multihop addressing") Signed-off-by: wenglianfa wenglianfa@huawei.com Signed-off-by: Junxian Huang huangjunxian6@hisilicon.com Link: https://patch.msgid.link/20241220055249.146943-2-huangjunxian6@hisilicon.com Signed-off-by: Leon Romanovsky leon@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/infiniband/hw/hns/hns_roce_hem.c | 43 ++++++++++++++++-------- drivers/infiniband/hw/hns/hns_roce_mr.c | 5 --- 2 files changed, 29 insertions(+), 19 deletions(-)
diff --git a/drivers/infiniband/hw/hns/hns_roce_hem.c b/drivers/infiniband/hw/hns/hns_roce_hem.c index d3fabf64e390..51ab6041ca91 100644 --- a/drivers/infiniband/hw/hns/hns_roce_hem.c +++ b/drivers/infiniband/hw/hns/hns_roce_hem.c @@ -986,6 +986,7 @@ struct hns_roce_hem_item { size_t count; /* max ba numbers */ int start; /* start buf offset in this hem */ int end; /* end buf offset in this hem */ + bool exist_bt; };
/* All HEM items are linked in a tree structure */ @@ -1014,6 +1015,7 @@ hem_list_alloc_item(struct hns_roce_dev *hr_dev, int start, int end, int count, } }
+ hem->exist_bt = exist_bt; hem->count = count; hem->start = start; hem->end = end; @@ -1024,22 +1026,22 @@ hem_list_alloc_item(struct hns_roce_dev *hr_dev, int start, int end, int count, }
static void hem_list_free_item(struct hns_roce_dev *hr_dev, - struct hns_roce_hem_item *hem, bool exist_bt) + struct hns_roce_hem_item *hem) { - if (exist_bt) + if (hem->exist_bt) dma_free_coherent(hr_dev->dev, hem->count * BA_BYTE_LEN, hem->addr, hem->dma_addr); kfree(hem); }
static void hem_list_free_all(struct hns_roce_dev *hr_dev, - struct list_head *head, bool exist_bt) + struct list_head *head) { struct hns_roce_hem_item *hem, *temp_hem;
list_for_each_entry_safe(hem, temp_hem, head, list) { list_del(&hem->list); - hem_list_free_item(hr_dev, hem, exist_bt); + hem_list_free_item(hr_dev, hem); } }
@@ -1139,6 +1141,10 @@ int hns_roce_hem_list_calc_root_ba(const struct hns_roce_buf_region *regions,
for (i = 0; i < region_cnt; i++) { r = (struct hns_roce_buf_region *)®ions[i]; + /* when r->hopnum = 0, the region should not occupy root_ba. */ + if (!r->hopnum) + continue; + if (r->hopnum > 1) { step = hem_list_calc_ba_range(r->hopnum, 1, unit); if (step > 0) @@ -1232,7 +1238,7 @@ static int hem_list_alloc_mid_bt(struct hns_roce_dev *hr_dev,
err_exit: for (level = 1; level < hopnum; level++) - hem_list_free_all(hr_dev, &temp_list[level], true); + hem_list_free_all(hr_dev, &temp_list[level]);
return ret; } @@ -1273,16 +1279,26 @@ static int alloc_fake_root_bt(struct hns_roce_dev *hr_dev, void *cpu_base, { struct hns_roce_hem_item *hem;
+ /* This is on the has_mtt branch, if r->hopnum + * is 0, there is no root_ba to reuse for the + * region's fake hem, so a dma_alloc request is + * necessary here. + */ hem = hem_list_alloc_item(hr_dev, r->offset, r->offset + r->count - 1, - r->count, false); + r->count, !r->hopnum); if (!hem) return -ENOMEM;
- hem_list_assign_bt(hem, cpu_base, phy_base); + /* The root_ba can be reused only when r->hopnum > 0. */ + if (r->hopnum) + hem_list_assign_bt(hem, cpu_base, phy_base); list_add(&hem->list, branch_head); list_add(&hem->sibling, leaf_head);
- return r->count; + /* If r->hopnum == 0, 0 is returned, + * so that the root_bt entry is not occupied. + */ + return r->hopnum ? r->count : 0; }
static int setup_middle_bt(struct hns_roce_dev *hr_dev, void *cpu_base, @@ -1326,7 +1342,7 @@ setup_root_hem(struct hns_roce_dev *hr_dev, struct hns_roce_hem_list *hem_list, return -ENOMEM;
total = 0; - for (i = 0; i < region_cnt && total < max_ba_num; i++) { + for (i = 0; i < region_cnt && total <= max_ba_num; i++) { r = ®ions[i]; if (!r->count) continue; @@ -1392,9 +1408,9 @@ static int hem_list_alloc_root_bt(struct hns_roce_dev *hr_dev, region_cnt); if (ret) { for (i = 0; i < region_cnt; i++) - hem_list_free_all(hr_dev, &head.branch[i], false); + hem_list_free_all(hr_dev, &head.branch[i]);
- hem_list_free_all(hr_dev, &head.root, true); + hem_list_free_all(hr_dev, &head.root); }
return ret; @@ -1457,10 +1473,9 @@ void hns_roce_hem_list_release(struct hns_roce_dev *hr_dev,
for (i = 0; i < HNS_ROCE_MAX_BT_REGION; i++) for (j = 0; j < HNS_ROCE_MAX_BT_LEVEL; j++) - hem_list_free_all(hr_dev, &hem_list->mid_bt[i][j], - j != 0); + hem_list_free_all(hr_dev, &hem_list->mid_bt[i][j]);
- hem_list_free_all(hr_dev, &hem_list->root_bt, true); + hem_list_free_all(hr_dev, &hem_list->root_bt); INIT_LIST_HEAD(&hem_list->btm_bt); hem_list->root_ba = 0; } diff --git a/drivers/infiniband/hw/hns/hns_roce_mr.c b/drivers/infiniband/hw/hns/hns_roce_mr.c index acdcd5c9f42f..408ef2a96149 100644 --- a/drivers/infiniband/hw/hns/hns_roce_mr.c +++ b/drivers/infiniband/hw/hns/hns_roce_mr.c @@ -767,11 +767,6 @@ int hns_roce_mtr_map(struct hns_roce_dev *hr_dev, struct hns_roce_mtr *mtr, for (i = 0, mapped_cnt = 0; i < mtr->hem_cfg.region_count && mapped_cnt < page_cnt; i++) { r = &mtr->hem_cfg.region[i]; - /* if hopnum is 0, no need to map pages in this region */ - if (!r->hopnum) { - mapped_cnt += r->count; - continue; - }
if (r->offset + r->count > page_cnt) { ret = -EINVAL;
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Chengchang Tang tangchengchang@huawei.com
[ Upstream commit fa5c4ba8cdbfd2c2d6422e001311c8213283ebbf ]
WARN_ON() is called in the IO path. And it could lead to a warning storm. Use WARN_ON_ONCE() instead of WARN_ON().
Fixes: 12542f1de179 ("RDMA/hns: Refactor process about opcode in post_send()") Signed-off-by: Chengchang Tang tangchengchang@huawei.com Signed-off-by: Junxian Huang huangjunxian6@hisilicon.com Link: https://patch.msgid.link/20241220055249.146943-4-huangjunxian6@hisilicon.com Signed-off-by: Leon Romanovsky leon@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/infiniband/hw/hns/hns_roce_hw_v2.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c index aed9c403f3be..c9c9be122471 100644 --- a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c +++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c @@ -471,7 +471,7 @@ static inline int set_ud_wqe(struct hns_roce_qp *qp, valid_num_sge = calc_wr_sge_num(wr, &msg_len);
ret = set_ud_opcode(ud_sq_wqe, wr); - if (WARN_ON(ret)) + if (WARN_ON_ONCE(ret)) return ret;
ud_sq_wqe->msg_len = cpu_to_le32(msg_len); @@ -575,7 +575,7 @@ static inline int set_rc_wqe(struct hns_roce_qp *qp, rc_sq_wqe->msg_len = cpu_to_le32(msg_len);
ret = set_rc_opcode(hr_dev, rc_sq_wqe, wr); - if (WARN_ON(ret)) + if (WARN_ON_ONCE(ret)) return ret;
hr_reg_write(rc_sq_wqe, RC_SEND_WQE_SO,
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Chengchang Tang tangchengchang@huawei.com
[ Upstream commit e3debdd48423d3d75b9d366399228d7225d902cd ]
Flush CQE handler has not been called if QP state gets into errored mode in DWQE path. So, the new added outstanding WQEs will never be flushed.
It leads to a hung task timeout when using NFS over RDMA: __switch_to+0x7c/0xd0 __schedule+0x350/0x750 schedule+0x50/0xf0 schedule_timeout+0x2c8/0x340 wait_for_common+0xf4/0x2b0 wait_for_completion+0x20/0x40 __ib_drain_sq+0x140/0x1d0 [ib_core] ib_drain_sq+0x98/0xb0 [ib_core] rpcrdma_xprt_disconnect+0x68/0x270 [rpcrdma] xprt_rdma_close+0x20/0x60 [rpcrdma] xprt_autoclose+0x64/0x1cc [sunrpc] process_one_work+0x1d8/0x4e0 worker_thread+0x154/0x420 kthread+0x108/0x150 ret_from_fork+0x10/0x18
Fixes: 01584a5edcc4 ("RDMA/hns: Add support of direct wqe") Signed-off-by: Chengchang Tang tangchengchang@huawei.com Signed-off-by: Junxian Huang huangjunxian6@hisilicon.com Link: https://patch.msgid.link/20241220055249.146943-5-huangjunxian6@hisilicon.com Signed-off-by: Leon Romanovsky leon@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/infiniband/hw/hns/hns_roce_hw_v2.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c index c9c9be122471..aded0a7f4283 100644 --- a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c +++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c @@ -673,6 +673,10 @@ static void write_dwqe(struct hns_roce_dev *hr_dev, struct hns_roce_qp *qp, #define HNS_ROCE_SL_SHIFT 2 struct hns_roce_v2_rc_send_wqe *rc_sq_wqe = wqe;
+ if (unlikely(qp->state == IB_QPS_ERR)) { + flush_cqe(hr_dev, qp); + return; + } /* All kinds of DirectWQE have the same header field layout */ hr_reg_enable(rc_sq_wqe, RC_SEND_WQE_FLAG); hr_reg_write(rc_sq_wqe, RC_SEND_WQE_DB_SL_L, qp->sl);
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Andrew Halaney ahalaney@redhat.com
[ Upstream commit f3c2caacee824ce4a331cdafb0b8dc8e987f105e ]
Currently a MDIO bus is created if the devicetree description is either:
1. Not fixed-link 2. fixed-link but contains a MDIO bus as well
The "1" case above isn't always accurate. If there's a phy-handle, it could be referencing a phy on another MDIO controller's bus[1]. In this case, where the MDIO bus is not described at all, currently stmmac will make a MDIO bus and scan its address space to discover phys (of which there are none). This process takes time scanning a bus that is known to be empty, delaying time to complete probe.
There are also a lot of upstream devicetrees[2] that expect a MDIO bus to be created, scanned for phys, and the first one found connected to the MAC. This case can be inferred from the platform description by not having a phy-handle && not being fixed-link. This hits case "1" in the current driver's logic, and must be handled in any logic change here since it is a valid legacy dt-binding.
Let's improve the logic to create a MDIO bus if either:
- Devicetree contains a MDIO bus - !fixed-link && !phy-handle (legacy handling)
This way the case where no MDIO bus should be made is handled, as well as retaining backwards compatibility with the valid cases.
Below devicetree snippets can be found that explain some of the cases above more concretely.
Here's[0] a devicetree example where the MAC is both fixed-link and driving a switch on MDIO (case "2" above). This needs a MDIO bus to be created:
&fec1 { phy-mode = "rmii";
fixed-link { speed = <100>; full-duplex; };
mdio1: mdio { switch0: switch0@0 { compatible = "marvell,mv88e6190"; pinctrl-0 = <&pinctrl_gpio_switch0>; }; }; };
Here's[1] an example where there is no MDIO bus or fixed-link for the ethernet1 MAC, so no MDIO bus should be created since ethernet0 is the MDIO master for ethernet1's phy:
ðernet0 { phy-mode = "sgmii"; phy-handle = <&sgmii_phy0>;
mdio { compatible = "snps,dwmac-mdio"; sgmii_phy0: phy@8 { compatible = "ethernet-phy-id0141.0dd4"; reg = <0x8>; device_type = "ethernet-phy"; };
sgmii_phy1: phy@a { compatible = "ethernet-phy-id0141.0dd4"; reg = <0xa>; device_type = "ethernet-phy"; }; }; };
ðernet1 { phy-mode = "sgmii"; phy-handle = <&sgmii_phy1>; };
Finally there's descriptions like this[2] which don't describe the MDIO bus but expect it to be created and the whole address space scanned for a phy since there's no phy-handle or fixed-link described:
&gmac { phy-supply = <&vcc_lan>; phy-mode = "rmii"; snps,reset-gpio = <&gpio3 RK_PB4 GPIO_ACTIVE_HIGH>; snps,reset-active-low; snps,reset-delays-us = <0 10000 1000000>; };
[0] https://elixir.bootlin.com/linux/v6.5-rc5/source/arch/arm/boot/dts/nxp/vf/vf... [1] https://elixir.bootlin.com/linux/v6.6-rc5/source/arch/arm64/boot/dts/qcom/sa... [2] https://elixir.bootlin.com/linux/v6.6-rc5/source/arch/arm64/boot/dts/rockchi...
Reviewed-by: Serge Semin fancer.lancer@gmail.com Co-developed-by: Bartosz Golaszewski bartosz.golaszewski@linaro.org Signed-off-by: Bartosz Golaszewski bartosz.golaszewski@linaro.org Signed-off-by: Andrew Halaney ahalaney@redhat.com Signed-off-by: David S. Miller davem@davemloft.net Stable-dep-of: 2b6ffcd7873b ("net: stmmac: restructure the error path of stmmac_probe_config_dt()") Signed-off-by: Sasha Levin sashal@kernel.org --- .../ethernet/stmicro/stmmac/stmmac_platform.c | 91 +++++++++++-------- 1 file changed, 54 insertions(+), 37 deletions(-)
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c index b4fdd40be63c..d73b2c17cc6c 100644 --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c @@ -296,62 +296,80 @@ static int stmmac_mtl_setup(struct platform_device *pdev, }
/** - * stmmac_dt_phy - parse device-tree driver parameters to allocate PHY resources - * @plat: driver data platform structure - * @np: device tree node - * @dev: device pointer - * Description: - * The mdio bus will be allocated in case of a phy transceiver is on board; - * it will be NULL if the fixed-link is configured. - * If there is the "snps,dwmac-mdio" sub-node the mdio will be allocated - * in any case (for DSA, mdio must be registered even if fixed-link). - * The table below sums the supported configurations: - * ------------------------------- - * snps,phy-addr | Y - * ------------------------------- - * phy-handle | Y - * ------------------------------- - * fixed-link | N - * ------------------------------- - * snps,dwmac-mdio | - * even if | Y - * fixed-link | - * ------------------------------- + * stmmac_of_get_mdio() - Gets the MDIO bus from the devicetree. + * @np: devicetree node * - * It returns 0 in case of success otherwise -ENODEV. + * The MDIO bus will be searched for in the following ways: + * 1. The compatible is "snps,dwc-qos-ethernet-4.10" && a "mdio" named + * child node exists + * 2. A child node with the "snps,dwmac-mdio" compatible is present + * + * Return: The MDIO node if present otherwise NULL */ -static int stmmac_dt_phy(struct plat_stmmacenet_data *plat, - struct device_node *np, struct device *dev) +static struct device_node *stmmac_of_get_mdio(struct device_node *np) { - bool mdio = !of_phy_is_fixed_link(np); static const struct of_device_id need_mdio_ids[] = { { .compatible = "snps,dwc-qos-ethernet-4.10" }, {}, }; + struct device_node *mdio_node = NULL;
if (of_match_node(need_mdio_ids, np)) { - plat->mdio_node = of_get_child_by_name(np, "mdio"); + mdio_node = of_get_child_by_name(np, "mdio"); } else { /** * If snps,dwmac-mdio is passed from DT, always register * the MDIO */ - for_each_child_of_node(np, plat->mdio_node) { - if (of_device_is_compatible(plat->mdio_node, + for_each_child_of_node(np, mdio_node) { + if (of_device_is_compatible(mdio_node, "snps,dwmac-mdio")) break; } }
- if (plat->mdio_node) { + return mdio_node; +} + +/** + * stmmac_mdio_setup() - Populate platform related MDIO structures. + * @plat: driver data platform structure + * @np: devicetree node + * @dev: device pointer + * + * This searches for MDIO information from the devicetree. + * If an MDIO node is found, it's assigned to plat->mdio_node and + * plat->mdio_bus_data is allocated. + * If no connection can be determined, just plat->mdio_bus_data is allocated + * to indicate a bus should be created and scanned for a phy. + * If it's determined there's no MDIO bus needed, both are left NULL. + * + * This expects that plat->phy_node has already been searched for. + * + * Return: 0 on success, errno otherwise. + */ +static int stmmac_mdio_setup(struct plat_stmmacenet_data *plat, + struct device_node *np, struct device *dev) +{ + bool legacy_mdio; + + plat->mdio_node = stmmac_of_get_mdio(np); + if (plat->mdio_node) dev_dbg(dev, "Found MDIO subnode\n"); - mdio = true; - }
- if (mdio) { - plat->mdio_bus_data = - devm_kzalloc(dev, sizeof(struct stmmac_mdio_bus_data), - GFP_KERNEL); + /* Legacy devicetrees allowed for no MDIO bus description and expect + * the bus to be scanned for devices. If there's no phy or fixed-link + * described assume this is the case since there must be something + * connected to the MAC. + */ + legacy_mdio = !of_phy_is_fixed_link(np) && !plat->phy_node; + if (legacy_mdio) + dev_info(dev, "Deprecated MDIO bus assumption used\n"); + + if (plat->mdio_node || legacy_mdio) { + plat->mdio_bus_data = devm_kzalloc(dev, + sizeof(*plat->mdio_bus_data), + GFP_KERNEL); if (!plat->mdio_bus_data) return -ENOMEM;
@@ -455,8 +473,7 @@ stmmac_probe_config_dt(struct platform_device *pdev, u8 *mac) if (of_property_read_u32(np, "snps,phy-addr", &plat->phy_addr) == 0) dev_warn(&pdev->dev, "snps,phy-addr property is deprecated\n");
- /* To Configure PHY by using all device-tree supported properties */ - rc = stmmac_dt_phy(plat, np, &pdev->dev); + rc = stmmac_mdio_setup(plat, np, &pdev->dev); if (rc) return ERR_PTR(rc);
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Joe Hattori joe@pf.is.s.u-tokyo.ac.jp
[ Upstream commit 2b6ffcd7873b7e8a62c3e15a6f305bfc747c466b ]
Current implementation of stmmac_probe_config_dt() does not release the OF node reference obtained by of_parse_phandle() in some error paths. The problem is that some error paths call stmmac_remove_config_dt() to clean up but others use and unwind ladder. These two types of error handling have not kept in sync and have been a recurring source of bugs. Re-write the error handling in stmmac_probe_config_dt() to use an unwind ladder. Consequently, stmmac_remove_config_dt() is not needed anymore, thus remove it.
This bug was found by an experimental verification tool that I am developing.
Fixes: 4838a5405028 ("net: stmmac: Fix wrapper drivers not detecting PHY") Signed-off-by: Joe Hattori joe@pf.is.s.u-tokyo.ac.jp Link: https://patch.msgid.link/20241219024119.2017012-1-joe@pf.is.s.u-tokyo.ac.jp Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- .../ethernet/stmicro/stmmac/stmmac_platform.c | 27 ++++++++++++------- 1 file changed, 17 insertions(+), 10 deletions(-)
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c index d73b2c17cc6c..4d570efd9d4b 100644 --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c @@ -474,8 +474,10 @@ stmmac_probe_config_dt(struct platform_device *pdev, u8 *mac) dev_warn(&pdev->dev, "snps,phy-addr property is deprecated\n");
rc = stmmac_mdio_setup(plat, np, &pdev->dev); - if (rc) - return ERR_PTR(rc); + if (rc) { + ret = ERR_PTR(rc); + goto error_put_phy; + }
of_property_read_u32(np, "tx-fifo-depth", &plat->tx_fifo_size);
@@ -564,8 +566,8 @@ stmmac_probe_config_dt(struct platform_device *pdev, u8 *mac) dma_cfg = devm_kzalloc(&pdev->dev, sizeof(*dma_cfg), GFP_KERNEL); if (!dma_cfg) { - stmmac_remove_config_dt(pdev, plat); - return ERR_PTR(-ENOMEM); + ret = ERR_PTR(-ENOMEM); + goto error_put_mdio; } plat->dma_cfg = dma_cfg;
@@ -593,8 +595,8 @@ stmmac_probe_config_dt(struct platform_device *pdev, u8 *mac)
rc = stmmac_mtl_setup(pdev, plat); if (rc) { - stmmac_remove_config_dt(pdev, plat); - return ERR_PTR(rc); + ret = ERR_PTR(rc); + goto error_put_mdio; }
/* clock setup */ @@ -646,6 +648,10 @@ stmmac_probe_config_dt(struct platform_device *pdev, u8 *mac) clk_disable_unprepare(plat->pclk); error_pclk_get: clk_disable_unprepare(plat->stmmac_clk); +error_put_mdio: + of_node_put(plat->mdio_node); +error_put_phy: + of_node_put(plat->phy_node);
return ret; } @@ -654,16 +660,17 @@ static void devm_stmmac_remove_config_dt(void *data) { struct plat_stmmacenet_data *plat = data;
- /* Platform data argument is unused */ - stmmac_remove_config_dt(NULL, plat); + clk_disable_unprepare(plat->stmmac_clk); + clk_disable_unprepare(plat->pclk); + of_node_put(plat->mdio_node); + of_node_put(plat->phy_node); }
/** * devm_stmmac_probe_config_dt * @pdev: platform_device structure * @mac: MAC address to use - * Description: Devres variant of stmmac_probe_config_dt(). Does not require - * the user to call stmmac_remove_config_dt() at driver detach. + * Description: Devres variant of stmmac_probe_config_dt(). */ struct plat_stmmacenet_data * devm_stmmac_probe_config_dt(struct platform_device *pdev, u8 *mac)
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Wang Liang wangliang74@huawei.com
[ Upstream commit 4f4aa4aa28142d53f8b06585c478476cfe325cfc ]
If inet_csk_reqsk_queue_hash_add() return false, tcp_conn_request() will return without free the dst memory, which allocated in af_ops->route_req.
Here is the kmemleak stack:
unreferenced object 0xffff8881198631c0 (size 240): comm "softirq", pid 0, jiffies 4299266571 (age 1802.392s) hex dump (first 32 bytes): 00 10 9b 03 81 88 ff ff 80 98 da bc ff ff ff ff ................ 81 55 18 bb ff ff ff ff 00 00 00 00 00 00 00 00 .U.............. backtrace: [<ffffffffb93e8d4c>] kmem_cache_alloc+0x60c/0xa80 [<ffffffffba11b4c5>] dst_alloc+0x55/0x250 [<ffffffffba227bf6>] rt_dst_alloc+0x46/0x1d0 [<ffffffffba23050a>] __mkroute_output+0x29a/0xa50 [<ffffffffba23456b>] ip_route_output_key_hash+0x10b/0x240 [<ffffffffba2346bd>] ip_route_output_flow+0x1d/0x90 [<ffffffffba254855>] inet_csk_route_req+0x2c5/0x500 [<ffffffffba26b331>] tcp_conn_request+0x691/0x12c0 [<ffffffffba27bd08>] tcp_rcv_state_process+0x3c8/0x11b0 [<ffffffffba2965c6>] tcp_v4_do_rcv+0x156/0x3b0 [<ffffffffba299c98>] tcp_v4_rcv+0x1cf8/0x1d80 [<ffffffffba239656>] ip_protocol_deliver_rcu+0xf6/0x360 [<ffffffffba2399a6>] ip_local_deliver_finish+0xe6/0x1e0 [<ffffffffba239b8e>] ip_local_deliver+0xee/0x360 [<ffffffffba239ead>] ip_rcv+0xad/0x2f0 [<ffffffffba110943>] __netif_receive_skb_one_core+0x123/0x140
Call dst_release() to free the dst memory when inet_csk_reqsk_queue_hash_add() return false in tcp_conn_request().
Fixes: ff46e3b44219 ("Fix race for duplicate reqsk on identical SYN") Signed-off-by: Wang Liang wangliang74@huawei.com Link: https://patch.msgid.link/20241219072859.3783576-1-wangliang74@huawei.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/ipv4/tcp_input.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index fb053942dba2..f6a213bae5cc 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -7192,6 +7192,7 @@ int tcp_conn_request(struct request_sock_ops *rsk_ops, if (unlikely(!inet_csk_reqsk_queue_hash_add(sk, req, req->timeout))) { reqsk_free(req); + dst_release(dst); return 0; }
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Dumazet edumazet@google.com
[ Upstream commit f694eee9e1c00d6ca06c5e59c04e3b6ff7d64aa9 ]
t->parms.link is read locklessly, annotate these reads and opposite writes accordingly.
Signed-off-by: Eric Dumazet edumazet@google.com Signed-off-by: David S. Miller davem@davemloft.net Stable-dep-of: b5a7b661a073 ("net: Fix netns for ip_tunnel_init_flow()") Signed-off-by: Sasha Levin sashal@kernel.org --- net/ipv4/ip_tunnel.c | 27 +++++++++++++-------------- 1 file changed, 13 insertions(+), 14 deletions(-)
diff --git a/net/ipv4/ip_tunnel.c b/net/ipv4/ip_tunnel.c index 72b2d68ef4da..0f5cfe3caa2e 100644 --- a/net/ipv4/ip_tunnel.c +++ b/net/ipv4/ip_tunnel.c @@ -102,10 +102,9 @@ struct ip_tunnel *ip_tunnel_lookup(struct ip_tunnel_net *itn, if (!ip_tunnel_key_match(&t->parms, flags, key)) continue;
- if (t->parms.link == link) + if (READ_ONCE(t->parms.link) == link) return t; - else - cand = t; + cand = t; }
hlist_for_each_entry_rcu(t, head, hash_node) { @@ -117,9 +116,9 @@ struct ip_tunnel *ip_tunnel_lookup(struct ip_tunnel_net *itn, if (!ip_tunnel_key_match(&t->parms, flags, key)) continue;
- if (t->parms.link == link) + if (READ_ONCE(t->parms.link) == link) return t; - else if (!cand) + if (!cand) cand = t; }
@@ -137,9 +136,9 @@ struct ip_tunnel *ip_tunnel_lookup(struct ip_tunnel_net *itn, if (!ip_tunnel_key_match(&t->parms, flags, key)) continue;
- if (t->parms.link == link) + if (READ_ONCE(t->parms.link) == link) return t; - else if (!cand) + if (!cand) cand = t; }
@@ -150,9 +149,9 @@ struct ip_tunnel *ip_tunnel_lookup(struct ip_tunnel_net *itn, !(t->dev->flags & IFF_UP)) continue;
- if (t->parms.link == link) + if (READ_ONCE(t->parms.link) == link) return t; - else if (!cand) + if (!cand) cand = t; }
@@ -221,7 +220,7 @@ static struct ip_tunnel *ip_tunnel_find(struct ip_tunnel_net *itn, hlist_for_each_entry_rcu(t, head, hash_node) { if (local == t->parms.iph.saddr && remote == t->parms.iph.daddr && - link == t->parms.link && + link == READ_ONCE(t->parms.link) && type == t->dev->type && ip_tunnel_key_match(&t->parms, flags, key)) break; @@ -774,7 +773,7 @@ void ip_tunnel_xmit(struct sk_buff *skb, struct net_device *dev,
ip_tunnel_init_flow(&fl4, protocol, dst, tnl_params->saddr, tunnel->parms.o_key, RT_TOS(tos), - dev_net(dev), tunnel->parms.link, + dev_net(dev), READ_ONCE(tunnel->parms.link), tunnel->fwmark, skb_get_hash(skb), 0);
if (ip_tunnel_encap(skb, &tunnel->encap, &protocol, &fl4) < 0) @@ -894,7 +893,7 @@ static void ip_tunnel_update(struct ip_tunnel_net *itn, if (t->parms.link != p->link || t->fwmark != fwmark) { int mtu;
- t->parms.link = p->link; + WRITE_ONCE(t->parms.link, p->link); t->fwmark = fwmark; mtu = ip_tunnel_bind_dev(dev); if (set_mtu) @@ -1084,9 +1083,9 @@ EXPORT_SYMBOL(ip_tunnel_get_link_net);
int ip_tunnel_get_iflink(const struct net_device *dev) { - struct ip_tunnel *tunnel = netdev_priv(dev); + const struct ip_tunnel *tunnel = netdev_priv(dev);
- return tunnel->parms.link; + return READ_ONCE(tunnel->parms.link); } EXPORT_SYMBOL(ip_tunnel_get_iflink);
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ido Schimmel idosch@nvidia.com
[ Upstream commit e7191e517a03d025405c7df730b400ad4118474e ]
Unmask the upper DSCP bits when initializing an IPv4 flow key via ip_tunnel_init_flow() before passing it to ip_route_output_key() so that in the future we could perform the FIB lookup according to the full DSCP value.
Signed-off-by: Ido Schimmel idosch@nvidia.com Reviewed-by: Guillaume Nault gnault@redhat.com Signed-off-by: David S. Miller davem@davemloft.net Stable-dep-of: b5a7b661a073 ("net: Fix netns for ip_tunnel_init_flow()") Signed-off-by: Sasha Levin sashal@kernel.org --- net/ipv4/ip_tunnel.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/ipv4/ip_tunnel.c b/net/ipv4/ip_tunnel.c index 0f5cfe3caa2e..571cf7c2fa28 100644 --- a/net/ipv4/ip_tunnel.c +++ b/net/ipv4/ip_tunnel.c @@ -293,7 +293,7 @@ static int ip_tunnel_bind_dev(struct net_device *dev)
ip_tunnel_init_flow(&fl4, iph->protocol, iph->daddr, iph->saddr, tunnel->parms.o_key, - RT_TOS(iph->tos), dev_net(dev), + iph->tos & INET_DSCP_MASK, dev_net(dev), tunnel->parms.link, tunnel->fwmark, 0, 0); rt = ip_route_output_key(tunnel->net, &fl4);
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ido Schimmel idosch@nvidia.com
[ Upstream commit c34cfe72bb260fc49660d9e6a9ba95ba01669ae2 ]
Unmask the upper DSCP bits when initializing an IPv4 flow key via ip_tunnel_init_flow() before passing it to ip_route_output_key() so that in the future we could perform the FIB lookup according to the full DSCP value.
Note that the 'tos' variable includes the full DS field. Either the one specified via the tunnel key or the one inherited from the inner packet.
Signed-off-by: Ido Schimmel idosch@nvidia.com Reviewed-by: Guillaume Nault gnault@redhat.com Signed-off-by: David S. Miller davem@davemloft.net Stable-dep-of: b5a7b661a073 ("net: Fix netns for ip_tunnel_init_flow()") Signed-off-by: Sasha Levin sashal@kernel.org --- net/ipv4/ip_tunnel.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/net/ipv4/ip_tunnel.c b/net/ipv4/ip_tunnel.c index 571cf7c2fa28..b5437755365c 100644 --- a/net/ipv4/ip_tunnel.c +++ b/net/ipv4/ip_tunnel.c @@ -43,6 +43,7 @@ #include <net/rtnetlink.h> #include <net/udp.h> #include <net/dst_metadata.h> +#include <net/inet_dscp.h>
#if IS_ENABLED(CONFIG_IPV6) #include <net/ipv6.h> @@ -609,9 +610,9 @@ void ip_md_tunnel_xmit(struct sk_buff *skb, struct net_device *dev, tos = ipv6_get_dsfield((const struct ipv6hdr *)inner_iph); } ip_tunnel_init_flow(&fl4, proto, key->u.ipv4.dst, key->u.ipv4.src, - tunnel_id_to_key32(key->tun_id), RT_TOS(tos), - dev_net(dev), 0, skb->mark, skb_get_hash(skb), - key->flow_flags); + tunnel_id_to_key32(key->tun_id), + tos & INET_DSCP_MASK, dev_net(dev), 0, skb->mark, + skb_get_hash(skb), key->flow_flags);
if (!tunnel_hlen) tunnel_hlen = ip_encap_hlen(&tun_info->encap);
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ido Schimmel idosch@nvidia.com
[ Upstream commit c2b639f9f3b7a058ca9c7349b096f355773f2cd8 ]
Unmask the upper DSCP bits when initializing an IPv4 flow key via ip_tunnel_init_flow() before passing it to ip_route_output_key() so that in the future we could perform the FIB lookup according to the full DSCP value.
Note that the 'tos' variable includes the full DS field. Either the one specified as part of the tunnel parameters or the one inherited from the inner packet.
Signed-off-by: Ido Schimmel idosch@nvidia.com Reviewed-by: Guillaume Nault gnault@redhat.com Signed-off-by: David S. Miller davem@davemloft.net Stable-dep-of: b5a7b661a073 ("net: Fix netns for ip_tunnel_init_flow()") Signed-off-by: Sasha Levin sashal@kernel.org --- net/ipv4/ip_tunnel.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/ipv4/ip_tunnel.c b/net/ipv4/ip_tunnel.c index b5437755365c..fd8923561b18 100644 --- a/net/ipv4/ip_tunnel.c +++ b/net/ipv4/ip_tunnel.c @@ -773,7 +773,7 @@ void ip_tunnel_xmit(struct sk_buff *skb, struct net_device *dev, }
ip_tunnel_init_flow(&fl4, protocol, dst, tnl_params->saddr, - tunnel->parms.o_key, RT_TOS(tos), + tunnel->parms.o_key, tos & INET_DSCP_MASK, dev_net(dev), READ_ONCE(tunnel->parms.link), tunnel->fwmark, skb_get_hash(skb), 0);
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Xiao Liang shaw.leon@gmail.com
[ Upstream commit b5a7b661a073727219fedc35f5619f62418ffe72 ]
The device denoted by tunnel->parms.link resides in the underlay net namespace. Therefore pass tunnel->net to ip_tunnel_init_flow().
Fixes: db53cd3d88dc ("net: Handle l3mdev in ip_tunnel_init_flow") Signed-off-by: Xiao Liang shaw.leon@gmail.com Reviewed-by: Ido Schimmel idosch@nvidia.com Link: https://patch.msgid.link/20241219130336.103839-1-shaw.leon@gmail.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/mellanox/mlxsw/spectrum_span.c | 3 +-- net/ipv4/ip_tunnel.c | 6 +++--- 2 files changed, 4 insertions(+), 5 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_span.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_span.c index dcd198104141..fa3fef2b74db 100644 --- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_span.c +++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_span.c @@ -423,8 +423,7 @@ mlxsw_sp_span_gretap4_route(const struct net_device *to_dev,
parms = mlxsw_sp_ipip_netdev_parms4(to_dev); ip_tunnel_init_flow(&fl4, parms.iph.protocol, *daddrp, *saddrp, - 0, 0, dev_net(to_dev), parms.link, tun->fwmark, 0, - 0); + 0, 0, tun->net, parms.link, tun->fwmark, 0, 0);
rt = ip_route_output_key(tun->net, &fl4); if (IS_ERR(rt)) diff --git a/net/ipv4/ip_tunnel.c b/net/ipv4/ip_tunnel.c index fd8923561b18..dd1803bf9c5c 100644 --- a/net/ipv4/ip_tunnel.c +++ b/net/ipv4/ip_tunnel.c @@ -294,7 +294,7 @@ static int ip_tunnel_bind_dev(struct net_device *dev)
ip_tunnel_init_flow(&fl4, iph->protocol, iph->daddr, iph->saddr, tunnel->parms.o_key, - iph->tos & INET_DSCP_MASK, dev_net(dev), + iph->tos & INET_DSCP_MASK, tunnel->net, tunnel->parms.link, tunnel->fwmark, 0, 0); rt = ip_route_output_key(tunnel->net, &fl4);
@@ -611,7 +611,7 @@ void ip_md_tunnel_xmit(struct sk_buff *skb, struct net_device *dev, } ip_tunnel_init_flow(&fl4, proto, key->u.ipv4.dst, key->u.ipv4.src, tunnel_id_to_key32(key->tun_id), - tos & INET_DSCP_MASK, dev_net(dev), 0, skb->mark, + tos & INET_DSCP_MASK, tunnel->net, 0, skb->mark, skb_get_hash(skb), key->flow_flags);
if (!tunnel_hlen) @@ -774,7 +774,7 @@ void ip_tunnel_xmit(struct sk_buff *skb, struct net_device *dev,
ip_tunnel_init_flow(&fl4, protocol, dst, tnl_params->saddr, tunnel->parms.o_key, tos & INET_DSCP_MASK, - dev_net(dev), READ_ONCE(tunnel->parms.link), + tunnel->net, READ_ONCE(tunnel->parms.link), tunnel->fwmark, skb_get_hash(skb), 0);
if (ip_tunnel_encap(skb, &tunnel->encap, &protocol, &fl4) < 0)
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ilya Shchipletsov rabbelkin@mail.ru
[ Upstream commit a4fd163aed2edd967a244499754dec991d8b4c7d ]
Syzkaller reports an uninit value read from ax25cmp when sending raw message through ieee802154 implementation.
===================================================== BUG: KMSAN: uninit-value in ax25cmp+0x3a5/0x460 net/ax25/ax25_addr.c:119 ax25cmp+0x3a5/0x460 net/ax25/ax25_addr.c:119 nr_dev_get+0x20e/0x450 net/netrom/nr_route.c:601 nr_route_frame+0x1a2/0xfc0 net/netrom/nr_route.c:774 nr_xmit+0x5a/0x1c0 net/netrom/nr_dev.c:144 __netdev_start_xmit include/linux/netdevice.h:4940 [inline] netdev_start_xmit include/linux/netdevice.h:4954 [inline] xmit_one net/core/dev.c:3548 [inline] dev_hard_start_xmit+0x247/0xa10 net/core/dev.c:3564 __dev_queue_xmit+0x33b8/0x5130 net/core/dev.c:4349 dev_queue_xmit include/linux/netdevice.h:3134 [inline] raw_sendmsg+0x654/0xc10 net/ieee802154/socket.c:299 ieee802154_sock_sendmsg+0x91/0xc0 net/ieee802154/socket.c:96 sock_sendmsg_nosec net/socket.c:730 [inline] __sock_sendmsg net/socket.c:745 [inline] ____sys_sendmsg+0x9c2/0xd60 net/socket.c:2584 ___sys_sendmsg+0x28d/0x3c0 net/socket.c:2638 __sys_sendmsg net/socket.c:2667 [inline] __do_sys_sendmsg net/socket.c:2676 [inline] __se_sys_sendmsg net/socket.c:2674 [inline] __x64_sys_sendmsg+0x307/0x490 net/socket.c:2674 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0x44/0x110 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x63/0x6b
Uninit was created at: slab_post_alloc_hook+0x129/0xa70 mm/slab.h:768 slab_alloc_node mm/slub.c:3478 [inline] kmem_cache_alloc_node+0x5e9/0xb10 mm/slub.c:3523 kmalloc_reserve+0x13d/0x4a0 net/core/skbuff.c:560 __alloc_skb+0x318/0x740 net/core/skbuff.c:651 alloc_skb include/linux/skbuff.h:1286 [inline] alloc_skb_with_frags+0xc8/0xbd0 net/core/skbuff.c:6334 sock_alloc_send_pskb+0xa80/0xbf0 net/core/sock.c:2780 sock_alloc_send_skb include/net/sock.h:1884 [inline] raw_sendmsg+0x36d/0xc10 net/ieee802154/socket.c:282 ieee802154_sock_sendmsg+0x91/0xc0 net/ieee802154/socket.c:96 sock_sendmsg_nosec net/socket.c:730 [inline] __sock_sendmsg net/socket.c:745 [inline] ____sys_sendmsg+0x9c2/0xd60 net/socket.c:2584 ___sys_sendmsg+0x28d/0x3c0 net/socket.c:2638 __sys_sendmsg net/socket.c:2667 [inline] __do_sys_sendmsg net/socket.c:2676 [inline] __se_sys_sendmsg net/socket.c:2674 [inline] __x64_sys_sendmsg+0x307/0x490 net/socket.c:2674 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0x44/0x110 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x63/0x6b
CPU: 0 PID: 5037 Comm: syz-executor166 Not tainted 6.7.0-rc7-syzkaller-00003-gfbafc3e621c3 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 11/17/2023 =====================================================
This issue occurs because the skb buffer is too small, and it's actual allocation is aligned. This hides an actual issue, which is that nr_route_frame does not validate the buffer size before using it.
Fix this issue by checking skb->len before accessing any fields in skb->data.
Found by Linux Verification Center (linuxtesting.org) with Syzkaller.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Co-developed-by: Nikita Marushkin hfggklm@gmail.com Signed-off-by: Nikita Marushkin hfggklm@gmail.com Signed-off-by: Ilya Shchipletsov rabbelkin@mail.ru Link: https://patch.msgid.link/20241219082308.3942-1-rabbelkin@mail.ru Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/netrom/nr_route.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/net/netrom/nr_route.c b/net/netrom/nr_route.c index bd2b17b219ae..0b270893ee14 100644 --- a/net/netrom/nr_route.c +++ b/net/netrom/nr_route.c @@ -754,6 +754,12 @@ int nr_route_frame(struct sk_buff *skb, ax25_cb *ax25) int ret; struct sk_buff *skbn;
+ /* + * Reject malformed packets early. Check that it contains at least 2 + * addresses and 1 byte more for Time-To-Live + */ + if (skb->len < 2 * sizeof(ax25_address) + 1) + return 0;
nr_src = (ax25_address *)(skb->data + 0); nr_dest = (ax25_address *)(skb->data + 7);
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Shahar Shitrit shshitrit@nvidia.com
[ Upstream commit 050a4c011b0dfeb91664a5d7bd3647ff38db08ce ]
When creating a software steering completion queue (CQ), an arbitrary MSIX vector n is selected. This results in the CQ sharing the same Ethernet traffic channel n associated with the chosen vector. However, the value of n is often unpredictable, which can introduce complications for interrupt monitoring and verification tools.
Moreover, SW steering uses polling rather than event-driven interrupts. Therefore, there is no need to select any MSIX vector other than the existing vector 0 for CQ creation.
In light of these factors, and to enhance predictability, we modify the code to consistently select MSIX vector 0 for CQ creation.
Fixes: 297cccebdc5a ("net/mlx5: DR, Expose an internal API to issue RDMA operations") Signed-off-by: Shahar Shitrit shshitrit@nvidia.com Reviewed-by: Yevgeny Kliteynik kliteyn@nvidia.com Signed-off-by: Tariq Toukan tariqt@nvidia.com Link: https://patch.msgid.link/20241220081505.1286093-2-tariqt@nvidia.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/mellanox/mlx5/core/steering/dr_send.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_send.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_send.c index 6fa06ba2d346..f57c84e5128b 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_send.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_send.c @@ -1067,7 +1067,6 @@ static struct mlx5dr_cq *dr_create_cq(struct mlx5_core_dev *mdev, int inlen, err, eqn; void *cqc, *in; __be64 *pas; - int vector; u32 i;
cq = kzalloc(sizeof(*cq), GFP_KERNEL); @@ -1096,8 +1095,7 @@ static struct mlx5dr_cq *dr_create_cq(struct mlx5_core_dev *mdev, if (!in) goto err_cqwq;
- vector = raw_smp_processor_id() % mlx5_comp_vectors_max(mdev); - err = mlx5_comp_eqn_get(mdev, vector, &eqn); + err = mlx5_comp_eqn_get(mdev, 0, &eqn); if (err) { kvfree(in); goto err_cqwq;
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dragos Tatulea dtatulea@nvidia.com
[ Upstream commit 8c6254479b3d5bd788d2b5fefaa48fb194331ed0 ]
In MACsec, it is possible to create multiple active TX SAs on a SC, but only one such SA can be used at a time for transmission. This SA is selected through the encoding_sa link parameter.
When there are 2 or more active TX SAs configured (encoding_sa=0): ip macsec add macsec0 tx sa 0 pn 1 on key 00 <KEY1> ip macsec add macsec0 tx sa 1 pn 1 on key 00 <KEY2>
... the traffic should be still sent via TX SA 0 as the encoding_sa was not changed. However, the driver ignores the encoding_sa and overrides it to SA 1 by installing the flow steering id of the newly created TX SA into the SCI -> flow steering id hash map. The future packet tx descriptors will point to the incorrect flow steering rule (SA 1).
This patch fixes the issue by avoiding the creation of the flow steering rule for an active TX SA that is not the encoding_sa. The driver side tx_sa object and the FW side macsec object are still created. When the encoding_sa link parameter is changed to another active TX SA, only the new flow steering rule will be created in the mlx5e_macsec_upd_txsa() handler.
Fixes: 8ff0ac5be144 ("net/mlx5: Add MACsec offload Tx command support") Signed-off-by: Dragos Tatulea dtatulea@nvidia.com Reviewed-by: Cosmin Ratiu cratiu@nvidia.com Reviewed-by: Lior Nahmanson liorna@nvidia.com Signed-off-by: Tariq Toukan tariqt@nvidia.com Link: https://patch.msgid.link/20241220081505.1286093-3-tariqt@nvidia.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/mellanox/mlx5/core/en_accel/macsec.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/macsec.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/macsec.c index cc9bcc420032..6ab02f3fc291 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/macsec.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/macsec.c @@ -339,9 +339,13 @@ static int mlx5e_macsec_init_sa_fs(struct macsec_context *ctx, { struct mlx5e_priv *priv = macsec_netdev_priv(ctx->netdev); struct mlx5_macsec_fs *macsec_fs = priv->mdev->macsec_fs; + const struct macsec_tx_sc *tx_sc = &ctx->secy->tx_sc; struct mlx5_macsec_rule_attrs rule_attrs; union mlx5_macsec_rule *macsec_rule;
+ if (is_tx && tx_sc->encoding_sa != sa->assoc_num) + return 0; + rule_attrs.macsec_obj_id = sa->macsec_obj_id; rule_attrs.sci = sa->sci; rule_attrs.assoc_num = sa->assoc_num;
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jianbo Liu jianbol@nvidia.com
[ Upstream commit 5a03b368562a7ff5f5f1f63b5adf8309cbdbd5be ]
During driver unload, unregister_netdev is called after unloading vport rep. So, the mlx5e_rep_priv is already freed while trying to get rpriv->netdev, or walk rpriv->tc_ht, which results in use-after-free. So add the checking to make sure access the data of vport rep which is still loaded.
Fixes: d1569537a837 ("net/mlx5e: Modify and restore TC rules for IPSec TX rules") Signed-off-by: Jianbo Liu jianbol@nvidia.com Reviewed-by: Saeed Mahameed saeedm@nvidia.com Signed-off-by: Tariq Toukan tariqt@nvidia.com Link: https://patch.msgid.link/20241220081505.1286093-4-tariqt@nvidia.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/mellanox/mlx5/core/esw/ipsec_fs.c | 6 +++--- drivers/net/ethernet/mellanox/mlx5/core/eswitch.h | 3 +++ drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c | 3 --- 3 files changed, 6 insertions(+), 6 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/esw/ipsec_fs.c b/drivers/net/ethernet/mellanox/mlx5/core/esw/ipsec_fs.c index 13b5916b64e2..eed8fcde2613 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/esw/ipsec_fs.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/esw/ipsec_fs.c @@ -150,11 +150,11 @@ void mlx5_esw_ipsec_restore_dest_uplink(struct mlx5_core_dev *mdev) unsigned long i; int err;
- xa_for_each(&esw->offloads.vport_reps, i, rep) { - rpriv = rep->rep_data[REP_ETH].priv; - if (!rpriv || !rpriv->netdev) + mlx5_esw_for_each_rep(esw, i, rep) { + if (atomic_read(&rep->rep_data[REP_ETH].state) != REP_LOADED) continue;
+ rpriv = rep->rep_data[REP_ETH].priv; rhashtable_walk_enter(&rpriv->tc_ht, &iter); rhashtable_walk_start(&iter); while ((flow = rhashtable_walk_next(&iter)) != NULL) { diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h index 9b771b572593..3e58e731b569 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h @@ -713,6 +713,9 @@ void mlx5e_tc_clean_fdb_peer_flows(struct mlx5_eswitch *esw); MLX5_CAP_GEN_2((esw->dev), ec_vf_vport_base) +\ (last) - 1)
+#define mlx5_esw_for_each_rep(esw, i, rep) \ + xa_for_each(&((esw)->offloads.vport_reps), i, rep) + struct mlx5_eswitch *__must_check mlx5_devlink_eswitch_get(struct devlink *devlink);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c index 58529d1a98b3..7eba3a5bb97c 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c @@ -52,9 +52,6 @@ #include "lag/lag.h" #include "en/tc/post_meter.h"
-#define mlx5_esw_for_each_rep(esw, i, rep) \ - xa_for_each(&((esw)->offloads.vport_reps), i, rep) - /* There are two match-all miss flows, one for unicast dst mac and * one for multicast. */
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Rodrigo Vivi rodrigo.vivi@intel.com
[ Upstream commit 20e7c5313ffbf11c34a46395345677adbe890bee ]
sub-pipe PG is not present on DG1. Setting these bits can disable other power gates and cause GPU hangs on video playbacks.
VLK: 16314, 4304
Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/13381 Fixes: 85a12d7eb8fe ("drm/i915/tgl: Fix Media power gate sequence.") Cc: Vinay Belgaumkar vinay.belgaumkar@intel.com Cc: Himal Prasad Ghimiray himal.prasad.ghimiray@intel.com Reviewed-by: Vinay Belgaumkar vinay.belgaumkar@intel.com Reviewed-by: Himal Prasad Ghimiray himal.prasad.ghimiray@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20241219210019.70532-1-rodrigo... Signed-off-by: Rodrigo Vivi rodrigo.vivi@intel.com (cherry picked from commit de7061947b4ed4be857d452c60d5fb795831d79e) Signed-off-by: Tvrtko Ursulin tursulin@ursulin.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/i915/gt/intel_rc6.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/gt/intel_rc6.c b/drivers/gpu/drm/i915/gt/intel_rc6.c index 9e113e947326..6e8c182b2559 100644 --- a/drivers/gpu/drm/i915/gt/intel_rc6.c +++ b/drivers/gpu/drm/i915/gt/intel_rc6.c @@ -133,7 +133,7 @@ static void gen11_rc6_enable(struct intel_rc6 *rc6) GEN9_MEDIA_PG_ENABLE | GEN11_MEDIA_SAMPLER_PG_ENABLE;
- if (GRAPHICS_VER(gt->i915) >= 12) { + if (GRAPHICS_VER(gt->i915) >= 12 && !IS_DG1(gt->i915)) { for (i = 0; i < I915_MAX_VCS; i++) if (HAS_ENGINE(gt, _VCS(i))) pg_enable |= (VDN_HCP_POWERGATE_ENABLE(i) |
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pablo Neira Ayuso pablo@netfilter.org
[ Upstream commit 542ed8145e6f9392e3d0a86a0e9027d2ffd183e4 ]
Access to genmask field in struct nft_set_ext results in unaligned atomic read:
[ 72.130109] Unable to handle kernel paging request at virtual address ffff0000c2bb708c [ 72.131036] Mem abort info: [ 72.131213] ESR = 0x0000000096000021 [ 72.131446] EC = 0x25: DABT (current EL), IL = 32 bits [ 72.132209] SET = 0, FnV = 0 [ 72.133216] EA = 0, S1PTW = 0 [ 72.134080] FSC = 0x21: alignment fault [ 72.135593] Data abort info: [ 72.137194] ISV = 0, ISS = 0x00000021, ISS2 = 0x00000000 [ 72.142351] CM = 0, WnR = 0, TnD = 0, TagAccess = 0 [ 72.145989] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 [ 72.150115] swapper pgtable: 4k pages, 48-bit VAs, pgdp=0000000237d27000 [ 72.154893] [ffff0000c2bb708c] pgd=0000000000000000, p4d=180000023ffff403, pud=180000023f84b403, pmd=180000023f835403, +pte=0068000102bb7707 [ 72.163021] Internal error: Oops: 0000000096000021 [#1] SMP [...] [ 72.170041] CPU: 7 UID: 0 PID: 54 Comm: kworker/7:0 Tainted: G E 6.13.0-rc3+ #2 [ 72.170509] Tainted: [E]=UNSIGNED_MODULE [ 72.170720] Hardware name: QEMU QEMU Virtual Machine, BIOS edk2-stable202302-for-qemu 03/01/2023 [ 72.171192] Workqueue: events_power_efficient nft_rhash_gc [nf_tables] [ 72.171552] pstate: 21400005 (nzCv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--) [ 72.171915] pc : nft_rhash_gc+0x200/0x2d8 [nf_tables] [ 72.172166] lr : nft_rhash_gc+0x128/0x2d8 [nf_tables] [ 72.172546] sp : ffff800081f2bce0 [ 72.172724] x29: ffff800081f2bd40 x28: ffff0000c2bb708c x27: 0000000000000038 [ 72.173078] x26: ffff0000c6780ef0 x25: ffff0000c643df00 x24: ffff0000c6778f78 [ 72.173431] x23: 000000000000001a x22: ffff0000c4b1f000 x21: ffff0000c6780f78 [ 72.173782] x20: ffff0000c2bb70dc x19: ffff0000c2bb7080 x18: 0000000000000000 [ 72.174135] x17: ffff0000c0a4e1c0 x16: 0000000000003000 x15: 0000ac26d173b978 [ 72.174485] x14: ffffffffffffffff x13: 0000000000000030 x12: ffff0000c6780ef0 [ 72.174841] x11: 0000000000000000 x10: ffff800081f2bcf8 x9 : ffff0000c3000000 [ 72.175193] x8 : 00000000000004be x7 : 0000000000000000 x6 : 0000000000000000 [ 72.175544] x5 : 0000000000000040 x4 : ffff0000c3000010 x3 : 0000000000000000 [ 72.175871] x2 : 0000000000003a98 x1 : ffff0000c2bb708c x0 : 0000000000000004 [ 72.176207] Call trace: [ 72.176316] nft_rhash_gc+0x200/0x2d8 [nf_tables] (P) [ 72.176653] process_one_work+0x178/0x3d0 [ 72.176831] worker_thread+0x200/0x3f0 [ 72.176995] kthread+0xe8/0xf8 [ 72.177130] ret_from_fork+0x10/0x20 [ 72.177289] Code: 54fff984 d503201f d2800080 91003261 (f820303f) [ 72.177557] ---[ end trace 0000000000000000 ]---
Align struct nft_set_ext to word size to address this and documentation it.
pahole reports that this increases the size of elements for rhash and pipapo in 8 bytes on x86_64.
Fixes: 7ffc7481153b ("netfilter: nft_set_hash: skip duplicated elements pending gc run") Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Sasha Levin sashal@kernel.org --- include/net/netfilter/nf_tables.h | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/include/net/netfilter/nf_tables.h b/include/net/netfilter/nf_tables.h index b5f9ee5810a3..8321915dddb2 100644 --- a/include/net/netfilter/nf_tables.h +++ b/include/net/netfilter/nf_tables.h @@ -721,15 +721,18 @@ struct nft_set_ext_tmpl { /** * struct nft_set_ext - set extensions * - * @genmask: generation mask + * @genmask: generation mask, but also flags (see NFT_SET_ELEM_DEAD_BIT) * @offset: offsets of individual extension types * @data: beginning of extension data + * + * This structure must be aligned to word size, otherwise atomic bitops + * on genmask field can cause alignment failure on some archs. */ struct nft_set_ext { u8 genmask; u8 offset[NFT_SET_EXT_NUM]; char data[]; -}; +} __aligned(BITS_PER_LONG / 8);
static inline void nft_set_ext_prepare(struct nft_set_ext_tmpl *tmpl) {
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Antonio Pastor antonio.pastor@gmail.com
[ Upstream commit a024e377efed31ecfb39210bed562932321345b3 ]
802.2+LLC+SNAP frames received by napi_complete_done with GRO and DSA have skb->transport_header set two bytes short, or pointing 2 bytes before network_header & skb->data. As snap_rcv expects transport_header to point to SNAP header (OID:PID) after LLC processing advances offset over LLC header (llc_rcv & llc_fixup_skb), code doesn't find a match and packet is dropped.
Between napi_complete_done and snap_rcv, transport_header is not used until __netif_receive_skb_core, where originally it was being reset. Commit fda55eca5a33 ("net: introduce skb_transport_header_was_set()") only does so if not set, on the assumption the value was set correctly by GRO (and also on assumption that "network stacks usually reset the transport header anyway"). Afterwards it is moved forward by llc_fixup_skb.
Locally generated traffic shows up at __netif_receive_skb_core with no transport_header set and is processed without issue. On a setup with GRO but no DSA, transport_header and network_header are both set to point to skb->data which is also correct.
As issue is LLC specific, to avoid impacting non-LLC traffic, and to follow up on original assumption made on previous code change, llc_fixup_skb to reset the offset after skb pull. llc_fixup_skb assumes the LLC header is at skb->data, and by definition SNAP header immediately follows.
Fixes: fda55eca5a33 ("net: introduce skb_transport_header_was_set()") Signed-off-by: Antonio Pastor antonio.pastor@gmail.com Reviewed-by: Eric Dumazet edumazet@google.com Link: https://patch.msgid.link/20241225010723.2830290-1-antonio.pastor@gmail.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/llc/llc_input.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/llc/llc_input.c b/net/llc/llc_input.c index 51bccfb00a9c..61b0159b2fbe 100644 --- a/net/llc/llc_input.c +++ b/net/llc/llc_input.c @@ -124,8 +124,8 @@ static inline int llc_fixup_skb(struct sk_buff *skb) if (unlikely(!pskb_may_pull(skb, llc_len))) return 0;
- skb->transport_header += llc_len; skb_pull(skb, llc_len); + skb_reset_transport_header(skb); if (skb->protocol == htons(ETH_P_802_2)) { __be16 pdulen; s32 data_size;
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Tanya Agarwal tanyaagarwal25699@gmail.com
[ Upstream commit b06a6187ef983f501e93faa56209169752d3bde3 ]
Initialize meter_urb array before use in mixer_us16x08.c.
CID 1410197: (#1 of 1): Uninitialized scalar variable (UNINIT) uninit_use_in_call: Using uninitialized value *meter_urb when calling get_meter_levels_from_urb.
Coverity Link: https://scan7.scan.coverity.com/#/project-view/52849/11354?selectedIssue=141...
Fixes: d2bb390a2081 ("ALSA: usb-audio: Tascam US-16x08 DSP mixer quirk") Signed-off-by: Tanya Agarwal tanyaagarwal25699@gmail.com Link: https://patch.msgid.link/20241229060240.1642-1-tanyaagarwal25699@gmail.com Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Sasha Levin sashal@kernel.org --- sound/usb/mixer_us16x08.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/sound/usb/mixer_us16x08.c b/sound/usb/mixer_us16x08.c index 6eb7d93b358d..20ac32635f1f 100644 --- a/sound/usb/mixer_us16x08.c +++ b/sound/usb/mixer_us16x08.c @@ -687,7 +687,7 @@ static int snd_us16x08_meter_get(struct snd_kcontrol *kcontrol, struct usb_mixer_elem_info *elem = kcontrol->private_data; struct snd_usb_audio *chip = elem->head.mixer->chip; struct snd_us16x08_meter_store *store = elem->private_data; - u8 meter_urb[64]; + u8 meter_urb[64] = {0};
switch (kcontrol->private_value) { case 0: {
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Vitalii Mordan mordan@ispras.ru
[ Upstream commit b255ef45fcc2141c1bf98456796abb956d843a27 ]
Check the return value of clk_prepare_enable to ensure that priv->clk has been successfully enabled.
If priv->clk was not enabled during bcm_sysport_probe, bcm_sysport_resume, or bcm_sysport_open, it must not be disabled in any subsequent execution paths.
Fixes: 31bc72d97656 ("net: systemport: fetch and use clock resources") Signed-off-by: Vitalii Mordan mordan@ispras.ru Reviewed-by: Florian Fainelli florian.fainelli@broadcom.com Link: https://patch.msgid.link/20241227123007.2333397-1-mordan@ispras.ru Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/broadcom/bcmsysport.c | 21 ++++++++++++++++++--- 1 file changed, 18 insertions(+), 3 deletions(-)
diff --git a/drivers/net/ethernet/broadcom/bcmsysport.c b/drivers/net/ethernet/broadcom/bcmsysport.c index 49e890a7e04a..23cc2d85994e 100644 --- a/drivers/net/ethernet/broadcom/bcmsysport.c +++ b/drivers/net/ethernet/broadcom/bcmsysport.c @@ -1967,7 +1967,11 @@ static int bcm_sysport_open(struct net_device *dev) unsigned int i; int ret;
- clk_prepare_enable(priv->clk); + ret = clk_prepare_enable(priv->clk); + if (ret) { + netdev_err(dev, "could not enable priv clock\n"); + return ret; + }
/* Reset UniMAC */ umac_reset(priv); @@ -2625,7 +2629,11 @@ static int bcm_sysport_probe(struct platform_device *pdev) goto err_deregister_notifier; }
- clk_prepare_enable(priv->clk); + ret = clk_prepare_enable(priv->clk); + if (ret) { + dev_err(&pdev->dev, "could not enable priv clock\n"); + goto err_deregister_netdev; + }
priv->rev = topctrl_readl(priv, REV_CNTL) & REV_MASK; dev_info(&pdev->dev, @@ -2639,6 +2647,8 @@ static int bcm_sysport_probe(struct platform_device *pdev)
return 0;
+err_deregister_netdev: + unregister_netdev(dev); err_deregister_notifier: unregister_netdevice_notifier(&priv->netdev_notifier); err_deregister_fixed_link: @@ -2810,7 +2820,12 @@ static int __maybe_unused bcm_sysport_resume(struct device *d) if (!netif_running(dev)) return 0;
- clk_prepare_enable(priv->clk); + ret = clk_prepare_enable(priv->clk); + if (ret) { + netdev_err(dev, "could not enable priv clock\n"); + return ret; + } + if (priv->wolopts) clk_disable_unprepare(priv->wol_clk);
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Joe Hattori joe@pf.is.s.u-tokyo.ac.jp
[ Upstream commit ad5c318086e2e23b577eca33559c5ebf89bc7eb9 ]
Current implementation of mv643xx_eth_shared_of_add_port() calls of_parse_phandle(), but does not release the refcount on error. Call of_node_put() in the error path and in mv643xx_eth_shared_of_remove().
This bug was found by an experimental verification tool that I am developing.
Fixes: 76723bca2802 ("net: mv643xx_eth: add DT parsing support") Signed-off-by: Joe Hattori joe@pf.is.s.u-tokyo.ac.jp Link: https://patch.msgid.link/20241221081448.3313163-1-joe@pf.is.s.u-tokyo.ac.jp Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/marvell/mv643xx_eth.c | 14 ++++++++++++-- 1 file changed, 12 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/marvell/mv643xx_eth.c b/drivers/net/ethernet/marvell/mv643xx_eth.c index 3b129a1c3381..07e5051171a4 100644 --- a/drivers/net/ethernet/marvell/mv643xx_eth.c +++ b/drivers/net/ethernet/marvell/mv643xx_eth.c @@ -2708,9 +2708,15 @@ static struct platform_device *port_platdev[3];
static void mv643xx_eth_shared_of_remove(void) { + struct mv643xx_eth_platform_data *pd; int n;
for (n = 0; n < 3; n++) { + if (!port_platdev[n]) + continue; + pd = dev_get_platdata(&port_platdev[n]->dev); + if (pd) + of_node_put(pd->phy_node); platform_device_del(port_platdev[n]); port_platdev[n] = NULL; } @@ -2773,8 +2779,10 @@ static int mv643xx_eth_shared_of_add_port(struct platform_device *pdev, }
ppdev = platform_device_alloc(MV643XX_ETH_NAME, dev_num); - if (!ppdev) - return -ENOMEM; + if (!ppdev) { + ret = -ENOMEM; + goto put_err; + } ppdev->dev.coherent_dma_mask = DMA_BIT_MASK(32); ppdev->dev.of_node = pnp;
@@ -2796,6 +2804,8 @@ static int mv643xx_eth_shared_of_add_port(struct platform_device *pdev,
port_err: platform_device_put(ppdev); +put_err: + of_node_put(ppd.phy_node); return ret; }
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jinjian Song jinjian.song@fibocom.com
[ Upstream commit 4f619d518db9cd1a933c3a095a5f95d0c1584ae8 ]
When driver processes the internal state change command, it use an asynchronous thread to process the command operation. If the main thread detects that the task has timed out, the asynchronous thread will panic when executing the completion notification because the main thread completion object has been released.
BUG: unable to handle page fault for address: fffffffffffffff8 PGD 1f283a067 P4D 1f283a067 PUD 1f283c067 PMD 0 Oops: 0000 [#1] PREEMPT SMP NOPTI RIP: 0010:complete_all+0x3e/0xa0 [...] Call Trace: <TASK> ? __die_body+0x68/0xb0 ? page_fault_oops+0x379/0x3e0 ? exc_page_fault+0x69/0xa0 ? asm_exc_page_fault+0x22/0x30 ? complete_all+0x3e/0xa0 fsm_main_thread+0xa3/0x9c0 [mtk_t7xx (HASH:1400 5)] ? __pfx_autoremove_wake_function+0x10/0x10 kthread+0xd8/0x110 ? __pfx_fsm_main_thread+0x10/0x10 [mtk_t7xx (HASH:1400 5)] ? __pfx_kthread+0x10/0x10 ret_from_fork+0x38/0x50 ? __pfx_kthread+0x10/0x10 ret_from_fork_asm+0x1b/0x30 </TASK> [...] CR2: fffffffffffffff8 ---[ end trace 0000000000000000 ]---
Use the reference counter to ensure safe release as Sergey suggests: https://lore.kernel.org/all/da90f64c-260a-4329-87bf-1f9ff20a5951@gmail.com/
Fixes: 13e920d93e37 ("net: wwan: t7xx: Add core components") Signed-off-by: Jinjian Song jinjian.song@fibocom.com Acked-by: Sergey Ryazanov ryazanov.s.a@gmail.com Link: https://patch.msgid.link/20241224041552.8711-1-jinjian.song@fibocom.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wwan/t7xx/t7xx_state_monitor.c | 26 ++++++++++++++-------- drivers/net/wwan/t7xx/t7xx_state_monitor.h | 5 +++-- 2 files changed, 20 insertions(+), 11 deletions(-)
diff --git a/drivers/net/wwan/t7xx/t7xx_state_monitor.c b/drivers/net/wwan/t7xx/t7xx_state_monitor.c index 80edb8e75a6a..64868df3640d 100644 --- a/drivers/net/wwan/t7xx/t7xx_state_monitor.c +++ b/drivers/net/wwan/t7xx/t7xx_state_monitor.c @@ -97,14 +97,21 @@ void t7xx_fsm_broadcast_state(struct t7xx_fsm_ctl *ctl, enum md_state state) fsm_state_notify(ctl->md, state); }
+static void fsm_release_command(struct kref *ref) +{ + struct t7xx_fsm_command *cmd = container_of(ref, typeof(*cmd), refcnt); + + kfree(cmd); +} + static void fsm_finish_command(struct t7xx_fsm_ctl *ctl, struct t7xx_fsm_command *cmd, int result) { if (cmd->flag & FSM_CMD_FLAG_WAIT_FOR_COMPLETION) { - *cmd->ret = result; - complete_all(cmd->done); + cmd->result = result; + complete_all(&cmd->done); }
- kfree(cmd); + kref_put(&cmd->refcnt, fsm_release_command); }
static void fsm_del_kf_event(struct t7xx_fsm_event *event) @@ -396,7 +403,6 @@ static int fsm_main_thread(void *data)
int t7xx_fsm_append_cmd(struct t7xx_fsm_ctl *ctl, enum t7xx_fsm_cmd_state cmd_id, unsigned int flag) { - DECLARE_COMPLETION_ONSTACK(done); struct t7xx_fsm_command *cmd; unsigned long flags; int ret; @@ -408,11 +414,13 @@ int t7xx_fsm_append_cmd(struct t7xx_fsm_ctl *ctl, enum t7xx_fsm_cmd_state cmd_id INIT_LIST_HEAD(&cmd->entry); cmd->cmd_id = cmd_id; cmd->flag = flag; + kref_init(&cmd->refcnt); if (flag & FSM_CMD_FLAG_WAIT_FOR_COMPLETION) { - cmd->done = &done; - cmd->ret = &ret; + init_completion(&cmd->done); + kref_get(&cmd->refcnt); }
+ kref_get(&cmd->refcnt); spin_lock_irqsave(&ctl->command_lock, flags); list_add_tail(&cmd->entry, &ctl->command_queue); spin_unlock_irqrestore(&ctl->command_lock, flags); @@ -422,11 +430,11 @@ int t7xx_fsm_append_cmd(struct t7xx_fsm_ctl *ctl, enum t7xx_fsm_cmd_state cmd_id if (flag & FSM_CMD_FLAG_WAIT_FOR_COMPLETION) { unsigned long wait_ret;
- wait_ret = wait_for_completion_timeout(&done, + wait_ret = wait_for_completion_timeout(&cmd->done, msecs_to_jiffies(FSM_CMD_TIMEOUT_MS)); - if (!wait_ret) - return -ETIMEDOUT;
+ ret = wait_ret ? cmd->result : -ETIMEDOUT; + kref_put(&cmd->refcnt, fsm_release_command); return ret; }
diff --git a/drivers/net/wwan/t7xx/t7xx_state_monitor.h b/drivers/net/wwan/t7xx/t7xx_state_monitor.h index b6e76f3903c8..74f96fd2605e 100644 --- a/drivers/net/wwan/t7xx/t7xx_state_monitor.h +++ b/drivers/net/wwan/t7xx/t7xx_state_monitor.h @@ -109,8 +109,9 @@ struct t7xx_fsm_command { struct list_head entry; enum t7xx_fsm_cmd_state cmd_id; unsigned int flag; - struct completion *done; - int *ret; + struct completion done; + int result; + struct kref refcnt; };
struct t7xx_fsm_notifier {
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Li Zhijian lizhijian@fujitsu.com
[ Upstream commit fb514b31395946022f13a08e06a435f53cf9e8b3 ]
Move the declaration of the 'ib_sge list' variable outside the 'always_invalidate' block to ensure it remains accessible for use throughout the function.
Previously, 'ib_sge list' was declared within the 'always_invalidate' block, limiting its accessibility, then caused a 'BUG: kernel NULL pointer dereference'[1]. ? __die_body.cold+0x19/0x27 ? page_fault_oops+0x15a/0x2d0 ? search_module_extables+0x19/0x60 ? search_bpf_extables+0x5f/0x80 ? exc_page_fault+0x7e/0x180 ? asm_exc_page_fault+0x26/0x30 ? memcpy_orig+0xd5/0x140 rxe_mr_copy+0x1c3/0x200 [rdma_rxe] ? rxe_pool_get_index+0x4b/0x80 [rdma_rxe] copy_data+0xa5/0x230 [rdma_rxe] rxe_requester+0xd9b/0xf70 [rdma_rxe] ? finish_task_switch.isra.0+0x99/0x2e0 rxe_sender+0x13/0x40 [rdma_rxe] do_task+0x68/0x1e0 [rdma_rxe] process_one_work+0x177/0x330 worker_thread+0x252/0x390 ? __pfx_worker_thread+0x10/0x10
This change ensures the variable is available for subsequent operations that require it.
[1] https://lore.kernel.org/linux-rdma/6a1f3e8f-deb0-49f9-bc69-a9b03ecfcda7@fuji...
Fixes: 9cb837480424 ("RDMA/rtrs: server: main functionality") Signed-off-by: Li Zhijian lizhijian@fujitsu.com Link: https://patch.msgid.link/20241231013416.1290920-1-lizhijian@fujitsu.com Signed-off-by: Leon Romanovsky leon@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/infiniband/ulp/rtrs/rtrs-srv.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/infiniband/ulp/rtrs/rtrs-srv.c b/drivers/infiniband/ulp/rtrs/rtrs-srv.c index 758a3d9c2844..84d1654148d7 100644 --- a/drivers/infiniband/ulp/rtrs/rtrs-srv.c +++ b/drivers/infiniband/ulp/rtrs/rtrs-srv.c @@ -346,6 +346,7 @@ static int send_io_resp_imm(struct rtrs_srv_con *con, struct rtrs_srv_op *id, struct rtrs_srv_mr *srv_mr; bool need_inval = false; enum ib_send_flags flags; + struct ib_sge list; u32 imm; int err;
@@ -398,7 +399,6 @@ static int send_io_resp_imm(struct rtrs_srv_con *con, struct rtrs_srv_op *id, imm = rtrs_to_io_rsp_imm(id->msg_id, errno, need_inval); imm_wr.wr.next = NULL; if (always_invalidate) { - struct ib_sge list; struct rtrs_msg_rkey_rsp *msg;
srv_mr = &srv_path->mrs[id->msg_id];
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Liang Jie liangjie@lixiang.com
[ Upstream commit a8620de72e5676993ec3a3b975f7c10908f5f60f ]
In efx_tc_ct_zone_ht_params, the key_len was previously set to offsetof(struct efx_tc_ct_zone, linkage). This calculation is incorrect because it includes any padding between the zone field and the linkage field due to structure alignment, which can vary between systems.
This patch updates key_len to use sizeof_field(struct efx_tc_ct_zone, zone) , ensuring that the hash table correctly uses the zone as the key. This fix prevents potential hash lookup errors and improves connection tracking reliability.
Fixes: c3bb5c6acd4e ("sfc: functions to register for conntrack zone offload") Signed-off-by: Liang Jie liangjie@lixiang.com Acked-by: Edward Cree ecree.xilinx@gmail.com Link: https://patch.msgid.link/20241230093709.3226854-1-buaajxlj@163.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/sfc/tc_conntrack.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/sfc/tc_conntrack.c b/drivers/net/ethernet/sfc/tc_conntrack.c index 44bb57670340..109d2aa34ae3 100644 --- a/drivers/net/ethernet/sfc/tc_conntrack.c +++ b/drivers/net/ethernet/sfc/tc_conntrack.c @@ -16,7 +16,7 @@ static int efx_tc_flow_block(enum tc_setup_type type, void *type_data, void *cb_priv);
static const struct rhashtable_params efx_tc_ct_zone_ht_params = { - .key_len = offsetof(struct efx_tc_ct_zone, linkage), + .key_len = sizeof_field(struct efx_tc_ct_zone, zone), .key_offset = 0, .head_offset = offsetof(struct efx_tc_ct_zone, linkage), };
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Willem de Bruijn willemb@google.com
[ Upstream commit 68e068cabd2c6c533ef934c2e5151609cf6ecc6d ]
The blamed commit disabled hardware offoad of IPv6 packets with extension headers on devices that advertise NETIF_F_IPV6_CSUM, based on the definition of that feature in skbuff.h:
* * - %NETIF_F_IPV6_CSUM * - Driver (device) is only able to checksum plain * TCP or UDP packets over IPv6. These are specifically * unencapsulated packets of the form IPv6|TCP or * IPv6|UDP where the Next Header field in the IPv6 * header is either TCP or UDP. IPv6 extension headers * are not supported with this feature. This feature * cannot be set in features for a device with * NETIF_F_HW_CSUM also set. This feature is being * DEPRECATED (see below).
The change causes skb_warn_bad_offload to fire for BIG TCP packets.
[ 496.310233] WARNING: CPU: 13 PID: 23472 at net/core/dev.c:3129 skb_warn_bad_offload+0xc4/0xe0
[ 496.310297] ? skb_warn_bad_offload+0xc4/0xe0 [ 496.310300] skb_checksum_help+0x129/0x1f0 [ 496.310303] skb_csum_hwoffload_help+0x150/0x1b0 [ 496.310306] validate_xmit_skb+0x159/0x270 [ 496.310309] validate_xmit_skb_list+0x41/0x70 [ 496.310312] sch_direct_xmit+0x5c/0x250 [ 496.310317] __qdisc_run+0x388/0x620
BIG TCP introduced an IPV6_TLV_JUMBO IPv6 extension header to communicate packet length, as this is an IPv6 jumbogram. But, the feature is only enabled on devices that support BIG TCP TSO. The header is only present for PF_PACKET taps like tcpdump, and not transmitted by physical devices.
For this specific case of extension headers that are not transmitted, return to the situation before the blamed commit and support hardware offload.
ipv6_has_hopopt_jumbo() tests not only whether this header is present, but also that it is the only extension header before a terminal (L4) header.
Fixes: 04c20a9356f2 ("net: skip offload for NETIF_F_IPV6_CSUM if ipv6 header contains extension") Reported-by: syzbot syzkaller@googlegroups.com Reported-by: Eric Dumazet edumazet@google.com Closes: https://lore.kernel.org/netdev/CANn89iK1hdC3Nt8KPhOtTF8vCPc1AHDCtse_BTNki1pW... Signed-off-by: Willem de Bruijn willemb@google.com Reviewed-by: Eric Dumazet edumazet@google.com Link: https://patch.msgid.link/20250101164909.1331680-1-willemdebruijn.kernel@gmai... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/core/dev.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/net/core/dev.c b/net/core/dev.c index 4beb9acf2c18..69da7b009f8b 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -3628,8 +3628,10 @@ int skb_csum_hwoffload_help(struct sk_buff *skb,
if (features & (NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM)) { if (vlan_get_protocol(skb) == htons(ETH_P_IPV6) && - skb_network_header_len(skb) != sizeof(struct ipv6hdr)) + skb_network_header_len(skb) != sizeof(struct ipv6hdr) && + !ipv6_has_hopopt_jumbo(skb)) goto sw_checksum; + switch (skb->csum_offset) { case offsetof(struct tcphdr, check): case offsetof(struct udphdr, check):
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Dumazet edumazet@google.com
[ Upstream commit 5b0af621c3f6ef9261cf6067812f2fd9943acb4b ]
After blamed commit, crypto sockets could accidentally be destroyed from RCU call back, as spotted by zyzbot [1].
Trying to acquire a mutex in RCU callback is not allowed.
Restrict SO_REUSEPORT socket option to inet sockets.
v1 of this patch supported TCP, UDP and SCTP sockets, but fcnal-test.sh test needed RAW and ICMP support.
[1] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:562 in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 24, name: ksoftirqd/1 preempt_count: 100, expected: 0 RCU nest depth: 0, expected: 0 1 lock held by ksoftirqd/1/24: #0: ffffffff8e937ba0 (rcu_callback){....}-{0:0}, at: rcu_lock_acquire include/linux/rcupdate.h:337 [inline] #0: ffffffff8e937ba0 (rcu_callback){....}-{0:0}, at: rcu_do_batch kernel/rcu/tree.c:2561 [inline] #0: ffffffff8e937ba0 (rcu_callback){....}-{0:0}, at: rcu_core+0xa37/0x17a0 kernel/rcu/tree.c:2823 Preemption disabled at: [<ffffffff8161c8c8>] softirq_handle_begin kernel/softirq.c:402 [inline] [<ffffffff8161c8c8>] handle_softirqs+0x128/0x9b0 kernel/softirq.c:537 CPU: 1 UID: 0 PID: 24 Comm: ksoftirqd/1 Not tainted 6.13.0-rc3-syzkaller-00174-ga024e377efed #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024 Call Trace: <TASK> __dump_stack lib/dump_stack.c:94 [inline] dump_stack_lvl+0x241/0x360 lib/dump_stack.c:120 __might_resched+0x5d4/0x780 kernel/sched/core.c:8758 __mutex_lock_common kernel/locking/mutex.c:562 [inline] __mutex_lock+0x131/0xee0 kernel/locking/mutex.c:735 crypto_put_default_null_skcipher+0x18/0x70 crypto/crypto_null.c:179 aead_release+0x3d/0x50 crypto/algif_aead.c:489 alg_do_release crypto/af_alg.c:118 [inline] alg_sock_destruct+0x86/0xc0 crypto/af_alg.c:502 __sk_destruct+0x58/0x5f0 net/core/sock.c:2260 rcu_do_batch kernel/rcu/tree.c:2567 [inline] rcu_core+0xaaa/0x17a0 kernel/rcu/tree.c:2823 handle_softirqs+0x2d4/0x9b0 kernel/softirq.c:561 run_ksoftirqd+0xca/0x130 kernel/softirq.c:950 smpboot_thread_fn+0x544/0xa30 kernel/smpboot.c:164 kthread+0x2f0/0x390 kernel/kthread.c:389 ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244 </TASK>
Fixes: 8c7138b33e5c ("net: Unpublish sk from sk_reuseport_cb before call_rcu") Reported-by: syzbot+b3e02953598f447d4d2a@syzkaller.appspotmail.com Closes: https://lore.kernel.org/netdev/6772f2f4.050a0220.2f3838.04cb.GAE@google.com/... Signed-off-by: Eric Dumazet edumazet@google.com Cc: Martin KaFai Lau kafai@fb.com Reviewed-by: Kuniyuki Iwashima kuniyu@amazon.com Link: https://patch.msgid.link/20241231160527.3994168-1-edumazet@google.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/core/sock.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/net/core/sock.c b/net/core/sock.c index bc2a4e38dcea..84ba3f67bca9 100644 --- a/net/core/sock.c +++ b/net/core/sock.c @@ -1133,7 +1133,10 @@ int sk_setsockopt(struct sock *sk, int level, int optname, sk->sk_reuse = (valbool ? SK_CAN_REUSE : SK_NO_REUSE); break; case SO_REUSEPORT: - sk->sk_reuseport = valbool; + if (valbool && !sk_is_inet(sk)) + ret = -EOPNOTSUPP; + else + sk->sk_reuseport = valbool; break; case SO_TYPE: case SO_PROTOCOL:
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Maciej S. Szmigiero mail@maciej.szmigiero.name
[ Upstream commit a7af435df0e04cfb4a4004136d597c42639a2ae7 ]
ipc_mmio_init() used the post-decrement operator in its loop continuing condition of "retries" counter being "> 0", which meant that when this condition caused loop exit "retries" counter reached -1.
But the later valid exec stage failure check only tests for "retries" counter being exactly zero, so it didn't trigger in this case (but would wrongly trigger if the code reaches a valid exec stage in the very last loop iteration).
Fix this by using the pre-decrement operator instead, so the loop counter is exactly zero on valid exec stage failure.
Fixes: dc0514f5d828 ("net: iosm: mmio scratchpad") Signed-off-by: Maciej S. Szmigiero mail@maciej.szmigiero.name Link: https://patch.msgid.link/8b19125a825f9dcdd81c667c1e5c48ba28d505a6.1735490770... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wwan/iosm/iosm_ipc_mmio.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/wwan/iosm/iosm_ipc_mmio.c b/drivers/net/wwan/iosm/iosm_ipc_mmio.c index 63eb08c43c05..6764c13530b9 100644 --- a/drivers/net/wwan/iosm/iosm_ipc_mmio.c +++ b/drivers/net/wwan/iosm/iosm_ipc_mmio.c @@ -104,7 +104,7 @@ struct iosm_mmio *ipc_mmio_init(void __iomem *mmio, struct device *dev) break;
msleep(20); - } while (retries-- > 0); + } while (--retries > 0);
if (!retries) { dev_err(ipc_mmio->dev, "invalid exec stage %X", stage);
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Dumazet edumazet@google.com
[ Upstream commit 77ee7a6d16b6ec07b5c3ae2b6b60a24c1afbed09 ]
Blamed commit forgot MSG_PEEK case, allowing a crash [1] as found by syzbot.
Rework vlan_get_tci() to not touch skb at all, so that it can be used from many cpus on the same skb.
Add a const qualifier to skb argument.
[1] skbuff: skb_under_panic: text:ffffffff8a8da482 len:32 put:14 head:ffff88807a1d5800 data:ffff88807a1d5810 tail:0x14 end:0x140 dev:<NULL> ------------[ cut here ]------------ kernel BUG at net/core/skbuff.c:206 ! Oops: invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI CPU: 0 UID: 0 PID: 5880 Comm: syz-executor172 Not tainted 6.13.0-rc3-syzkaller-00762-g9268abe611b0 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024 RIP: 0010:skb_panic net/core/skbuff.c:206 [inline] RIP: 0010:skb_under_panic+0x14b/0x150 net/core/skbuff.c:216 Code: 0b 8d 48 c7 c6 9e 6c 26 8e 48 8b 54 24 08 8b 0c 24 44 8b 44 24 04 4d 89 e9 50 41 54 41 57 41 56 e8 3a 5a 79 f7 48 83 c4 20 90 <0f> 0b 0f 1f 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 RSP: 0018:ffffc90003baf5b8 EFLAGS: 00010286 RAX: 0000000000000087 RBX: dffffc0000000000 RCX: 8565c1eec37aa000 RDX: 0000000000000000 RSI: 0000000080000000 RDI: 0000000000000000 RBP: ffff88802616fb50 R08: ffffffff817f0a4c R09: 1ffff92000775e50 R10: dffffc0000000000 R11: fffff52000775e51 R12: 0000000000000140 R13: ffff88807a1d5800 R14: ffff88807a1d5810 R15: 0000000000000014 FS: 00007fa03261f6c0(0000) GS:ffff8880b8600000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007ffd65753000 CR3: 0000000031720000 CR4: 00000000003526f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> skb_push+0xe5/0x100 net/core/skbuff.c:2636 vlan_get_tci+0x272/0x550 net/packet/af_packet.c:565 packet_recvmsg+0x13c9/0x1ef0 net/packet/af_packet.c:3616 sock_recvmsg_nosec net/socket.c:1044 [inline] sock_recvmsg+0x22f/0x280 net/socket.c:1066 ____sys_recvmsg+0x1c6/0x480 net/socket.c:2814 ___sys_recvmsg net/socket.c:2856 [inline] do_recvmmsg+0x426/0xab0 net/socket.c:2951 __sys_recvmmsg net/socket.c:3025 [inline] __do_sys_recvmmsg net/socket.c:3048 [inline] __se_sys_recvmmsg net/socket.c:3041 [inline] __x64_sys_recvmmsg+0x199/0x250 net/socket.c:3041 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
Fixes: 79eecf631c14 ("af_packet: Handle outgoing VLAN packets without hardware offloading") Reported-by: syzbot+8400677f3fd43f37d3bc@syzkaller.appspotmail.com Closes: https://lore.kernel.org/netdev/6772c485.050a0220.2f3838.04c6.GAE@google.com/... Signed-off-by: Eric Dumazet edumazet@google.com Cc: Chengen Du chengen.du@canonical.com Reviewed-by: Willem de Bruijn willemb@google.com Link: https://patch.msgid.link/20241230161004.2681892-1-edumazet@google.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/packet/af_packet.c | 12 +++--------- 1 file changed, 3 insertions(+), 9 deletions(-)
diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c index 56e3ae3b6be9..96eca4a290ad 100644 --- a/net/packet/af_packet.c +++ b/net/packet/af_packet.c @@ -538,10 +538,8 @@ static void *packet_current_frame(struct packet_sock *po, return packet_lookup_frame(po, rb, rb->head, status); }
-static u16 vlan_get_tci(struct sk_buff *skb, struct net_device *dev) +static u16 vlan_get_tci(const struct sk_buff *skb, struct net_device *dev) { - u8 *skb_orig_data = skb->data; - int skb_orig_len = skb->len; struct vlan_hdr vhdr, *vh; unsigned int header_len;
@@ -562,12 +560,8 @@ static u16 vlan_get_tci(struct sk_buff *skb, struct net_device *dev) else return 0;
- skb_push(skb, skb->data - skb_mac_header(skb)); - vh = skb_header_pointer(skb, header_len, sizeof(vhdr), &vhdr); - if (skb_orig_data != skb->data) { - skb->data = skb_orig_data; - skb->len = skb_orig_len; - } + vh = skb_header_pointer(skb, skb_mac_offset(skb) + header_len, + sizeof(vhdr), &vhdr); if (unlikely(!vh)) return 0;
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Dumazet edumazet@google.com
[ Upstream commit f91a5b8089389eb408501af2762f168c3aaa7b79 ]
Blamed commit forgot MSG_PEEK case, allowing a crash [1] as found by syzbot.
Rework vlan_get_protocol_dgram() to not touch skb at all, so that it can be used from many cpus on the same skb.
Add a const qualifier to skb argument.
[1] skbuff: skb_under_panic: text:ffffffff8a8ccd05 len:29 put:14 head:ffff88807fc8e400 data:ffff88807fc8e3f4 tail:0x11 end:0x140 dev:<NULL> ------------[ cut here ]------------ kernel BUG at net/core/skbuff.c:206 ! Oops: invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI CPU: 1 UID: 0 PID: 5892 Comm: syz-executor883 Not tainted 6.13.0-rc4-syzkaller-00054-gd6ef8b40d075 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024 RIP: 0010:skb_panic net/core/skbuff.c:206 [inline] RIP: 0010:skb_under_panic+0x14b/0x150 net/core/skbuff.c:216 Code: 0b 8d 48 c7 c6 86 d5 25 8e 48 8b 54 24 08 8b 0c 24 44 8b 44 24 04 4d 89 e9 50 41 54 41 57 41 56 e8 5a 69 79 f7 48 83 c4 20 90 <0f> 0b 0f 1f 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 RSP: 0018:ffffc900038d7638 EFLAGS: 00010282 RAX: 0000000000000087 RBX: dffffc0000000000 RCX: 609ffd18ea660600 RDX: 0000000000000000 RSI: 0000000080000000 RDI: 0000000000000000 RBP: ffff88802483c8d0 R08: ffffffff817f0a8c R09: 1ffff9200071ae60 R10: dffffc0000000000 R11: fffff5200071ae61 R12: 0000000000000140 R13: ffff88807fc8e400 R14: ffff88807fc8e3f4 R15: 0000000000000011 FS: 00007fbac5e006c0(0000) GS:ffff8880b8700000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fbac5e00d58 CR3: 000000001238e000 CR4: 00000000003526f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> skb_push+0xe5/0x100 net/core/skbuff.c:2636 vlan_get_protocol_dgram+0x165/0x290 net/packet/af_packet.c:585 packet_recvmsg+0x948/0x1ef0 net/packet/af_packet.c:3552 sock_recvmsg_nosec net/socket.c:1033 [inline] sock_recvmsg+0x22f/0x280 net/socket.c:1055 ____sys_recvmsg+0x1c6/0x480 net/socket.c:2803 ___sys_recvmsg net/socket.c:2845 [inline] do_recvmmsg+0x426/0xab0 net/socket.c:2940 __sys_recvmmsg net/socket.c:3014 [inline] __do_sys_recvmmsg net/socket.c:3037 [inline] __se_sys_recvmmsg net/socket.c:3030 [inline] __x64_sys_recvmmsg+0x199/0x250 net/socket.c:3030 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f
Fixes: 79eecf631c14 ("af_packet: Handle outgoing VLAN packets without hardware offloading") Reported-by: syzbot+74f70bb1cb968bf09e4f@syzkaller.appspotmail.com Closes: https://lore.kernel.org/netdev/6772c485.050a0220.2f3838.04c5.GAE@google.com/... Signed-off-by: Eric Dumazet edumazet@google.com Cc: Chengen Du chengen.du@canonical.com Reviewed-by: Willem de Bruijn willemb@google.com Link: https://patch.msgid.link/20241230161004.2681892-2-edumazet@google.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/if_vlan.h | 16 +++++++++++++--- net/packet/af_packet.c | 16 ++++------------ 2 files changed, 17 insertions(+), 15 deletions(-)
diff --git a/include/linux/if_vlan.h b/include/linux/if_vlan.h index 3028af87716e..430749a0f362 100644 --- a/include/linux/if_vlan.h +++ b/include/linux/if_vlan.h @@ -585,13 +585,16 @@ static inline int vlan_get_tag(const struct sk_buff *skb, u16 *vlan_tci) * vlan_get_protocol - get protocol EtherType. * @skb: skbuff to query * @type: first vlan protocol + * @mac_offset: MAC offset * @depth: buffer to store length of eth and vlan tags in bytes * * Returns the EtherType of the packet, regardless of whether it is * vlan encapsulated (normal or hardware accelerated) or not. */ -static inline __be16 __vlan_get_protocol(const struct sk_buff *skb, __be16 type, - int *depth) +static inline __be16 __vlan_get_protocol_offset(const struct sk_buff *skb, + __be16 type, + int mac_offset, + int *depth) { unsigned int vlan_depth = skb->mac_len, parse_depth = VLAN_MAX_DEPTH;
@@ -610,7 +613,8 @@ static inline __be16 __vlan_get_protocol(const struct sk_buff *skb, __be16 type, do { struct vlan_hdr vhdr, *vh;
- vh = skb_header_pointer(skb, vlan_depth, sizeof(vhdr), &vhdr); + vh = skb_header_pointer(skb, mac_offset + vlan_depth, + sizeof(vhdr), &vhdr); if (unlikely(!vh || !--parse_depth)) return 0;
@@ -625,6 +629,12 @@ static inline __be16 __vlan_get_protocol(const struct sk_buff *skb, __be16 type, return type; }
+static inline __be16 __vlan_get_protocol(const struct sk_buff *skb, __be16 type, + int *depth) +{ + return __vlan_get_protocol_offset(skb, type, 0, depth); +} + /** * vlan_get_protocol - get protocol EtherType. * @skb: skbuff to query diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c index 96eca4a290ad..4abf7e9ac4f2 100644 --- a/net/packet/af_packet.c +++ b/net/packet/af_packet.c @@ -568,21 +568,13 @@ static u16 vlan_get_tci(const struct sk_buff *skb, struct net_device *dev) return ntohs(vh->h_vlan_TCI); }
-static __be16 vlan_get_protocol_dgram(struct sk_buff *skb) +static __be16 vlan_get_protocol_dgram(const struct sk_buff *skb) { __be16 proto = skb->protocol;
- if (unlikely(eth_type_vlan(proto))) { - u8 *skb_orig_data = skb->data; - int skb_orig_len = skb->len; - - skb_push(skb, skb->data - skb_mac_header(skb)); - proto = __vlan_get_protocol(skb, proto, NULL); - if (skb_orig_data != skb->data) { - skb->data = skb_orig_data; - skb->len = skb_orig_len; - } - } + if (unlikely(eth_type_vlan(proto))) + proto = __vlan_get_protocol_offset(skb, proto, + skb_mac_offset(skb), NULL);
return proto; }
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Dumazet edumazet@google.com
[ Upstream commit 260466b576bca0081a7d4acecc8e93687aa22d0e ]
syzbot found a race in ila_add_mapping() [1]
commit 031ae72825ce ("ila: call nf_unregister_net_hooks() sooner") attempted to fix a similar issue.
Looking at the syzbot repro, we have concurrent ILA_CMD_ADD commands.
Add a mutex to make sure at most one thread is calling nf_register_net_hooks().
[1] BUG: KASAN: slab-use-after-free in rht_key_hashfn include/linux/rhashtable.h:159 [inline] BUG: KASAN: slab-use-after-free in __rhashtable_lookup.constprop.0+0x426/0x550 include/linux/rhashtable.h:604 Read of size 4 at addr ffff888028f40008 by task dhcpcd/5501
CPU: 1 UID: 0 PID: 5501 Comm: dhcpcd Not tainted 6.13.0-rc4-syzkaller-00054-gd6ef8b40d075 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024 Call Trace: <IRQ> __dump_stack lib/dump_stack.c:94 [inline] dump_stack_lvl+0x116/0x1f0 lib/dump_stack.c:120 print_address_description mm/kasan/report.c:378 [inline] print_report+0xc3/0x620 mm/kasan/report.c:489 kasan_report+0xd9/0x110 mm/kasan/report.c:602 rht_key_hashfn include/linux/rhashtable.h:159 [inline] __rhashtable_lookup.constprop.0+0x426/0x550 include/linux/rhashtable.h:604 rhashtable_lookup include/linux/rhashtable.h:646 [inline] rhashtable_lookup_fast include/linux/rhashtable.h:672 [inline] ila_lookup_wildcards net/ipv6/ila/ila_xlat.c:127 [inline] ila_xlat_addr net/ipv6/ila/ila_xlat.c:652 [inline] ila_nf_input+0x1ee/0x620 net/ipv6/ila/ila_xlat.c:185 nf_hook_entry_hookfn include/linux/netfilter.h:154 [inline] nf_hook_slow+0xbb/0x200 net/netfilter/core.c:626 nf_hook.constprop.0+0x42e/0x750 include/linux/netfilter.h:269 NF_HOOK include/linux/netfilter.h:312 [inline] ipv6_rcv+0xa4/0x680 net/ipv6/ip6_input.c:309 __netif_receive_skb_one_core+0x12e/0x1e0 net/core/dev.c:5672 __netif_receive_skb+0x1d/0x160 net/core/dev.c:5785 process_backlog+0x443/0x15f0 net/core/dev.c:6117 __napi_poll.constprop.0+0xb7/0x550 net/core/dev.c:6883 napi_poll net/core/dev.c:6952 [inline] net_rx_action+0xa94/0x1010 net/core/dev.c:7074 handle_softirqs+0x213/0x8f0 kernel/softirq.c:561 __do_softirq kernel/softirq.c:595 [inline] invoke_softirq kernel/softirq.c:435 [inline] __irq_exit_rcu+0x109/0x170 kernel/softirq.c:662 irq_exit_rcu+0x9/0x30 kernel/softirq.c:678 instr_sysvec_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1049 [inline] sysvec_apic_timer_interrupt+0xa4/0xc0 arch/x86/kernel/apic/apic.c:1049
Fixes: 7f00feaf1076 ("ila: Add generic ILA translation facility") Reported-by: syzbot+47e761d22ecf745f72b9@syzkaller.appspotmail.com Closes: https://lore.kernel.org/netdev/6772c9ae.050a0220.2f3838.04c7.GAE@google.com/... Signed-off-by: Eric Dumazet edumazet@google.com Cc: Florian Westphal fw@strlen.de Cc: Tom Herbert tom@herbertland.com Link: https://patch.msgid.link/20241230162849.2795486-1-edumazet@google.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/ipv6/ila/ila_xlat.c | 16 +++++++++++----- 1 file changed, 11 insertions(+), 5 deletions(-)
diff --git a/net/ipv6/ila/ila_xlat.c b/net/ipv6/ila/ila_xlat.c index 534a4498e280..fff09f5a796a 100644 --- a/net/ipv6/ila/ila_xlat.c +++ b/net/ipv6/ila/ila_xlat.c @@ -200,6 +200,8 @@ static const struct nf_hook_ops ila_nf_hook_ops[] = { }, };
+static DEFINE_MUTEX(ila_mutex); + static int ila_add_mapping(struct net *net, struct ila_xlat_params *xp) { struct ila_net *ilan = net_generic(net, ila_net_id); @@ -207,16 +209,20 @@ static int ila_add_mapping(struct net *net, struct ila_xlat_params *xp) spinlock_t *lock = ila_get_lock(ilan, xp->ip.locator_match); int err = 0, order;
- if (!ilan->xlat.hooks_registered) { + if (!READ_ONCE(ilan->xlat.hooks_registered)) { /* We defer registering net hooks in the namespace until the * first mapping is added. */ - err = nf_register_net_hooks(net, ila_nf_hook_ops, - ARRAY_SIZE(ila_nf_hook_ops)); + mutex_lock(&ila_mutex); + if (!ilan->xlat.hooks_registered) { + err = nf_register_net_hooks(net, ila_nf_hook_ops, + ARRAY_SIZE(ila_nf_hook_ops)); + if (!err) + WRITE_ONCE(ilan->xlat.hooks_registered, true); + } + mutex_unlock(&ila_mutex); if (err) return err; - - ilan->xlat.hooks_registered = true; }
ila = kzalloc(sizeof(*ila), GFP_KERNEL);
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Meghana Malladi m-malladi@ti.com
[ Upstream commit 9b115361248dc6cce182a2dc030c1c70b0a9639e ]
When ICSSG interfaces are brought down and brought up again, the pru cores are shut down and booted again, flushing out all the memories and start again in a clean state. Hence it is expected that the IEP_CMP_CFG register needs to be flushed during iep_init() to ensure that the existing residual configuration doesn't cause any unusual behavior. If the register is not cleared, existing IEP_CMP_CFG set for CMP1 will result in SYNC0_OUT signal based on the SYNC_OUT register values.
After bringing the interface up, calling PPS enable doesn't work as the driver believes PPS is already enabled, (iep->pps_enabled is not cleared during interface bring down) and driver will just return true even though there is no signal. Fix this by disabling pps and perout.
Fixes: c1e0230eeaab ("net: ti: icss-iep: Add IEP driver") Signed-off-by: Meghana Malladi m-malladi@ti.com Reviewed-by: Roger Quadros rogerq@kernel.org Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/ti/icssg/icss_iep.c | 8 ++++++++ 1 file changed, 8 insertions(+)
diff --git a/drivers/net/ethernet/ti/icssg/icss_iep.c b/drivers/net/ethernet/ti/icssg/icss_iep.c index 3025e9c18970..f06cdec14ed7 100644 --- a/drivers/net/ethernet/ti/icssg/icss_iep.c +++ b/drivers/net/ethernet/ti/icssg/icss_iep.c @@ -290,6 +290,9 @@ static void icss_iep_enable_shadow_mode(struct icss_iep *iep) for (cmp = IEP_MIN_CMP; cmp < IEP_MAX_CMP; cmp++) { regmap_update_bits(iep->map, ICSS_IEP_CMP_STAT_REG, IEP_CMP_STATUS(cmp), IEP_CMP_STATUS(cmp)); + + regmap_update_bits(iep->map, ICSS_IEP_CMP_CFG_REG, + IEP_CMP_CFG_CMP_EN(cmp), 0); }
/* enable reset counter on CMP0 event */ @@ -808,6 +811,11 @@ int icss_iep_exit(struct icss_iep *iep) } icss_iep_disable(iep);
+ if (iep->pps_enabled) + icss_iep_pps_enable(iep, false); + else if (iep->perout_enabled) + icss_iep_perout_enable(iep, NULL, false); + return 0; } EXPORT_SYMBOL_GPL(icss_iep_exit);
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Issam Hamdi ih@simonwunderlich.de
[ Upstream commit 49dba1ded8dd5a6a12748631403240b2ab245c34 ]
On 32-bit systems, the size of an unsigned long is 4 bytes, while a u64 is 8 bytes. Therefore, when using or_each_set_bit(bit, &bits, sizeof(changed) * BITS_PER_BYTE), the code is incorrectly searching for a bit in a 32-bit variable that is expected to be 64 bits in size, leading to incorrect bit finding.
Solution: Ensure that the size of the bits variable is correctly adjusted for each architecture.
Call Trace: ? show_regs+0x54/0x58 ? __warn+0x6b/0xd4 ? ieee80211_link_info_change_notify+0xcc/0xd4 [mac80211] ? report_bug+0x113/0x150 ? exc_overflow+0x30/0x30 ? handle_bug+0x27/0x44 ? exc_invalid_op+0x18/0x50 ? handle_exception+0xf6/0xf6 ? exc_overflow+0x30/0x30 ? ieee80211_link_info_change_notify+0xcc/0xd4 [mac80211] ? exc_overflow+0x30/0x30 ? ieee80211_link_info_change_notify+0xcc/0xd4 [mac80211] ? ieee80211_mesh_work+0xff/0x260 [mac80211] ? cfg80211_wiphy_work+0x72/0x98 [cfg80211] ? process_one_work+0xf1/0x1fc ? worker_thread+0x2c0/0x3b4 ? kthread+0xc7/0xf0 ? mod_delayed_work_on+0x4c/0x4c ? kthread_complete_and_exit+0x14/0x14 ? ret_from_fork+0x24/0x38 ? kthread_complete_and_exit+0x14/0x14 ? ret_from_fork_asm+0xf/0x14 ? entry_INT80_32+0xf0/0xf0
Signed-off-by: Issam Hamdi ih@simonwunderlich.de Link: https://patch.msgid.link/20241125162920.2711462-1-ih@simonwunderlich.de [restore no-op path for no changes] Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/mac80211/mesh.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/net/mac80211/mesh.c b/net/mac80211/mesh.c index 25223184d6e5..a5e7edd2f2d1 100644 --- a/net/mac80211/mesh.c +++ b/net/mac80211/mesh.c @@ -1173,14 +1173,14 @@ void ieee80211_mbss_info_change_notify(struct ieee80211_sub_if_data *sdata, u64 changed) { struct ieee80211_if_mesh *ifmsh = &sdata->u.mesh; - unsigned long bits = changed; + unsigned long bits[] = { BITMAP_FROM_U64(changed) }; u32 bit;
- if (!bits) + if (!changed) return;
/* if we race with running work, worst case this work becomes a noop */ - for_each_set_bit(bit, &bits, sizeof(changed) * BITS_PER_BYTE) + for_each_set_bit(bit, bits, sizeof(changed) * BITS_PER_BYTE) set_bit(bit, ifmsh->mbss_changed); set_bit(MESH_WORK_MBSS_CHANGED, &ifmsh->wrkq_flags); wiphy_work_queue(sdata->local->hw.wiphy, &sdata->work);
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Emmanuel Grumbach emmanuel.grumbach@intel.com
[ Upstream commit 220bf000530f9b1114fa2a1022a871c7ce8a0b38 ]
In case we fail to resume, we'll WARN with "Hardware became unavailable during restart." and we'll wait until user space does something. It'll typically bring the interface down and up to recover. This won't work though because the queues are still stopped on IEEE80211_QUEUE_STOP_REASON_SUSPEND reason. Make sure we clear that reason so that we give a chance to the recovery to succeed.
Signed-off-by: Emmanuel Grumbach emmanuel.grumbach@intel.com Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219447 Signed-off-by: Miri Korenblit miriam.rachel.korenblit@intel.com Link: https://patch.msgid.link/20241119173108.cd628f560f97.I76a15fdb92de450e532994... Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/mac80211/util.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/net/mac80211/util.c b/net/mac80211/util.c index cc3c46a82077..154b41af4157 100644 --- a/net/mac80211/util.c +++ b/net/mac80211/util.c @@ -2586,6 +2586,9 @@ int ieee80211_reconfig(struct ieee80211_local *local) WARN(1, "Hardware became unavailable upon resume. This could be a software issue prior to suspend or a hardware issue.\n"); else WARN(1, "Hardware became unavailable during restart.\n"); + ieee80211_wake_queues_by_reason(hw, IEEE80211_MAX_QUEUE_MAP, + IEEE80211_QUEUE_STOP_REASON_SUSPEND, + false); ieee80211_handle_reconfig_failure(local); return res; }
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Prike Liang Prike.Liang@amd.com
[ Upstream commit 5c3de6b02d38eb9386edf50490e050bb44398e40 ]
The SVM DMA device map direction should be set the same as the DMA unmap setting, otherwise the DMA core will report the following warning.
Before finialize this solution, there're some discussion on the DMA mapping type(stream-based or coherent) in this KFD migration case, followed by https://lore.kernel.org/all/04d4ab32 -45a1-4b88-86ee-fb0f35a0ca40@amd.com/T/.
As there's no dma_sync_single_for_*() in the DMA buffer accessed that because this migration operation should be sync properly and automatically. Give that there's might not be a performance problem in various cache sync policy of DMA sync. Therefore, in order to simplify the DMA direction setting alignment, let's set the DMA map direction as BIDIRECTIONAL.
[ 150.834218] WARNING: CPU: 8 PID: 1812 at kernel/dma/debug.c:1028 check_unmap+0x1cc/0x930 [ 150.834225] Modules linked in: amdgpu(OE) amdxcp drm_exec(OE) gpu_sched drm_buddy(OE) drm_ttm_helper(OE) ttm(OE) drm_suballoc_helper(OE) drm_display_helper(OE) drm_kms_helper(OE) i2c_algo_bit rpcsec_gss_krb5 auth_rpcgss nfsv4 nfs lockd grace netfs xt_conntrack xt_MASQUERADE nf_conntrack_netlink xfrm_user xfrm_algo iptable_nat xt_addrtype iptable_filter br_netfilter nvme_fabrics overlay nfnetlink_cttimeout nfnetlink openvswitch nsh nf_conncount nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c bridge stp llc sch_fq_codel intel_rapl_msr amd_atl intel_rapl_common snd_hda_codec_realtek snd_hda_codec_generic snd_hda_scodec_component snd_hda_codec_hdmi snd_hda_intel snd_intel_dspcfg edac_mce_amd snd_pci_acp6x snd_hda_codec snd_acp_config snd_hda_core snd_hwdep snd_soc_acpi kvm_amd sunrpc snd_pcm kvm binfmt_misc snd_seq_midi crct10dif_pclmul snd_seq_midi_event ghash_clmulni_intel sha512_ssse3 snd_rawmidi nls_iso8859_1 sha256_ssse3 sha1_ssse3 snd_seq aesni_intel snd_seq_device crypto_simd snd_timer cryptd input_leds [ 150.834310] wmi_bmof serio_raw k10temp rapl snd sp5100_tco ipmi_devintf soundcore ccp ipmi_msghandler cm32181 industrialio mac_hid msr parport_pc ppdev lp parport efi_pstore drm(OE) ip_tables x_tables pci_stub crc32_pclmul nvme ahci libahci i2c_piix4 r8169 nvme_core i2c_designware_pci realtek i2c_ccgx_ucsi video wmi hid_generic cdc_ether usbnet usbhid hid r8152 mii [ 150.834354] CPU: 8 PID: 1812 Comm: rocrtst64 Tainted: G OE 6.10.0-custom #492 [ 150.834358] Hardware name: AMD Majolica-RN/Majolica-RN, BIOS RMJ1009A 06/13/2021 [ 150.834360] RIP: 0010:check_unmap+0x1cc/0x930 [ 150.834363] Code: c0 4c 89 4d c8 e8 34 bf 86 00 4c 8b 4d c8 4c 8b 45 c0 48 8b 4d b8 48 89 c6 41 57 4c 89 ea 48 c7 c7 80 49 b4 84 e8 b4 81 f3 ff <0f> 0b 48 c7 c7 04 83 ac 84 e8 76 ba fc ff 41 8b 76 4c 49 8d 7e 50 [ 150.834365] RSP: 0018:ffffaac5023739e0 EFLAGS: 00010086 [ 150.834368] RAX: 0000000000000000 RBX: ffffffff8566a2e0 RCX: 0000000000000027 [ 150.834370] RDX: ffff8f6a8f621688 RSI: 0000000000000001 RDI: ffff8f6a8f621680 [ 150.834372] RBP: ffffaac502373a30 R08: 00000000000000c9 R09: ffffaac502373850 [ 150.834373] R10: ffffaac502373848 R11: ffffffff84f46328 R12: ffffaac502373a40 [ 150.834375] R13: ffff8f6741045330 R14: ffff8f6741a77700 R15: ffffffff84ac831b [ 150.834377] FS: 00007faf0fc94c00(0000) GS:ffff8f6a8f600000(0000) knlGS:0000000000000000 [ 150.834379] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 150.834381] CR2: 00007faf0b600020 CR3: 000000010a52e000 CR4: 0000000000350ef0 [ 150.834383] Call Trace: [ 150.834385] <TASK> [ 150.834387] ? show_regs+0x6d/0x80 [ 150.834393] ? __warn+0x8c/0x140 [ 150.834397] ? check_unmap+0x1cc/0x930 [ 150.834400] ? report_bug+0x193/0x1a0 [ 150.834406] ? handle_bug+0x46/0x80 [ 150.834410] ? exc_invalid_op+0x1d/0x80 [ 150.834413] ? asm_exc_invalid_op+0x1f/0x30 [ 150.834420] ? check_unmap+0x1cc/0x930 [ 150.834425] debug_dma_unmap_page+0x86/0x90 [ 150.834431] ? srso_return_thunk+0x5/0x5f [ 150.834435] ? rmap_walk+0x28/0x50 [ 150.834438] ? srso_return_thunk+0x5/0x5f [ 150.834441] ? remove_migration_ptes+0x79/0x80 [ 150.834445] ? srso_return_thunk+0x5/0x5f [ 150.834448] dma_unmap_page_attrs+0xfa/0x1d0 [ 150.834453] svm_range_dma_unmap_dev+0x8a/0xf0 [amdgpu] [ 150.834710] svm_migrate_ram_to_vram+0x361/0x740 [amdgpu] [ 150.834914] svm_migrate_to_vram+0xa8/0xe0 [amdgpu] [ 150.835111] svm_range_set_attr+0xff2/0x1450 [amdgpu] [ 150.835311] svm_ioctl+0x4a/0x50 [amdgpu] [ 150.835510] kfd_ioctl_svm+0x54/0x90 [amdgpu] [ 150.835701] kfd_ioctl+0x3c2/0x530 [amdgpu] [ 150.835888] ? __pfx_kfd_ioctl_svm+0x10/0x10 [amdgpu] [ 150.836075] ? srso_return_thunk+0x5/0x5f [ 150.836080] ? tomoyo_file_ioctl+0x20/0x30 [ 150.836086] __x64_sys_ioctl+0x9c/0xd0 [ 150.836091] x64_sys_call+0x1219/0x20d0 [ 150.836095] do_syscall_64+0x51/0x120 [ 150.836098] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 150.836102] RIP: 0033:0x7faf0f11a94f [ 150.836105] Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24 10 00 00 00 48 89 44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00 00 0f 05 <41> 89 c0 3d 00 f0 ff ff 77 1f 48 8b 44 24 18 64 48 2b 04 25 28 00 [ 150.836107] RSP: 002b:00007ffeced26bc0 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [ 150.836110] RAX: ffffffffffffffda RBX: 000055c683528fb0 RCX: 00007faf0f11a94f [ 150.836112] RDX: 00007ffeced26c60 RSI: 00000000c0484b20 RDI: 0000000000000003 [ 150.836114] RBP: 00007ffeced26c50 R08: 0000000000000000 R09: 0000000000000001 [ 150.836115] R10: 0000000000000032 R11: 0000000000000246 R12: 000055c683528bd0 [ 150.836117] R13: 0000000000000000 R14: 0000000000000021 R15: 0000000000000000 [ 150.836122] </TASK> [ 150.836124] ---[ end trace 0000000000000000 ]---
Signed-off-by: Prike Liang Prike.Liang@amd.com Reviewed-by: Felix Kuehling felix.kuehling@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/amdkfd/kfd_migrate.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c b/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c index 3263b5fa182d..f99e3b812ee4 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c @@ -319,7 +319,7 @@ svm_migrate_copy_to_vram(struct kfd_node *node, struct svm_range *prange, spage = migrate_pfn_to_page(migrate->src[i]); if (spage && !is_zone_device_page(spage)) { src[i] = dma_map_page(dev, spage, 0, PAGE_SIZE, - DMA_TO_DEVICE); + DMA_BIDIRECTIONAL); r = dma_mapping_error(dev, src[i]); if (r) { dev_err(dev, "%s: fail %d dma_map_page\n", @@ -634,7 +634,7 @@ svm_migrate_copy_to_ram(struct amdgpu_device *adev, struct svm_range *prange, goto out_oom; }
- dst[i] = dma_map_page(dev, dpage, 0, PAGE_SIZE, DMA_FROM_DEVICE); + dst[i] = dma_map_page(dev, dpage, 0, PAGE_SIZE, DMA_BIDIRECTIONAL); r = dma_mapping_error(dev, dst[i]); if (r) { dev_err(adev->dev, "%s: fail %d dma_map_page\n", __func__, r);
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Filipe Manana fdmanana@suse.com
[ Upstream commit f10bef73fb355e3fc85e63a50386798be68ff486 ]
During the unmount path, at close_ctree(), we first stop the cleaner kthread, using kthread_stop() which frees the associated task_struct, and then stop and destroy all the work queues. However after we stopped the cleaner we may still have a worker from the delalloc_workers queue running inode.c:submit_compressed_extents(), which calls btrfs_add_delayed_iput(), which in turn tries to wake up the cleaner kthread - which was already destroyed before, resulting in a use-after-free on the task_struct.
Syzbot reported this with the following stack traces:
BUG: KASAN: slab-use-after-free in __lock_acquire+0x78/0x2100 kernel/locking/lockdep.c:5089 Read of size 8 at addr ffff8880259d2818 by task kworker/u8:3/52
CPU: 1 UID: 0 PID: 52 Comm: kworker/u8:3 Not tainted 6.13.0-rc1-syzkaller-00002-gcdd30ebb1b9f #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024 Workqueue: btrfs-delalloc btrfs_work_helper Call Trace: <TASK> __dump_stack lib/dump_stack.c:94 [inline] dump_stack_lvl+0x241/0x360 lib/dump_stack.c:120 print_address_description mm/kasan/report.c:378 [inline] print_report+0x169/0x550 mm/kasan/report.c:489 kasan_report+0x143/0x180 mm/kasan/report.c:602 __lock_acquire+0x78/0x2100 kernel/locking/lockdep.c:5089 lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5849 __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline] _raw_spin_lock_irqsave+0xd5/0x120 kernel/locking/spinlock.c:162 class_raw_spinlock_irqsave_constructor include/linux/spinlock.h:551 [inline] try_to_wake_up+0xc2/0x1470 kernel/sched/core.c:4205 submit_compressed_extents+0xdf/0x16e0 fs/btrfs/inode.c:1615 run_ordered_work fs/btrfs/async-thread.c:288 [inline] btrfs_work_helper+0x96f/0xc40 fs/btrfs/async-thread.c:324 process_one_work kernel/workqueue.c:3229 [inline] process_scheduled_works+0xa66/0x1840 kernel/workqueue.c:3310 worker_thread+0x870/0xd30 kernel/workqueue.c:3391 kthread+0x2f0/0x390 kernel/kthread.c:389 ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244 </TASK>
Allocated by task 2: kasan_save_stack mm/kasan/common.c:47 [inline] kasan_save_track+0x3f/0x80 mm/kasan/common.c:68 unpoison_slab_object mm/kasan/common.c:319 [inline] __kasan_slab_alloc+0x66/0x80 mm/kasan/common.c:345 kasan_slab_alloc include/linux/kasan.h:250 [inline] slab_post_alloc_hook mm/slub.c:4104 [inline] slab_alloc_node mm/slub.c:4153 [inline] kmem_cache_alloc_node_noprof+0x1d9/0x380 mm/slub.c:4205 alloc_task_struct_node kernel/fork.c:180 [inline] dup_task_struct+0x57/0x8c0 kernel/fork.c:1113 copy_process+0x5d1/0x3d50 kernel/fork.c:2225 kernel_clone+0x223/0x870 kernel/fork.c:2807 kernel_thread+0x1bc/0x240 kernel/fork.c:2869 create_kthread kernel/kthread.c:412 [inline] kthreadd+0x60d/0x810 kernel/kthread.c:767 ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
Freed by task 24: kasan_save_stack mm/kasan/common.c:47 [inline] kasan_save_track+0x3f/0x80 mm/kasan/common.c:68 kasan_save_free_info+0x40/0x50 mm/kasan/generic.c:582 poison_slab_object mm/kasan/common.c:247 [inline] __kasan_slab_free+0x59/0x70 mm/kasan/common.c:264 kasan_slab_free include/linux/kasan.h:233 [inline] slab_free_hook mm/slub.c:2338 [inline] slab_free mm/slub.c:4598 [inline] kmem_cache_free+0x195/0x410 mm/slub.c:4700 put_task_struct include/linux/sched/task.h:144 [inline] delayed_put_task_struct+0x125/0x300 kernel/exit.c:227 rcu_do_batch kernel/rcu/tree.c:2567 [inline] rcu_core+0xaaa/0x17a0 kernel/rcu/tree.c:2823 handle_softirqs+0x2d4/0x9b0 kernel/softirq.c:554 run_ksoftirqd+0xca/0x130 kernel/softirq.c:943 smpboot_thread_fn+0x544/0xa30 kernel/smpboot.c:164 kthread+0x2f0/0x390 kernel/kthread.c:389 ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
Last potentially related work creation: kasan_save_stack+0x3f/0x60 mm/kasan/common.c:47 __kasan_record_aux_stack+0xac/0xc0 mm/kasan/generic.c:544 __call_rcu_common kernel/rcu/tree.c:3086 [inline] call_rcu+0x167/0xa70 kernel/rcu/tree.c:3190 context_switch kernel/sched/core.c:5372 [inline] __schedule+0x1803/0x4be0 kernel/sched/core.c:6756 __schedule_loop kernel/sched/core.c:6833 [inline] schedule+0x14b/0x320 kernel/sched/core.c:6848 schedule_timeout+0xb0/0x290 kernel/time/sleep_timeout.c:75 do_wait_for_common kernel/sched/completion.c:95 [inline] __wait_for_common kernel/sched/completion.c:116 [inline] wait_for_common kernel/sched/completion.c:127 [inline] wait_for_completion+0x355/0x620 kernel/sched/completion.c:148 kthread_stop+0x19e/0x640 kernel/kthread.c:712 close_ctree+0x524/0xd60 fs/btrfs/disk-io.c:4328 generic_shutdown_super+0x139/0x2d0 fs/super.c:642 kill_anon_super+0x3b/0x70 fs/super.c:1237 btrfs_kill_super+0x41/0x50 fs/btrfs/super.c:2112 deactivate_locked_super+0xc4/0x130 fs/super.c:473 cleanup_mnt+0x41f/0x4b0 fs/namespace.c:1373 task_work_run+0x24f/0x310 kernel/task_work.c:239 ptrace_notify+0x2d2/0x380 kernel/signal.c:2503 ptrace_report_syscall include/linux/ptrace.h:415 [inline] ptrace_report_syscall_exit include/linux/ptrace.h:477 [inline] syscall_exit_work+0xc7/0x1d0 kernel/entry/common.c:173 syscall_exit_to_user_mode_prepare kernel/entry/common.c:200 [inline] __syscall_exit_to_user_mode_work kernel/entry/common.c:205 [inline] syscall_exit_to_user_mode+0x24a/0x340 kernel/entry/common.c:218 do_syscall_64+0x100/0x230 arch/x86/entry/common.c:89 entry_SYSCALL_64_after_hwframe+0x77/0x7f
The buggy address belongs to the object at ffff8880259d1e00 which belongs to the cache task_struct of size 7424 The buggy address is located 2584 bytes inside of freed 7424-byte region [ffff8880259d1e00, ffff8880259d3b00)
The buggy address belongs to the physical page: page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x259d0 head: order:3 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0 memcg:ffff88802f4b56c1 flags: 0xfff00000000040(head|node=0|zone=1|lastcpupid=0x7ff) page_type: f5(slab) raw: 00fff00000000040 ffff88801bafe500 dead000000000100 dead000000000122 raw: 0000000000000000 0000000000040004 00000001f5000000 ffff88802f4b56c1 head: 00fff00000000040 ffff88801bafe500 dead000000000100 dead000000000122 head: 0000000000000000 0000000000040004 00000001f5000000 ffff88802f4b56c1 head: 00fff00000000003 ffffea0000967401 ffffffffffffffff 0000000000000000 head: 0000000000000008 0000000000000000 00000000ffffffff 0000000000000000 page dumped because: kasan: bad access detected page_owner tracks the page as allocated page last allocated via order 3, migratetype Unmovable, gfp_mask 0xd20c0(__GFP_IO|__GFP_FS|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP|__GFP_NOMEMALLOC), pid 12, tgid 12 (kworker/u8:1), ts 7328037942, free_ts 0 set_page_owner include/linux/page_owner.h:32 [inline] post_alloc_hook+0x1f3/0x230 mm/page_alloc.c:1556 prep_new_page mm/page_alloc.c:1564 [inline] get_page_from_freelist+0x3651/0x37a0 mm/page_alloc.c:3474 __alloc_pages_noprof+0x292/0x710 mm/page_alloc.c:4751 alloc_pages_mpol_noprof+0x3e8/0x680 mm/mempolicy.c:2265 alloc_slab_page+0x6a/0x140 mm/slub.c:2408 allocate_slab+0x5a/0x2f0 mm/slub.c:2574 new_slab mm/slub.c:2627 [inline] ___slab_alloc+0xcd1/0x14b0 mm/slub.c:3815 __slab_alloc+0x58/0xa0 mm/slub.c:3905 __slab_alloc_node mm/slub.c:3980 [inline] slab_alloc_node mm/slub.c:4141 [inline] kmem_cache_alloc_node_noprof+0x269/0x380 mm/slub.c:4205 alloc_task_struct_node kernel/fork.c:180 [inline] dup_task_struct+0x57/0x8c0 kernel/fork.c:1113 copy_process+0x5d1/0x3d50 kernel/fork.c:2225 kernel_clone+0x223/0x870 kernel/fork.c:2807 user_mode_thread+0x132/0x1a0 kernel/fork.c:2885 call_usermodehelper_exec_work+0x5c/0x230 kernel/umh.c:171 process_one_work kernel/workqueue.c:3229 [inline] process_scheduled_works+0xa66/0x1840 kernel/workqueue.c:3310 worker_thread+0x870/0xd30 kernel/workqueue.c:3391 page_owner free stack trace missing
Memory state around the buggy address: ffff8880259d2700: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff8880259d2780: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff8880259d2800: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^ ffff8880259d2880: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff8880259d2900: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ==================================================================
Fix this by flushing the delalloc workers queue before stopping the cleaner kthread.
Reported-by: syzbot+b7cf50a0c173770dcb14@syzkaller.appspotmail.com Link: https://lore.kernel.org/linux-btrfs/674ed7e8.050a0220.48a03.0031.GAE@google.... Reviewed-by: Qu Wenruo wqu@suse.com Signed-off-by: Filipe Manana fdmanana@suse.com Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/btrfs/disk-io.c | 9 +++++++++ 1 file changed, 9 insertions(+)
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index 8ec411eb9c9b..967c6b5dd0a4 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -4323,6 +4323,15 @@ void __cold close_ctree(struct btrfs_fs_info *fs_info) * already the cleaner, but below we run all pending delayed iputs. */ btrfs_flush_workqueue(fs_info->fixup_workers); + /* + * Similar case here, we have to wait for delalloc workers before we + * proceed below and stop the cleaner kthread, otherwise we trigger a + * use-after-tree on the cleaner kthread task_struct when a delalloc + * worker running submit_compressed_extents() adds a delayed iput, which + * does a wake up on the cleaner kthread, which was already freed below + * when we call kthread_stop(). + */ + btrfs_flush_workqueue(fs_info->delalloc_workers);
/* * After we parked the cleaner kthread, ordered extents may have
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Takashi Iwai tiwai@suse.de
[ Upstream commit 7c005292e20ac53dfa601bf2a7375fd4815511ad ]
CA0132 used the PCI SSID lookup helper that doesn't support the model string matching or quirk aliasing.
Replace it with the standard HD-audio quirk helpers for supporting those, and add the definition of the model strings for supported quirks, too. There should be no visible change to the outside for the working system, but the driver will parse the model option and apply the quirk based on it from now on.
Link: https://patch.msgid.link/20241207133754.3658-1-tiwai@suse.de Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Sasha Levin sashal@kernel.org --- sound/pci/hda/patch_ca0132.c | 37 ++++++++++++++++++++---------------- 1 file changed, 21 insertions(+), 16 deletions(-)
diff --git a/sound/pci/hda/patch_ca0132.c b/sound/pci/hda/patch_ca0132.c index 748a3c40966e..27e48fdbbf3a 100644 --- a/sound/pci/hda/patch_ca0132.c +++ b/sound/pci/hda/patch_ca0132.c @@ -1134,7 +1134,6 @@ struct ca0132_spec {
struct hda_codec *codec; struct delayed_work unsol_hp_work; - int quirk;
#ifdef ENABLE_TUNING_CONTROLS long cur_ctl_vals[TUNING_CTLS_COUNT]; @@ -1166,7 +1165,6 @@ struct ca0132_spec { * CA0132 quirks table */ enum { - QUIRK_NONE, QUIRK_ALIENWARE, QUIRK_ALIENWARE_M17XR4, QUIRK_SBZ, @@ -1176,10 +1174,11 @@ enum { QUIRK_R3D, QUIRK_AE5, QUIRK_AE7, + QUIRK_NONE = HDA_FIXUP_ID_NOT_SET, };
#ifdef CONFIG_PCI -#define ca0132_quirk(spec) ((spec)->quirk) +#define ca0132_quirk(spec) ((spec)->codec->fixup_id) #define ca0132_use_pci_mmio(spec) ((spec)->use_pci_mmio) #define ca0132_use_alt_functions(spec) ((spec)->use_alt_functions) #define ca0132_use_alt_controls(spec) ((spec)->use_alt_controls) @@ -1293,7 +1292,7 @@ static const struct hda_pintbl ae7_pincfgs[] = { {} };
-static const struct snd_pci_quirk ca0132_quirks[] = { +static const struct hda_quirk ca0132_quirks[] = { SND_PCI_QUIRK(0x1028, 0x057b, "Alienware M17x R4", QUIRK_ALIENWARE_M17XR4), SND_PCI_QUIRK(0x1028, 0x0685, "Alienware 15 2015", QUIRK_ALIENWARE), SND_PCI_QUIRK(0x1028, 0x0688, "Alienware 17 2015", QUIRK_ALIENWARE), @@ -1316,6 +1315,19 @@ static const struct snd_pci_quirk ca0132_quirks[] = { {} };
+static const struct hda_model_fixup ca0132_quirk_models[] = { + { .id = QUIRK_ALIENWARE, .name = "alienware" }, + { .id = QUIRK_ALIENWARE_M17XR4, .name = "alienware-m17xr4" }, + { .id = QUIRK_SBZ, .name = "sbz" }, + { .id = QUIRK_ZXR, .name = "zxr" }, + { .id = QUIRK_ZXR_DBPRO, .name = "zxr-dbpro" }, + { .id = QUIRK_R3DI, .name = "r3di" }, + { .id = QUIRK_R3D, .name = "r3d" }, + { .id = QUIRK_AE5, .name = "ae5" }, + { .id = QUIRK_AE7, .name = "ae7" }, + {} +}; + /* Output selection quirk info structures. */ #define MAX_QUIRK_MMIO_GPIO_SET_VALS 3 #define MAX_QUIRK_SCP_SET_VALS 2 @@ -9962,17 +9974,15 @@ static int ca0132_prepare_verbs(struct hda_codec *codec) */ static void sbz_detect_quirk(struct hda_codec *codec) { - struct ca0132_spec *spec = codec->spec; - switch (codec->core.subsystem_id) { case 0x11020033: - spec->quirk = QUIRK_ZXR; + codec->fixup_id = QUIRK_ZXR; break; case 0x1102003f: - spec->quirk = QUIRK_ZXR_DBPRO; + codec->fixup_id = QUIRK_ZXR_DBPRO; break; default: - spec->quirk = QUIRK_SBZ; + codec->fixup_id = QUIRK_SBZ; break; } } @@ -9981,7 +9991,6 @@ static int patch_ca0132(struct hda_codec *codec) { struct ca0132_spec *spec; int err; - const struct snd_pci_quirk *quirk;
codec_dbg(codec, "patch_ca0132\n");
@@ -9992,11 +10001,7 @@ static int patch_ca0132(struct hda_codec *codec) spec->codec = codec;
/* Detect codec quirk */ - quirk = snd_pci_quirk_lookup(codec->bus->pci, ca0132_quirks); - if (quirk) - spec->quirk = quirk->value; - else - spec->quirk = QUIRK_NONE; + snd_hda_pick_fixup(codec, ca0132_quirk_models, ca0132_quirks, NULL); if (ca0132_quirk(spec) == QUIRK_SBZ) sbz_detect_quirk(codec);
@@ -10073,7 +10078,7 @@ static int patch_ca0132(struct hda_codec *codec) spec->mem_base = pci_iomap(codec->bus->pci, 2, 0xC20); if (spec->mem_base == NULL) { codec_warn(codec, "pci_iomap failed! Setting quirk to QUIRK_NONE."); - spec->quirk = QUIRK_NONE; + codec->fixup_id = QUIRK_NONE; } } #endif
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Vasiliy Kovalev kovalev@altlinux.org
[ Upstream commit 50db91fccea0da5c669bc68e2429e8de303758d3 ]
Introduces the alc2xx-fixup-headset-mic model to simplify enabling headset microphones on ALC2XX codecs.
Many recent configurations, as well as older systems that lacked this fix for a long time, leave headset microphones inactive by default. This addition provides a flexible workaround using the existing ALC2XX_FIXUP_HEADSET_MIC quirk.
Signed-off-by: Vasiliy Kovalev kovalev@altlinux.org Link: https://patch.msgid.link/20241207201836.6879-1-kovalev@altlinux.org Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Sasha Levin sashal@kernel.org --- sound/pci/hda/patch_realtek.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/sound/pci/hda/patch_realtek.c b/sound/pci/hda/patch_realtek.c index 29d7eb8c6bec..031cfc4744c0 100644 --- a/sound/pci/hda/patch_realtek.c +++ b/sound/pci/hda/patch_realtek.c @@ -10631,6 +10631,7 @@ static const struct hda_model_fixup alc269_fixup_models[] = { {.id = ALC255_FIXUP_ACER_HEADPHONE_AND_MIC, .name = "alc255-acer-headphone-and-mic"}, {.id = ALC285_FIXUP_HP_GPIO_AMP_INIT, .name = "alc285-hp-amp-init"}, {.id = ALC236_FIXUP_LENOVO_INV_DMIC, .name = "alc236-fixup-lenovo-inv-mic"}, + {.id = ALC2XX_FIXUP_HEADSET_MIC, .name = "alc2xx-fixup-headset-mic"}, {} }; #define ALC225_STANDARD_PINS \
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Adrian Ratiu adrian.ratiu@collabora.com
[ Upstream commit c84bd6c810d1880194fea2229c7086e4b73fddc1 ]
This is a UAC 2 DAC capable of raw DSD on intf 2 alt 4:
Bus 007 Device 004: ID 262a:9302 SAVITECH Corp. TC44C Device Descriptor: bLength 18 bDescriptorType 1 bcdUSB 2.00 bDeviceClass 239 Miscellaneous Device bDeviceSubClass 2 [unknown] bDeviceProtocol 1 Interface Association bMaxPacketSize0 64 idVendor 0x262a SAVITECH Corp. idProduct 0x9302 TC44C bcdDevice 0.01 iManufacturer 1 DDHIFI iProduct 2 TC44C iSerial 6 5000000001 ....... Interface Descriptor: bLength 9 bDescriptorType 4 bInterfaceNumber 2 bAlternateSetting 4 bNumEndpoints 2 bInterfaceClass 1 Audio bInterfaceSubClass 2 Streaming bInterfaceProtocol 32 iInterface 0 AudioStreaming Interface Descriptor: bLength 16 bDescriptorType 36 bDescriptorSubtype 1 (AS_GENERAL) bTerminalLink 3 bmControls 0x00 bFormatType 1 bmFormats 0x80000000 bNrChannels 2 bmChannelConfig 0x00000000 iChannelNames 0 .......
Signed-off-by: Adrian Ratiu adrian.ratiu@collabora.com Link: https://patch.msgid.link/20241209090529.16134-1-adrian.ratiu@collabora.com Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Sasha Levin sashal@kernel.org --- sound/usb/quirks.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/sound/usb/quirks.c b/sound/usb/quirks.c index 8eed8d9742fd..ec81b47c41c9 100644 --- a/sound/usb/quirks.c +++ b/sound/usb/quirks.c @@ -2225,6 +2225,8 @@ static const struct usb_audio_quirk_flags_table quirk_flags_table[] = { QUIRK_FLAG_DSD_RAW), DEVICE_FLG(0x2522, 0x0007, /* LH Labs Geek Out HD Audio 1V5 */ QUIRK_FLAG_SET_IFACE_FIRST), + DEVICE_FLG(0x262a, 0x9302, /* ddHiFi TC44C */ + QUIRK_FLAG_DSD_RAW), DEVICE_FLG(0x2708, 0x0002, /* Audient iD14 */ QUIRK_FLAG_IGNORE_CTL_ERROR), DEVICE_FLG(0x2912, 0x30c8, /* Audioengine D1 */
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Adrian Ratiu adrian.ratiu@collabora.com
[ Upstream commit b50a3e98442b8d72f061617c7f7a71f7dba19484 ]
UAC 2 & 3 DAC's set bit 31 of the format to signal support for a RAW_DATA type, typically used for DSD playback.
This is correctly tested by (format & UAC*_FORMAT_TYPE_I_RAW_DATA), fp->dsd_raw = true; and call snd_usb_interface_dsd_format_quirks(), however a confusing and unnecessary message gets printed because the bit is not properly tested in the last "unsupported" if test: if (format & ~0x3F) { ... }
For example the output:
usb 7-1: new high-speed USB device number 5 using xhci_hcd usb 7-1: New USB device found, idVendor=262a, idProduct=9302, bcdDevice=0.01 usb 7-1: New USB device strings: Mfr=1, Product=2, SerialNumber=6 usb 7-1: Product: TC44C usb 7-1: Manufacturer: TC44C usb 7-1: SerialNumber: 5000000001 hid-generic 0003:262A:9302.001E: No inputs registered, leaving hid-generic 0003:262A:9302.001E: hidraw6: USB HID v1.00 Device [DDHIFI TC44C] on usb-0000:08:00.3-1/input0 usb 7-1: 2:4 : unsupported format bits 0x100000000
This last "unsupported format" is actually wrong: we know the format is a RAW_DATA which we assume is DSD, so there is no need to print the confusing message.
This we unset bit 31 of the format after recognizing it, to avoid the message.
Suggested-by: Takashi Iwai tiwai@suse.com Signed-off-by: Adrian Ratiu adrian.ratiu@collabora.com Link: https://patch.msgid.link/20241209090529.16134-2-adrian.ratiu@collabora.com Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Sasha Levin sashal@kernel.org --- sound/usb/format.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/sound/usb/format.c b/sound/usb/format.c index 3b45d0ee7693..3b3a5ea6fcbf 100644 --- a/sound/usb/format.c +++ b/sound/usb/format.c @@ -60,6 +60,8 @@ static u64 parse_audio_format_i_type(struct snd_usb_audio *chip, pcm_formats |= SNDRV_PCM_FMTBIT_SPECIAL; /* flag potentially raw DSD capable altsettings */ fp->dsd_raw = true; + /* clear special format bit to avoid "unsupported format" msg below */ + format &= ~UAC2_FORMAT_TYPE_I_RAW_DATA; }
format <<= 1; @@ -71,8 +73,11 @@ static u64 parse_audio_format_i_type(struct snd_usb_audio *chip, sample_width = as->bBitResolution; sample_bytes = as->bSubslotSize;
- if (format & UAC3_FORMAT_TYPE_I_RAW_DATA) + if (format & UAC3_FORMAT_TYPE_I_RAW_DATA) { pcm_formats |= SNDRV_PCM_FMTBIT_SPECIAL; + /* clear special format bit to avoid "unsupported format" msg below */ + format &= ~UAC3_FORMAT_TYPE_I_RAW_DATA; + }
format <<= 1; break;
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Anton Protopopov aspsk@isovalent.com
[ Upstream commit c4441ca86afe4814039ee1b32c39d833c1a16bbc ]
The bpf_remove_insns() function returns WARN_ON_ONCE(error), where error is a result of bpf_adj_branches(), and thus should be always 0 However, if for any reason it is not 0, then it will be converted to boolean by WARN_ON_ONCE and returned to user space as 1, not an actual error value. Fix this by returning the original err after the WARN check.
Signed-off-by: Anton Protopopov aspsk@isovalent.com Acked-by: Jiri Olsa jolsa@kernel.org Acked-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/r/20241210114245.836164-1-aspsk@isovalent.com Signed-off-by: Alexei Starovoitov ast@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/bpf/core.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c index 58ee17f429a3..02f327f05fd6 100644 --- a/kernel/bpf/core.c +++ b/kernel/bpf/core.c @@ -529,6 +529,8 @@ struct bpf_prog *bpf_patch_insn_single(struct bpf_prog *prog, u32 off,
int bpf_remove_insns(struct bpf_prog *prog, u32 off, u32 cnt) { + int err; + /* Branch offsets can't overflow when program is shrinking, no need * to call bpf_adj_branches(..., true) here */ @@ -536,7 +538,9 @@ int bpf_remove_insns(struct bpf_prog *prog, u32 off, u32 cnt) sizeof(struct bpf_insn) * (prog->len - off - cnt)); prog->len -= cnt;
- return WARN_ON_ONCE(bpf_adj_branches(prog, off, off + cnt, off, false)); + err = bpf_adj_branches(prog, off, off + cnt, off, false); + WARN_ON_ONCE(err); + return err; }
static void bpf_prog_kallsyms_del_subprogs(struct bpf_prog *fp)
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Hobin Woo hobin.woo@samsung.com
[ Upstream commit 2b904d61a97e8ba79e3bc216ba290fd7e1d85028 ]
Some file systems do not ensure that the single call of iterate_dir reaches the end of the directory. For example, FUSE fetches entries from a daemon using 4KB buffer and stops fetching if entries exceed the buffer. And then an actor of caller, KSMBD, is used to fill the entries from the buffer. Thus, pattern searching on FUSE, files located after the 4KB could not be found and STATUS_NO_SUCH_FILE was returned.
Signed-off-by: Hobin Woo hobin.woo@samsung.com Reviewed-by: Sungjong Seo sj1557.seo@samsung.com Reviewed-by: Namjae Jeon linkinjeon@kernel.org Tested-by: Yoonho Shin yoonho.shin@samsung.com Acked-by: Namjae Jeon linkinjeon@kernel.org Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/smb/server/smb2pdu.c | 12 +++++++++++- fs/smb/server/vfs.h | 1 + 2 files changed, 12 insertions(+), 1 deletion(-)
diff --git a/fs/smb/server/smb2pdu.c b/fs/smb/server/smb2pdu.c index cd530b9a00ca..7216e2cc498b 100644 --- a/fs/smb/server/smb2pdu.c +++ b/fs/smb/server/smb2pdu.c @@ -4225,6 +4225,7 @@ static bool __query_dir(struct dir_context *ctx, const char *name, int namlen, /* dot and dotdot entries are already reserved */ if (!strcmp(".", name) || !strcmp("..", name)) return true; + d_info->num_scan++; if (ksmbd_share_veto_filename(priv->work->tcon->share_conf, name)) return true; if (!match_pattern(name, namlen, priv->search_pattern)) @@ -4385,8 +4386,17 @@ int smb2_query_dir(struct ksmbd_work *work) query_dir_private.info_level = req->FileInformationClass; dir_fp->readdir_data.private = &query_dir_private; set_ctx_actor(&dir_fp->readdir_data.ctx, __query_dir); - +again: + d_info.num_scan = 0; rc = iterate_dir(dir_fp->filp, &dir_fp->readdir_data.ctx); + /* + * num_entry can be 0 if the directory iteration stops before reaching + * the end of the directory and no file is matched with the search + * pattern. + */ + if (rc >= 0 && !d_info.num_entry && d_info.num_scan && + d_info.out_buf_len > 0) + goto again; /* * req->OutputBufferLength is too small to contain even one entry. * In this case, it immediately returns OutputBufferLength 0 to client. diff --git a/fs/smb/server/vfs.h b/fs/smb/server/vfs.h index cb76f4b5bafe..06903024a2d8 100644 --- a/fs/smb/server/vfs.h +++ b/fs/smb/server/vfs.h @@ -43,6 +43,7 @@ struct ksmbd_dir_info { char *rptr; int name_len; int out_buf_len; + int num_scan; int num_entry; int data_count; int last_entry_offset;
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Namjae Jeon linkinjeon@kernel.org
[ Upstream commit 21e46a79bbe6c4e1aa73b3ed998130f2ff07b128 ]
David reported that the new warning from setattr_copy_mgtime is coming like the following.
[ 113.215316] ------------[ cut here ]------------ [ 113.215974] WARNING: CPU: 1 PID: 31 at fs/attr.c:300 setattr_copy+0x1ee/0x200 [ 113.219192] CPU: 1 UID: 0 PID: 31 Comm: kworker/1:1 Not tainted 6.13.0-rc1+ #234 [ 113.220127] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.2-3-gd478f380-rebuilt.opensuse.org 04/01/2014 [ 113.221530] Workqueue: ksmbd-io handle_ksmbd_work [ksmbd] [ 113.222220] RIP: 0010:setattr_copy+0x1ee/0x200 [ 113.222833] Code: 24 28 49 8b 44 24 30 48 89 53 58 89 43 6c 5b 41 5c 41 5d 41 5e 41 5f 5d c3 cc cc cc cc 48 89 df e8 77 d6 ff ff e9 cd fe ff ff <0f> 0b e9 be fe ff ff 66 0 [ 113.225110] RSP: 0018:ffffaf218010fb68 EFLAGS: 00010202 [ 113.225765] RAX: 0000000000000120 RBX: ffffa446815f8568 RCX: 0000000000000003 [ 113.226667] RDX: ffffaf218010fd38 RSI: ffffa446815f8568 RDI: ffffffff94eb03a0 [ 113.227531] RBP: ffffaf218010fb90 R08: 0000001a251e217d R09: 00000000675259fa [ 113.228426] R10: 0000000002ba8a6d R11: ffffa4468196c7a8 R12: ffffaf218010fd38 [ 113.229304] R13: 0000000000000120 R14: ffffffff94eb03a0 R15: 0000000000000000 [ 113.230210] FS: 0000000000000000(0000) GS:ffffa44739d00000(0000) knlGS:0000000000000000 [ 113.231215] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 113.232055] CR2: 00007efe0053d27e CR3: 000000000331a000 CR4: 00000000000006b0 [ 113.232926] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 113.233812] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 113.234797] Call Trace: [ 113.235116] <TASK> [ 113.235393] ? __warn+0x73/0xd0 [ 113.235802] ? setattr_copy+0x1ee/0x200 [ 113.236299] ? report_bug+0xf3/0x1e0 [ 113.236757] ? handle_bug+0x4d/0x90 [ 113.237202] ? exc_invalid_op+0x13/0x60 [ 113.237689] ? asm_exc_invalid_op+0x16/0x20 [ 113.238185] ? setattr_copy+0x1ee/0x200 [ 113.238692] btrfs_setattr+0x80/0x820 [btrfs] [ 113.239285] ? get_stack_info_noinstr+0x12/0xf0 [ 113.239857] ? __module_address+0x22/0xa0 [ 113.240368] ? handle_ksmbd_work+0x6e/0x460 [ksmbd] [ 113.240993] ? __module_text_address+0x9/0x50 [ 113.241545] ? __module_address+0x22/0xa0 [ 113.242033] ? unwind_next_frame+0x10e/0x920 [ 113.242600] ? __pfx_stack_trace_consume_entry+0x10/0x10 [ 113.243268] notify_change+0x2c2/0x4e0 [ 113.243746] ? stack_depot_save_flags+0x27/0x730 [ 113.244339] ? set_file_basic_info+0x130/0x2b0 [ksmbd] [ 113.244993] set_file_basic_info+0x130/0x2b0 [ksmbd] [ 113.245613] ? process_scheduled_works+0xbe/0x310 [ 113.246181] ? worker_thread+0x100/0x240 [ 113.246696] ? kthread+0xc8/0x100 [ 113.247126] ? ret_from_fork+0x2b/0x40 [ 113.247606] ? ret_from_fork_asm+0x1a/0x30 [ 113.248132] smb2_set_info+0x63f/0xa70 [ksmbd]
ksmbd is trying to set the atime and mtime via notify_change without also setting the ctime. so This patch add ATTR_CTIME flags when setting mtime to avoid a warning.
Reported-by: David Disseldorp ddiss@suse.de Signed-off-by: Namjae Jeon linkinjeon@kernel.org Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/smb/server/smb2pdu.c | 10 +++------- 1 file changed, 3 insertions(+), 7 deletions(-)
diff --git a/fs/smb/server/smb2pdu.c b/fs/smb/server/smb2pdu.c index 7216e2cc498b..2884ebdc0eda 100644 --- a/fs/smb/server/smb2pdu.c +++ b/fs/smb/server/smb2pdu.c @@ -6017,15 +6017,13 @@ static int set_file_basic_info(struct ksmbd_file *fp, attrs.ia_valid |= (ATTR_ATIME | ATTR_ATIME_SET); }
- attrs.ia_valid |= ATTR_CTIME; if (file_info->ChangeTime) - attrs.ia_ctime = ksmbd_NTtimeToUnix(file_info->ChangeTime); - else - attrs.ia_ctime = inode_get_ctime(inode); + inode_set_ctime_to_ts(inode, + ksmbd_NTtimeToUnix(file_info->ChangeTime));
if (file_info->LastWriteTime) { attrs.ia_mtime = ksmbd_NTtimeToUnix(file_info->LastWriteTime); - attrs.ia_valid |= (ATTR_MTIME | ATTR_MTIME_SET); + attrs.ia_valid |= (ATTR_MTIME | ATTR_MTIME_SET | ATTR_CTIME); }
if (file_info->Attributes) { @@ -6067,8 +6065,6 @@ static int set_file_basic_info(struct ksmbd_file *fp, return -EACCES;
inode_lock(inode); - inode_set_ctime_to_ts(inode, attrs.ia_ctime); - attrs.ia_valid &= ~ATTR_CTIME; rc = notify_change(idmap, dentry, &attrs, NULL); inode_unlock(inode); }
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Enzo Matsumiya ematsumiya@suse.de
[ Upstream commit 633609c48a358134d3f8ef8241dff24841577f58 ]
Fix potential problem in rmmod
Signed-off-by: Enzo Matsumiya ematsumiya@suse.de Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/smb/client/cifsfs.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/fs/smb/client/cifsfs.c b/fs/smb/client/cifsfs.c index 6ed0f2548232..bbb0ef18d7b8 100644 --- a/fs/smb/client/cifsfs.c +++ b/fs/smb/client/cifsfs.c @@ -2015,6 +2015,7 @@ exit_cifs(void) destroy_workqueue(decrypt_wq); destroy_workqueue(fileinfo_put_wq); destroy_workqueue(serverclose_wq); + destroy_workqueue(cfid_put_wq); destroy_workqueue(cifsiod_wq); cifs_proc_clean(); }
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Daniele Palmas dnlplm@gmail.com
[ Upstream commit 3b58b53a26598209a7ad8259a5114ce71f7c3d64 ]
Add the following Telit FE910C04 compositions:
0x10c0: rmnet + tty (AT/NMEA) + tty (AT) + tty (diag) T: Bus=02 Lev=01 Prnt=03 Port=06 Cnt=01 Dev#= 13 Spd=480 MxCh= 0 D: Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs= 1 P: Vendor=1bc7 ProdID=10c0 Rev=05.15 S: Manufacturer=Telit Cinterion S: Product=FE910 S: SerialNumber=f71b8b32 C: #Ifs= 4 Cfg#= 1 Atr=e0 MxPwr=500mA I: If#= 0 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=50 Driver=qmi_wwan E: Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=82(I) Atr=03(Int.) MxPS= 8 Ivl=32ms I: If#= 1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=60 Driver=option E: Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=83(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=84(I) Atr=03(Int.) MxPS= 10 Ivl=32ms I: If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option E: Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=85(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=86(I) Atr=03(Int.) MxPS= 10 Ivl=32ms I: If#= 3 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option E: Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=87(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
0x10c4: rmnet + tty (AT) + tty (AT) + tty (diag) T: Bus=02 Lev=01 Prnt=03 Port=06 Cnt=01 Dev#= 14 Spd=480 MxCh= 0 D: Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs= 1 P: Vendor=1bc7 ProdID=10c4 Rev=05.15 S: Manufacturer=Telit Cinterion S: Product=FE910 S: SerialNumber=f71b8b32 C: #Ifs= 4 Cfg#= 1 Atr=e0 MxPwr=500mA I: If#= 0 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=50 Driver=qmi_wwan E: Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=82(I) Atr=03(Int.) MxPS= 8 Ivl=32ms I: If#= 1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option E: Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=83(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=84(I) Atr=03(Int.) MxPS= 10 Ivl=32ms I: If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option E: Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=85(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=86(I) Atr=03(Int.) MxPS= 10 Ivl=32ms I: If#= 3 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option E: Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=87(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
0x10c8: rmnet + tty (AT) + tty (diag) + DPL (data packet logging) + adb T: Bus=02 Lev=01 Prnt=03 Port=06 Cnt=01 Dev#= 17 Spd=480 MxCh= 0 D: Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs= 1 P: Vendor=1bc7 ProdID=10c8 Rev=05.15 S: Manufacturer=Telit Cinterion S: Product=FE910 S: SerialNumber=f71b8b32 C: #Ifs= 5 Cfg#= 1 Atr=e0 MxPwr=500mA I: If#= 0 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=50 Driver=qmi_wwan E: Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=82(I) Atr=03(Int.) MxPS= 8 Ivl=32ms I: If#= 1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option E: Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=83(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=84(I) Atr=03(Int.) MxPS= 10 Ivl=32ms I: If#= 2 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option E: Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=85(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I: If#= 3 Alt= 0 #EPs= 1 Cls=ff(vend.) Sub=ff Prot=80 Driver=(none) E: Ad=86(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I: If#= 4 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=42 Prot=01 Driver=(none) E: Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=87(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
Signed-off-by: Daniele Palmas dnlplm@gmail.com Link: https://patch.msgid.link/20241209151821.3688829-1-dnlplm@gmail.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/usb/qmi_wwan.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/net/usb/qmi_wwan.c b/drivers/net/usb/qmi_wwan.c index 89775b6d0699..8e30df676ede 100644 --- a/drivers/net/usb/qmi_wwan.c +++ b/drivers/net/usb/qmi_wwan.c @@ -1373,6 +1373,9 @@ static const struct usb_device_id products[] = { {QMI_QUIRK_SET_DTR(0x1bc7, 0x10a0, 0)}, /* Telit FN920C04 */ {QMI_QUIRK_SET_DTR(0x1bc7, 0x10a4, 0)}, /* Telit FN920C04 */ {QMI_QUIRK_SET_DTR(0x1bc7, 0x10a9, 0)}, /* Telit FN920C04 */ + {QMI_QUIRK_SET_DTR(0x1bc7, 0x10c0, 0)}, /* Telit FE910C04 */ + {QMI_QUIRK_SET_DTR(0x1bc7, 0x10c4, 0)}, /* Telit FE910C04 */ + {QMI_QUIRK_SET_DTR(0x1bc7, 0x10c8, 0)}, /* Telit FE910C04 */ {QMI_FIXED_INTF(0x1bc7, 0x1100, 3)}, /* Telit ME910 */ {QMI_FIXED_INTF(0x1bc7, 0x1101, 3)}, /* Telit ME910 dual modem */ {QMI_FIXED_INTF(0x1bc7, 0x1200, 5)}, /* Telit LE920 */
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Luiz Augusto von Dentz luiz.von.dentz@intel.com
[ Upstream commit 4d94f05558271654670d18c26c912da0c1c15549 ]
This reworks hci_cb_list to not use mutex hci_cb_list_lock to avoid bugs like the bellow:
BUG: sleeping function called from invalid context at kernel/locking/mutex.c:585 in_atomic(): 0, irqs_disabled(): 0, non_block: 0, pid: 5070, name: kworker/u9:2 preempt_count: 0, expected: 0 RCU nest depth: 1, expected: 0 4 locks held by kworker/u9:2/5070: #0: ffff888015be3948 ((wq_completion)hci0#2){+.+.}-{0:0}, at: process_one_work kernel/workqueue.c:3229 [inline] #0: ffff888015be3948 ((wq_completion)hci0#2){+.+.}-{0:0}, at: process_scheduled_works+0x8e0/0x1770 kernel/workqueue.c:3335 #1: ffffc90003b6fd00 ((work_completion)(&hdev->rx_work)){+.+.}-{0:0}, at: process_one_work kernel/workqueue.c:3230 [inline] #1: ffffc90003b6fd00 ((work_completion)(&hdev->rx_work)){+.+.}-{0:0}, at: process_scheduled_works+0x91b/0x1770 kernel/workqueue.c:3335 #2: ffff8880665d0078 (&hdev->lock){+.+.}-{3:3}, at: hci_le_create_big_complete_evt+0xcf/0xae0 net/bluetooth/hci_event.c:6914 #3: ffffffff8e132020 (rcu_read_lock){....}-{1:2}, at: rcu_lock_acquire include/linux/rcupdate.h:298 [inline] #3: ffffffff8e132020 (rcu_read_lock){....}-{1:2}, at: rcu_read_lock include/linux/rcupdate.h:750 [inline] #3: ffffffff8e132020 (rcu_read_lock){....}-{1:2}, at: hci_le_create_big_complete_evt+0xdb/0xae0 net/bluetooth/hci_event.c:6915 CPU: 0 PID: 5070 Comm: kworker/u9:2 Not tainted 6.8.0-syzkaller-08073-g480e035fc4c7 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/27/2024 Workqueue: hci0 hci_rx_work Call Trace: <TASK> __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0x241/0x360 lib/dump_stack.c:114 __might_resched+0x5d4/0x780 kernel/sched/core.c:10187 __mutex_lock_common kernel/locking/mutex.c:585 [inline] __mutex_lock+0xc1/0xd70 kernel/locking/mutex.c:752 hci_connect_cfm include/net/bluetooth/hci_core.h:2004 [inline] hci_le_create_big_complete_evt+0x3d9/0xae0 net/bluetooth/hci_event.c:6939 hci_event_func net/bluetooth/hci_event.c:7514 [inline] hci_event_packet+0xa53/0x1540 net/bluetooth/hci_event.c:7569 hci_rx_work+0x3e8/0xca0 net/bluetooth/hci_core.c:4171 process_one_work kernel/workqueue.c:3254 [inline] process_scheduled_works+0xa00/0x1770 kernel/workqueue.c:3335 worker_thread+0x86d/0xd70 kernel/workqueue.c:3416 kthread+0x2f0/0x390 kernel/kthread.c:388 ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:243 </TASK>
Reported-by: syzbot+2fb0835e0c9cefc34614@syzkaller.appspotmail.com Tested-by: syzbot+2fb0835e0c9cefc34614@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=2fb0835e0c9cefc34614 Signed-off-by: Luiz Augusto von Dentz luiz.von.dentz@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- include/net/bluetooth/hci_core.h | 108 ++++++++++++++++++++----------- net/bluetooth/hci_core.c | 10 +-- net/bluetooth/iso.c | 6 ++ net/bluetooth/l2cap_core.c | 12 ++-- net/bluetooth/rfcomm/core.c | 6 ++ net/bluetooth/sco.c | 12 ++-- 6 files changed, 97 insertions(+), 57 deletions(-)
diff --git a/include/net/bluetooth/hci_core.h b/include/net/bluetooth/hci_core.h index e9214ccfde2d..4fcee6b734b7 100644 --- a/include/net/bluetooth/hci_core.h +++ b/include/net/bluetooth/hci_core.h @@ -800,7 +800,6 @@ struct hci_conn_params { extern struct list_head hci_dev_list; extern struct list_head hci_cb_list; extern rwlock_t hci_dev_list_lock; -extern struct mutex hci_cb_list_lock;
#define hci_dev_set_flag(hdev, nr) set_bit((nr), (hdev)->dev_flags) #define hci_dev_clear_flag(hdev, nr) clear_bit((nr), (hdev)->dev_flags) @@ -1949,24 +1948,47 @@ struct hci_cb {
char *name;
+ bool (*match) (struct hci_conn *conn); void (*connect_cfm) (struct hci_conn *conn, __u8 status); void (*disconn_cfm) (struct hci_conn *conn, __u8 status); void (*security_cfm) (struct hci_conn *conn, __u8 status, - __u8 encrypt); + __u8 encrypt); void (*key_change_cfm) (struct hci_conn *conn, __u8 status); void (*role_switch_cfm) (struct hci_conn *conn, __u8 status, __u8 role); };
+static inline void hci_cb_lookup(struct hci_conn *conn, struct list_head *list) +{ + struct hci_cb *cb, *cpy; + + rcu_read_lock(); + list_for_each_entry_rcu(cb, &hci_cb_list, list) { + if (cb->match && cb->match(conn)) { + cpy = kmalloc(sizeof(*cpy), GFP_ATOMIC); + if (!cpy) + break; + + *cpy = *cb; + INIT_LIST_HEAD(&cpy->list); + list_add_rcu(&cpy->list, list); + } + } + rcu_read_unlock(); +} + static inline void hci_connect_cfm(struct hci_conn *conn, __u8 status) { - struct hci_cb *cb; + struct list_head list; + struct hci_cb *cb, *tmp; + + INIT_LIST_HEAD(&list); + hci_cb_lookup(conn, &list);
- mutex_lock(&hci_cb_list_lock); - list_for_each_entry(cb, &hci_cb_list, list) { + list_for_each_entry_safe(cb, tmp, &list, list) { if (cb->connect_cfm) cb->connect_cfm(conn, status); + kfree(cb); } - mutex_unlock(&hci_cb_list_lock);
if (conn->connect_cfm_cb) conn->connect_cfm_cb(conn, status); @@ -1974,43 +1996,55 @@ static inline void hci_connect_cfm(struct hci_conn *conn, __u8 status)
static inline void hci_disconn_cfm(struct hci_conn *conn, __u8 reason) { - struct hci_cb *cb; + struct list_head list; + struct hci_cb *cb, *tmp; + + INIT_LIST_HEAD(&list); + hci_cb_lookup(conn, &list);
- mutex_lock(&hci_cb_list_lock); - list_for_each_entry(cb, &hci_cb_list, list) { + list_for_each_entry_safe(cb, tmp, &list, list) { if (cb->disconn_cfm) cb->disconn_cfm(conn, reason); + kfree(cb); } - mutex_unlock(&hci_cb_list_lock);
if (conn->disconn_cfm_cb) conn->disconn_cfm_cb(conn, reason); }
-static inline void hci_auth_cfm(struct hci_conn *conn, __u8 status) +static inline void hci_security_cfm(struct hci_conn *conn, __u8 status, + __u8 encrypt) { - struct hci_cb *cb; - __u8 encrypt; - - if (test_bit(HCI_CONN_ENCRYPT_PEND, &conn->flags)) - return; + struct list_head list; + struct hci_cb *cb, *tmp;
- encrypt = test_bit(HCI_CONN_ENCRYPT, &conn->flags) ? 0x01 : 0x00; + INIT_LIST_HEAD(&list); + hci_cb_lookup(conn, &list);
- mutex_lock(&hci_cb_list_lock); - list_for_each_entry(cb, &hci_cb_list, list) { + list_for_each_entry_safe(cb, tmp, &list, list) { if (cb->security_cfm) cb->security_cfm(conn, status, encrypt); + kfree(cb); } - mutex_unlock(&hci_cb_list_lock);
if (conn->security_cfm_cb) conn->security_cfm_cb(conn, status); }
+static inline void hci_auth_cfm(struct hci_conn *conn, __u8 status) +{ + __u8 encrypt; + + if (test_bit(HCI_CONN_ENCRYPT_PEND, &conn->flags)) + return; + + encrypt = test_bit(HCI_CONN_ENCRYPT, &conn->flags) ? 0x01 : 0x00; + + hci_security_cfm(conn, status, encrypt); +} + static inline void hci_encrypt_cfm(struct hci_conn *conn, __u8 status) { - struct hci_cb *cb; __u8 encrypt;
if (conn->state == BT_CONFIG) { @@ -2037,40 +2071,38 @@ static inline void hci_encrypt_cfm(struct hci_conn *conn, __u8 status) conn->sec_level = conn->pending_sec_level; }
- mutex_lock(&hci_cb_list_lock); - list_for_each_entry(cb, &hci_cb_list, list) { - if (cb->security_cfm) - cb->security_cfm(conn, status, encrypt); - } - mutex_unlock(&hci_cb_list_lock); - - if (conn->security_cfm_cb) - conn->security_cfm_cb(conn, status); + hci_security_cfm(conn, status, encrypt); }
static inline void hci_key_change_cfm(struct hci_conn *conn, __u8 status) { - struct hci_cb *cb; + struct list_head list; + struct hci_cb *cb, *tmp; + + INIT_LIST_HEAD(&list); + hci_cb_lookup(conn, &list);
- mutex_lock(&hci_cb_list_lock); - list_for_each_entry(cb, &hci_cb_list, list) { + list_for_each_entry_safe(cb, tmp, &list, list) { if (cb->key_change_cfm) cb->key_change_cfm(conn, status); + kfree(cb); } - mutex_unlock(&hci_cb_list_lock); }
static inline void hci_role_switch_cfm(struct hci_conn *conn, __u8 status, __u8 role) { - struct hci_cb *cb; + struct list_head list; + struct hci_cb *cb, *tmp; + + INIT_LIST_HEAD(&list); + hci_cb_lookup(conn, &list);
- mutex_lock(&hci_cb_list_lock); - list_for_each_entry(cb, &hci_cb_list, list) { + list_for_each_entry_safe(cb, tmp, &list, list) { if (cb->role_switch_cfm) cb->role_switch_cfm(conn, status, role); + kfree(cb); } - mutex_unlock(&hci_cb_list_lock); }
static inline bool hci_bdaddr_is_rpa(bdaddr_t *bdaddr, u8 addr_type) diff --git a/net/bluetooth/hci_core.c b/net/bluetooth/hci_core.c index 30519d47e8a6..f29fd3264401 100644 --- a/net/bluetooth/hci_core.c +++ b/net/bluetooth/hci_core.c @@ -58,7 +58,6 @@ DEFINE_RWLOCK(hci_dev_list_lock);
/* HCI callback list */ LIST_HEAD(hci_cb_list); -DEFINE_MUTEX(hci_cb_list_lock);
/* HCI ID Numbering */ static DEFINE_IDA(hci_index_ida); @@ -2957,9 +2956,7 @@ int hci_register_cb(struct hci_cb *cb) { BT_DBG("%p name %s", cb, cb->name);
- mutex_lock(&hci_cb_list_lock); - list_add_tail(&cb->list, &hci_cb_list); - mutex_unlock(&hci_cb_list_lock); + list_add_tail_rcu(&cb->list, &hci_cb_list);
return 0; } @@ -2969,9 +2966,8 @@ int hci_unregister_cb(struct hci_cb *cb) { BT_DBG("%p name %s", cb, cb->name);
- mutex_lock(&hci_cb_list_lock); - list_del(&cb->list); - mutex_unlock(&hci_cb_list_lock); + list_del_rcu(&cb->list); + synchronize_rcu();
return 0; } diff --git a/net/bluetooth/iso.c b/net/bluetooth/iso.c index b94d202bf374..f165cafa3aa9 100644 --- a/net/bluetooth/iso.c +++ b/net/bluetooth/iso.c @@ -1929,6 +1929,11 @@ int iso_connect_ind(struct hci_dev *hdev, bdaddr_t *bdaddr, __u8 *flags) return lm; }
+static bool iso_match(struct hci_conn *hcon) +{ + return hcon->type == ISO_LINK || hcon->type == LE_LINK; +} + static void iso_connect_cfm(struct hci_conn *hcon, __u8 status) { if (hcon->type != ISO_LINK) { @@ -2110,6 +2115,7 @@ void iso_recv(struct hci_conn *hcon, struct sk_buff *skb, u16 flags)
static struct hci_cb iso_cb = { .name = "ISO", + .match = iso_match, .connect_cfm = iso_connect_cfm, .disconn_cfm = iso_disconn_cfm, }; diff --git a/net/bluetooth/l2cap_core.c b/net/bluetooth/l2cap_core.c index 93651c421767..acb148759bd0 100644 --- a/net/bluetooth/l2cap_core.c +++ b/net/bluetooth/l2cap_core.c @@ -7223,6 +7223,11 @@ static struct l2cap_chan *l2cap_global_fixed_chan(struct l2cap_chan *c, return NULL; }
+static bool l2cap_match(struct hci_conn *hcon) +{ + return hcon->type == ACL_LINK || hcon->type == LE_LINK; +} + static void l2cap_connect_cfm(struct hci_conn *hcon, u8 status) { struct hci_dev *hdev = hcon->hdev; @@ -7230,9 +7235,6 @@ static void l2cap_connect_cfm(struct hci_conn *hcon, u8 status) struct l2cap_chan *pchan; u8 dst_type;
- if (hcon->type != ACL_LINK && hcon->type != LE_LINK) - return; - BT_DBG("hcon %p bdaddr %pMR status %d", hcon, &hcon->dst, status);
if (status) { @@ -7297,9 +7299,6 @@ int l2cap_disconn_ind(struct hci_conn *hcon)
static void l2cap_disconn_cfm(struct hci_conn *hcon, u8 reason) { - if (hcon->type != ACL_LINK && hcon->type != LE_LINK) - return; - BT_DBG("hcon %p reason %d", hcon, reason);
l2cap_conn_del(hcon, bt_to_errno(reason)); @@ -7578,6 +7577,7 @@ void l2cap_recv_acldata(struct hci_conn *hcon, struct sk_buff *skb, u16 flags)
static struct hci_cb l2cap_cb = { .name = "L2CAP", + .match = l2cap_match, .connect_cfm = l2cap_connect_cfm, .disconn_cfm = l2cap_disconn_cfm, .security_cfm = l2cap_security_cfm, diff --git a/net/bluetooth/rfcomm/core.c b/net/bluetooth/rfcomm/core.c index 1d34d8497033..9d46afb24caf 100644 --- a/net/bluetooth/rfcomm/core.c +++ b/net/bluetooth/rfcomm/core.c @@ -2134,6 +2134,11 @@ static int rfcomm_run(void *unused) return 0; }
+static bool rfcomm_match(struct hci_conn *hcon) +{ + return hcon->type == ACL_LINK; +} + static void rfcomm_security_cfm(struct hci_conn *conn, u8 status, u8 encrypt) { struct rfcomm_session *s; @@ -2180,6 +2185,7 @@ static void rfcomm_security_cfm(struct hci_conn *conn, u8 status, u8 encrypt)
static struct hci_cb rfcomm_cb = { .name = "RFCOMM", + .match = rfcomm_match, .security_cfm = rfcomm_security_cfm };
diff --git a/net/bluetooth/sco.c b/net/bluetooth/sco.c index 64d4d57c7033..c4c36ff25fb2 100644 --- a/net/bluetooth/sco.c +++ b/net/bluetooth/sco.c @@ -1353,11 +1353,13 @@ int sco_connect_ind(struct hci_dev *hdev, bdaddr_t *bdaddr, __u8 *flags) return lm; }
-static void sco_connect_cfm(struct hci_conn *hcon, __u8 status) +static bool sco_match(struct hci_conn *hcon) { - if (hcon->type != SCO_LINK && hcon->type != ESCO_LINK) - return; + return hcon->type == SCO_LINK || hcon->type == ESCO_LINK; +}
+static void sco_connect_cfm(struct hci_conn *hcon, __u8 status) +{ BT_DBG("hcon %p bdaddr %pMR status %u", hcon, &hcon->dst, status);
if (!status) { @@ -1372,9 +1374,6 @@ static void sco_connect_cfm(struct hci_conn *hcon, __u8 status)
static void sco_disconn_cfm(struct hci_conn *hcon, __u8 reason) { - if (hcon->type != SCO_LINK && hcon->type != ESCO_LINK) - return; - BT_DBG("hcon %p reason %d", hcon, reason);
sco_conn_del(hcon, bt_to_errno(reason)); @@ -1400,6 +1399,7 @@ void sco_recv_scodata(struct hci_conn *hcon, struct sk_buff *skb)
static struct hci_cb sco_cb = { .name = "SCO", + .match = sco_match, .connect_cfm = sco_connect_cfm, .disconn_cfm = sco_disconn_cfm, };
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Uros Bizjak ubizjak@gmail.com
[ Upstream commit a1855f1b7c33642c9f7a01991fb763342a312e9b ]
percpu_base is used in various percpu functions that expect variable in __percpu address space. Correct the declaration of percpu_base to
void __iomem * __percpu *percpu_base;
to declare the variable as __percpu pointer.
The patch fixes several sparse warnings:
irq-gic.c:1172:44: warning: incorrect type in assignment (different address spaces) irq-gic.c:1172:44: expected void [noderef] __percpu *[noderef] __iomem *percpu_base irq-gic.c:1172:44: got void [noderef] __iomem *[noderef] __percpu * ... irq-gic.c:1231:43: warning: incorrect type in argument 1 (different address spaces) irq-gic.c:1231:43: expected void [noderef] __percpu *__pdata irq-gic.c:1231:43: got void [noderef] __percpu *[noderef] __iomem *percpu_base
There were no changes in the resulting object files.
Signed-off-by: Uros Bizjak ubizjak@gmail.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Acked-by: Marc Zyngier maz@kernel.org Link: https://lore.kernel.org/all/20241213145809.2918-2-ubizjak@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/irqchip/irq-gic.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c index 412196a7dad5..2c6c50348afd 100644 --- a/drivers/irqchip/irq-gic.c +++ b/drivers/irqchip/irq-gic.c @@ -64,7 +64,7 @@ static void gic_check_cpu_features(void)
union gic_base { void __iomem *common_base; - void __percpu * __iomem *percpu_base; + void __iomem * __percpu *percpu_base; };
struct gic_chip_data {
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Leon Romanovsky leonro@nvidia.com
[ Upstream commit 824927e88456331c7a999fdf5d9d27923b619590 ]
ARC GCC compiler is packaged starting from Fedora 39i and the GCC variant of cross compile tools has arc-linux-gnu- prefix and not arc-linux-. This is causing that CROSS_COMPILE variable is left unset.
This change allows builds without need to supply CROSS_COMPILE argument if distro package is used.
Before this change: $ make -j 128 ARCH=arc W=1 drivers/infiniband/hw/mlx4/ gcc: warning: ‘-mcpu=’ is deprecated; use ‘-mtune=’ or ‘-march=’ instead gcc: error: unrecognized command-line option ‘-mmedium-calls’ gcc: error: unrecognized command-line option ‘-mlock’ gcc: error: unrecognized command-line option ‘-munaligned-access’
[1] https://packages.fedoraproject.org/pkgs/cross-gcc/gcc-arc-linux-gnu/index.ht... Signed-off-by: Leon Romanovsky leonro@nvidia.com Signed-off-by: Vineet Gupta vgupta@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arc/Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/arc/Makefile b/arch/arc/Makefile index 2390dd042e36..fb98478ed1ab 100644 --- a/arch/arc/Makefile +++ b/arch/arc/Makefile @@ -6,7 +6,7 @@ KBUILD_DEFCONFIG := haps_hs_smp_defconfig
ifeq ($(CROSS_COMPILE),) -CROSS_COMPILE := $(call cc-cross-prefix, arc-linux- arceb-linux-) +CROSS_COMPILE := $(call cc-cross-prefix, arc-linux- arceb-linux- arc-linux-gnu-) endif
cflags-y += -fno-common -pipe -fno-builtin -mmedium-calls -D__linux__
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nathan Lynch nathanl@linux.ibm.com
[ Upstream commit 7a8e9cdf9405819105ae7405cd91e482bf574b01 ]
Using the address operator on the array doesn't work:
./include/linux/seq_buf.h:27:27: error: initialization of ‘char *’ from incompatible pointer type ‘char (*)[128]’ [-Werror=incompatible-pointer-types] 27 | .buffer = &__ ## NAME ## _buffer, \ | ^
Apart from fixing that, we can improve DECLARE_SEQ_BUF() by using a compound literal to define the buffer array without attaching a name to it. This makes the macro a single statement, allowing constructs such as:
static DECLARE_SEQ_BUF(my_seq_buf, MYSB_SIZE);
to work as intended.
Link: https://lkml.kernel.org/r/20240116-declare-seq-buf-fix-v1-1-915db4692f32@lin...
Cc: stable@vger.kernel.org Acked-by: Kees Cook keescook@chromium.org Fixes: dcc4e5728eea ("seq_buf: Introduce DECLARE_SEQ_BUF and seq_buf_str()") Signed-off-by: Nathan Lynch nathanl@linux.ibm.com Signed-off-by: Steven Rostedt (Google) rostedt@goodmis.org Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/seq_buf.h | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/include/linux/seq_buf.h b/include/linux/seq_buf.h index d9db59f420a4..468d8c5eef4a 100644 --- a/include/linux/seq_buf.h +++ b/include/linux/seq_buf.h @@ -22,9 +22,8 @@ struct seq_buf { };
#define DECLARE_SEQ_BUF(NAME, SIZE) \ - char __ ## NAME ## _buffer[SIZE] = ""; \ struct seq_buf NAME = { \ - .buffer = &__ ## NAME ## _buffer, \ + .buffer = (char[SIZE]) { 0 }, \ .size = SIZE, \ }
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Selvin Xavier selvin.xavier@broadcom.com
[ Upstream commit 227f51743b61fe3f6fc481f0fb8086bf8c49b8c9 ]
When variable size WQE is supported, max_qp_sges reported is more than 6. For devices that supports variable size WQE, the Send WQE size calculation is wrong when an an older library that doesn't support variable size WQE is used.
Set the WQE size to 128 when static WQE is supported.
Fixes: de1d364c3815 ("RDMA/bnxt_re: Add support for Variable WQE in Genp7 adapters") Signed-off-by: Selvin Xavier selvin.xavier@broadcom.com Link: https://patch.msgid.link/1725444253-13221-3-git-send-email-selvin.xavier@bro... Signed-off-by: Leon Romanovsky leon@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/infiniband/hw/bnxt_re/ib_verbs.c | 21 ++++++++++----------- drivers/infiniband/hw/bnxt_re/qplib_sp.h | 2 ++ 2 files changed, 12 insertions(+), 11 deletions(-)
diff --git a/drivers/infiniband/hw/bnxt_re/ib_verbs.c b/drivers/infiniband/hw/bnxt_re/ib_verbs.c index 540998ddbb44..13c65ec58256 100644 --- a/drivers/infiniband/hw/bnxt_re/ib_verbs.c +++ b/drivers/infiniband/hw/bnxt_re/ib_verbs.c @@ -992,23 +992,22 @@ static int bnxt_re_setup_swqe_size(struct bnxt_re_qp *qp, align = sizeof(struct sq_send_hdr); ilsize = ALIGN(init_attr->cap.max_inline_data, align);
- sq->wqe_size = bnxt_re_get_wqe_size(ilsize, sq->max_sge); - if (sq->wqe_size > bnxt_re_get_swqe_size(dev_attr->max_qp_sges)) - return -EINVAL; - /* For gen p4 and gen p5 backward compatibility mode - * wqe size is fixed to 128 bytes + /* For gen p4 and gen p5 fixed wqe compatibility mode + * wqe size is fixed to 128 bytes - ie 6 SGEs */ - if (sq->wqe_size < bnxt_re_get_swqe_size(dev_attr->max_qp_sges) && - qplqp->wqe_mode == BNXT_QPLIB_WQE_MODE_STATIC) - sq->wqe_size = bnxt_re_get_swqe_size(dev_attr->max_qp_sges); + if (qplqp->wqe_mode == BNXT_QPLIB_WQE_MODE_STATIC) { + sq->wqe_size = bnxt_re_get_swqe_size(BNXT_STATIC_MAX_SGE); + sq->max_sge = BNXT_STATIC_MAX_SGE; + } else { + sq->wqe_size = bnxt_re_get_wqe_size(ilsize, sq->max_sge); + if (sq->wqe_size > bnxt_re_get_swqe_size(dev_attr->max_qp_sges)) + return -EINVAL; + }
if (init_attr->cap.max_inline_data) { qplqp->max_inline_data = sq->wqe_size - sizeof(struct sq_send_hdr); init_attr->cap.max_inline_data = qplqp->max_inline_data; - if (qplqp->wqe_mode == BNXT_QPLIB_WQE_MODE_STATIC) - sq->max_sge = qplqp->max_inline_data / - sizeof(struct sq_sge); }
return 0; diff --git a/drivers/infiniband/hw/bnxt_re/qplib_sp.h b/drivers/infiniband/hw/bnxt_re/qplib_sp.h index b91e6a85e75d..aeacd0a9a92c 100644 --- a/drivers/infiniband/hw/bnxt_re/qplib_sp.h +++ b/drivers/infiniband/hw/bnxt_re/qplib_sp.h @@ -358,4 +358,6 @@ int bnxt_qplib_modify_cc(struct bnxt_qplib_res *res, #define BNXT_VAR_MAX_SGE 13 #define BNXT_RE_MAX_RQ_WQES 65536
+#define BNXT_STATIC_MAX_SGE 6 + #endif /* __BNXT_QPLIB_SP_H__*/
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Masahiro Yamada masahiroy@kernel.org
[ Upstream commit 77dc55a978e69625f9718460012e5ef0172dc4de ]
When building a 64-bit kernel on a 32-bit build host, incorrect input MODULE_ALIAS() entries may be generated.
For example, when compiling a 64-bit kernel with CONFIG_INPUT_MOUSEDEV=m on a 64-bit build machine, you will get the correct output:
$ grep MODULE_ALIAS drivers/input/mousedev.mod.c MODULE_ALIAS("input:b*v*p*e*-e*1,*2,*k*110,*r*0,*1,*a*m*l*s*f*w*"); MODULE_ALIAS("input:b*v*p*e*-e*1,*2,*k*r*8,*a*m*l*s*f*w*"); MODULE_ALIAS("input:b*v*p*e*-e*1,*3,*k*14A,*r*a*0,*1,*m*l*s*f*w*"); MODULE_ALIAS("input:b*v*p*e*-e*1,*3,*k*145,*r*a*0,*1,*18,*1C,*m*l*s*f*w*"); MODULE_ALIAS("input:b*v*p*e*-e*1,*3,*k*110,*r*a*0,*1,*m*l*s*f*w*");
However, building the same kernel on a 32-bit machine results in incorrect output:
$ grep MODULE_ALIAS drivers/input/mousedev.mod.c MODULE_ALIAS("input:b*v*p*e*-e*1,*2,*k*110,*130,*r*0,*1,*a*m*l*s*f*w*"); MODULE_ALIAS("input:b*v*p*e*-e*1,*2,*k*r*8,*a*m*l*s*f*w*"); MODULE_ALIAS("input:b*v*p*e*-e*1,*3,*k*14A,*16A,*r*a*0,*1,*20,*21,*m*l*s*f*w*"); MODULE_ALIAS("input:b*v*p*e*-e*1,*3,*k*145,*165,*r*a*0,*1,*18,*1C,*20,*21,*38,*3C,*m*l*s*f*w*"); MODULE_ALIAS("input:b*v*p*e*-e*1,*3,*k*110,*130,*r*a*0,*1,*20,*21,*m*l*s*f*w*");
A similar issue occurs with CONFIG_INPUT_JOYDEV=m. On a 64-bit build machine, the output is:
$ grep MODULE_ALIAS drivers/input/joydev.mod.c MODULE_ALIAS("input:b*v*p*e*-e*3,*k*r*a*0,*m*l*s*f*w*"); MODULE_ALIAS("input:b*v*p*e*-e*3,*k*r*a*2,*m*l*s*f*w*"); MODULE_ALIAS("input:b*v*p*e*-e*3,*k*r*a*8,*m*l*s*f*w*"); MODULE_ALIAS("input:b*v*p*e*-e*3,*k*r*a*6,*m*l*s*f*w*"); MODULE_ALIAS("input:b*v*p*e*-e*1,*k*120,*r*a*m*l*s*f*w*"); MODULE_ALIAS("input:b*v*p*e*-e*1,*k*130,*r*a*m*l*s*f*w*"); MODULE_ALIAS("input:b*v*p*e*-e*1,*k*2C0,*r*a*m*l*s*f*w*");
However, on a 32-bit machine, the output is incorrect:
$ grep MODULE_ALIAS drivers/input/joydev.mod.c MODULE_ALIAS("input:b*v*p*e*-e*3,*k*r*a*0,*20,*m*l*s*f*w*"); MODULE_ALIAS("input:b*v*p*e*-e*3,*k*r*a*2,*22,*m*l*s*f*w*"); MODULE_ALIAS("input:b*v*p*e*-e*3,*k*r*a*8,*28,*m*l*s*f*w*"); MODULE_ALIAS("input:b*v*p*e*-e*3,*k*r*a*6,*26,*m*l*s*f*w*"); MODULE_ALIAS("input:b*v*p*e*-e*1,*k*11F,*13F,*r*a*m*l*s*f*w*"); MODULE_ALIAS("input:b*v*p*e*-e*1,*k*11F,*13F,*r*a*m*l*s*f*w*"); MODULE_ALIAS("input:b*v*p*e*-e*1,*k*2C0,*2E0,*r*a*m*l*s*f*w*");
When building a 64-bit kernel, BITS_PER_LONG is defined as 64. However, on a 32-bit build machine, the constant 1L is a signed 32-bit value. Left-shifting it beyond 32 bits causes wraparound, and shifting by 31 or 63 bits makes it a negative value.
The fix in commit e0e92632715f ("[PATCH] PATCH: 1 line 2.6.18 bugfix: modpost-64bit-fix.patch") is incorrect; it only addresses cases where a 64-bit kernel is built on a 64-bit build machine, overlooking cases on a 32-bit build machine.
Using 1ULL ensures a 64-bit width on both 32-bit and 64-bit machines, avoiding the wraparound issue.
Fixes: e0e92632715f ("[PATCH] PATCH: 1 line 2.6.18 bugfix: modpost-64bit-fix.patch") Signed-off-by: Masahiro Yamada masahiroy@kernel.org Stable-dep-of: bf36b4bf1b9a ("modpost: fix the missed iteration for the max bit in do_input()") Signed-off-by: Sasha Levin sashal@kernel.org --- scripts/mod/file2alias.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/scripts/mod/file2alias.c b/scripts/mod/file2alias.c index efbb4836ec66..96f37fe1b992 100644 --- a/scripts/mod/file2alias.c +++ b/scripts/mod/file2alias.c @@ -743,7 +743,7 @@ static void do_input(char *alias, for (i = min / BITS_PER_LONG; i < max / BITS_PER_LONG + 1; i++) arr[i] = TO_NATIVE(arr[i]); for (i = min; i < max; i++) - if (arr[i / BITS_PER_LONG] & (1L << (i%BITS_PER_LONG))) + if (arr[i / BITS_PER_LONG] & (1ULL << (i%BITS_PER_LONG))) sprintf(alias + strlen(alias), "%X,*", i); }
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Masahiro Yamada masahiroy@kernel.org
[ Upstream commit bf36b4bf1b9a7a0015610e2f038ee84ddb085de2 ]
This loop should iterate over the range from 'min' to 'max' inclusively. The last interation is missed.
Fixes: 1d8f430c15b3 ("[PATCH] Input: add modalias support") Signed-off-by: Masahiro Yamada masahiroy@kernel.org Tested-by: John Paul Adrian Glaubitz glaubitz@physik.fu-berlin.de Signed-off-by: Sasha Levin sashal@kernel.org --- scripts/mod/file2alias.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/scripts/mod/file2alias.c b/scripts/mod/file2alias.c index 96f37fe1b992..ea498eff1f2a 100644 --- a/scripts/mod/file2alias.c +++ b/scripts/mod/file2alias.c @@ -742,7 +742,7 @@ static void do_input(char *alias,
for (i = min / BITS_PER_LONG; i < max / BITS_PER_LONG + 1; i++) arr[i] = TO_NATIVE(arr[i]); - for (i = min; i < max; i++) + for (i = min; i <= max; i++) if (arr[i / BITS_PER_LONG] & (1ULL << (i%BITS_PER_LONG))) sprintf(alias + strlen(alias), "%X,*", i); }
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Shung-Hsi Yu shung-hsi.yu@suse.com
Revert commit ecc2aeeaa08a355d84d3ca9c3d2512399a194f29 which is commit 41f6f64e6999a837048b1bd13a2f8742964eca6b upstream.
Levi reported that commit ecc2aeeaa08a ("bpf: support non-r10 register spill/fill to/from stack in precision tracking") cause eBPF program that previously loads successfully in stable 6.6 now fails to load, when the same program also loads successfully in v6.13-rc5.
Revert ecc2aeeaa08a until the problem has been probably figured out and resolved.
Fixes: ecc2aeeaa08a ("bpf: support non-r10 register spill/fill to/from stack in precision tracking") Reported-by: Levi Zim rsworktech@outlook.com Link: https://lore.kernel.org/stable/MEYP282MB2312C3C8801476C4F262D6E1C6162@MEYP28... Signed-off-by: Shung-Hsi Yu shung-hsi.yu@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/linux/bpf_verifier.h | 31 - kernel/bpf/verifier.c | 175 ++++------ tools/testing/selftests/bpf/progs/verifier_subprog_precision.c | 23 - tools/testing/selftests/bpf/verifier/precise.c | 38 -- 4 files changed, 98 insertions(+), 169 deletions(-)
--- a/include/linux/bpf_verifier.h +++ b/include/linux/bpf_verifier.h @@ -319,34 +319,12 @@ struct bpf_func_state { struct bpf_stack_state *stack; };
-#define MAX_CALL_FRAMES 8 - -/* instruction history flags, used in bpf_jmp_history_entry.flags field */ -enum { - /* instruction references stack slot through PTR_TO_STACK register; - * we also store stack's frame number in lower 3 bits (MAX_CALL_FRAMES is 8) - * and accessed stack slot's index in next 6 bits (MAX_BPF_STACK is 512, - * 8 bytes per slot, so slot index (spi) is [0, 63]) - */ - INSN_F_FRAMENO_MASK = 0x7, /* 3 bits */ - - INSN_F_SPI_MASK = 0x3f, /* 6 bits */ - INSN_F_SPI_SHIFT = 3, /* shifted 3 bits to the left */ - - INSN_F_STACK_ACCESS = BIT(9), /* we need 10 bits total */ -}; - -static_assert(INSN_F_FRAMENO_MASK + 1 >= MAX_CALL_FRAMES); -static_assert(INSN_F_SPI_MASK + 1 >= MAX_BPF_STACK / 8); - -struct bpf_jmp_history_entry { +struct bpf_idx_pair { + u32 prev_idx; u32 idx; - /* insn idx can't be bigger than 1 million */ - u32 prev_idx : 22; - /* special flags, e.g., whether insn is doing register stack spill/load */ - u32 flags : 10; };
+#define MAX_CALL_FRAMES 8 /* Maximum number of register states that can exist at once */ #define BPF_ID_MAP_SIZE ((MAX_BPF_REG + MAX_BPF_STACK / BPF_REG_SIZE) * MAX_CALL_FRAMES) struct bpf_verifier_state { @@ -429,7 +407,7 @@ struct bpf_verifier_state { * For most states jmp_history_cnt is [0-3]. * For loops can go up to ~40. */ - struct bpf_jmp_history_entry *jmp_history; + struct bpf_idx_pair *jmp_history; u32 jmp_history_cnt; u32 dfs_depth; u32 callback_unroll_depth; @@ -662,7 +640,6 @@ struct bpf_verifier_env { int cur_stack; } cfg; struct backtrack_state bt; - struct bpf_jmp_history_entry *cur_hist_ent; u32 pass_cnt; /* number of times do_check() was called */ u32 subprog_cnt; /* number of instructions analyzed by the verifier */ --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -1762,8 +1762,8 @@ static int copy_verifier_state(struct bp int i, err;
dst_state->jmp_history = copy_array(dst_state->jmp_history, src->jmp_history, - src->jmp_history_cnt, sizeof(*dst_state->jmp_history), - GFP_USER); + src->jmp_history_cnt, sizeof(struct bpf_idx_pair), + GFP_USER); if (!dst_state->jmp_history) return -ENOMEM; dst_state->jmp_history_cnt = src->jmp_history_cnt; @@ -3397,21 +3397,6 @@ static int check_reg_arg(struct bpf_veri return __check_reg_arg(env, state->regs, regno, t); }
-static int insn_stack_access_flags(int frameno, int spi) -{ - return INSN_F_STACK_ACCESS | (spi << INSN_F_SPI_SHIFT) | frameno; -} - -static int insn_stack_access_spi(int insn_flags) -{ - return (insn_flags >> INSN_F_SPI_SHIFT) & INSN_F_SPI_MASK; -} - -static int insn_stack_access_frameno(int insn_flags) -{ - return insn_flags & INSN_F_FRAMENO_MASK; -} - static void mark_jmp_point(struct bpf_verifier_env *env, int idx) { env->insn_aux_data[idx].jmp_point = true; @@ -3423,51 +3408,28 @@ static bool is_jmp_point(struct bpf_veri }
/* for any branch, call, exit record the history of jmps in the given state */ -static int push_jmp_history(struct bpf_verifier_env *env, struct bpf_verifier_state *cur, - int insn_flags) +static int push_jmp_history(struct bpf_verifier_env *env, + struct bpf_verifier_state *cur) { u32 cnt = cur->jmp_history_cnt; - struct bpf_jmp_history_entry *p; + struct bpf_idx_pair *p; size_t alloc_size;
- /* combine instruction flags if we already recorded this instruction */ - if (env->cur_hist_ent) { - /* atomic instructions push insn_flags twice, for READ and - * WRITE sides, but they should agree on stack slot - */ - WARN_ONCE((env->cur_hist_ent->flags & insn_flags) && - (env->cur_hist_ent->flags & insn_flags) != insn_flags, - "verifier insn history bug: insn_idx %d cur flags %x new flags %x\n", - env->insn_idx, env->cur_hist_ent->flags, insn_flags); - env->cur_hist_ent->flags |= insn_flags; + if (!is_jmp_point(env, env->insn_idx)) return 0; - }
cnt++; alloc_size = kmalloc_size_roundup(size_mul(cnt, sizeof(*p))); p = krealloc(cur->jmp_history, alloc_size, GFP_USER); if (!p) return -ENOMEM; + p[cnt - 1].idx = env->insn_idx; + p[cnt - 1].prev_idx = env->prev_insn_idx; cur->jmp_history = p; - - p = &cur->jmp_history[cnt - 1]; - p->idx = env->insn_idx; - p->prev_idx = env->prev_insn_idx; - p->flags = insn_flags; cur->jmp_history_cnt = cnt; - env->cur_hist_ent = p; - return 0; }
-static struct bpf_jmp_history_entry *get_jmp_hist_entry(struct bpf_verifier_state *st, - u32 hist_end, int insn_idx) -{ - if (hist_end > 0 && st->jmp_history[hist_end - 1].idx == insn_idx) - return &st->jmp_history[hist_end - 1]; - return NULL; -} - /* Backtrack one insn at a time. If idx is not at the top of recorded * history then previous instruction came from straight line execution. * Return -ENOENT if we exhausted all instructions within given state. @@ -3629,14 +3591,9 @@ static inline bool bt_is_reg_set(struct return bt->reg_masks[bt->frame] & (1 << reg); }
-static inline bool bt_is_frame_slot_set(struct backtrack_state *bt, u32 frame, u32 slot) -{ - return bt->stack_masks[frame] & (1ull << slot); -} - static inline bool bt_is_slot_set(struct backtrack_state *bt, u32 slot) { - return bt_is_frame_slot_set(bt, bt->frame, slot); + return bt->stack_masks[bt->frame] & (1ull << slot); }
/* format registers bitmask, e.g., "r0,r2,r4" for 0x15 mask */ @@ -3690,7 +3647,7 @@ static bool calls_callback(struct bpf_ve * - *was* processed previously during backtracking. */ static int backtrack_insn(struct bpf_verifier_env *env, int idx, int subseq_idx, - struct bpf_jmp_history_entry *hist, struct backtrack_state *bt) + struct backtrack_state *bt) { const struct bpf_insn_cbs cbs = { .cb_call = disasm_kfunc_name, @@ -3703,7 +3660,7 @@ static int backtrack_insn(struct bpf_ver u8 mode = BPF_MODE(insn->code); u32 dreg = insn->dst_reg; u32 sreg = insn->src_reg; - u32 spi, i, fr; + u32 spi, i;
if (insn->code == 0) return 0; @@ -3766,15 +3723,20 @@ static int backtrack_insn(struct bpf_ver * by 'precise' mark in corresponding register of this state. * No further tracking necessary. */ - if (!hist || !(hist->flags & INSN_F_STACK_ACCESS)) + if (insn->src_reg != BPF_REG_FP) return 0; + /* dreg = *(u64 *)[fp - off] was a fill from the stack. * that [fp - off] slot contains scalar that needs to be * tracked with precision */ - spi = insn_stack_access_spi(hist->flags); - fr = insn_stack_access_frameno(hist->flags); - bt_set_frame_slot(bt, fr, spi); + spi = (-insn->off - 1) / BPF_REG_SIZE; + if (spi >= 64) { + verbose(env, "BUG spi %d\n", spi); + WARN_ONCE(1, "verifier backtracking bug"); + return -EFAULT; + } + bt_set_slot(bt, spi); } else if (class == BPF_STX || class == BPF_ST) { if (bt_is_reg_set(bt, dreg)) /* stx & st shouldn't be using _scalar_ dst_reg @@ -3783,13 +3745,17 @@ static int backtrack_insn(struct bpf_ver */ return -ENOTSUPP; /* scalars can only be spilled into stack */ - if (!hist || !(hist->flags & INSN_F_STACK_ACCESS)) + if (insn->dst_reg != BPF_REG_FP) return 0; - spi = insn_stack_access_spi(hist->flags); - fr = insn_stack_access_frameno(hist->flags); - if (!bt_is_frame_slot_set(bt, fr, spi)) + spi = (-insn->off - 1) / BPF_REG_SIZE; + if (spi >= 64) { + verbose(env, "BUG spi %d\n", spi); + WARN_ONCE(1, "verifier backtracking bug"); + return -EFAULT; + } + if (!bt_is_slot_set(bt, spi)) return 0; - bt_clear_frame_slot(bt, fr, spi); + bt_clear_slot(bt, spi); if (class == BPF_STX) bt_set_reg(bt, sreg); } else if (class == BPF_JMP || class == BPF_JMP32) { @@ -3833,14 +3799,10 @@ static int backtrack_insn(struct bpf_ver WARN_ONCE(1, "verifier backtracking bug"); return -EFAULT; } - /* we are now tracking register spills correctly, - * so any instance of leftover slots is a bug - */ - if (bt_stack_mask(bt) != 0) { - verbose(env, "BUG stack slots %llx\n", bt_stack_mask(bt)); - WARN_ONCE(1, "verifier backtracking bug (subprog leftover stack slots)"); - return -EFAULT; - } + /* we don't track register spills perfectly, + * so fallback to force-precise instead of failing */ + if (bt_stack_mask(bt) != 0) + return -ENOTSUPP; /* propagate r1-r5 to the caller */ for (i = BPF_REG_1; i <= BPF_REG_5; i++) { if (bt_is_reg_set(bt, i)) { @@ -3865,11 +3827,8 @@ static int backtrack_insn(struct bpf_ver WARN_ONCE(1, "verifier backtracking bug"); return -EFAULT; } - if (bt_stack_mask(bt) != 0) { - verbose(env, "BUG stack slots %llx\n", bt_stack_mask(bt)); - WARN_ONCE(1, "verifier backtracking bug (callback leftover stack slots)"); - return -EFAULT; - } + if (bt_stack_mask(bt) != 0) + return -ENOTSUPP; /* clear r1-r5 in callback subprog's mask */ for (i = BPF_REG_1; i <= BPF_REG_5; i++) bt_clear_reg(bt, i); @@ -4306,7 +4265,6 @@ static int __mark_chain_precision(struct for (;;) { DECLARE_BITMAP(mask, 64); u32 history = st->jmp_history_cnt; - struct bpf_jmp_history_entry *hist;
if (env->log.level & BPF_LOG_LEVEL2) { verbose(env, "mark_precise: frame%d: last_idx %d first_idx %d subseq_idx %d \n", @@ -4370,8 +4328,7 @@ static int __mark_chain_precision(struct err = 0; skip_first = false; } else { - hist = get_jmp_hist_entry(st, history, i); - err = backtrack_insn(env, i, subseq_idx, hist, bt); + err = backtrack_insn(env, i, subseq_idx, bt); } if (err == -ENOTSUPP) { mark_all_scalars_precise(env, env->cur_state); @@ -4424,10 +4381,22 @@ static int __mark_chain_precision(struct bitmap_from_u64(mask, bt_frame_stack_mask(bt, fr)); for_each_set_bit(i, mask, 64) { if (i >= func->allocated_stack / BPF_REG_SIZE) { - verbose(env, "BUG backtracking (stack slot %d, total slots %d)\n", - i, func->allocated_stack / BPF_REG_SIZE); - WARN_ONCE(1, "verifier backtracking bug (stack slot out of bounds)"); - return -EFAULT; + /* the sequence of instructions: + * 2: (bf) r3 = r10 + * 3: (7b) *(u64 *)(r3 -8) = r0 + * 4: (79) r4 = *(u64 *)(r10 -8) + * doesn't contain jmps. It's backtracked + * as a single block. + * During backtracking insn 3 is not recognized as + * stack access, so at the end of backtracking + * stack slot fp-8 is still marked in stack_mask. + * However the parent state may not have accessed + * fp-8 and it's "unallocated" stack space. + * In such case fallback to conservative. + */ + mark_all_scalars_precise(env, env->cur_state); + bt_reset(bt); + return 0; }
if (!is_spilled_scalar_reg(&func->stack[i])) { @@ -4592,7 +4561,7 @@ static int check_stack_write_fixed_off(s int i, slot = -off - 1, spi = slot / BPF_REG_SIZE, err; struct bpf_insn *insn = &env->prog->insnsi[insn_idx]; struct bpf_reg_state *reg = NULL; - int insn_flags = insn_stack_access_flags(state->frameno, spi); + u32 dst_reg = insn->dst_reg;
/* caller checked that off % size == 0 and -MAX_BPF_STACK <= off < 0, * so it's aligned access and [off, off + size) are within stack limits @@ -4631,6 +4600,17 @@ static int check_stack_write_fixed_off(s mark_stack_slot_scratched(env, spi); if (reg && !(off % BPF_REG_SIZE) && register_is_bounded(reg) && !register_is_null(reg) && env->bpf_capable) { + if (dst_reg != BPF_REG_FP) { + /* The backtracking logic can only recognize explicit + * stack slot address like [fp - 8]. Other spill of + * scalar via different register has to be conservative. + * Backtrack from here and mark all registers as precise + * that contributed into 'reg' being a constant. + */ + err = mark_chain_precision(env, value_regno); + if (err) + return err; + } save_register_state(state, spi, reg, size); /* Break the relation on a narrowing spill. */ if (fls64(reg->umax_value) > BITS_PER_BYTE * size) @@ -4642,7 +4622,6 @@ static int check_stack_write_fixed_off(s __mark_reg_known(&fake_reg, insn->imm); fake_reg.type = SCALAR_VALUE; save_register_state(state, spi, &fake_reg, size); - insn_flags = 0; /* not a register spill */ } else if (reg && is_spillable_regtype(reg->type)) { /* register containing pointer is being spilled into stack */ if (size != BPF_REG_SIZE) { @@ -4688,12 +4667,9 @@ static int check_stack_write_fixed_off(s
/* Mark slots affected by this stack write. */ for (i = 0; i < size; i++) - state->stack[spi].slot_type[(slot - i) % BPF_REG_SIZE] = type; - insn_flags = 0; /* not a register spill */ + state->stack[spi].slot_type[(slot - i) % BPF_REG_SIZE] = + type; } - - if (insn_flags) - return push_jmp_history(env, env->cur_state, insn_flags); return 0; }
@@ -4882,7 +4858,6 @@ static int check_stack_read_fixed_off(st int i, slot = -off - 1, spi = slot / BPF_REG_SIZE; struct bpf_reg_state *reg; u8 *stype, type; - int insn_flags = insn_stack_access_flags(reg_state->frameno, spi);
stype = reg_state->stack[spi].slot_type; reg = ®_state->stack[spi].spilled_ptr; @@ -4928,10 +4903,12 @@ static int check_stack_read_fixed_off(st return -EACCES; } mark_reg_unknown(env, state->regs, dst_regno); - insn_flags = 0; /* not restoring original register state */ } state->regs[dst_regno].live |= REG_LIVE_WRITTEN; - } else if (dst_regno >= 0) { + return 0; + } + + if (dst_regno >= 0) { /* restore register state from stack */ copy_register_state(&state->regs[dst_regno], reg); /* mark reg as written since spilled pointer state likely @@ -4967,10 +4944,7 @@ static int check_stack_read_fixed_off(st mark_reg_read(env, reg, reg->parent, REG_LIVE_READ64); if (dst_regno >= 0) mark_reg_stack_read(env, reg_state, off, off + size, dst_regno); - insn_flags = 0; /* we are not restoring spilled register */ } - if (insn_flags) - return push_jmp_history(env, env->cur_state, insn_flags); return 0; }
@@ -7054,6 +7028,7 @@ static int check_atomic(struct bpf_verif BPF_SIZE(insn->code), BPF_WRITE, -1, true, false); if (err) return err; + return 0; }
@@ -16802,8 +16777,7 @@ hit: * the precision needs to be propagated back in * the current state. */ - if (is_jmp_point(env, env->insn_idx)) - err = err ? : push_jmp_history(env, cur, 0); + err = err ? : push_jmp_history(env, cur); err = err ? : propagate_precision(env, &sl->state); if (err) return err; @@ -17027,9 +17001,6 @@ static int do_check(struct bpf_verifier_ u8 class; int err;
- /* reset current history entry on each new instruction */ - env->cur_hist_ent = NULL; - env->prev_insn_idx = prev_insn_idx; if (env->insn_idx >= insn_cnt) { verbose(env, "invalid insn idx %d insn_cnt %d\n", @@ -17069,7 +17040,7 @@ static int do_check(struct bpf_verifier_ }
if (is_jmp_point(env, env->insn_idx)) { - err = push_jmp_history(env, state, 0); + err = push_jmp_history(env, state); if (err) return err; } --- a/tools/testing/selftests/bpf/progs/verifier_subprog_precision.c +++ b/tools/testing/selftests/bpf/progs/verifier_subprog_precision.c @@ -541,24 +541,11 @@ static __u64 subprog_spill_reg_precise(v
SEC("?raw_tp") __success __log_level(2) -__msg("10: (0f) r1 += r7") -__msg("mark_precise: frame0: last_idx 10 first_idx 7 subseq_idx -1") -__msg("mark_precise: frame0: regs=r7 stack= before 9: (bf) r1 = r8") -__msg("mark_precise: frame0: regs=r7 stack= before 8: (27) r7 *= 4") -__msg("mark_precise: frame0: regs=r7 stack= before 7: (79) r7 = *(u64 *)(r10 -8)") -__msg("mark_precise: frame0: parent state regs= stack=-8: R0_w=2 R6_w=1 R8_rw=map_value(map=.data.vals,ks=4,vs=16) R10=fp0 fp-8_rw=P1") -__msg("mark_precise: frame0: last_idx 18 first_idx 0 subseq_idx 7") -__msg("mark_precise: frame0: regs= stack=-8 before 18: (95) exit") -__msg("mark_precise: frame1: regs= stack= before 17: (0f) r0 += r2") -__msg("mark_precise: frame1: regs= stack= before 16: (79) r2 = *(u64 *)(r1 +0)") -__msg("mark_precise: frame1: regs= stack= before 15: (79) r0 = *(u64 *)(r10 -16)") -__msg("mark_precise: frame1: regs= stack= before 14: (7b) *(u64 *)(r10 -16) = r2") -__msg("mark_precise: frame1: regs= stack= before 13: (7b) *(u64 *)(r1 +0) = r2") -__msg("mark_precise: frame1: regs=r2 stack= before 6: (85) call pc+6") -__msg("mark_precise: frame0: regs=r2 stack= before 5: (bf) r2 = r6") -__msg("mark_precise: frame0: regs=r6 stack= before 4: (07) r1 += -8") -__msg("mark_precise: frame0: regs=r6 stack= before 3: (bf) r1 = r10") -__msg("mark_precise: frame0: regs=r6 stack= before 2: (b7) r6 = 1") +/* precision backtracking can't currently handle stack access not through r10, + * so we won't be able to mark stack slot fp-8 as precise, and so will + * fallback to forcing all as precise + */ +__msg("mark_precise: frame0: falling back to forcing all scalars precise") __naked int subprog_spill_into_parent_stack_slot_precise(void) { asm volatile ( --- a/tools/testing/selftests/bpf/verifier/precise.c +++ b/tools/testing/selftests/bpf/verifier/precise.c @@ -140,11 +140,10 @@ .result = REJECT, }, { - "precise: ST zero to stack insn is supported", + "precise: ST insn causing spi > allocated_stack", .insns = { BPF_MOV64_REG(BPF_REG_3, BPF_REG_10), BPF_JMP_IMM(BPF_JNE, BPF_REG_3, 123, 0), - /* not a register spill, so we stop precision propagation for R4 here */ BPF_ST_MEM(BPF_DW, BPF_REG_3, -8, 0), BPF_LDX_MEM(BPF_DW, BPF_REG_4, BPF_REG_10, -8), BPF_MOV64_IMM(BPF_REG_0, -1), @@ -158,11 +157,11 @@ mark_precise: frame0: last_idx 4 first_idx 2\ mark_precise: frame0: regs=r4 stack= before 4\ mark_precise: frame0: regs=r4 stack= before 3\ + mark_precise: frame0: regs= stack=-8 before 2\ + mark_precise: frame0: falling back to forcing all scalars precise\ + force_precise: frame0: forcing r0 to be precise\ mark_precise: frame0: last_idx 5 first_idx 5\ - mark_precise: frame0: parent state regs=r0 stack=:\ - mark_precise: frame0: last_idx 4 first_idx 2\ - mark_precise: frame0: regs=r0 stack= before 4\ - 5: R0=-1 R4=0", + mark_precise: frame0: parent state regs= stack=:", .result = VERBOSE_ACCEPT, .retval = -1, }, @@ -170,8 +169,6 @@ "precise: STX insn causing spi > allocated_stack", .insns = { BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_get_prandom_u32), - /* make later reg spill more interesting by having somewhat known scalar */ - BPF_ALU64_IMM(BPF_AND, BPF_REG_0, 0xff), BPF_MOV64_REG(BPF_REG_3, BPF_REG_10), BPF_JMP_IMM(BPF_JNE, BPF_REG_3, 123, 0), BPF_STX_MEM(BPF_DW, BPF_REG_3, BPF_REG_0, -8), @@ -182,21 +179,18 @@ }, .prog_type = BPF_PROG_TYPE_XDP, .flags = BPF_F_TEST_STATE_FREQ, - .errstr = "mark_precise: frame0: last_idx 7 first_idx 7\ + .errstr = "mark_precise: frame0: last_idx 6 first_idx 6\ mark_precise: frame0: parent state regs=r4 stack=:\ - mark_precise: frame0: last_idx 6 first_idx 4\ - mark_precise: frame0: regs=r4 stack= before 6: (b7) r0 = -1\ - mark_precise: frame0: regs=r4 stack= before 5: (79) r4 = *(u64 *)(r10 -8)\ - mark_precise: frame0: regs= stack=-8 before 4: (7b) *(u64 *)(r3 -8) = r0\ - mark_precise: frame0: parent state regs=r0 stack=:\ - mark_precise: frame0: last_idx 3 first_idx 3\ - mark_precise: frame0: regs=r0 stack= before 3: (55) if r3 != 0x7b goto pc+0\ - mark_precise: frame0: regs=r0 stack= before 2: (bf) r3 = r10\ - mark_precise: frame0: regs=r0 stack= before 1: (57) r0 &= 255\ - mark_precise: frame0: parent state regs=r0 stack=:\ - mark_precise: frame0: last_idx 0 first_idx 0\ - mark_precise: frame0: regs=r0 stack= before 0: (85) call bpf_get_prandom_u32#7\ - mark_precise: frame0: last_idx 7 first_idx 7\ + mark_precise: frame0: last_idx 5 first_idx 3\ + mark_precise: frame0: regs=r4 stack= before 5\ + mark_precise: frame0: regs=r4 stack= before 4\ + mark_precise: frame0: regs= stack=-8 before 3\ + mark_precise: frame0: falling back to forcing all scalars precise\ + force_precise: frame0: forcing r0 to be precise\ + force_precise: frame0: forcing r0 to be precise\ + force_precise: frame0: forcing r0 to be precise\ + force_precise: frame0: forcing r0 to be precise\ + mark_precise: frame0: last_idx 6 first_idx 6\ mark_precise: frame0: parent state regs= stack=:", .result = VERBOSE_ACCEPT, .retval = -1,
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Takashi Iwai tiwai@suse.de
commit 8765429279e7d3d68d39ace5f84af2815174bb1e upstream.
When the kernel is built without UMP support but a user-space app requires the midi_version > 0, the kernel should return an error. Otherwise user-space assumes as if it were possible to deal, eventually hitting serious errors later.
Fixes: 46397622a3fa ("ALSA: seq: Add UMP support") Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20241231145358.21946-1-tiwai@suse.de Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/core/seq/seq_clientmgr.c | 14 ++++++++++---- 1 file changed, 10 insertions(+), 4 deletions(-)
--- a/sound/core/seq/seq_clientmgr.c +++ b/sound/core/seq/seq_clientmgr.c @@ -1280,10 +1280,16 @@ static int snd_seq_ioctl_set_client_info if (client->type != client_info->type) return -EINVAL;
- /* check validity of midi_version field */ - if (client->user_pversion >= SNDRV_PROTOCOL_VERSION(1, 0, 3) && - client_info->midi_version > SNDRV_SEQ_CLIENT_UMP_MIDI_2_0) - return -EINVAL; + if (client->user_pversion >= SNDRV_PROTOCOL_VERSION(1, 0, 3)) { + /* check validity of midi_version field */ + if (client_info->midi_version > SNDRV_SEQ_CLIENT_UMP_MIDI_2_0) + return -EINVAL; + + /* check if UMP is supported in kernel */ + if (!IS_ENABLED(CONFIG_SND_SEQ_UMP) && + client_info->midi_version > 0) + return -EINVAL; + }
/* fill the info fields */ if (client_info->name[0])
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Daniel Schaefer dhs@frame.work
commit 7b509910b3ad6d7aacead24c8744de10daf8715d upstream.
Similar to commit eb91c456f371 ("ALSA: hda/realtek: Add Framework Laptop 13 (Intel Core Ultra) to quirks") and previous quirks for Framework systems with Realtek codecs.
000C is a new platform that will also have an ALC285 codec and needs the same quirk.
Cc: Jaroslav Kysela perex@perex.cz Cc: Takashi Iwai tiwai@suse.com Cc: linux@frame.work Cc: Dustin L. Howett dustin@howett.net Signed-off-by: Daniel Schaefer dhs@frame.work Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20241231045958.14545-1-dhs@frame.work Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/pci/hda/patch_realtek.c | 1 + 1 file changed, 1 insertion(+)
--- a/sound/pci/hda/patch_realtek.c +++ b/sound/pci/hda/patch_realtek.c @@ -10443,6 +10443,7 @@ static const struct hda_quirk alc269_fix SND_PCI_QUIRK(0xf111, 0x0001, "Framework Laptop", ALC295_FIXUP_FRAMEWORK_LAPTOP_MIC_NO_PRESENCE), SND_PCI_QUIRK(0xf111, 0x0006, "Framework Laptop", ALC295_FIXUP_FRAMEWORK_LAPTOP_MIC_NO_PRESENCE), SND_PCI_QUIRK(0xf111, 0x0009, "Framework Laptop", ALC295_FIXUP_FRAMEWORK_LAPTOP_MIC_NO_PRESENCE), + SND_PCI_QUIRK(0xf111, 0x000c, "Framework Laptop", ALC295_FIXUP_FRAMEWORK_LAPTOP_MIC_NO_PRESENCE),
#if 0 /* Below is a quirk table taken from the old code.
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Takashi Iwai tiwai@suse.de
commit 0179488ca992d79908b8e26b9213f1554fc5bacc upstream.
OSS sequencer handles the SysEx messages split in 6 bytes packets, and ALSA sequencer OSS layer tries to combine those. It stores the data in the internal buffer and this access is racy as of now, which may lead to the out-of-bounds access.
As a temporary band-aid fix, introduce a mutex for serializing the process of the SysEx message packets.
Reported-by: Kun Hu huk23@m.fudan.edu.cn Closes: https://lore.kernel.org/2B7E93E4-B13A-4AE4-8E87-306A8EE9BBB7@m.fudan.edu.cn Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20241230110543.32454-1-tiwai@suse.de Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/core/seq/oss/seq_oss_synth.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/sound/core/seq/oss/seq_oss_synth.c +++ b/sound/core/seq/oss/seq_oss_synth.c @@ -66,6 +66,7 @@ static struct seq_oss_synth midi_synth_d };
static DEFINE_SPINLOCK(register_lock); +static DEFINE_MUTEX(sysex_mutex);
/* * prototypes @@ -497,6 +498,7 @@ snd_seq_oss_synth_sysex(struct seq_oss_d if (!info) return -ENXIO;
+ guard(mutex)(&sysex_mutex); sysex = info->sysex; if (sysex == NULL) { sysex = kzalloc(sizeof(*sysex), GFP_KERNEL);
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dennis Lam dennis.lamerice@gmail.com
commit 5f3fd772d152229d94602bca243fbb658068a597 upstream.
When mounting ocfs2 and then remounting it as read-only, a slab-use-after-free occurs after the user uses a syscall to quota_getnextquota. Specifically, sb_dqinfo(sb, type)->dqi_priv is the dangling pointer.
During the remounting process, the pointer dqi_priv is freed but is never set as null leaving it to be accessed. Additionally, the read-only option for remounting sets the DQUOT_SUSPENDED flag instead of setting the DQUOT_USAGE_ENABLED flags. Moreover, later in the process of getting the next quota, the function ocfs2_get_next_id is called and only checks the quota usage flags and not the quota suspended flags.
To fix this, I set dqi_priv to null when it is freed after remounting with read-only and put a check for DQUOT_SUSPENDED in ocfs2_get_next_id.
[akpm@linux-foundation.org: coding-style cleanups] Link: https://lkml.kernel.org/r/20241218023924.22821-2-dennis.lamerice@gmail.com Fixes: 8f9e8f5fcc05 ("ocfs2: Fix Q_GETNEXTQUOTA for filesystem without quotas") Signed-off-by: Dennis Lam dennis.lamerice@gmail.com Reported-by: syzbot+d173bf8a5a7faeede34c@syzkaller.appspotmail.com Tested-by: syzbot+d173bf8a5a7faeede34c@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/6731d26f.050a0220.1fb99c.014b.GAE@google.com/T/ Reviewed-by: Joseph Qi joseph.qi@linux.alibaba.com Cc: Mark Fasheh mark@fasheh.com Cc: Joel Becker jlbec@evilplan.org Cc: Junxiao Bi junxiao.bi@oracle.com Cc: Changwei Ge gechangwei@live.cn Cc: Jun Piao piaojun@huawei.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ocfs2/quota_global.c | 2 +- fs/ocfs2/quota_local.c | 1 + 2 files changed, 2 insertions(+), 1 deletion(-)
--- a/fs/ocfs2/quota_global.c +++ b/fs/ocfs2/quota_global.c @@ -881,7 +881,7 @@ static int ocfs2_get_next_id(struct supe int status = 0;
trace_ocfs2_get_next_id(from_kqid(&init_user_ns, *qid), type); - if (!sb_has_quota_loaded(sb, type)) { + if (!sb_has_quota_active(sb, type)) { status = -ESRCH; goto out; } --- a/fs/ocfs2/quota_local.c +++ b/fs/ocfs2/quota_local.c @@ -864,6 +864,7 @@ out: brelse(oinfo->dqi_libh); brelse(oinfo->dqi_lqi_bh); kfree(oinfo); + info->dqi_priv = NULL; return status; }
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Arnd Bergmann arnd@arndb.de
commit cb0ca08b326aa03f87fe94bb91872ce8d2ef1ed8 upstream.
If gcc decides not to inline in_softirq_really(), objtool warns about a function call with UACCESS enabled:
kernel/kcov.o: warning: objtool: __sanitizer_cov_trace_pc+0x1e: call to in_softirq_really() with UACCESS enabled kernel/kcov.o: warning: objtool: check_kcov_mode+0x11: call to in_softirq_really() with UACCESS enabled
Mark this as __always_inline to avoid the problem.
Link: https://lkml.kernel.org/r/20241217071814.2261620-1-arnd@kernel.org Fixes: 7d4df2dad312 ("kcov: properly check for softirq context") Signed-off-by: Arnd Bergmann arnd@arndb.de Reviewed-by: Marco Elver elver@google.com Cc: Aleksandr Nogikh nogikh@google.com Cc: Andrey Konovalov andreyknvl@gmail.com Cc: Dmitry Vyukov dvyukov@google.com Cc: Josh Poimboeuf jpoimboe@kernel.org Cc: Peter Zijlstra peterz@infradead.org Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/kcov.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/kernel/kcov.c +++ b/kernel/kcov.c @@ -165,7 +165,7 @@ static void kcov_remote_area_put(struct * Unlike in_serving_softirq(), this function returns false when called during * a hardirq or an NMI that happened in the softirq context. */ -static inline bool in_softirq_really(void) +static __always_inline bool in_softirq_really(void) { return in_serving_softirq() && !in_hardirq() && !in_nmi(); }
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kuan-Wei Chiu visitorckw@gmail.com
commit 0210d251162f4033350a94a43f95b1c39ec84a90 upstream.
The orc_sort_cmp() function, used with qsort(), previously violated the symmetry and transitivity rules required by the C standard. Specifically, when both entries are ORC_TYPE_UNDEFINED, it could result in both a < b and b < a, which breaks the required symmetry and transitivity. This can lead to undefined behavior and incorrect sorting results, potentially causing memory corruption in glibc implementations [1].
Symmetry: If x < y, then y > x. Transitivity: If x < y and y < z, then x < z.
Fix the comparison logic to return 0 when both entries are ORC_TYPE_UNDEFINED, ensuring compliance with qsort() requirements.
Link: https://www.qualys.com/2024/01/30/qsort.txt [1] Link: https://lkml.kernel.org/r/20241226140332.2670689-1-visitorckw@gmail.com Fixes: 57fa18994285 ("scripts/sorttable: Implement build-time ORC unwind table sorting") Fixes: fb799447ae29 ("x86,objtool: Split UNWIND_HINT_EMPTY in two") Signed-off-by: Kuan-Wei Chiu visitorckw@gmail.com Cc: Ching-Chun (Jim) Huang jserv@ccns.ncku.edu.tw Cc: chuang@cs.nycu.edu.tw Cc: Ingo Molnar mingo@kernel.org Cc: Josh Poimboeuf jpoimboe@kernel.org Cc: Peter Zijlstra peterz@infradead.org Cc: Shile Zhang shile.zhang@linux.alibaba.com Cc: Steven Rostedt rostedt@goodmis.org Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- scripts/sorttable.h | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
--- a/scripts/sorttable.h +++ b/scripts/sorttable.h @@ -110,7 +110,7 @@ static inline unsigned long orc_ip(const
static int orc_sort_cmp(const void *_a, const void *_b) { - struct orc_entry *orc_a; + struct orc_entry *orc_a, *orc_b; const int *a = g_orc_ip_table + *(int *)_a; const int *b = g_orc_ip_table + *(int *)_b; unsigned long a_val = orc_ip(a); @@ -128,6 +128,9 @@ static int orc_sort_cmp(const void *_a, * whitelisted .o files which didn't get objtool generation. */ orc_a = g_orc_table + (a - g_orc_ip_table); + orc_b = g_orc_table + (b - g_orc_ip_table); + if (orc_a->type == ORC_TYPE_UNDEFINED && orc_b->type == ORC_TYPE_UNDEFINED) + return 0; return orc_a->type == ORC_TYPE_UNDEFINED ? -1 : 1; }
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dan Carpenter dan.carpenter@linaro.org
commit d0257e089d1bbd35c69b6c97ff73e3690ab149a9 upstream.
In the expression "cmd.wqe_size * cmd.wr_count", both variables are u32 values that come from the user so the multiplication can lead to integer wrapping. Then we pass the result to uverbs_request_next_ptr() which also could potentially wrap. The "cmd.sge_count * sizeof(struct ib_uverbs_sge)" multiplication can also overflow on 32bit systems although it's fine on 64bit systems.
This patch does two things. First, I've re-arranged the condition in uverbs_request_next_ptr() so that the use controlled variable "len" is on one side of the comparison by itself without any math. Then I've modified all the callers to use size_mul() for the multiplications.
Fixes: 67cdb40ca444 ("[IB] uverbs: Implement more commands") Cc: stable@vger.kernel.org Signed-off-by: Dan Carpenter dan.carpenter@linaro.org Link: https://patch.msgid.link/b8765ab3-c2da-4611-aae0-ddd6ba173d23@stanley.mounta... Signed-off-by: Leon Romanovsky leon@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/infiniband/core/uverbs_cmd.c | 16 +++++++++------- 1 file changed, 9 insertions(+), 7 deletions(-)
--- a/drivers/infiniband/core/uverbs_cmd.c +++ b/drivers/infiniband/core/uverbs_cmd.c @@ -161,7 +161,7 @@ static const void __user *uverbs_request { const void __user *res = iter->cur;
- if (iter->cur + len > iter->end) + if (len > iter->end - iter->cur) return (void __force __user *)ERR_PTR(-ENOSPC); iter->cur += len; return res; @@ -2009,11 +2009,13 @@ static int ib_uverbs_post_send(struct uv ret = uverbs_request_start(attrs, &iter, &cmd, sizeof(cmd)); if (ret) return ret; - wqes = uverbs_request_next_ptr(&iter, cmd.wqe_size * cmd.wr_count); + wqes = uverbs_request_next_ptr(&iter, size_mul(cmd.wqe_size, + cmd.wr_count)); if (IS_ERR(wqes)) return PTR_ERR(wqes); - sgls = uverbs_request_next_ptr( - &iter, cmd.sge_count * sizeof(struct ib_uverbs_sge)); + sgls = uverbs_request_next_ptr(&iter, + size_mul(cmd.sge_count, + sizeof(struct ib_uverbs_sge))); if (IS_ERR(sgls)) return PTR_ERR(sgls); ret = uverbs_request_finish(&iter); @@ -2199,11 +2201,11 @@ ib_uverbs_unmarshall_recv(struct uverbs_ if (wqe_size < sizeof(struct ib_uverbs_recv_wr)) return ERR_PTR(-EINVAL);
- wqes = uverbs_request_next_ptr(iter, wqe_size * wr_count); + wqes = uverbs_request_next_ptr(iter, size_mul(wqe_size, wr_count)); if (IS_ERR(wqes)) return ERR_CAST(wqes); - sgls = uverbs_request_next_ptr( - iter, sge_count * sizeof(struct ib_uverbs_sge)); + sgls = uverbs_request_next_ptr(iter, size_mul(sge_count, + sizeof(struct ib_uverbs_sge))); if (IS_ERR(sgls)) return ERR_CAST(sgls); ret = uverbs_request_finish(iter);
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Evgenii Shatokhin e.shatokhin@yadro.com
commit a37eecb705f33726f1fb7cd2a67e514a15dfe693 upstream.
If a device uses MCP23xxx IO expander to receive IRQs, the following bug can happen:
BUG: sleeping function called from invalid context at kernel/locking/mutex.c:283 in_atomic(): 1, irqs_disabled(): 1, non_block: 0, ... preempt_count: 1, expected: 0 ... Call Trace: ... __might_resched+0x104/0x10e __might_sleep+0x3e/0x62 mutex_lock+0x20/0x4c regmap_lock_mutex+0x10/0x18 regmap_update_bits_base+0x2c/0x66 mcp23s08_irq_set_type+0x1ae/0x1d6 __irq_set_trigger+0x56/0x172 __setup_irq+0x1e6/0x646 request_threaded_irq+0xb6/0x160 ...
We observed the problem while experimenting with a touchscreen driver which used MCP23017 IO expander (I2C).
The regmap in the pinctrl-mcp23s08 driver uses a mutex for protection from concurrent accesses, which is the default for regmaps without .fast_io, .disable_locking, etc.
mcp23s08_irq_set_type() calls regmap_update_bits_base(), and the latter locks the mutex.
However, __setup_irq() locks desc->lock spinlock before calling these functions. As a result, the system tries to lock the mutex whole holding the spinlock.
It seems, the internal regmap locks are not needed in this driver at all. mcp->lock seems to protect the regmap from concurrent accesses already, except, probably, in mcp_pinconf_get/set.
mcp23s08_irq_set_type() and mcp23s08_irq_mask/unmask() are called under chip_bus_lock(), which calls mcp23s08_irq_bus_lock(). The latter takes mcp->lock and enables regmap caching, so that the potentially slow I2C accesses are deferred until chip_bus_unlock().
The accesses to the regmap from mcp23s08_probe_one() do not need additional locking.
In all remaining places where the regmap is accessed, except mcp_pinconf_get/set(), the driver already takes mcp->lock.
This patch adds locking in mcp_pinconf_get/set() and disables internal locking in the regmap config. Among other things, it fixes the sleeping in atomic context described above.
Fixes: 8f38910ba4f6 ("pinctrl: mcp23s08: switch to regmap caching") Cc: stable@vger.kernel.org Signed-off-by: Evgenii Shatokhin e.shatokhin@yadro.com Link: https://lore.kernel.org/20241209074659.1442898-1-e.shatokhin@yadro.com Signed-off-by: Linus Walleij linus.walleij@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pinctrl/pinctrl-mcp23s08.c | 6 ++++++ 1 file changed, 6 insertions(+)
--- a/drivers/pinctrl/pinctrl-mcp23s08.c +++ b/drivers/pinctrl/pinctrl-mcp23s08.c @@ -86,6 +86,7 @@ const struct regmap_config mcp23x08_regm .num_reg_defaults = ARRAY_SIZE(mcp23x08_defaults), .cache_type = REGCACHE_FLAT, .max_register = MCP_OLAT, + .disable_locking = true, /* mcp->lock protects the regmap */ }; EXPORT_SYMBOL_GPL(mcp23x08_regmap);
@@ -132,6 +133,7 @@ const struct regmap_config mcp23x17_regm .num_reg_defaults = ARRAY_SIZE(mcp23x17_defaults), .cache_type = REGCACHE_FLAT, .val_format_endian = REGMAP_ENDIAN_LITTLE, + .disable_locking = true, /* mcp->lock protects the regmap */ }; EXPORT_SYMBOL_GPL(mcp23x17_regmap);
@@ -228,7 +230,9 @@ static int mcp_pinconf_get(struct pinctr
switch (param) { case PIN_CONFIG_BIAS_PULL_UP: + mutex_lock(&mcp->lock); ret = mcp_read(mcp, MCP_GPPU, &data); + mutex_unlock(&mcp->lock); if (ret < 0) return ret; status = (data & BIT(pin)) ? 1 : 0; @@ -257,7 +261,9 @@ static int mcp_pinconf_set(struct pinctr
switch (param) { case PIN_CONFIG_BIAS_PULL_UP: + mutex_lock(&mcp->lock); ret = mcp_set_bit(mcp, MCP_GPPU, pin, arg); + mutex_unlock(&mcp->lock); break; default: dev_dbg(mcp->dev, "Invalid config param %04x\n", param);
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pascal Hambourg pascal@plouf.fr.eu.org
commit 03c8d0af2e409e15c16130b185e12b5efba0a6b9 upstream.
A Marvell 88E8075 ethernet controller has this device ID instead of 11ab:4370 and works fine with the sky2 driver.
Signed-off-by: Pascal Hambourg pascal@plouf.fr.eu.org Cc: stable@vger.kernel.org Link: https://patch.msgid.link/10165a62-99fb-4be6-8c64-84afd6234085@plouf.fr.eu.or... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/marvell/sky2.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/net/ethernet/marvell/sky2.c +++ b/drivers/net/ethernet/marvell/sky2.c @@ -129,6 +129,7 @@ static const struct pci_device_id sky2_i { PCI_DEVICE(PCI_VENDOR_ID_MARVELL, 0x436C) }, /* 88E8072 */ { PCI_DEVICE(PCI_VENDOR_ID_MARVELL, 0x436D) }, /* 88E8055 */ { PCI_DEVICE(PCI_VENDOR_ID_MARVELL, 0x4370) }, /* 88E8075 */ + { PCI_DEVICE(PCI_VENDOR_ID_MARVELL, 0x4373) }, /* 88E8075 */ { PCI_DEVICE(PCI_VENDOR_ID_MARVELL, 0x4380) }, /* 88E8057 */ { PCI_DEVICE(PCI_VENDOR_ID_MARVELL, 0x4381) }, /* 88E8059 */ { PCI_DEVICE(PCI_VENDOR_ID_MARVELL, 0x4382) }, /* 88E8079 */
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nikolay Kuratov kniv@yandex-team.ru
commit 4e86729d1ff329815a6e8a920cb554a1d4cb5b8d upstream.
While by default max_autoclose equals to INT_MAX / HZ, one may set net.sctp.max_autoclose to UINT_MAX. There is code in sctp_association_init() that can consequently trigger overflow.
Cc: stable@vger.kernel.org Fixes: 9f70f46bd4c7 ("sctp: properly latch and use autoclose value from sock to association") Signed-off-by: Nikolay Kuratov kniv@yandex-team.ru Acked-by: Xin Long lucien.xin@gmail.com Link: https://patch.msgid.link/20241219162114.2863827-1-kniv@yandex-team.ru Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/sctp/associola.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/net/sctp/associola.c +++ b/net/sctp/associola.c @@ -137,7 +137,8 @@ static struct sctp_association *sctp_ass = 5 * asoc->rto_max;
asoc->timeouts[SCTP_EVENT_TIMEOUT_SACK] = asoc->sackdelay; - asoc->timeouts[SCTP_EVENT_TIMEOUT_AUTOCLOSE] = sp->autoclose * HZ; + asoc->timeouts[SCTP_EVENT_TIMEOUT_AUTOCLOSE] = + (unsigned long)sp->autoclose * HZ;
/* Initializes the timers */ for (i = SCTP_EVENT_TIMEOUT_NONE; i < SCTP_NUM_TIMEOUT_TYPES; ++i)
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Biju Das biju.das.jz@bp.renesas.com
commit 79d67c499c3f886202a40c5cb27e747e4fa4d738 upstream.
As per [1] and [2], ADV7535/7533 supports only 2-, 3-, or 4-lane. Drop unsupported 1-lane.
[1] https://www.analog.com/media/en/technical-documentation/data-sheets/ADV7535.... [2] https://www.analog.com/media/en/technical-documentation/data-sheets/ADV7533....
Fixes: 1e4d58cd7f88 ("drm/bridge: adv7533: Create a MIPI DSI device") Reported-by: Hien Huynh hien.huynh.px@renesas.com Cc: stable@vger.kernel.org Reviewed-by: Laurent Pinchart laurent.pinchart+renesas@ideasonboard.com Reviewed-by: Adam Ford aford173@gmail.com Signed-off-by: Biju Das biju.das.jz@bp.renesas.com Link: https://patchwork.freedesktop.org/patch/msgid/20241119192040.152657-4-biju.d... Signed-off-by: Dmitry Baryshkov dmitry.baryshkov@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/bridge/adv7511/adv7533.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/gpu/drm/bridge/adv7511/adv7533.c +++ b/drivers/gpu/drm/bridge/adv7511/adv7533.c @@ -175,7 +175,7 @@ int adv7533_parse_dt(struct device_node
of_property_read_u32(np, "adi,dsi-lanes", &num_lanes);
- if (num_lanes < 1 || num_lanes > 4) + if (num_lanes < 2 || num_lanes > 4) return -EINVAL;
adv->num_dsi_lanes = num_lanes;
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Biju Das biju.das.jz@bp.renesas.com
commit ee8f9ed57a397605434caeef351bafa3ec4dfdd4 upstream.
As per [1] and [2], ADV7535/7533 supports only 2-, 3-, or 4-lane. Drop unsupported 1-lane from bindings.
[1] https://www.analog.com/media/en/technical-documentation/data-sheets/ADV7535.... [2] https://www.analog.com/media/en/technical-documentation/data-sheets/ADV7533....
Fixes: 1e4d58cd7f88 ("drm/bridge: adv7533: Create a MIPI DSI device") Cc: stable@vger.kernel.org Acked-by: Krzysztof Kozlowski krzysztof.kozlowski@linaro.org Reviewed-by: Geert Uytterhoeven geert+renesas@glider.be Reviewed-by: Laurent Pinchart laurent.pinchart+renesas@ideasonboard.com Signed-off-by: Biju Das biju.das.jz@bp.renesas.com Link: https://patchwork.freedesktop.org/patch/msgid/20241119192040.152657-3-biju.d... Signed-off-by: Dmitry Baryshkov dmitry.baryshkov@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- Documentation/devicetree/bindings/display/bridge/adi,adv7533.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/Documentation/devicetree/bindings/display/bridge/adi,adv7533.yaml +++ b/Documentation/devicetree/bindings/display/bridge/adi,adv7533.yaml @@ -87,7 +87,7 @@ properties: adi,dsi-lanes: description: Number of DSI data lanes connected to the DSI host. $ref: /schemas/types.yaml#/definitions/uint32 - enum: [ 1, 2, 3, 4 ] + enum: [ 2, 3, 4 ]
ports: description:
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Biju Das biju.das.jz@bp.renesas.com
commit 81adbd3ff21c1182e06aa02c6be0bfd9ea02d8e8 upstream.
The host_node pointer was assigned and freed in adv7533_parse_dt(), and later, adv7533_attach_dsi() uses the same. Fix this use-after-free issue by dropping of_node_put() in adv7533_parse_dt() and calling of_node_put() in error path of probe() and also in the remove().
Fixes: 1e4d58cd7f88 ("drm/bridge: adv7533: Create a MIPI DSI device") Cc: stable@vger.kernel.org Reviewed-by: Laurent Pinchart laurent.pinchart+renesas@ideasonboard.com Signed-off-by: Biju Das biju.das.jz@bp.renesas.com Link: https://patchwork.freedesktop.org/patch/msgid/20241119192040.152657-2-biju.d... Signed-off-by: Dmitry Baryshkov dmitry.baryshkov@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/bridge/adv7511/adv7511_drv.c | 10 ++++++++-- drivers/gpu/drm/bridge/adv7511/adv7533.c | 2 -- 2 files changed, 8 insertions(+), 4 deletions(-)
--- a/drivers/gpu/drm/bridge/adv7511/adv7511_drv.c +++ b/drivers/gpu/drm/bridge/adv7511/adv7511_drv.c @@ -1225,8 +1225,10 @@ static int adv7511_probe(struct i2c_clie return ret;
ret = adv7511_init_regulators(adv7511); - if (ret) - return dev_err_probe(dev, ret, "failed to init regulators\n"); + if (ret) { + dev_err_probe(dev, ret, "failed to init regulators\n"); + goto err_of_node_put; + }
/* * The power down GPIO is optional. If present, toggle it from active to @@ -1346,6 +1348,8 @@ err_i2c_unregister_edid: i2c_unregister_device(adv7511->i2c_edid); uninit_regulators: adv7511_uninit_regulators(adv7511); +err_of_node_put: + of_node_put(adv7511->host_node);
return ret; } @@ -1354,6 +1358,8 @@ static void adv7511_remove(struct i2c_cl { struct adv7511 *adv7511 = i2c_get_clientdata(i2c);
+ of_node_put(adv7511->host_node); + adv7511_uninit_regulators(adv7511);
drm_bridge_remove(&adv7511->bridge); --- a/drivers/gpu/drm/bridge/adv7511/adv7533.c +++ b/drivers/gpu/drm/bridge/adv7511/adv7533.c @@ -184,8 +184,6 @@ int adv7533_parse_dt(struct device_node if (!adv->host_node) return -ENODEV;
- of_node_put(adv->host_node); - adv->use_timing_gen = !of_property_read_bool(np, "adi,disable-timing-generator");
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: David Hildenbrand david@redhat.com
commit 3754137d263f52f4b507cf9ae913f8f0497d1b0e upstream.
Entries (including flags) are u64, even on 32bit. So right now we are cutting of the flags on 32bit. This way, for example the cow selftest complains about:
# ./cow ... Bail Out! read and ioctl return unmatched results for populated: 0 1
Link: https://lkml.kernel.org/r/20241217195000.1734039-1-david@redhat.com Fixes: 2c1f057e5be6 ("fs/proc/task_mmu: properly detect PM_MMAP_EXCLUSIVE per page of PMD-mapped THPs") Signed-off-by: David Hildenbrand david@redhat.com Cc: Oscar Salvador osalvador@suse.de Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/proc/task_mmu.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -1516,7 +1516,7 @@ static int pagemap_pmd_range(pmd_t *pmdp flags |= PM_FILE;
for (; addr != end; addr += PAGE_SIZE, idx++) { - unsigned long cur_flags = flags; + u64 cur_flags = flags; pagemap_entry_t pme;
if (page && (flags & PM_PRESENT) &&
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Joshua Washington joshwash@google.com
commit 40338d7987d810fcaa95c500b1068a52b08eec9b upstream.
This patch predicates the enabling and disabling of XSK pools on the existence of queues. As it stands, if the interface is down, disabling or enabling XSK pools would result in a crash, as the RX queue pointer would be NULL. XSK pool registration will occur as part of the next interface up.
Similarly, xsk_wakeup needs be guarded against queues disappearing while the function is executing, so a check against the GVE_PRIV_FLAGS_NAPI_ENABLED flag is added to synchronize with the disabling of the bit and the synchronize_net() in gve_turndown.
Fixes: fd8e40321a12 ("gve: Add AF_XDP zero-copy support for GQI-QPL format") Cc: stable@vger.kernel.org Signed-off-by: Joshua Washington joshwash@google.com Signed-off-by: Praveen Kaligineedi pkaligineedi@google.com Reviewed-by: Praveen Kaligineedi pkaligineedi@google.com Reviewed-by: Shailend Chand shailend@google.com Reviewed-by: Willem de Bruijn willemb@google.com Reviewed-by: Larysa Zaremba larysa.zaremba@intel.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/google/gve/gve_main.c | 22 ++++++++++------------ 1 file changed, 10 insertions(+), 12 deletions(-)
--- a/drivers/net/ethernet/google/gve/gve_main.c +++ b/drivers/net/ethernet/google/gve/gve_main.c @@ -1528,8 +1528,8 @@ static int gve_xsk_pool_enable(struct ne if (err) return err;
- /* If XDP prog is not installed, return */ - if (!priv->xdp_prog) + /* If XDP prog is not installed or interface is down, return. */ + if (!priv->xdp_prog || !netif_running(dev)) return 0;
rx = &priv->rx[qid]; @@ -1574,21 +1574,16 @@ static int gve_xsk_pool_disable(struct n if (qid >= priv->rx_cfg.num_queues) return -EINVAL;
- /* If XDP prog is not installed, unmap DMA and return */ - if (!priv->xdp_prog) + /* If XDP prog is not installed or interface is down, unmap DMA and + * return. + */ + if (!priv->xdp_prog || !netif_running(dev)) goto done;
- tx_qid = gve_xdp_tx_queue_id(priv, qid); - if (!netif_running(dev)) { - priv->rx[qid].xsk_pool = NULL; - xdp_rxq_info_unreg(&priv->rx[qid].xsk_rxq); - priv->tx[tx_qid].xsk_pool = NULL; - goto done; - } - napi_rx = &priv->ntfy_blocks[priv->rx[qid].ntfy_id].napi; napi_disable(napi_rx); /* make sure current rx poll is done */
+ tx_qid = gve_xdp_tx_queue_id(priv, qid); napi_tx = &priv->ntfy_blocks[priv->tx[tx_qid].ntfy_id].napi; napi_disable(napi_tx); /* make sure current tx poll is done */
@@ -1616,6 +1611,9 @@ static int gve_xsk_wakeup(struct net_dev struct gve_priv *priv = netdev_priv(dev); int tx_queue_id = gve_xdp_tx_queue_id(priv, queue_id);
+ if (!gve_get_napi_enabled(priv)) + return -ENETDOWN; + if (queue_id >= priv->rx_cfg.num_queues || !priv->xdp_prog) return -EINVAL;
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Joshua Washington joshwash@google.com
commit ff7c2dea9dd1a436fc79d6273adffdcc4a7ffea3 upstream.
In GVE, dedicated XDP queues only exist when an XDP program is installed and the interface is up. As such, the NDO XDP XMIT callback should return early if either of these conditions are false.
In the case of no loaded XDP program, priv->num_xdp_queues=0 which can cause a divide-by-zero error, and in the case of interface down, num_xdp_queues remains untouched to persist XDP queue count for the next interface up, but the TX pointer itself would be NULL.
The XDP xmit callback also needs to synchronize with a device transitioning from open to close. This synchronization will happen via the GVE_PRIV_FLAGS_NAPI_ENABLED bit along with a synchronize_net() call, which waits for any RCU critical sections at call-time to complete.
Fixes: 39a7f4aa3e4a ("gve: Add XDP REDIRECT support for GQI-QPL format") Cc: stable@vger.kernel.org Signed-off-by: Joshua Washington joshwash@google.com Signed-off-by: Praveen Kaligineedi pkaligineedi@google.com Reviewed-by: Praveen Kaligineedi pkaligineedi@google.com Reviewed-by: Shailend Chand shailend@google.com Reviewed-by: Willem de Bruijn willemb@google.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/google/gve/gve_main.c | 3 +++ drivers/net/ethernet/google/gve/gve_tx.c | 5 ++++- 2 files changed, 7 insertions(+), 1 deletion(-)
--- a/drivers/net/ethernet/google/gve/gve_main.c +++ b/drivers/net/ethernet/google/gve/gve_main.c @@ -1755,6 +1755,9 @@ static void gve_turndown(struct gve_priv
gve_clear_napi_enabled(priv); gve_clear_report_stats(priv); + + /* Make sure that all traffic is finished processing. */ + synchronize_net(); }
static void gve_turnup(struct gve_priv *priv) --- a/drivers/net/ethernet/google/gve/gve_tx.c +++ b/drivers/net/ethernet/google/gve/gve_tx.c @@ -777,9 +777,12 @@ int gve_xdp_xmit(struct net_device *dev, struct gve_tx_ring *tx; int i, err = 0, qid;
- if (unlikely(flags & ~XDP_XMIT_FLAGS_MASK)) + if (unlikely(flags & ~XDP_XMIT_FLAGS_MASK) || !priv->xdp_prog) return -EINVAL;
+ if (!gve_get_napi_enabled(priv)) + return -ENETDOWN; + qid = gve_xdp_tx_queue_id(priv, smp_processor_id() % priv->num_xdp_queues);
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Yafang Shao laoar.shao@gmail.com
commit 158cdce87c8c172787063998ad5dd3e2f658b963 upstream.
When testing large folio support with XFS on our servers, we observed that only a few large folios are mapped when reading large files via mmap. After a thorough analysis, I identified it was caused by the `/sys/block/*/queue/read_ahead_kb` setting. On our test servers, this parameter is set to 128KB. After I tune it to 2MB, the large folio can work as expected. However, I believe the large folio behavior should not be dependent on the value of read_ahead_kb. It would be more robust if the kernel can automatically adopt to it.
With /sys/block/*/queue/read_ahead_kb set to 128KB and performing a sequential read on a 1GB file using MADV_HUGEPAGE, the differences in /proc/meminfo are as follows:
- before this patch FileHugePages: 18432 kB FilePmdMapped: 4096 kB
- after this patch FileHugePages: 1067008 kB FilePmdMapped: 1048576 kB
This shows that after applying the patch, the entire 1GB file is mapped to huge pages. The stable list is CCed, as without this patch, large folios don't function optimally in the readahead path.
It's worth noting that if read_ahead_kb is set to a larger value that isn't aligned with huge page sizes (e.g., 4MB + 128KB), it may still fail to map to hugepages.
Link: https://lkml.kernel.org/r/20241108141710.9721-1-laoar.shao@gmail.com Link: https://lkml.kernel.org/r/20241206083025.3478-1-laoar.shao@gmail.com Fixes: 4687fdbb805a ("mm/filemap: Support VM_HUGEPAGE for file mappings") Signed-off-by: Yafang Shao laoar.shao@gmail.com Tested-by: kernel test robot oliver.sang@intel.com Cc: Matthew Wilcox willy@infradead.org Cc: David Hildenbrand david@redhat.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/readahead.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
--- a/mm/readahead.c +++ b/mm/readahead.c @@ -580,7 +580,11 @@ static void ondemand_readahead(struct re 1UL << order); if (index == expected || index == (ra->start + ra->size)) { ra->start += ra->size; - ra->size = get_next_ra_size(ra, max_pages); + /* + * In the case of MADV_HUGEPAGE, the actual size might exceed + * the readahead window. + */ + ra->size = max(ra->size, get_next_ra_size(ra, max_pages)); ra->async_size = ra->size; goto readit; }
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alessandro Carminati acarmina@redhat.com
commit cddc76b165161a02ff14c4d84d0f5266d9d32b9e upstream.
Address a bug in the kernel that triggers a "sleeping function called from invalid context" warning when /sys/kernel/debug/kmemleak is printed under specific conditions: - CONFIG_PREEMPT_RT=y - Set SELinux as the LSM for the system - Set kptr_restrict to 1 - kmemleak buffer contains at least one item
BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:48 in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 136, name: cat preempt_count: 1, expected: 0 RCU nest depth: 2, expected: 2 6 locks held by cat/136: #0: ffff32e64bcbf950 (&p->lock){+.+.}-{3:3}, at: seq_read_iter+0xb8/0xe30 #1: ffffafe6aaa9dea0 (scan_mutex){+.+.}-{3:3}, at: kmemleak_seq_start+0x34/0x128 #3: ffff32e6546b1cd0 (&object->lock){....}-{2:2}, at: kmemleak_seq_show+0x3c/0x1e0 #4: ffffafe6aa8d8560 (rcu_read_lock){....}-{1:2}, at: has_ns_capability_noaudit+0x8/0x1b0 #5: ffffafe6aabbc0f8 (notif_lock){+.+.}-{2:2}, at: avc_compute_av+0xc4/0x3d0 irq event stamp: 136660 hardirqs last enabled at (136659): [<ffffafe6a80fd7a0>] _raw_spin_unlock_irqrestore+0xa8/0xd8 hardirqs last disabled at (136660): [<ffffafe6a80fd85c>] _raw_spin_lock_irqsave+0x8c/0xb0 softirqs last enabled at (0): [<ffffafe6a5d50b28>] copy_process+0x11d8/0x3df8 softirqs last disabled at (0): [<0000000000000000>] 0x0 Preemption disabled at: [<ffffafe6a6598a4c>] kmemleak_seq_show+0x3c/0x1e0 CPU: 1 UID: 0 PID: 136 Comm: cat Tainted: G E 6.11.0-rt7+ #34 Tainted: [E]=UNSIGNED_MODULE Hardware name: linux,dummy-virt (DT) Call trace: dump_backtrace+0xa0/0x128 show_stack+0x1c/0x30 dump_stack_lvl+0xe8/0x198 dump_stack+0x18/0x20 rt_spin_lock+0x8c/0x1a8 avc_perm_nonode+0xa0/0x150 cred_has_capability.isra.0+0x118/0x218 selinux_capable+0x50/0x80 security_capable+0x7c/0xd0 has_ns_capability_noaudit+0x94/0x1b0 has_capability_noaudit+0x20/0x30 restricted_pointer+0x21c/0x4b0 pointer+0x298/0x760 vsnprintf+0x330/0xf70 seq_printf+0x178/0x218 print_unreferenced+0x1a4/0x2d0 kmemleak_seq_show+0xd0/0x1e0 seq_read_iter+0x354/0xe30 seq_read+0x250/0x378 full_proxy_read+0xd8/0x148 vfs_read+0x190/0x918 ksys_read+0xf0/0x1e0 __arm64_sys_read+0x70/0xa8 invoke_syscall.constprop.0+0xd4/0x1d8 el0_svc+0x50/0x158 el0t_64_sync+0x17c/0x180
%pS and %pK, in the same back trace line, are redundant, and %pS can void %pK service in certain contexts.
%pS alone already provides the necessary information, and if it cannot resolve the symbol, it falls back to printing the raw address voiding the original intent behind the %pK.
Additionally, %pK requires a privilege check CAP_SYSLOG enforced through the LSM, which can trigger a "sleeping function called from invalid context" warning under RT_PREEMPT kernels when the check occurs in an atomic context. This issue may also affect other LSMs.
This change avoids the unnecessary privilege check and resolves the sleeping function warning without any loss of information.
Link: https://lkml.kernel.org/r/20241217142032.55793-1-acarmina@redhat.com Fixes: 3a6f33d86baa ("mm/kmemleak: use %pK to display kernel pointers in backtrace") Signed-off-by: Alessandro Carminati acarmina@redhat.com Acked-by: Sebastian Andrzej Siewior bigeasy@linutronix.de Acked-by: Catalin Marinas catalin.marinas@arm.com Cc: Clément Léger clement.leger@bootlin.com Cc: Alessandro Carminati acarmina@redhat.com Cc: Eric Chanudet echanude@redhat.com Cc: Gabriele Paoloni gpaoloni@redhat.com Cc: Juri Lelli juri.lelli@redhat.com Cc: Steven Rostedt rostedt@goodmis.org Cc: Thomas Weißschuh thomas.weissschuh@linutronix.de Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/kmemleak.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/mm/kmemleak.c +++ b/mm/kmemleak.c @@ -368,7 +368,7 @@ static void print_unreferenced(struct se
for (i = 0; i < nr_entries; i++) { void *ptr = (void *)entries[i]; - warn_or_seq_printf(seq, " [<%pK>] %pS\n", ptr, ptr); + warn_or_seq_printf(seq, " %pS\n", ptr); } }
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Seiji Nishikawa snishika@redhat.com
commit 6aaced5abd32e2a57cd94fd64f824514d0361da8 upstream.
The task sometimes continues looping in throttle_direct_reclaim() because allow_direct_reclaim(pgdat) keeps returning false.
#0 [ffff80002cb6f8d0] __switch_to at ffff8000080095ac #1 [ffff80002cb6f900] __schedule at ffff800008abbd1c #2 [ffff80002cb6f990] schedule at ffff800008abc50c #3 [ffff80002cb6f9b0] throttle_direct_reclaim at ffff800008273550 #4 [ffff80002cb6fa20] try_to_free_pages at ffff800008277b68 #5 [ffff80002cb6fae0] __alloc_pages_nodemask at ffff8000082c4660 #6 [ffff80002cb6fc50] alloc_pages_vma at ffff8000082e4a98 #7 [ffff80002cb6fca0] do_anonymous_page at ffff80000829f5a8 #8 [ffff80002cb6fce0] __handle_mm_fault at ffff8000082a5974 #9 [ffff80002cb6fd90] handle_mm_fault at ffff8000082a5bd4
At this point, the pgdat contains the following two zones:
NODE: 4 ZONE: 0 ADDR: ffff00817fffe540 NAME: "DMA32" SIZE: 20480 MIN/LOW/HIGH: 11/28/45 VM_STAT: NR_FREE_PAGES: 359 NR_ZONE_INACTIVE_ANON: 18813 NR_ZONE_ACTIVE_ANON: 0 NR_ZONE_INACTIVE_FILE: 50 NR_ZONE_ACTIVE_FILE: 0 NR_ZONE_UNEVICTABLE: 0 NR_ZONE_WRITE_PENDING: 0 NR_MLOCK: 0 NR_BOUNCE: 0 NR_ZSPAGES: 0 NR_FREE_CMA_PAGES: 0
NODE: 4 ZONE: 1 ADDR: ffff00817fffec00 NAME: "Normal" SIZE: 8454144 PRESENT: 98304 MIN/LOW/HIGH: 68/166/264 VM_STAT: NR_FREE_PAGES: 146 NR_ZONE_INACTIVE_ANON: 94668 NR_ZONE_ACTIVE_ANON: 3 NR_ZONE_INACTIVE_FILE: 735 NR_ZONE_ACTIVE_FILE: 78 NR_ZONE_UNEVICTABLE: 0 NR_ZONE_WRITE_PENDING: 0 NR_MLOCK: 0 NR_BOUNCE: 0 NR_ZSPAGES: 0 NR_FREE_CMA_PAGES: 0
In allow_direct_reclaim(), while processing ZONE_DMA32, the sum of inactive/active file-backed pages calculated in zone_reclaimable_pages() based on the result of zone_page_state_snapshot() is zero.
Additionally, since this system lacks swap, the calculation of inactive/ active anonymous pages is skipped.
crash> p nr_swap_pages nr_swap_pages = $1937 = { counter = 0 }
As a result, ZONE_DMA32 is deemed unreclaimable and skipped, moving on to the processing of the next zone, ZONE_NORMAL, despite ZONE_DMA32 having free pages significantly exceeding the high watermark.
The problem is that the pgdat->kswapd_failures hasn't been incremented.
crash> px ((struct pglist_data *) 0xffff00817fffe540)->kswapd_failures $1935 = 0x0
This is because the node deemed balanced. The node balancing logic in balance_pgdat() evaluates all zones collectively. If one or more zones (e.g., ZONE_DMA32) have enough free pages to meet their watermarks, the entire node is deemed balanced. This causes balance_pgdat() to exit early before incrementing the kswapd_failures, as it considers the overall memory state acceptable, even though some zones (like ZONE_NORMAL) remain under significant pressure.
The patch ensures that zone_reclaimable_pages() includes free pages (NR_FREE_PAGES) in its calculation when no other reclaimable pages are available (e.g., file-backed or anonymous pages). This change prevents zones like ZONE_DMA32, which have sufficient free pages, from being mistakenly deemed unreclaimable. By doing so, the patch ensures proper node balancing, avoids masking pressure on other zones like ZONE_NORMAL, and prevents infinite loops in throttle_direct_reclaim() caused by allow_direct_reclaim(pgdat) repeatedly returning false.
The kernel hangs due to a task stuck in throttle_direct_reclaim(), caused by a node being incorrectly deemed balanced despite pressure in certain zones, such as ZONE_NORMAL. This issue arises from zone_reclaimable_pages() returning 0 for zones without reclaimable file- backed or anonymous pages, causing zones like ZONE_DMA32 with sufficient free pages to be skipped.
The lack of swap or reclaimable pages results in ZONE_DMA32 being ignored during reclaim, masking pressure in other zones. Consequently, pgdat->kswapd_failures remains 0 in balance_pgdat(), preventing fallback mechanisms in allow_direct_reclaim() from being triggered, leading to an infinite loop in throttle_direct_reclaim().
This patch modifies zone_reclaimable_pages() to account for free pages (NR_FREE_PAGES) when no other reclaimable pages exist. This ensures zones with sufficient free pages are not skipped, enabling proper balancing and reclaim behavior.
[akpm@linux-foundation.org: coding-style cleanups] Link: https://lkml.kernel.org/r/20241130164346.436469-1-snishika@redhat.com Link: https://lkml.kernel.org/r/20241130161236.433747-2-snishika@redhat.com Fixes: 5a1c84b404a7 ("mm: remove reclaim and compaction retry approximations") Signed-off-by: Seiji Nishikawa snishika@redhat.com Cc: Mel Gorman mgorman@techsingularity.net Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/vmscan.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-)
--- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -641,7 +641,14 @@ unsigned long zone_reclaimable_pages(str if (can_reclaim_anon_pages(NULL, zone_to_nid(zone), NULL)) nr += zone_page_state_snapshot(zone, NR_ZONE_INACTIVE_ANON) + zone_page_state_snapshot(zone, NR_ZONE_ACTIVE_ANON); - + /* + * If there are no reclaimable file-backed or anonymous pages, + * ensure zones with sufficient free pages are not skipped. + * This prevents zones like DMA32 from being ignored in reclaim + * scenarios where they can still help alleviate memory pressure. + */ + if (nr == 0) + nr = zone_page_state_snapshot(zone, NR_FREE_PAGES); return nr; }
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Paolo Abeni pabeni@redhat.com
commit cbb26f7d8451fe56ccac802c6db48d16240feebd upstream.
Syzbot reported the following splat:
Oops: general protection fault, probably for non-canonical address 0xdffffc0000000001: 0000 [#1] PREEMPT SMP KASAN PTI KASAN: null-ptr-deref in range [0x0000000000000008-0x000000000000000f] CPU: 1 UID: 0 PID: 5836 Comm: sshd Not tainted 6.13.0-rc3-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 11/25/2024 RIP: 0010:_compound_head include/linux/page-flags.h:242 [inline] RIP: 0010:put_page+0x23/0x260 include/linux/mm.h:1552 Code: 90 90 90 90 90 90 90 55 41 57 41 56 53 49 89 fe 48 bd 00 00 00 00 00 fc ff df e8 f8 5e 12 f8 49 8d 5e 08 48 89 d8 48 c1 e8 03 <80> 3c 28 00 74 08 48 89 df e8 8f c7 78 f8 48 8b 1b 48 89 de 48 83 RSP: 0000:ffffc90003916c90 EFLAGS: 00010202 RAX: 0000000000000001 RBX: 0000000000000008 RCX: ffff888030458000 RDX: 0000000000000100 RSI: 0000000000000000 RDI: 0000000000000000 RBP: dffffc0000000000 R08: ffffffff898ca81d R09: 1ffff110054414ac R10: dffffc0000000000 R11: ffffed10054414ad R12: 0000000000000007 R13: ffff88802a20a542 R14: 0000000000000000 R15: 0000000000000000 FS: 00007f34f496e800(0000) GS:ffff8880b8700000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f9d6ec9ec28 CR3: 000000004d260000 CR4: 00000000003526f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> skb_page_unref include/linux/skbuff_ref.h:43 [inline] __skb_frag_unref include/linux/skbuff_ref.h:56 [inline] skb_release_data+0x483/0x8a0 net/core/skbuff.c:1119 skb_release_all net/core/skbuff.c:1190 [inline] __kfree_skb+0x55/0x70 net/core/skbuff.c:1204 tcp_clean_rtx_queue net/ipv4/tcp_input.c:3436 [inline] tcp_ack+0x2442/0x6bc0 net/ipv4/tcp_input.c:4032 tcp_rcv_state_process+0x8eb/0x44e0 net/ipv4/tcp_input.c:6805 tcp_v4_do_rcv+0x77d/0xc70 net/ipv4/tcp_ipv4.c:1939 tcp_v4_rcv+0x2dc0/0x37f0 net/ipv4/tcp_ipv4.c:2351 ip_protocol_deliver_rcu+0x22e/0x440 net/ipv4/ip_input.c:205 ip_local_deliver_finish+0x341/0x5f0 net/ipv4/ip_input.c:233 NF_HOOK+0x3a4/0x450 include/linux/netfilter.h:314 NF_HOOK+0x3a4/0x450 include/linux/netfilter.h:314 __netif_receive_skb_one_core net/core/dev.c:5672 [inline] __netif_receive_skb+0x2bf/0x650 net/core/dev.c:5785 process_backlog+0x662/0x15b0 net/core/dev.c:6117 __napi_poll+0xcb/0x490 net/core/dev.c:6883 napi_poll net/core/dev.c:6952 [inline] net_rx_action+0x89b/0x1240 net/core/dev.c:7074 handle_softirqs+0x2d4/0x9b0 kernel/softirq.c:561 __do_softirq kernel/softirq.c:595 [inline] invoke_softirq kernel/softirq.c:435 [inline] __irq_exit_rcu+0xf7/0x220 kernel/softirq.c:662 irq_exit_rcu+0x9/0x30 kernel/softirq.c:678 instr_sysvec_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1049 [inline] sysvec_apic_timer_interrupt+0x57/0xc0 arch/x86/kernel/apic/apic.c:1049 asm_sysvec_apic_timer_interrupt+0x1a/0x20 arch/x86/include/asm/idtentry.h:702 RIP: 0033:0x7f34f4519ad5 Code: 85 d2 74 0d 0f 10 02 48 8d 54 24 20 0f 11 44 24 20 64 8b 04 25 18 00 00 00 85 c0 75 27 41 b8 08 00 00 00 b8 0f 01 00 00 0f 05 <48> 3d 00 f0 ff ff 76 75 48 8b 15 24 73 0d 00 f7 d8 64 89 02 48 83 RSP: 002b:00007ffec5b32ce0 EFLAGS: 00000246 RAX: 0000000000000001 RBX: 00000000000668a0 RCX: 00007f34f4519ad5 RDX: 00007ffec5b32d00 RSI: 0000000000000004 RDI: 0000564f4bc6cae0 RBP: 0000564f4bc6b5a0 R08: 0000000000000008 R09: 0000000000000000 R10: 00007ffec5b32de8 R11: 0000000000000246 R12: 0000564f48ea8aa4 R13: 0000000000000001 R14: 0000564f48ea93e8 R15: 00007ffec5b32d68 </TASK>
Eric noted a probable shinfo->nr_frags corruption, which indeed occurs.
The root cause is a buggy MPTCP option len computation in some circumstances: the ADD_ADDR option should be mutually exclusive with DSS since the blamed commit.
Still, mptcp_established_options_add_addr() tries to set the relevant info in mptcp_out_options, if the remaining space is large enough even when DSS is present.
Since the ADD_ADDR infos and the DSS share the same union fields, adding first corrupts the latter. In the worst-case scenario, such corruption increases the DSS binary layout, exceeding the computed length and possibly overwriting the skb shared info.
Address the issue by enforcing mutual exclusion in mptcp_established_options_add_addr(), too.
Cc: stable@vger.kernel.org Reported-by: syzbot+38a095a81f30d82884c1@syzkaller.appspotmail.com Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/538 Fixes: 1bff1e43a30e ("mptcp: optimize out option generation") Signed-off-by: Paolo Abeni pabeni@redhat.com Reviewed-by: Matthieu Baerts (NGI0) matttbe@kernel.org Reviewed-by: Eric Dumazet edumazet@google.com Link: https://patch.msgid.link/025d9df8cde3c9a557befc47e9bc08fbbe3476e5.1734771049... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/mptcp/options.c | 7 +++++++ 1 file changed, 7 insertions(+)
--- a/net/mptcp/options.c +++ b/net/mptcp/options.c @@ -667,8 +667,15 @@ static bool mptcp_established_options_ad &echo, &drop_other_suboptions)) return false;
+ /* + * Later on, mptcp_write_options() will enforce mutually exclusion with + * DSS, bail out if such option is set and we can't drop it. + */ if (drop_other_suboptions) remaining += opt_size; + else if (opts->suboptions & OPTION_MPTCP_DSS) + return false; + len = mptcp_add_addr_len(opts->addr.family, echo, !!opts->addr.port); if (remaining < len) return false;
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Paolo Abeni pabeni@redhat.com
commit 449e6912a2522af672e99992e1201a454910864e upstream.
If the recvmsg() blocks after receiving some data - i.e. due to SO_RCVLOWAT - the MPTCP code will attempt multiple times to adjust the receive buffer size, wrongly accounting every time the cumulative of received data - instead of accounting only for the delta.
Address the issue moving mptcp_rcv_space_adjust just after the data reception and passing it only the just received bytes.
This also removes an unneeded difference between the TCP and MPTCP RX code path implementation.
Fixes: 581302298524 ("mptcp: error out earlier on disconnect") Cc: stable@vger.kernel.org Signed-off-by: Paolo Abeni pabeni@redhat.com Reviewed-by: Mat Martineau martineau@kernel.org Signed-off-by: Matthieu Baerts (NGI0) matttbe@kernel.org Link: https://patch.msgid.link/20241230-net-mptcp-rbuf-fixes-v1-1-8608af434ceb@ker... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/mptcp/protocol.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
--- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -1902,6 +1902,8 @@ do_error: goto out; }
+static void mptcp_rcv_space_adjust(struct mptcp_sock *msk, int copied); + static int __mptcp_recvmsg_mskq(struct mptcp_sock *msk, struct msghdr *msg, size_t len, int flags, @@ -1955,6 +1957,7 @@ static int __mptcp_recvmsg_mskq(struct m break; }
+ mptcp_rcv_space_adjust(msk, copied); return copied; }
@@ -2231,7 +2234,6 @@ static int mptcp_recvmsg(struct sock *sk }
pr_debug("block timeout %ld\n", timeo); - mptcp_rcv_space_adjust(msk, copied); err = sk_wait_data(sk, &timeo, NULL); if (err < 0) { err = copied ? : err; @@ -2239,8 +2241,6 @@ static int mptcp_recvmsg(struct sock *sk } }
- mptcp_rcv_space_adjust(msk, copied); - out_err: if (cmsg_flags && copied >= 0) { if (cmsg_flags & MPTCP_CMSG_TS)
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Paolo Abeni pabeni@redhat.com
commit 551844f26da2a9f76c0a698baaffa631d1178645 upstream.
Under some corner cases the MPTCP protocol can end-up invoking mptcp_cleanup_rbuf() when no data has been copied, but such helper assumes the opposite condition.
Explicitly drop such assumption and performs the costly call only when strictly needed - before releasing the msk socket lock.
Fixes: fd8976790a6c ("mptcp: be careful on MPTCP-level ack.") Cc: stable@vger.kernel.org Signed-off-by: Paolo Abeni pabeni@redhat.com Reviewed-by: Mat Martineau martineau@kernel.org Signed-off-by: Matthieu Baerts (NGI0) matttbe@kernel.org Link: https://patch.msgid.link/20241230-net-mptcp-rbuf-fixes-v1-2-8608af434ceb@ker... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/mptcp/protocol.c | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-)
--- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -528,13 +528,13 @@ static void mptcp_send_ack(struct mptcp_ mptcp_subflow_send_ack(mptcp_subflow_tcp_sock(subflow)); }
-static void mptcp_subflow_cleanup_rbuf(struct sock *ssk) +static void mptcp_subflow_cleanup_rbuf(struct sock *ssk, int copied) { bool slow;
slow = lock_sock_fast(ssk); if (tcp_can_send_ack(ssk)) - tcp_cleanup_rbuf(ssk, 1); + tcp_cleanup_rbuf(ssk, copied); unlock_sock_fast(ssk, slow); }
@@ -551,7 +551,7 @@ static bool mptcp_subflow_could_cleanup( (ICSK_ACK_PUSHED2 | ICSK_ACK_PUSHED))); }
-static void mptcp_cleanup_rbuf(struct mptcp_sock *msk) +static void mptcp_cleanup_rbuf(struct mptcp_sock *msk, int copied) { int old_space = READ_ONCE(msk->old_wspace); struct mptcp_subflow_context *subflow; @@ -559,14 +559,14 @@ static void mptcp_cleanup_rbuf(struct mp int space = __mptcp_space(sk); bool cleanup, rx_empty;
- cleanup = (space > 0) && (space >= (old_space << 1)); - rx_empty = !__mptcp_rmem(sk); + cleanup = (space > 0) && (space >= (old_space << 1)) && copied; + rx_empty = !__mptcp_rmem(sk) && copied;
mptcp_for_each_subflow(msk, subflow) { struct sock *ssk = mptcp_subflow_tcp_sock(subflow);
if (cleanup || mptcp_subflow_could_cleanup(ssk, rx_empty)) - mptcp_subflow_cleanup_rbuf(ssk); + mptcp_subflow_cleanup_rbuf(ssk, copied); } }
@@ -2183,9 +2183,6 @@ static int mptcp_recvmsg(struct sock *sk
copied += bytes_read;
- /* be sure to advertise window change */ - mptcp_cleanup_rbuf(msk); - if (skb_queue_empty(&msk->receive_queue) && __mptcp_move_skbs(msk)) continue;
@@ -2234,6 +2231,7 @@ static int mptcp_recvmsg(struct sock *sk }
pr_debug("block timeout %ld\n", timeo); + mptcp_cleanup_rbuf(msk, copied); err = sk_wait_data(sk, &timeo, NULL); if (err < 0) { err = copied ? : err; @@ -2241,6 +2239,8 @@ static int mptcp_recvmsg(struct sock *sk } }
+ mptcp_cleanup_rbuf(msk, copied); + out_err: if (cmsg_flags && copied >= 0) { if (cmsg_flags & MPTCP_CMSG_TS)
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kashyap Desai kashyap.desai@broadcom.com
commit 79d330fbdffd8cee06d8bdf38d82cb62d8363a27 upstream.
Gen P7 supports up to 13 SGEs for now. WQE software structure can hold only 6 now. Since the max send sge is reported as 13, the stack can give requests up to 13 SGEs. This is causing traffic failures and system crashes.
Use the define for max SGE supported for variable size. This will work for both static and variable WQEs.
Fixes: 227f51743b61 ("RDMA/bnxt_re: Fix the max WQE size for static WQE support") Signed-off-by: Kashyap Desai kashyap.desai@broadcom.com Signed-off-by: Selvin Xavier selvin.xavier@broadcom.com Link: https://patch.msgid.link/20241204075416.478431-2-kalesh-anakkur.purayil@broa... Signed-off-by: Leon Romanovsky leon@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/infiniband/hw/bnxt_re/qplib_fp.h | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
--- a/drivers/infiniband/hw/bnxt_re/qplib_fp.h +++ b/drivers/infiniband/hw/bnxt_re/qplib_fp.h @@ -113,7 +113,6 @@ struct bnxt_qplib_sge { u32 size; };
-#define BNXT_QPLIB_QP_MAX_SGL 6 struct bnxt_qplib_swq { u64 wr_id; int next_idx; @@ -153,7 +152,7 @@ struct bnxt_qplib_swqe { #define BNXT_QPLIB_SWQE_FLAGS_UC_FENCE BIT(2) #define BNXT_QPLIB_SWQE_FLAGS_SOLICIT_EVENT BIT(3) #define BNXT_QPLIB_SWQE_FLAGS_INLINE BIT(4) - struct bnxt_qplib_sge sg_list[BNXT_QPLIB_QP_MAX_SGL]; + struct bnxt_qplib_sge sg_list[BNXT_VAR_MAX_SGE]; int num_sge; /* Max inline data is 96 bytes */ u32 inline_len;
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Yihang Li liyihang9@huawei.com
commit 3f030550476566b12091687c70071d05ad433e0d upstream.
In commit 63f0733d07ce ("scsi: hisi_sas: Allocate DFX memory during dump trigger"), the memory allocation time of the DFX is changed from device initialization to dump occurs, so .debugfs_itct is not a valid address and do not need to check.
The parameter hisi_sas_debugfs_enable is enough to check whether automatic debugfs dump is triggered, so remove redunant checks.
Fixes: 63f0733d07ce ("scsi: hisi_sas: Allocate DFX memory during dump trigger") Signed-off-by: Yihang Li liyihang9@huawei.com Signed-off-by: Xiang Chen chenxiang66@hisilicon.com Link: https://lore.kernel.org/r/1705904747-62186-3-git-send-email-chenxiang66@hisi... Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/scsi/hisi_sas/hisi_sas_main.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/scsi/hisi_sas/hisi_sas_main.c +++ b/drivers/scsi/hisi_sas/hisi_sas_main.c @@ -1579,7 +1579,7 @@ static int hisi_sas_controller_prereset( return -EPERM; }
- if (hisi_sas_debugfs_enable && hisi_hba->debugfs_itct[0].itct) + if (hisi_sas_debugfs_enable) hisi_hba->hw->debugfs_snapshot_regs(hisi_hba);
return 0; @@ -1967,7 +1967,7 @@ static bool hisi_sas_internal_abort_time struct hisi_hba *hisi_hba = dev_to_hisi_hba(device); struct hisi_sas_internal_abort_data *timeout = data;
- if (hisi_sas_debugfs_enable && hisi_hba->debugfs_itct[0].itct) { + if (hisi_sas_debugfs_enable) { /* * If timeout occurs in device gone scenario, to avoid * circular dependency like:
On Mon, 6 Jan 2025 at 20:53, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 6.6.70 release. There are 222 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 08 Jan 2025 15:11:04 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.6.70-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.6.y and the diffstat can be found below.
thanks,
greg k-h
The following build errors were noticed while building the allmodconfig builds on arm64 on the stable-rc linux-6.6.y branch.
This is first seen on 5652330123c6a64b444f3012d9c9013742a872e7. GOOD: v6.6.69 BAD: 5652330123c6a64b444f3012d9c9013742a872e7
Reported-by: Linux Kernel Functional Testing lkft@linaro.org
Build error: ============ drivers/i2c/busses/i2c-xgene-slimpro.c:95: error: "PCC_SIGNATURE" redefined [-Werror] 95 | #define PCC_SIGNATURE 0x50424300 | In file included from drivers/i2c/busses/i2c-xgene-slimpro.c:12: include/acpi/pcc.h:23: note: this is the location of the previous definition 23 | #define PCC_SIGNATURE 0x50434300 | cc1: all warnings being treated as errors make[6]: *** [scripts/Makefile.build:243: drivers/i2c/busses/i2c-xgene
Links: ------- - https://qa-reports.linaro.org/lkft/linux-stable-rc-linux-6.6.y/build/v6.6.69... - https://qa-reports.linaro.org/lkft/linux-stable-rc-linux-6.6.y/build/v6.6.69... - https://qa-reports.linaro.org/lkft/linux-stable-rc-linux-6.6.y/build/v6.6.69...
metadata: ---- git sha: 5652330123c6a64b444f3012d9c9013742a872e7 git repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git kernel config: https://storage.tuxsuite.com/public/linaro/lkft/builds/2rGGBoqD6FxSfg6IKJLK4... build url: https://storage.tuxsuite.com/public/linaro/lkft/builds/2rGGBoqD6FxSfg6IKJLK4... toolchain: clang, gcc-13, gcc-8 arch: arm64 config: allmodconfig
-- Linaro LKFT https://lkft.linaro.org
On Mon, 06 Jan 2025 16:13:24 +0100 Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 6.6.70 release. There are 222 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 08 Jan 2025 15:11:04 +0000. Anything received after that time might be too late.
I got (in both the two runs I did):
[ 0.022989] ------------[ cut here ]------------ [ 0.023006] Usage of MAX_NUMNODES is deprecated. Use NUMA_NO_NODE instead [ 0.023362] WARNING: CPU: 0 PID: 0 at mm/memblock.c:1325 memblock_set_node+0xcf/0xe0 [ 0.023841] Modules linked in: [ 0.023988] CPU: 0 PID: 0 Comm: swapper Not tainted 6.6.70-rc1-g5652330123c6 #1 [ 0.024062] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014 [ 0.024156] RIP: 0010:memblock_set_node+0xcf/0xe0 [ 0.024215] Code: 2c 07 48 83 c0 18 48 ff c9 75 f0 48 89 df e8 78 01 00 00 31 c0 eb a8 c6 05 a7 24 a6 ff 01 48 c7 c7 1e 37 7b a7 e8 71 aa 07 fe <0f> 0b e9 67 ff ff ff e8 75 f3 e3 fe 0f 1f 44 00 00 55 41 57 41 56 [ 0.024361] RSP: 0000:ffffffffa7a03eb0 EFLAGS: 00000046 ORIG_RAX: 0000000000000000 [ 0.024386] RAX: 0000000000000000 RBX: ffffffffa82a2240 RCX: ffffffffa7a52068 [ 0.024397] RDX: ffffffffa7a03daf RSI: 0000000000000082 RDI: ffffffffa7a52060 [ 0.024407] RBP: 0000000000000000 R08: ffffffffa7a52240 R09: 4f4e5f414d554e20 [ 0.024417] R10: 2045444f4e5f4f4e R11: 0a64616574736e69 R12: 0000000000000040 [ 0.024428] R13: 0000000000000000 R14: ffffffffffffffff R15: 0000000000000000 [ 0.024465] FS: 0000000000000000(0000) GS:ffffffffa81a4000(0000) knlGS:0000000000000000 [ 0.024490] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 0.024502] CR2: ffff8c88c6001000 CR3: 0000000005628000 CR4: 00000000000000b0 [ 0.024585] Call Trace: [ 0.024778] <TASK> [ 0.024953] ? __warn+0xc3/0x1c0 [ 0.025012] ? memblock_set_node+0xcf/0xe0 [ 0.025024] ? report_bug+0x144/0x1e0 [ 0.025042] ? early_fixup_exception+0x46/0x90 [ 0.025051] ? early_idt_handler_common+0x2f/0x40 [ 0.025065] ? memblock_set_node+0xcf/0xe0 [ 0.025074] ? acpi_numa_processor_affinity_init+0x80/0x80 [ 0.025084] ? numa_init+0x5e/0x190 [ 0.025091] ? x86_numa_init+0x15/0x40 [ 0.025099] ? setup_arch+0x4a3/0x5a0 [ 0.025105] ? start_kernel+0x5a/0x3a0 [ 0.025115] ? x86_64_start_reservations+0x20/0x20 [ 0.025122] ? x86_64_start_kernel+0xa7/0xb0 [ 0.025128] ? secondary_startup_64_no_verify+0x179/0x17b [ 0.025156] </TASK> [ 0.025196] ---[ end trace 0000000000000000 ]---
I hope the helps!
Cheers, Miguel
On 1/6/25 07:13, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.6.70 release. There are 222 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 08 Jan 2025 15:11:04 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.6.70-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.6.y and the diffstat can be found below.
thanks,
greg k-h
On ARCH_BRCMSTB using 32-bit and 64-bit ARM kernels, build tested on BMIPS_GENERIC:
Tested-by: Florian Fainelli florian.fainelli@broadcom.com
Am 06.01.2025 um 16:13 schrieb Greg Kroah-Hartman:
This is the start of the stable review cycle for the 6.6.70 release. There are 222 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
6.6.70-rc1 builds, boots and seems to work on my 2-socket Ivy Bridge Xeon E5-2697 v2 server, but I see a scary looking dmesg warning, right after booting. As it is currently running, I will try building 6.12.9-rc with it, and see how that goes. The dmesg warnings I see look similar to that already reported by Miguel Ojeda.
[ 0.012910] ACPI: Reserving ERST table memory at [mem 0x7cca41c8-0x7cca43f7] [ 0.012911] ACPI: Reserving HEST table memory at [mem 0x7cca43f8-0x7cca449f] [ 0.012912] ACPI: Reserving BERT table memory at [mem 0x7cca44a0-0x7cca44cf] [ 0.012938] ------------[ cut here ]------------ [ 0.012939] Usage of MAX_NUMNODES is deprecated. Use NUMA_NO_NODE instead [ 0.012948] WARNING: CPU: 0 PID: 0 at mm/memblock.c:1324 memblock_set_node+0x10c/0x120 [ 0.012955] Modules linked in: [ 0.012956] CPU: 0 PID: 0 Comm: swapper Not tainted 6.6.70-rc1+ #1 [ 0.012959] Hardware name: ASUSTeK COMPUTER INC. Z9PE-D16 Series/Z9PE-D16 Series, BIOS 5601 06/11/2015 [ 0.012960] RIP: 0010:memblock_set_node+0x10c/0x120 [ 0.012962] Code: 0f 87 fe c0 ca 00 41 83 e4 01 74 0b 41 bc ff ff ff ff e9 50 ff ff ff 48 c7 c7 d0 28 fa 98 c6 05 58 0a 13 02 01 e8 14 86 ce ff <0f> 0b eb de e8 fb e3 d0 00 66 66 2e 0f 1f 84 00 00 00 00 00 90 90 [ 0.012965] RSP: 0000:ffffffff99803df8 EFLAGS: 00010046 ORIG_RAX: 0000000000000000 [ 0.012967] RAX: 0000000000000000 RBX: ffffffff999fc990 RCX: 0000000000000000 [ 0.012969] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 [ 0.012970] RBP: ffffffff99803e28 R08: 0000000000000000 R09: 0000000000000000 [ 0.012971] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 [ 0.012972] R13: 0000000000000000 R14: ffffffffffffffff R15: 000000000008a000 [ 0.012973] FS: 0000000000000000(0000) GS:ffffffff99bb0000(0000) knlGS:0000000000000000 [ 0.012975] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 0.012976] CR2: ffff9e62eac01000 CR3: 0000000ae9e36000 CR4: 00000000000200f0 [ 0.012978] Call Trace: [ 0.012979] <TASK> [ 0.012981] ? show_regs+0x6d/0x80 [ 0.012987] ? __warn+0x89/0x160 [ 0.012990] ? memblock_set_node+0x10c/0x120 [ 0.012993] ? report_bug+0x17e/0x1b0 [ 0.012996] ? fixup_exception+0x27/0x370 [ 0.013000] ? early_fixup_exception+0x9b/0xf0 [ 0.013004] ? do_early_exception+0x25/0x80 [ 0.013008] ? early_idt_handler_common+0x2f/0x3a [ 0.013013] ? memblock_set_node+0x10c/0x120 [ 0.013015] ? memblock_set_node+0x10c/0x120 [ 0.013017] ? __pfx_x86_acpi_numa_init+0x10/0x10 [ 0.013021] numa_init+0x82/0x570 [ 0.013025] x86_numa_init+0x1f/0x60 [ 0.013028] initmem_init+0xe/0x20 [ 0.013030] setup_arch+0x98b/0xea0 [ 0.013034] start_kernel+0x6c/0xb00 [ 0.013039] ? load_ucode_intel_bsp+0x3d/0x80 [ 0.013044] x86_64_start_reservations+0x18/0x30 [ 0.013046] x86_64_start_kernel+0xbf/0x110 [ 0.013049] secondary_startup_64_no_verify+0x18f/0x19b [ 0.013056] </TASK> [ 0.013057] ---[ end trace 0000000000000000 ]--- [ 0.013063] SRAT: PXM 0 -> APIC 0x00 -> Node 0 [ 0.013065] SRAT: PXM 0 -> APIC 0x01 -> Node 0 [ 0.013066] SRAT: PXM 0 -> APIC 0x02 -> Node 0 [ 0.013066] SRAT: PXM 0 -> APIC 0x03 -> Node 0
Beste Grüße, Peter Schneider
Am 07.01.2025 um 01:36 schrieb Peter Schneider:
Am 06.01.2025 um 16:13 schrieb Greg Kroah-Hartman:
This is the start of the stable review cycle for the 6.6.70 release. There are 222 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
6.6.70-rc1 builds, boots and seems to work on my 2-socket Ivy Bridge Xeon E5-2697 v2 server, but I see a scary looking dmesg warning, right after booting. As it is currently running, I will try building 6.12.9-rc with it, and see how that goes. The dmesg warnings I see look similar to that already reported by Miguel Ojeda.
[...]
So running 6.6.70-rc1, despite the above mentioned initial warning message, I could build 6.12.9-rc1 just fine with it. It took the usual ~20 minutes, and no more dmesgs warning have been added.
Beste Grüße, Peter Schneider
Hello,
On Mon, 6 Jan 2025 16:13:24 +0100 Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 6.6.70 release. There are 222 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 08 Jan 2025 15:11:04 +0000. Anything received after that time might be too late.
This rc kernel passes DAMON functionality test[1] on my test machine. Attaching the test results summary below. Please note that I retrieved the kernel from linux-stable-rc tree[2].
Tested-by: SeongJae Park sj@kernel.org
[1] https://github.com/damonitor/damon-tests/tree/next/corr [2] 5652330123c6 ("Linux 6.6.70-rc1")
Thanks, SJ
[...]
---
ok 1 selftests: damon: debugfs_attrs.sh # SKIP ok 2 selftests: damon: debugfs_schemes.sh # SKIP ok 3 selftests: damon: debugfs_target_ids.sh # SKIP ok 4 selftests: damon: debugfs_empty_targets.sh # SKIP ok 5 selftests: damon: debugfs_huge_count_read_write.sh # SKIP ok 6 selftests: damon: debugfs_duplicate_context_creation.sh # SKIP ok 7 selftests: damon: debugfs_rm_non_contexts.sh # SKIP ok 8 selftests: damon: sysfs.sh ok 9 selftests: damon: sysfs_update_removed_scheme_dir.sh ok 10 selftests: damon: reclaim.sh ok 11 selftests: damon: lru_sort.sh ok 1 selftests: damon-tests: kunit.sh ok 2 selftests: damon-tests: huge_count_read_write.sh # SKIP ok 3 selftests: damon-tests: buffer_overflow.sh ok 4 selftests: damon-tests: rm_contexts.sh # SKIP ok 5 selftests: damon-tests: record_null_deref.sh ok 6 selftests: damon-tests: dbgfs_target_ids_read_before_terminate_race.sh ok 7 selftests: damon-tests: dbgfs_target_ids_pid_leak.sh ok 8 selftests: damon-tests: damo_tests.sh ok 9 selftests: damon-tests: masim-record.sh ok 10 selftests: damon-tests: build_i386.sh ok 11 selftests: damon-tests: build_arm64.sh # SKIP ok 12 selftests: damon-tests: build_m68k.sh # SKIP ok 13 selftests: damon-tests: build_i386_idle_flag.sh ok 14 selftests: damon-tests: build_i386_highpte.sh ok 15 selftests: damon-tests: build_nomemcg.sh
PASS
On 1/6/25 07:13, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.6.70 release. There are 222 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 08 Jan 2025 15:11:04 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.6.70-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.6.y and the diffstat can be found below.
thanks,
greg k-h
Built and booted successfully on RISC-V RV64 (HiFive Unmatched).
Tested-by: Ron Economos re@w6rz.net
On Mon, 6 Jan 2025 at 20:53, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 6.6.70 release. There are 222 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 08 Jan 2025 15:11:04 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.6.70-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.6.y and the diffstat can be found below.
thanks,
greg k-h
As others have reported, boot warnings on x86 have been noticed
memblock: make memblock_set_node() also warn about use of MAX_NUMNODES [ Upstream commit e0eec24e2e199873f43df99ec39773ad3af2bff7 ]
diff --git a/mm/memblock.c b/mm/memblock.c index 87a2b4340ce4ea..ba64b47b7c3b24 100644 --- a/mm/memblock.c +++ b/mm/memblock.c @@ -1321,6 +1321,10 @@ int __init_memblock memblock_set_node(phys_addr_t base, phys_addr_t size, int start_rgn, end_rgn; int i, ret;
+ if (WARN_ONCE(nid == MAX_NUMNODES, + "Usage of MAX_NUMNODES is deprecated. Use NUMA_NO_NODE instead\n")) + nid = NUMA_NO_NODE; + ret = memblock_isolate_range(type, base, size, &start_rgn, &end_rgn); if (ret) return ret;
On Tue, Jan 07, 2025 at 03:54:43PM +0530, Naresh Kamboju wrote:
On Mon, 6 Jan 2025 at 20:53, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 6.6.70 release. There are 222 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 08 Jan 2025 15:11:04 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.6.70-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.6.y and the diffstat can be found below.
thanks,
greg k-h
As others have reported, boot warnings on x86 have been noticed
memblock: make memblock_set_node() also warn about use of MAX_NUMNODES [ Upstream commit e0eec24e2e199873f43df99ec39773ad3af2bff7 ]
There's 8043832e2a12 ("memblock: use numa_valid_node() helper to check for invalid node ID") that fixes it
diff --git a/mm/memblock.c b/mm/memblock.c index 87a2b4340ce4ea..ba64b47b7c3b24 100644 --- a/mm/memblock.c +++ b/mm/memblock.c @@ -1321,6 +1321,10 @@ int __init_memblock memblock_set_node(phys_addr_t base, phys_addr_t size, int start_rgn, end_rgn; int i, ret;
- if (WARN_ONCE(nid == MAX_NUMNODES,
"Usage of MAX_NUMNODES is deprecated. Use NUMA_NO_NODE instead\n"))
- nid = NUMA_NO_NODE;
- ret = memblock_isolate_range(type, base, size, &start_rgn, &end_rgn); if (ret) return ret;
--
Reported-by: Linux Kernel Functional Testing lkft@linaro.org
------------[ cut here ]------------ [ 0.042522] Usage of MAX_NUMNODES is deprecated. Use NUMA_NO_NODE instead [ 0.043058] WARNING: CPU: 0 PID: 0 at mm/memblock.c:1324 memblock_set_node+0xf0/0x100 [ 0.043730] Modules linked in: [ 0.043957] CPU: 0 PID: 0 Comm: swapper Not tainted 6.6.70-rc1 #1 [ 0.044026] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 [ 0.044174] RIP: 0010:memblock_set_node+0xf0/0x100 [ 0.044293] Code: 3d 81 da ef ff 00 74 0b 41 bc ff ff ff ff e9 6c ff ff ff 48 c7 c7 30 40 e3 8e 48 89 75 d0 c6 05 62 da ef ff 01 e8 20 cb 04 fe <0f> 0b 48 8b 75 d0 eb d6 e8 43 db 0e ff 0f 1f 00 90 90 90 90 90 90 [ 0.044494] RSP: 0000:ffffffff8f003e08 EFLAGS: 00010082 ORIG_RAX: 0000000000000000 [ 0.044537] RAX: 0000000000000000 RBX: ffffffff8f46da30 RCX: 0000000000000000 [ 0.044553] RDX: ffffffff8f156f08 RSI: 0000000000000082 RDI: 0000000000000001 [ 0.044588] RBP: ffffffff8f003e38 R08: 0000000000000000 R09: 4f4e5f414d554e20 [ 0.044606] R10: 2045444f4e5f4f4e R11: 0a64616574736e69 R12: 0000000000000040 [ 0.044621] R13: 0000000000000000 R14: 000000013ee00000 R15: 0000000000014750 [ 0.044673] FS: 0000000000000000(0000) GS:ffffffff8f347000(0000) knlGS:0000000000000000 [ 0.044707] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 0.044723] CR2: ffff9ab33ffff000 CR3: 0000000140c44000 CR4: 00000000000000b0 [ 0.044816] Call Trace: [ 0.045237] <TASK> [ 0.045521] ? show_regs+0x69/0x80 [ 0.045593] ? __warn+0x8d/0x150 [ 0.045611] ? memblock_set_node+0xf0/0x100 [ 0.045629] ? report_bug+0x171/0x1a0 [ 0.045648] ? fixup_exception+0x2b/0x310 [ 0.045669] ? early_fixup_exception+0xb3/0xd0 [ 0.045687] ? do_early_exception+0x1f/0x60 [ 0.045716] ? early_idt_handler_common+0x2f/0x40 [ 0.045742] ? memblock_set_node+0xf0/0x100 [ 0.045760] ? memblock_set_node+0xf0/0x100 [ 0.045792] ? __pfx_x86_acpi_numa_init+0x10/0x10 [ 0.045817] numa_init+0x8b/0x600 [ 0.045932] x86_numa_init+0x23/0x50 [ 0.045953] initmem_init+0x12/0x20 [ 0.045969] setup_arch+0x88b/0xce0 [ 0.045988] start_kernel+0x76/0x6d0 [ 0.046008] x86_64_start_reservations+0x1c/0x30 [ 0.046022] x86_64_start_kernel+0xca/0xe0 [ 0.046037] secondary_startup_64_no_verify+0x178/0x17b [ 0.046111] </TASK> [ 0.046180] ---[ end trace 0000000000000000 ]---
Links:
- https://qa-reports.linaro.org/lkft/linux-stable-rc-linux-6.6.y/build/v6.6.69...
- https://qa-reports.linaro.org/lkft/linux-stable-rc-linux-6.6.y/build/v6.6.69...
## Build
- kernel: 6.6.70-rc1
- git: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
- git commit: 5652330123c6a64b444f3012d9c9013742a872e7
- git describe: v6.6.69-223-g5652330123c6
- test details:
https://qa-reports.linaro.org/lkft/linux-stable-rc-linux-6.6.y/build/v6.6.69...
## Test Regressions (compared to v6.6.68-87-g159cc5fd9b13)
- arm64, build
- clang-19-allmodconfig
- clang-19-allyesconfig
- gcc-13-allmodconfig
- gcc-13-allyesconfig
## Metric Regressions (compared to v6.6.68-87-g159cc5fd9b13)
## Test Fixes (compared to v6.6.68-87-g159cc5fd9b13)
## Metric Fixes (compared to v6.6.68-87-g159cc5fd9b13)
## Test result summary total: 150166, pass: 122611, fail: 4998, skip: 22391, xfail: 166
## Build Summary
- arc: 6 total, 5 passed, 1 failed
- arm: 132 total, 132 passed, 0 failed
- arm64: 44 total, 38 passed, 6 failed
- i386: 31 total, 28 passed, 3 failed
- mips: 30 total, 25 passed, 5 failed
- parisc: 5 total, 5 passed, 0 failed
- powerpc: 36 total, 32 passed, 4 failed
- riscv: 23 total, 22 passed, 1 failed
- s390: 18 total, 14 passed, 4 failed
- sh: 12 total, 10 passed, 2 failed
- sparc: 9 total, 8 passed, 1 failed
- x86_64: 36 total, 35 passed, 1 failed
## Test suites summary
- boot
- commands
- kselftest-arm64
- kselftest-breakpoints
- kselftest-capabilities
- kselftest-cgroup
- kselftest-clone3
- kselftest-core
- kselftest-cpu-hotplug
- kselftest-cpufreq
- kselftest-efivarfs
- kselftest-exec
- kselftest-filesystems
- kselftest-filesystems-binderfs
- kselftest-filesystems-epoll
- kselftest-firmware
- kselftest-fpu
- kselftest-ftrace
- kselftest-futex
- kselftest-gpio
- kselftest-intel_pstate
- kselftest-ipc
- kselftest-kcmp
- kselftest-kvm
- kselftest-livepatch
- kselftest-membarrier
- kselftest-memfd
- kselftest-mincore
- kselftest-mqueue
- kselftest-net
- kselftest-net-mptcp
- kselftest-openat2
- kselftest-ptrace
- kselftest-rseq
- kselftest-rtc
- kselftest-seccomp
- kselftest-sigaltstack
- kselftest-size
- kselftest-tc-testing
- kselftest-timers
- kselftest-tmpfs
- kselftest-tpm2
- kselftest-user_events
- kselftest-vDSO
- kselftest-x86
- kunit
- kvm-unit-tests
- libgpiod
- libhugetlbfs
- log-parser-boot
- log-parser-build-clang
- log-parser-build-gcc
- log-parser-test
- ltp-capability
- ltp-commands
- ltp-containers
- ltp-controllers
- ltp-cpuhotplug
- ltp-crypto
- ltp-cve
- ltp-dio
- ltp-fcntl-locktests
- ltp-filecaps
- ltp-fs
- ltp-fs_bind
- ltp-fs_perms_simple
- ltp-hugetlb
- ltp-ipc
- ltp-math
- ltp-mm
- ltp-nptl
- ltp-pty
- ltp-sched
- ltp-smoke
- ltp-syscalls
- ltp-tracing
- perf
- rcutorture
-- Linaro LKFT https://lkft.linaro.org
On Mon, Jan 06, 2025 at 04:13:24PM +0100, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.6.70 release. There are 222 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Tested-by: Mark Brown broonie@kernel.org
On Mon, 06 Jan 2025 16:13:24 +0100, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.6.70 release. There are 222 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 08 Jan 2025 15:11:04 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.6.70-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.6.y and the diffstat can be found below.
thanks,
greg k-h
All tests passing for Tegra ...
Test results for stable-v6.6: 10 builds: 10 pass, 0 fail 26 boots: 26 pass, 0 fail 116 tests: 116 pass, 0 fail
Linux version: 6.6.70-rc1-g5652330123c6 Boards tested: tegra124-jetson-tk1, tegra186-p2771-0000, tegra194-p2972-0000, tegra194-p3509-0000+p3668-0000, tegra20-ventana, tegra210-p2371-2180, tegra210-p3450-0000, tegra30-cardhu-a04
Tested-by: Jon Hunter jonathanh@nvidia.com
Jon
The kernel, bpf tool, amd kselftest tool builds fine for v6.6.70-rc1 on x86 and arm64 Azure VM.
Tested-by: Hardik Garg hargar@linux.microsoft.com
Thanks, Hardik
On 1/6/25 08:13, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.6.70 release. There are 222 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 08 Jan 2025 15:11:04 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.6.70-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.6.y and the diffstat can be found below.
thanks,
greg k-h
Compiled and booted on my test system. No dmesg regressions.
Tested-by: Shuah Khan skhan@linuxfoundation.org
thanks, -- Shuah
linux-stable-mirror@lists.linaro.org