This is the start of the stable review cycle for the 5.4.163 release. There are 92 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 01 Dec 2021 18:16:51 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.4.163-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.4.y and the diffstat can be found below.
thanks,
greg k-h
------------- Pseudo-Shortlog of commits:
Greg Kroah-Hartman gregkh@linuxfoundation.org Linux 5.4.163-rc1
Juergen Gross jgross@suse.com tty: hvc: replace BUG_ON() with negative return value
Juergen Gross jgross@suse.com xen/netfront: don't trust the backend response data blindly
Juergen Gross jgross@suse.com xen/netfront: disentangle tx_skb_freelist
Juergen Gross jgross@suse.com xen/netfront: don't read data from request on the ring page
Juergen Gross jgross@suse.com xen/netfront: read response from backend only once
Juergen Gross jgross@suse.com xen/blkfront: don't trust the backend response data blindly
Juergen Gross jgross@suse.com xen/blkfront: don't take local copy of a request from the ring page
Juergen Gross jgross@suse.com xen/blkfront: read response from backend only once
Juergen Gross jgross@suse.com xen: sync include/xen/interface/io/ring.h with Xen's newest version
Miklos Szeredi mszeredi@redhat.com fuse: release pipe buf after last use
Lin Ma linma@zju.edu.cn NFC: add NCI_UNREG flag to eliminate the race
Alexander Mikhalitsyn alexander.mikhalitsyn@virtuozzo.com shm: extend forced shm destroy to support objects from several IPC nses
David Hildenbrand david@redhat.com s390/mm: validate VMA in PGSTE manipulation functions
Steven Rostedt (VMware) rostedt@goodmis.org tracing: Check pid filtering when creating events
Stefano Garzarella sgarzare@redhat.com vhost/vsock: fix incorrect used length reported to the guest
Steve French stfrench@microsoft.com smb3: do not error on fsync when readonly
Weichao Guo guoweichao@oppo.com f2fs: set SBI_NEED_FSCK flag when inconsistent node block found
Vladimir Oltean vladimir.oltean@nxp.com net: mscc: ocelot: correctly report the timestamping RX filters in ethtool
Vladimir Oltean vladimir.oltean@nxp.com net: mscc: ocelot: don't downgrade timestamping RX filters in SIOCSHWTSTAMP
Guangbin Huang huangguangbin2@huawei.com net: hns3: fix VF RSS failed problem after PF enable multi-TCs
Tony Lu tonylu@linux.alibaba.com net/smc: Don't call clcsock shutdown twice when smc shutdown
Ziyang Xuan william.xuanziyang@huawei.com net: vlan: fix underflow for the real_dev refcnt
Huang Pei huangpei@loongson.cn MIPS: use 3-level pgtable for 64KB page size on MIPS_VA_BITS_48
Jesse Brandeburg jesse.brandeburg@intel.com igb: fix netpoll exit with traffic
Maurizio Lombardi mlombard@redhat.com nvmet: use IOCB_NOWAIT only if the filesystem supports it
Eric Dumazet edumazet@google.com tcp_cubic: fix spurious Hystart ACK train detections for not-cwnd-limited flows
Thomas Zeitlhofer thomas.zeitlhofer+lkml@ze-it.at PM: hibernate: use correct mode for swsusp_close()
Kumar Thangavel kumarthangavel.hcl@gmail.com net/ncsi : Add payload to be 32-bit aligned to fix dropped packets
Varun Prakash varun@chelsio.com nvmet-tcp: fix incomplete data digest send
Tony Lu tonylu@linux.alibaba.com net/smc: Ensure the active closing peer first closes clcsock
Mike Christie michael.christie@oracle.com scsi: core: sysfs: Fix setting device state to SDEV_RUNNING
Nikolay Aleksandrov nikolay@nvidia.com net: nexthop: release IPv6 per-cpu dsts when replacing a nexthop group
Nikolay Aleksandrov nikolay@nvidia.com net: ipv6: add fib6_nh_release_dsts stub
Diana Wang na.wang@corigine.com nfp: checking parameter process for rx-usecs/tx-usecs is invalid
Eric Dumazet edumazet@google.com ipv6: fix typos in __ip6_finish_output()
Nitesh B Venkatesh nitesh.b.venkatesh@intel.com iavf: Prevent changing static ITR values if adaptive moderation is on
Dan Carpenter dan.carpenter@oracle.com drm/vc4: fix error code in vc4_create_object()
Sreekanth Reddy sreekanth.reddy@broadcom.com scsi: mpt3sas: Fix kernel panic during drive powercycle test
Takashi Iwai tiwai@suse.de ARM: socfpga: Fix crash with CONFIG_FORTIRY_SOURCE
Trond Myklebust trond.myklebust@hammerspace.com NFSv42: Don't fail clone() unless the OP_CLONE operation failed
Peng Fan peng.fan@nxp.com firmware: arm_scmi: pm: Propagate return value to caller
Alexander Aring aahringo@redhat.com net: ieee802154: handle iftypes as u32
Takashi Iwai tiwai@suse.de ASoC: topology: Add missing rwsem around snd_ctl_remove() calls
Srinivas Kandagatla srinivas.kandagatla@linaro.org ASoC: qdsp6: q6routing: Conditionally reset FrontEnd Mixer
Florian Fainelli f.fainelli@gmail.com ARM: dts: BCM5301X: Add interrupt properties to GPIO node
Florian Fainelli f.fainelli@gmail.com ARM: dts: BCM5301X: Fix I2C controller interrupt
yangxingwu xingwu.yang@gmail.com netfilter: ipvs: Fix reuse connection if RS weight is 0
David Hildenbrand david@redhat.com proc/vmcore: fix clearing user buffer by properly using clear_user()
Marek Behún kabel@kernel.org arm64: dts: marvell: armada-37xx: Set pcie_reset_pin to gpio function
Marek Behún kabel@kernel.org pinctrl: armada-37xx: Correct PWM pins definitions
Pali Rohár pali@kernel.org PCI: aardvark: Fix support for PCI_BRIDGE_CTL_BUS_RESET on emulated bridge
Pali Rohár pali@kernel.org PCI: aardvark: Set PCI Bridge Class Code to PCI Bridge
Pali Rohár pali@kernel.org PCI: aardvark: Fix support for bus mastering and PCI_COMMAND on emulated bridge
Pali Rohár pali@kernel.org PCI: aardvark: Fix link training
Pali Rohár pali@kernel.org PCI: aardvark: Simplify initialization of rootcap on virtual bridge
Pali Rohár pali@kernel.org PCI: aardvark: Implement re-issuing config requests on CRS response
Pali Rohár pali@kernel.org PCI: aardvark: Fix PCIe Max Payload Size setting
Pali Rohár pali@kernel.org PCI: aardvark: Configure PCIe resources from 'ranges' DT property
Russell King rmk+kernel@armlinux.org.uk PCI: pci-bridge-emul: Fix array overruns, improve safety
Pali Rohár pali@kernel.org PCI: aardvark: Update comment about disabling link training
Pali Rohár pali@kernel.org PCI: aardvark: Move PCIe reset card code to advk_pcie_train_link()
Pali Rohár pali@kernel.org PCI: aardvark: Fix compilation on s390
Pali Rohár pali@kernel.org PCI: aardvark: Don't touch PCIe registers if no card connected
Pali Rohár pali@kernel.org PCI: aardvark: Replace custom macros by standard linux/pci_regs.h macros
Pali Rohár pali@kernel.org PCI: aardvark: Issue PERST via GPIO
Marek Behún marek.behun@nic.cz PCI: aardvark: Improve link training
Pali Rohár pali@kernel.org PCI: aardvark: Train link immediately after enabling training
Grzegorz Jaszczyk jaz@semihalf.com PCI: aardvark: Fix big endian support
Remi Pommarel repk@triplefau.lt PCI: aardvark: Wait for endpoint to be ready before training link
Marek Behún kabel@kernel.org PCI: aardvark: Deduplicate code in advk_pcie_rd_conf()
Dylan Hung dylan_hung@aspeedtech.com mdio: aspeed: Fix "Link is Down" issue
Adrian Hunter adrian.hunter@intel.com mmc: sdhci: Fix ADMA for PAGE_SIZE >= 64KiB
Steven Rostedt (VMware) rostedt@goodmis.org tracing: Fix pid filtering when triggers are attached
Jiri Olsa jolsa@redhat.com tracing/uprobe: Fix uprobe_perf_open probes iteration
Nicholas Piggin npiggin@gmail.com KVM: PPC: Book3S HV: Prevent POWER7/8 TLB flush flushing SLB
Stefano Stabellini stefano.stabellini@xilinx.com xen: detect uninitialized xenbus in xenbus_init
Stefano Stabellini stefano.stabellini@xilinx.com xen: don't continue xenstore initialization in case of errors
Dan Carpenter dan.carpenter@oracle.com staging: rtl8192e: Fix use after free in _rtl92e_pci_disconnect()
Noralf Trønnes noralf@tronnes.org staging/fbtft: Fix backlight
Jason Gerecke killertofu@gmail.com HID: wacom: Use "Confidence" flag to prevent reporting invalid contacts
Helge Deller deller@gmx.de Revert "parisc: Fix backtrace to always include init funtion names"
Hans Verkuil hverkuil-cisco@xs4all.nl media: cec: copy sequence field for the reply
Takashi Iwai tiwai@suse.de ALSA: ctxfi: Fix out-of-range access
Todd Kjos tkjos@google.com binder: fix test regression due to sender_euid change
Mathias Nyman mathias.nyman@linux.intel.com usb: hub: Fix locking issues with address0_mutex
Mathias Nyman mathias.nyman@linux.intel.com usb: hub: Fix usb enumeration issue due to address0 race
Ondrej Jirman megous@megous.com usb: typec: fusb302: Fix masking of comparator and bc_lvl interrupts
Nikolay Aleksandrov nikolay@nvidia.com net: nexthop: fix null pointer dereference when IPv6 is not enabled
Nathan Chancellor nathan@kernel.org usb: dwc2: hcd_queue: Fix use of floating point literal
Minas Harutyunyan Minas.Harutyunyan@synopsys.com usb: dwc2: gadget: Fix ISOC flow for elapsed frames
Mingjie Zhang superzmj@fibocom.com USB: serial: option: add Fibocom FM101-GL variants
Daniele Palmas dnlplm@gmail.com USB: serial: option: add Telit LE910S1 0x9200 composition
-------------
Diffstat:
.../pinctrl/marvell,armada-37xx-pinctrl.txt | 8 +- Documentation/networking/ipvs-sysctl.txt | 3 +- Makefile | 4 +- arch/arm/boot/dts/bcm5301x.dtsi | 4 +- arch/arm/mach-socfpga/core.h | 2 +- arch/arm/mach-socfpga/platsmp.c | 8 +- arch/arm64/boot/dts/marvell/armada-3720-db.dts | 3 + .../boot/dts/marvell/armada-3720-espressobin.dts | 1 + .../boot/dts/marvell/armada-3720-turris-mox.dts | 4 - arch/arm64/boot/dts/marvell/armada-37xx.dtsi | 2 +- arch/mips/Kconfig | 2 +- arch/parisc/kernel/vmlinux.lds.S | 3 +- arch/powerpc/kvm/book3s_hv_builtin.c | 5 +- arch/s390/mm/pgtable.c | 13 + drivers/android/binder.c | 2 +- drivers/block/xen-blkfront.c | 126 +++-- drivers/firmware/arm_scmi/scmi_pm_domain.c | 4 +- drivers/gpu/drm/vc4/vc4_bo.c | 2 +- drivers/hid/wacom_wac.c | 8 +- drivers/hid/wacom_wac.h | 1 + drivers/media/cec/cec-adap.c | 1 + drivers/mmc/host/sdhci.c | 21 +- drivers/mmc/host/sdhci.h | 4 +- .../ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c | 4 +- drivers/net/ethernet/intel/iavf/iavf_ethtool.c | 30 +- drivers/net/ethernet/intel/igb/igb_main.c | 2 +- drivers/net/ethernet/mscc/ocelot.c | 11 +- drivers/net/ethernet/netronome/nfp/nfp_net.h | 3 - .../net/ethernet/netronome/nfp/nfp_net_ethtool.c | 2 +- drivers/net/phy/mdio-aspeed.c | 7 + drivers/net/xen-netfront.c | 255 +++++---- drivers/nvme/target/io-cmd-file.c | 4 +- drivers/nvme/target/tcp.c | 7 +- drivers/pci/controller/pci-aardvark.c | 576 +++++++++++++++++---- drivers/pci/pci-bridge-emul.c | 11 +- drivers/pinctrl/mvebu/pinctrl-armada-37xx.c | 17 +- drivers/scsi/mpt3sas/mpt3sas_scsih.c | 2 +- drivers/scsi/scsi_sysfs.c | 2 +- drivers/staging/fbtft/fb_ssd1351.c | 4 - drivers/staging/fbtft/fbtft-core.c | 9 +- drivers/staging/rtl8192e/rtl8192e/rtl_core.c | 3 +- drivers/tty/hvc/hvc_xen.c | 17 +- drivers/usb/core/hub.c | 23 +- drivers/usb/dwc2/gadget.c | 17 +- drivers/usb/dwc2/hcd_queue.c | 2 +- drivers/usb/serial/option.c | 5 + drivers/usb/typec/tcpm/fusb302.c | 6 +- drivers/vhost/vsock.c | 2 +- drivers/xen/xenbus/xenbus_probe.c | 27 +- fs/cifs/file.c | 35 +- fs/f2fs/node.c | 1 + fs/fuse/dev.c | 10 +- fs/nfs/nfs42xdr.c | 3 +- fs/proc/vmcore.c | 16 +- include/linux/ipc_namespace.h | 15 + include/linux/sched/task.h | 2 +- include/net/ip6_fib.h | 1 + include/net/ipv6_stubs.h | 1 + include/net/nfc/nci_core.h | 1 + include/net/nl802154.h | 7 +- include/xen/interface/io/ring.h | 293 ++++++----- ipc/shm.c | 189 +++++-- kernel/power/hibernate.c | 6 +- kernel/trace/trace.h | 24 +- kernel/trace/trace_events.c | 7 + kernel/trace/trace_uprobe.c | 1 + net/8021q/vlan.c | 3 - net/8021q/vlan_dev.c | 3 + net/ipv4/nexthop.c | 35 +- net/ipv4/tcp_cubic.c | 5 +- net/ipv6/af_inet6.c | 1 + net/ipv6/ip6_output.c | 2 +- net/ipv6/route.c | 19 + net/ncsi/ncsi-cmd.c | 24 +- net/netfilter/ipvs/ip_vs_core.c | 8 +- net/nfc/nci/core.c | 19 +- net/smc/af_smc.c | 8 +- net/smc/smc_close.c | 6 + sound/pci/ctxfi/ctamixer.c | 14 +- sound/pci/ctxfi/ctdaio.c | 16 +- sound/pci/ctxfi/ctresource.c | 7 +- sound/pci/ctxfi/ctresource.h | 4 +- sound/pci/ctxfi/ctsrc.c | 7 +- sound/soc/qcom/qdsp6/q6routing.c | 6 +- sound/soc/soc-topology.c | 3 + 85 files changed, 1464 insertions(+), 617 deletions(-)
From: Daniele Palmas dnlplm@gmail.com
commit e353f3e88720300c3d72f49a4bea54f42db1fa5e upstream.
Add the following Telit LE910S1 composition:
0x9200: tty
Signed-off-by: Daniele Palmas dnlplm@gmail.com Link: https://lore.kernel.org/r/20211119140319.10448-1-dnlplm@gmail.com Cc: stable@vger.kernel.org Signed-off-by: Johan Hovold johan@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/serial/option.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/drivers/usb/serial/option.c +++ b/drivers/usb/serial/option.c @@ -1267,6 +1267,8 @@ static const struct usb_device_id option .driver_info = NCTRL(2) }, { USB_DEVICE(TELIT_VENDOR_ID, 0x9010), /* Telit SBL FN980 flashing device */ .driver_info = NCTRL(0) | ZLP }, + { USB_DEVICE(TELIT_VENDOR_ID, 0x9200), /* Telit LE910S1 flashing device */ + .driver_info = NCTRL(0) | ZLP }, { USB_DEVICE_AND_INTERFACE_INFO(ZTE_VENDOR_ID, ZTE_PRODUCT_MF622, 0xff, 0xff, 0xff) }, /* ZTE WCDMA products */ { USB_DEVICE_AND_INTERFACE_INFO(ZTE_VENDOR_ID, 0x0002, 0xff, 0xff, 0xff), .driver_info = RSVD(1) },
From: Mingjie Zhang superzmj@fibocom.com
commit 88459e3e42760abb2299bbf6cb1026491170e02a upstream.
Update the USB serial option driver support for the Fibocom FM101-GL Cat.6 LTE modules as there are actually several different variants. - VID:PID 2cb7:01a2, FM101-GL are laptop M.2 cards (with MBIM interfaces for /Linux/Chrome OS) - VID:PID 2cb7:01a4, FM101-GL for laptop debug M.2 cards(with adb interface for /Linux/Chrome OS)
0x01a2: mbim, tty, tty, diag, gnss 0x01a4: mbim, diag, tty, adb, gnss, gnss
Here are the outputs of lsusb -v and usb-devices:
T: Bus=02 Lev=01 Prnt=01 Port=03 Cnt=01 Dev#= 86 Spd=5000 MxCh= 0 D: Ver= 3.20 Cls=00(>ifc ) Sub=00 Prot=00 MxPS= 9 #Cfgs= 1 P: Vendor=2cb7 ProdID=01a2 Rev= 5.04 S: Manufacturer=Fibocom Wireless Inc. S: Product=Fibocom FM101-GL Module S: SerialNumber=673326ce C:* #Ifs= 6 Cfg#= 1 Atr=a0 MxPwr=896mA A: FirstIf#= 0 IfCount= 2 Cls=02(comm.) Sub=0e Prot=00 I:* If#= 0 Alt= 0 #EPs= 1 Cls=02(comm.) Sub=0e Prot=00 Driver=cdc_mbim I: If#= 1 Alt= 0 #EPs= 0 Cls=0a(data ) Sub=00 Prot=02 Driver=cdc_mbim I:* If#= 1 Alt= 1 #EPs= 2 Cls=0a(data ) Sub=00 Prot=02 Driver=cdc_mbim I:* If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=(none) I:* If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=(none) I:* If#= 4 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=(none) I:* If#= 5 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=40 Driver=(none)
Bus 002 Device 084: ID 2cb7:01a2 Fibocom Wireless Inc. Fibocom FM101-GL Module Device Descriptor: bLength 18 bDescriptorType 1 bcdUSB 3.20 bDeviceClass 0 bDeviceSubClass 0 bDeviceProtocol 0 bMaxPacketSize0 9 idVendor 0x2cb7 idProduct 0x01a2 bcdDevice 5.04 iManufacturer 1 Fibocom Wireless Inc. iProduct 2 Fibocom FM101-GL Module iSerial 3 673326ce bNumConfigurations 1 Configuration Descriptor: bLength 9 bDescriptorType 2 wTotalLength 0x015d bNumInterfaces 6 bConfigurationValue 1 iConfiguration 4 MBIM_DUN_DUN_DIAG_NMEA bmAttributes 0xa0 (Bus Powered) Remote Wakeup MaxPower 896mA Interface Association: bLength 8 bDescriptorType 11 bFirstInterface 0 bInterfaceCount 2 bFunctionClass 2 Communications bFunctionSubClass 14 bFunctionProtocol 0 iFunction 0 Interface Descriptor: bLength 9 bDescriptorType 4 bInterfaceNumber 0 bAlternateSetting 0 bNumEndpoints 1 bInterfaceClass 2 Communications bInterfaceSubClass 14 bInterfaceProtocol 0 iInterface 5 Fibocom FM101-GL LTE Modem CDC Header: bcdCDC 1.10 CDC Union: bMasterInterface 0 bSlaveInterface 1 CDC MBIM: bcdMBIMVersion 1.00 wMaxControlMessage 4096 bNumberFilters 32 bMaxFilterSize 128 wMaxSegmentSize 2048 bmNetworkCapabilities 0x20 8-byte ntb input size CDC MBIM Extended: bcdMBIMExtendedVersion 1.00 bMaxOutstandingCommandMessages 64 wMTU 1500 Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x81 EP 1 IN bmAttributes 3 Transfer Type Interrupt Synch Type None Usage Type Data wMaxPacketSize 0x0040 1x 64 bytes bInterval 9 bMaxBurst 0 Interface Descriptor: bLength 9 bDescriptorType 4 bInterfaceNumber 1 bAlternateSetting 0 bNumEndpoints 0 bInterfaceClass 10 CDC Data bInterfaceSubClass 0 bInterfaceProtocol 2 iInterface 0 Interface Descriptor: bLength 9 bDescriptorType 4 bInterfaceNumber 1 bAlternateSetting 1 bNumEndpoints 2 bInterfaceClass 10 CDC Data bInterfaceSubClass 0 bInterfaceProtocol 2 iInterface 6 MBIM Data Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x8e EP 14 IN bmAttributes 2 Transfer Type Bulk Synch Type None Usage Type Data wMaxPacketSize 0x0400 1x 1024 bytes bInterval 0 bMaxBurst 6 Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x0f EP 15 OUT bmAttributes 2 Transfer Type Bulk Synch Type None Usage Type Data wMaxPacketSize 0x0400 1x 1024 bytes bInterval 0 bMaxBurst 2 Interface Descriptor: bLength 9 bDescriptorType 4 bInterfaceNumber 2 bAlternateSetting 0 bNumEndpoints 3 bInterfaceClass 255 Vendor Specific Class bInterfaceSubClass 255 Vendor Specific Subclass bInterfaceProtocol 64 iInterface 0 ** UNRECOGNIZED: 05 24 00 10 01 ** UNRECOGNIZED: 05 24 01 00 00 ** UNRECOGNIZED: 04 24 02 02 ** UNRECOGNIZED: 05 24 06 00 00 Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x83 EP 3 IN bmAttributes 3 Transfer Type Interrupt Synch Type None Usage Type Data wMaxPacketSize 0x000a 1x 10 bytes bInterval 9 bMaxBurst 0 Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x82 EP 2 IN bmAttributes 2 Transfer Type Bulk Synch Type None Usage Type Data wMaxPacketSize 0x0400 1x 1024 bytes bInterval 0 bMaxBurst 0 Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x01 EP 1 OUT bmAttributes 2 Transfer Type Bulk Synch Type None Usage Type Data wMaxPacketSize 0x0400 1x 1024 bytes bInterval 0 bMaxBurst 0 Interface Descriptor: bLength 9 bDescriptorType 4 bInterfaceNumber 3 bAlternateSetting 0 bNumEndpoints 3 bInterfaceClass 255 Vendor Specific Class bInterfaceSubClass 255 Vendor Specific Subclass bInterfaceProtocol 64 iInterface 0 ** UNRECOGNIZED: 05 24 00 10 01 ** UNRECOGNIZED: 05 24 01 00 00 ** UNRECOGNIZED: 04 24 02 02 ** UNRECOGNIZED: 05 24 06 00 00 Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x85 EP 5 IN bmAttributes 3 Transfer Type Interrupt Synch Type None Usage Type Data wMaxPacketSize 0x000a 1x 10 bytes bInterval 9 bMaxBurst 0 Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x84 EP 4 IN bmAttributes 2 Transfer Type Bulk Synch Type None Usage Type Data wMaxPacketSize 0x0400 1x 1024 bytes bInterval 0 bMaxBurst 0 Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x02 EP 2 OUT bmAttributes 2 Transfer Type Bulk Synch Type None Usage Type Data wMaxPacketSize 0x0400 1x 1024 bytes bInterval 0 bMaxBurst 0 Interface Descriptor: bLength 9 bDescriptorType 4 bInterfaceNumber 4 bAlternateSetting 0 bNumEndpoints 2 bInterfaceClass 255 Vendor Specific Class bInterfaceSubClass 255 Vendor Specific Subclass bInterfaceProtocol 48 iInterface 0 Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x03 EP 3 OUT bmAttributes 2 Transfer Type Bulk Synch Type None Usage Type Data wMaxPacketSize 0x0400 1x 1024 bytes bInterval 0 bMaxBurst 0 Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x86 EP 6 IN bmAttributes 2 Transfer Type Bulk Synch Type None Usage Type Data wMaxPacketSize 0x0400 1x 1024 bytes bInterval 0 bMaxBurst 0 Interface Descriptor: bLength 9 bDescriptorType 4 bInterfaceNumber 5 bAlternateSetting 0 bNumEndpoints 3 bInterfaceClass 255 Vendor Specific Class bInterfaceSubClass 0 bInterfaceProtocol 64 iInterface 0 ** UNRECOGNIZED: 05 24 00 10 01 ** UNRECOGNIZED: 05 24 01 00 00 ** UNRECOGNIZED: 04 24 02 02 ** UNRECOGNIZED: 05 24 06 00 00 Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x88 EP 8 IN bmAttributes 3 Transfer Type Interrupt Synch Type None Usage Type Data wMaxPacketSize 0x000a 1x 10 bytes bInterval 9 bMaxBurst 0 Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x87 EP 7 IN bmAttributes 2 Transfer Type Bulk Synch Type None Usage Type Data wMaxPacketSize 0x0400 1x 1024 bytes bInterval 0 bMaxBurst 0 Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x04 EP 4 OUT bmAttributes 2 Transfer Type Bulk Synch Type None Usage Type Data wMaxPacketSize 0x0400 1x 1024 bytes bInterval 0 bMaxBurst 0
T: Bus=02 Lev=01 Prnt=01 Port=03 Cnt=01 Dev#= 85 Spd=5000 MxCh= 0 D: Ver= 3.20 Cls=00(>ifc ) Sub=00 Prot=00 MxPS= 9 #Cfgs= 1 P: Vendor=2cb7 ProdID=01a4 Rev= 5.04 S: Manufacturer=Fibocom Wireless Inc. S: Product=Fibocom FM101-GL Module S: SerialNumber=673326ce C:* #Ifs= 7 Cfg#= 1 Atr=a0 MxPwr=896mA A: FirstIf#= 0 IfCount= 2 Cls=02(comm.) Sub=0e Prot=00 I:* If#= 0 Alt= 0 #EPs= 1 Cls=02(comm.) Sub=0e Prot=00 Driver=cdc_mbim I: If#= 1 Alt= 0 #EPs= 0 Cls=0a(data ) Sub=00 Prot=02 Driver=cdc_mbim I:* If#= 1 Alt= 1 #EPs= 2 Cls=0a(data ) Sub=00 Prot=02 Driver=cdc_mbim I:* If#= 2 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=(none) I:* If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=(none) I:* If#= 4 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=42 Prot=01 Driver=(none) I:* If#= 5 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=40 Driver=(none) I:* If#= 6 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=40 Driver=(none)
Bus 002 Device 085: ID 2cb7:01a4 Fibocom Wireless Inc. Fibocom FM101-GL Module Device Descriptor: bLength 18 bDescriptorType 1 bcdUSB 3.20 bDeviceClass 0 bDeviceSubClass 0 bDeviceProtocol 0 bMaxPacketSize0 9 idVendor 0x2cb7 idProduct 0x01a4 bcdDevice 5.04 iManufacturer 1 Fibocom Wireless Inc. iProduct 2 Fibocom FM101-GL Module iSerial 3 673326ce bNumConfigurations 1 Configuration Descriptor: bLength 9 bDescriptorType 2 wTotalLength 0x0180 bNumInterfaces 7 bConfigurationValue 1 iConfiguration 4 MBIM_DIAG_DUN_ADB_GNSS_GNSS bmAttributes 0xa0 (Bus Powered) Remote Wakeup MaxPower 896mA Interface Association: bLength 8 bDescriptorType 11 bFirstInterface 0 bInterfaceCount 2 bFunctionClass 2 Communications bFunctionSubClass 14 bFunctionProtocol 0 iFunction 0 Interface Descriptor: bLength 9 bDescriptorType 4 bInterfaceNumber 0 bAlternateSetting 0 bNumEndpoints 1 bInterfaceClass 2 Communications bInterfaceSubClass 14 bInterfaceProtocol 0 iInterface 5 Fibocom FM101-GL LTE Modem CDC Header: bcdCDC 1.10 CDC Union: bMasterInterface 0 bSlaveInterface 1 CDC MBIM: bcdMBIMVersion 1.00 wMaxControlMessage 4096 bNumberFilters 32 bMaxFilterSize 128 wMaxSegmentSize 2048 bmNetworkCapabilities 0x20 8-byte ntb input size CDC MBIM Extended: bcdMBIMExtendedVersion 1.00 bMaxOutstandingCommandMessages 64 wMTU 1500 Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x81 EP 1 IN bmAttributes 3 Transfer Type Interrupt Synch Type None Usage Type Data wMaxPacketSize 0x0040 1x 64 bytes bInterval 9 bMaxBurst 0 Interface Descriptor: bLength 9 bDescriptorType 4 bInterfaceNumber 1 bAlternateSetting 0 bNumEndpoints 0 bInterfaceClass 10 CDC Data bInterfaceSubClass 0 bInterfaceProtocol 2 iInterface 0 Interface Descriptor: bLength 9 bDescriptorType 4 bInterfaceNumber 1 bAlternateSetting 1 bNumEndpoints 2 bInterfaceClass 10 CDC Data bInterfaceSubClass 0 bInterfaceProtocol 2 iInterface 6 MBIM Data Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x8e EP 14 IN bmAttributes 2 Transfer Type Bulk Synch Type None Usage Type Data wMaxPacketSize 0x0400 1x 1024 bytes bInterval 0 bMaxBurst 6 Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x0f EP 15 OUT bmAttributes 2 Transfer Type Bulk Synch Type None Usage Type Data wMaxPacketSize 0x0400 1x 1024 bytes bInterval 0 bMaxBurst 2 Interface Descriptor: bLength 9 bDescriptorType 4 bInterfaceNumber 2 bAlternateSetting 0 bNumEndpoints 2 bInterfaceClass 255 Vendor Specific Class bInterfaceSubClass 255 Vendor Specific Subclass bInterfaceProtocol 48 iInterface 0 Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x01 EP 1 OUT bmAttributes 2 Transfer Type Bulk Synch Type None Usage Type Data wMaxPacketSize 0x0400 1x 1024 bytes bInterval 0 bMaxBurst 0 Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x82 EP 2 IN bmAttributes 2 Transfer Type Bulk Synch Type None Usage Type Data wMaxPacketSize 0x0400 1x 1024 bytes bInterval 0 bMaxBurst 0 Interface Descriptor: bLength 9 bDescriptorType 4 bInterfaceNumber 3 bAlternateSetting 0 bNumEndpoints 3 bInterfaceClass 255 Vendor Specific Class bInterfaceSubClass 255 Vendor Specific Subclass bInterfaceProtocol 64 iInterface 0 ** UNRECOGNIZED: 05 24 00 10 01 ** UNRECOGNIZED: 05 24 01 00 00 ** UNRECOGNIZED: 04 24 02 02 ** UNRECOGNIZED: 05 24 06 00 00 Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x84 EP 4 IN bmAttributes 3 Transfer Type Interrupt Synch Type None Usage Type Data wMaxPacketSize 0x000a 1x 10 bytes bInterval 9 bMaxBurst 0 Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x83 EP 3 IN bmAttributes 2 Transfer Type Bulk Synch Type None Usage Type Data wMaxPacketSize 0x0400 1x 1024 bytes bInterval 0 bMaxBurst 0 Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x02 EP 2 OUT bmAttributes 2 Transfer Type Bulk Synch Type None Usage Type Data wMaxPacketSize 0x0400 1x 1024 bytes bInterval 0 bMaxBurst 0 Interface Descriptor: bLength 9 bDescriptorType 4 bInterfaceNumber 4 bAlternateSetting 0 bNumEndpoints 2 bInterfaceClass 255 Vendor Specific Class bInterfaceSubClass 66 bInterfaceProtocol 1 iInterface 8 ADB Interface Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x03 EP 3 OUT bmAttributes 2 Transfer Type Bulk Synch Type None Usage Type Data wMaxPacketSize 0x0400 1x 1024 bytes bInterval 0 bMaxBurst 0 Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x85 EP 5 IN bmAttributes 2 Transfer Type Bulk Synch Type None Usage Type Data wMaxPacketSize 0x0400 1x 1024 bytes bInterval 0 bMaxBurst 0 Interface Descriptor: bLength 9 bDescriptorType 4 bInterfaceNumber 5 bAlternateSetting 0 bNumEndpoints 3 bInterfaceClass 255 Vendor Specific Class bInterfaceSubClass 0 bInterfaceProtocol 64 iInterface 0 ** UNRECOGNIZED: 05 24 00 10 01 ** UNRECOGNIZED: 05 24 01 00 00 ** UNRECOGNIZED: 04 24 02 02 ** UNRECOGNIZED: 05 24 06 00 00 Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x87 EP 7 IN bmAttributes 3 Transfer Type Interrupt Synch Type None Usage Type Data wMaxPacketSize 0x000a 1x 10 bytes bInterval 9 bMaxBurst 0 Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x86 EP 6 IN bmAttributes 2 Transfer Type Bulk Synch Type None Usage Type Data wMaxPacketSize 0x0400 1x 1024 bytes bInterval 0 bMaxBurst 0 Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x04 EP 4 OUT bmAttributes 2 Transfer Type Bulk Synch Type None Usage Type Data wMaxPacketSize 0x0400 1x 1024 bytes bInterval 0 bMaxBurst 0 Interface Descriptor: bLength 9 bDescriptorType 4 bInterfaceNumber 6 bAlternateSetting 0 bNumEndpoints 3 bInterfaceClass 255 Vendor Specific Class bInterfaceSubClass 0 bInterfaceProtocol 64 iInterface 0 ** UNRECOGNIZED: 05 24 00 10 01 ** UNRECOGNIZED: 05 24 01 00 00 ** UNRECOGNIZED: 04 24 02 02 ** UNRECOGNIZED: 05 24 06 00 00 Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x89 EP 9 IN bmAttributes 3 Transfer Type Interrupt Synch Type None Usage Type Data wMaxPacketSize 0x000a 1x 10 bytes bInterval 9 bMaxBurst 0 Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x88 EP 8 IN bmAttributes 2 Transfer Type Bulk Synch Type None Usage Type Data wMaxPacketSize 0x0400 1x 1024 bytes bInterval 0 bMaxBurst 0 Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x05 EP 5 OUT bmAttributes 2 Transfer Type Bulk Synch Type None Usage Type Data wMaxPacketSize 0x0400 1x 1024 bytes bInterval 0 bMaxBurst 0
Signed-off-by: Mingjie Zhang superzmj@fibocom.com Link: https://lore.kernel.org/r/20211123133757.37475-1-superzmj@fibocom.com Cc: stable@vger.kernel.org Signed-off-by: Johan Hovold johan@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/serial/option.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/drivers/usb/serial/option.c +++ b/drivers/usb/serial/option.c @@ -2096,6 +2096,9 @@ static const struct usb_device_id option { USB_DEVICE_AND_INTERFACE_INFO(0x2cb7, 0x010b, 0xff, 0xff, 0x30) }, /* Fibocom FG150 Diag */ { USB_DEVICE_AND_INTERFACE_INFO(0x2cb7, 0x010b, 0xff, 0, 0) }, /* Fibocom FG150 AT */ { USB_DEVICE_INTERFACE_CLASS(0x2cb7, 0x01a0, 0xff) }, /* Fibocom NL668-AM/NL652-EU (laptop MBIM) */ + { USB_DEVICE_INTERFACE_CLASS(0x2cb7, 0x01a2, 0xff) }, /* Fibocom FM101-GL (laptop MBIM) */ + { USB_DEVICE_INTERFACE_CLASS(0x2cb7, 0x01a4, 0xff), /* Fibocom FM101-GL (laptop MBIM) */ + .driver_info = RSVD(4) }, { USB_DEVICE_INTERFACE_CLASS(0x2df3, 0x9d03, 0xff) }, /* LongSung M5710 */ { USB_DEVICE_INTERFACE_CLASS(0x305a, 0x1404, 0xff) }, /* GosunCn GM500 RNDIS */ { USB_DEVICE_INTERFACE_CLASS(0x305a, 0x1405, 0xff) }, /* GosunCn GM500 MBIM */
From: Minas Harutyunyan Minas.Harutyunyan@synopsys.com
commit 7ad4a0b1d46b2612f4429a72afd8f137d7efa9a9 upstream.
Added updating of request frame number for elapsed frames, otherwise frame number will remain as previous use of request. This will allow function driver to correctly track frames in case of Missed ISOC occurs.
Added setting request actual length to 0 for elapsed frames. In Slave mode when pushing data to RxFIFO by dwords, request actual length incrementing accordingly. But before whole packet will be pushed into RxFIFO and send to host can occurs Missed ISOC and data will not send to host. So, in this case request actual length should be reset to 0.
Fixes: 91bb163e1e4f ("usb: dwc2: gadget: Fix ISOC flow for BDMA and Slave") Cc: stable stable@vger.kernel.org Reviewed-by: John Keeping john@metanate.com Signed-off-by: Minas Harutyunyan Minas.Harutyunyan@synopsys.com Link: https://lore.kernel.org/r/c356baade6e9716d312d43df08d53ae557cb8037.163601127... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/dwc2/gadget.c | 17 ++++++++++++++--- 1 file changed, 14 insertions(+), 3 deletions(-)
--- a/drivers/usb/dwc2/gadget.c +++ b/drivers/usb/dwc2/gadget.c @@ -1198,6 +1198,8 @@ static void dwc2_hsotg_start_req(struct } ctrl |= DXEPCTL_CNAK; } else { + hs_req->req.frame_number = hs_ep->target_frame; + hs_req->req.actual = 0; dwc2_hsotg_complete_request(hsotg, hs_ep, hs_req, -ENODATA); return; } @@ -2855,9 +2857,12 @@ static void dwc2_gadget_handle_ep_disabl
do { hs_req = get_ep_head(hs_ep); - if (hs_req) + if (hs_req) { + hs_req->req.frame_number = hs_ep->target_frame; + hs_req->req.actual = 0; dwc2_hsotg_complete_request(hsotg, hs_ep, hs_req, -ENODATA); + } dwc2_gadget_incr_frame_num(hs_ep); /* Update current frame number value. */ hsotg->frame_number = dwc2_hsotg_read_frameno(hsotg); @@ -2910,8 +2915,11 @@ static void dwc2_gadget_handle_out_token
while (dwc2_gadget_target_frame_elapsed(ep)) { hs_req = get_ep_head(ep); - if (hs_req) + if (hs_req) { + hs_req->req.frame_number = ep->target_frame; + hs_req->req.actual = 0; dwc2_hsotg_complete_request(hsotg, ep, hs_req, -ENODATA); + }
dwc2_gadget_incr_frame_num(ep); /* Update current frame number value. */ @@ -3000,8 +3008,11 @@ static void dwc2_gadget_handle_nak(struc
while (dwc2_gadget_target_frame_elapsed(hs_ep)) { hs_req = get_ep_head(hs_ep); - if (hs_req) + if (hs_req) { + hs_req->req.frame_number = hs_ep->target_frame; + hs_req->req.actual = 0; dwc2_hsotg_complete_request(hsotg, hs_ep, hs_req, -ENODATA); + }
dwc2_gadget_incr_frame_num(hs_ep); /* Update current frame number value. */
From: Nathan Chancellor nathan@kernel.org
commit 310780e825f3ffd211b479b8f828885a6faedd63 upstream.
A new commit in LLVM causes an error on the use of 'long double' when '-mno-x87' is used, which the kernel does through an alias, '-mno-80387' (see the LLVM commit below for more details around why it does this).
drivers/usb/dwc2/hcd_queue.c:1744:25: error: expression requires 'long double' type support, but target 'x86_64-unknown-linux-gnu' does not support it delay = ktime_set(0, DWC2_RETRY_WAIT_DELAY); ^ drivers/usb/dwc2/hcd_queue.c:62:34: note: expanded from macro 'DWC2_RETRY_WAIT_DELAY' #define DWC2_RETRY_WAIT_DELAY (1 * 1E6L) ^ 1 error generated.
This happens due to the use of a 'long double' literal. The 'E6' part of '1E6L' causes the literal to be a 'double' then the 'L' suffix promotes it to 'long double'.
There is no visible reason for a floating point value in this driver, as the value is only used as a parameter to a function that expects an integer type. Use NSEC_PER_MSEC, which is the same integer value as '1E6L', to avoid changing functionality but fix the error.
Link: https://github.com/ClangBuiltLinux/linux/issues/1497 Link: https://github.com/llvm/llvm-project/commit/a8083d42b1c346e21623a1d36d1f0cad... Fixes: 6ed30a7d8ec2 ("usb: dwc2: host: use hrtimer for NAK retries") Cc: stable stable@vger.kernel.org Reviewed-by: Nick Desaulniers ndesaulniers@google.com Reviewed-by: John Keeping john@metanate.com Acked-by: Minas Harutyunyan Minas.Harutyunyan@synopsys.com Signed-off-by: Nathan Chancellor nathan@kernel.org Link: https://lore.kernel.org/r/20211105145802.2520658-1-nathan@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/dwc2/hcd_queue.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/usb/dwc2/hcd_queue.c +++ b/drivers/usb/dwc2/hcd_queue.c @@ -59,7 +59,7 @@ #define DWC2_UNRESERVE_DELAY (msecs_to_jiffies(5))
/* If we get a NAK, wait this long before retrying */ -#define DWC2_RETRY_WAIT_DELAY 1*1E6L +#define DWC2_RETRY_WAIT_DELAY (1 * NSEC_PER_MSEC)
/** * dwc2_periodic_channel_available() - Checks that a channel is available for a
From: Nikolay Aleksandrov nikolay@nvidia.com
commit 1c743127cc54b112b155f434756bd4b5fa565a99 upstream.
When we try to add an IPv6 nexthop and IPv6 is not enabled (!CONFIG_IPV6) we'll hit a NULL pointer dereference[1] in the error path of nh_create_ipv6() due to calling ipv6_stub->fib6_nh_release. The bug has been present since the beginning of IPv6 nexthop gateway support. Commit 1aefd3de7bc6 ("ipv6: Add fib6_nh_init and release to stubs") tells us that only fib6_nh_init has a dummy stub because fib6_nh_release should not be called if fib6_nh_init returns an error, but the commit below added a call to ipv6_stub->fib6_nh_release in its error path. To fix it return the dummy stub's -EAFNOSUPPORT error directly without calling ipv6_stub->fib6_nh_release in nh_create_ipv6()'s error path.
[1] Output is a bit truncated, but it clearly shows the error. BUG: kernel NULL pointer dereference, address: 000000000000000000 #PF: supervisor instruction fetch in kernel modede #PF: error_code(0x0010) - not-present pagege PGD 0 P4D 0 Oops: 0010 [#1] PREEMPT SMP NOPTI CPU: 4 PID: 638 Comm: ip Kdump: loaded Not tainted 5.16.0-rc1+ #446 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-4.fc34 04/01/2014 RIP: 0010:0x0 Code: Unable to access opcode bytes at RIP 0xffffffffffffffd6. RSP: 0018:ffff888109f5b8f0 EFLAGS: 00010286^Ac RAX: 0000000000000000 RBX: ffff888109f5ba28 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8881008a2860 RBP: ffff888109f5b9d8 R08: 0000000000000000 R09: 0000000000000000 R10: ffff888109f5b978 R11: ffff888109f5b948 R12: 00000000ffffff9f R13: ffff8881008a2a80 R14: ffff8881008a2860 R15: ffff8881008a2840 FS: 00007f98de70f100(0000) GS:ffff88822bf00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffffffffffffd6 CR3: 0000000100efc000 CR4: 00000000000006e0 Call Trace: <TASK> nh_create_ipv6+0xed/0x10c rtm_new_nexthop+0x6d7/0x13f3 ? check_preemption_disabled+0x3d/0xf2 ? lock_is_held_type+0xbe/0xfd rtnetlink_rcv_msg+0x23f/0x26a ? check_preemption_disabled+0x3d/0xf2 ? rtnl_calcit.isra.0+0x147/0x147 netlink_rcv_skb+0x61/0xb2 netlink_unicast+0x100/0x187 netlink_sendmsg+0x37f/0x3a0 ? netlink_unicast+0x187/0x187 sock_sendmsg_nosec+0x67/0x9b ____sys_sendmsg+0x19d/0x1f9 ? copy_msghdr_from_user+0x4c/0x5e ? rcu_read_lock_any_held+0x2a/0x78 ___sys_sendmsg+0x6c/0x8c ? asm_sysvec_apic_timer_interrupt+0x12/0x20 ? lockdep_hardirqs_on+0xd9/0x102 ? sockfd_lookup_light+0x69/0x99 __sys_sendmsg+0x50/0x6e do_syscall_64+0xcb/0xf2 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x7f98dea28914 Code: 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b5 0f 1f 80 00 00 00 00 48 8d 05 e9 5d 0c 00 8b 00 85 c0 75 13 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 54 c3 0f 1f 00 41 54 41 89 d4 55 48 89 f5 53 RSP: 002b:00007fff859f5e68 EFLAGS: 00000246 ORIG_RAX: 000000000000002e2e RAX: ffffffffffffffda RBX: 00000000619cb810 RCX: 00007f98dea28914 RDX: 0000000000000000 RSI: 00007fff859f5ed0 RDI: 0000000000000003 RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000008 R10: fffffffffffffce6 R11: 0000000000000246 R12: 0000000000000001 R13: 000055c0097ae520 R14: 000055c0097957fd R15: 00007fff859f63a0 </TASK> Modules linked in: bridge stp llc bonding virtio_net
Cc: stable@vger.kernel.org Fixes: 53010f991a9f ("nexthop: Add support for IPv6 gateways") Signed-off-by: Nikolay Aleksandrov nikolay@nvidia.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv4/nexthop.c | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-)
--- a/net/ipv4/nexthop.c +++ b/net/ipv4/nexthop.c @@ -1231,11 +1231,15 @@ static int nh_create_ipv6(struct net *ne /* sets nh_dev if successful */ err = ipv6_stub->fib6_nh_init(net, fib6_nh, &fib6_cfg, GFP_KERNEL, extack); - if (err) + if (err) { + /* IPv6 is not enabled, don't call fib6_nh_release */ + if (err == -EAFNOSUPPORT) + goto out; ipv6_stub->fib6_nh_release(fib6_nh); - else + } else { nh->nh_flags = fib6_nh->fib_nh_flags; - + } +out: return err; }
From: Ondrej Jirman megous@megous.com
commit 362468830dd5bea8bf6ad5203b2ea61f8a4e8288 upstream.
The code that enables either BC_LVL or COMP_CHNG interrupt in tcpm_set_cc wrongly assumes that the interrupt is unmasked by writing 1 to the apropriate bit in the mask register. In fact, interrupts are enabled when the mask is 0, so the tcpm_set_cc enables interrupt for COMP_CHNG when it expects BC_LVL interrupt to be enabled.
This causes inability of the driver to recognize cable unplug events in host mode (unplug is recognized only via a COMP_CHNG interrupt).
In device mode this bug was masked by simultaneous triggering of the VBUS change interrupt, because of loss of VBUS when the port peer is providing power.
Fixes: 48242e30532b ("usb: typec: fusb302: Revert "Resolve fixed power role contract setup"") Cc: stable stable@vger.kernel.org Cc: Hans de Goede hdegoede@redhat.com Reviewed-by: Hans de Goede hdegoede@redhat.com Acked-by: Heikki Krogerus heikki.krogerus@linux.intel.com Signed-off-by: Ondrej Jirman megous@megous.com Link: https://lore.kernel.org/r/20211108102833.2793803-1-megous@megous.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/typec/tcpm/fusb302.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
--- a/drivers/usb/typec/tcpm/fusb302.c +++ b/drivers/usb/typec/tcpm/fusb302.c @@ -669,25 +669,27 @@ static int tcpm_set_cc(struct tcpc_dev * ret = fusb302_i2c_mask_write(chip, FUSB_REG_MASK, FUSB_REG_MASK_BC_LVL | FUSB_REG_MASK_COMP_CHNG, - FUSB_REG_MASK_COMP_CHNG); + FUSB_REG_MASK_BC_LVL); if (ret < 0) { fusb302_log(chip, "cannot set SRC interrupt, ret=%d", ret); goto done; } chip->intr_comp_chng = true; + chip->intr_bc_lvl = false; break; case TYPEC_CC_RD: ret = fusb302_i2c_mask_write(chip, FUSB_REG_MASK, FUSB_REG_MASK_BC_LVL | FUSB_REG_MASK_COMP_CHNG, - FUSB_REG_MASK_BC_LVL); + FUSB_REG_MASK_COMP_CHNG); if (ret < 0) { fusb302_log(chip, "cannot set SRC interrupt, ret=%d", ret); goto done; } chip->intr_bc_lvl = true; + chip->intr_comp_chng = false; break; default: break;
From: Mathias Nyman mathias.nyman@linux.intel.com
commit 6ae6dc22d2d1ce6aa77a6da8a761e61aca216f8b upstream.
xHC hardware can only have one slot in default state with address 0 waiting for a unique address at a time, otherwise "undefined behavior may occur" according to xhci spec 5.4.3.4
The address0_mutex exists to prevent this across both xhci roothubs.
If hub_port_init() fails, it may unlock the mutex and exit with a xhci slot in default state. If the other xhci roothub calls hub_port_init() at this point we end up with two slots in default state.
Make sure the address0_mutex protects the slot default state across hub_port_init() retries, until slot is addressed or disabled.
Note, one known minor case is not fixed by this patch. If device needs to be reset during resume, but fails all hub_port_init() retries in usb_reset_and_verify_device(), then it's possible the slot is still left in default state when address0_mutex is unlocked.
Cc: stable@vger.kernel.org Fixes: 638139eb95d2 ("usb: hub: allow to process more usb hub events in parallel") Signed-off-by: Mathias Nyman mathias.nyman@linux.intel.com Link: https://lore.kernel.org/r/20211115221630.871204-1-mathias.nyman@linux.intel.... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/core/hub.c | 14 +++++++++++--- 1 file changed, 11 insertions(+), 3 deletions(-)
--- a/drivers/usb/core/hub.c +++ b/drivers/usb/core/hub.c @@ -4609,8 +4609,6 @@ hub_port_init(struct usb_hub *hub, struc if (oldspeed == USB_SPEED_LOW) delay = HUB_LONG_RESET_TIME;
- mutex_lock(hcd->address0_mutex); - /* Reset the device; full speed may morph to high speed */ /* FIXME a USB 2.0 device may morph into SuperSpeed on reset. */ retval = hub_port_reset(hub, port1, udev, delay, false); @@ -4925,7 +4923,6 @@ fail: hub_port_disable(hub, port1, 0); update_devnum(udev, devnum); /* for disconnect processing */ } - mutex_unlock(hcd->address0_mutex); return retval; }
@@ -5070,6 +5067,9 @@ static void hub_port_connect(struct usb_ unit_load = 100;
status = 0; + + mutex_lock(hcd->address0_mutex); + for (i = 0; i < SET_CONFIG_TRIES; i++) {
/* reallocate for each attempt, since references @@ -5106,6 +5106,8 @@ static void hub_port_connect(struct usb_ if (status < 0) goto loop;
+ mutex_unlock(hcd->address0_mutex); + if (udev->quirks & USB_QUIRK_DELAY_INIT) msleep(2000);
@@ -5194,6 +5196,7 @@ static void hub_port_connect(struct usb_
loop_disable: hub_port_disable(hub, port1, 1); + mutex_lock(hcd->address0_mutex); loop: usb_ep0_reinit(udev); release_devnum(udev); @@ -5220,6 +5223,8 @@ loop: }
done: + mutex_unlock(hcd->address0_mutex); + hub_port_disable(hub, port1, 1); if (hcd->driver->relinquish_port && !hub->hdev->parent) { if (status != -ENOTCONN && status != -ENODEV) @@ -5794,6 +5799,8 @@ static int usb_reset_and_verify_device(s bos = udev->bos; udev->bos = NULL;
+ mutex_lock(hcd->address0_mutex); + for (i = 0; i < SET_CONFIG_TRIES; ++i) {
/* ep0 maxpacket size may change; let the HCD know about it. @@ -5803,6 +5810,7 @@ static int usb_reset_and_verify_device(s if (ret >= 0 || ret == -ENOTCONN || ret == -ENODEV) break; } + mutex_unlock(hcd->address0_mutex);
if (ret < 0) goto re_enumerate;
From: Mathias Nyman mathias.nyman@linux.intel.com
commit 6cca13de26eea6d32a98d96d916a048d16a12822 upstream.
Fix the circular lock dependency and unbalanced unlock of addess0_mutex introduced when fixing an address0_mutex enumeration retry race in commit ae6dc22d2d1 ("usb: hub: Fix usb enumeration issue due to address0 race")
Make sure locking order between port_dev->status_lock and address0_mutex is correct, and that address0_mutex is not unlocked in hub_port_connect "done:" codepath which may be reached without locking address0_mutex
Fixes: 6ae6dc22d2d1 ("usb: hub: Fix usb enumeration issue due to address0 race") Cc: stable@vger.kernel.org Reported-by: Marek Szyprowski m.szyprowski@samsung.com Tested-by: Hans de Goede hdegoede@redhat.com Tested-by: Marek Szyprowski m.szyprowski@samsung.com Acked-by: Hans de Goede hdegoede@redhat.com Signed-off-by: Mathias Nyman mathias.nyman@linux.intel.com Link: https://lore.kernel.org/r/20211123101656.1113518-1-mathias.nyman@linux.intel... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/core/hub.c | 19 ++++++++++++------- 1 file changed, 12 insertions(+), 7 deletions(-)
--- a/drivers/usb/core/hub.c +++ b/drivers/usb/core/hub.c @@ -5012,6 +5012,7 @@ static void hub_port_connect(struct usb_ struct usb_port *port_dev = hub->ports[port1 - 1]; struct usb_device *udev = port_dev->child; static int unreliable_port = -1; + bool retry_locked;
/* Disconnect any existing devices under this port */ if (udev) { @@ -5068,9 +5069,10 @@ static void hub_port_connect(struct usb_
status = 0;
- mutex_lock(hcd->address0_mutex); - for (i = 0; i < SET_CONFIG_TRIES; i++) { + usb_lock_port(port_dev); + mutex_lock(hcd->address0_mutex); + retry_locked = true;
/* reallocate for each attempt, since references * to the previous one can escape in various ways @@ -5079,6 +5081,8 @@ static void hub_port_connect(struct usb_ if (!udev) { dev_err(&port_dev->dev, "couldn't allocate usb_device\n"); + mutex_unlock(hcd->address0_mutex); + usb_unlock_port(port_dev); goto done; }
@@ -5100,13 +5104,13 @@ static void hub_port_connect(struct usb_ }
/* reset (non-USB 3.0 devices) and get descriptor */ - usb_lock_port(port_dev); status = hub_port_init(hub, udev, port1, i); - usb_unlock_port(port_dev); if (status < 0) goto loop;
mutex_unlock(hcd->address0_mutex); + usb_unlock_port(port_dev); + retry_locked = false;
if (udev->quirks & USB_QUIRK_DELAY_INIT) msleep(2000); @@ -5196,11 +5200,14 @@ static void hub_port_connect(struct usb_
loop_disable: hub_port_disable(hub, port1, 1); - mutex_lock(hcd->address0_mutex); loop: usb_ep0_reinit(udev); release_devnum(udev); hub_free_dev(udev); + if (retry_locked) { + mutex_unlock(hcd->address0_mutex); + usb_unlock_port(port_dev); + } usb_put_dev(udev); if ((status == -ENOTCONN) || (status == -ENOTSUPP)) break; @@ -5223,8 +5230,6 @@ loop: }
done: - mutex_unlock(hcd->address0_mutex); - hub_port_disable(hub, port1, 1); if (hcd->driver->relinquish_port && !hub->hdev->parent) { if (status != -ENOTCONN && status != -ENODEV)
From: Todd Kjos tkjos@google.com
commit c21a80ca0684ec2910344d72556c816cb8940c01 upstream.
This is a partial revert of commit 29bc22ac5e5b ("binder: use euid from cred instead of using task"). Setting sender_euid using proc->cred caused some Android system test regressions that need further investigation. It is a partial reversion because subsequent patches rely on proc->cred.
Fixes: 29bc22ac5e5b ("binder: use euid from cred instead of using task") Cc: stable@vger.kernel.org # 4.4+ Acked-by: Christian Brauner christian.brauner@ubuntu.com Signed-off-by: Todd Kjos tkjos@google.com Change-Id: I9b1769a3510fed250bb21859ef8beebabe034c66 Link: https://lore.kernel.org/r/20211112180720.2858135-1-tkjos@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/android/binder.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/android/binder.c +++ b/drivers/android/binder.c @@ -3095,7 +3095,7 @@ static void binder_transaction(struct bi t->from = thread; else t->from = NULL; - t->sender_euid = proc->cred->euid; + t->sender_euid = task_euid(proc->tsk); t->to_proc = target_proc; t->to_thread = target_thread; t->code = tr->code;
From: Takashi Iwai tiwai@suse.de
commit 76c47183224c86e4011048b80f0e2d0d166f01c2 upstream.
The master and next_conj of rcs_ops are used for iterating the resource list entries, and currently those are supposed to return the current value. The problem is that next_conf may go over the last entry before the loop abort condition is evaluated, and it may return the "current" value that is beyond the array size. It was caught recently as a GPF, for example.
Those return values are, however, never actually evaluated, hence basically we don't have to consider the current value as the return at all. By dropping those return values, the potential out-of-range access above is also fixed automatically.
This patch changes the return type of master and next_conj callbacks to void and drop the superfluous code accordingly.
BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=214985 Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20211118215729.26257-1-tiwai@suse.de Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/pci/ctxfi/ctamixer.c | 14 ++++++-------- sound/pci/ctxfi/ctdaio.c | 16 ++++++++-------- sound/pci/ctxfi/ctresource.c | 7 +++---- sound/pci/ctxfi/ctresource.h | 4 ++-- sound/pci/ctxfi/ctsrc.c | 7 +++---- 5 files changed, 22 insertions(+), 26 deletions(-)
--- a/sound/pci/ctxfi/ctamixer.c +++ b/sound/pci/ctxfi/ctamixer.c @@ -23,16 +23,15 @@
#define BLANK_SLOT 4094
-static int amixer_master(struct rsc *rsc) +static void amixer_master(struct rsc *rsc) { rsc->conj = 0; - return rsc->idx = container_of(rsc, struct amixer, rsc)->idx[0]; + rsc->idx = container_of(rsc, struct amixer, rsc)->idx[0]; }
-static int amixer_next_conj(struct rsc *rsc) +static void amixer_next_conj(struct rsc *rsc) { rsc->conj++; - return container_of(rsc, struct amixer, rsc)->idx[rsc->conj]; }
static int amixer_index(const struct rsc *rsc) @@ -331,16 +330,15 @@ int amixer_mgr_destroy(struct amixer_mgr
/* SUM resource management */
-static int sum_master(struct rsc *rsc) +static void sum_master(struct rsc *rsc) { rsc->conj = 0; - return rsc->idx = container_of(rsc, struct sum, rsc)->idx[0]; + rsc->idx = container_of(rsc, struct sum, rsc)->idx[0]; }
-static int sum_next_conj(struct rsc *rsc) +static void sum_next_conj(struct rsc *rsc) { rsc->conj++; - return container_of(rsc, struct sum, rsc)->idx[rsc->conj]; }
static int sum_index(const struct rsc *rsc) --- a/sound/pci/ctxfi/ctdaio.c +++ b/sound/pci/ctxfi/ctdaio.c @@ -51,12 +51,12 @@ static struct daio_rsc_idx idx_20k2[NUM_ [SPDIFIO] = {.left = 0x05, .right = 0x85}, };
-static int daio_master(struct rsc *rsc) +static void daio_master(struct rsc *rsc) { /* Actually, this is not the resource index of DAIO. * For DAO, it is the input mapper index. And, for DAI, * it is the output time-slot index. */ - return rsc->conj = rsc->idx; + rsc->conj = rsc->idx; }
static int daio_index(const struct rsc *rsc) @@ -64,19 +64,19 @@ static int daio_index(const struct rsc * return rsc->conj; }
-static int daio_out_next_conj(struct rsc *rsc) +static void daio_out_next_conj(struct rsc *rsc) { - return rsc->conj += 2; + rsc->conj += 2; }
-static int daio_in_next_conj_20k1(struct rsc *rsc) +static void daio_in_next_conj_20k1(struct rsc *rsc) { - return rsc->conj += 0x200; + rsc->conj += 0x200; }
-static int daio_in_next_conj_20k2(struct rsc *rsc) +static void daio_in_next_conj_20k2(struct rsc *rsc) { - return rsc->conj += 0x100; + rsc->conj += 0x100; }
static const struct rsc_ops daio_out_rsc_ops = { --- a/sound/pci/ctxfi/ctresource.c +++ b/sound/pci/ctxfi/ctresource.c @@ -109,18 +109,17 @@ static int audio_ring_slot(const struct return (rsc->conj << 4) + offset_in_audio_slot_block[rsc->type]; }
-static int rsc_next_conj(struct rsc *rsc) +static void rsc_next_conj(struct rsc *rsc) { unsigned int i; for (i = 0; (i < 8) && (!(rsc->msr & (0x1 << i))); ) i++; rsc->conj += (AUDIO_SLOT_BLOCK_NUM >> i); - return rsc->conj; }
-static int rsc_master(struct rsc *rsc) +static void rsc_master(struct rsc *rsc) { - return rsc->conj = rsc->idx; + rsc->conj = rsc->idx; }
static const struct rsc_ops rsc_generic_ops = { --- a/sound/pci/ctxfi/ctresource.h +++ b/sound/pci/ctxfi/ctresource.h @@ -39,8 +39,8 @@ struct rsc { };
struct rsc_ops { - int (*master)(struct rsc *rsc); /* Move to master resource */ - int (*next_conj)(struct rsc *rsc); /* Move to next conjugate resource */ + void (*master)(struct rsc *rsc); /* Move to master resource */ + void (*next_conj)(struct rsc *rsc); /* Move to next conjugate resource */ int (*index)(const struct rsc *rsc); /* Return the index of resource */ /* Return the output slot number */ int (*output_slot)(const struct rsc *rsc); --- a/sound/pci/ctxfi/ctsrc.c +++ b/sound/pci/ctxfi/ctsrc.c @@ -590,16 +590,15 @@ int src_mgr_destroy(struct src_mgr *src_
/* SRCIMP resource manager operations */
-static int srcimp_master(struct rsc *rsc) +static void srcimp_master(struct rsc *rsc) { rsc->conj = 0; - return rsc->idx = container_of(rsc, struct srcimp, rsc)->idx[0]; + rsc->idx = container_of(rsc, struct srcimp, rsc)->idx[0]; }
-static int srcimp_next_conj(struct rsc *rsc) +static void srcimp_next_conj(struct rsc *rsc) { rsc->conj++; - return container_of(rsc, struct srcimp, rsc)->idx[rsc->conj]; }
static int srcimp_index(const struct rsc *rsc)
From: Hans Verkuil hverkuil-cisco@xs4all.nl
commit 13cbaa4c2b7bf9f8285e1164d005dbf08244ecd5 upstream.
When the reply for a non-blocking transmit arrives, the sequence field for that reply was never filled in, so userspace would have no way of associating the reply to the original transmit.
Copy the sequence field to ensure that this is now possible.
Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Fixes: 0dbacebede1e ([media] cec: move the CEC framework out of staging and to media) Cc: stable@vger.kernel.org Signed-off-by: Mauro Carvalho Chehab mchehab+huawei@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/media/cec/cec-adap.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/media/cec/cec-adap.c +++ b/drivers/media/cec/cec-adap.c @@ -1191,6 +1191,7 @@ void cec_received_msg_ts(struct cec_adap if (abort) dst->rx_status |= CEC_RX_STATUS_FEATURE_ABORT; msg->flags = dst->flags; + msg->sequence = dst->sequence; /* Remove it from the wait_queue */ list_del_init(&data->list);
From: Helge Deller deller@gmx.de
commit 98400ad75e95860e9a10ec78b0b90ab66184a2ce upstream.
This reverts commit 279917e27edc293eb645a25428c6ab3f3bca3f86.
With the CONFIG_HARDENED_USERCOPY option enabled, this patch triggers kernel bugs at runtime:
usercopy: Kernel memory overwrite attempt detected to kernel text (offset 2084839, size 6)! kernel BUG at mm/usercopy.c:99! Backtrace: IAOQ[0]: usercopy_abort+0xc4/0xe8 [<00000000406ed1c8>] __check_object_size+0x174/0x238 [<00000000407086d4>] copy_strings.isra.0+0x3e8/0x708 [<0000000040709a20>] do_execveat_common.isra.0+0x1bc/0x328 [<000000004070b760>] compat_sys_execve+0x7c/0xb8 [<0000000040303eb8>] syscall_exit+0x0/0x14
The problem is, that we have an init section of at least 2MB size which starts at _stext and is freed after bootup.
If then later some kernel data is (temporarily) stored in this free memory, check_kernel_text_object() will trigger a bug since the data appears to be inside the kernel text (>=_stext) area: if (overlaps(ptr, len, _stext, _etext)) usercopy_abort("kernel text");
Signed-off-by: Helge Deller deller@gmx.de Cc: stable@kernel.org # 5.4+ Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/parisc/kernel/vmlinux.lds.S | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
--- a/arch/parisc/kernel/vmlinux.lds.S +++ b/arch/parisc/kernel/vmlinux.lds.S @@ -56,8 +56,6 @@ SECTIONS { . = KERNEL_BINARY_TEXT_START;
- _stext = .; /* start of kernel text, includes init code & data */ - __init_begin = .; HEAD_TEXT_SECTION MLONGCALL_DISCARD(INIT_TEXT_SECTION(8)) @@ -81,6 +79,7 @@ SECTIONS /* freed after init ends here */
_text = .; /* Text and read-only data */ + _stext = .; MLONGCALL_KEEP(INIT_TEXT_SECTION(8)) .text ALIGN(PAGE_SIZE) : { TEXT_TEXT
From: Jason Gerecke killertofu@gmail.com
commit 7fb0413baa7f8a04caef0c504df9af7e0623d296 upstream.
The HID descriptor of many of Wacom's touch input devices include a "Confidence" usage that signals if a particular touch collection contains useful data. The driver does not look at this flag, however, which causes even invalid contacts to be reported to userspace. A lucky combination of kernel event filtering and device behavior (specifically: contact ID 0 == invalid, contact ID >0 == valid; and order all data so that all valid contacts are reported before any invalid contacts) spare most devices from any visibly-bad behavior.
The DTH-2452 is one example of an unlucky device that misbehaves. It uses ID 0 for both the first valid contact and all invalid contacts. Because we report both the valid and invalid contacts, the kernel reports that contact 0 first goes down (valid) and then goes up (invalid) in every report. This causes ~100 clicks per second simply by touching the screen.
This patch inroduces new `confidence` flag in our `hid_data` structure. The value is initially set to `true` at the start of a report and can be set to `false` if an invalid touch usage is seen.
Link: https://github.com/linuxwacom/input-wacom/issues/270 Fixes: f8b6a74719b5 ("HID: wacom: generic: Support multiple tools per report") Signed-off-by: Jason Gerecke jason.gerecke@wacom.com Tested-by: Joshua Dickens joshua.dickens@wacom.com Cc: stable@vger.kernel.org Signed-off-by: Jiri Kosina jkosina@suse.cz Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/hid/wacom_wac.c | 8 +++++++- drivers/hid/wacom_wac.h | 1 + 2 files changed, 8 insertions(+), 1 deletion(-)
--- a/drivers/hid/wacom_wac.c +++ b/drivers/hid/wacom_wac.c @@ -2578,6 +2578,9 @@ static void wacom_wac_finger_event(struc return;
switch (equivalent_usage) { + case HID_DG_CONFIDENCE: + wacom_wac->hid_data.confidence = value; + break; case HID_GD_X: wacom_wac->hid_data.x = value; break; @@ -2610,7 +2613,8 @@ static void wacom_wac_finger_event(struc }
if (usage->usage_index + 1 == field->report_count) { - if (equivalent_usage == wacom_wac->hid_data.last_slot_field) + if (equivalent_usage == wacom_wac->hid_data.last_slot_field && + wacom_wac->hid_data.confidence) wacom_wac_finger_slot(wacom_wac, wacom_wac->touch_input); } } @@ -2625,6 +2629,8 @@ static void wacom_wac_finger_pre_report(
wacom_wac->is_invalid_bt_frame = false;
+ hid_data->confidence = true; + for (i = 0; i < report->maxfield; i++) { struct hid_field *field = report->field[i]; int j; --- a/drivers/hid/wacom_wac.h +++ b/drivers/hid/wacom_wac.h @@ -300,6 +300,7 @@ struct hid_data { bool tipswitch; bool barrelswitch; bool barrelswitch2; + bool confidence; int x; int y; int pressure;
From: Noralf Trønnes noralf@tronnes.org
commit 7865dd24934ad580d1bcde8f63c39f324211a23b upstream.
Commit b4a1ed0cd18b ("fbdev: make FB_BACKLIGHT a tristate") forgot to update fbtft breaking its backlight support when FB_BACKLIGHT is a module.
Since FB_TFT selects FB_BACKLIGHT there's no need for this conditional so just remove it and we're good.
Fixes: b4a1ed0cd18b ("fbdev: make FB_BACKLIGHT a tristate") Cc: stable@vger.kernel.org Acked-by: Sam Ravnborg sam@ravnborg.org Signed-off-by: Noralf Trønnes noralf@tronnes.org Link: https://lore.kernel.org/r/20211105204358.2991-1-noralf@tronnes.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/staging/fbtft/fb_ssd1351.c | 4 ---- drivers/staging/fbtft/fbtft-core.c | 9 +-------- 2 files changed, 1 insertion(+), 12 deletions(-)
--- a/drivers/staging/fbtft/fb_ssd1351.c +++ b/drivers/staging/fbtft/fb_ssd1351.c @@ -187,7 +187,6 @@ static struct fbtft_display display = { }, };
-#ifdef CONFIG_FB_BACKLIGHT static int update_onboard_backlight(struct backlight_device *bd) { struct fbtft_par *par = bl_get_data(bd); @@ -231,9 +230,6 @@ static void register_onboard_backlight(s if (!par->fbtftops.unregister_backlight) par->fbtftops.unregister_backlight = fbtft_unregister_backlight; } -#else -static void register_onboard_backlight(struct fbtft_par *par) { }; -#endif
FBTFT_REGISTER_DRIVER(DRVNAME, "solomon,ssd1351", &display);
--- a/drivers/staging/fbtft/fbtft-core.c +++ b/drivers/staging/fbtft/fbtft-core.c @@ -136,7 +136,6 @@ static int fbtft_request_gpios_dt(struct } #endif
-#ifdef CONFIG_FB_BACKLIGHT static int fbtft_backlight_update_status(struct backlight_device *bd) { struct fbtft_par *par = bl_get_data(bd); @@ -169,6 +168,7 @@ void fbtft_unregister_backlight(struct f par->info->bl_dev = NULL; } } +EXPORT_SYMBOL(fbtft_unregister_backlight);
static const struct backlight_ops fbtft_bl_ops = { .get_brightness = fbtft_backlight_get_brightness, @@ -206,12 +206,7 @@ void fbtft_register_backlight(struct fbt if (!par->fbtftops.unregister_backlight) par->fbtftops.unregister_backlight = fbtft_unregister_backlight; } -#else -void fbtft_register_backlight(struct fbtft_par *par) { }; -void fbtft_unregister_backlight(struct fbtft_par *par) { }; -#endif EXPORT_SYMBOL(fbtft_register_backlight); -EXPORT_SYMBOL(fbtft_unregister_backlight);
static void fbtft_set_addr_win(struct fbtft_par *par, int xs, int ys, int xe, int ye) @@ -860,13 +855,11 @@ int fbtft_register_framebuffer(struct fb fb_info->fix.smem_len >> 10, text1, HZ / fb_info->fbdefio->delay, text2);
-#ifdef CONFIG_FB_BACKLIGHT /* Turn on backlight if available */ if (fb_info->bl_dev) { fb_info->bl_dev->props.power = FB_BLANK_UNBLANK; fb_info->bl_dev->ops->update_status(fb_info->bl_dev); } -#endif
return 0;
From: Dan Carpenter dan.carpenter@oracle.com
commit b535917c51acc97fb0761b1edec85f1f3d02bda4 upstream.
The free_rtllib() function frees the "dev" pointer so there is use after free on the next line. Re-arrange things to avoid that.
Fixes: 66898177e7e5 ("staging: rtl8192e: Fix unload/reload problem") Cc: stable stable@vger.kernel.org Signed-off-by: Dan Carpenter dan.carpenter@oracle.com Link: https://lore.kernel.org/r/20211117072016.GA5237@kili Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/staging/rtl8192e/rtl8192e/rtl_core.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/drivers/staging/rtl8192e/rtl8192e/rtl_core.c +++ b/drivers/staging/rtl8192e/rtl8192e/rtl_core.c @@ -2559,13 +2559,14 @@ static void _rtl92e_pci_disconnect(struc free_irq(dev->irq, dev); priv->irq = 0; } - free_rtllib(dev);
if (dev->mem_start != 0) { iounmap((void __iomem *)dev->mem_start); release_mem_region(pci_resource_start(pdev, 1), pci_resource_len(pdev, 1)); } + + free_rtllib(dev); } else { priv = rtllib_priv(dev); }
From: Stefano Stabellini stefano.stabellini@xilinx.com
commit 08f6c2b09ebd4b326dbe96d13f94fee8f9814c78 upstream.
In case of errors in xenbus_init (e.g. missing xen_store_gfn parameter), we goto out_error but we forget to reset xen_store_domain_type to XS_UNKNOWN. As a consequence xenbus_probe_initcall and other initcalls will still try to initialize xenstore resulting into a crash at boot.
[ 2.479830] Call trace: [ 2.482314] xb_init_comms+0x18/0x150 [ 2.486354] xs_init+0x34/0x138 [ 2.489786] xenbus_probe+0x4c/0x70 [ 2.498432] xenbus_probe_initcall+0x2c/0x7c [ 2.503944] do_one_initcall+0x54/0x1b8 [ 2.507358] kernel_init_freeable+0x1ac/0x210 [ 2.511617] kernel_init+0x28/0x130 [ 2.516112] ret_from_fork+0x10/0x20
Cc: Stable@vger.kernel.org Cc: jbeulich@suse.com Signed-off-by: Stefano Stabellini stefano.stabellini@xilinx.com Link: https://lore.kernel.org/r/20211115222719.2558207-1-sstabellini@kernel.org Reviewed-by: Jan Beulich jbeulich@suse.com Signed-off-by: Boris Ostrovsky boris.ostrovsky@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/xen/xenbus/xenbus_probe.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
--- a/drivers/xen/xenbus/xenbus_probe.c +++ b/drivers/xen/xenbus/xenbus_probe.c @@ -846,7 +846,7 @@ static struct notifier_block xenbus_resu
static int __init xenbus_init(void) { - int err = 0; + int err; uint64_t v = 0; xen_store_domain_type = XS_UNKNOWN;
@@ -920,8 +920,10 @@ static int __init xenbus_init(void) */ proc_create_mount_point("xen"); #endif + return 0;
out_error: + xen_store_domain_type = XS_UNKNOWN; return err; }
From: Stefano Stabellini stefano.stabellini@xilinx.com
commit 36e8f60f0867d3b70d398d653c17108459a04efe upstream.
If the xenstore page hasn't been allocated properly, reading the value of the related hvm_param (HVM_PARAM_STORE_PFN) won't actually return error. Instead, it will succeed and return zero. Instead of attempting to xen_remap a bad guest physical address, detect this condition and return early.
Note that although a guest physical address of zero for HVM_PARAM_STORE_PFN is theoretically possible, it is not a good choice and zero has never been validly used in that capacity.
Also recognize all bits set as an invalid value.
For 32-bit Linux, any pfn above ULONG_MAX would get truncated. Pfns above ULONG_MAX should never be passed by the Xen tools to HVM guests anyway, so check for this condition and return early.
Cc: stable@vger.kernel.org Signed-off-by: Stefano Stabellini stefano.stabellini@xilinx.com Reviewed-by: Juergen Gross jgross@suse.com Reviewed-by: Jan Beulich jbeulich@suse.com Link: https://lore.kernel.org/r/20211123210748.1910236-1-sstabellini@kernel.org Signed-off-by: Boris Ostrovsky boris.ostrovsky@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/xen/xenbus/xenbus_probe.c | 23 +++++++++++++++++++++++ 1 file changed, 23 insertions(+)
--- a/drivers/xen/xenbus/xenbus_probe.c +++ b/drivers/xen/xenbus/xenbus_probe.c @@ -886,6 +886,29 @@ static int __init xenbus_init(void) err = hvm_get_parameter(HVM_PARAM_STORE_PFN, &v); if (err) goto out_error; + /* + * Uninitialized hvm_params are zero and return no error. + * Although it is theoretically possible to have + * HVM_PARAM_STORE_PFN set to zero on purpose, in reality it is + * not zero when valid. If zero, it means that Xenstore hasn't + * been properly initialized. Instead of attempting to map a + * wrong guest physical address return error. + * + * Also recognize all bits set as an invalid value. + */ + if (!v || !~v) { + err = -ENOENT; + goto out_error; + } + /* Avoid truncation on 32-bit. */ +#if BITS_PER_LONG == 32 + if (v > ULONG_MAX) { + pr_err("%s: cannot handle HVM_PARAM_STORE_PFN=%llx > ULONG_MAX\n", + __func__, v); + err = -EINVAL; + goto out_error; + } +#endif xen_store_gfn = (unsigned long)v; xen_store_interface = xen_remap(xen_store_gfn << XEN_PAGE_SHIFT,
From: Nicholas Piggin npiggin@gmail.com
commit cf0b0e3712f7af90006f8317ff27278094c2c128 upstream.
The POWER9 ERAT flush instruction is a SLBIA with IH=7, which is a reserved value on POWER7/8. On POWER8 this invalidates the SLB entries above index 0, similarly to SLBIA IH=0.
If the SLB entries are invalidated, and then the guest is bypassed, the host SLB does not get re-loaded, so the bolted entries above 0 will be lost. This can result in kernel stack access causing a SLB fault.
Kernel stack access causing a SLB fault was responsible for the infamous mega bug (search "Fix SLB reload bug"). Although since commit 48e7b7695745 ("powerpc/64s/hash: Convert SLB miss handlers to C") that starts using the kernel stack in the SLB miss handler, it might only result in an infinite loop of SLB faults. In any case it's a bug.
Fix this by only executing the instruction on >= POWER9 where IH=7 is defined not to invalidate the SLB. POWER7/8 don't require this ERAT flush.
Fixes: 500871125920 ("KVM: PPC: Book3S HV: Invalidate ERAT when flushing guest TLB entries") Cc: stable@vger.kernel.org # v5.2+ Signed-off-by: Nicholas Piggin npiggin@gmail.com Reviewed-by: Fabiano Rosas farosas@linux.ibm.com Signed-off-by: Michael Ellerman mpe@ellerman.id.au Link: https://lore.kernel.org/r/20211119031627.577853-1-npiggin@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/powerpc/kvm/book3s_hv_builtin.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
--- a/arch/powerpc/kvm/book3s_hv_builtin.c +++ b/arch/powerpc/kvm/book3s_hv_builtin.c @@ -821,6 +821,7 @@ static void flush_guest_tlb(struct kvm * "r" (0) : "memory"); } asm volatile("ptesync": : :"memory"); + // POWER9 congruence-class TLBIEL leaves ERAT. Flush it now. asm volatile(PPC_RADIX_INVALIDATE_ERAT_GUEST : : :"memory"); } else { for (set = 0; set < kvm->arch.tlb_sets; ++set) { @@ -831,7 +832,9 @@ static void flush_guest_tlb(struct kvm * rb += PPC_BIT(51); /* increment set number */ } asm volatile("ptesync": : :"memory"); - asm volatile(PPC_ISA_3_0_INVALIDATE_ERAT : : :"memory"); + // POWER9 congruence-class TLBIEL leaves ERAT. Flush it now. + if (cpu_has_feature(CPU_FTR_ARCH_300)) + asm volatile(PPC_ISA_3_0_INVALIDATE_ERAT : : :"memory"); } }
From: Jiri Olsa jolsa@redhat.com
commit 1880ed71ce863318c1ce93bf324876fb5f92854f upstream.
Add missing 'tu' variable initialization in the probes loop, otherwise the head 'tu' is used instead of added probes.
Link: https://lkml.kernel.org/r/20211123142801.182530-1-jolsa@kernel.org
Cc: stable@vger.kernel.org Fixes: 99c9a923e97a ("tracing/uprobe: Fix double perf_event linking on multiprobe uprobe") Acked-by: Masami Hiramatsu mhiramat@kernel.org Signed-off-by: Jiri Olsa jolsa@kernel.org Signed-off-by: Steven Rostedt (VMware) rostedt@goodmis.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/trace/trace_uprobe.c | 1 + 1 file changed, 1 insertion(+)
--- a/kernel/trace/trace_uprobe.c +++ b/kernel/trace/trace_uprobe.c @@ -1299,6 +1299,7 @@ static int uprobe_perf_open(struct trace return 0;
list_for_each_entry(pos, trace_probe_probe_list(tp), list) { + tu = container_of(pos, struct trace_uprobe, tp); err = uprobe_apply(tu->inode, tu->offset, &tu->consumer, true); if (err) { uprobe_perf_close(call, event);
From: Steven Rostedt (VMware) rostedt@goodmis.org
commit a55f224ff5f238013de8762c4287117e47b86e22 upstream.
If a event is filtered by pid and a trigger that requires processing of the event to happen is a attached to the event, the discard portion does not take the pid filtering into account, and the event will then be recorded when it should not have been.
Cc: stable@vger.kernel.org Fixes: 3fdaf80f4a836 ("tracing: Implement event pid filtering") Signed-off-by: Steven Rostedt (VMware) rostedt@goodmis.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/trace/trace.h | 24 ++++++++++++++++++------ 1 file changed, 18 insertions(+), 6 deletions(-)
--- a/kernel/trace/trace.h +++ b/kernel/trace/trace.h @@ -1423,14 +1423,26 @@ __event_trigger_test_discard(struct trac if (eflags & EVENT_FILE_FL_TRIGGER_COND) *tt = event_triggers_call(file, entry, event);
- if (test_bit(EVENT_FILE_FL_SOFT_DISABLED_BIT, &file->flags) || - (unlikely(file->flags & EVENT_FILE_FL_FILTERED) && - !filter_match_preds(file->filter, entry))) { - __trace_event_discard_commit(buffer, event); - return true; - } + if (likely(!(file->flags & (EVENT_FILE_FL_SOFT_DISABLED | + EVENT_FILE_FL_FILTERED | + EVENT_FILE_FL_PID_FILTER)))) + return false; + + if (file->flags & EVENT_FILE_FL_SOFT_DISABLED) + goto discard; + + if (file->flags & EVENT_FILE_FL_FILTERED && + !filter_match_preds(file->filter, entry)) + goto discard; + + if ((file->flags & EVENT_FILE_FL_PID_FILTER) && + trace_event_ignore_this_pid(file)) + goto discard;
return false; + discard: + __trace_event_discard_commit(buffer, event); + return true; }
/**
From: Adrian Hunter adrian.hunter@intel.com
commit 3d7c194b7c9ad414264935ad4f943a6ce285ebb1 upstream.
The block layer forces a minimum segment size of PAGE_SIZE, so a segment can be too big for the ADMA table, if PAGE_SIZE >= 64KiB. Fix by writing multiple descriptors, noting that the ADMA table is sized for 4KiB chunks anyway, so it will be big enough.
Reported-and-tested-by: Bough Chen haibo.chen@nxp.com Signed-off-by: Adrian Hunter adrian.hunter@intel.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20211115082345.802238-1-adrian.hunter@intel.com Signed-off-by: Ulf Hansson ulf.hansson@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/mmc/host/sdhci.c | 21 ++++++++++++++++++--- drivers/mmc/host/sdhci.h | 4 +++- 2 files changed, 21 insertions(+), 4 deletions(-)
--- a/drivers/mmc/host/sdhci.c +++ b/drivers/mmc/host/sdhci.c @@ -749,7 +749,19 @@ static void sdhci_adma_table_pre(struct len -= offset; }
- BUG_ON(len > 65536); + /* + * The block layer forces a minimum segment size of PAGE_SIZE, + * so 'len' can be too big here if PAGE_SIZE >= 64KiB. Write + * multiple descriptors, noting that the ADMA table is sized + * for 4KiB chunks anyway, so it will be big enough. + */ + while (len > host->max_adma) { + int n = 32 * 1024; /* 32KiB*/ + + __sdhci_adma_write_desc(host, &desc, addr, n, ADMA2_TRAN_VALID); + addr += n; + len -= n; + }
/* tran, valid */ if (len) @@ -3568,6 +3580,7 @@ struct sdhci_host *sdhci_alloc_host(stru * descriptor for each segment, plus 1 for a nop end descriptor. */ host->adma_table_cnt = SDHCI_MAX_SEGS * 2 + 1; + host->max_adma = 65536;
return host; } @@ -4221,10 +4234,12 @@ int sdhci_setup_host(struct sdhci_host * * be larger than 64 KiB though. */ if (host->flags & SDHCI_USE_ADMA) { - if (host->quirks & SDHCI_QUIRK_BROKEN_ADMA_ZEROLEN_DESC) + if (host->quirks & SDHCI_QUIRK_BROKEN_ADMA_ZEROLEN_DESC) { + host->max_adma = 65532; /* 32-bit alignment */ mmc->max_seg_size = 65535; - else + } else { mmc->max_seg_size = 65536; + } } else { mmc->max_seg_size = mmc->max_req_size; } --- a/drivers/mmc/host/sdhci.h +++ b/drivers/mmc/host/sdhci.h @@ -349,7 +349,8 @@ struct sdhci_adma2_64_desc {
/* * Maximum segments assuming a 512KiB maximum requisition size and a minimum - * 4KiB page size. + * 4KiB page size. Note this also allows enough for multiple descriptors in + * case of PAGE_SIZE >= 64KiB. */ #define SDHCI_MAX_SEGS 128
@@ -547,6 +548,7 @@ struct sdhci_host { unsigned int blocks; /* remaining PIO blocks */
int sg_count; /* Mapped sg entries */ + int max_adma; /* Max. length in ADMA descriptor */
void *adma_table; /* ADMA descriptor table */ void *align_buffer; /* Bounce buffer */
From: Dylan Hung dylan_hung@aspeedtech.com
commit 9dbe33cf371bd70330858370bdbc35c7668f00c3 upstream.
The issue happened randomly in runtime. The message "Link is Down" is popped but soon it recovered to "Link is Up".
The "Link is Down" results from the incorrect read data for reading the PHY register via MDIO bus. The correct sequence for reading the data shall be: 1. fire the command 2. wait for command done (this step was missing) 3. wait for data idle 4. read data from data register
Cc: stable@vger.kernel.org Fixes: f160e99462c6 ("net: phy: Add mdio-aspeed") Reviewed-by: Joel Stanley joel@jms.id.au Signed-off-by: Dylan Hung dylan_hung@aspeedtech.com Reviewed-by: Andrew Lunn andrew@lunn.ch Reviewed-by: Russell King (Oracle) rmk+kernel@armlinux.org.uk Link: https://lore.kernel.org/r/20211125024432.15809-1-dylan_hung@aspeedtech.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/phy/mdio-aspeed.c | 7 +++++++ 1 file changed, 7 insertions(+)
--- a/drivers/net/phy/mdio-aspeed.c +++ b/drivers/net/phy/mdio-aspeed.c @@ -61,6 +61,13 @@ static int aspeed_mdio_read(struct mii_b
iowrite32(ctrl, ctx->base + ASPEED_MDIO_CTRL);
+ rc = readl_poll_timeout(ctx->base + ASPEED_MDIO_CTRL, ctrl, + !(ctrl & ASPEED_MDIO_CTRL_FIRE), + ASPEED_MDIO_INTERVAL_US, + ASPEED_MDIO_TIMEOUT_US); + if (rc < 0) + return rc; + rc = readl_poll_timeout(ctx->base + ASPEED_MDIO_DATA, data, data & ASPEED_MDIO_DATA_IDLE, ASPEED_MDIO_INTERVAL_US,
From: Marek Behún kabel@kernel.org
commit 67cb2a4c93499c2c22704998fd1fd2bc35194d8e upstream.
Avoid code repetition in advk_pcie_rd_conf() by handling errors with goto jump, as is customary in kernel.
Link: https://lore.kernel.org/r/20211005180952.6812-9-kabel@kernel.org Fixes: 43f5c77bcbd2 ("PCI: aardvark: Fix reporting CRS value") Signed-off-by: Marek Behún kabel@kernel.org Signed-off-by: Lorenzo Pieralisi lorenzo.pieralisi@arm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pci/controller/pci-aardvark.c | 48 ++++++++++++++-------------------- 1 file changed, 20 insertions(+), 28 deletions(-)
--- a/drivers/pci/controller/pci-aardvark.c +++ b/drivers/pci/controller/pci-aardvark.c @@ -773,18 +773,8 @@ static int advk_pcie_rd_conf(struct pci_ (le16_to_cpu(pcie->bridge.pcie_conf.rootctl) & PCI_EXP_RTCTL_CRSSVE);
- if (advk_pcie_pio_is_running(pcie)) { - /* - * If it is possible return Completion Retry Status so caller - * tries to issue the request again instead of failing. - */ - if (allow_crs) { - *val = CFG_RD_CRS_VAL; - return PCIBIOS_SUCCESSFUL; - } - *val = 0xffffffff; - return PCIBIOS_SET_FAILED; - } + if (advk_pcie_pio_is_running(pcie)) + goto try_crs;
/* Program the control register */ reg = advk_readl(pcie, PIO_CTRL); @@ -808,25 +798,13 @@ static int advk_pcie_rd_conf(struct pci_ advk_writel(pcie, 1, PIO_START);
ret = advk_pcie_wait_pio(pcie); - if (ret < 0) { - /* - * If it is possible return Completion Retry Status so caller - * tries to issue the request again instead of failing. - */ - if (allow_crs) { - *val = CFG_RD_CRS_VAL; - return PCIBIOS_SUCCESSFUL; - } - *val = 0xffffffff; - return PCIBIOS_SET_FAILED; - } + if (ret < 0) + goto try_crs;
/* Check PIO status and get the read result */ ret = advk_pcie_check_pio_status(pcie, allow_crs, val); - if (ret < 0) { - *val = 0xffffffff; - return PCIBIOS_SET_FAILED; - } + if (ret < 0) + goto fail;
if (size == 1) *val = (*val >> (8 * (where & 3))) & 0xff; @@ -834,6 +812,20 @@ static int advk_pcie_rd_conf(struct pci_ *val = (*val >> (8 * (where & 3))) & 0xffff;
return PCIBIOS_SUCCESSFUL; + +try_crs: + /* + * If it is possible, return Completion Retry Status so that caller + * tries to issue the request again instead of failing. + */ + if (allow_crs) { + *val = CFG_RD_CRS_VAL; + return PCIBIOS_SUCCESSFUL; + } + +fail: + *val = 0xffffffff; + return PCIBIOS_SET_FAILED; }
static int advk_pcie_wr_conf(struct pci_bus *bus, u32 devfn,
From: Remi Pommarel repk@triplefau.lt
commit f4c7d053d7f77cd5c1a1ba7c7ce085ddba13d1d7 upstream.
When configuring pcie reset pin from gpio (e.g. initially set by u-boot) to pcie function this pin goes low for a brief moment asserting the PERST# signal. Thus connected device enters fundamental reset process and link configuration can only begin after a minimal 100ms delay (see [1]).
Because the pin configuration comes from the "default" pinctrl it is implicitly configured before the probe callback is called:
driver_probe_device() really_probe() ... pinctrl_bind_pins() /* Here pin goes from gpio to PCIE reset function and PERST# is asserted */ ... drv->probe()
[1] "PCI Express Base Specification", REV. 4.0 PCI Express, February 19 2014, 6.6.1 Conventional Reset
Signed-off-by: Remi Pommarel repk@triplefau.lt Signed-off-by: Lorenzo Pieralisi lorenzo.pieralisi@arm.com Acked-by: Thomas Petazzoni thomas.petazzoni@bootlin.com Signed-off-by: Marek Behún kabel@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pci/controller/pci-aardvark.c | 8 ++++++++ 1 file changed, 8 insertions(+)
--- a/drivers/pci/controller/pci-aardvark.c +++ b/drivers/pci/controller/pci-aardvark.c @@ -432,6 +432,14 @@ static void advk_pcie_setup_hw(struct ad reg |= PIO_CTRL_ADDR_WIN_DISABLE; advk_writel(pcie, reg, PIO_CTRL);
+ /* + * PERST# signal could have been asserted by pinctrl subsystem before + * probe() callback has been called, making the endpoint going into + * fundamental reset. As required by PCI Express spec a delay for at + * least 100ms after such a reset before link training is needed. + */ + msleep(PCI_PM_D3COLD_WAIT); + /* Start link training */ reg = advk_readl(pcie, PCIE_CORE_LINK_CTRL_STAT_REG); reg |= PCIE_CORE_LINK_TRAINING;
From: Grzegorz Jaszczyk jaz@semihalf.com
commit e078723f9cccd509482fd7f30a4afb1125ca7a2a upstream.
Initialise every multiple-byte field of emulated PCI bridge config space with proper cpu_to_le* macro. This is required since the structure describing config space of emulated bridge assumes little-endian convention.
Signed-off-by: Grzegorz Jaszczyk jaz@semihalf.com Signed-off-by: Lorenzo Pieralisi lorenzo.pieralisi@arm.com Signed-off-by: Marek Behún kabel@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pci/controller/pci-aardvark.c | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-)
--- a/drivers/pci/controller/pci-aardvark.c +++ b/drivers/pci/controller/pci-aardvark.c @@ -686,18 +686,20 @@ static int advk_sw_pci_bridge_init(struc struct pci_bridge_emul *bridge = &pcie->bridge; int ret;
- bridge->conf.vendor = advk_readl(pcie, PCIE_CORE_DEV_ID_REG) & 0xffff; - bridge->conf.device = advk_readl(pcie, PCIE_CORE_DEV_ID_REG) >> 16; + bridge->conf.vendor = + cpu_to_le16(advk_readl(pcie, PCIE_CORE_DEV_ID_REG) & 0xffff); + bridge->conf.device = + cpu_to_le16(advk_readl(pcie, PCIE_CORE_DEV_ID_REG) >> 16); bridge->conf.class_revision = - advk_readl(pcie, PCIE_CORE_DEV_REV_REG) & 0xff; + cpu_to_le32(advk_readl(pcie, PCIE_CORE_DEV_REV_REG) & 0xff);
/* Support 32 bits I/O addressing */ bridge->conf.iobase = PCI_IO_RANGE_TYPE_32; bridge->conf.iolimit = PCI_IO_RANGE_TYPE_32;
/* Support 64 bits memory pref */ - bridge->conf.pref_mem_base = PCI_PREF_RANGE_TYPE_64; - bridge->conf.pref_mem_limit = PCI_PREF_RANGE_TYPE_64; + bridge->conf.pref_mem_base = cpu_to_le16(PCI_PREF_RANGE_TYPE_64); + bridge->conf.pref_mem_limit = cpu_to_le16(PCI_PREF_RANGE_TYPE_64);
/* Support interrupt A for MSI feature */ bridge->conf.intpin = PCIE_CORE_INT_A_ASSERT_ENABLE;
From: Pali Rohár pali@kernel.org
commit 6964494582f56a3882c2c53b0edbfe99eb32b2e1 upstream.
Adding even 100ms (PCI_PM_D3COLD_WAIT) delay between enabling link training and starting link training causes detection issues with some buggy cards (such as Compex WLE900VX).
Move the code which enables link training immediately before the one which starts link traning.
This fixes detection issues of Compex WLE900VX card on Turris MOX after cold boot.
Link: https://lore.kernel.org/r/20200430080625.26070-2-pali@kernel.org Fixes: f4c7d053d7f7 ("PCI: aardvark: Wait for endpoint to be ready...") Tested-by: Tomasz Maciej Nowak tmn505@gmail.com Signed-off-by: Pali Rohár pali@kernel.org Signed-off-by: Lorenzo Pieralisi lorenzo.pieralisi@arm.com Acked-by: Rob Herring robh@kernel.org Acked-by: Thomas Petazzoni thomas.petazzoni@bootlin.com Signed-off-by: Marek Behún kabel@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pci/controller/pci-aardvark.c | 15 +++++++++------ 1 file changed, 9 insertions(+), 6 deletions(-)
--- a/drivers/pci/controller/pci-aardvark.c +++ b/drivers/pci/controller/pci-aardvark.c @@ -394,11 +394,6 @@ static void advk_pcie_setup_hw(struct ad reg |= LANE_COUNT_1; advk_writel(pcie, reg, PCIE_CORE_CTRL0_REG);
- /* Enable link training */ - reg = advk_readl(pcie, PCIE_CORE_CTRL0_REG); - reg |= LINK_TRAINING_EN; - advk_writel(pcie, reg, PCIE_CORE_CTRL0_REG); - /* Enable MSI */ reg = advk_readl(pcie, PCIE_CORE_CTRL2_REG); reg |= PCIE_CORE_CTRL2_MSI_ENABLE; @@ -440,7 +435,15 @@ static void advk_pcie_setup_hw(struct ad */ msleep(PCI_PM_D3COLD_WAIT);
- /* Start link training */ + /* Enable link training */ + reg = advk_readl(pcie, PCIE_CORE_CTRL0_REG); + reg |= LINK_TRAINING_EN; + advk_writel(pcie, reg, PCIE_CORE_CTRL0_REG); + + /* + * Start link training immediately after enabling it. + * This solves problems for some buggy cards. + */ reg = advk_readl(pcie, PCIE_CORE_LINK_CTRL_STAT_REG); reg |= PCIE_CORE_LINK_TRAINING; advk_writel(pcie, reg, PCIE_CORE_LINK_CTRL_STAT_REG);
From: Marek Behún marek.behun@nic.cz
commit 43fc679ced18006b12d918d7a8a4af392b7fbfe7 upstream.
Currently the aardvark driver trains link in PCIe gen2 mode. This may cause some buggy gen1 cards (such as Compex WLE900VX) to be unstable or even not detected. Moreover when ASPM code tries to retrain link second time, these cards may stop responding and link goes down. If gen1 is used this does not happen.
Unconditionally forcing gen1 is not a good solution since it may have performance impact on gen2 cards.
To overcome this, read 'max-link-speed' property (as defined in PCI device tree bindings) and use this as max gen mode. Then iteratively try link training at this mode or lower until successful. After successful link training choose final controller gen based on Negotiated Link Speed from Link Status register, which should match card speed.
Link: https://lore.kernel.org/r/20200430080625.26070-5-pali@kernel.org Tested-by: Tomasz Maciej Nowak tmn505@gmail.com Signed-off-by: Pali Rohár pali@kernel.org Signed-off-by: Marek Behún marek.behun@nic.cz Signed-off-by: Lorenzo Pieralisi lorenzo.pieralisi@arm.com Reviewed-by: Rob Herring robh@kernel.org Acked-by: Thomas Petazzoni thomas.petazzoni@bootlin.com Signed-off-by: Marek Behún kabel@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pci/controller/pci-aardvark.c | 114 ++++++++++++++++++++++++++-------- 1 file changed, 89 insertions(+), 25 deletions(-)
--- a/drivers/pci/controller/pci-aardvark.c +++ b/drivers/pci/controller/pci-aardvark.c @@ -39,6 +39,7 @@ #define PCIE_CORE_LINK_CTRL_STAT_REG 0xd0 #define PCIE_CORE_LINK_L0S_ENTRY BIT(0) #define PCIE_CORE_LINK_TRAINING BIT(5) +#define PCIE_CORE_LINK_SPEED_SHIFT 16 #define PCIE_CORE_LINK_WIDTH_SHIFT 20 #define PCIE_CORE_ERR_CAPCTL_REG 0x118 #define PCIE_CORE_ERR_CAPCTL_ECRC_CHK_TX BIT(5) @@ -249,6 +250,7 @@ struct advk_pcie { struct mutex msi_used_lock; u16 msi_msg; int root_bus_nr; + int link_gen; struct pci_bridge_emul bridge; };
@@ -309,20 +311,16 @@ static inline bool advk_pcie_link_traini
static int advk_pcie_wait_for_link(struct advk_pcie *pcie) { - struct device *dev = &pcie->pdev->dev; int retries;
/* check if the link is up or not */ for (retries = 0; retries < LINK_WAIT_MAX_RETRIES; retries++) { - if (advk_pcie_link_up(pcie)) { - dev_info(dev, "link up\n"); + if (advk_pcie_link_up(pcie)) return 0; - }
usleep_range(LINK_WAIT_USLEEP_MIN, LINK_WAIT_USLEEP_MAX); }
- dev_err(dev, "link never came up\n"); return -ETIMEDOUT; }
@@ -337,6 +335,85 @@ static void advk_pcie_wait_for_retrain(s } }
+static int advk_pcie_train_at_gen(struct advk_pcie *pcie, int gen) +{ + int ret, neg_gen; + u32 reg; + + /* Setup link speed */ + reg = advk_readl(pcie, PCIE_CORE_CTRL0_REG); + reg &= ~PCIE_GEN_SEL_MSK; + if (gen == 3) + reg |= SPEED_GEN_3; + else if (gen == 2) + reg |= SPEED_GEN_2; + else + reg |= SPEED_GEN_1; + advk_writel(pcie, reg, PCIE_CORE_CTRL0_REG); + + /* + * Enable link training. This is not needed in every call to this + * function, just once suffices, but it does not break anything either. + */ + reg = advk_readl(pcie, PCIE_CORE_CTRL0_REG); + reg |= LINK_TRAINING_EN; + advk_writel(pcie, reg, PCIE_CORE_CTRL0_REG); + + /* + * Start link training immediately after enabling it. + * This solves problems for some buggy cards. + */ + reg = advk_readl(pcie, PCIE_CORE_LINK_CTRL_STAT_REG); + reg |= PCIE_CORE_LINK_TRAINING; + advk_writel(pcie, reg, PCIE_CORE_LINK_CTRL_STAT_REG); + + ret = advk_pcie_wait_for_link(pcie); + if (ret) + return ret; + + reg = advk_readl(pcie, PCIE_CORE_LINK_CTRL_STAT_REG); + neg_gen = (reg >> PCIE_CORE_LINK_SPEED_SHIFT) & 0xf; + + return neg_gen; +} + +static void advk_pcie_train_link(struct advk_pcie *pcie) +{ + struct device *dev = &pcie->pdev->dev; + int neg_gen = -1, gen; + + /* + * Try link training at link gen specified by device tree property + * 'max-link-speed'. If this fails, iteratively train at lower gen. + */ + for (gen = pcie->link_gen; gen > 0; --gen) { + neg_gen = advk_pcie_train_at_gen(pcie, gen); + if (neg_gen > 0) + break; + } + + if (neg_gen < 0) + goto err; + + /* + * After successful training if negotiated gen is lower than requested, + * train again on negotiated gen. This solves some stability issues for + * some buggy gen1 cards. + */ + if (neg_gen < gen) { + gen = neg_gen; + neg_gen = advk_pcie_train_at_gen(pcie, gen); + } + + if (neg_gen == gen) { + dev_info(dev, "link up at gen %i\n", gen); + return; + } + +err: + dev_err(dev, "link never came up\n"); +} + static void advk_pcie_setup_hw(struct advk_pcie *pcie) { u32 reg; @@ -382,12 +459,6 @@ static void advk_pcie_setup_hw(struct ad PCIE_CORE_CTRL2_TD_ENABLE; advk_writel(pcie, reg, PCIE_CORE_CTRL2_REG);
- /* Set GEN2 */ - reg = advk_readl(pcie, PCIE_CORE_CTRL0_REG); - reg &= ~PCIE_GEN_SEL_MSK; - reg |= SPEED_GEN_2; - advk_writel(pcie, reg, PCIE_CORE_CTRL0_REG); - /* Set lane X1 */ reg = advk_readl(pcie, PCIE_CORE_CTRL0_REG); reg &= ~LANE_CNT_MSK; @@ -435,20 +506,7 @@ static void advk_pcie_setup_hw(struct ad */ msleep(PCI_PM_D3COLD_WAIT);
- /* Enable link training */ - reg = advk_readl(pcie, PCIE_CORE_CTRL0_REG); - reg |= LINK_TRAINING_EN; - advk_writel(pcie, reg, PCIE_CORE_CTRL0_REG); - - /* - * Start link training immediately after enabling it. - * This solves problems for some buggy cards. - */ - reg = advk_readl(pcie, PCIE_CORE_LINK_CTRL_STAT_REG); - reg |= PCIE_CORE_LINK_TRAINING; - advk_writel(pcie, reg, PCIE_CORE_LINK_CTRL_STAT_REG); - - advk_pcie_wait_for_link(pcie); + advk_pcie_train_link(pcie);
reg = advk_readl(pcie, PCIE_CORE_CMD_STATUS_REG); reg |= PCIE_CORE_CMD_MEM_ACCESS_EN | @@ -1278,6 +1336,12 @@ static int advk_pcie_probe(struct platfo return ret; }
+ ret = of_pci_get_max_link_speed(dev->of_node); + if (ret <= 0 || ret > 3) + pcie->link_gen = 3; + else + pcie->link_gen = ret; + advk_pcie_setup_hw(pcie);
ret = advk_sw_pci_bridge_init(pcie);
From: Pali Rohár pali@kernel.org
commit 5169a9851daaa2782a7bd2bb83d5b1bd224b2879 upstream.
Add support for issuing PERST via GPIO specified in 'reset-gpios' property (as described in PCI device tree bindings).
Some buggy cards (e.g. Compex WLE900VX or WLE1216) are not detected after reboot when PERST is not issued during driver initialization.
If bootloader already enabled link training then issuing PERST has no effect for some buggy cards (e.g. Compex WLE900VX) and these cards are not detected. We therefore clear the LINK_TRAINING_EN register before.
It was observed that Compex WLE900VX card needs to be in PERST reset for at least 10ms if bootloader enabled link training.
Tested on Turris MOX.
Link: https://lore.kernel.org/r/20200430080625.26070-6-pali@kernel.org Tested-by: Tomasz Maciej Nowak tmn505@gmail.com Signed-off-by: Pali Rohár pali@kernel.org Signed-off-by: Lorenzo Pieralisi lorenzo.pieralisi@arm.com Acked-by: Thomas Petazzoni thomas.petazzoni@bootlin.com Signed-off-by: Marek Behún kabel@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pci/controller/pci-aardvark.c | 43 +++++++++++++++++++++++++++++++++- 1 file changed, 42 insertions(+), 1 deletion(-)
--- a/drivers/pci/controller/pci-aardvark.c +++ b/drivers/pci/controller/pci-aardvark.c @@ -9,6 +9,7 @@ */
#include <linux/delay.h> +#include <linux/gpio.h> #include <linux/interrupt.h> #include <linux/irq.h> #include <linux/irqdomain.h> @@ -17,6 +18,7 @@ #include <linux/init.h> #include <linux/platform_device.h> #include <linux/of_address.h> +#include <linux/of_gpio.h> #include <linux/of_pci.h>
#include "../pci.h" @@ -252,6 +254,7 @@ struct advk_pcie { int root_bus_nr; int link_gen; struct pci_bridge_emul bridge; + struct gpio_desc *reset_gpio; };
static inline void advk_writel(struct advk_pcie *pcie, u32 val, u64 reg) @@ -414,10 +417,31 @@ err: dev_err(dev, "link never came up\n"); }
+static void advk_pcie_issue_perst(struct advk_pcie *pcie) +{ + u32 reg; + + if (!pcie->reset_gpio) + return; + + /* PERST does not work for some cards when link training is enabled */ + reg = advk_readl(pcie, PCIE_CORE_CTRL0_REG); + reg &= ~LINK_TRAINING_EN; + advk_writel(pcie, reg, PCIE_CORE_CTRL0_REG); + + /* 10ms delay is needed for some cards */ + dev_info(&pcie->pdev->dev, "issuing PERST via reset GPIO for 10ms\n"); + gpiod_set_value_cansleep(pcie->reset_gpio, 1); + usleep_range(10000, 11000); + gpiod_set_value_cansleep(pcie->reset_gpio, 0); +} + static void advk_pcie_setup_hw(struct advk_pcie *pcie) { u32 reg;
+ advk_pcie_issue_perst(pcie); + /* Set to Direct mode */ reg = advk_readl(pcie, CTRL_CONFIG_REG); reg &= ~(CTRL_MODE_MASK << CTRL_MODE_SHIFT); @@ -500,7 +524,8 @@ static void advk_pcie_setup_hw(struct ad
/* * PERST# signal could have been asserted by pinctrl subsystem before - * probe() callback has been called, making the endpoint going into + * probe() callback has been called or issued explicitly by reset gpio + * function advk_pcie_issue_perst(), making the endpoint going into * fundamental reset. As required by PCI Express spec a delay for at * least 100ms after such a reset before link training is needed. */ @@ -1336,6 +1361,22 @@ static int advk_pcie_probe(struct platfo return ret; }
+ pcie->reset_gpio = devm_gpiod_get_from_of_node(dev, dev->of_node, + "reset-gpios", 0, + GPIOD_OUT_LOW, + "pcie1-reset"); + ret = PTR_ERR_OR_ZERO(pcie->reset_gpio); + if (ret) { + if (ret == -ENOENT) { + pcie->reset_gpio = NULL; + } else { + if (ret != -EPROBE_DEFER) + dev_err(dev, "Failed to get reset-gpio: %i\n", + ret); + return ret; + } + } + ret = of_pci_get_max_link_speed(dev->of_node); if (ret <= 0 || ret > 3) pcie->link_gen = 3;
From: Pali Rohár pali@kernel.org
commit 96be36dbffacea0aa9e6ec4839583e79faa141a1 upstream.
PCI-E capability macros are already defined in linux/pci_regs.h. Remove their reimplementation in pcie-aardvark.
Link: https://lore.kernel.org/r/20200430080625.26070-9-pali@kernel.org Tested-by: Tomasz Maciej Nowak tmn505@gmail.com Signed-off-by: Pali Rohár pali@kernel.org Signed-off-by: Lorenzo Pieralisi lorenzo.pieralisi@arm.com Reviewed-by: Rob Herring robh@kernel.org Acked-by: Thomas Petazzoni thomas.petazzoni@bootlin.com Signed-off-by: Marek Behún kabel@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pci/controller/pci-aardvark.c | 41 ++++++++++++++-------------------- 1 file changed, 18 insertions(+), 23 deletions(-)
--- a/drivers/pci/controller/pci-aardvark.c +++ b/drivers/pci/controller/pci-aardvark.c @@ -32,17 +32,6 @@ #define PCIE_CORE_CMD_MEM_IO_REQ_EN BIT(2) #define PCIE_CORE_DEV_REV_REG 0x8 #define PCIE_CORE_PCIEXP_CAP 0xc0 -#define PCIE_CORE_DEV_CTRL_STATS_REG 0xc8 -#define PCIE_CORE_DEV_CTRL_STATS_RELAX_ORDER_DISABLE (0 << 4) -#define PCIE_CORE_DEV_CTRL_STATS_MAX_PAYLOAD_SZ_SHIFT 5 -#define PCIE_CORE_DEV_CTRL_STATS_SNOOP_DISABLE (0 << 11) -#define PCIE_CORE_DEV_CTRL_STATS_MAX_RD_REQ_SIZE_SHIFT 12 -#define PCIE_CORE_DEV_CTRL_STATS_MAX_RD_REQ_SZ 0x2 -#define PCIE_CORE_LINK_CTRL_STAT_REG 0xd0 -#define PCIE_CORE_LINK_L0S_ENTRY BIT(0) -#define PCIE_CORE_LINK_TRAINING BIT(5) -#define PCIE_CORE_LINK_SPEED_SHIFT 16 -#define PCIE_CORE_LINK_WIDTH_SHIFT 20 #define PCIE_CORE_ERR_CAPCTL_REG 0x118 #define PCIE_CORE_ERR_CAPCTL_ECRC_CHK_TX BIT(5) #define PCIE_CORE_ERR_CAPCTL_ECRC_CHK_TX_EN BIT(6) @@ -267,6 +256,11 @@ static inline u32 advk_readl(struct advk return readl(pcie->base + reg); }
+static inline u16 advk_read16(struct advk_pcie *pcie, u64 reg) +{ + return advk_readl(pcie, (reg & ~0x3)) >> ((reg & 0x3) * 8); +} + static u8 advk_pcie_ltssm_state(struct advk_pcie *pcie) { u32 val; @@ -366,16 +360,16 @@ static int advk_pcie_train_at_gen(struct * Start link training immediately after enabling it. * This solves problems for some buggy cards. */ - reg = advk_readl(pcie, PCIE_CORE_LINK_CTRL_STAT_REG); - reg |= PCIE_CORE_LINK_TRAINING; - advk_writel(pcie, reg, PCIE_CORE_LINK_CTRL_STAT_REG); + reg = advk_readl(pcie, PCIE_CORE_PCIEXP_CAP + PCI_EXP_LNKCTL); + reg |= PCI_EXP_LNKCTL_RL; + advk_writel(pcie, reg, PCIE_CORE_PCIEXP_CAP + PCI_EXP_LNKCTL);
ret = advk_pcie_wait_for_link(pcie); if (ret) return ret;
- reg = advk_readl(pcie, PCIE_CORE_LINK_CTRL_STAT_REG); - neg_gen = (reg >> PCIE_CORE_LINK_SPEED_SHIFT) & 0xf; + reg = advk_read16(pcie, PCIE_CORE_PCIEXP_CAP + PCI_EXP_LNKSTA); + neg_gen = reg & PCI_EXP_LNKSTA_CLS;
return neg_gen; } @@ -470,13 +464,14 @@ static void advk_pcie_setup_hw(struct ad PCIE_CORE_ERR_CAPCTL_ECRC_CHCK_RCV; advk_writel(pcie, reg, PCIE_CORE_ERR_CAPCTL_REG);
- /* Set PCIe Device Control and Status 1 PF0 register */ - reg = PCIE_CORE_DEV_CTRL_STATS_RELAX_ORDER_DISABLE | - (7 << PCIE_CORE_DEV_CTRL_STATS_MAX_PAYLOAD_SZ_SHIFT) | - PCIE_CORE_DEV_CTRL_STATS_SNOOP_DISABLE | - (PCIE_CORE_DEV_CTRL_STATS_MAX_RD_REQ_SZ << - PCIE_CORE_DEV_CTRL_STATS_MAX_RD_REQ_SIZE_SHIFT); - advk_writel(pcie, reg, PCIE_CORE_DEV_CTRL_STATS_REG); + /* Set PCIe Device Control register */ + reg = advk_readl(pcie, PCIE_CORE_PCIEXP_CAP + PCI_EXP_DEVCTL); + reg &= ~PCI_EXP_DEVCTL_RELAX_EN; + reg &= ~PCI_EXP_DEVCTL_NOSNOOP_EN; + reg &= ~PCI_EXP_DEVCTL_READRQ; + reg |= PCI_EXP_DEVCTL_PAYLOAD; /* Set max payload size */ + reg |= PCI_EXP_DEVCTL_READRQ_512B; + advk_writel(pcie, reg, PCIE_CORE_PCIEXP_CAP + PCI_EXP_DEVCTL);
/* Program PCIe Control 2 to disable strict ordering */ reg = PCIE_CORE_CTRL2_RESERVED |
From: Pali Rohár pali@kernel.org
commit 70e380250c3621c55ff218cbaf2272830d9dbb1d upstream.
When there is no PCIe card connected and advk_pcie_rd_conf() or advk_pcie_wr_conf() is called for PCI bus which doesn't belong to emulated root bridge, the aardvark driver throws the following error message:
advk-pcie d0070000.pcie: config read/write timed out
Obviously accessing PCIe registers of disconnected card is not possible.
Extend check in advk_pcie_valid_device() function for validating availability of PCIe bus. If PCIe link is down, then the device is marked as Not Found and the driver does not try to access these registers.
This is just an optimization to prevent accessing PCIe registers when card is disconnected. Trying to access PCIe registers of disconnected card does not cause any crash, kernel just needs to wait for a timeout. So if card disappear immediately after checking for PCIe link (before accessing PCIe registers), it does not cause any problems.
Link: https://lore.kernel.org/r/20200702083036.12230-1-pali@kernel.org Signed-off-by: Pali Rohár pali@kernel.org Signed-off-by: Lorenzo Pieralisi lorenzo.pieralisi@arm.com Signed-off-by: Marek Behún kabel@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pci/controller/pci-aardvark.c | 7 +++++++ 1 file changed, 7 insertions(+)
--- a/drivers/pci/controller/pci-aardvark.c +++ b/drivers/pci/controller/pci-aardvark.c @@ -806,6 +806,13 @@ static bool advk_pcie_valid_device(struc if ((bus->number == pcie->root_bus_nr) && PCI_SLOT(devfn) != 0) return false;
+ /* + * If the link goes down after we check for link-up, nothing bad + * happens but the config access times out. + */ + if (bus->number != pcie->root_bus_nr && !advk_pcie_link_up(pcie)) + return false; + return true; }
From: Pali Rohár pali@kernel.org
commit b32c012e4b98f0126aa327be2d1f409963057643 upstream.
Include linux/gpio/consumer.h instead of linux/gpio.h, as is said in the latter file.
This was reported by kernel test bot when compiling for s390.
drivers/pci/controller/pci-aardvark.c:350:2: error: implicit declaration of function 'gpiod_set_value_cansleep' [-Werror,-Wimplicit-function-declaration] drivers/pci/controller/pci-aardvark.c:1074:21: error: implicit declaration of function 'devm_gpiod_get_from_of_node' [-Werror,-Wimplicit-function-declaration] drivers/pci/controller/pci-aardvark.c:1076:14: error: use of undeclared identifier 'GPIOD_OUT_LOW'
Link: https://lore.kernel.org/r/202006211118.LxtENQfl%25lkp@intel.com Link: https://lore.kernel.org/r/20200907111038.5811-2-pali@kernel.org Fixes: 5169a9851daa ("PCI: aardvark: Issue PERST via GPIO") Reported-by: kernel test robot lkp@intel.com Signed-off-by: Pali Rohár pali@kernel.org Signed-off-by: Lorenzo Pieralisi lorenzo.pieralisi@arm.com Reviewed-by: Marek Behún marek.behun@nic.cz Signed-off-by: Marek Behún kabel@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pci/controller/pci-aardvark.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/pci/controller/pci-aardvark.c +++ b/drivers/pci/controller/pci-aardvark.c @@ -9,7 +9,7 @@ */
#include <linux/delay.h> -#include <linux/gpio.h> +#include <linux/gpio/consumer.h> #include <linux/interrupt.h> #include <linux/irq.h> #include <linux/irqdomain.h>
From: Pali Rohár pali@kernel.org
commit d0c6a3475b033960e85ae2bf176b14cab0a627d2 upstream.
Move code which belongs to link training (delays and resets) into advk_pcie_train_link() function, so everything related to link training, including timings is at one place.
After experiments it can be observed that link training in aardvark hardware is very sensitive to timings and delays, so it is a good idea to have this code at the same place as link training calls.
This patch does not change behavior of aardvark initialization.
Link: https://lore.kernel.org/r/20200907111038.5811-6-pali@kernel.org Tested-by: Marek Behún marek.behun@nic.cz Signed-off-by: Pali Rohár pali@kernel.org Signed-off-by: Lorenzo Pieralisi lorenzo.pieralisi@arm.com Signed-off-by: Marek Behún kabel@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pci/controller/pci-aardvark.c | 64 ++++++++++++++++++---------------- 1 file changed, 34 insertions(+), 30 deletions(-)
--- a/drivers/pci/controller/pci-aardvark.c +++ b/drivers/pci/controller/pci-aardvark.c @@ -332,6 +332,25 @@ static void advk_pcie_wait_for_retrain(s } }
+static void advk_pcie_issue_perst(struct advk_pcie *pcie) +{ + u32 reg; + + if (!pcie->reset_gpio) + return; + + /* PERST does not work for some cards when link training is enabled */ + reg = advk_readl(pcie, PCIE_CORE_CTRL0_REG); + reg &= ~LINK_TRAINING_EN; + advk_writel(pcie, reg, PCIE_CORE_CTRL0_REG); + + /* 10ms delay is needed for some cards */ + dev_info(&pcie->pdev->dev, "issuing PERST via reset GPIO for 10ms\n"); + gpiod_set_value_cansleep(pcie->reset_gpio, 1); + usleep_range(10000, 11000); + gpiod_set_value_cansleep(pcie->reset_gpio, 0); +} + static int advk_pcie_train_at_gen(struct advk_pcie *pcie, int gen) { int ret, neg_gen; @@ -380,6 +399,21 @@ static void advk_pcie_train_link(struct int neg_gen = -1, gen;
/* + * Reset PCIe card via PERST# signal. Some cards are not detected + * during link training when they are in some non-initial state. + */ + advk_pcie_issue_perst(pcie); + + /* + * PERST# signal could have been asserted by pinctrl subsystem before + * probe() callback has been called or issued explicitly by reset gpio + * function advk_pcie_issue_perst(), making the endpoint going into + * fundamental reset. As required by PCI Express spec a delay for at + * least 100ms after such a reset before link training is needed. + */ + msleep(PCI_PM_D3COLD_WAIT); + + /* * Try link training at link gen specified by device tree property * 'max-link-speed'. If this fails, iteratively train at lower gen. */ @@ -411,31 +445,10 @@ err: dev_err(dev, "link never came up\n"); }
-static void advk_pcie_issue_perst(struct advk_pcie *pcie) -{ - u32 reg; - - if (!pcie->reset_gpio) - return; - - /* PERST does not work for some cards when link training is enabled */ - reg = advk_readl(pcie, PCIE_CORE_CTRL0_REG); - reg &= ~LINK_TRAINING_EN; - advk_writel(pcie, reg, PCIE_CORE_CTRL0_REG); - - /* 10ms delay is needed for some cards */ - dev_info(&pcie->pdev->dev, "issuing PERST via reset GPIO for 10ms\n"); - gpiod_set_value_cansleep(pcie->reset_gpio, 1); - usleep_range(10000, 11000); - gpiod_set_value_cansleep(pcie->reset_gpio, 0); -} - static void advk_pcie_setup_hw(struct advk_pcie *pcie) { u32 reg;
- advk_pcie_issue_perst(pcie); - /* Set to Direct mode */ reg = advk_readl(pcie, CTRL_CONFIG_REG); reg &= ~(CTRL_MODE_MASK << CTRL_MODE_SHIFT); @@ -517,15 +530,6 @@ static void advk_pcie_setup_hw(struct ad reg |= PIO_CTRL_ADDR_WIN_DISABLE; advk_writel(pcie, reg, PIO_CTRL);
- /* - * PERST# signal could have been asserted by pinctrl subsystem before - * probe() callback has been called or issued explicitly by reset gpio - * function advk_pcie_issue_perst(), making the endpoint going into - * fundamental reset. As required by PCI Express spec a delay for at - * least 100ms after such a reset before link training is needed. - */ - msleep(PCI_PM_D3COLD_WAIT); - advk_pcie_train_link(pcie);
reg = advk_readl(pcie, PCIE_CORE_CMD_STATUS_REG);
From: Pali Rohár pali@kernel.org
commit 1d1cd163d0de22a4041a6f1aeabcf78f80076539 upstream.
According to PCI Express Base Specifications (rev 4.0, 6.6.1 "Conventional reset"), after fundamental reset a 100ms delay is needed prior to enabling link training.
Update comment in code to reflect this requirement.
Link: https://lore.kernel.org/r/20201202184659.3795-1-pali@kernel.org Signed-off-by: Pali Rohár pali@kernel.org Signed-off-by: Lorenzo Pieralisi lorenzo.pieralisi@arm.com Signed-off-by: Marek Behún kabel@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pci/controller/pci-aardvark.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-)
--- a/drivers/pci/controller/pci-aardvark.c +++ b/drivers/pci/controller/pci-aardvark.c @@ -339,7 +339,14 @@ static void advk_pcie_issue_perst(struct if (!pcie->reset_gpio) return;
- /* PERST does not work for some cards when link training is enabled */ + /* + * As required by PCI Express spec (PCI Express Base Specification, REV. + * 4.0 PCI Express, February 19 2014, 6.6.1 Conventional Reset) a delay + * for at least 100ms after de-asserting PERST# signal is needed before + * link training is enabled. So ensure that link training is disabled + * prior de-asserting PERST# signal to fulfill that PCI Express spec + * requirement. + */ reg = advk_readl(pcie, PCIE_CORE_CTRL0_REG); reg &= ~LINK_TRAINING_EN; advk_writel(pcie, reg, PCIE_CORE_CTRL0_REG);
From: Russell King rmk+kernel@armlinux.org.uk
commit f8ee579d53aca887d93f5f411462f25c085a5106 upstream.
We allow up to PCI_EXP_SLTSTA2 registers to be accessed, but the pcie_cap_regs_behavior[] array only covers up to PCI_EXP_RTSTA. Expand this array to avoid walking off the end of it.
Do the same for pci_regs_behavior for consistency[], and add a BUILD_BUG_ON() to also check the bridge->conf structure size.
Fixes: 23a5fba4d941 ("PCI: Introduce PCI bridge emulated config space common logic") Link: https://lore.kernel.org/r/E1l6z9W-0006Re-MQ@rmk-PC.armlinux.org.uk Signed-off-by: Russell King rmk+kernel@armlinux.org.uk Signed-off-by: Bjorn Helgaas bhelgaas@google.com Reviewed-by: Pali Rohár pali@kernel.org Signed-off-by: Marek Behún kabel@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pci/pci-bridge-emul.c | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-)
--- a/drivers/pci/pci-bridge-emul.c +++ b/drivers/pci/pci-bridge-emul.c @@ -21,8 +21,9 @@ #include "pci-bridge-emul.h"
#define PCI_BRIDGE_CONF_END PCI_STD_HEADER_SIZEOF +#define PCI_CAP_PCIE_SIZEOF (PCI_EXP_SLTSTA2 + 2) #define PCI_CAP_PCIE_START PCI_BRIDGE_CONF_END -#define PCI_CAP_PCIE_END (PCI_CAP_PCIE_START + PCI_EXP_SLTSTA2 + 2) +#define PCI_CAP_PCIE_END (PCI_CAP_PCIE_START + PCI_CAP_PCIE_SIZEOF)
struct pci_bridge_reg_behavior { /* Read-only bits */ @@ -38,7 +39,8 @@ struct pci_bridge_reg_behavior { u32 rsvd; };
-static const struct pci_bridge_reg_behavior pci_regs_behavior[] = { +static const +struct pci_bridge_reg_behavior pci_regs_behavior[PCI_STD_HEADER_SIZEOF / 4] = { [PCI_VENDOR_ID / 4] = { .ro = ~0 }, [PCI_COMMAND / 4] = { .rw = (PCI_COMMAND_IO | PCI_COMMAND_MEMORY | @@ -173,7 +175,8 @@ static const struct pci_bridge_reg_behav }, };
-static const struct pci_bridge_reg_behavior pcie_cap_regs_behavior[] = { +static const +struct pci_bridge_reg_behavior pcie_cap_regs_behavior[PCI_CAP_PCIE_SIZEOF / 4] = { [PCI_CAP_LIST_ID / 4] = { /* * Capability ID, Next Capability Pointer and @@ -270,6 +273,8 @@ static const struct pci_bridge_reg_behav int pci_bridge_emul_init(struct pci_bridge_emul *bridge, unsigned int flags) { + BUILD_BUG_ON(sizeof(bridge->conf) != PCI_BRIDGE_CONF_END); + bridge->conf.class_revision |= cpu_to_le32(PCI_CLASS_BRIDGE_PCI << 16); bridge->conf.header_type = PCI_HEADER_TYPE_BRIDGE; bridge->conf.cache_line_size = 0x10;
From: Pali Rohár pali@kernel.org
commit 64f160e19e9264a7f6d89c516baae1473b6f8359 upstream.
In commit 6df6ba974a55 ("PCI: aardvark: Remove PCIe outbound window configuration") was removed aardvark PCIe outbound window configuration and commit description said that was recommended solution by HW designers.
But that commit completely removed support for configuring PCIe IO resources without removing PCIe IO 'ranges' from DTS files. After that commit PCIe IO space started to be treated as PCIe MEM space and accessing it just caused kernel crash.
Moreover implementation of PCIe outbound windows prior that commit was incorrect. It completely ignored offset between CPU address and PCIe bus address and expected that in DTS is CPU address always same as PCIe bus address without doing any checks. Also it completely ignored size of every PCIe resource specified in 'ranges' DTS property and expected that every PCIe resource has size 128 MB (also for PCIe IO range). Again without any check. Apparently none of PCIe resource has in DTS specified size of 128 MB. So it was completely broken and thanks to how aardvark mask works, configuration was completely ignored.
This patch reverts back support for PCIe outbound window configuration but implementation is a new without issues mentioned above. PCIe outbound window is required when DTS specify in 'ranges' property non-zero offset between CPU and PCIe address space. To address recommendation by HW designers as specified in commit description of 6df6ba974a55, set default outbound parameters as PCIe MEM access without translation and therefore for this PCIe 'ranges' it is not needed to configure PCIe outbound window. For PCIe IO space is needed to configure aardvark PCIe outbound window.
This patch fixes kernel crash when trying to access PCIe IO space.
Link: https://lore.kernel.org/r/20210624215546.4015-2-pali@kernel.org Signed-off-by: Pali Rohár pali@kernel.org Signed-off-by: Lorenzo Pieralisi lorenzo.pieralisi@arm.com Cc: stable@vger.kernel.org # 6df6ba974a55 ("PCI: aardvark: Remove PCIe outbound window configuration") Signed-off-by: Marek Behún kabel@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pci/controller/pci-aardvark.c | 190 +++++++++++++++++++++++++++++++++- 1 file changed, 189 insertions(+), 1 deletion(-)
--- a/drivers/pci/controller/pci-aardvark.c +++ b/drivers/pci/controller/pci-aardvark.c @@ -114,6 +114,46 @@ #define PCIE_MSI_PAYLOAD_REG (CONTROL_BASE_ADDR + 0x9C) #define PCIE_MSI_DATA_MASK GENMASK(15, 0)
+/* PCIe window configuration */ +#define OB_WIN_BASE_ADDR 0x4c00 +#define OB_WIN_BLOCK_SIZE 0x20 +#define OB_WIN_COUNT 8 +#define OB_WIN_REG_ADDR(win, offset) (OB_WIN_BASE_ADDR + \ + OB_WIN_BLOCK_SIZE * (win) + \ + (offset)) +#define OB_WIN_MATCH_LS(win) OB_WIN_REG_ADDR(win, 0x00) +#define OB_WIN_ENABLE BIT(0) +#define OB_WIN_MATCH_MS(win) OB_WIN_REG_ADDR(win, 0x04) +#define OB_WIN_REMAP_LS(win) OB_WIN_REG_ADDR(win, 0x08) +#define OB_WIN_REMAP_MS(win) OB_WIN_REG_ADDR(win, 0x0c) +#define OB_WIN_MASK_LS(win) OB_WIN_REG_ADDR(win, 0x10) +#define OB_WIN_MASK_MS(win) OB_WIN_REG_ADDR(win, 0x14) +#define OB_WIN_ACTIONS(win) OB_WIN_REG_ADDR(win, 0x18) +#define OB_WIN_DEFAULT_ACTIONS (OB_WIN_ACTIONS(OB_WIN_COUNT-1) + 0x4) +#define OB_WIN_FUNC_NUM_MASK GENMASK(31, 24) +#define OB_WIN_FUNC_NUM_SHIFT 24 +#define OB_WIN_FUNC_NUM_ENABLE BIT(23) +#define OB_WIN_BUS_NUM_BITS_MASK GENMASK(22, 20) +#define OB_WIN_BUS_NUM_BITS_SHIFT 20 +#define OB_WIN_MSG_CODE_ENABLE BIT(22) +#define OB_WIN_MSG_CODE_MASK GENMASK(21, 14) +#define OB_WIN_MSG_CODE_SHIFT 14 +#define OB_WIN_MSG_PAYLOAD_LEN BIT(12) +#define OB_WIN_ATTR_ENABLE BIT(11) +#define OB_WIN_ATTR_TC_MASK GENMASK(10, 8) +#define OB_WIN_ATTR_TC_SHIFT 8 +#define OB_WIN_ATTR_RELAXED BIT(7) +#define OB_WIN_ATTR_NOSNOOP BIT(6) +#define OB_WIN_ATTR_POISON BIT(5) +#define OB_WIN_ATTR_IDO BIT(4) +#define OB_WIN_TYPE_MASK GENMASK(3, 0) +#define OB_WIN_TYPE_SHIFT 0 +#define OB_WIN_TYPE_MEM 0x0 +#define OB_WIN_TYPE_IO 0x4 +#define OB_WIN_TYPE_CONFIG_TYPE0 0x8 +#define OB_WIN_TYPE_CONFIG_TYPE1 0x9 +#define OB_WIN_TYPE_MSG 0xc + /* LMI registers base address and register offsets */ #define LMI_BASE_ADDR 0x6000 #define CFG_REG (LMI_BASE_ADDR + 0x0) @@ -229,6 +269,13 @@ struct advk_pcie { struct platform_device *pdev; void __iomem *base; struct list_head resources; + struct { + phys_addr_t match; + phys_addr_t remap; + phys_addr_t mask; + u32 actions; + } wins[OB_WIN_COUNT]; + u8 wins_count; struct irq_domain *irq_domain; struct irq_chip irq_chip; raw_spinlock_t irq_lock; @@ -452,9 +499,39 @@ err: dev_err(dev, "link never came up\n"); }
+/* + * Set PCIe address window register which could be used for memory + * mapping. + */ +static void advk_pcie_set_ob_win(struct advk_pcie *pcie, u8 win_num, + phys_addr_t match, phys_addr_t remap, + phys_addr_t mask, u32 actions) +{ + advk_writel(pcie, OB_WIN_ENABLE | + lower_32_bits(match), OB_WIN_MATCH_LS(win_num)); + advk_writel(pcie, upper_32_bits(match), OB_WIN_MATCH_MS(win_num)); + advk_writel(pcie, lower_32_bits(remap), OB_WIN_REMAP_LS(win_num)); + advk_writel(pcie, upper_32_bits(remap), OB_WIN_REMAP_MS(win_num)); + advk_writel(pcie, lower_32_bits(mask), OB_WIN_MASK_LS(win_num)); + advk_writel(pcie, upper_32_bits(mask), OB_WIN_MASK_MS(win_num)); + advk_writel(pcie, actions, OB_WIN_ACTIONS(win_num)); +} + +static void advk_pcie_disable_ob_win(struct advk_pcie *pcie, u8 win_num) +{ + advk_writel(pcie, 0, OB_WIN_MATCH_LS(win_num)); + advk_writel(pcie, 0, OB_WIN_MATCH_MS(win_num)); + advk_writel(pcie, 0, OB_WIN_REMAP_LS(win_num)); + advk_writel(pcie, 0, OB_WIN_REMAP_MS(win_num)); + advk_writel(pcie, 0, OB_WIN_MASK_LS(win_num)); + advk_writel(pcie, 0, OB_WIN_MASK_MS(win_num)); + advk_writel(pcie, 0, OB_WIN_ACTIONS(win_num)); +} + static void advk_pcie_setup_hw(struct advk_pcie *pcie) { u32 reg; + int i;
/* Set to Direct mode */ reg = advk_readl(pcie, CTRL_CONFIG_REG); @@ -528,15 +605,51 @@ static void advk_pcie_setup_hw(struct ad reg = PCIE_IRQ_ALL_MASK & (~PCIE_IRQ_ENABLE_INTS_MASK); advk_writel(pcie, reg, HOST_CTRL_INT_MASK_REG);
+ /* + * Enable AXI address window location generation: + * When it is enabled, the default outbound window + * configurations (Default User Field: 0xD0074CFC) + * are used to transparent address translation for + * the outbound transactions. Thus, PCIe address + * windows are not required for transparent memory + * access when default outbound window configuration + * is set for memory access. + */ reg = advk_readl(pcie, PCIE_CORE_CTRL2_REG); reg |= PCIE_CORE_CTRL2_OB_WIN_ENABLE; advk_writel(pcie, reg, PCIE_CORE_CTRL2_REG);
- /* Bypass the address window mapping for PIO */ + /* + * Set memory access in Default User Field so it + * is not required to configure PCIe address for + * transparent memory access. + */ + advk_writel(pcie, OB_WIN_TYPE_MEM, OB_WIN_DEFAULT_ACTIONS); + + /* + * Bypass the address window mapping for PIO: + * Since PIO access already contains all required + * info over AXI interface by PIO registers, the + * address window is not required. + */ reg = advk_readl(pcie, PIO_CTRL); reg |= PIO_CTRL_ADDR_WIN_DISABLE; advk_writel(pcie, reg, PIO_CTRL);
+ /* + * Configure PCIe address windows for non-memory or + * non-transparent access as by default PCIe uses + * transparent memory access. + */ + for (i = 0; i < pcie->wins_count; i++) + advk_pcie_set_ob_win(pcie, i, + pcie->wins[i].match, pcie->wins[i].remap, + pcie->wins[i].mask, pcie->wins[i].actions); + + /* Disable remaining PCIe outbound windows */ + for (i = pcie->wins_count; i < OB_WIN_COUNT; i++) + advk_pcie_disable_ob_win(pcie, i); + advk_pcie_train_link(pcie);
reg = advk_readl(pcie, PCIE_CORE_CMD_STATUS_REG); @@ -1345,6 +1458,7 @@ static int advk_pcie_probe(struct platfo struct advk_pcie *pcie; struct resource *res; struct pci_host_bridge *bridge; + struct resource_entry *entry; int ret, irq;
bridge = devm_pci_alloc_host_bridge(dev, sizeof(struct advk_pcie)); @@ -1374,6 +1488,80 @@ static int advk_pcie_probe(struct platfo return ret; }
+ resource_list_for_each_entry(entry, &pcie->resources) { + resource_size_t start = entry->res->start; + resource_size_t size = resource_size(entry->res); + unsigned long type = resource_type(entry->res); + u64 win_size; + + /* + * Aardvark hardware allows to configure also PCIe window + * for config type 0 and type 1 mapping, but driver uses + * only PIO for issuing configuration transfers which does + * not use PCIe window configuration. + */ + if (type != IORESOURCE_MEM && type != IORESOURCE_MEM_64 && + type != IORESOURCE_IO) + continue; + + /* + * Skip transparent memory resources. Default outbound access + * configuration is set to transparent memory access so it + * does not need window configuration. + */ + if ((type == IORESOURCE_MEM || type == IORESOURCE_MEM_64) && + entry->offset == 0) + continue; + + /* + * The n-th PCIe window is configured by tuple (match, remap, mask) + * and an access to address A uses this window if A matches the + * match with given mask. + * So every PCIe window size must be a power of two and every start + * address must be aligned to window size. Minimal size is 64 KiB + * because lower 16 bits of mask must be zero. Remapped address + * may have set only bits from the mask. + */ + while (pcie->wins_count < OB_WIN_COUNT && size > 0) { + /* Calculate the largest aligned window size */ + win_size = (1ULL << (fls64(size)-1)) | + (start ? (1ULL << __ffs64(start)) : 0); + win_size = 1ULL << __ffs64(win_size); + if (win_size < 0x10000) + break; + + dev_dbg(dev, + "Configuring PCIe window %d: [0x%llx-0x%llx] as %lu\n", + pcie->wins_count, (unsigned long long)start, + (unsigned long long)start + win_size, type); + + if (type == IORESOURCE_IO) { + pcie->wins[pcie->wins_count].actions = OB_WIN_TYPE_IO; + pcie->wins[pcie->wins_count].match = pci_pio_to_address(start); + } else { + pcie->wins[pcie->wins_count].actions = OB_WIN_TYPE_MEM; + pcie->wins[pcie->wins_count].match = start; + } + pcie->wins[pcie->wins_count].remap = start - entry->offset; + pcie->wins[pcie->wins_count].mask = ~(win_size - 1); + + if (pcie->wins[pcie->wins_count].remap & (win_size - 1)) + break; + + start += win_size; + size -= win_size; + pcie->wins_count++; + } + + if (size > 0) { + dev_err(&pcie->pdev->dev, + "Invalid PCIe region [0x%llx-0x%llx]\n", + (unsigned long long)entry->res->start, + (unsigned long long)entry->res->end + 1); + return -EINVAL; + } + } + pcie->reset_gpio = devm_gpiod_get_from_of_node(dev, dev->of_node, "reset-gpios", 0, GPIOD_OUT_LOW,
From: Pali Rohár pali@kernel.org
commit a4e17d65dafdd3513042d8f00404c9b6068a825c upstream.
Change PCIe Max Payload Size setting in PCIe Device Control register to 512 bytes to align with PCIe Link Initialization sequence as defined in Marvell Armada 3700 Functional Specification. According to the specification, maximal Max Payload Size supported by this device is 512 bytes.
Without this kernel prints suspicious line:
pci 0000:01:00.0: Upstream bridge's Max Payload Size set to 256 (was 16384, max 512)
With this change it changes to:
pci 0000:01:00.0: Upstream bridge's Max Payload Size set to 256 (was 512, max 512)
Link: https://lore.kernel.org/r/20211005180952.6812-3-kabel@kernel.org Fixes: 8c39d710363c ("PCI: aardvark: Add Aardvark PCI host controller driver") Signed-off-by: Pali Rohár pali@kernel.org Signed-off-by: Marek Behún kabel@kernel.org Signed-off-by: Lorenzo Pieralisi lorenzo.pieralisi@arm.com Reviewed-by: Marek Behún kabel@kernel.org Cc: stable@vger.kernel.org Signed-off-by: Marek Behún kabel@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pci/controller/pci-aardvark.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/drivers/pci/controller/pci-aardvark.c +++ b/drivers/pci/controller/pci-aardvark.c @@ -565,8 +565,9 @@ static void advk_pcie_setup_hw(struct ad reg = advk_readl(pcie, PCIE_CORE_PCIEXP_CAP + PCI_EXP_DEVCTL); reg &= ~PCI_EXP_DEVCTL_RELAX_EN; reg &= ~PCI_EXP_DEVCTL_NOSNOOP_EN; + reg &= ~PCI_EXP_DEVCTL_PAYLOAD; reg &= ~PCI_EXP_DEVCTL_READRQ; - reg |= PCI_EXP_DEVCTL_PAYLOAD; /* Set max payload size */ + reg |= PCI_EXP_DEVCTL_PAYLOAD_512B; reg |= PCI_EXP_DEVCTL_READRQ_512B; advk_writel(pcie, reg, PCIE_CORE_PCIEXP_CAP + PCI_EXP_DEVCTL);
From: Pali Rohár pali@kernel.org
commit 223dec14a05337a4155f1deed46d2becce4d00fd upstream.
Commit 43f5c77bcbd2 ("PCI: aardvark: Fix reporting CRS value") fixed handling of CRS response and when CRSSVE flag was not enabled it marked CRS response as failed transaction (due to simplicity).
But pci-aardvark.c driver is already waiting up to the PIO_RETRY_CNT count for PIO config response and so we can with a small change implement re-issuing of config requests as described in PCIe base specification.
This change implements re-issuing of config requests when response is CRS. Set upper bound of wait cycles to around PIO_RETRY_CNT, afterwards the transaction is marked as failed and an all-ones value is returned as before.
We do this by returning appropriate error codes from function advk_pcie_check_pio_status(). On CRS we return -EAGAIN and caller then reissues transaction.
Link: https://lore.kernel.org/r/20211005180952.6812-10-kabel@kernel.org Signed-off-by: Pali Rohár pali@kernel.org Signed-off-by: Marek Behún kabel@kernel.org Signed-off-by: Lorenzo Pieralisi lorenzo.pieralisi@arm.com Reviewed-by: Marek Behún kabel@kernel.org Signed-off-by: Marek Behún kabel@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pci/controller/pci-aardvark.c | 69 +++++++++++++++++++++------------- 1 file changed, 44 insertions(+), 25 deletions(-)
--- a/drivers/pci/controller/pci-aardvark.c +++ b/drivers/pci/controller/pci-aardvark.c @@ -666,6 +666,7 @@ static int advk_pcie_check_pio_status(st u32 reg; unsigned int status; char *strcomp_status, *str_posted; + int ret;
reg = advk_readl(pcie, PIO_STAT); status = (reg & PIO_COMPLETION_STATUS_MASK) >> @@ -690,6 +691,7 @@ static int advk_pcie_check_pio_status(st case PIO_COMPLETION_STATUS_OK: if (reg & PIO_ERR_STATUS) { strcomp_status = "COMP_ERR"; + ret = -EFAULT; break; } /* Get the read result */ @@ -697,9 +699,11 @@ static int advk_pcie_check_pio_status(st *val = advk_readl(pcie, PIO_RD_DATA); /* No error */ strcomp_status = NULL; + ret = 0; break; case PIO_COMPLETION_STATUS_UR: strcomp_status = "UR"; + ret = -EOPNOTSUPP; break; case PIO_COMPLETION_STATUS_CRS: if (allow_crs && val) { @@ -717,6 +721,7 @@ static int advk_pcie_check_pio_status(st */ *val = CFG_RD_CRS_VAL; strcomp_status = NULL; + ret = 0; break; } /* PCIe r4.0, sec 2.3.2, says: @@ -732,21 +737,24 @@ static int advk_pcie_check_pio_status(st * Request and taking appropriate action, e.g., complete the * Request to the host as a failed transaction. * - * To simplify implementation do not re-issue the Configuration - * Request and complete the Request as a failed transaction. + * So return -EAGAIN and caller (pci-aardvark.c driver) will + * re-issue request again up to the PIO_RETRY_CNT retries. */ strcomp_status = "CRS"; + ret = -EAGAIN; break; case PIO_COMPLETION_STATUS_CA: strcomp_status = "CA"; + ret = -ECANCELED; break; default: strcomp_status = "Unknown"; + ret = -EINVAL; break; }
if (!strcomp_status) - return 0; + return ret;
if (reg & PIO_NON_POSTED_REQ) str_posted = "Non-posted"; @@ -756,7 +764,7 @@ static int advk_pcie_check_pio_status(st dev_dbg(dev, "%s PIO Response Status: %s, %#x @ %#x\n", str_posted, strcomp_status, reg, advk_readl(pcie, PIO_ADDR_LS));
- return -EFAULT; + return ret; }
static int advk_pcie_wait_pio(struct advk_pcie *pcie) @@ -764,13 +772,13 @@ static int advk_pcie_wait_pio(struct adv struct device *dev = &pcie->pdev->dev; int i;
- for (i = 0; i < PIO_RETRY_CNT; i++) { + for (i = 1; i <= PIO_RETRY_CNT; i++) { u32 start, isr;
start = advk_readl(pcie, PIO_START); isr = advk_readl(pcie, PIO_ISR); if (!start && isr) - return 0; + return i; udelay(PIO_RETRY_DELAY); }
@@ -974,6 +982,7 @@ static int advk_pcie_rd_conf(struct pci_ int where, int size, u32 *val) { struct advk_pcie *pcie = bus->sysdata; + int retry_count; bool allow_crs; u32 reg; int ret; @@ -1016,16 +1025,22 @@ static int advk_pcie_rd_conf(struct pci_ /* Program the data strobe */ advk_writel(pcie, 0xf, PIO_WR_DATA_STRB);
- /* Clear PIO DONE ISR and start the transfer */ - advk_writel(pcie, 1, PIO_ISR); - advk_writel(pcie, 1, PIO_START); - - ret = advk_pcie_wait_pio(pcie); - if (ret < 0) - goto try_crs; + retry_count = 0; + do { + /* Clear PIO DONE ISR and start the transfer */ + advk_writel(pcie, 1, PIO_ISR); + advk_writel(pcie, 1, PIO_START); + + ret = advk_pcie_wait_pio(pcie); + if (ret < 0) + goto try_crs; + + retry_count += ret; + + /* Check PIO status and get the read result */ + ret = advk_pcie_check_pio_status(pcie, allow_crs, val); + } while (ret == -EAGAIN && retry_count < PIO_RETRY_CNT);
- /* Check PIO status and get the read result */ - ret = advk_pcie_check_pio_status(pcie, allow_crs, val); if (ret < 0) goto fail;
@@ -1057,6 +1072,7 @@ static int advk_pcie_wr_conf(struct pci_ struct advk_pcie *pcie = bus->sysdata; u32 reg; u32 data_strobe = 0x0; + int retry_count; int offset; int ret;
@@ -1098,19 +1114,22 @@ static int advk_pcie_wr_conf(struct pci_ /* Program the data strobe */ advk_writel(pcie, data_strobe, PIO_WR_DATA_STRB);
- /* Clear PIO DONE ISR and start the transfer */ - advk_writel(pcie, 1, PIO_ISR); - advk_writel(pcie, 1, PIO_START); + retry_count = 0; + do { + /* Clear PIO DONE ISR and start the transfer */ + advk_writel(pcie, 1, PIO_ISR); + advk_writel(pcie, 1, PIO_START); + + ret = advk_pcie_wait_pio(pcie); + if (ret < 0) + return PCIBIOS_SET_FAILED;
- ret = advk_pcie_wait_pio(pcie); - if (ret < 0) - return PCIBIOS_SET_FAILED; + retry_count += ret;
- ret = advk_pcie_check_pio_status(pcie, false, NULL); - if (ret < 0) - return PCIBIOS_SET_FAILED; + ret = advk_pcie_check_pio_status(pcie, false, NULL); + } while (ret == -EAGAIN && retry_count < PIO_RETRY_CNT);
- return PCIBIOS_SUCCESSFUL; + return ret < 0 ? PCIBIOS_SET_FAILED : PCIBIOS_SUCCESSFUL; }
static struct pci_ops advk_pcie_ops = {
From: Pali Rohár pali@kernel.org
commit 454c53271fc11f3aa5e44e41fd99ca181bd32c62 upstream.
PCIe config space can be initialized also before pci_bridge_emul_init() call, so move rootcap initialization after PCI config space initialization.
This simplifies the function a little since it removes one if (ret < 0) check.
Link: https://lore.kernel.org/r/20211005180952.6812-11-kabel@kernel.org Signed-off-by: Pali Rohár pali@kernel.org Signed-off-by: Marek Behún kabel@kernel.org Signed-off-by: Lorenzo Pieralisi lorenzo.pieralisi@arm.com Reviewed-by: Marek Behún kabel@kernel.org Signed-off-by: Marek Behún kabel@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pci/controller/pci-aardvark.c | 14 ++++---------- 1 file changed, 4 insertions(+), 10 deletions(-)
--- a/drivers/pci/controller/pci-aardvark.c +++ b/drivers/pci/controller/pci-aardvark.c @@ -898,7 +898,6 @@ static struct pci_bridge_emul_ops advk_p static int advk_sw_pci_bridge_init(struct advk_pcie *pcie) { struct pci_bridge_emul *bridge = &pcie->bridge; - int ret;
bridge->conf.vendor = cpu_to_le16(advk_readl(pcie, PCIE_CORE_DEV_ID_REG) & 0xffff); @@ -918,19 +917,14 @@ static int advk_sw_pci_bridge_init(struc /* Support interrupt A for MSI feature */ bridge->conf.intpin = PCIE_CORE_INT_A_ASSERT_ENABLE;
+ /* Indicates supports for Completion Retry Status */ + bridge->pcie_conf.rootcap = cpu_to_le16(PCI_EXP_RTCAP_CRSVIS); + bridge->has_pcie = true; bridge->data = pcie; bridge->ops = &advk_pci_bridge_emul_ops;
- /* PCIe config space can be initialized after pci_bridge_emul_init() */ - ret = pci_bridge_emul_init(bridge, 0); - if (ret < 0) - return ret; - - /* Indicates supports for Completion Retry Status */ - bridge->pcie_conf.rootcap = cpu_to_le16(PCI_EXP_RTCAP_CRSVIS); - - return 0; + return pci_bridge_emul_init(bridge, 0); }
static bool advk_pcie_valid_device(struct advk_pcie *pcie, struct pci_bus *bus,
From: Pali Rohár pali@kernel.org
commit f76b36d40beee0a13aa8f6aa011df0d7cbbb8a7f upstream.
Fix multiple link training issues in aardvark driver. The main reason of these issues was misunderstanding of what certain registers do, since their names and comments were misleading: before commit 96be36dbffac ("PCI: aardvark: Replace custom macros by standard linux/pci_regs.h macros"), the pci-aardvark.c driver used custom macros for accessing standard PCIe Root Bridge registers, and misleading comments did not help to understand what the code was really doing.
After doing more tests and experiments I've come to the conclusion that the SPEED_GEN register in aardvark sets the PCIe revision / generation compliance and forces maximal link speed. Both GEN3 and GEN2 values set the read-only PCI_EXP_FLAGS_VERS bits (PCIe capabilities version of Root Bridge) to value 2, while GEN1 value sets PCI_EXP_FLAGS_VERS to 1, which matches with PCI Express specifications revisions 3, 2 and 1 respectively. Changing SPEED_GEN also sets the read-only bits PCI_EXP_LNKCAP_SLS and PCI_EXP_LNKCAP2_SLS to corresponding speed.
(Note that PCI Express rev 1 specification does not define PCI_EXP_LNKCAP2 and PCI_EXP_LNKCTL2 registers and when SPEED_GEN is set to GEN1 (which also sets PCI_EXP_FLAGS_VERS set to 1), lspci cannot access PCI_EXP_LNKCAP2 and PCI_EXP_LNKCTL2 registers.)
Changing PCIe link speed can be done via PCI_EXP_LNKCTL2_TLS bits of PCI_EXP_LNKCTL2 register. Armada 3700 Functional Specifications says that the default value of PCI_EXP_LNKCTL2_TLS is based on SPEED_GEN value, but tests showed that the default value is always 8.0 GT/s, independently of speed set by SPEED_GEN. So after setting SPEED_GEN, we must also set value in PCI_EXP_LNKCTL2 register via PCI_EXP_LNKCTL2_TLS bits.
Triggering PCI_EXP_LNKCTL_RL bit immediately after setting LINK_TRAINING_EN bit actually doesn't do anything. Tests have shown that a delay is needed after enabling LINK_TRAINING_EN bit. As triggering PCI_EXP_LNKCTL_RL currently does nothing, remove it.
Commit 43fc679ced18 ("PCI: aardvark: Improve link training") introduced code which sets SPEED_GEN register based on negotiated link speed from PCI_EXP_LNKSTA_CLS bits of PCI_EXP_LNKSTA register. This code was added to fix detection of Compex WLE900VX (Atheros QCA9880) WiFi GEN1 PCIe cards, as otherwise these cards were "invisible" on PCIe bus (probably because they crashed). But apparently more people reported the same issues with these cards also with other PCIe controllers [1] and I was able to reproduce this issue also with other "noname" WiFi cards based on Atheros QCA9890 chip (with the same PCI vendor/device ids as Atheros QCA9880). So this is not an issue in aardvark but rather an issue in Atheros QCA98xx chips. Also, this issue only exists if the kernel is compiled with PCIe ASPM support, and a generic workaround for this is to change PCIe Bridge to 2.5 GT/s link speed via PCI_EXP_LNKCTL2_TLS_2_5GT bits in PCI_EXP_LNKCTL2 register [2], before triggering PCI_EXP_LNKCTL_RL bit. This workaround also works when SPEED_GEN is set to value GEN2 (5 GT/s). So remove this hack completely in the aardvark driver and always set SPEED_GEN to value from 'max-link-speed' DT property. Fix for Atheros QCA98xx chips is handled separately by patch [2].
These two things (code for triggering PCI_EXP_LNKCTL_RL bit and changing SPEED_GEN value) also explain why commit 6964494582f5 ("PCI: aardvark: Train link immediately after enabling training") somehow fixed detection of those problematic Compex cards with Atheros chips: if triggering link retraining (via PCI_EXP_LNKCTL_RL bit) was done immediately after enabling link training (via LINK_TRAINING_EN), it did nothing. If there was a specific delay, aardvark HW already initialized PCIe link and therefore triggering link retraining caused the above issue. Compex cards triggered link down event and disappeared from the PCIe bus.
Commit f4c7d053d7f7 ("PCI: aardvark: Wait for endpoint to be ready before training link") added 100ms sleep before calling 'Start link training' command and explained that it is a requirement of PCI Express specification. But the code after this 100ms sleep was not doing 'Start link training', rather it triggered PCI_EXP_LNKCTL_RL bit via PCIe Root Bridge to put link into Recovery state.
The required delay after fundamental reset is already done in function advk_pcie_wait_for_link() which also checks whether PCIe link is up. So after removing the code which triggers PCI_EXP_LNKCTL_RL bit on PCIe Root Bridge, there is no need to wait 100ms again. Remove the extra msleep() call and update comment about the delay required by the PCI Express specification.
According to Marvell Armada 3700 Functional Specifications, Link training should be enabled via aardvark register LINK_TRAINING_EN after selecting PCIe generation and x1 lane. There is no need to disable it prior resetting card via PERST# signal. This disabling code was introduced in commit 5169a9851daa ("PCI: aardvark: Issue PERST via GPIO") as a workaround for some Atheros cards. It turns out that this also is Atheros specific issue and affects any PCIe controller, not only aardvark. Moreover this Atheros issue was triggered by juggling with PCI_EXP_LNKCTL_RL, LINK_TRAINING_EN and SPEED_GEN bits interleaved with sleeps. Now, after removing triggering PCI_EXP_LNKCTL_RL, there is no need to explicitly disable LINK_TRAINING_EN bit. So remove this code too. The problematic Compex cards described in previous git commits are correctly detected in advk_pcie_train_link() function even after applying all these changes.
Note that with this patch, and also prior this patch, some NVMe disks which support PCIe GEN3 with 8 GT/s speed are negotiated only at the lowest link speed 2.5 GT/s, independently of SPEED_GEN value. After manually triggering PCI_EXP_LNKCTL_RL bit (e.g. from userspace via setpci), these NVMe disks change link speed to 5 GT/s when SPEED_GEN was configured to GEN2. This issue first needs to be properly investigated. I will send a fix in the future.
On the other hand, some other GEN2 PCIe cards with 5 GT/s speed are autonomously by HW autonegotiated at full 5 GT/s speed without need of any software interaction.
Armada 3700 Functional Specifications describes the following steps for link training: set SPEED_GEN to GEN2, enable LINK_TRAINING_EN, poll until link training is complete, trigger PCI_EXP_LNKCTL_RL, poll until signal rate is 5 GT/s, poll until link training is complete, enable ASPM L0s.
The requirement for triggering PCI_EXP_LNKCTL_RL can be explained by the need to achieve 5 GT/s speed (as changing link speed is done by throw to recovery state entered by PCI_EXP_LNKCTL_RL) or maybe as a part of enabling ASPM L0s (but in this case ASPM L0s should have been enabled prior PCI_EXP_LNKCTL_RL).
It is unknown why the original pci-aardvark.c driver was triggering PCI_EXP_LNKCTL_RL bit before waiting for the link to be up. This does not align with neither PCIe base specifications nor with Armada 3700 Functional Specification. (Note that in older versions of aardvark, this bit was called incorrectly PCIE_CORE_LINK_TRAINING, so this may be the reason.)
It is also unknown why Armada 3700 Functional Specification says that it is needed to trigger PCI_EXP_LNKCTL_RL for GEN2 mode, as according to PCIe base specification 5 GT/s speed negotiation is supposed to be entirely autonomous, even if initial speed is 2.5 GT/s.
[1] - https://lore.kernel.org/linux-pci/87h7l8axqp.fsf@toke.dk/ [2] - https://lore.kernel.org/linux-pci/20210326124326.21163-1-pali@kernel.org/
Link: https://lore.kernel.org/r/20211005180952.6812-12-kabel@kernel.org Signed-off-by: Pali Rohár pali@kernel.org Signed-off-by: Marek Behún kabel@kernel.org Signed-off-by: Lorenzo Pieralisi lorenzo.pieralisi@arm.com Reviewed-by: Marek Behún kabel@kernel.org Signed-off-by: Marek Behún kabel@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pci/controller/pci-aardvark.c | 119 ++++++++++------------------------ 1 file changed, 35 insertions(+), 84 deletions(-)
--- a/drivers/pci/controller/pci-aardvark.c +++ b/drivers/pci/controller/pci-aardvark.c @@ -303,11 +303,6 @@ static inline u32 advk_readl(struct advk return readl(pcie->base + reg); }
-static inline u16 advk_read16(struct advk_pcie *pcie, u64 reg) -{ - return advk_readl(pcie, (reg & ~0x3)) >> ((reg & 0x3) * 8); -} - static u8 advk_pcie_ltssm_state(struct advk_pcie *pcie) { u32 val; @@ -381,23 +376,9 @@ static void advk_pcie_wait_for_retrain(s
static void advk_pcie_issue_perst(struct advk_pcie *pcie) { - u32 reg; - if (!pcie->reset_gpio) return;
- /* - * As required by PCI Express spec (PCI Express Base Specification, REV. - * 4.0 PCI Express, February 19 2014, 6.6.1 Conventional Reset) a delay - * for at least 100ms after de-asserting PERST# signal is needed before - * link training is enabled. So ensure that link training is disabled - * prior de-asserting PERST# signal to fulfill that PCI Express spec - * requirement. - */ - reg = advk_readl(pcie, PCIE_CORE_CTRL0_REG); - reg &= ~LINK_TRAINING_EN; - advk_writel(pcie, reg, PCIE_CORE_CTRL0_REG); - /* 10ms delay is needed for some cards */ dev_info(&pcie->pdev->dev, "issuing PERST via reset GPIO for 10ms\n"); gpiod_set_value_cansleep(pcie->reset_gpio, 1); @@ -405,54 +386,47 @@ static void advk_pcie_issue_perst(struct gpiod_set_value_cansleep(pcie->reset_gpio, 0); }
-static int advk_pcie_train_at_gen(struct advk_pcie *pcie, int gen) +static void advk_pcie_train_link(struct advk_pcie *pcie) { - int ret, neg_gen; + struct device *dev = &pcie->pdev->dev; u32 reg; + int ret;
- /* Setup link speed */ + /* + * Setup PCIe rev / gen compliance based on device tree property + * 'max-link-speed' which also forces maximal link speed. + */ reg = advk_readl(pcie, PCIE_CORE_CTRL0_REG); reg &= ~PCIE_GEN_SEL_MSK; - if (gen == 3) + if (pcie->link_gen == 3) reg |= SPEED_GEN_3; - else if (gen == 2) + else if (pcie->link_gen == 2) reg |= SPEED_GEN_2; else reg |= SPEED_GEN_1; advk_writel(pcie, reg, PCIE_CORE_CTRL0_REG);
/* - * Enable link training. This is not needed in every call to this - * function, just once suffices, but it does not break anything either. - */ + * Set maximal link speed value also into PCIe Link Control 2 register. + * Armada 3700 Functional Specification says that default value is based + * on SPEED_GEN but tests showed that default value is always 8.0 GT/s. + */ + reg = advk_readl(pcie, PCIE_CORE_PCIEXP_CAP + PCI_EXP_LNKCTL2); + reg &= ~PCI_EXP_LNKCTL2_TLS; + if (pcie->link_gen == 3) + reg |= PCI_EXP_LNKCTL2_TLS_8_0GT; + else if (pcie->link_gen == 2) + reg |= PCI_EXP_LNKCTL2_TLS_5_0GT; + else + reg |= PCI_EXP_LNKCTL2_TLS_2_5GT; + advk_writel(pcie, reg, PCIE_CORE_PCIEXP_CAP + PCI_EXP_LNKCTL2); + + /* Enable link training after selecting PCIe generation */ reg = advk_readl(pcie, PCIE_CORE_CTRL0_REG); reg |= LINK_TRAINING_EN; advk_writel(pcie, reg, PCIE_CORE_CTRL0_REG);
/* - * Start link training immediately after enabling it. - * This solves problems for some buggy cards. - */ - reg = advk_readl(pcie, PCIE_CORE_PCIEXP_CAP + PCI_EXP_LNKCTL); - reg |= PCI_EXP_LNKCTL_RL; - advk_writel(pcie, reg, PCIE_CORE_PCIEXP_CAP + PCI_EXP_LNKCTL); - - ret = advk_pcie_wait_for_link(pcie); - if (ret) - return ret; - - reg = advk_read16(pcie, PCIE_CORE_PCIEXP_CAP + PCI_EXP_LNKSTA); - neg_gen = reg & PCI_EXP_LNKSTA_CLS; - - return neg_gen; -} - -static void advk_pcie_train_link(struct advk_pcie *pcie) -{ - struct device *dev = &pcie->pdev->dev; - int neg_gen = -1, gen; - - /* * Reset PCIe card via PERST# signal. Some cards are not detected * during link training when they are in some non-initial state. */ @@ -462,41 +436,18 @@ static void advk_pcie_train_link(struct * PERST# signal could have been asserted by pinctrl subsystem before * probe() callback has been called or issued explicitly by reset gpio * function advk_pcie_issue_perst(), making the endpoint going into - * fundamental reset. As required by PCI Express spec a delay for at - * least 100ms after such a reset before link training is needed. - */ - msleep(PCI_PM_D3COLD_WAIT); - - /* - * Try link training at link gen specified by device tree property - * 'max-link-speed'. If this fails, iteratively train at lower gen. - */ - for (gen = pcie->link_gen; gen > 0; --gen) { - neg_gen = advk_pcie_train_at_gen(pcie, gen); - if (neg_gen > 0) - break; - } - - if (neg_gen < 0) - goto err; - - /* - * After successful training if negotiated gen is lower than requested, - * train again on negotiated gen. This solves some stability issues for - * some buggy gen1 cards. + * fundamental reset. As required by PCI Express spec (PCI Express + * Base Specification, REV. 4.0 PCI Express, February 19 2014, 6.6.1 + * Conventional Reset) a delay for at least 100ms after such a reset + * before sending a Configuration Request to the device is needed. + * So wait until PCIe link is up. Function advk_pcie_wait_for_link() + * waits for link at least 900ms. */ - if (neg_gen < gen) { - gen = neg_gen; - neg_gen = advk_pcie_train_at_gen(pcie, gen); - } - - if (neg_gen == gen) { - dev_info(dev, "link up at gen %i\n", gen); - return; - } - -err: - dev_err(dev, "link never came up\n"); + ret = advk_pcie_wait_for_link(pcie); + if (ret < 0) + dev_err(dev, "link never came up\n"); + else + dev_info(dev, "link up\n"); }
/*
From: Pali Rohár pali@kernel.org
commit 771153fc884f566a89af2d30033b7f3bc6e24e84 upstream.
From very vague, ambiguous and incomplete information from Marvell we
deduced that the 32-bit Aardvark register at address 0x4 (PCIE_CORE_CMD_STATUS_REG), which is not documented for Root Complex mode in the Functional Specification (only for Endpoint mode), controls two 16-bit PCIe registers: Command Register and Status Registers of PCIe Root Port.
This means that bit 2 controls bus mastering and forwarding of memory and I/O requests in the upstream direction. According to PCI specifications bits [0:2] of Command Register, this should be by default disabled on reset. So explicitly disable these bits at early setup of the Aardvark driver.
Remove code which unconditionally enables all 3 bits and let kernel code (via pci_set_master() function) to handle bus mastering of Root PCIe Bridge via emulated PCI_COMMAND on emulated bridge.
Link: https://lore.kernel.org/r/20211028185659.20329-5-kabel@kernel.org Fixes: 8a3ebd8de328 ("PCI: aardvark: Implement emulated root PCI bridge config space") Signed-off-by: Pali Rohár pali@kernel.org Signed-off-by: Marek Behún kabel@kernel.org Signed-off-by: Lorenzo Pieralisi lorenzo.pieralisi@arm.com Cc: stable@vger.kernel.org # b2a56469d550 ("PCI: aardvark: Add FIXME comment for PCIE_CORE_CMD_STATUS_REG access") Signed-off-by: Marek Behún kabel@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pci/controller/pci-aardvark.c | 47 +++++++++++++++++++++++++++------- 1 file changed, 38 insertions(+), 9 deletions(-)
--- a/drivers/pci/controller/pci-aardvark.c +++ b/drivers/pci/controller/pci-aardvark.c @@ -27,9 +27,6 @@ /* PCIe core registers */ #define PCIE_CORE_DEV_ID_REG 0x0 #define PCIE_CORE_CMD_STATUS_REG 0x4 -#define PCIE_CORE_CMD_IO_ACCESS_EN BIT(0) -#define PCIE_CORE_CMD_MEM_ACCESS_EN BIT(1) -#define PCIE_CORE_CMD_MEM_IO_REQ_EN BIT(2) #define PCIE_CORE_DEV_REV_REG 0x8 #define PCIE_CORE_PCIEXP_CAP 0xc0 #define PCIE_CORE_ERR_CAPCTL_REG 0x118 @@ -505,6 +502,11 @@ static void advk_pcie_setup_hw(struct ad reg = (PCI_VENDOR_ID_MARVELL << 16) | PCI_VENDOR_ID_MARVELL; advk_writel(pcie, reg, VENDOR_ID_REG);
+ /* Disable Root Bridge I/O space, memory space and bus mastering */ + reg = advk_readl(pcie, PCIE_CORE_CMD_STATUS_REG); + reg &= ~(PCI_COMMAND_IO | PCI_COMMAND_MEMORY | PCI_COMMAND_MASTER); + advk_writel(pcie, reg, PCIE_CORE_CMD_STATUS_REG); + /* Set Advanced Error Capabilities and Control PF0 register */ reg = PCIE_CORE_ERR_CAPCTL_ECRC_CHK_TX | PCIE_CORE_ERR_CAPCTL_ECRC_CHK_TX_EN | @@ -603,12 +605,6 @@ static void advk_pcie_setup_hw(struct ad advk_pcie_disable_ob_win(pcie, i);
advk_pcie_train_link(pcie); - - reg = advk_readl(pcie, PCIE_CORE_CMD_STATUS_REG); - reg |= PCIE_CORE_CMD_MEM_ACCESS_EN | - PCIE_CORE_CMD_IO_ACCESS_EN | - PCIE_CORE_CMD_MEM_IO_REQ_EN; - advk_writel(pcie, reg, PCIE_CORE_CMD_STATUS_REG); }
static int advk_pcie_check_pio_status(struct advk_pcie *pcie, bool allow_crs, u32 *val) @@ -737,6 +733,37 @@ static int advk_pcie_wait_pio(struct adv return -ETIMEDOUT; }
+static pci_bridge_emul_read_status_t +advk_pci_bridge_emul_base_conf_read(struct pci_bridge_emul *bridge, + int reg, u32 *value) +{ + struct advk_pcie *pcie = bridge->data; + + switch (reg) { + case PCI_COMMAND: + *value = advk_readl(pcie, PCIE_CORE_CMD_STATUS_REG); + return PCI_BRIDGE_EMUL_HANDLED; + + default: + return PCI_BRIDGE_EMUL_NOT_HANDLED; + } +} + +static void +advk_pci_bridge_emul_base_conf_write(struct pci_bridge_emul *bridge, + int reg, u32 old, u32 new, u32 mask) +{ + struct advk_pcie *pcie = bridge->data; + + switch (reg) { + case PCI_COMMAND: + advk_writel(pcie, new, PCIE_CORE_CMD_STATUS_REG); + break; + + default: + break; + } +}
static pci_bridge_emul_read_status_t advk_pci_bridge_emul_pcie_conf_read(struct pci_bridge_emul *bridge, @@ -838,6 +865,8 @@ advk_pci_bridge_emul_pcie_conf_write(str }
static struct pci_bridge_emul_ops advk_pci_bridge_emul_ops = { + .read_base = advk_pci_bridge_emul_base_conf_read, + .write_base = advk_pci_bridge_emul_base_conf_write, .read_pcie = advk_pci_bridge_emul_pcie_conf_read, .write_pcie = advk_pci_bridge_emul_pcie_conf_write, };
From: Pali Rohár pali@kernel.org
commit 84e1b4045dc887b78bdc87d92927093dc3a465aa upstream.
Aardvark controller has something like config space of a Root Port available at offset 0x0 of internal registers - these registers are used for implementation of the emulated bridge.
The default value of Class Code of this bridge corresponds to a RAID Mass storage controller, though. (This is probably intended for when the controller is used as Endpoint.)
Change the Class Code to correspond to a PCI Bridge.
Add comment explaining this change.
Link: https://lore.kernel.org/r/20211028185659.20329-6-kabel@kernel.org Fixes: 8a3ebd8de328 ("PCI: aardvark: Implement emulated root PCI bridge config space") Signed-off-by: Pali Rohár pali@kernel.org Signed-off-by: Marek Behún kabel@kernel.org Signed-off-by: Lorenzo Pieralisi lorenzo.pieralisi@arm.com Cc: stable@vger.kernel.org Signed-off-by: Marek Behún kabel@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pci/controller/pci-aardvark.c | 20 ++++++++++++++++++++ 1 file changed, 20 insertions(+)
--- a/drivers/pci/controller/pci-aardvark.c +++ b/drivers/pci/controller/pci-aardvark.c @@ -502,6 +502,26 @@ static void advk_pcie_setup_hw(struct ad reg = (PCI_VENDOR_ID_MARVELL << 16) | PCI_VENDOR_ID_MARVELL; advk_writel(pcie, reg, VENDOR_ID_REG);
+ /* + * Change Class Code of PCI Bridge device to PCI Bridge (0x600400), + * because the default value is Mass storage controller (0x010400). + * + * Note that this Aardvark PCI Bridge does not have compliant Type 1 + * Configuration Space and it even cannot be accessed via Aardvark's + * PCI config space access method. Something like config space is + * available in internal Aardvark registers starting at offset 0x0 + * and is reported as Type 0. In range 0x10 - 0x34 it has totally + * different registers. + * + * Therefore driver uses emulation of PCI Bridge which emulates + * access to configuration space via internal Aardvark registers or + * emulated configuration buffer. + */ + reg = advk_readl(pcie, PCIE_CORE_DEV_REV_REG); + reg &= ~0xffffff00; + reg |= (PCI_CLASS_BRIDGE_PCI << 8) << 8; + advk_writel(pcie, reg, PCIE_CORE_DEV_REV_REG); + /* Disable Root Bridge I/O space, memory space and bus mastering */ reg = advk_readl(pcie, PCIE_CORE_CMD_STATUS_REG); reg &= ~(PCI_COMMAND_IO | PCI_COMMAND_MEMORY | PCI_COMMAND_MASTER);
From: Pali Rohár pali@kernel.org
commit bc4fac42e5f8460af09c0a7f2f1915be09e20c71 upstream.
Aardvark supports PCIe Hot Reset via PCIE_CORE_CTRL1_REG.
Use it for implementing PCI_BRIDGE_CTL_BUS_RESET bit of PCI_BRIDGE_CONTROL register on emulated bridge.
With this, the function pci_reset_secondary_bus() starts working and can reset connected PCIe card. Custom userspace script [1] which uses setpci can trigger PCIe Hot Reset and reset the card manually.
[1] https://alexforencich.com/wiki/en/pcie/hot-reset-linux
Link: https://lore.kernel.org/r/20211028185659.20329-7-kabel@kernel.org Fixes: 8a3ebd8de328 ("PCI: aardvark: Implement emulated root PCI bridge config space") Signed-off-by: Pali Rohár pali@kernel.org Signed-off-by: Marek Behún kabel@kernel.org Signed-off-by: Lorenzo Pieralisi lorenzo.pieralisi@arm.com Cc: stable@vger.kernel.org Signed-off-by: Marek Behún kabel@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pci/controller/pci-aardvark.c | 27 +++++++++++++++++++++++++++ 1 file changed, 27 insertions(+)
--- a/drivers/pci/controller/pci-aardvark.c +++ b/drivers/pci/controller/pci-aardvark.c @@ -764,6 +764,22 @@ advk_pci_bridge_emul_base_conf_read(stru *value = advk_readl(pcie, PCIE_CORE_CMD_STATUS_REG); return PCI_BRIDGE_EMUL_HANDLED;
+ case PCI_INTERRUPT_LINE: { + /* + * From the whole 32bit register we support reading from HW only + * one bit: PCI_BRIDGE_CTL_BUS_RESET. + * Other bits are retrieved only from emulated config buffer. + */ + __le32 *cfgspace = (__le32 *)&bridge->conf; + u32 val = le32_to_cpu(cfgspace[PCI_INTERRUPT_LINE / 4]); + if (advk_readl(pcie, PCIE_CORE_CTRL1_REG) & HOT_RESET_GEN) + val |= PCI_BRIDGE_CTL_BUS_RESET << 16; + else + val &= ~(PCI_BRIDGE_CTL_BUS_RESET << 16); + *value = val; + return PCI_BRIDGE_EMUL_HANDLED; + } + default: return PCI_BRIDGE_EMUL_NOT_HANDLED; } @@ -780,6 +796,17 @@ advk_pci_bridge_emul_base_conf_write(str advk_writel(pcie, new, PCIE_CORE_CMD_STATUS_REG); break;
+ case PCI_INTERRUPT_LINE: + if (mask & (PCI_BRIDGE_CTL_BUS_RESET << 16)) { + u32 val = advk_readl(pcie, PCIE_CORE_CTRL1_REG); + if (new & (PCI_BRIDGE_CTL_BUS_RESET << 16)) + val |= HOT_RESET_GEN; + else + val &= ~HOT_RESET_GEN; + advk_writel(pcie, val, PCIE_CORE_CTRL1_REG); + } + break; + default: break; }
From: Marek Behún kabel@kernel.org
commit baf8d6899b1e8906dc076ef26cc633e96a8bb0c3 upstream.
The PWM pins on North Bridge on Armada 37xx can be configured into PWM or GPIO functions. When in PWM function, each pin can also be configured to drive low on 0 and tri-state on 1 (LED mode).
The current definitions handle this by declaring two pin groups for each pin: - group "pwmN" with functions "pwm" and "gpio" - group "ledN_od" ("od" for open drain) with functions "led" and "gpio"
This is semantically incorrect. The correct definition for each pin should be one group with three functions: "pwm", "led" and "gpio".
Change the "pwmN" groups to support "led" function.
Remove "ledN_od" groups. This cannot break backwards compatibility with older device trees: no device tree uses it since there is no PWM driver for this SOC yet. Also "ledN_od" groups are not even documented.
Fixes: b835d6953009 ("pinctrl: armada-37xx: swap polarity on LED group") Signed-off-by: Marek Behún kabel@kernel.org Acked-by: Rob Herring robh@kernel.org Link: https://lore.kernel.org/r/20210719112938.27594-1-kabel@kernel.org Signed-off-by: Linus Walleij linus.walleij@linaro.org Signed-off-by: Marek Behún kabel@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- Documentation/devicetree/bindings/pinctrl/marvell,armada-37xx-pinctrl.txt | 8 ++-- drivers/pinctrl/mvebu/pinctrl-armada-37xx.c | 17 ++++------ 2 files changed, 12 insertions(+), 13 deletions(-)
--- a/Documentation/devicetree/bindings/pinctrl/marvell,armada-37xx-pinctrl.txt +++ b/Documentation/devicetree/bindings/pinctrl/marvell,armada-37xx-pinctrl.txt @@ -43,19 +43,19 @@ group emmc_nb
group pwm0 - pin 11 (GPIO1-11) - - functions pwm, gpio + - functions pwm, led, gpio
group pwm1 - pin 12 - - functions pwm, gpio + - functions pwm, led, gpio
group pwm2 - pin 13 - - functions pwm, gpio + - functions pwm, led, gpio
group pwm3 - pin 14 - - functions pwm, gpio + - functions pwm, led, gpio
group pmic1 - pin 7 --- a/drivers/pinctrl/mvebu/pinctrl-armada-37xx.c +++ b/drivers/pinctrl/mvebu/pinctrl-armada-37xx.c @@ -166,10 +166,14 @@ static struct armada_37xx_pin_group arma PIN_GRP_GPIO("jtag", 20, 5, BIT(0), "jtag"), PIN_GRP_GPIO("sdio0", 8, 3, BIT(1), "sdio"), PIN_GRP_GPIO("emmc_nb", 27, 9, BIT(2), "emmc"), - PIN_GRP_GPIO("pwm0", 11, 1, BIT(3), "pwm"), - PIN_GRP_GPIO("pwm1", 12, 1, BIT(4), "pwm"), - PIN_GRP_GPIO("pwm2", 13, 1, BIT(5), "pwm"), - PIN_GRP_GPIO("pwm3", 14, 1, BIT(6), "pwm"), + PIN_GRP_GPIO_3("pwm0", 11, 1, BIT(3) | BIT(20), 0, BIT(20), BIT(3), + "pwm", "led"), + PIN_GRP_GPIO_3("pwm1", 12, 1, BIT(4) | BIT(21), 0, BIT(21), BIT(4), + "pwm", "led"), + PIN_GRP_GPIO_3("pwm2", 13, 1, BIT(5) | BIT(22), 0, BIT(22), BIT(5), + "pwm", "led"), + PIN_GRP_GPIO_3("pwm3", 14, 1, BIT(6) | BIT(23), 0, BIT(23), BIT(6), + "pwm", "led"), PIN_GRP_GPIO("pmic1", 7, 1, BIT(7), "pmic"), PIN_GRP_GPIO("pmic0", 6, 1, BIT(8), "pmic"), PIN_GRP_GPIO("i2c2", 2, 2, BIT(9), "i2c"), @@ -183,11 +187,6 @@ static struct armada_37xx_pin_group arma PIN_GRP_EXTRA("uart2", 9, 2, BIT(1) | BIT(13) | BIT(14) | BIT(19), BIT(1) | BIT(13) | BIT(14), BIT(1) | BIT(19), 18, 2, "gpio", "uart"), - PIN_GRP_GPIO_2("led0_od", 11, 1, BIT(20), BIT(20), 0, "led"), - PIN_GRP_GPIO_2("led1_od", 12, 1, BIT(21), BIT(21), 0, "led"), - PIN_GRP_GPIO_2("led2_od", 13, 1, BIT(22), BIT(22), 0, "led"), - PIN_GRP_GPIO_2("led3_od", 14, 1, BIT(23), BIT(23), 0, "led"), - };
static struct armada_37xx_pin_group armada_37xx_sb_groups[] = {
From: Marek Behún kabel@kernel.org
commit 715878016984b2617f6c1f177c50039e12e7bd5b upstream.
We found out that we are unable to control the PERST# signal via the default pin dedicated to be PERST# pin (GPIO2[3] pin) on A3700 SOC when this pin is in EP_PCIE1_Resetn mode. There is a register in the PCIe register space called PERSTN_GPIO_EN (D0088004[3]), but changing the value of this register does not change the pin output when measuring with voltmeter.
We do not know if this is a bug in the SOC, or if it works only when PCIe controller is in a certain state.
Commit f4c7d053d7f7 ("PCI: aardvark: Wait for endpoint to be ready before training link") says that when this pin changes pinctrl mode from EP_PCIE1_Resetn to GPIO, the PERST# signal is asserted for a brief moment.
So currently the situation is that on A3700 boards the PERST# signal is asserted in U-Boot (because the code in U-Boot issues reset via this pin via GPIO mode), and then in Linux by the obscure and undocumented mechanism described by the above mentioned commit.
We want to issue PERST# signal in a known way, therefore this patch changes the pcie_reset_pin function from "pcie" to "gpio" and adds the reset-gpios property to the PCIe node in device tree files of EspressoBin and Armada 3720 Dev Board (Turris Mox device tree already has this property and uDPU does not have a PCIe port).
Signed-off-by: Marek Behún marek.behun@nic.cz Cc: Remi Pommarel repk@triplefau.lt Tested-by: Tomasz Maciej Nowak tmn505@gmail.com Acked-by: Thomas Petazzoni thomas.petazzoni@bootlin.com Signed-off-by: Gregory CLEMENT gregory.clement@bootlin.com Signed-off-by: Marek Behún kabel@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm64/boot/dts/marvell/armada-3720-db.dts | 3 +++ arch/arm64/boot/dts/marvell/armada-3720-espressobin.dts | 1 + arch/arm64/boot/dts/marvell/armada-3720-turris-mox.dts | 4 ---- arch/arm64/boot/dts/marvell/armada-37xx.dtsi | 2 +- 4 files changed, 5 insertions(+), 5 deletions(-)
--- a/arch/arm64/boot/dts/marvell/armada-3720-db.dts +++ b/arch/arm64/boot/dts/marvell/armada-3720-db.dts @@ -128,6 +128,9 @@
/* CON15(V2.0)/CON17(V1.4) : PCIe / CON15(V2.0)/CON12(V1.4) :mini-PCIe */ &pcie0 { + pinctrl-names = "default"; + pinctrl-0 = <&pcie_reset_pins &pcie_clkreq_pins>; + reset-gpios = <&gpiosb 3 GPIO_ACTIVE_LOW>; status = "okay"; };
--- a/arch/arm64/boot/dts/marvell/armada-3720-espressobin.dts +++ b/arch/arm64/boot/dts/marvell/armada-3720-espressobin.dts @@ -59,6 +59,7 @@ phys = <&comphy1 0>; pinctrl-names = "default"; pinctrl-0 = <&pcie_reset_pins &pcie_clkreq_pins>; + reset-gpios = <&gpiosb 3 GPIO_ACTIVE_LOW>; };
/* J6 */ --- a/arch/arm64/boot/dts/marvell/armada-3720-turris-mox.dts +++ b/arch/arm64/boot/dts/marvell/armada-3720-turris-mox.dts @@ -127,10 +127,6 @@ }; };
-&pcie_reset_pins { - function = "gpio"; -}; - &pcie0 { pinctrl-names = "default"; pinctrl-0 = <&pcie_reset_pins &pcie_clkreq_pins>; --- a/arch/arm64/boot/dts/marvell/armada-37xx.dtsi +++ b/arch/arm64/boot/dts/marvell/armada-37xx.dtsi @@ -318,7 +318,7 @@
pcie_reset_pins: pcie-reset-pins { groups = "pcie1"; - function = "pcie"; + function = "gpio"; };
pcie_clkreq_pins: pcie-clkreq-pins {
From: David Hildenbrand david@redhat.com
commit c1e63117711977cc4295b2ce73de29dd17066c82 upstream.
To clear a user buffer we cannot simply use memset, we have to use clear_user(). With a virtio-mem device that registers a vmcore_cb and has some logically unplugged memory inside an added Linux memory block, I can easily trigger a BUG by copying the vmcore via "cp":
systemd[1]: Starting Kdump Vmcore Save Service... kdump[420]: Kdump is using the default log level(3). kdump[453]: saving to /sysroot/var/crash/127.0.0.1-2021-11-11-14:59:22/ kdump[458]: saving vmcore-dmesg.txt to /sysroot/var/crash/127.0.0.1-2021-11-11-14:59:22/ kdump[465]: saving vmcore-dmesg.txt complete kdump[467]: saving vmcore BUG: unable to handle page fault for address: 00007f2374e01000 #PF: supervisor write access in kernel mode #PF: error_code(0x0003) - permissions violation PGD 7a523067 P4D 7a523067 PUD 7a528067 PMD 7a525067 PTE 800000007048f867 Oops: 0003 [#1] PREEMPT SMP NOPTI CPU: 0 PID: 468 Comm: cp Not tainted 5.15.0+ #6 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.14.0-27-g64f37cc530f1-prebuilt.qemu.org 04/01/2014 RIP: 0010:read_from_oldmem.part.0.cold+0x1d/0x86 Code: ff ff ff e8 05 ff fe ff e9 b9 e9 7f ff 48 89 de 48 c7 c7 38 3b 60 82 e8 f1 fe fe ff 83 fd 08 72 3c 49 8d 7d 08 4c 89 e9 89 e8 <49> c7 45 00 00 00 00 00 49 c7 44 05 f8 00 00 00 00 48 83 e7 f81 RSP: 0018:ffffc9000073be08 EFLAGS: 00010212 RAX: 0000000000001000 RBX: 00000000002fd000 RCX: 00007f2374e01000 RDX: 0000000000000001 RSI: 00000000ffffdfff RDI: 00007f2374e01008 RBP: 0000000000001000 R08: 0000000000000000 R09: ffffc9000073bc50 R10: ffffc9000073bc48 R11: ffffffff829461a8 R12: 000000000000f000 R13: 00007f2374e01000 R14: 0000000000000000 R15: ffff88807bd421e8 FS: 00007f2374e12140(0000) GS:ffff88807f000000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f2374e01000 CR3: 000000007a4aa000 CR4: 0000000000350eb0 Call Trace: read_vmcore+0x236/0x2c0 proc_reg_read+0x55/0xa0 vfs_read+0x95/0x190 ksys_read+0x4f/0xc0 do_syscall_64+0x3b/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xae
Some x86-64 CPUs have a CPU feature called "Supervisor Mode Access Prevention (SMAP)", which is used to detect wrong access from the kernel to user buffers like this: SMAP triggers a permissions violation on wrong access. In the x86-64 variant of clear_user(), SMAP is properly handled via clac()+stac().
To fix, properly use clear_user() when we're dealing with a user buffer.
Link: https://lkml.kernel.org/r/20211112092750.6921-1-david@redhat.com Fixes: 997c136f518c ("fs/proc/vmcore.c: add hook to read_from_oldmem() to check for non-ram pages") Signed-off-by: David Hildenbrand david@redhat.com Acked-by: Baoquan He bhe@redhat.com Cc: Dave Young dyoung@redhat.com Cc: Baoquan He bhe@redhat.com Cc: Vivek Goyal vgoyal@redhat.com Cc: Philipp Rudo prudo@redhat.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: David Hildenbrand david@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/proc/vmcore.c | 16 ++++++++++------ 1 file changed, 10 insertions(+), 6 deletions(-)
--- a/fs/proc/vmcore.c +++ b/fs/proc/vmcore.c @@ -125,9 +125,13 @@ ssize_t read_from_oldmem(char *buf, size nr_bytes = count;
/* If pfn is not ram, return zeros for sparse dump files */ - if (pfn_is_ram(pfn) == 0) - memset(buf, 0, nr_bytes); - else { + if (pfn_is_ram(pfn) == 0) { + tmp = 0; + if (!userbuf) + memset(buf, 0, nr_bytes); + else if (clear_user(buf, nr_bytes)) + tmp = -EFAULT; + } else { if (encrypted) tmp = copy_oldmem_page_encrypted(pfn, buf, nr_bytes, @@ -136,10 +140,10 @@ ssize_t read_from_oldmem(char *buf, size else tmp = copy_oldmem_page(pfn, buf, nr_bytes, offset, userbuf); - - if (tmp < 0) - return tmp; } + if (tmp < 0) + return tmp; + *ppos += nr_bytes; count -= nr_bytes; buf += nr_bytes;
From: yangxingwu xingwu.yang@gmail.com
[ Upstream commit c95c07836fa4c1767ed11d8eca0769c652760e32 ]
We are changing expire_nodest_conn to work even for reused connections when conn_reuse_mode=0, just as what was done with commit dc7b3eb900aa ("ipvs: Fix reuse connection if real server is dead").
For controlled and persistent connections, the new connection will get the needed real server depending on the rules in ip_vs_check_template().
Fixes: d752c3645717 ("ipvs: allow rescheduling of new connections when port reuse is detected") Co-developed-by: Chuanqi Liu legend050709@qq.com Signed-off-by: Chuanqi Liu legend050709@qq.com Signed-off-by: yangxingwu xingwu.yang@gmail.com Acked-by: Simon Horman horms@verge.net.au Acked-by: Julian Anastasov ja@ssi.bg Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Sasha Levin sashal@kernel.org --- Documentation/networking/ipvs-sysctl.txt | 3 +-- net/netfilter/ipvs/ip_vs_core.c | 8 ++++---- 2 files changed, 5 insertions(+), 6 deletions(-)
diff --git a/Documentation/networking/ipvs-sysctl.txt b/Documentation/networking/ipvs-sysctl.txt index 056898685d408..fc531c29a2e83 100644 --- a/Documentation/networking/ipvs-sysctl.txt +++ b/Documentation/networking/ipvs-sysctl.txt @@ -30,8 +30,7 @@ conn_reuse_mode - INTEGER
0: disable any special handling on port reuse. The new connection will be delivered to the same real server that was - servicing the previous connection. This will effectively - disable expire_nodest_conn. + servicing the previous connection.
bit 1: enable rescheduling of new connections when it is safe. That is, whenever expire_nodest_conn and for TCP sockets, when diff --git a/net/netfilter/ipvs/ip_vs_core.c b/net/netfilter/ipvs/ip_vs_core.c index 89aa1fc334b19..ccd6af1440745 100644 --- a/net/netfilter/ipvs/ip_vs_core.c +++ b/net/netfilter/ipvs/ip_vs_core.c @@ -1982,7 +1982,6 @@ ip_vs_in(struct netns_ipvs *ipvs, unsigned int hooknum, struct sk_buff *skb, int struct ip_vs_proto_data *pd; struct ip_vs_conn *cp; int ret, pkts; - int conn_reuse_mode; struct sock *sk;
/* Already marked as IPVS request or reply? */ @@ -2059,15 +2058,16 @@ ip_vs_in(struct netns_ipvs *ipvs, unsigned int hooknum, struct sk_buff *skb, int cp = INDIRECT_CALL_1(pp->conn_in_get, ip_vs_conn_in_get_proto, ipvs, af, skb, &iph);
- conn_reuse_mode = sysctl_conn_reuse_mode(ipvs); - if (conn_reuse_mode && !iph.fragoffs && is_new_conn(skb, &iph) && cp) { + if (!iph.fragoffs && is_new_conn(skb, &iph) && cp) { + int conn_reuse_mode = sysctl_conn_reuse_mode(ipvs); bool old_ct = false, resched = false;
if (unlikely(sysctl_expire_nodest_conn(ipvs)) && cp->dest && unlikely(!atomic_read(&cp->dest->weight))) { resched = true; old_ct = ip_vs_conn_uses_old_conntrack(cp, skb); - } else if (is_new_conn_expected(cp, conn_reuse_mode)) { + } else if (conn_reuse_mode && + is_new_conn_expected(cp, conn_reuse_mode)) { old_ct = ip_vs_conn_uses_old_conntrack(cp, skb); if (!atomic_read(&cp->n_control)) { resched = true;
From: Florian Fainelli f.fainelli@gmail.com
[ Upstream commit 754c4050a00e802e122690112fc2c3a6abafa7e2 ]
The I2C interrupt controller line is off by 32 because the datasheet describes interrupt inputs into the GIC which are for Shared Peripheral Interrupts and are starting at offset 32. The ARM GIC binding expects the SPI interrupts to be numbered from 0 relative to the SPI base.
Fixes: bb097e3e0045 ("ARM: dts: BCM5301X: Add I2C support to the DT") Tested-by: Christian Lamparter chunkeey@gmail.com Signed-off-by: Florian Fainelli f.fainelli@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm/boot/dts/bcm5301x.dtsi | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/arm/boot/dts/bcm5301x.dtsi b/arch/arm/boot/dts/bcm5301x.dtsi index 9711170649b69..51f20ded92f04 100644 --- a/arch/arm/boot/dts/bcm5301x.dtsi +++ b/arch/arm/boot/dts/bcm5301x.dtsi @@ -387,7 +387,7 @@ usb3_dmp: syscon@18105000 { i2c0: i2c@18009000 { compatible = "brcm,iproc-i2c"; reg = <0x18009000 0x50>; - interrupts = <GIC_SPI 121 IRQ_TYPE_LEVEL_HIGH>; + interrupts = <GIC_SPI 89 IRQ_TYPE_LEVEL_HIGH>; #address-cells = <1>; #size-cells = <0>; clock-frequency = <100000>;
From: Florian Fainelli f.fainelli@gmail.com
[ Upstream commit 40f7342f0587639e5ad625adaa15efdd3cffb18f ]
The GPIO controller is also an interrupt controller provider and is currently missing the appropriate 'interrupt-controller' and '#interrupt-cells' properties to denote that.
Fixes: fb026d3de33b ("ARM: BCM5301X: Add Broadcom's bus-axi to the DTS file") Signed-off-by: Florian Fainelli f.fainelli@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm/boot/dts/bcm5301x.dtsi | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/arch/arm/boot/dts/bcm5301x.dtsi b/arch/arm/boot/dts/bcm5301x.dtsi index 51f20ded92f04..05d67f9769118 100644 --- a/arch/arm/boot/dts/bcm5301x.dtsi +++ b/arch/arm/boot/dts/bcm5301x.dtsi @@ -242,6 +242,8 @@ chipcommon: chipcommon@0 {
gpio-controller; #gpio-cells = <2>; + interrupt-controller; + #interrupt-cells = <2>; };
pcie0: pcie@12000 {
From: Srinivas Kandagatla srinivas.kandagatla@linaro.org
[ Upstream commit 861afeac7990587588d057b2c0b3222331c3da29 ]
Stream IDs are reused across multiple BackEnd mixers, do not reset the stream mixers if they are not already set for that particular FrontEnd.
Ex: amixer cset iface=MIXER,name='SLIMBUS_0_RX Audio Mixer MultiMedia1' 1
would set the MultiMedia1 steam for SLIMBUS_0_RX, however doing below command will reset previously setup MultiMedia1 stream, because both of them are using MultiMedia1 PCM stream.
amixer cset iface=MIXER,name='SLIMBUS_2_RX Audio Mixer MultiMedia1' 0
reset the FrontEnd Mixers conditionally to fix this issue.
This is more noticeable in desktop setup, where in alsactl tries to restore the alsa state and overwriting the previous mixer settings.
Fixes: e3a33673e845 ("ASoC: qdsp6: q6routing: Add q6routing driver") Signed-off-by: Srinivas Kandagatla srinivas.kandagatla@linaro.org Link: https://lore.kernel.org/r/20211116114721.12517-3-srinivas.kandagatla@linaro.... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/qcom/qdsp6/q6routing.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/sound/soc/qcom/qdsp6/q6routing.c b/sound/soc/qcom/qdsp6/q6routing.c index 745cc9dd14f38..16f26dd2d59ed 100644 --- a/sound/soc/qcom/qdsp6/q6routing.c +++ b/sound/soc/qcom/qdsp6/q6routing.c @@ -443,7 +443,11 @@ static int msm_routing_put_audio_mixer(struct snd_kcontrol *kcontrol, session->port_id = be_id; snd_soc_dapm_mixer_update_power(dapm, kcontrol, 1, update); } else { - session->port_id = -1; + if (session->port_id == be_id) { + session->port_id = -1; + return 0; + } + snd_soc_dapm_mixer_update_power(dapm, kcontrol, 0, update); }
From: Takashi Iwai tiwai@suse.de
[ Upstream commit 7e567b5ae06315ef2d70666b149962e2bb4b97af ]
snd_ctl_remove() has to be called with card->controls_rwsem held (when called after the card instantiation). This patch add the missing rwsem calls around it.
Fixes: 8a9782346dcc ("ASoC: topology: Add topology core") Signed-off-by: Takashi Iwai tiwai@suse.de Link: https://lore.kernel.org/r/20211116071812.18109-1-tiwai@suse.de Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/soc-topology.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/sound/soc/soc-topology.c b/sound/soc/soc-topology.c index c367609433bfc..21f859e56b700 100644 --- a/sound/soc/soc-topology.c +++ b/sound/soc/soc-topology.c @@ -2777,6 +2777,7 @@ EXPORT_SYMBOL_GPL(snd_soc_tplg_widget_remove_all); /* remove dynamic controls from the component driver */ int snd_soc_tplg_component_remove(struct snd_soc_component *comp, u32 index) { + struct snd_card *card = comp->card->snd_card; struct snd_soc_dobj *dobj, *next_dobj; int pass = SOC_TPLG_PASS_END;
@@ -2784,6 +2785,7 @@ int snd_soc_tplg_component_remove(struct snd_soc_component *comp, u32 index) while (pass >= SOC_TPLG_PASS_START) {
/* remove mixer controls */ + down_write(&card->controls_rwsem); list_for_each_entry_safe(dobj, next_dobj, &comp->dobj_list, list) {
@@ -2827,6 +2829,7 @@ int snd_soc_tplg_component_remove(struct snd_soc_component *comp, u32 index) break; } } + up_write(&card->controls_rwsem); pass--; }
From: Alexander Aring aahringo@redhat.com
[ Upstream commit 451dc48c806a7ce9fbec5e7a24ccf4b2c936e834 ]
This patch fixes an issue that an u32 netlink value is handled as a signed enum value which doesn't fit into the range of u32 netlink type. If it's handled as -1 value some BIT() evaluation ends in a shift-out-of-bounds issue. To solve the issue we set the to u32 max which is s32 "-1" value to keep backwards compatibility and let the followed enum values start counting at 0. This brings the compiler to never handle the enum as signed and a check if the value is above NL802154_IFTYPE_MAX should filter -1 out.
Fixes: f3ea5e44231a ("ieee802154: add new interface command") Signed-off-by: Alexander Aring aahringo@redhat.com Link: https://lore.kernel.org/r/20211112030916.685793-1-aahringo@redhat.com Signed-off-by: Stefan Schmidt stefan@datenfreihafen.org Signed-off-by: Sasha Levin sashal@kernel.org --- include/net/nl802154.h | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/include/net/nl802154.h b/include/net/nl802154.h index ddcee128f5d9a..145acb8f25095 100644 --- a/include/net/nl802154.h +++ b/include/net/nl802154.h @@ -19,6 +19,8 @@ * */
+#include <linux/types.h> + #define NL802154_GENL_NAME "nl802154"
enum nl802154_commands { @@ -150,10 +152,9 @@ enum nl802154_attrs { };
enum nl802154_iftype { - /* for backwards compatibility TODO */ - NL802154_IFTYPE_UNSPEC = -1, + NL802154_IFTYPE_UNSPEC = (~(__u32)0),
- NL802154_IFTYPE_NODE, + NL802154_IFTYPE_NODE = 0, NL802154_IFTYPE_MONITOR, NL802154_IFTYPE_COORD,
From: Peng Fan peng.fan@nxp.com
[ Upstream commit 1446fc6c678e8d8b31606a4b877abe205f344b38 ]
of_genpd_add_provider_onecell may return error, so let's propagate its return value to caller
Link: https://lore.kernel.org/r/20211116064227.20571-1-peng.fan@oss.nxp.com Fixes: 898216c97ed2 ("firmware: arm_scmi: add device power domain support using genpd") Signed-off-by: Peng Fan peng.fan@nxp.com Signed-off-by: Sudeep Holla sudeep.holla@arm.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/firmware/arm_scmi/scmi_pm_domain.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/drivers/firmware/arm_scmi/scmi_pm_domain.c b/drivers/firmware/arm_scmi/scmi_pm_domain.c index 041f8152272bf..177874adccf0d 100644 --- a/drivers/firmware/arm_scmi/scmi_pm_domain.c +++ b/drivers/firmware/arm_scmi/scmi_pm_domain.c @@ -106,9 +106,7 @@ static int scmi_pm_domain_probe(struct scmi_device *sdev) scmi_pd_data->domains = domains; scmi_pd_data->num_domains = num_domains;
- of_genpd_add_provider_onecell(np, scmi_pd_data); - - return 0; + return of_genpd_add_provider_onecell(np, scmi_pd_data); }
static const struct scmi_device_id scmi_id_table[] = {
From: Trond Myklebust trond.myklebust@hammerspace.com
[ Upstream commit d3c45824ad65aebf765fcf51366d317a29538820 ]
The failure to retrieve post-op attributes has no bearing on whether or not the clone operation itself was successful. We must therefore ignore the return value of decode_getfattr() when looking at the success or failure of nfs4_xdr_dec_clone().
Fixes: 36022770de6c ("nfs42: add CLONE xdr functions") Signed-off-by: Trond Myklebust trond.myklebust@hammerspace.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/nfs/nfs42xdr.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/fs/nfs/nfs42xdr.c b/fs/nfs/nfs42xdr.c index aed865a846296..2b78f7b8d5467 100644 --- a/fs/nfs/nfs42xdr.c +++ b/fs/nfs/nfs42xdr.c @@ -769,8 +769,7 @@ static int nfs4_xdr_dec_clone(struct rpc_rqst *rqstp, status = decode_clone(xdr); if (status) goto out; - status = decode_getfattr(xdr, res->dst_fattr, res->server); - + decode_getfattr(xdr, res->dst_fattr, res->server); out: res->rpc_status = status; return status;
From: Takashi Iwai tiwai@suse.de
[ Upstream commit 187bea472600dcc8d2eb714335053264dd437172 ]
When CONFIG_FORTIFY_SOURCE is set, memcpy() checks the potential buffer overflow and panics. The code in sofcpga bootstrapping contains the memcpy() calls are mistakenly translated as the shorter size, hence it triggers a panic as if it were overflowing.
This patch changes the secondary_trampoline and *_end definitions to arrays for avoiding the false-positive crash above.
Fixes: 9c4566a117a6 ("ARM: socfpga: Enable SMP for socfpga") Suggested-by: Kees Cook keescook@chromium.org Buglink: https://bugzilla.suse.com/show_bug.cgi?id=1192473 Link: https://lore.kernel.org/r/20211117193244.31162-1-tiwai@suse.de Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Dinh Nguyen dinguyen@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm/mach-socfpga/core.h | 2 +- arch/arm/mach-socfpga/platsmp.c | 8 ++++---- 2 files changed, 5 insertions(+), 5 deletions(-)
diff --git a/arch/arm/mach-socfpga/core.h b/arch/arm/mach-socfpga/core.h index fc2608b18a0d0..18f01190dcfd4 100644 --- a/arch/arm/mach-socfpga/core.h +++ b/arch/arm/mach-socfpga/core.h @@ -33,7 +33,7 @@ extern void __iomem *sdr_ctl_base_addr; u32 socfpga_sdram_self_refresh(u32 sdr_base); extern unsigned int socfpga_sdram_self_refresh_sz;
-extern char secondary_trampoline, secondary_trampoline_end; +extern char secondary_trampoline[], secondary_trampoline_end[];
extern unsigned long socfpga_cpu1start_addr;
diff --git a/arch/arm/mach-socfpga/platsmp.c b/arch/arm/mach-socfpga/platsmp.c index fbb80b883e5dd..201191cf68f32 100644 --- a/arch/arm/mach-socfpga/platsmp.c +++ b/arch/arm/mach-socfpga/platsmp.c @@ -20,14 +20,14 @@
static int socfpga_boot_secondary(unsigned int cpu, struct task_struct *idle) { - int trampoline_size = &secondary_trampoline_end - &secondary_trampoline; + int trampoline_size = secondary_trampoline_end - secondary_trampoline;
if (socfpga_cpu1start_addr) { /* This will put CPU #1 into reset. */ writel(RSTMGR_MPUMODRST_CPU1, rst_manager_base_addr + SOCFPGA_RSTMGR_MODMPURST);
- memcpy(phys_to_virt(0), &secondary_trampoline, trampoline_size); + memcpy(phys_to_virt(0), secondary_trampoline, trampoline_size);
writel(__pa_symbol(secondary_startup), sys_manager_base_addr + (socfpga_cpu1start_addr & 0x000000ff)); @@ -45,12 +45,12 @@ static int socfpga_boot_secondary(unsigned int cpu, struct task_struct *idle)
static int socfpga_a10_boot_secondary(unsigned int cpu, struct task_struct *idle) { - int trampoline_size = &secondary_trampoline_end - &secondary_trampoline; + int trampoline_size = secondary_trampoline_end - secondary_trampoline;
if (socfpga_cpu1start_addr) { writel(RSTMGR_MPUMODRST_CPU1, rst_manager_base_addr + SOCFPGA_A10_RSTMGR_MODMPURST); - memcpy(phys_to_virt(0), &secondary_trampoline, trampoline_size); + memcpy(phys_to_virt(0), secondary_trampoline, trampoline_size);
writel(__pa_symbol(secondary_startup), sys_manager_base_addr + (socfpga_cpu1start_addr & 0x00000fff));
From: Sreekanth Reddy sreekanth.reddy@broadcom.com
[ Upstream commit 0ee4ba13e09c9d9c1cb6abb59da8295d9952328b ]
While looping over shost's sdev list it is possible that one of the drives is getting removed and its sas_target object is freed but its sdev object remains intact.
Consequently, a kernel panic can occur while the driver is trying to access the sas_address field of sas_target object without also checking the sas_target object for NULL.
Link: https://lore.kernel.org/r/20211117104909.2069-1-sreekanth.reddy@broadcom.com Fixes: f92363d12359 ("[SCSI] mpt3sas: add new driver supporting 12GB SAS") Signed-off-by: Sreekanth Reddy sreekanth.reddy@broadcom.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/mpt3sas/mpt3sas_scsih.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c b/drivers/scsi/mpt3sas/mpt3sas_scsih.c index 3654cfc4376fa..97c1f242ef0a3 100644 --- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c +++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c @@ -3387,7 +3387,7 @@ _scsih_ublock_io_device(struct MPT3SAS_ADAPTER *ioc, u64 sas_address)
shost_for_each_device(sdev, ioc->shost) { sas_device_priv_data = sdev->hostdata; - if (!sas_device_priv_data) + if (!sas_device_priv_data || !sas_device_priv_data->sas_target) continue; if (sas_device_priv_data->sas_target->sas_address != sas_address)
From: Dan Carpenter dan.carpenter@oracle.com
[ Upstream commit 96c5f82ef0a145d3e56e5b26f2bf6dcd2ffeae1c ]
The ->gem_create_object() functions are supposed to return NULL if there is an error. None of the callers expect error pointers so returing one will lead to an Oops. See drm_gem_vram_create(), for example.
Fixes: c826a6e10644 ("drm/vc4: Add a BO cache.") Signed-off-by: Dan Carpenter dan.carpenter@oracle.com Signed-off-by: Maxime Ripard maxime@cerno.tech Link: https://patchwork.freedesktop.org/patch/msgid/20211118111416.GC1147@kili Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/vc4/vc4_bo.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/vc4/vc4_bo.c b/drivers/gpu/drm/vc4/vc4_bo.c index 72d30d90b856c..0af246a5609ca 100644 --- a/drivers/gpu/drm/vc4/vc4_bo.c +++ b/drivers/gpu/drm/vc4/vc4_bo.c @@ -389,7 +389,7 @@ struct drm_gem_object *vc4_create_object(struct drm_device *dev, size_t size)
bo = kzalloc(sizeof(*bo), GFP_KERNEL); if (!bo) - return ERR_PTR(-ENOMEM); + return NULL;
bo->madv = VC4_MADV_WILLNEED; refcount_set(&bo->usecnt, 0);
From: Nitesh B Venkatesh nitesh.b.venkatesh@intel.com
[ Upstream commit e792779e6b639c182df91b46ac1e5803460b0b15 ]
Resolve being able to change static values on VF when adaptive interrupt moderation is enabled.
This problem is fixed by checking the interrupt settings is not a combination of change of static value while adaptive interrupt moderation is turned on.
Without this fix, the user would be able to change static values on VF with adaptive moderation enabled.
Fixes: 65e87c0398f5 ("i40evf: support queue-specific settings for interrupt moderation") Signed-off-by: Nitesh B Venkatesh nitesh.b.venkatesh@intel.com Tested-by: George Kuruvinakunnel george.kuruvinakunnel@intel.com Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- .../net/ethernet/intel/iavf/iavf_ethtool.c | 30 ++++++++++++++++--- 1 file changed, 26 insertions(+), 4 deletions(-)
diff --git a/drivers/net/ethernet/intel/iavf/iavf_ethtool.c b/drivers/net/ethernet/intel/iavf/iavf_ethtool.c index ad1e796e5544a..4e0e1b02d615e 100644 --- a/drivers/net/ethernet/intel/iavf/iavf_ethtool.c +++ b/drivers/net/ethernet/intel/iavf/iavf_ethtool.c @@ -719,12 +719,31 @@ static int iavf_get_per_queue_coalesce(struct net_device *netdev, u32 queue, * * Change the ITR settings for a specific queue. **/ -static void iavf_set_itr_per_queue(struct iavf_adapter *adapter, - struct ethtool_coalesce *ec, int queue) +static int iavf_set_itr_per_queue(struct iavf_adapter *adapter, + struct ethtool_coalesce *ec, int queue) { struct iavf_ring *rx_ring = &adapter->rx_rings[queue]; struct iavf_ring *tx_ring = &adapter->tx_rings[queue]; struct iavf_q_vector *q_vector; + u16 itr_setting; + + itr_setting = rx_ring->itr_setting & ~IAVF_ITR_DYNAMIC; + + if (ec->rx_coalesce_usecs != itr_setting && + ec->use_adaptive_rx_coalesce) { + netif_info(adapter, drv, adapter->netdev, + "Rx interrupt throttling cannot be changed if adaptive-rx is enabled\n"); + return -EINVAL; + } + + itr_setting = tx_ring->itr_setting & ~IAVF_ITR_DYNAMIC; + + if (ec->tx_coalesce_usecs != itr_setting && + ec->use_adaptive_tx_coalesce) { + netif_info(adapter, drv, adapter->netdev, + "Tx interrupt throttling cannot be changed if adaptive-tx is enabled\n"); + return -EINVAL; + }
rx_ring->itr_setting = ITR_REG_ALIGN(ec->rx_coalesce_usecs); tx_ring->itr_setting = ITR_REG_ALIGN(ec->tx_coalesce_usecs); @@ -747,6 +766,7 @@ static void iavf_set_itr_per_queue(struct iavf_adapter *adapter, * the Tx and Rx ITR values based on the values we have entered * into the q_vector, no need to write the values now. */ + return 0; }
/** @@ -788,9 +808,11 @@ static int __iavf_set_coalesce(struct net_device *netdev, */ if (queue < 0) { for (i = 0; i < adapter->num_active_queues; i++) - iavf_set_itr_per_queue(adapter, ec, i); + if (iavf_set_itr_per_queue(adapter, ec, i)) + return -EINVAL; } else if (queue < adapter->num_active_queues) { - iavf_set_itr_per_queue(adapter, ec, queue); + if (iavf_set_itr_per_queue(adapter, ec, queue)) + return -EINVAL; } else { netif_info(adapter, drv, netdev, "Invalid queue value, queue range is 0 - %d\n", adapter->num_active_queues - 1);
From: Eric Dumazet edumazet@google.com
[ Upstream commit 19d36c5f294879949c9d6f57cb61d39cc4c48553 ]
We deal with IPv6 packets, so we need to use IP6CB(skb)->flags and IP6SKB_REROUTED, instead of IPCB(skb)->flags and IPSKB_REROUTED
Found by code inspection, please double check that fixing this bug does not surface other bugs.
Fixes: 09ee9dba9611 ("ipv6: Reinject IPv6 packets if IPsec policy matches after SNAT") Signed-off-by: Eric Dumazet edumazet@google.com Cc: Tobias Brunner tobias@strongswan.org Cc: Steffen Klassert steffen.klassert@secunet.com Cc: David Ahern dsahern@kernel.org Reviewed-by: David Ahern dsahern@kernel.org Tested-by: Tobias Brunner tobias@strongswan.org Acked-by: Tobias Brunner tobias@strongswan.org Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/ipv6/ip6_output.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c index fc913f09606db..d847aa32628da 100644 --- a/net/ipv6/ip6_output.c +++ b/net/ipv6/ip6_output.c @@ -192,7 +192,7 @@ static int __ip6_finish_output(struct net *net, struct sock *sk, struct sk_buff #if defined(CONFIG_NETFILTER) && defined(CONFIG_XFRM) /* Policy lookup after SNAT yielded a new policy */ if (skb_dst(skb)->xfrm) { - IPCB(skb)->flags |= IPSKB_REROUTED; + IP6CB(skb)->flags |= IP6SKB_REROUTED; return dst_output(net, sk, skb); } #endif
From: Diana Wang na.wang@corigine.com
[ Upstream commit 3bd6b2a838ba6a3b86d41b077f570b1b61174def ]
Use nn->tlv_caps.me_freq_mhz instead of nn->me_freq_mhz to check whether rx-usecs/tx-usecs is valid.
This is because nn->tlv_caps.me_freq_mhz represents the clock_freq (MHz) of the flow processing cores (FPC) on the NIC. While nn->me_freq_mhz is not be set.
Fixes: ce991ab6662a ("nfp: read ME frequency from vNIC ctrl memory") Signed-off-by: Diana Wang na.wang@corigine.com Signed-off-by: Simon Horman simon.horman@corigine.com Reviewed-by: Jakub Kicinski kuba@kernel.org Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/netronome/nfp/nfp_net.h | 3 --- drivers/net/ethernet/netronome/nfp/nfp_net_ethtool.c | 2 +- 2 files changed, 1 insertion(+), 4 deletions(-)
diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net.h b/drivers/net/ethernet/netronome/nfp/nfp_net.h index 250f510b1d212..3dcb09f17b77f 100644 --- a/drivers/net/ethernet/netronome/nfp/nfp_net.h +++ b/drivers/net/ethernet/netronome/nfp/nfp_net.h @@ -557,7 +557,6 @@ struct nfp_net_dp { * @exn_name: Name for Exception interrupt * @shared_handler: Handler for shared interrupts * @shared_name: Name for shared interrupt - * @me_freq_mhz: ME clock_freq (MHz) * @reconfig_lock: Protects @reconfig_posted, @reconfig_timer_active, * @reconfig_sync_present and HW reconfiguration request * regs/machinery from async requests (sync must take @@ -639,8 +638,6 @@ struct nfp_net { irq_handler_t shared_handler; char shared_name[IFNAMSIZ + 8];
- u32 me_freq_mhz; - bool link_up; spinlock_t link_status_lock;
diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net_ethtool.c b/drivers/net/ethernet/netronome/nfp/nfp_net_ethtool.c index 2354dec994184..89e578e25ff8f 100644 --- a/drivers/net/ethernet/netronome/nfp/nfp_net_ethtool.c +++ b/drivers/net/ethernet/netronome/nfp/nfp_net_ethtool.c @@ -1269,7 +1269,7 @@ static int nfp_net_set_coalesce(struct net_device *netdev, * ME timestamp ticks. There are 16 ME clock cycles for each timestamp * count. */ - factor = nn->me_freq_mhz / 16; + factor = nn->tlv_caps.me_freq_mhz / 16;
/* Each pair of (usecs, max_frames) fields specifies that interrupts * should be coalesced until
From: Nikolay Aleksandrov nikolay@nvidia.com
[ Upstream commit 8837cbbf854246f5f4d565f21e6baa945d37aded ]
We need a way to release a fib6_nh's per-cpu dsts when replacing nexthops otherwise we can end up with stale per-cpu dsts which hold net device references, so add a new IPv6 stub called fib6_nh_release_dsts. It must be used after an RCU grace period, so no new dsts can be created through a group's nexthop entry. Similar to fib6_nh_release it shouldn't be used if fib6_nh_init has failed so it doesn't need a dummy stub when IPv6 is not enabled.
Fixes: 7bf4796dd099 ("nexthops: add support for replace") Signed-off-by: Nikolay Aleksandrov nikolay@nvidia.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- include/net/ip6_fib.h | 1 + include/net/ipv6_stubs.h | 1 + net/ipv6/af_inet6.c | 1 + net/ipv6/route.c | 19 +++++++++++++++++++ 4 files changed, 22 insertions(+)
diff --git a/include/net/ip6_fib.h b/include/net/ip6_fib.h index bd0f1595bdc71..05ecaefeb6322 100644 --- a/include/net/ip6_fib.h +++ b/include/net/ip6_fib.h @@ -451,6 +451,7 @@ int fib6_nh_init(struct net *net, struct fib6_nh *fib6_nh, struct fib6_config *cfg, gfp_t gfp_flags, struct netlink_ext_ack *extack); void fib6_nh_release(struct fib6_nh *fib6_nh); +void fib6_nh_release_dsts(struct fib6_nh *fib6_nh);
int call_fib6_entry_notifiers(struct net *net, enum fib_event_type event_type, diff --git a/include/net/ipv6_stubs.h b/include/net/ipv6_stubs.h index 3e7d2c0e79ca1..af9e127779adf 100644 --- a/include/net/ipv6_stubs.h +++ b/include/net/ipv6_stubs.h @@ -47,6 +47,7 @@ struct ipv6_stub { struct fib6_config *cfg, gfp_t gfp_flags, struct netlink_ext_ack *extack); void (*fib6_nh_release)(struct fib6_nh *fib6_nh); + void (*fib6_nh_release_dsts)(struct fib6_nh *fib6_nh); void (*fib6_update_sernum)(struct net *net, struct fib6_info *rt); int (*ip6_del_rt)(struct net *net, struct fib6_info *rt); void (*fib6_rt_update)(struct net *net, struct fib6_info *rt, diff --git a/net/ipv6/af_inet6.c b/net/ipv6/af_inet6.c index 14ac1d9112877..942da168f18fb 100644 --- a/net/ipv6/af_inet6.c +++ b/net/ipv6/af_inet6.c @@ -955,6 +955,7 @@ static const struct ipv6_stub ipv6_stub_impl = { .ip6_mtu_from_fib6 = ip6_mtu_from_fib6, .fib6_nh_init = fib6_nh_init, .fib6_nh_release = fib6_nh_release, + .fib6_nh_release_dsts = fib6_nh_release_dsts, .fib6_update_sernum = fib6_update_sernum_stub, .fib6_rt_update = fib6_rt_update, .ip6_del_rt = ip6_del_rt, diff --git a/net/ipv6/route.c b/net/ipv6/route.c index daa876c6ae8db..f36db3dd97346 100644 --- a/net/ipv6/route.c +++ b/net/ipv6/route.c @@ -3585,6 +3585,25 @@ void fib6_nh_release(struct fib6_nh *fib6_nh) fib_nh_common_release(&fib6_nh->nh_common); }
+void fib6_nh_release_dsts(struct fib6_nh *fib6_nh) +{ + int cpu; + + if (!fib6_nh->rt6i_pcpu) + return; + + for_each_possible_cpu(cpu) { + struct rt6_info *pcpu_rt, **ppcpu_rt; + + ppcpu_rt = per_cpu_ptr(fib6_nh->rt6i_pcpu, cpu); + pcpu_rt = xchg(ppcpu_rt, NULL); + if (pcpu_rt) { + dst_dev_put(&pcpu_rt->dst); + dst_release(&pcpu_rt->dst); + } + } +} + static struct fib6_info *ip6_route_info_create(struct fib6_config *cfg, gfp_t gfp_flags, struct netlink_ext_ack *extack)
From: Nikolay Aleksandrov nikolay@nvidia.com
[ Upstream commit 1005f19b9357b81aa64e1decd08d6e332caaa284 ]
When replacing a nexthop group, we must release the IPv6 per-cpu dsts of the removed nexthop entries after an RCU grace period because they contain references to the nexthop's net device and to the fib6 info. With specific series of events[1] we can reach net device refcount imbalance which is unrecoverable. IPv4 is not affected because dsts don't take a refcount on the route.
[1] $ ip nexthop list id 200 via 2002:db8::2 dev bridge.10 scope link onlink id 201 via 2002:db8::3 dev bridge scope link onlink id 203 group 201/200 $ ip -6 route 2001:db8::10 nhid 203 metric 1024 pref medium nexthop via 2002:db8::3 dev bridge weight 1 onlink nexthop via 2002:db8::2 dev bridge.10 weight 1 onlink
Create rt6_info through one of the multipath legs, e.g.: $ taskset -a -c 1 ./pkt_inj 24 bridge.10 2001:db8::10 (pkt_inj is just a custom packet generator, nothing special)
Then remove that leg from the group by replace (let's assume it is id 200 in this case): $ ip nexthop replace id 203 group 201
Now remove the IPv6 route: $ ip -6 route del 2001:db8::10/128
The route won't be really deleted due to the stale rt6_info holding 1 refcnt in nexthop id 200. At this point we have the following reference count dependency: (deleted) IPv6 route holds 1 reference over nhid 203 nh 203 holds 1 ref over id 201 nh 200 holds 1 ref over the net device and the route due to the stale rt6_info
Now to create circular dependency between nh 200 and the IPv6 route, and also to get a reference over nh 200, restore nhid 200 in the group: $ ip nexthop replace id 203 group 201/200
And now we have a permanent circular dependncy because nhid 203 holds a reference over nh 200 and 201, but the route holds a ref over nh 203 and is deleted.
To trigger the bug just delete the group (nhid 203): $ ip nexthop del id 203
It won't really be deleted due to the IPv6 route dependency, and now we have 2 unlinked and deleted objects that reference each other: the group and the IPv6 route. Since the group drops the reference it holds over its entries at free time (i.e. its own refcount needs to drop to 0) that will never happen and we get a permanent ref on them, since one of the entries holds a reference over the IPv6 route it will also never be released.
At this point the dependencies are: (deleted, only unlinked) IPv6 route holds reference over group nh 203 (deleted, only unlinked) group nh 203 holds reference over nh 201 and 200 nh 200 holds 1 ref over the net device and the route due to the stale rt6_info
This is the last point where it can be fixed by running traffic through nh 200, and specifically through the same CPU so the rt6_info (dst) will get released due to the IPv6 genid, that in turn will free the IPv6 route, which in turn will free the ref count over the group nh 203.
If nh 200 is deleted at this point, it will never be released due to the ref from the unlinked group 203, it will only be unlinked: $ ip nexthop del id 200 $ ip nexthop $
Now we can never release that stale rt6_info, we have IPv6 route with ref over group nh 203, group nh 203 with ref over nh 200 and 201, nh 200 with rt6_info (dst) with ref over the net device and the IPv6 route. All of these objects are only unlinked, and cannot be released, thus they can't release their ref counts.
Message from syslogd@dev at Nov 19 14:04:10 ... kernel:[73501.828730] unregister_netdevice: waiting for bridge.10 to become free. Usage count = 3 Message from syslogd@dev at Nov 19 14:04:20 ... kernel:[73512.068811] unregister_netdevice: waiting for bridge.10 to become free. Usage count = 3
Fixes: 7bf4796dd099 ("nexthops: add support for replace") Signed-off-by: Nikolay Aleksandrov nikolay@nvidia.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/ipv4/nexthop.c | 25 +++++++++++++++++++++++-- 1 file changed, 23 insertions(+), 2 deletions(-)
diff --git a/net/ipv4/nexthop.c b/net/ipv4/nexthop.c index 8bdffdf0502a1..4d69b3de980a6 100644 --- a/net/ipv4/nexthop.c +++ b/net/ipv4/nexthop.c @@ -839,15 +839,36 @@ static void remove_nexthop(struct net *net, struct nexthop *nh, /* if any FIB entries reference this nexthop, any dst entries * need to be regenerated */ -static void nh_rt_cache_flush(struct net *net, struct nexthop *nh) +static void nh_rt_cache_flush(struct net *net, struct nexthop *nh, + struct nexthop *replaced_nh) { struct fib6_info *f6i; + struct nh_group *nhg; + int i;
if (!list_empty(&nh->fi_list)) rt_cache_flush(net);
list_for_each_entry(f6i, &nh->f6i_list, nh_list) ipv6_stub->fib6_update_sernum(net, f6i); + + /* if an IPv6 group was replaced, we have to release all old + * dsts to make sure all refcounts are released + */ + if (!replaced_nh->is_group) + return; + + /* new dsts must use only the new nexthop group */ + synchronize_net(); + + nhg = rtnl_dereference(replaced_nh->nh_grp); + for (i = 0; i < nhg->num_nh; i++) { + struct nh_grp_entry *nhge = &nhg->nh_entries[i]; + struct nh_info *nhi = rtnl_dereference(nhge->nh->nh_info); + + if (nhi->family == AF_INET6) + ipv6_stub->fib6_nh_release_dsts(&nhi->fib6_nh); + } }
static int replace_nexthop_grp(struct net *net, struct nexthop *old, @@ -994,7 +1015,7 @@ static int replace_nexthop(struct net *net, struct nexthop *old, err = replace_nexthop_single(net, old, new, extack);
if (!err) { - nh_rt_cache_flush(net, old); + nh_rt_cache_flush(net, old, new);
__remove_nexthop(net, new, NULL); nexthop_put(new);
From: Mike Christie michael.christie@oracle.com
[ Upstream commit eb97545d6264b341b06ba7603f52ff6c0b2af6ea ]
This fixes an issue added in commit 4edd8cd4e86d ("scsi: core: sysfs: Fix hang when device state is set via sysfs") where if userspace is requesting to set the device state to SDEV_RUNNING when the state is already SDEV_RUNNING, we return -EINVAL instead of count. The commmit above set ret to count for this case, when it should have set it to 0.
Link: https://lore.kernel.org/r/20211120164917.4924-1-michael.christie@oracle.com Fixes: 4edd8cd4e86d ("scsi: core: sysfs: Fix hang when device state is set via sysfs") Reviewed-by: Lee Duncan lduncan@suse.com Signed-off-by: Mike Christie michael.christie@oracle.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/scsi_sysfs.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c index 16432d42a50aa..6faf1d6451b0c 100644 --- a/drivers/scsi/scsi_sysfs.c +++ b/drivers/scsi/scsi_sysfs.c @@ -796,7 +796,7 @@ store_state_field(struct device *dev, struct device_attribute *attr,
mutex_lock(&sdev->state_mutex); if (sdev->sdev_state == SDEV_RUNNING && state == SDEV_RUNNING) { - ret = count; + ret = 0; } else { ret = scsi_device_set_state(sdev, state); if (ret == 0 && state == SDEV_RUNNING)
From: Tony Lu tonylu@linux.alibaba.com
[ Upstream commit 606a63c9783a32a45bd2ef0eee393711d75b3284 ]
The side that actively closed socket, it's clcsock doesn't enter TIME_WAIT state, but the passive side does it. It should show the same behavior as TCP sockets.
Consider this, when client actively closes the socket, the clcsock in server enters TIME_WAIT state, which means the address is occupied and won't be reused before TIME_WAIT dismissing. If we restarted server, the service would be unavailable for a long time.
To solve this issue, shutdown the clcsock in [A], perform the TCP active close progress first, before the passive closed side closing it. So that the actively closed side enters TIME_WAIT, not the passive one.
Client | Server close() // client actively close | smc_release() | smc_close_active() // PEERCLOSEWAIT1 | smc_close_final() // abort or closed = 1| smc_cdc_get_slot_and_msg_send() | [A] | |smc_cdc_msg_recv_action() // ACTIVE | queue_work(smc_close_wq, &conn->close_work) | smc_close_passive_work() // PROCESSABORT or APPCLOSEWAIT1 | smc_close_passive_abort_received() // only in abort | |close() // server recv zero, close | smc_release() // PROCESSABORT or APPCLOSEWAIT1 | smc_close_active() | smc_close_abort() or smc_close_final() // CLOSED | smc_cdc_get_slot_and_msg_send() // abort or closed = 1 smc_cdc_msg_recv_action() | smc_clcsock_release() queue_work(smc_close_wq, &conn->close_work) | sock_release(tcp) // actively close clc, enter TIME_WAIT smc_close_passive_work() // PEERCLOSEWAIT1 | smc_conn_free() smc_close_passive_abort_received() // CLOSED| smc_conn_free() | smc_clcsock_release() | sock_release(tcp) // passive close clc |
Link: https://www.spinics.net/lists/netdev/msg780407.html Fixes: b38d732477e4 ("smc: socket closing and linkgroup cleanup") Signed-off-by: Tony Lu tonylu@linux.alibaba.com Reviewed-by: Wen Gu guwen@linux.alibaba.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/smc/smc_close.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/net/smc/smc_close.c b/net/smc/smc_close.c index fc06720b53c14..2eabf39dee74d 100644 --- a/net/smc/smc_close.c +++ b/net/smc/smc_close.c @@ -218,6 +218,12 @@ int smc_close_active(struct smc_sock *smc) if (rc) break; sk->sk_state = SMC_PEERCLOSEWAIT1; + + /* actively shutdown clcsock before peer close it, + * prevent peer from entering TIME_WAIT state. + */ + if (smc->clcsock && smc->clcsock->sk) + rc = kernel_sock_shutdown(smc->clcsock, SHUT_RDWR); } else { /* peer event has changed the state */ goto again;
From: Varun Prakash varun@chelsio.com
[ Upstream commit 102110efdff6beedece6ab9b51664c32ac01e2db ]
Current nvmet_try_send_ddgst() code does not check whether all data digest bytes are transmitted, fix this by returning -EAGAIN if all data digest bytes are not transmitted.
Fixes: 872d26a391da ("nvmet-tcp: add NVMe over TCP target driver") Signed-off-by: Varun Prakash varun@chelsio.com Reviewed-by: Sagi Grimberg sagi@grimberg.me Signed-off-by: Christoph Hellwig hch@lst.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/nvme/target/tcp.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/drivers/nvme/target/tcp.c b/drivers/nvme/target/tcp.c index fac1985870765..4341c72446628 100644 --- a/drivers/nvme/target/tcp.c +++ b/drivers/nvme/target/tcp.c @@ -631,10 +631,11 @@ static int nvmet_try_send_r2t(struct nvmet_tcp_cmd *cmd, bool last_in_batch) static int nvmet_try_send_ddgst(struct nvmet_tcp_cmd *cmd) { struct nvmet_tcp_queue *queue = cmd->queue; + int left = NVME_TCP_DIGEST_LENGTH - cmd->offset; struct msghdr msg = { .msg_flags = MSG_DONTWAIT }; struct kvec iov = { .iov_base = (u8 *)&cmd->exp_ddgst + cmd->offset, - .iov_len = NVME_TCP_DIGEST_LENGTH - cmd->offset + .iov_len = left }; int ret;
@@ -643,6 +644,10 @@ static int nvmet_try_send_ddgst(struct nvmet_tcp_cmd *cmd) return ret;
cmd->offset += ret; + left -= ret; + + if (left) + return -EAGAIN;
if (queue->nvme_sq.sqhd_disabled) { cmd->queue->snd_cmd = NULL;
From: Kumar Thangavel kumarthangavel.hcl@gmail.com
[ Upstream commit ac132852147ad303a938dda318970dd1bbdfda4e ]
Update NC-SI command handler (both standard and OEM) to take into account of payload paddings in allocating skb (in case of payload size is not 32-bit aligned).
The checksum field follows payload field, without taking payload padding into account can cause checksum being truncated, leading to dropped packets.
Fixes: fb4ee67529ff ("net/ncsi: Add NCSI OEM command support") Signed-off-by: Kumar Thangavel thangavel.k@hcl.com Acked-by: Samuel Mendoza-Jonas sam@mendozajonas.com Reviewed-by: Paul Menzel pmenzel@molgen.mpg.de Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/ncsi/ncsi-cmd.c | 24 ++++++++++++++++-------- 1 file changed, 16 insertions(+), 8 deletions(-)
diff --git a/net/ncsi/ncsi-cmd.c b/net/ncsi/ncsi-cmd.c index 0187e65176c05..114ef47db76d3 100644 --- a/net/ncsi/ncsi-cmd.c +++ b/net/ncsi/ncsi-cmd.c @@ -18,6 +18,8 @@ #include "internal.h" #include "ncsi-pkt.h"
+static const int padding_bytes = 26; + u32 ncsi_calculate_checksum(unsigned char *data, int len) { u32 checksum = 0; @@ -213,12 +215,17 @@ static int ncsi_cmd_handler_oem(struct sk_buff *skb, { struct ncsi_cmd_oem_pkt *cmd; unsigned int len; + int payload; + /* NC-SI spec DSP_0222_1.2.0, section 8.2.2.2 + * requires payload to be padded with 0 to + * 32-bit boundary before the checksum field. + * Ensure the padding bytes are accounted for in + * skb allocation + */
+ payload = ALIGN(nca->payload, 4); len = sizeof(struct ncsi_cmd_pkt_hdr) + 4; - if (nca->payload < 26) - len += 26; - else - len += nca->payload; + len += max(payload, padding_bytes);
cmd = skb_put_zero(skb, len); memcpy(&cmd->mfr_id, nca->data, nca->payload); @@ -272,6 +279,7 @@ static struct ncsi_request *ncsi_alloc_command(struct ncsi_cmd_arg *nca) struct net_device *dev = nd->dev; int hlen = LL_RESERVED_SPACE(dev); int tlen = dev->needed_tailroom; + int payload; int len = hlen + tlen; struct sk_buff *skb; struct ncsi_request *nr; @@ -281,14 +289,14 @@ static struct ncsi_request *ncsi_alloc_command(struct ncsi_cmd_arg *nca) return NULL;
/* NCSI command packet has 16-bytes header, payload, 4 bytes checksum. + * Payload needs padding so that the checksum field following payload is + * aligned to 32-bit boundary. * The packet needs padding if its payload is less than 26 bytes to * meet 64 bytes minimal ethernet frame length. */ len += sizeof(struct ncsi_cmd_pkt_hdr) + 4; - if (nca->payload < 26) - len += 26; - else - len += nca->payload; + payload = ALIGN(nca->payload, 4); + len += max(payload, padding_bytes);
/* Allocate skb */ skb = alloc_skb(len, GFP_ATOMIC);
From: Thomas Zeitlhofer thomas.zeitlhofer+lkml@ze-it.at
[ Upstream commit cefcf24b4d351daf70ecd945324e200d3736821e ]
Commit 39fbef4b0f77 ("PM: hibernate: Get block device exclusively in swsusp_check()") changed the opening mode of the block device to (FMODE_READ | FMODE_EXCL).
In the corresponding calls to swsusp_close(), the mode is still just FMODE_READ which triggers the warning in blkdev_flush_mapping() on resume from hibernate.
So, use the mode (FMODE_READ | FMODE_EXCL) also when closing the device.
Fixes: 39fbef4b0f77 ("PM: hibernate: Get block device exclusively in swsusp_check()") Signed-off-by: Thomas Zeitlhofer thomas.zeitlhofer+lkml@ze-it.at Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/power/hibernate.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/kernel/power/hibernate.c b/kernel/power/hibernate.c index 69c4cd472def3..6cafb2e910a11 100644 --- a/kernel/power/hibernate.c +++ b/kernel/power/hibernate.c @@ -676,7 +676,7 @@ static int load_image_and_restore(void) goto Unlock;
error = swsusp_read(&flags); - swsusp_close(FMODE_READ); + swsusp_close(FMODE_READ | FMODE_EXCL); if (!error) hibernation_restore(flags & SF_PLATFORM_MODE);
@@ -871,7 +871,7 @@ static int software_resume(void) /* The snapshot device should not be opened while we're running */ if (!atomic_add_unless(&snapshot_device_available, -1, 0)) { error = -EBUSY; - swsusp_close(FMODE_READ); + swsusp_close(FMODE_READ | FMODE_EXCL); goto Unlock; }
@@ -907,7 +907,7 @@ static int software_resume(void) pm_pr_dbg("Hibernation image not present or could not be loaded.\n"); return error; Close_Finish: - swsusp_close(FMODE_READ); + swsusp_close(FMODE_READ | FMODE_EXCL); goto Finish; }
From: Eric Dumazet edumazet@google.com
[ Upstream commit 4e1fddc98d2585ddd4792b5e44433dcee7ece001 ]
While testing BIG TCP patch series, I was expecting that TCP_RR workloads with 80KB requests/answers would send one 80KB TSO packet, then being received as a single GRO packet.
It turns out this was not happening, and the root cause was that cubic Hystart ACK train was triggering after a few (2 or 3) rounds of RPC.
Hystart was wrongly setting CWND/SSTHRESH to 30, while my RPC needed a budget of ~20 segments.
Ideally these TCP_RR flows should not exit slow start.
Cubic Hystart should reset itself at each round, instead of assuming every TCP flow is a bulk one.
Note that even after this patch, Hystart can still trigger, depending on scheduling artifacts, but at a higher CWND/SSTHRESH threshold, keeping optimal TSO packet sizes.
Tested:
ip link set dev eth0 gro_ipv6_max_size 131072 gso_ipv6_max_size 131072 nstat -n; netperf -H ... -t TCP_RR -l 5 -- -r 80000,80000 -K cubic; nstat|egrep "Ip6InReceives|Hystart|Ip6OutRequests"
Before:
8605 Ip6InReceives 87541 0.0 Ip6OutRequests 129496 0.0 TcpExtTCPHystartTrainDetect 1 0.0 TcpExtTCPHystartTrainCwnd 30 0.0
After:
8760 Ip6InReceives 88514 0.0 Ip6OutRequests 87975 0.0
Fixes: ae27e98a5152 ("[TCP] CUBIC v2.3") Co-developed-by: Neal Cardwell ncardwell@google.com Signed-off-by: Neal Cardwell ncardwell@google.com Signed-off-by: Eric Dumazet edumazet@google.com Cc: Stephen Hemminger stephen@networkplumber.org Cc: Yuchung Cheng ycheng@google.com Cc: Soheil Hassas Yeganeh soheil@google.com Link: https://lore.kernel.org/r/20211123202535.1843771-1-eric.dumazet@gmail.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/ipv4/tcp_cubic.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/net/ipv4/tcp_cubic.c b/net/ipv4/tcp_cubic.c index ee6c38a73325d..44be7a5a13911 100644 --- a/net/ipv4/tcp_cubic.c +++ b/net/ipv4/tcp_cubic.c @@ -341,8 +341,6 @@ static void bictcp_cong_avoid(struct sock *sk, u32 ack, u32 acked) return;
if (tcp_in_slow_start(tp)) { - if (hystart && after(ack, ca->end_seq)) - bictcp_hystart_reset(sk); acked = tcp_slow_start(tp, acked); if (!acked) return; @@ -384,6 +382,9 @@ static void hystart_update(struct sock *sk, u32 delay) if (ca->found & hystart_detect) return;
+ if (after(tp->snd_una, ca->end_seq)) + bictcp_hystart_reset(sk); + if (hystart_detect & HYSTART_ACK_TRAIN) { u32 now = bictcp_clock();
From: Maurizio Lombardi mlombard@redhat.com
[ Upstream commit c024b226a417c4eb9353ff500b1c823165d4d508 ]
Submit I/O requests with the IOCB_NOWAIT flag set only if the underlying filesystem supports it.
Fixes: 50a909db36f2 ("nvmet: use IOCB_NOWAIT for file-ns buffered I/O") Signed-off-by: Maurizio Lombardi mlombard@redhat.com Reviewed-by: Chaitanya Kulkarni kch@nvidia.com Signed-off-by: Christoph Hellwig hch@lst.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/nvme/target/io-cmd-file.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/nvme/target/io-cmd-file.c b/drivers/nvme/target/io-cmd-file.c index 6ca17a0babae2..1c8d16b0245b1 100644 --- a/drivers/nvme/target/io-cmd-file.c +++ b/drivers/nvme/target/io-cmd-file.c @@ -8,6 +8,7 @@ #include <linux/uio.h> #include <linux/falloc.h> #include <linux/file.h> +#include <linux/fs.h> #include "nvmet.h"
#define NVMET_MAX_MPOOL_BVEC 16 @@ -254,7 +255,8 @@ static void nvmet_file_execute_rw(struct nvmet_req *req)
if (req->ns->buffered_io) { if (likely(!req->f.mpool_alloc) && - nvmet_file_execute_io(req, IOCB_NOWAIT)) + (req->ns->file->f_mode & FMODE_NOWAIT) && + nvmet_file_execute_io(req, IOCB_NOWAIT)) return; nvmet_file_submit_buffered_io(req); } else
From: Jesse Brandeburg jesse.brandeburg@intel.com
[ Upstream commit eaeace60778e524a2820d0c0ad60bf80289e292c ]
Oleksandr brought a bug report where netpoll causes trace messages in the log on igb.
Danielle brought this back up as still occurring, so we'll try again.
[22038.710800] ------------[ cut here ]------------ [22038.710801] igb_poll+0x0/0x1440 [igb] exceeded budget in poll [22038.710802] WARNING: CPU: 12 PID: 40362 at net/core/netpoll.c:155 netpoll_poll_dev+0x18a/0x1a0
As Alex suggested, change the driver to return work_done at the exit of napi_poll, which should be safe to do in this driver because it is not polling multiple queues in this single napi context (multiple queues attached to one MSI-X vector). Several other drivers contain the same simple sequence, so I hope this will not create new problems.
Fixes: 16eb8815c235 ("igb: Refactor clean_rx_irq to reduce overhead and improve performance") Reported-by: Oleksandr Natalenko oleksandr@natalenko.name Reported-by: Danielle Ratson danieller@nvidia.com Suggested-by: Alexander Duyck alexander.duyck@gmail.com Signed-off-by: Jesse Brandeburg jesse.brandeburg@intel.com Tested-by: Oleksandr Natalenko oleksandr@natalenko.name Tested-by: Danielle Ratson danieller@nvidia.com Link: https://lore.kernel.org/r/20211123204000.1597971-1-jesse.brandeburg@intel.co... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/igb/igb_main.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c index 158feb0ab2739..c11244a9b7e69 100644 --- a/drivers/net/ethernet/intel/igb/igb_main.c +++ b/drivers/net/ethernet/intel/igb/igb_main.c @@ -7752,7 +7752,7 @@ static int igb_poll(struct napi_struct *napi, int budget) if (likely(napi_complete_done(napi, work_done))) igb_ring_irq_enable(q_vector);
- return min(work_done, budget - 1); + return work_done; }
/**
From: Huang Pei huangpei@loongson.cn
[ Upstream commit 41ce097f714401e6ad8f3f5eb30d7f91b0b5e495 ]
It hangup when booting Loongson 3A1000 with BOTH CONFIG_PAGE_SIZE_64KB and CONFIG_MIPS_VA_BITS_48, that it turn out to use 2-level pgtable instead of 3-level. 64KB page size with 2-level pgtable only cover 42 bits VA, use 3-level pgtable to cover all 48 bits VA(55 bits)
Fixes: 1e321fa917fb ("MIPS64: Support of at least 48 bits of SEGBITS) Signed-off-by: Huang Pei huangpei@loongson.cn Signed-off-by: Thomas Bogendoerfer tsbogend@alpha.franken.de Signed-off-by: Sasha Levin sashal@kernel.org --- arch/mips/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/mips/Kconfig b/arch/mips/Kconfig index 9749818eed6d6..2811ecc1f3c71 100644 --- a/arch/mips/Kconfig +++ b/arch/mips/Kconfig @@ -3059,7 +3059,7 @@ config STACKTRACE_SUPPORT config PGTABLE_LEVELS int default 4 if PAGE_SIZE_4KB && MIPS_VA_BITS_48 - default 3 if 64BIT && !PAGE_SIZE_64KB + default 3 if 64BIT && (!PAGE_SIZE_64KB || MIPS_VA_BITS_48) default 2
config MIPS_AUTO_PFN_OFFSET
From: Ziyang Xuan william.xuanziyang@huawei.com
[ Upstream commit 01d9cc2dea3fde3bad6d27f464eff463496e2b00 ]
Inject error before dev_hold(real_dev) in register_vlan_dev(), and execute the following testcase:
ip link add dev dummy1 type dummy ip link add name dummy1.100 link dummy1 type vlan id 100 ip link del dev dummy1
When the dummy netdevice is removed, we will get a WARNING as following:
======================================================================= refcount_t: decrement hit 0; leaking memory. WARNING: CPU: 2 PID: 0 at lib/refcount.c:31 refcount_warn_saturate+0xbf/0x1e0
and an endless loop of:
======================================================================= unregister_netdevice: waiting for dummy1 to become free. Usage count = -1073741824
That is because dev_put(real_dev) in vlan_dev_free() be called without dev_hold(real_dev) in register_vlan_dev(). It makes the refcnt of real_dev underflow.
Move the dev_hold(real_dev) to vlan_dev_init() which is the call-back of ndo_init(). That makes dev_hold() and dev_put() for vlan's real_dev symmetrical.
Fixes: 563bcbae3ba2 ("net: vlan: fix a UAF in vlan_dev_real_dev()") Reported-by: Petr Machata petrm@nvidia.com Suggested-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Ziyang Xuan william.xuanziyang@huawei.com Link: https://lore.kernel.org/r/20211126015942.2918542-1-william.xuanziyang@huawei... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/8021q/vlan.c | 3 --- net/8021q/vlan_dev.c | 3 +++ 2 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/net/8021q/vlan.c b/net/8021q/vlan.c index cd7c0429cddf8..796d95797ab40 100644 --- a/net/8021q/vlan.c +++ b/net/8021q/vlan.c @@ -177,9 +177,6 @@ int register_vlan_dev(struct net_device *dev, struct netlink_ext_ack *extack) if (err) goto out_unregister_netdev;
- /* Account for reference in struct vlan_dev_priv */ - dev_hold(real_dev); - vlan_stacked_transfer_operstate(real_dev, dev, vlan); linkwatch_fire_event(dev); /* _MUST_ call rfc2863_policy() */
diff --git a/net/8021q/vlan_dev.c b/net/8021q/vlan_dev.c index 415a29d42cdf0..589615ec490bb 100644 --- a/net/8021q/vlan_dev.c +++ b/net/8021q/vlan_dev.c @@ -583,6 +583,9 @@ static int vlan_dev_init(struct net_device *dev) if (!vlan->vlan_pcpu_stats) return -ENOMEM;
+ /* Get vlan's reference to real_dev */ + dev_hold(real_dev); + return 0; }
From: Tony Lu tonylu@linux.alibaba.com
[ Upstream commit bacb6c1e47691cda4a95056c21b5487fb7199fcc ]
When applications call shutdown() with SHUT_RDWR in userspace, smc_close_active() calls kernel_sock_shutdown(), and it is called twice in smc_shutdown().
This fixes this by checking sk_state before do clcsock shutdown, and avoids missing the application's call of smc_shutdown().
Link: https://lore.kernel.org/linux-s390/1f67548e-cbf6-0dce-82b5-10288a4583bd@linu... Fixes: 606a63c9783a ("net/smc: Ensure the active closing peer first closes clcsock") Signed-off-by: Tony Lu tonylu@linux.alibaba.com Reviewed-by: Wen Gu guwen@linux.alibaba.com Acked-by: Karsten Graul kgraul@linux.ibm.com Link: https://lore.kernel.org/r/20211126024134.45693-1-tonylu@linux.alibaba.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/smc/af_smc.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c index 6b0f09c5b195f..5e1493f8deba7 100644 --- a/net/smc/af_smc.c +++ b/net/smc/af_smc.c @@ -1658,8 +1658,10 @@ static __poll_t smc_poll(struct file *file, struct socket *sock, static int smc_shutdown(struct socket *sock, int how) { struct sock *sk = sock->sk; + bool do_shutdown = true; struct smc_sock *smc; int rc = -EINVAL; + int old_state; int rc1 = 0;
smc = smc_sk(sk); @@ -1686,7 +1688,11 @@ static int smc_shutdown(struct socket *sock, int how) } switch (how) { case SHUT_RDWR: /* shutdown in both directions */ + old_state = sk->sk_state; rc = smc_close_active(smc); + if (old_state == SMC_ACTIVE && + sk->sk_state == SMC_PEERCLOSEWAIT1) + do_shutdown = false; break; case SHUT_WR: rc = smc_close_shutdown_write(smc); @@ -1696,7 +1702,7 @@ static int smc_shutdown(struct socket *sock, int how) /* nothing more to do because peer is not involved */ break; } - if (smc->clcsock) + if (do_shutdown && smc->clcsock) rc1 = kernel_sock_shutdown(smc->clcsock, how); /* map sock_shutdown_cmd constants to sk_shutdown value range */ sk->sk_shutdown |= how + 1;
From: Guangbin Huang huangguangbin2@huawei.com
[ Upstream commit 8d2ad993aa05c0768f00c886c9d369cd97a337ac ]
When PF is set to multi-TCs and configured mapping relationship between priorities and TCs, the hardware will active these settings for this PF and its VFs.
In this case when VF just uses one TC and its rx packets contain priority, and if the priority is not mapped to TC0, as other TCs of VF is not valid, hardware always put this kind of packets to the queue 0. It cause this kind of packets of VF can not be used RSS function.
To fix this problem, set tc mode of all unused TCs of VF to the setting of TC0, then rx packet with priority which map to unused TC will be direct to TC0.
Fixes: e2cb1dec9779 ("net: hns3: Add HNS3 VF HCL(Hardware Compatibility Layer) Support") Signed-off-by: Guangbin Huang huangguangbin2@huawei.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c index db2e9dd5681eb..ce6a4e1965e1d 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c +++ b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c @@ -644,9 +644,9 @@ static int hclgevf_set_rss_tc_mode(struct hclgevf_dev *hdev, u16 rss_size) roundup_size = ilog2(roundup_size);
for (i = 0; i < HCLGEVF_MAX_TC_NUM; i++) { - tc_valid[i] = !!(hdev->hw_tc_map & BIT(i)); + tc_valid[i] = 1; tc_size[i] = roundup_size; - tc_offset[i] = rss_size * i; + tc_offset[i] = (hdev->hw_tc_map & BIT(i)) ? rss_size * i : 0; }
hclgevf_cmd_setup_basic_desc(&desc, HCLGEVF_OPC_RSS_TC_MODE, false);
From: Vladimir Oltean vladimir.oltean@nxp.com
[ Upstream commit 8a075464d1e9317ffae0973dfe538a7511291a06 ]
The ocelot driver, when asked to timestamp all receiving packets, 1588 v1 or NTP, says "nah, here's 1588 v2 for you".
According to this discussion: https://patchwork.kernel.org/project/netdevbpf/patch/20211104133204.19757-8-... drivers that downgrade from a wider request to a narrower response (or even a response where the intersection with the request is empty) are buggy, and should return -ERANGE instead. This patch fixes that.
Fixes: 4e3b0468e6d7 ("net: mscc: PTP Hardware Clock (PHC) support") Suggested-by: Richard Cochran richardcochran@gmail.com Signed-off-by: Vladimir Oltean vladimir.oltean@nxp.com Acked-by: Richard Cochran richardcochran@gmail.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/mscc/ocelot.c | 6 ------ 1 file changed, 6 deletions(-)
diff --git a/drivers/net/ethernet/mscc/ocelot.c b/drivers/net/ethernet/mscc/ocelot.c index 6030c90d50ccb..3f83647c5802e 100644 --- a/drivers/net/ethernet/mscc/ocelot.c +++ b/drivers/net/ethernet/mscc/ocelot.c @@ -1024,12 +1024,6 @@ static int ocelot_hwstamp_set(struct ocelot_port *port, struct ifreq *ifr) switch (cfg.rx_filter) { case HWTSTAMP_FILTER_NONE: break; - case HWTSTAMP_FILTER_ALL: - case HWTSTAMP_FILTER_SOME: - case HWTSTAMP_FILTER_PTP_V1_L4_EVENT: - case HWTSTAMP_FILTER_PTP_V1_L4_SYNC: - case HWTSTAMP_FILTER_PTP_V1_L4_DELAY_REQ: - case HWTSTAMP_FILTER_NTP_ALL: case HWTSTAMP_FILTER_PTP_V2_L4_EVENT: case HWTSTAMP_FILTER_PTP_V2_L4_SYNC: case HWTSTAMP_FILTER_PTP_V2_L4_DELAY_REQ:
From: Vladimir Oltean vladimir.oltean@nxp.com
[ Upstream commit c49a35eedfef08bffd46b53c25dbf9d6016a86ff ]
The driver doesn't support RX timestamping for non-PTP packets, but it declares that it does. Restrict the reported RX filters to PTP v2 over L2 and over L4.
Fixes: 4e3b0468e6d7 ("net: mscc: PTP Hardware Clock (PHC) support") Signed-off-by: Vladimir Oltean vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/mscc/ocelot.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/mscc/ocelot.c b/drivers/net/ethernet/mscc/ocelot.c index 3f83647c5802e..bf7832b34a000 100644 --- a/drivers/net/ethernet/mscc/ocelot.c +++ b/drivers/net/ethernet/mscc/ocelot.c @@ -1183,7 +1183,10 @@ static int ocelot_get_ts_info(struct net_device *dev, SOF_TIMESTAMPING_RAW_HARDWARE; info->tx_types = BIT(HWTSTAMP_TX_OFF) | BIT(HWTSTAMP_TX_ON) | BIT(HWTSTAMP_TX_ONESTEP_SYNC); - info->rx_filters = BIT(HWTSTAMP_FILTER_NONE) | BIT(HWTSTAMP_FILTER_ALL); + info->rx_filters = BIT(HWTSTAMP_FILTER_NONE) | + BIT(HWTSTAMP_FILTER_PTP_V2_EVENT) | + BIT(HWTSTAMP_FILTER_PTP_V2_L2_EVENT) | + BIT(HWTSTAMP_FILTER_PTP_V2_L4_EVENT);
return 0; }
From: Weichao Guo guoweichao@oppo.com
[ Upstream commit 6663b138ded1a59e630c9e605e42aa7fde490cdc ]
Inconsistent node block will cause a file fail to open or read, which could make the user process crashes or stucks. Let's mark SBI_NEED_FSCK flag to trigger a fix at next fsck time. After unlinking the corrupted file, the user process could regenerate a new one and work correctly.
Signed-off-by: Weichao Guo guoweichao@oppo.com Reviewed-by: Chao Yu chao@kernel.org Signed-off-by: Jaegeuk Kim jaegeuk@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/f2fs/node.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c index 4cb182c20eedd..0cd1d51dde06d 100644 --- a/fs/f2fs/node.c +++ b/fs/f2fs/node.c @@ -1385,6 +1385,7 @@ static struct page *__get_node_page(struct f2fs_sb_info *sbi, pgoff_t nid, nid, nid_of_node(page), ino_of_node(page), ofs_of_node(page), cpver_of_node(page), next_blkaddr_of_node(page)); + set_sbi_flag(sbi, SBI_NEED_FSCK); err = -EINVAL; out_err: ClearPageUptodate(page);
From: Steve French stfrench@microsoft.com
[ Upstream commit 71e6864eacbef0b2645ca043cdfbac272cb6cea3 ]
Linux allows doing a flush/fsync on a file open for read-only, but the protocol does not allow that. If the file passed in on the flush is read-only try to find a writeable handle for the same inode, if that is not possible skip sending the fsync call to the server to avoid breaking the apps.
Reported-by: Julian Sikorski belegdol@gmail.com Tested-by: Julian Sikorski belegdol@gmail.com Suggested-by: Jeremy Allison jra@samba.org Reviewed-by: Paulo Alcantara (SUSE) pc@cjr.nz Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/cifs/file.c | 35 +++++++++++++++++++++++++++++------ 1 file changed, 29 insertions(+), 6 deletions(-)
diff --git a/fs/cifs/file.c b/fs/cifs/file.c index a9746af5a44db..03c85beecec10 100644 --- a/fs/cifs/file.c +++ b/fs/cifs/file.c @@ -2577,12 +2577,23 @@ int cifs_strict_fsync(struct file *file, loff_t start, loff_t end, tcon = tlink_tcon(smbfile->tlink); if (!(cifs_sb->mnt_cifs_flags & CIFS_MOUNT_NOSSYNC)) { server = tcon->ses->server; - if (server->ops->flush) - rc = server->ops->flush(xid, tcon, &smbfile->fid); - else + if (server->ops->flush == NULL) { rc = -ENOSYS; + goto strict_fsync_exit; + } + + if ((OPEN_FMODE(smbfile->f_flags) & FMODE_WRITE) == 0) { + smbfile = find_writable_file(CIFS_I(inode), FIND_WR_ANY); + if (smbfile) { + rc = server->ops->flush(xid, tcon, &smbfile->fid); + cifsFileInfo_put(smbfile); + } else + cifs_dbg(FYI, "ignore fsync for file not open for write\n"); + } else + rc = server->ops->flush(xid, tcon, &smbfile->fid); }
+strict_fsync_exit: free_xid(xid); return rc; } @@ -2594,6 +2605,7 @@ int cifs_fsync(struct file *file, loff_t start, loff_t end, int datasync) struct cifs_tcon *tcon; struct TCP_Server_Info *server; struct cifsFileInfo *smbfile = file->private_data; + struct inode *inode = file_inode(file); struct cifs_sb_info *cifs_sb = CIFS_FILE_SB(file);
rc = file_write_and_wait_range(file, start, end); @@ -2608,12 +2620,23 @@ int cifs_fsync(struct file *file, loff_t start, loff_t end, int datasync) tcon = tlink_tcon(smbfile->tlink); if (!(cifs_sb->mnt_cifs_flags & CIFS_MOUNT_NOSSYNC)) { server = tcon->ses->server; - if (server->ops->flush) - rc = server->ops->flush(xid, tcon, &smbfile->fid); - else + if (server->ops->flush == NULL) { rc = -ENOSYS; + goto fsync_exit; + } + + if ((OPEN_FMODE(smbfile->f_flags) & FMODE_WRITE) == 0) { + smbfile = find_writable_file(CIFS_I(inode), FIND_WR_ANY); + if (smbfile) { + rc = server->ops->flush(xid, tcon, &smbfile->fid); + cifsFileInfo_put(smbfile); + } else + cifs_dbg(FYI, "ignore fsync for file not open for write\n"); + } else + rc = server->ops->flush(xid, tcon, &smbfile->fid); }
+fsync_exit: free_xid(xid); return rc; }
From: Stefano Garzarella sgarzare@redhat.com
commit 49d8c5ffad07ca014cfae72a1b9b8c52b6ad9cb8 upstream.
The "used length" reported by calling vhost_add_used() must be the number of bytes written by the device (using "in" buffers).
In vhost_vsock_handle_tx_kick() the device only reads the guest buffers (they are all "out" buffers), without writing anything, so we must pass 0 as "used length" to comply virtio spec.
Fixes: 433fc58e6bf2 ("VSOCK: Introduce vhost_vsock.ko") Cc: stable@vger.kernel.org Reported-by: Halil Pasic pasic@linux.ibm.com Suggested-by: Jason Wang jasowang@redhat.com Signed-off-by: Stefano Garzarella sgarzare@redhat.com Link: https://lore.kernel.org/r/20211122163525.294024-2-sgarzare@redhat.com Signed-off-by: Michael S. Tsirkin mst@redhat.com Reviewed-by: Stefan Hajnoczi stefanha@redhat.com Reviewed-by: Halil Pasic pasic@linux.ibm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/vhost/vsock.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/vhost/vsock.c +++ b/drivers/vhost/vsock.c @@ -491,7 +491,7 @@ static void vhost_vsock_handle_tx_kick(s virtio_transport_free_pkt(pkt);
len += sizeof(pkt->hdr); - vhost_add_used(vq, head, len); + vhost_add_used(vq, head, 0); total_len += len; added = true; } while(likely(!vhost_exceeds_weight(vq, ++pkts, total_len)));
From: Steven Rostedt (VMware) rostedt@goodmis.org
commit 6cb206508b621a9a0a2c35b60540e399225c8243 upstream.
When pid filtering is activated in an instance, all of the events trace files for that instance has the PID_FILTER flag set. This determines whether or not pid filtering needs to be done on the event, otherwise the event is executed as normal.
If pid filtering is enabled when an event is created (via a dynamic event or modules), its flag is not updated to reflect the current state, and the events are not filtered properly.
Cc: stable@vger.kernel.org Fixes: 3fdaf80f4a836 ("tracing: Implement event pid filtering") Signed-off-by: Steven Rostedt (VMware) rostedt@goodmis.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/trace/trace_events.c | 7 +++++++ 1 file changed, 7 insertions(+)
--- a/kernel/trace/trace_events.c +++ b/kernel/trace/trace_events.c @@ -2247,12 +2247,19 @@ static struct trace_event_file * trace_create_new_event(struct trace_event_call *call, struct trace_array *tr) { + struct trace_pid_list *pid_list; struct trace_event_file *file;
file = kmem_cache_alloc(file_cachep, GFP_TRACE); if (!file) return NULL;
+ pid_list = rcu_dereference_protected(tr->filtered_pids, + lockdep_is_held(&event_mutex)); + + if (pid_list) + file->flags |= EVENT_FILE_FL_PID_FILTER; + file->event_call = call; file->tr = tr; atomic_set(&file->sm_ref, 0);
From: David Hildenbrand david@redhat.com
commit fe3d10024073f06f04c74b9674bd71ccc1d787cf upstream.
We should not walk/touch page tables outside of VMA boundaries when holding only the mmap sem in read mode. Evil user space can modify the VMA layout just before this function runs and e.g., trigger races with page table removal code since commit dd2283f2605e ("mm: mmap: zap pages with read mmap_sem in munmap"). gfn_to_hva() will only translate using KVM memory regions, but won't validate the VMA.
Further, we should not allocate page tables outside of VMA boundaries: if evil user space decides to map hugetlbfs to these ranges, bad things will happen because we suddenly have PTE or PMD page tables where we shouldn't have them.
Similarly, we have to check if we suddenly find a hugetlbfs VMA, before calling get_locked_pte().
Fixes: 2d42f9477320 ("s390/kvm: Add PGSTE manipulation functions") Signed-off-by: David Hildenbrand david@redhat.com Reviewed-by: Claudio Imbrenda imbrenda@linux.ibm.com Acked-by: Heiko Carstens hca@linux.ibm.com Link: https://lore.kernel.org/r/20210909162248.14969-4-david@redhat.com Signed-off-by: Christian Borntraeger borntraeger@de.ibm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/s390/mm/pgtable.c | 13 +++++++++++++ 1 file changed, 13 insertions(+)
--- a/arch/s390/mm/pgtable.c +++ b/arch/s390/mm/pgtable.c @@ -970,6 +970,7 @@ EXPORT_SYMBOL(get_guest_storage_key); int pgste_perform_essa(struct mm_struct *mm, unsigned long hva, int orc, unsigned long *oldpte, unsigned long *oldpgste) { + struct vm_area_struct *vma; unsigned long pgstev; spinlock_t *ptl; pgste_t pgste; @@ -979,6 +980,10 @@ int pgste_perform_essa(struct mm_struct WARN_ON_ONCE(orc > ESSA_MAX); if (unlikely(orc > ESSA_MAX)) return -EINVAL; + + vma = find_vma(mm, hva); + if (!vma || hva < vma->vm_start || is_vm_hugetlb_page(vma)) + return -EFAULT; ptep = get_locked_pte(mm, hva, &ptl); if (unlikely(!ptep)) return -EFAULT; @@ -1071,10 +1076,14 @@ EXPORT_SYMBOL(pgste_perform_essa); int set_pgste_bits(struct mm_struct *mm, unsigned long hva, unsigned long bits, unsigned long value) { + struct vm_area_struct *vma; spinlock_t *ptl; pgste_t new; pte_t *ptep;
+ vma = find_vma(mm, hva); + if (!vma || hva < vma->vm_start || is_vm_hugetlb_page(vma)) + return -EFAULT; ptep = get_locked_pte(mm, hva, &ptl); if (unlikely(!ptep)) return -EFAULT; @@ -1099,9 +1108,13 @@ EXPORT_SYMBOL(set_pgste_bits); */ int get_pgste(struct mm_struct *mm, unsigned long hva, unsigned long *pgstep) { + struct vm_area_struct *vma; spinlock_t *ptl; pte_t *ptep;
+ vma = find_vma(mm, hva); + if (!vma || hva < vma->vm_start || is_vm_hugetlb_page(vma)) + return -EFAULT; ptep = get_locked_pte(mm, hva, &ptl); if (unlikely(!ptep)) return -EFAULT;
From: Alexander Mikhalitsyn alexander.mikhalitsyn@virtuozzo.com
commit 85b6d24646e4125c591639841169baa98a2da503 upstream.
Currently, the exit_shm() function not designed to work properly when task->sysvshm.shm_clist holds shm objects from different IPC namespaces.
This is a real pain when sysctl kernel.shm_rmid_forced = 1, because it leads to use-after-free (reproducer exists).
This is an attempt to fix the problem by extending exit_shm mechanism to handle shm's destroy from several IPC ns'es.
To achieve that we do several things:
1. add a namespace (non-refcounted) pointer to the struct shmid_kernel
2. during new shm object creation (newseg()/shmget syscall) we initialize this pointer by current task IPC ns
3. exit_shm() fully reworked such that it traverses over all shp's in task->sysvshm.shm_clist and gets IPC namespace not from current task as it was before but from shp's object itself, then call shm_destroy(shp, ns).
Note: We need to be really careful here, because as it was said before (1), our pointer to IPC ns non-refcnt'ed. To be on the safe side we using special helper get_ipc_ns_not_zero() which allows to get IPC ns refcounter only if IPC ns not in the "state of destruction".
Q/A
Q: Why can we access shp->ns memory using non-refcounted pointer? A: Because shp object lifetime is always shorther than IPC namespace lifetime, so, if we get shp object from the task->sysvshm.shm_clist while holding task_lock(task) nobody can steal our namespace.
Q: Does this patch change semantics of unshare/setns/clone syscalls? A: No. It's just fixes non-covered case when process may leave IPC namespace without getting task->sysvshm.shm_clist list cleaned up.
Link: https://lkml.kernel.org/r/67bb03e5-f79c-1815-e2bf-949c67047418@colorfullife.... Link: https://lkml.kernel.org/r/20211109151501.4921-1-manfred@colorfullife.com Fixes: ab602f79915 ("shm: make exit_shm work proportional to task activity") Co-developed-by: Manfred Spraul manfred@colorfullife.com Signed-off-by: Manfred Spraul manfred@colorfullife.com Signed-off-by: Alexander Mikhalitsyn alexander.mikhalitsyn@virtuozzo.com Cc: "Eric W. Biederman" ebiederm@xmission.com Cc: Davidlohr Bueso dave@stgolabs.net Cc: Greg KH gregkh@linuxfoundation.org Cc: Andrei Vagin avagin@gmail.com Cc: Pavel Tikhomirov ptikhomirov@virtuozzo.com Cc: Vasily Averin vvs@virtuozzo.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- include/linux/ipc_namespace.h | 15 +++ include/linux/sched/task.h | 2 ipc/shm.c | 189 +++++++++++++++++++++++++++++++----------- 3 files changed, 159 insertions(+), 47 deletions(-)
--- a/include/linux/ipc_namespace.h +++ b/include/linux/ipc_namespace.h @@ -130,6 +130,16 @@ static inline struct ipc_namespace *get_ return ns; }
+static inline struct ipc_namespace *get_ipc_ns_not_zero(struct ipc_namespace *ns) +{ + if (ns) { + if (refcount_inc_not_zero(&ns->count)) + return ns; + } + + return NULL; +} + extern void put_ipc_ns(struct ipc_namespace *ns); #else static inline struct ipc_namespace *copy_ipcs(unsigned long flags, @@ -145,6 +155,11 @@ static inline struct ipc_namespace *get_ { return ns; } + +static inline struct ipc_namespace *get_ipc_ns_not_zero(struct ipc_namespace *ns) +{ + return ns; +}
static inline void put_ipc_ns(struct ipc_namespace *ns) { --- a/include/linux/sched/task.h +++ b/include/linux/sched/task.h @@ -157,7 +157,7 @@ static inline struct vm_struct *task_sta * Protects ->fs, ->files, ->mm, ->group_info, ->comm, keyring * subscriptions and synchronises with wait4(). Also used in procfs. Also * pins the final release of task.io_context. Also protects ->cpuset and - * ->cgroup.subsys[]. And ->vfork_done. + * ->cgroup.subsys[]. And ->vfork_done. And ->sysvshm.shm_clist. * * Nests both inside and outside of read_lock(&tasklist_lock). * It must not be nested with write_lock_irq(&tasklist_lock), --- a/ipc/shm.c +++ b/ipc/shm.c @@ -62,9 +62,18 @@ struct shmid_kernel /* private to the ke struct pid *shm_lprid; struct user_struct *mlock_user;
- /* The task created the shm object. NULL if the task is dead. */ + /* + * The task created the shm object, for + * task_lock(shp->shm_creator) + */ struct task_struct *shm_creator; - struct list_head shm_clist; /* list by creator */ + + /* + * List by creator. task_lock(->shm_creator) required for read/write. + * If list_empty(), then the creator is dead already. + */ + struct list_head shm_clist; + struct ipc_namespace *ns; } __randomize_layout;
/* shm_mode upper byte flags */ @@ -115,6 +124,7 @@ static void do_shm_rmid(struct ipc_names struct shmid_kernel *shp;
shp = container_of(ipcp, struct shmid_kernel, shm_perm); + WARN_ON(ns != shp->ns);
if (shp->shm_nattch) { shp->shm_perm.mode |= SHM_DEST; @@ -225,10 +235,43 @@ static void shm_rcu_free(struct rcu_head kvfree(shp); }
-static inline void shm_rmid(struct ipc_namespace *ns, struct shmid_kernel *s) +/* + * It has to be called with shp locked. + * It must be called before ipc_rmid() + */ +static inline void shm_clist_rm(struct shmid_kernel *shp) +{ + struct task_struct *creator; + + /* ensure that shm_creator does not disappear */ + rcu_read_lock(); + + /* + * A concurrent exit_shm may do a list_del_init() as well. + * Just do nothing if exit_shm already did the work + */ + if (!list_empty(&shp->shm_clist)) { + /* + * shp->shm_creator is guaranteed to be valid *only* + * if shp->shm_clist is not empty. + */ + creator = shp->shm_creator; + + task_lock(creator); + /* + * list_del_init() is a nop if the entry was already removed + * from the list. + */ + list_del_init(&shp->shm_clist); + task_unlock(creator); + } + rcu_read_unlock(); +} + +static inline void shm_rmid(struct shmid_kernel *s) { - list_del(&s->shm_clist); - ipc_rmid(&shm_ids(ns), &s->shm_perm); + shm_clist_rm(s); + ipc_rmid(&shm_ids(s->ns), &s->shm_perm); }
@@ -283,7 +326,7 @@ static void shm_destroy(struct ipc_names shm_file = shp->shm_file; shp->shm_file = NULL; ns->shm_tot -= (shp->shm_segsz + PAGE_SIZE - 1) >> PAGE_SHIFT; - shm_rmid(ns, shp); + shm_rmid(shp); shm_unlock(shp); if (!is_file_hugepages(shm_file)) shmem_lock(shm_file, 0, shp->mlock_user); @@ -306,10 +349,10 @@ static void shm_destroy(struct ipc_names * * 2) sysctl kernel.shm_rmid_forced is set to 1. */ -static bool shm_may_destroy(struct ipc_namespace *ns, struct shmid_kernel *shp) +static bool shm_may_destroy(struct shmid_kernel *shp) { return (shp->shm_nattch == 0) && - (ns->shm_rmid_forced || + (shp->ns->shm_rmid_forced || (shp->shm_perm.mode & SHM_DEST)); }
@@ -340,7 +383,7 @@ static void shm_close(struct vm_area_str ipc_update_pid(&shp->shm_lprid, task_tgid(current)); shp->shm_dtim = ktime_get_real_seconds(); shp->shm_nattch--; - if (shm_may_destroy(ns, shp)) + if (shm_may_destroy(shp)) shm_destroy(ns, shp); else shm_unlock(shp); @@ -361,10 +404,10 @@ static int shm_try_destroy_orphaned(int * * As shp->* are changed under rwsem, it's safe to skip shp locking. */ - if (shp->shm_creator != NULL) + if (!list_empty(&shp->shm_clist)) return 0;
- if (shm_may_destroy(ns, shp)) { + if (shm_may_destroy(shp)) { shm_lock_by_ptr(shp); shm_destroy(ns, shp); } @@ -382,48 +425,97 @@ void shm_destroy_orphaned(struct ipc_nam /* Locking assumes this will only be called with task == current */ void exit_shm(struct task_struct *task) { - struct ipc_namespace *ns = task->nsproxy->ipc_ns; - struct shmid_kernel *shp, *n; + for (;;) { + struct shmid_kernel *shp; + struct ipc_namespace *ns;
- if (list_empty(&task->sysvshm.shm_clist)) - return; + task_lock(task); + + if (list_empty(&task->sysvshm.shm_clist)) { + task_unlock(task); + break; + } + + shp = list_first_entry(&task->sysvshm.shm_clist, struct shmid_kernel, + shm_clist);
- /* - * If kernel.shm_rmid_forced is not set then only keep track of - * which shmids are orphaned, so that a later set of the sysctl - * can clean them up. - */ - if (!ns->shm_rmid_forced) { - down_read(&shm_ids(ns).rwsem); - list_for_each_entry(shp, &task->sysvshm.shm_clist, shm_clist) - shp->shm_creator = NULL; /* - * Only under read lock but we are only called on current - * so no entry on the list will be shared. + * 1) Get pointer to the ipc namespace. It is worth to say + * that this pointer is guaranteed to be valid because + * shp lifetime is always shorter than namespace lifetime + * in which shp lives. + * We taken task_lock it means that shp won't be freed. */ - list_del(&task->sysvshm.shm_clist); - up_read(&shm_ids(ns).rwsem); - return; - } + ns = shp->ns;
- /* - * Destroy all already created segments, that were not yet mapped, - * and mark any mapped as orphan to cover the sysctl toggling. - * Destroy is skipped if shm_may_destroy() returns false. - */ - down_write(&shm_ids(ns).rwsem); - list_for_each_entry_safe(shp, n, &task->sysvshm.shm_clist, shm_clist) { - shp->shm_creator = NULL; + /* + * 2) If kernel.shm_rmid_forced is not set then only keep track of + * which shmids are orphaned, so that a later set of the sysctl + * can clean them up. + */ + if (!ns->shm_rmid_forced) + goto unlink_continue;
- if (shm_may_destroy(ns, shp)) { - shm_lock_by_ptr(shp); - shm_destroy(ns, shp); + /* + * 3) get a reference to the namespace. + * The refcount could be already 0. If it is 0, then + * the shm objects will be free by free_ipc_work(). + */ + ns = get_ipc_ns_not_zero(ns); + if (!ns) { +unlink_continue: + list_del_init(&shp->shm_clist); + task_unlock(task); + continue; } - }
- /* Remove the list head from any segments still attached. */ - list_del(&task->sysvshm.shm_clist); - up_write(&shm_ids(ns).rwsem); + /* + * 4) get a reference to shp. + * This cannot fail: shm_clist_rm() is called before + * ipc_rmid(), thus the refcount cannot be 0. + */ + WARN_ON(!ipc_rcu_getref(&shp->shm_perm)); + + /* + * 5) unlink the shm segment from the list of segments + * created by current. + * This must be done last. After unlinking, + * only the refcounts obtained above prevent IPC_RMID + * from destroying the segment or the namespace. + */ + list_del_init(&shp->shm_clist); + + task_unlock(task); + + /* + * 6) we have all references + * Thus lock & if needed destroy shp. + */ + down_write(&shm_ids(ns).rwsem); + shm_lock_by_ptr(shp); + /* + * rcu_read_lock was implicitly taken in shm_lock_by_ptr, it's + * safe to call ipc_rcu_putref here + */ + ipc_rcu_putref(&shp->shm_perm, shm_rcu_free); + + if (ipc_valid_object(&shp->shm_perm)) { + if (shm_may_destroy(shp)) + shm_destroy(ns, shp); + else + shm_unlock(shp); + } else { + /* + * Someone else deleted the shp from namespace + * idr/kht while we have waited. + * Just unlock and continue. + */ + shm_unlock(shp); + } + + up_write(&shm_ids(ns).rwsem); + put_ipc_ns(ns); /* paired with get_ipc_ns_not_zero */ + } }
static vm_fault_t shm_fault(struct vm_fault *vmf) @@ -680,7 +772,11 @@ static int newseg(struct ipc_namespace * if (error < 0) goto no_id;
+ shp->ns = ns; + + task_lock(current); list_add(&shp->shm_clist, ¤t->sysvshm.shm_clist); + task_unlock(current);
/* * shmid gets reported as "inode#" in /proc/pid/maps. @@ -1575,7 +1671,8 @@ out_nattch: down_write(&shm_ids(ns).rwsem); shp = shm_lock(ns, shmid); shp->shm_nattch--; - if (shm_may_destroy(ns, shp)) + + if (shm_may_destroy(shp)) shm_destroy(ns, shp); else shm_unlock(shp);
From: Lin Ma linma@zju.edu.cn
commit 48b71a9e66c2eab60564b1b1c85f4928ed04e406 upstream.
There are two sites that calls queue_work() after the destroy_workqueue() and lead to possible UAF.
The first site is nci_send_cmd(), which can happen after the nci_close_device as below
nfcmrvl_nci_unregister_dev | nfc_genl_dev_up nci_close_device | flush_workqueue | del_timer_sync | nci_unregister_device | nfc_get_device destroy_workqueue | nfc_dev_up nfc_unregister_device | nci_dev_up device_del | nci_open_device | __nci_request | nci_send_cmd | queue_work !!!
Another site is nci_cmd_timer, awaked by the nci_cmd_work from the nci_send_cmd.
... | ... nci_unregister_device | queue_work destroy_workqueue | nfc_unregister_device | ... device_del | nci_cmd_work | mod_timer | ... | nci_cmd_timer | queue_work !!!
For the above two UAF, the root cause is that the nfc_dev_up can race between the nci_unregister_device routine. Therefore, this patch introduce NCI_UNREG flag to easily eliminate the possible race. In addition, the mutex_lock in nci_close_device can act as a barrier.
Signed-off-by: Lin Ma linma@zju.edu.cn Fixes: 6a2968aaf50c ("NFC: basic NCI protocol implementation") Reviewed-by: Jakub Kicinski kuba@kernel.org Reviewed-by: Krzysztof Kozlowski krzysztof.kozlowski@canonical.com Link: https://lore.kernel.org/r/20211116152732.19238-1-linma@zju.edu.cn Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/net/nfc/nci_core.h | 1 + net/nfc/nci/core.c | 19 +++++++++++++++++-- 2 files changed, 18 insertions(+), 2 deletions(-)
--- a/include/net/nfc/nci_core.h +++ b/include/net/nfc/nci_core.h @@ -30,6 +30,7 @@ enum nci_flag { NCI_UP, NCI_DATA_EXCHANGE, NCI_DATA_EXCHANGE_TO, + NCI_UNREG, };
/* NCI device states */ --- a/net/nfc/nci/core.c +++ b/net/nfc/nci/core.c @@ -473,6 +473,11 @@ static int nci_open_device(struct nci_de
mutex_lock(&ndev->req_lock);
+ if (test_bit(NCI_UNREG, &ndev->flags)) { + rc = -ENODEV; + goto done; + } + if (test_bit(NCI_UP, &ndev->flags)) { rc = -EALREADY; goto done; @@ -536,6 +541,10 @@ done: static int nci_close_device(struct nci_dev *ndev) { nci_req_cancel(ndev, ENODEV); + + /* This mutex needs to be held as a barrier for + * caller nci_unregister_device + */ mutex_lock(&ndev->req_lock);
if (!test_and_clear_bit(NCI_UP, &ndev->flags)) { @@ -573,8 +582,8 @@ static int nci_close_device(struct nci_d /* Flush cmd wq */ flush_workqueue(ndev->cmd_wq);
- /* Clear flags */ - ndev->flags = 0; + /* Clear flags except NCI_UNREG */ + ndev->flags &= BIT(NCI_UNREG);
mutex_unlock(&ndev->req_lock);
@@ -1256,6 +1265,12 @@ void nci_unregister_device(struct nci_de { struct nci_conn_info *conn_info, *n;
+ /* This set_bit is not protected with specialized barrier, + * However, it is fine because the mutex_lock(&ndev->req_lock); + * in nci_close_device() will help to emit one. + */ + set_bit(NCI_UNREG, &ndev->flags); + nci_close_device(ndev);
destroy_workqueue(ndev->cmd_wq);
From: Miklos Szeredi mszeredi@redhat.com
commit 473441720c8616dfaf4451f9c7ea14f0eb5e5d65 upstream.
Checking buf->flags should be done before the pipe_buf_release() is called on the pipe buffer, since releasing the buffer might modify the flags.
This is exactly what page_cache_pipe_buf_release() does, and which results in the same VM_BUG_ON_PAGE(PageLRU(page)) that the original patch was trying to fix.
Reported-by: Justin Forbes jmforbes@linuxtx.org Fixes: 712a951025c0 ("fuse: fix page stealing") Cc: stable@vger.kernel.org # v2.6.35 Signed-off-by: Miklos Szeredi mszeredi@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/fuse/dev.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-)
--- a/fs/fuse/dev.c +++ b/fs/fuse/dev.c @@ -839,17 +839,17 @@ static int fuse_try_move_page(struct fus goto out_put_old; }
+ get_page(newpage); + + if (!(buf->flags & PIPE_BUF_FLAG_LRU)) + lru_cache_add_file(newpage); + /* * Release while we have extra ref on stolen page. Otherwise * anon_pipe_buf_release() might think the page can be reused. */ pipe_buf_release(cs->pipe, buf);
- get_page(newpage); - - if (!(buf->flags & PIPE_BUF_FLAG_LRU)) - lru_cache_add_file(newpage); - err = 0; spin_lock(&cs->req->waitq.lock); if (test_bit(FR_ABORTED, &cs->req->flags))
From: Juergen Gross jgross@suse.com
commit 629a5d87e26fe96bcaab44cbb81f5866af6f7008 upstream.
Sync include/xen/interface/io/ring.h with Xen's newest version in order to get the RING_COPY_RESPONSE() and RING_RESPONSE_PROD_OVERFLOW() macros.
Note that this will correct the wrong license info by adding the missing original copyright notice.
Signed-off-by: Juergen Gross jgross@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/xen/interface/io/ring.h | 307 +++++++++++++++++++++------------------- 1 file changed, 165 insertions(+), 142 deletions(-)
--- a/include/xen/interface/io/ring.h +++ b/include/xen/interface/io/ring.h @@ -1,21 +1,53 @@ -/* SPDX-License-Identifier: GPL-2.0 */ /****************************************************************************** * ring.h * * Shared producer-consumer ring macros. * + * Permission is hereby granted, free of charge, to any person obtaining a copy + * of this software and associated documentation files (the "Software"), to + * deal in the Software without restriction, including without limitation the + * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or + * sell copies of the Software, and to permit persons to whom the Software is + * furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE + * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER + * DEALINGS IN THE SOFTWARE. + * * Tim Deegan and Andrew Warfield November 2004. */
#ifndef __XEN_PUBLIC_IO_RING_H__ #define __XEN_PUBLIC_IO_RING_H__
+/* + * When #include'ing this header, you need to provide the following + * declaration upfront: + * - standard integers types (uint8_t, uint16_t, etc) + * They are provided by stdint.h of the standard headers. + * + * In addition, if you intend to use the FLEX macros, you also need to + * provide the following, before invoking the FLEX macros: + * - size_t + * - memcpy + * - grant_ref_t + * These declarations are provided by string.h of the standard headers, + * and grant_table.h from the Xen public headers. + */ + #include <xen/interface/grant_table.h>
typedef unsigned int RING_IDX;
/* Round a 32-bit unsigned constant down to the nearest power of two. */ -#define __RD2(_x) (((_x) & 0x00000002) ? 0x2 : ((_x) & 0x1)) +#define __RD2(_x) (((_x) & 0x00000002) ? 0x2 : ((_x) & 0x1)) #define __RD4(_x) (((_x) & 0x0000000c) ? __RD2((_x)>>2)<<2 : __RD2(_x)) #define __RD8(_x) (((_x) & 0x000000f0) ? __RD4((_x)>>4)<<4 : __RD4(_x)) #define __RD16(_x) (((_x) & 0x0000ff00) ? __RD8((_x)>>8)<<8 : __RD8(_x)) @@ -27,82 +59,79 @@ typedef unsigned int RING_IDX; * A ring contains as many entries as will fit, rounded down to the nearest * power of two (so we can mask with (size-1) to loop around). */ -#define __CONST_RING_SIZE(_s, _sz) \ - (__RD32(((_sz) - offsetof(struct _s##_sring, ring)) / \ - sizeof(((struct _s##_sring *)0)->ring[0]))) - +#define __CONST_RING_SIZE(_s, _sz) \ + (__RD32(((_sz) - offsetof(struct _s##_sring, ring)) / \ + sizeof(((struct _s##_sring *)0)->ring[0]))) /* * The same for passing in an actual pointer instead of a name tag. */ -#define __RING_SIZE(_s, _sz) \ - (__RD32(((_sz) - (long)&(_s)->ring + (long)(_s)) / sizeof((_s)->ring[0]))) +#define __RING_SIZE(_s, _sz) \ + (__RD32(((_sz) - (long)(_s)->ring + (long)(_s)) / sizeof((_s)->ring[0])))
/* * Macros to make the correct C datatypes for a new kind of ring. * * To make a new ring datatype, you need to have two message structures, - * let's say struct request, and struct response already defined. + * let's say request_t, and response_t already defined. * * In a header where you want the ring datatype declared, you then do: * - * DEFINE_RING_TYPES(mytag, struct request, struct response); + * DEFINE_RING_TYPES(mytag, request_t, response_t); * * These expand out to give you a set of types, as you can see below. * The most important of these are: * - * struct mytag_sring - The shared ring. - * struct mytag_front_ring - The 'front' half of the ring. - * struct mytag_back_ring - The 'back' half of the ring. + * mytag_sring_t - The shared ring. + * mytag_front_ring_t - The 'front' half of the ring. + * mytag_back_ring_t - The 'back' half of the ring. * * To initialize a ring in your code you need to know the location and size * of the shared memory area (PAGE_SIZE, for instance). To initialise * the front half: * - * struct mytag_front_ring front_ring; - * SHARED_RING_INIT((struct mytag_sring *)shared_page); - * FRONT_RING_INIT(&front_ring, (struct mytag_sring *)shared_page, - * PAGE_SIZE); + * mytag_front_ring_t front_ring; + * SHARED_RING_INIT((mytag_sring_t *)shared_page); + * FRONT_RING_INIT(&front_ring, (mytag_sring_t *)shared_page, PAGE_SIZE); * * Initializing the back follows similarly (note that only the front * initializes the shared ring): * - * struct mytag_back_ring back_ring; - * BACK_RING_INIT(&back_ring, (struct mytag_sring *)shared_page, - * PAGE_SIZE); + * mytag_back_ring_t back_ring; + * BACK_RING_INIT(&back_ring, (mytag_sring_t *)shared_page, PAGE_SIZE); */
-#define DEFINE_RING_TYPES(__name, __req_t, __rsp_t) \ - \ -/* Shared ring entry */ \ -union __name##_sring_entry { \ - __req_t req; \ - __rsp_t rsp; \ -}; \ - \ -/* Shared ring page */ \ -struct __name##_sring { \ - RING_IDX req_prod, req_event; \ - RING_IDX rsp_prod, rsp_event; \ - uint8_t pad[48]; \ - union __name##_sring_entry ring[1]; /* variable-length */ \ -}; \ - \ -/* "Front" end's private variables */ \ -struct __name##_front_ring { \ - RING_IDX req_prod_pvt; \ - RING_IDX rsp_cons; \ - unsigned int nr_ents; \ - struct __name##_sring *sring; \ -}; \ - \ -/* "Back" end's private variables */ \ -struct __name##_back_ring { \ - RING_IDX rsp_prod_pvt; \ - RING_IDX req_cons; \ - unsigned int nr_ents; \ - struct __name##_sring *sring; \ -}; - +#define DEFINE_RING_TYPES(__name, __req_t, __rsp_t) \ + \ +/* Shared ring entry */ \ +union __name##_sring_entry { \ + __req_t req; \ + __rsp_t rsp; \ +}; \ + \ +/* Shared ring page */ \ +struct __name##_sring { \ + RING_IDX req_prod, req_event; \ + RING_IDX rsp_prod, rsp_event; \ + uint8_t __pad[48]; \ + union __name##_sring_entry ring[1]; /* variable-length */ \ +}; \ + \ +/* "Front" end's private variables */ \ +struct __name##_front_ring { \ + RING_IDX req_prod_pvt; \ + RING_IDX rsp_cons; \ + unsigned int nr_ents; \ + struct __name##_sring *sring; \ +}; \ + \ +/* "Back" end's private variables */ \ +struct __name##_back_ring { \ + RING_IDX rsp_prod_pvt; \ + RING_IDX req_cons; \ + unsigned int nr_ents; \ + struct __name##_sring *sring; \ +}; \ + \ /* * Macros for manipulating rings. * @@ -119,105 +148,99 @@ struct __name##_back_ring { \ */
/* Initialising empty rings */ -#define SHARED_RING_INIT(_s) do { \ - (_s)->req_prod = (_s)->rsp_prod = 0; \ - (_s)->req_event = (_s)->rsp_event = 1; \ - memset((_s)->pad, 0, sizeof((_s)->pad)); \ +#define SHARED_RING_INIT(_s) do { \ + (_s)->req_prod = (_s)->rsp_prod = 0; \ + (_s)->req_event = (_s)->rsp_event = 1; \ + (void)memset((_s)->__pad, 0, sizeof((_s)->__pad)); \ } while(0)
-#define FRONT_RING_INIT(_r, _s, __size) do { \ - (_r)->req_prod_pvt = 0; \ - (_r)->rsp_cons = 0; \ - (_r)->nr_ents = __RING_SIZE(_s, __size); \ - (_r)->sring = (_s); \ +#define FRONT_RING_ATTACH(_r, _s, _i, __size) do { \ + (_r)->req_prod_pvt = (_i); \ + (_r)->rsp_cons = (_i); \ + (_r)->nr_ents = __RING_SIZE(_s, __size); \ + (_r)->sring = (_s); \ } while (0)
-#define BACK_RING_INIT(_r, _s, __size) do { \ - (_r)->rsp_prod_pvt = 0; \ - (_r)->req_cons = 0; \ - (_r)->nr_ents = __RING_SIZE(_s, __size); \ - (_r)->sring = (_s); \ -} while (0) +#define FRONT_RING_INIT(_r, _s, __size) FRONT_RING_ATTACH(_r, _s, 0, __size)
-/* Initialize to existing shared indexes -- for recovery */ -#define FRONT_RING_ATTACH(_r, _s, __size) do { \ - (_r)->sring = (_s); \ - (_r)->req_prod_pvt = (_s)->req_prod; \ - (_r)->rsp_cons = (_s)->rsp_prod; \ - (_r)->nr_ents = __RING_SIZE(_s, __size); \ +#define BACK_RING_ATTACH(_r, _s, _i, __size) do { \ + (_r)->rsp_prod_pvt = (_i); \ + (_r)->req_cons = (_i); \ + (_r)->nr_ents = __RING_SIZE(_s, __size); \ + (_r)->sring = (_s); \ } while (0)
-#define BACK_RING_ATTACH(_r, _s, __size) do { \ - (_r)->sring = (_s); \ - (_r)->rsp_prod_pvt = (_s)->rsp_prod; \ - (_r)->req_cons = (_s)->req_prod; \ - (_r)->nr_ents = __RING_SIZE(_s, __size); \ -} while (0) +#define BACK_RING_INIT(_r, _s, __size) BACK_RING_ATTACH(_r, _s, 0, __size)
/* How big is this ring? */ -#define RING_SIZE(_r) \ +#define RING_SIZE(_r) \ ((_r)->nr_ents)
/* Number of free requests (for use on front side only). */ -#define RING_FREE_REQUESTS(_r) \ +#define RING_FREE_REQUESTS(_r) \ (RING_SIZE(_r) - ((_r)->req_prod_pvt - (_r)->rsp_cons))
/* Test if there is an empty slot available on the front ring. * (This is only meaningful from the front. ) */ -#define RING_FULL(_r) \ +#define RING_FULL(_r) \ (RING_FREE_REQUESTS(_r) == 0)
/* Test if there are outstanding messages to be processed on a ring. */ -#define RING_HAS_UNCONSUMED_RESPONSES(_r) \ +#define RING_HAS_UNCONSUMED_RESPONSES(_r) \ ((_r)->sring->rsp_prod - (_r)->rsp_cons)
-#define RING_HAS_UNCONSUMED_REQUESTS(_r) \ - ({ \ - unsigned int req = (_r)->sring->req_prod - (_r)->req_cons; \ - unsigned int rsp = RING_SIZE(_r) - \ - ((_r)->req_cons - (_r)->rsp_prod_pvt); \ - req < rsp ? req : rsp; \ - }) +#define RING_HAS_UNCONSUMED_REQUESTS(_r) ({ \ + unsigned int req = (_r)->sring->req_prod - (_r)->req_cons; \ + unsigned int rsp = RING_SIZE(_r) - \ + ((_r)->req_cons - (_r)->rsp_prod_pvt); \ + req < rsp ? req : rsp; \ +})
/* Direct access to individual ring elements, by index. */ -#define RING_GET_REQUEST(_r, _idx) \ +#define RING_GET_REQUEST(_r, _idx) \ (&((_r)->sring->ring[((_idx) & (RING_SIZE(_r) - 1))].req))
+#define RING_GET_RESPONSE(_r, _idx) \ + (&((_r)->sring->ring[((_idx) & (RING_SIZE(_r) - 1))].rsp)) + /* - * Get a local copy of a request. + * Get a local copy of a request/response. * - * Use this in preference to RING_GET_REQUEST() so all processing is + * Use this in preference to RING_GET_{REQUEST,RESPONSE}() so all processing is * done on a local copy that cannot be modified by the other end. * * Note that https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58145 may cause this - * to be ineffective where _req is a struct which consists of only bitfields. + * to be ineffective where dest is a struct which consists of only bitfields. */ -#define RING_COPY_REQUEST(_r, _idx, _req) do { \ - /* Use volatile to force the copy into _req. */ \ - *(_req) = *(volatile typeof(_req))RING_GET_REQUEST(_r, _idx); \ +#define RING_COPY_(type, r, idx, dest) do { \ + /* Use volatile to force the copy into dest. */ \ + *(dest) = *(volatile typeof(dest))RING_GET_##type(r, idx); \ } while (0)
-#define RING_GET_RESPONSE(_r, _idx) \ - (&((_r)->sring->ring[((_idx) & (RING_SIZE(_r) - 1))].rsp)) +#define RING_COPY_REQUEST(r, idx, req) RING_COPY_(REQUEST, r, idx, req) +#define RING_COPY_RESPONSE(r, idx, rsp) RING_COPY_(RESPONSE, r, idx, rsp)
/* Loop termination condition: Would the specified index overflow the ring? */ -#define RING_REQUEST_CONS_OVERFLOW(_r, _cons) \ +#define RING_REQUEST_CONS_OVERFLOW(_r, _cons) \ (((_cons) - (_r)->rsp_prod_pvt) >= RING_SIZE(_r))
/* Ill-behaved frontend determination: Can there be this many requests? */ -#define RING_REQUEST_PROD_OVERFLOW(_r, _prod) \ +#define RING_REQUEST_PROD_OVERFLOW(_r, _prod) \ (((_prod) - (_r)->rsp_prod_pvt) > RING_SIZE(_r))
- -#define RING_PUSH_REQUESTS(_r) do { \ - virt_wmb(); /* back sees requests /before/ updated producer index */ \ - (_r)->sring->req_prod = (_r)->req_prod_pvt; \ +/* Ill-behaved backend determination: Can there be this many responses? */ +#define RING_RESPONSE_PROD_OVERFLOW(_r, _prod) \ + (((_prod) - (_r)->rsp_cons) > RING_SIZE(_r)) + +#define RING_PUSH_REQUESTS(_r) do { \ + virt_wmb(); /* back sees requests /before/ updated producer index */\ + (_r)->sring->req_prod = (_r)->req_prod_pvt; \ } while (0)
-#define RING_PUSH_RESPONSES(_r) do { \ - virt_wmb(); /* front sees responses /before/ updated producer index */ \ - (_r)->sring->rsp_prod = (_r)->rsp_prod_pvt; \ +#define RING_PUSH_RESPONSES(_r) do { \ + virt_wmb(); /* front sees resps /before/ updated producer index */ \ + (_r)->sring->rsp_prod = (_r)->rsp_prod_pvt; \ } while (0)
/* @@ -250,40 +273,40 @@ struct __name##_back_ring { \ * field appropriately. */
-#define RING_PUSH_REQUESTS_AND_CHECK_NOTIFY(_r, _notify) do { \ - RING_IDX __old = (_r)->sring->req_prod; \ - RING_IDX __new = (_r)->req_prod_pvt; \ - virt_wmb(); /* back sees requests /before/ updated producer index */ \ - (_r)->sring->req_prod = __new; \ - virt_mb(); /* back sees new requests /before/ we check req_event */ \ - (_notify) = ((RING_IDX)(__new - (_r)->sring->req_event) < \ - (RING_IDX)(__new - __old)); \ -} while (0) - -#define RING_PUSH_RESPONSES_AND_CHECK_NOTIFY(_r, _notify) do { \ - RING_IDX __old = (_r)->sring->rsp_prod; \ - RING_IDX __new = (_r)->rsp_prod_pvt; \ - virt_wmb(); /* front sees responses /before/ updated producer index */ \ - (_r)->sring->rsp_prod = __new; \ - virt_mb(); /* front sees new responses /before/ we check rsp_event */ \ - (_notify) = ((RING_IDX)(__new - (_r)->sring->rsp_event) < \ - (RING_IDX)(__new - __old)); \ -} while (0) - -#define RING_FINAL_CHECK_FOR_REQUESTS(_r, _work_to_do) do { \ - (_work_to_do) = RING_HAS_UNCONSUMED_REQUESTS(_r); \ - if (_work_to_do) break; \ - (_r)->sring->req_event = (_r)->req_cons + 1; \ - virt_mb(); \ - (_work_to_do) = RING_HAS_UNCONSUMED_REQUESTS(_r); \ -} while (0) - -#define RING_FINAL_CHECK_FOR_RESPONSES(_r, _work_to_do) do { \ - (_work_to_do) = RING_HAS_UNCONSUMED_RESPONSES(_r); \ - if (_work_to_do) break; \ - (_r)->sring->rsp_event = (_r)->rsp_cons + 1; \ - virt_mb(); \ - (_work_to_do) = RING_HAS_UNCONSUMED_RESPONSES(_r); \ +#define RING_PUSH_REQUESTS_AND_CHECK_NOTIFY(_r, _notify) do { \ + RING_IDX __old = (_r)->sring->req_prod; \ + RING_IDX __new = (_r)->req_prod_pvt; \ + virt_wmb(); /* back sees requests /before/ updated producer index */\ + (_r)->sring->req_prod = __new; \ + virt_mb(); /* back sees new requests /before/ we check req_event */ \ + (_notify) = ((RING_IDX)(__new - (_r)->sring->req_event) < \ + (RING_IDX)(__new - __old)); \ +} while (0) + +#define RING_PUSH_RESPONSES_AND_CHECK_NOTIFY(_r, _notify) do { \ + RING_IDX __old = (_r)->sring->rsp_prod; \ + RING_IDX __new = (_r)->rsp_prod_pvt; \ + virt_wmb(); /* front sees resps /before/ updated producer index */ \ + (_r)->sring->rsp_prod = __new; \ + virt_mb(); /* front sees new resps /before/ we check rsp_event */ \ + (_notify) = ((RING_IDX)(__new - (_r)->sring->rsp_event) < \ + (RING_IDX)(__new - __old)); \ +} while (0) + +#define RING_FINAL_CHECK_FOR_REQUESTS(_r, _work_to_do) do { \ + (_work_to_do) = RING_HAS_UNCONSUMED_REQUESTS(_r); \ + if (_work_to_do) break; \ + (_r)->sring->req_event = (_r)->req_cons + 1; \ + virt_mb(); \ + (_work_to_do) = RING_HAS_UNCONSUMED_REQUESTS(_r); \ +} while (0) + +#define RING_FINAL_CHECK_FOR_RESPONSES(_r, _work_to_do) do { \ + (_work_to_do) = RING_HAS_UNCONSUMED_RESPONSES(_r); \ + if (_work_to_do) break; \ + (_r)->sring->rsp_event = (_r)->rsp_cons + 1; \ + virt_mb(); \ + (_work_to_do) = RING_HAS_UNCONSUMED_RESPONSES(_r); \ } while (0)
From: Juergen Gross jgross@suse.com
commit 71b66243f9898d0e54296b4e7035fb33cdcb0707 upstream.
In order to avoid problems in case the backend is modifying a response on the ring page while the frontend has already seen it, just read the response into a local buffer in one go and then operate on that buffer only.
Signed-off-by: Juergen Gross jgross@suse.com Reviewed-by: Jan Beulich jbeulich@suse.com Acked-by: Roger Pau Monné roger.pau@citrix.com Link: https://lore.kernel.org/r/20210730103854.12681-2-jgross@suse.com Signed-off-by: Juergen Gross jgross@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/block/xen-blkfront.c | 35 ++++++++++++++++++----------------- 1 file changed, 18 insertions(+), 17 deletions(-)
--- a/drivers/block/xen-blkfront.c +++ b/drivers/block/xen-blkfront.c @@ -1549,7 +1549,7 @@ static bool blkif_completion(unsigned lo static irqreturn_t blkif_interrupt(int irq, void *dev_id) { struct request *req; - struct blkif_response *bret; + struct blkif_response bret; RING_IDX i, rp; unsigned long flags; struct blkfront_ring_info *rinfo = (struct blkfront_ring_info *)dev_id; @@ -1566,8 +1566,9 @@ static irqreturn_t blkif_interrupt(int i for (i = rinfo->ring.rsp_cons; i != rp; i++) { unsigned long id;
- bret = RING_GET_RESPONSE(&rinfo->ring, i); - id = bret->id; + RING_COPY_RESPONSE(&rinfo->ring, i, &bret); + id = bret.id; + /* * The backend has messed up and given us an id that we would * never have given to it (we stamp it up to BLK_RING_SIZE - @@ -1575,39 +1576,39 @@ static irqreturn_t blkif_interrupt(int i */ if (id >= BLK_RING_SIZE(info)) { WARN(1, "%s: response to %s has incorrect id (%ld)\n", - info->gd->disk_name, op_name(bret->operation), id); + info->gd->disk_name, op_name(bret.operation), id); /* We can't safely get the 'struct request' as * the id is busted. */ continue; } req = rinfo->shadow[id].request;
- if (bret->operation != BLKIF_OP_DISCARD) { + if (bret.operation != BLKIF_OP_DISCARD) { /* * We may need to wait for an extra response if the * I/O request is split in 2 */ - if (!blkif_completion(&id, rinfo, bret)) + if (!blkif_completion(&id, rinfo, &bret)) continue; }
if (add_id_to_freelist(rinfo, id)) { WARN(1, "%s: response to %s (id %ld) couldn't be recycled!\n", - info->gd->disk_name, op_name(bret->operation), id); + info->gd->disk_name, op_name(bret.operation), id); continue; }
- if (bret->status == BLKIF_RSP_OKAY) + if (bret.status == BLKIF_RSP_OKAY) blkif_req(req)->error = BLK_STS_OK; else blkif_req(req)->error = BLK_STS_IOERR;
- switch (bret->operation) { + switch (bret.operation) { case BLKIF_OP_DISCARD: - if (unlikely(bret->status == BLKIF_RSP_EOPNOTSUPP)) { + if (unlikely(bret.status == BLKIF_RSP_EOPNOTSUPP)) { struct request_queue *rq = info->rq; printk(KERN_WARNING "blkfront: %s: %s op failed\n", - info->gd->disk_name, op_name(bret->operation)); + info->gd->disk_name, op_name(bret.operation)); blkif_req(req)->error = BLK_STS_NOTSUPP; info->feature_discard = 0; info->feature_secdiscard = 0; @@ -1617,15 +1618,15 @@ static irqreturn_t blkif_interrupt(int i break; case BLKIF_OP_FLUSH_DISKCACHE: case BLKIF_OP_WRITE_BARRIER: - if (unlikely(bret->status == BLKIF_RSP_EOPNOTSUPP)) { + if (unlikely(bret.status == BLKIF_RSP_EOPNOTSUPP)) { printk(KERN_WARNING "blkfront: %s: %s op failed\n", - info->gd->disk_name, op_name(bret->operation)); + info->gd->disk_name, op_name(bret.operation)); blkif_req(req)->error = BLK_STS_NOTSUPP; } - if (unlikely(bret->status == BLKIF_RSP_ERROR && + if (unlikely(bret.status == BLKIF_RSP_ERROR && rinfo->shadow[id].req.u.rw.nr_segments == 0)) { printk(KERN_WARNING "blkfront: %s: empty %s op failed\n", - info->gd->disk_name, op_name(bret->operation)); + info->gd->disk_name, op_name(bret.operation)); blkif_req(req)->error = BLK_STS_NOTSUPP; } if (unlikely(blkif_req(req)->error)) { @@ -1638,9 +1639,9 @@ static irqreturn_t blkif_interrupt(int i /* fall through */ case BLKIF_OP_READ: case BLKIF_OP_WRITE: - if (unlikely(bret->status != BLKIF_RSP_OKAY)) + if (unlikely(bret.status != BLKIF_RSP_OKAY)) dev_dbg(&info->xbdev->dev, "Bad return from blkdev data " - "request: %x\n", bret->status); + "request: %x\n", bret.status);
break; default:
From: Juergen Gross jgross@suse.com
commit 8f5a695d99000fc3aa73934d7ced33cfc64dcdab upstream.
In order to avoid a malicious backend being able to influence the local copy of a request build the request locally first and then copy it to the ring page instead of doing it the other way round as today.
Signed-off-by: Juergen Gross jgross@suse.com Reviewed-by: Jan Beulich jbeulich@suse.com Acked-by: Roger Pau Monné roger.pau@citrix.com Link: https://lore.kernel.org/r/20210730103854.12681-3-jgross@suse.com Signed-off-by: Juergen Gross jgross@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/block/xen-blkfront.c | 25 +++++++++++++++---------- 1 file changed, 15 insertions(+), 10 deletions(-)
--- a/drivers/block/xen-blkfront.c +++ b/drivers/block/xen-blkfront.c @@ -536,7 +536,7 @@ static unsigned long blkif_ring_get_requ rinfo->shadow[id].status = REQ_WAITING; rinfo->shadow[id].associated_id = NO_ASSOCIATED_ID;
- (*ring_req)->u.rw.id = id; + rinfo->shadow[id].req.u.rw.id = id;
return id; } @@ -544,11 +544,12 @@ static unsigned long blkif_ring_get_requ static int blkif_queue_discard_req(struct request *req, struct blkfront_ring_info *rinfo) { struct blkfront_info *info = rinfo->dev_info; - struct blkif_request *ring_req; + struct blkif_request *ring_req, *final_ring_req; unsigned long id;
/* Fill out a communications ring structure. */ - id = blkif_ring_get_request(rinfo, req, &ring_req); + id = blkif_ring_get_request(rinfo, req, &final_ring_req); + ring_req = &rinfo->shadow[id].req;
ring_req->operation = BLKIF_OP_DISCARD; ring_req->u.discard.nr_sectors = blk_rq_sectors(req); @@ -559,8 +560,8 @@ static int blkif_queue_discard_req(struc else ring_req->u.discard.flag = 0;
- /* Keep a private copy so we can reissue requests when recovering. */ - rinfo->shadow[id].req = *ring_req; + /* Copy the request to the ring page. */ + *final_ring_req = *ring_req;
return 0; } @@ -693,6 +694,7 @@ static int blkif_queue_rw_req(struct req { struct blkfront_info *info = rinfo->dev_info; struct blkif_request *ring_req, *extra_ring_req = NULL; + struct blkif_request *final_ring_req, *final_extra_ring_req = NULL; unsigned long id, extra_id = NO_ASSOCIATED_ID; bool require_extra_req = false; int i; @@ -737,7 +739,8 @@ static int blkif_queue_rw_req(struct req }
/* Fill out a communications ring structure. */ - id = blkif_ring_get_request(rinfo, req, &ring_req); + id = blkif_ring_get_request(rinfo, req, &final_ring_req); + ring_req = &rinfo->shadow[id].req;
num_sg = blk_rq_map_sg(req->q, req, rinfo->shadow[id].sg); num_grant = 0; @@ -788,7 +791,9 @@ static int blkif_queue_rw_req(struct req ring_req->u.rw.nr_segments = num_grant; if (unlikely(require_extra_req)) { extra_id = blkif_ring_get_request(rinfo, req, - &extra_ring_req); + &final_extra_ring_req); + extra_ring_req = &rinfo->shadow[extra_id].req; + /* * Only the first request contains the scatter-gather * list. @@ -830,10 +835,10 @@ static int blkif_queue_rw_req(struct req if (setup.segments) kunmap_atomic(setup.segments);
- /* Keep a private copy so we can reissue requests when recovering. */ - rinfo->shadow[id].req = *ring_req; + /* Copy request(s) to the ring page. */ + *final_ring_req = *ring_req; if (unlikely(require_extra_req)) - rinfo->shadow[extra_id].req = *extra_ring_req; + *final_extra_ring_req = *extra_ring_req;
if (new_persistent_gnts) gnttab_free_grant_references(setup.gref_head);
From: Juergen Gross jgross@suse.com
commit b94e4b147fd1992ad450e1fea1fdaa3738753373 upstream.
Today blkfront will trust the backend to send only sane response data. In order to avoid privilege escalations or crashes in case of malicious backends verify the data to be within expected limits. Especially make sure that the response always references an outstanding request.
Introduce a new state of the ring BLKIF_STATE_ERROR which will be switched to in case an inconsistency is being detected. Recovering from this state is possible only via removing and adding the virtual device again (e.g. via a suspend/resume cycle).
Make all warning messages issued due to valid error responses rate limited in order to avoid message floods being triggered by a malicious backend.
Signed-off-by: Juergen Gross jgross@suse.com Reviewed-by: Jan Beulich jbeulich@suse.com Acked-by: Roger Pau Monné roger.pau@citrix.com Link: https://lore.kernel.org/r/20210730103854.12681-4-jgross@suse.com Signed-off-by: Juergen Gross jgross@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/block/xen-blkfront.c | 70 ++++++++++++++++++++++++++++++++----------- 1 file changed, 53 insertions(+), 17 deletions(-)
--- a/drivers/block/xen-blkfront.c +++ b/drivers/block/xen-blkfront.c @@ -80,6 +80,7 @@ enum blkif_state { BLKIF_STATE_DISCONNECTED, BLKIF_STATE_CONNECTED, BLKIF_STATE_SUSPENDED, + BLKIF_STATE_ERROR, };
struct grant { @@ -89,6 +90,7 @@ struct grant { };
enum blk_req_status { + REQ_PROCESSING, REQ_WAITING, REQ_DONE, REQ_ERROR, @@ -533,7 +535,7 @@ static unsigned long blkif_ring_get_requ
id = get_id_from_freelist(rinfo); rinfo->shadow[id].request = req; - rinfo->shadow[id].status = REQ_WAITING; + rinfo->shadow[id].status = REQ_PROCESSING; rinfo->shadow[id].associated_id = NO_ASSOCIATED_ID;
rinfo->shadow[id].req.u.rw.id = id; @@ -562,6 +564,7 @@ static int blkif_queue_discard_req(struc
/* Copy the request to the ring page. */ *final_ring_req = *ring_req; + rinfo->shadow[id].status = REQ_WAITING;
return 0; } @@ -837,8 +840,11 @@ static int blkif_queue_rw_req(struct req
/* Copy request(s) to the ring page. */ *final_ring_req = *ring_req; - if (unlikely(require_extra_req)) + rinfo->shadow[id].status = REQ_WAITING; + if (unlikely(require_extra_req)) { *final_extra_ring_req = *extra_ring_req; + rinfo->shadow[extra_id].status = REQ_WAITING; + }
if (new_persistent_gnts) gnttab_free_grant_references(setup.gref_head); @@ -1412,8 +1418,8 @@ static enum blk_req_status blkif_rsp_to_ static int blkif_get_final_status(enum blk_req_status s1, enum blk_req_status s2) { - BUG_ON(s1 == REQ_WAITING); - BUG_ON(s2 == REQ_WAITING); + BUG_ON(s1 < REQ_DONE); + BUG_ON(s2 < REQ_DONE);
if (s1 == REQ_ERROR || s2 == REQ_ERROR) return BLKIF_RSP_ERROR; @@ -1446,7 +1452,7 @@ static bool blkif_completion(unsigned lo s->status = blkif_rsp_to_req_status(bret->status);
/* Wait the second response if not yet here. */ - if (s2->status == REQ_WAITING) + if (s2->status < REQ_DONE) return false;
bret->status = blkif_get_final_status(s->status, @@ -1565,11 +1571,17 @@ static irqreturn_t blkif_interrupt(int i
spin_lock_irqsave(&rinfo->ring_lock, flags); again: - rp = rinfo->ring.sring->rsp_prod; - rmb(); /* Ensure we see queued responses up to 'rp'. */ + rp = READ_ONCE(rinfo->ring.sring->rsp_prod); + virt_rmb(); /* Ensure we see queued responses up to 'rp'. */ + if (RING_RESPONSE_PROD_OVERFLOW(&rinfo->ring, rp)) { + pr_alert("%s: illegal number of responses %u\n", + info->gd->disk_name, rp - rinfo->ring.rsp_cons); + goto err; + }
for (i = rinfo->ring.rsp_cons; i != rp; i++) { unsigned long id; + unsigned int op;
RING_COPY_RESPONSE(&rinfo->ring, i, &bret); id = bret.id; @@ -1580,14 +1592,28 @@ static irqreturn_t blkif_interrupt(int i * look in get_id_from_freelist. */ if (id >= BLK_RING_SIZE(info)) { - WARN(1, "%s: response to %s has incorrect id (%ld)\n", - info->gd->disk_name, op_name(bret.operation), id); - /* We can't safely get the 'struct request' as - * the id is busted. */ - continue; + pr_alert("%s: response has incorrect id (%ld)\n", + info->gd->disk_name, id); + goto err; + } + if (rinfo->shadow[id].status != REQ_WAITING) { + pr_alert("%s: response references no pending request\n", + info->gd->disk_name); + goto err; } + + rinfo->shadow[id].status = REQ_PROCESSING; req = rinfo->shadow[id].request;
+ op = rinfo->shadow[id].req.operation; + if (op == BLKIF_OP_INDIRECT) + op = rinfo->shadow[id].req.u.indirect.indirect_op; + if (bret.operation != op) { + pr_alert("%s: response has wrong operation (%u instead of %u)\n", + info->gd->disk_name, bret.operation, op); + goto err; + } + if (bret.operation != BLKIF_OP_DISCARD) { /* * We may need to wait for an extra response if the @@ -1612,7 +1638,8 @@ static irqreturn_t blkif_interrupt(int i case BLKIF_OP_DISCARD: if (unlikely(bret.status == BLKIF_RSP_EOPNOTSUPP)) { struct request_queue *rq = info->rq; - printk(KERN_WARNING "blkfront: %s: %s op failed\n", + + pr_warn_ratelimited("blkfront: %s: %s op failed\n", info->gd->disk_name, op_name(bret.operation)); blkif_req(req)->error = BLK_STS_NOTSUPP; info->feature_discard = 0; @@ -1624,13 +1651,13 @@ static irqreturn_t blkif_interrupt(int i case BLKIF_OP_FLUSH_DISKCACHE: case BLKIF_OP_WRITE_BARRIER: if (unlikely(bret.status == BLKIF_RSP_EOPNOTSUPP)) { - printk(KERN_WARNING "blkfront: %s: %s op failed\n", + pr_warn_ratelimited("blkfront: %s: %s op failed\n", info->gd->disk_name, op_name(bret.operation)); blkif_req(req)->error = BLK_STS_NOTSUPP; } if (unlikely(bret.status == BLKIF_RSP_ERROR && rinfo->shadow[id].req.u.rw.nr_segments == 0)) { - printk(KERN_WARNING "blkfront: %s: empty %s op failed\n", + pr_warn_ratelimited("blkfront: %s: empty %s op failed\n", info->gd->disk_name, op_name(bret.operation)); blkif_req(req)->error = BLK_STS_NOTSUPP; } @@ -1645,8 +1672,9 @@ static irqreturn_t blkif_interrupt(int i case BLKIF_OP_READ: case BLKIF_OP_WRITE: if (unlikely(bret.status != BLKIF_RSP_OKAY)) - dev_dbg(&info->xbdev->dev, "Bad return from blkdev data " - "request: %x\n", bret.status); + dev_dbg_ratelimited(&info->xbdev->dev, + "Bad return from blkdev data request: %#x\n", + bret.status);
break; default: @@ -1671,6 +1699,14 @@ static irqreturn_t blkif_interrupt(int i spin_unlock_irqrestore(&rinfo->ring_lock, flags);
return IRQ_HANDLED; + + err: + info->connected = BLKIF_STATE_ERROR; + + spin_unlock_irqrestore(&rinfo->ring_lock, flags); + + pr_alert("%s disabled for further use\n", info->gd->disk_name); + return IRQ_HANDLED; }
From: Juergen Gross jgross@suse.com
commit 8446066bf8c1f9f7b7412c43fbea0fb87464d75b upstream.
In order to avoid problems in case the backend is modifying a response on the ring page while the frontend has already seen it, just read the response into a local buffer in one go and then operate on that buffer only.
Signed-off-by: Juergen Gross jgross@suse.com Reviewed-by: Jan Beulich jbeulich@suse.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/xen-netfront.c | 38 +++++++++++++++++++------------------- 1 file changed, 19 insertions(+), 19 deletions(-)
--- a/drivers/net/xen-netfront.c +++ b/drivers/net/xen-netfront.c @@ -387,13 +387,13 @@ static void xennet_tx_buf_gc(struct netf rmb(); /* Ensure we see responses up to 'rp'. */
for (cons = queue->tx.rsp_cons; cons != prod; cons++) { - struct xen_netif_tx_response *txrsp; + struct xen_netif_tx_response txrsp;
- txrsp = RING_GET_RESPONSE(&queue->tx, cons); - if (txrsp->status == XEN_NETIF_RSP_NULL) + RING_COPY_RESPONSE(&queue->tx, cons, &txrsp); + if (txrsp.status == XEN_NETIF_RSP_NULL) continue;
- id = txrsp->id; + id = txrsp.id; skb = queue->tx_skbs[id].skb; if (unlikely(gnttab_query_foreign_access( queue->grant_tx_ref[id]) != 0)) { @@ -741,7 +741,7 @@ static int xennet_get_extras(struct netf RING_IDX rp)
{ - struct xen_netif_extra_info *extra; + struct xen_netif_extra_info extra; struct device *dev = &queue->info->netdev->dev; RING_IDX cons = queue->rx.rsp_cons; int err = 0; @@ -757,24 +757,22 @@ static int xennet_get_extras(struct netf break; }
- extra = (struct xen_netif_extra_info *) - RING_GET_RESPONSE(&queue->rx, ++cons); + RING_COPY_RESPONSE(&queue->rx, ++cons, &extra);
- if (unlikely(!extra->type || - extra->type >= XEN_NETIF_EXTRA_TYPE_MAX)) { + if (unlikely(!extra.type || + extra.type >= XEN_NETIF_EXTRA_TYPE_MAX)) { if (net_ratelimit()) dev_warn(dev, "Invalid extra type: %d\n", - extra->type); + extra.type); err = -EINVAL; } else { - memcpy(&extras[extra->type - 1], extra, - sizeof(*extra)); + extras[extra.type - 1] = extra; }
skb = xennet_get_rx_skb(queue, cons); ref = xennet_get_rx_ref(queue, cons); xennet_move_rx_slot(queue, skb, ref); - } while (extra->flags & XEN_NETIF_EXTRA_FLAG_MORE); + } while (extra.flags & XEN_NETIF_EXTRA_FLAG_MORE);
queue->rx.rsp_cons = cons; return err; @@ -784,7 +782,7 @@ static int xennet_get_responses(struct n struct netfront_rx_info *rinfo, RING_IDX rp, struct sk_buff_head *list) { - struct xen_netif_rx_response *rx = &rinfo->rx; + struct xen_netif_rx_response *rx = &rinfo->rx, rx_local; struct xen_netif_extra_info *extras = rinfo->extras; struct device *dev = &queue->info->netdev->dev; RING_IDX cons = queue->rx.rsp_cons; @@ -842,7 +840,8 @@ next: break; }
- rx = RING_GET_RESPONSE(&queue->rx, cons + slots); + RING_COPY_RESPONSE(&queue->rx, cons + slots, &rx_local); + rx = &rx_local; skb = xennet_get_rx_skb(queue, cons + slots); ref = xennet_get_rx_ref(queue, cons + slots); slots++; @@ -897,10 +896,11 @@ static int xennet_fill_frags(struct netf struct sk_buff *nskb;
while ((nskb = __skb_dequeue(list))) { - struct xen_netif_rx_response *rx = - RING_GET_RESPONSE(&queue->rx, ++cons); + struct xen_netif_rx_response rx; skb_frag_t *nfrag = &skb_shinfo(nskb)->frags[0];
+ RING_COPY_RESPONSE(&queue->rx, ++cons, &rx); + if (skb_shinfo(skb)->nr_frags == MAX_SKB_FRAGS) { unsigned int pull_to = NETFRONT_SKB_CB(skb)->pull_to;
@@ -915,7 +915,7 @@ static int xennet_fill_frags(struct netf
skb_add_rx_frag(skb, skb_shinfo(skb)->nr_frags, skb_frag_page(nfrag), - rx->offset, rx->status, PAGE_SIZE); + rx.offset, rx.status, PAGE_SIZE);
skb_shinfo(nskb)->nr_frags = 0; kfree_skb(nskb); @@ -1013,7 +1013,7 @@ static int xennet_poll(struct napi_struc i = queue->rx.rsp_cons; work_done = 0; while ((i != rp) && (work_done < budget)) { - memcpy(rx, RING_GET_RESPONSE(&queue->rx, i), sizeof(*rx)); + RING_COPY_RESPONSE(&queue->rx, i, rx); memset(extras, 0, sizeof(rinfo.extras));
err = xennet_get_responses(queue, &rinfo, rp, &tmpq);
From: Juergen Gross jgross@suse.com
commit 162081ec33c2686afa29d91bf8d302824aa846c7 upstream.
In order to avoid a malicious backend being able to influence the local processing of a request build the request locally first and then copy it to the ring page. Any reading from the request influencing the processing in the frontend needs to be done on the local instance.
Signed-off-by: Juergen Gross jgross@suse.com Reviewed-by: Jan Beulich jbeulich@suse.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/xen-netfront.c | 78 ++++++++++++++++++++------------------------- 1 file changed, 36 insertions(+), 42 deletions(-)
--- a/drivers/net/xen-netfront.c +++ b/drivers/net/xen-netfront.c @@ -423,7 +423,8 @@ struct xennet_gnttab_make_txreq { struct netfront_queue *queue; struct sk_buff *skb; struct page *page; - struct xen_netif_tx_request *tx; /* Last request */ + struct xen_netif_tx_request *tx; /* Last request on ring page */ + struct xen_netif_tx_request tx_local; /* Last request local copy*/ unsigned int size; };
@@ -451,30 +452,27 @@ static void xennet_tx_setup_grant(unsign queue->grant_tx_page[id] = page; queue->grant_tx_ref[id] = ref;
- tx->id = id; - tx->gref = ref; - tx->offset = offset; - tx->size = len; - tx->flags = 0; + info->tx_local.id = id; + info->tx_local.gref = ref; + info->tx_local.offset = offset; + info->tx_local.size = len; + info->tx_local.flags = 0; + + *tx = info->tx_local;
info->tx = tx; - info->size += tx->size; + info->size += info->tx_local.size; }
static struct xen_netif_tx_request *xennet_make_first_txreq( - struct netfront_queue *queue, struct sk_buff *skb, - struct page *page, unsigned int offset, unsigned int len) + struct xennet_gnttab_make_txreq *info, + unsigned int offset, unsigned int len) { - struct xennet_gnttab_make_txreq info = { - .queue = queue, - .skb = skb, - .page = page, - .size = 0, - }; + info->size = 0;
- gnttab_for_one_grant(page, offset, len, xennet_tx_setup_grant, &info); + gnttab_for_one_grant(info->page, offset, len, xennet_tx_setup_grant, info);
- return info.tx; + return info->tx; }
static void xennet_make_one_txreq(unsigned long gfn, unsigned int offset, @@ -487,35 +485,27 @@ static void xennet_make_one_txreq(unsign xennet_tx_setup_grant(gfn, offset, len, data); }
-static struct xen_netif_tx_request *xennet_make_txreqs( - struct netfront_queue *queue, struct xen_netif_tx_request *tx, - struct sk_buff *skb, struct page *page, +static void xennet_make_txreqs( + struct xennet_gnttab_make_txreq *info, + struct page *page, unsigned int offset, unsigned int len) { - struct xennet_gnttab_make_txreq info = { - .queue = queue, - .skb = skb, - .tx = tx, - }; - /* Skip unused frames from start of page */ page += offset >> PAGE_SHIFT; offset &= ~PAGE_MASK;
while (len) { - info.page = page; - info.size = 0; + info->page = page; + info->size = 0;
gnttab_foreach_grant_in_range(page, offset, len, xennet_make_one_txreq, - &info); + info);
page++; offset = 0; - len -= info.size; + len -= info->size; } - - return info.tx; }
/* @@ -568,7 +558,7 @@ static netdev_tx_t xennet_start_xmit(str { struct netfront_info *np = netdev_priv(dev); struct netfront_stats *tx_stats = this_cpu_ptr(np->tx_stats); - struct xen_netif_tx_request *tx, *first_tx; + struct xen_netif_tx_request *first_tx; unsigned int i; int notify; int slots; @@ -577,6 +567,7 @@ static netdev_tx_t xennet_start_xmit(str unsigned int len; unsigned long flags; struct netfront_queue *queue = NULL; + struct xennet_gnttab_make_txreq info = { }; unsigned int num_queues = dev->real_num_tx_queues; u16 queue_index; struct sk_buff *nskb; @@ -634,21 +625,24 @@ static netdev_tx_t xennet_start_xmit(str }
/* First request for the linear area. */ - first_tx = tx = xennet_make_first_txreq(queue, skb, - page, offset, len); - offset += tx->size; + info.queue = queue; + info.skb = skb; + info.page = page; + first_tx = xennet_make_first_txreq(&info, offset, len); + offset += info.tx_local.size; if (offset == PAGE_SIZE) { page++; offset = 0; } - len -= tx->size; + len -= info.tx_local.size;
if (skb->ip_summed == CHECKSUM_PARTIAL) /* local packet? */ - tx->flags |= XEN_NETTXF_csum_blank | XEN_NETTXF_data_validated; + first_tx->flags |= XEN_NETTXF_csum_blank | + XEN_NETTXF_data_validated; else if (skb->ip_summed == CHECKSUM_UNNECESSARY) /* remote but checksummed. */ - tx->flags |= XEN_NETTXF_data_validated; + first_tx->flags |= XEN_NETTXF_data_validated;
/* Optional extra info after the first request. */ if (skb_shinfo(skb)->gso_size) { @@ -657,7 +651,7 @@ static netdev_tx_t xennet_start_xmit(str gso = (struct xen_netif_extra_info *) RING_GET_REQUEST(&queue->tx, queue->tx.req_prod_pvt++);
- tx->flags |= XEN_NETTXF_extra_info; + first_tx->flags |= XEN_NETTXF_extra_info;
gso->u.gso.size = skb_shinfo(skb)->gso_size; gso->u.gso.type = (skb_shinfo(skb)->gso_type & SKB_GSO_TCPV6) ? @@ -671,12 +665,12 @@ static netdev_tx_t xennet_start_xmit(str }
/* Requests for the rest of the linear area. */ - tx = xennet_make_txreqs(queue, tx, skb, page, offset, len); + xennet_make_txreqs(&info, page, offset, len);
/* Requests for all the frags. */ for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) { skb_frag_t *frag = &skb_shinfo(skb)->frags[i]; - tx = xennet_make_txreqs(queue, tx, skb, skb_frag_page(frag), + xennet_make_txreqs(&info, skb_frag_page(frag), skb_frag_off(frag), skb_frag_size(frag)); }
From: Juergen Gross jgross@suse.com
commit 21631d2d741a64a073e167c27769e73bc7844a2f upstream.
The tx_skb_freelist elements are in a single linked list with the request id used as link reference. The per element link field is in a union with the skb pointer of an in use request.
Move the link reference out of the union in order to enable a later reuse of it for requests which need a populated skb pointer.
Rename add_id_to_freelist() and get_id_from_freelist() to add_id_to_list() and get_id_from_list() in order to prepare using those for other lists as well. Define ~0 as value to indicate the end of a list and place that value into the link for a request not being on the list.
When freeing a skb zero the skb pointer in the request. Use a NULL value of the skb pointer instead of skb_entry_is_link() for deciding whether a request has a skb linked to it.
Remove skb_entry_set_link() and open code it instead as it is really trivial now.
Signed-off-by: Juergen Gross jgross@suse.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/xen-netfront.c | 61 ++++++++++++++++++--------------------------- 1 file changed, 25 insertions(+), 36 deletions(-)
--- a/drivers/net/xen-netfront.c +++ b/drivers/net/xen-netfront.c @@ -121,17 +121,11 @@ struct netfront_queue {
/* * {tx,rx}_skbs store outstanding skbuffs. Free tx_skb entries - * are linked from tx_skb_freelist through skb_entry.link. - * - * NB. Freelist index entries are always going to be less than - * PAGE_OFFSET, whereas pointers to skbs will always be equal or - * greater than PAGE_OFFSET: we use this property to distinguish - * them. + * are linked from tx_skb_freelist through tx_link. */ - union skb_entry { - struct sk_buff *skb; - unsigned long link; - } tx_skbs[NET_TX_RING_SIZE]; + struct sk_buff *tx_skbs[NET_TX_RING_SIZE]; + unsigned short tx_link[NET_TX_RING_SIZE]; +#define TX_LINK_NONE 0xffff grant_ref_t gref_tx_head; grant_ref_t grant_tx_ref[NET_TX_RING_SIZE]; struct page *grant_tx_page[NET_TX_RING_SIZE]; @@ -169,33 +163,25 @@ struct netfront_rx_info { struct xen_netif_extra_info extras[XEN_NETIF_EXTRA_TYPE_MAX - 1]; };
-static void skb_entry_set_link(union skb_entry *list, unsigned short id) -{ - list->link = id; -} - -static int skb_entry_is_link(const union skb_entry *list) -{ - BUILD_BUG_ON(sizeof(list->skb) != sizeof(list->link)); - return (unsigned long)list->skb < PAGE_OFFSET; -} - /* * Access macros for acquiring freeing slots in tx_skbs[]. */
-static void add_id_to_freelist(unsigned *head, union skb_entry *list, - unsigned short id) +static void add_id_to_list(unsigned *head, unsigned short *list, + unsigned short id) { - skb_entry_set_link(&list[id], *head); + list[id] = *head; *head = id; }
-static unsigned short get_id_from_freelist(unsigned *head, - union skb_entry *list) +static unsigned short get_id_from_list(unsigned *head, unsigned short *list) { unsigned int id = *head; - *head = list[id].link; + + if (id != TX_LINK_NONE) { + *head = list[id]; + list[id] = TX_LINK_NONE; + } return id; }
@@ -394,7 +380,8 @@ static void xennet_tx_buf_gc(struct netf continue;
id = txrsp.id; - skb = queue->tx_skbs[id].skb; + skb = queue->tx_skbs[id]; + queue->tx_skbs[id] = NULL; if (unlikely(gnttab_query_foreign_access( queue->grant_tx_ref[id]) != 0)) { pr_alert("%s: warning -- grant still in use by backend domain\n", @@ -407,7 +394,7 @@ static void xennet_tx_buf_gc(struct netf &queue->gref_tx_head, queue->grant_tx_ref[id]); queue->grant_tx_ref[id] = GRANT_INVALID_REF; queue->grant_tx_page[id] = NULL; - add_id_to_freelist(&queue->tx_skb_freelist, queue->tx_skbs, id); + add_id_to_list(&queue->tx_skb_freelist, queue->tx_link, id); dev_kfree_skb_irq(skb); }
@@ -440,7 +427,7 @@ static void xennet_tx_setup_grant(unsign struct netfront_queue *queue = info->queue; struct sk_buff *skb = info->skb;
- id = get_id_from_freelist(&queue->tx_skb_freelist, queue->tx_skbs); + id = get_id_from_list(&queue->tx_skb_freelist, queue->tx_link); tx = RING_GET_REQUEST(&queue->tx, queue->tx.req_prod_pvt++); ref = gnttab_claim_grant_reference(&queue->gref_tx_head); WARN_ON_ONCE(IS_ERR_VALUE((unsigned long)(int)ref)); @@ -448,7 +435,7 @@ static void xennet_tx_setup_grant(unsign gnttab_grant_foreign_access_ref(ref, queue->info->xbdev->otherend_id, gfn, GNTMAP_readonly);
- queue->tx_skbs[id].skb = skb; + queue->tx_skbs[id] = skb; queue->grant_tx_page[id] = page; queue->grant_tx_ref[id] = ref;
@@ -1129,17 +1116,18 @@ static void xennet_release_tx_bufs(struc
for (i = 0; i < NET_TX_RING_SIZE; i++) { /* Skip over entries which are actually freelist references */ - if (skb_entry_is_link(&queue->tx_skbs[i])) + if (!queue->tx_skbs[i]) continue;
- skb = queue->tx_skbs[i].skb; + skb = queue->tx_skbs[i]; + queue->tx_skbs[i] = NULL; get_page(queue->grant_tx_page[i]); gnttab_end_foreign_access(queue->grant_tx_ref[i], GNTMAP_readonly, (unsigned long)page_address(queue->grant_tx_page[i])); queue->grant_tx_page[i] = NULL; queue->grant_tx_ref[i] = GRANT_INVALID_REF; - add_id_to_freelist(&queue->tx_skb_freelist, queue->tx_skbs, i); + add_id_to_list(&queue->tx_skb_freelist, queue->tx_link, i); dev_kfree_skb_irq(skb); } } @@ -1621,13 +1609,14 @@ static int xennet_init_queue(struct netf snprintf(queue->name, sizeof(queue->name), "vif%s-q%u", devid, queue->id);
- /* Initialise tx_skbs as a free chain containing every entry. */ + /* Initialise tx_skb_freelist as a free chain containing every entry. */ queue->tx_skb_freelist = 0; for (i = 0; i < NET_TX_RING_SIZE; i++) { - skb_entry_set_link(&queue->tx_skbs[i], i+1); + queue->tx_link[i] = i + 1; queue->grant_tx_ref[i] = GRANT_INVALID_REF; queue->grant_tx_page[i] = NULL; } + queue->tx_link[NET_TX_RING_SIZE - 1] = TX_LINK_NONE;
/* Clear out rx_skbs */ for (i = 0; i < NET_RX_RING_SIZE; i++) {
From: Juergen Gross jgross@suse.com
commit a884daa61a7d91650987e855464526aef219590f upstream.
Today netfront will trust the backend to send only sane response data. In order to avoid privilege escalations or crashes in case of malicious backends verify the data to be within expected limits. Especially make sure that the response always references an outstanding request.
Note that only the tx queue needs special id handling, as for the rx queue the id is equal to the index in the ring page.
Introduce a new indicator for the device whether it is broken and let the device stop working when it is set. Set this indicator in case the backend sets any weird data.
Signed-off-by: Juergen Gross jgross@suse.com Reviewed-by: Jan Beulich jbeulich@suse.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/xen-netfront.c | 80 ++++++++++++++++++++++++++++++++++++++++++--- 1 file changed, 75 insertions(+), 5 deletions(-)
--- a/drivers/net/xen-netfront.c +++ b/drivers/net/xen-netfront.c @@ -126,10 +126,12 @@ struct netfront_queue { struct sk_buff *tx_skbs[NET_TX_RING_SIZE]; unsigned short tx_link[NET_TX_RING_SIZE]; #define TX_LINK_NONE 0xffff +#define TX_PENDING 0xfffe grant_ref_t gref_tx_head; grant_ref_t grant_tx_ref[NET_TX_RING_SIZE]; struct page *grant_tx_page[NET_TX_RING_SIZE]; unsigned tx_skb_freelist; + unsigned int tx_pend_queue;
spinlock_t rx_lock ____cacheline_aligned_in_smp; struct xen_netif_rx_front_ring rx; @@ -155,6 +157,9 @@ struct netfront_info { struct netfront_stats __percpu *rx_stats; struct netfront_stats __percpu *tx_stats;
+ /* Is device behaving sane? */ + bool broken; + atomic_t rx_gso_checksum_fixup; };
@@ -337,7 +342,7 @@ static int xennet_open(struct net_device unsigned int i = 0; struct netfront_queue *queue = NULL;
- if (!np->queues) + if (!np->queues || np->broken) return -ENODEV;
for (i = 0; i < num_queues; ++i) { @@ -365,11 +370,17 @@ static void xennet_tx_buf_gc(struct netf unsigned short id; struct sk_buff *skb; bool more_to_do; + const struct device *dev = &queue->info->netdev->dev;
BUG_ON(!netif_carrier_ok(queue->info->netdev));
do { prod = queue->tx.sring->rsp_prod; + if (RING_RESPONSE_PROD_OVERFLOW(&queue->tx, prod)) { + dev_alert(dev, "Illegal number of responses %u\n", + prod - queue->tx.rsp_cons); + goto err; + } rmb(); /* Ensure we see responses up to 'rp'. */
for (cons = queue->tx.rsp_cons; cons != prod; cons++) { @@ -379,14 +390,27 @@ static void xennet_tx_buf_gc(struct netf if (txrsp.status == XEN_NETIF_RSP_NULL) continue;
- id = txrsp.id; + id = txrsp.id; + if (id >= RING_SIZE(&queue->tx)) { + dev_alert(dev, + "Response has incorrect id (%u)\n", + id); + goto err; + } + if (queue->tx_link[id] != TX_PENDING) { + dev_alert(dev, + "Response for inactive request\n"); + goto err; + } + + queue->tx_link[id] = TX_LINK_NONE; skb = queue->tx_skbs[id]; queue->tx_skbs[id] = NULL; if (unlikely(gnttab_query_foreign_access( queue->grant_tx_ref[id]) != 0)) { - pr_alert("%s: warning -- grant still in use by backend domain\n", - __func__); - BUG(); + dev_alert(dev, + "Grant still in use by backend domain\n"); + goto err; } gnttab_end_foreign_access_ref( queue->grant_tx_ref[id], GNTMAP_readonly); @@ -404,6 +428,12 @@ static void xennet_tx_buf_gc(struct netf } while (more_to_do);
xennet_maybe_wake_tx(queue); + + return; + + err: + queue->info->broken = true; + dev_alert(dev, "Disabled for further use\n"); }
struct xennet_gnttab_make_txreq { @@ -447,6 +477,12 @@ static void xennet_tx_setup_grant(unsign
*tx = info->tx_local;
+ /* + * Put the request in the pending queue, it will be set to be pending + * when the producer index is about to be raised. + */ + add_id_to_list(&queue->tx_pend_queue, queue->tx_link, id); + info->tx = tx; info->size += info->tx_local.size; } @@ -539,6 +575,15 @@ static u16 xennet_select_queue(struct ne return queue_idx; }
+static void xennet_mark_tx_pending(struct netfront_queue *queue) +{ + unsigned int i; + + while ((i = get_id_from_list(&queue->tx_pend_queue, queue->tx_link)) != + TX_LINK_NONE) + queue->tx_link[i] = TX_PENDING; +} + #define MAX_XEN_SKB_FRAGS (65536 / XEN_PAGE_SIZE + 1)
static netdev_tx_t xennet_start_xmit(struct sk_buff *skb, struct net_device *dev) @@ -562,6 +607,8 @@ static netdev_tx_t xennet_start_xmit(str /* Drop the packet if no queues are set up */ if (num_queues < 1) goto drop; + if (unlikely(np->broken)) + goto drop; /* Determine which queue to transmit this SKB on */ queue_index = skb_get_queue_mapping(skb); queue = &np->queues[queue_index]; @@ -665,6 +712,8 @@ static netdev_tx_t xennet_start_xmit(str /* First request has the packet length. */ first_tx->size = skb->len;
+ xennet_mark_tx_pending(queue); + RING_PUSH_REQUESTS_AND_CHECK_NOTIFY(&queue->tx, notify); if (notify) notify_remote_via_irq(queue->tx_irq); @@ -989,6 +1038,13 @@ static int xennet_poll(struct napi_struc skb_queue_head_init(&tmpq);
rp = queue->rx.sring->rsp_prod; + if (RING_RESPONSE_PROD_OVERFLOW(&queue->rx, rp)) { + dev_alert(&dev->dev, "Illegal number of responses %u\n", + rp - queue->rx.rsp_cons); + queue->info->broken = true; + spin_unlock(&queue->rx_lock); + return 0; + } rmb(); /* Ensure we see queued responses up to 'rp'. */
i = queue->rx.rsp_cons; @@ -1207,6 +1263,9 @@ static irqreturn_t xennet_tx_interrupt(i struct netfront_queue *queue = dev_id; unsigned long flags;
+ if (queue->info->broken) + return IRQ_HANDLED; + spin_lock_irqsave(&queue->tx_lock, flags); xennet_tx_buf_gc(queue); spin_unlock_irqrestore(&queue->tx_lock, flags); @@ -1219,6 +1278,9 @@ static irqreturn_t xennet_rx_interrupt(i struct netfront_queue *queue = dev_id; struct net_device *dev = queue->info->netdev;
+ if (queue->info->broken) + return IRQ_HANDLED; + if (likely(netif_carrier_ok(dev) && RING_HAS_UNCONSUMED_RESPONSES(&queue->rx))) napi_schedule(&queue->napi); @@ -1240,6 +1302,10 @@ static void xennet_poll_controller(struc struct netfront_info *info = netdev_priv(dev); unsigned int num_queues = dev->real_num_tx_queues; unsigned int i; + + if (info->broken) + return; + for (i = 0; i < num_queues; ++i) xennet_interrupt(0, &info->queues[i]); } @@ -1611,6 +1677,7 @@ static int xennet_init_queue(struct netf
/* Initialise tx_skb_freelist as a free chain containing every entry. */ queue->tx_skb_freelist = 0; + queue->tx_pend_queue = TX_LINK_NONE; for (i = 0; i < NET_TX_RING_SIZE; i++) { queue->tx_link[i] = i + 1; queue->grant_tx_ref[i] = GRANT_INVALID_REF; @@ -1821,6 +1888,9 @@ static int talk_to_netback(struct xenbus if (info->queues) xennet_destroy_queues(info);
+ /* For the case of a reconnect reset the "broken" indicator. */ + info->broken = false; + err = xennet_create_queues(info, &num_queues); if (err < 0) { xenbus_dev_fatal(dev, err, "creating queues");
From: Juergen Gross jgross@suse.com
commit e679004dec37566f658a255157d3aed9d762a2b7 upstream.
Xen frontends shouldn't BUG() in case of illegal data received from their backends. So replace the BUG_ON()s when reading illegal data from the ring page with negative return values.
This is commit e679004dec37566f upstream.
Reviewed-by: Jan Beulich jbeulich@suse.com Signed-off-by: Juergen Gross jgross@suse.com Link: https://lore.kernel.org/r/20210707091045.460-1-jgross@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/tty/hvc/hvc_xen.c | 17 ++++++++++++++--- 1 file changed, 14 insertions(+), 3 deletions(-)
--- a/drivers/tty/hvc/hvc_xen.c +++ b/drivers/tty/hvc/hvc_xen.c @@ -86,7 +86,11 @@ static int __write_console(struct xencon cons = intf->out_cons; prod = intf->out_prod; mb(); /* update queue values before going on */ - BUG_ON((prod - cons) > sizeof(intf->out)); + + if ((prod - cons) > sizeof(intf->out)) { + pr_err_once("xencons: Illegal ring page indices"); + return -EINVAL; + }
while ((sent < len) && ((prod - cons) < sizeof(intf->out))) intf->out[MASK_XENCONS_IDX(prod++, intf->out)] = data[sent++]; @@ -114,7 +118,10 @@ static int domU_write_console(uint32_t v */ while (len) { int sent = __write_console(cons, data, len); - + + if (sent < 0) + return sent; + data += sent; len -= sent;
@@ -138,7 +145,11 @@ static int domU_read_console(uint32_t vt cons = intf->in_cons; prod = intf->in_prod; mb(); /* get pointers before reading ring */ - BUG_ON((prod - cons) > sizeof(intf->in)); + + if ((prod - cons) > sizeof(intf->in)) { + pr_err_once("xencons: Illegal ring page indices"); + return -EINVAL; + }
while (cons != prod && recv < len) buf[recv++] = intf->in[MASK_XENCONS_IDX(cons++, intf->in)];
On 11/29/21 11:17 AM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.4.163 release. There are 92 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 01 Dec 2021 18:16:51 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.4.163-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.4.y and the diffstat can be found below.
thanks,
greg k-h
Compiled and booted on my test system. No dmesg regressions.
Tested-by: Shuah Khan skhan@linuxfoundation.org
thanks, -- Shuah
On 2021/11/30 2:17, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.4.163 release. There are 92 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 01 Dec 2021 18:16:51 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.4.163-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.4.y and the diffstat can be found below.
thanks,
greg k-h
Tested on arm64 and x86 for 5.4.163-rc1,
Kernel repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git Branch: linux-5.4.y Version: 5.4.163-rc1 Commit: ddf64c6cf75aa84461ca3c666915638d4677912e Compiler: gcc version 7.3.0 (GCC)
arm64: -------------------------------------------------------------------- Testcase Result Summary: total: 9015 passed: 9015 failed: 0 timeout: 0 --------------------------------------------------------------------
x86: -------------------------------------------------------------------- Testcase Result Summary: total: 9015 passed: 9015 failed: 0 timeout: 0 --------------------------------------------------------------------
Tested-by: Hulk Robot hulkrobot@huawei.com
On 11/29/2021 10:17 AM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.4.163 release. There are 92 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 01 Dec 2021 18:16:51 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.4.163-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.4.y and the diffstat can be found below.
thanks,
greg k-h
On ARCH_BRCMSTB using 32-bit and 64-bit ARM kernels:
Tested-by: Florian Fainelli f.fainelli@gmail.com
On Mon, 29 Nov 2021 19:17:29 +0100, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.4.163 release. There are 92 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 01 Dec 2021 18:16:51 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.4.163-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.4.y and the diffstat can be found below.
thanks,
greg k-h
All tests passing for Tegra ...
Test results for stable-v5.4: 10 builds: 10 pass, 0 fail 26 boots: 26 pass, 0 fail 59 tests: 59 pass, 0 fail
Linux version: 5.4.163-rc1-g53673b0fe933 Boards tested: tegra124-jetson-tk1, tegra186-p2771-0000, tegra194-p2972-0000, tegra20-ventana, tegra210-p2371-2180, tegra210-p3450-0000, tegra30-cardhu-a04
Tested-by: Jon Hunter jonathanh@nvidia.com
Jon
On Mon, 29 Nov 2021 at 23:53, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 5.4.163 release. There are 92 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 01 Dec 2021 18:16:51 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.4.163-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.4.y and the diffstat can be found below.
thanks,
greg k-h
Results from Linaro’s test farm. No regressions on arm64, arm, x86_64, and i386.
Tested-by: Linux Kernel Functional Testing lkft@linaro.org
## Build * kernel: 5.4.163-rc1 * git: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git * git branch: linux-5.4.y * git commit: 53673b0fe933a85d63ae4a4ed78945e5a2ce1b8f * git describe: v5.4.162-93-g53673b0fe933 * test details: https://qa-reports.linaro.org/lkft/linux-stable-rc-linux-5.4.y/build/v5.4.16...
## No regressions (compared to v5.4.161-101-ge0db70362c5c)
## No fixes (compared to v5.4.161-101-ge0db70362c5c)
## Test result summary total: 84761, pass: 69858, fail: 802, skip: 12926, xfail: 1175
## Build Summary * arc: 10 total, 10 passed, 0 failed * arm: 258 total, 258 passed, 0 failed * arm64: 74 total, 48 passed, 26 failed * dragonboard-410c: 1 total, 1 passed, 0 failed * hi6220-hikey: 1 total, 1 passed, 0 failed * i386: 20 total, 20 passed, 0 failed * juno-r2: 1 total, 1 passed, 0 failed * mips: 34 total, 34 passed, 0 failed * parisc: 12 total, 12 passed, 0 failed * powerpc: 54 total, 48 passed, 6 failed * riscv: 24 total, 24 passed, 0 failed * s390: 12 total, 12 passed, 0 failed * sh: 20 total, 20 passed, 0 failed * sparc: 12 total, 12 passed, 0 failed * x15: 1 total, 1 passed, 0 failed * x86: 1 total, 1 passed, 0 failed * x86_64: 36 total, 36 passed, 0 failed
## Test suites summary * fwts * igt-gpu-tools * kselftest-android * kselftest-arm64 * kselftest-arm64/arm64.btitest.bti_c_func * kselftest-arm64/arm64.btitest.bti_j_func * kselftest-arm64/arm64.btitest.bti_jc_func * kselftest-arm64/arm64.btitest.bti_none_func * kselftest-arm64/arm64.btitest.nohint_func * kselftest-arm64/arm64.btitest.paciasp_func * kselftest-arm64/arm64.nobtitest.bti_c_func * kselftest-arm64/arm64.nobtitest.bti_j_func * kselftest-arm64/arm64.nobtitest.bti_jc_func * kselftest-arm64/arm64.nobtitest.bti_none_func * kselftest-arm64/arm64.nobtitest.nohint_func * kselftest-arm64/arm64.nobtitest.paciasp_func * kselftest-bpf * kselftest-breakpoints * kselftest-capabilities * kselftest-cgroup * kselftest-clone3 * kselftest-core * kselftest-cpu-hotplug * kselftest-cpufreq * kselftest-drivers * kselftest-efivarfs * kselftest-filesystems * kselftest-firmware * kselftest-fpu * kselftest-futex * kselftest-gpio * kselftest-intel_pstate * kselftest-ipc * kselftest-ir * kselftest-kcmp * kselftest-kexec * kselftest-kvm * kselftest-lib * kselftest-livepatch * kselftest-membarrier * kselftest-memfd * kselftest-memory-hotplug * kselftest-mincore * kselftest-mount * kselftest-mqueue * kselftest-net * kselftest-netfilter * kselftest-nsfs * kselftest-openat2 * kselftest-pid_namespace * kselftest-pidfd * kselftest-proc * kselftest-pstore * kselftest-ptrace * kselftest-rseq * kselftest-rtc * kselftest-seccomp * kselftest-sigaltstack * kselftest-size * kselftest-splice * kselftest-static_keys * kselftest-sync * kselftest-sysctl * kselftest-tc-testing * kselftest-timens * kselftest-timers * kselftest-tmpfs * kselftest-tpm2 * kselftest-user * kselftest-vm * kselftest-x86 * kselftest-zram * kvm-unit-tests * libgpiod * libhugetlbfs * linux-log-parser * ltp-cap_bounds-tests * ltp-commands-tests * ltp-containers-tests * ltp-controllers-tests * ltp-cpuhotplug-tests * ltp-crypto-tests * ltp-cve-tests * ltp-dio-tests * ltp-fcntl-locktests-tests * ltp-filecaps-tests * ltp-fs-tests * ltp-fs_bind-tests * ltp-fs_perms_simple-tests * ltp-fsx-tests * ltp-hugetlb-tests * ltp-io-tests * ltp-ipc-tests * ltp-math-tests * ltp-mm-tests * ltp-nptl-tests * ltp-open-posix-tests * ltp-pty-tests * ltp-sched-tests * ltp-securebits-tests * ltp-syscalls-tests * ltp-tracing-tests * network-basic-tests * packetdrill * perf * rcutorture * ssuite * v4l2-compliance
-- Linaro LKFT https://lkft.linaro.org
Hi Greg,
On Mon, Nov 29, 2021 at 07:17:29PM +0100, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.4.163 release. There are 92 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 01 Dec 2021 18:16:51 +0000. Anything received after that time might be too late.
Build test: mips (gcc version 11.2.1 20211112): 65 configs -> no new failure arm (gcc version 11.2.1 20211112): 107 configs -> no new failure arm64 (gcc version 11.2.1 20211112): 2 configs -> no failure x86_64 (gcc version 11.2.1 20211112): 4 configs -> no failure
Boot test: x86_64: Booted on my test laptop. No regression. x86_64: Booted on qemu. No regression. [1]
[1]. https://openqa.qa.codethink.co.uk/tests/453
Tested-by: Sudip Mukherjee sudip.mukherjee@codethink.co.uk
-- Regards Sudip
On Mon, Nov 29, 2021 at 07:17:29PM +0100, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.4.163 release. There are 92 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 01 Dec 2021 18:16:51 +0000. Anything received after that time might be too late.
Build results: total: 157 pass: 157 fail: 0 Qemu test results: total: 446 pass: 446 fail: 0
Tested-by: Guenter Roeck linux@roeck-us.net
Guenter
linux-stable-mirror@lists.linaro.org