This is the start of the stable review cycle for the 5.1.13 release. There are 98 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sat 22 Jun 2019 05:42:15 PM UTC. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.1.13-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.1.y and the diffstat can be found below.
thanks,
greg k-h
------------- Pseudo-Shortlog of commits:
Greg Kroah-Hartman gregkh@linuxfoundation.org Linux 5.1.13-rc1
Andrea Arcangeli aarcange@redhat.com coredump: fix race condition between collapse_huge_page() and core dumping
Sagi Grimberg sagi@grimberg.me nvme-tcp: fix queue mapping when queue count is limited
Sagi Grimberg sagi@grimberg.me nvme-tcp: fix possible null deref on a timed out io queue connect
Sagi Grimberg sagi@grimberg.me nvme-tcp: rename function to have nvme_tcp prefix
Yang Shi yang.shi@linux.alibaba.com mm: mmu_gather: remove __tlb_reset_range() for force flush
Tobin C. Harding tobin@kernel.org ocfs2: fix error path kobject memory leak
Amit Cohen amitc@mellanox.com mlxsw: spectrum: Prevent force of 56G
Jason Yan yanaijie@huawei.com scsi: libsas: delete sas port if expander discover failed
YueHaibing yuehaibing@huawei.com scsi: scsi_dh_alua: Fix possible null-ptr-deref
Lianbo Jiang lijiang@redhat.com scsi: smartpqi: properly set both the DMA mask and the coherent DMA mask
Varun Prakash varun@chelsio.com scsi: libcxgbi: add a check for NULL pointer in cxgbi_check_route()
Max Uvarov muvarov@gmail.com net: phy: dp83867: Set up RGMII TX delay
Max Uvarov muvarov@gmail.com net: phy: dp83867: increase SGMII autoneg timer duration
Max Uvarov muvarov@gmail.com net: phy: dp83867: fix speed 10 in sgmii mode
Russell King rmk+kernel@armlinux.org.uk net: phylink: ensure consistent phy interface mode
Jes Sorensen jsorensen@fb.com blk-mq: Fix memory leak in error handling
Yoshihiro Shimoda yoshihiro.shimoda.uh@renesas.com net: sh_eth: fix mdio access in sh_eth_close() for R-Car Gen2 and RZ/A1 SoCs
Sami Tolvanen samitolvanen@google.com arm64: use the correct function type for __arm64_sys_ni_syscall
Sami Tolvanen samitolvanen@google.com arm64: use the correct function type in SYSCALL_DEFINE0
Sami Tolvanen samitolvanen@google.com arm64: fix syscall_fn_t type
Geert Uytterhoeven geert@linux-m68k.org ALSA: fireface: Use ULL suffixes for 64-bit constants
Paul Mackerras paulus@ozlabs.org KVM: PPC: Book3S HV: Don't take kvm->lock around kvm_for_each_vcpu
Paul Mackerras paulus@ozlabs.org KVM: PPC: Book3S: Use new mutex to synchronize access to rtas token list
Paul Mackerras paulus@ozlabs.org KVM: PPC: Book3S HV: Use new mutex to synchronize MMU setup
Gen Zhang blackgod016574@gmail.com dfs_cache: fix a wrong use of kfree in flush_cache_ent()
Ross Lagerwall ross.lagerwall@citrix.com xenbus: Avoid deadlock during suspend due to open transactions
YueHaibing yuehaibing@huawei.com xen/pvcalls: Remove set but not used variable
Madalin Bucur madalin.bucur@nxp.com dpaa_eth: use only online CPU portals
Randy Dunlap rdunlap@infradead.org ia64: fix build errors by exporting paddr_to_nid()
Thomas Richter tmricht@linux.ibm.com perf record: Fix s390 missing module symbol and warning for non-root users
Namhyung Kim namhyung@kernel.org perf namespace: Protect reading thread's namespace
Harald Freudenberger freude@linux.ibm.com s390/zcrypt: Fix wrong dispatching for control domain CPRBs
Shawn Landden shawn@git.icu perf data: Fix 'strncat may truncate' build failure with recent gcc
Sahitya Tummala stummala@codeaurora.org configfs: Fix use-after-free when accessing sd->s_dentry
Bard Liao yung-chuan.liao@linux.intel.com ALSA: hda - Force polling mode on CNL for fixing codec communication
Yingjoe Chen yingjoe.chen@mediatek.com i2c: dev: fix potential memory leak in i2cdev_ioctl_rdwr
Dmitry Bogdanov dmitry.bogdanov@aquantia.com net: aquantia: fix LRO with FCS error
Igor Russkikh Igor.Russkikh@aquantia.com net: aquantia: tx clean budget logic error
Lucas Stach l.stach@pengutronix.de drm/etnaviv: lock MMU while dumping core
Rafael J. Wysocki rafael.j.wysocki@intel.com ACPI/PCI: PM: Add missing wakeup.flags.valid checks
Kees Cook keescook@chromium.org net: tulip: de4x5: Drop redundant MODULE_DEVICE_TABLE()
Ioana Radulescu ruxandra.radulescu@nxp.com dpaa2-eth: Use PTR_ERR_OR_ZERO where appropriate
Ioana Radulescu ruxandra.radulescu@nxp.com dpaa2-eth: Fix potential spectre issue
Pavel Begunkov asml.silence@gmail.com io_uring: Fix __io_uring_register() false success
Biao Huang biao.huang@mediatek.com net: stmmac: dwmac-mediatek: modify csr_clk value to fix mdio read/write fail
Biao Huang biao.huang@mediatek.com net: stmmac: fix csr_clk can't be zero issue
Biao Huang biao.huang@mediatek.com net: stmmac: update rx tail pointer register to fix rx dma hang issue.
Randy Dunlap rdunlap@infradead.org gpio: fix gpio-adp5588 build errors
Peter Zijlstra peterz@infradead.org perf/ring-buffer: Always use {READ,WRITE}_ONCE() for rb->user_page data
Peter Zijlstra peterz@infradead.org perf/ring_buffer: Add ordering to rb->nest increment
Yabin Cui yabinc@google.com perf/ring_buffer: Fix exposing a temporarily decreased data_head
Frank van der Linden fllinden@amazon.com x86/CPU/AMD: Don't force the CPB cap when running under a hypervisor
Dan Carpenter dan.carpenter@oracle.com mISDN: make sure device name is NUL terminated
Jia-Ju Bai baijiaju1990@gmail.com usb: xhci: Fix a potential null pointer dereference in xhci_debugfs_create_endpoint()
Anju T Sudhakar anju@linux.vnet.ibm.com powerpc/powernv: Return for invalid IMC domain
Tony Lindgren tony@atomide.com clk: ti: clkctrl: Fix clkdm_clk handling
Jeffrin Jose T jeffrin@rajagiritech.edu.in selftests: netfilter: missing error check when setting up veth interface
YueHaibing yuehaibing@huawei.com ipvs: Fix use-after-free in ip_vs_in
Phil Sutter phil@nwl.cc netfilter: nft_fib: Fix existence check support
Jagdish Motwani jagdish.motwani@sophos.com netfilter: nf_queue: fix reinject verdict handling
Stephane Eranian eranian@google.com perf/x86/intel/ds: Fix EVENT vs. UEVENT PEBS constraints
Florian Westphal fw@strlen.de netfilter: nf_tables: fix oops during rule dump
Kai-Heng Feng kai.heng.feng@canonical.com pinctrl: intel: Clear interrupt status in mask/unmask callback
Dan Carpenter dan.carpenter@oracle.com staging: wilc1000: Fix some double unlock bugs in wilc_wlan_cleanup()
Dan Carpenter dan.carpenter@oracle.com Staging: vc04_services: Fix a couple error codes
Chengguang Xu cgxu519@gmail.com staging: erofs: set sb->s_root to NULL when failing from __getname()
Steve Moskovchenko stevemo@skydio.com iio: imu: mpu6050: Fix FIFO layout for ICM20602
Alaa Hleihel alaa@mellanox.com net/mlx5e: Avoid detaching non-existing netdev under switchdev mode
Willem de Bruijn willemb@google.com net: correct udp zerocopy refcnt also when zerocopy only on append
Eli Britstein elibr@mellanox.com net/mlx5e: Support tagged tunnel over bond
Petr Machata petrm@mellanox.com mlxsw: spectrum_buffers: Reduce pool size on Spectrum-2
Raed Salem raeds@mellanox.com net/mlx5e: Fix source port matching in fdb peer flow rule
Jiri Pirko jiri@mellanox.com mlxsw: spectrum_flower: Fix TOS matching
Chris Mi chrism@mellanox.com net/mlx5e: Add ndo_set_feature for uplink representor
Ido Schimmel idosch@mellanox.com mlxsw: spectrum_router: Refresh nexthop neighbour when it becomes dead
Edward Srouji edwards@mellanox.com net/mlx5: Update pci error handler entries and command translation
Maxime Chevallier maxime.chevallier@bootlin.com net: ethtool: Allow matching on vlan DEI bit
Robert Hancock hancock@sedsystems.ca net: dsa: microchip: Don't try to read stats for unused ports
Maxime Chevallier maxime.chevallier@bootlin.com net: mvpp2: prs: Use the correct helpers when removing all VID filters
Maxime Chevallier maxime.chevallier@bootlin.com net: mvpp2: prs: Fix parser range for VID filtering
Stefano Brivio sbrivio@redhat.com geneve: Don't assume linear buffers in error handler
Stefano Brivio sbrivio@redhat.com vxlan: Don't assume linear buffers in error handler
Alaa Hleihel alaa@mellanox.com net/mlx5: Avoid reloading already removed devices
Stephen Barber smbarber@chromium.org vsock/virtio: set SOCK_DONE on peer shutdown
Xin Long lucien.xin@gmail.com tipc: purge deferredq list for each grp member in tipc_group_delete
John Paul Adrian Glaubitz glaubitz@physik.fu-berlin.de sunhv: Fix device naming inconsistency between sunhv_console and sunhv_reg
Neil Horman nhorman@tuxdriver.com sctp: Free cookie before we memdup a new one
Young Xiao 92siuyang@gmail.com nfc: Ensure presence of required attributes in the deactivate_target handler
John Fastabend john.fastabend@gmail.com net: tls, correctly account for copied bytes with multiple sk_msgs
Taehee Yoo ap420073@gmail.com net: openvswitch: do not free vport if register_netdevice() is failed.
Linus Walleij linus.walleij@linaro.org net: dsa: rtl8366: Fix up VLAN filtering
Eric Dumazet edumazet@google.com neigh: fix use-after-free read in pneigh_get_next
Jeremy Sowden jeremy@azazel.net lapb: fixed leak of control-blocks.
Eric Dumazet edumazet@google.com ipv6: flowlabel: fl6_sock_lookup() must use atomic_inc_not_zero
Haiyang Zhang haiyangz@microsoft.com hv_netvsc: Set probe mode to sync
Ivan Vecera ivecera@redhat.com be2net: Fix number of Rx queues used for flow hashing
Eric Dumazet edumazet@google.com ax25: fix inconsistent lock state in ax25_destroy_timer
Florian Westphal fw@strlen.de netfilter: nat: fix udp checksum corruption
-------------
Diffstat:
Makefile | 4 +- arch/arm64/include/asm/syscall.h | 2 +- arch/arm64/include/asm/syscall_wrapper.h | 18 +++--- arch/arm64/kernel/sys.c | 14 +++-- arch/arm64/kernel/sys32.c | 7 +-- arch/ia64/mm/numa.c | 1 + arch/powerpc/include/asm/kvm_host.h | 2 + arch/powerpc/kvm/book3s.c | 1 + arch/powerpc/kvm/book3s_64_mmu_hv.c | 36 +++++------ arch/powerpc/kvm/book3s_hv.c | 40 +++++++----- arch/powerpc/kvm/book3s_rtas.c | 14 ++--- arch/powerpc/platforms/powernv/opal-imc.c | 4 ++ arch/s390/include/asm/ap.h | 4 +- arch/x86/events/intel/ds.c | 28 ++++----- arch/x86/kernel/cpu/amd.c | 7 ++- block/blk-mq.c | 5 +- drivers/acpi/device_pm.c | 4 +- drivers/clk/ti/clkctrl.c | 8 +-- drivers/gpio/Kconfig | 1 + drivers/gpu/drm/etnaviv/etnaviv_dump.c | 5 ++ drivers/i2c/i2c-dev.c | 1 + drivers/iio/imu/inv_mpu6050/inv_mpu_core.c | 46 ++++++++++++-- drivers/iio/imu/inv_mpu6050/inv_mpu_iio.h | 20 +++++- drivers/iio/imu/inv_mpu6050/inv_mpu_ring.c | 3 + drivers/isdn/mISDN/socket.c | 5 +- drivers/net/dsa/microchip/ksz_common.c | 3 + drivers/net/dsa/rtl8366.c | 7 ++- drivers/net/ethernet/aquantia/atlantic/aq_ring.c | 7 ++- .../ethernet/aquantia/atlantic/hw_atl/hw_atl_b0.c | 61 +++++++++--------- drivers/net/ethernet/dec/tulip/de4x5.c | 1 - drivers/net/ethernet/emulex/benet/be_ethtool.c | 2 +- drivers/net/ethernet/freescale/dpaa/dpaa_eth.c | 9 ++- drivers/net/ethernet/freescale/dpaa/dpaa_ethtool.c | 4 +- drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c | 4 +- .../net/ethernet/freescale/dpaa2/dpaa2-ethtool.c | 3 + drivers/net/ethernet/marvell/mvpp2/mvpp2_prs.c | 23 +++---- drivers/net/ethernet/mellanox/mlx5/core/cmd.c | 8 +++ drivers/net/ethernet/mellanox/mlx5/core/dev.c | 25 +++++++- drivers/net/ethernet/mellanox/mlx5/core/en.h | 1 + .../net/ethernet/mellanox/mlx5/core/en/tc_tun.c | 11 ++-- drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 8 ++- drivers/net/ethernet/mellanox/mlx5/core/en_rep.c | 10 +-- drivers/net/ethernet/mellanox/mlx5/core/en_tc.c | 3 - drivers/net/ethernet/mellanox/mlxsw/spectrum.c | 4 ++ .../net/ethernet/mellanox/mlxsw/spectrum_buffers.c | 4 +- .../net/ethernet/mellanox/mlxsw/spectrum_flower.c | 4 +- .../net/ethernet/mellanox/mlxsw/spectrum_router.c | 73 +++++++++++++++++++++- drivers/net/ethernet/renesas/sh_eth.c | 4 ++ .../net/ethernet/stmicro/stmmac/dwmac-mediatek.c | 2 - drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 7 ++- .../net/ethernet/stmicro/stmmac/stmmac_platform.c | 5 +- drivers/net/geneve.c | 2 +- drivers/net/hyperv/netvsc_drv.c | 2 +- drivers/net/phy/dp83867.c | 39 +++++++++++- drivers/net/phy/phylink.c | 10 ++- drivers/net/vxlan.c | 2 +- drivers/nvme/host/tcp.c | 70 ++++++++++++++++----- drivers/pci/pci-acpi.c | 3 +- drivers/pinctrl/intel/pinctrl-intel.c | 37 ++--------- drivers/s390/crypto/ap_bus.c | 26 ++++++-- drivers/s390/crypto/ap_bus.h | 3 + drivers/s390/crypto/zcrypt_api.c | 17 ++++- drivers/scsi/cxgbi/libcxgbi.c | 4 ++ drivers/scsi/device_handler/scsi_dh_alua.c | 6 +- drivers/scsi/libsas/sas_expander.c | 2 + drivers/scsi/smartpqi/smartpqi_init.c | 2 +- drivers/staging/erofs/super.c | 1 + .../vc04_services/bcm2835-camera/controls.c | 4 +- drivers/staging/wilc1000/wilc_wlan.c | 8 ++- drivers/tty/serial/sunhv.c | 2 +- drivers/usb/host/xhci-debugfs.c | 3 + drivers/xen/pvcalls-front.c | 4 -- drivers/xen/xenbus/xenbus.h | 3 + drivers/xen/xenbus/xenbus_dev_frontend.c | 18 ++++++ drivers/xen/xenbus/xenbus_xs.c | 7 ++- fs/cifs/dfs_cache.c | 4 +- fs/configfs/dir.c | 14 ++--- fs/io_uring.c | 2 +- fs/ocfs2/filecheck.c | 1 + include/linux/sched/mm.h | 4 ++ include/net/flow_dissector.h | 1 + include/net/netfilter/nft_fib.h | 2 +- kernel/events/ring_buffer.c | 39 +++++++++--- mm/khugepaged.c | 3 + mm/mmu_gather.c | 24 +++++-- net/ax25/ax25_route.c | 2 + net/core/ethtool.c | 5 ++ net/core/neighbour.c | 7 +++ net/ipv4/ip_output.c | 2 +- net/ipv4/netfilter/nft_fib_ipv4.c | 23 +------ net/ipv6/ip6_flowlabel.c | 7 ++- net/ipv6/ip6_output.c | 2 +- net/ipv6/netfilter/nft_fib_ipv6.c | 16 +---- net/lapb/lapb_iface.c | 1 + net/netfilter/ipvs/ip_vs_core.c | 2 +- net/netfilter/nf_nat_helper.c | 2 +- net/netfilter/nf_queue.c | 1 + net/netfilter/nf_tables_api.c | 20 +++--- net/netfilter/nft_fib.c | 6 +- net/nfc/netlink.c | 3 +- net/openvswitch/vport-internal_dev.c | 18 ++++-- net/sctp/sm_make_chunk.c | 8 +++ net/tipc/group.c | 1 + net/tls/tls_sw.c | 1 - net/vmw_vsock/virtio_transport_common.c | 4 +- sound/firewire/fireface/ff-protocol-latter.c | 10 +-- sound/pci/hda/hda_intel.c | 5 +- tools/perf/arch/s390/util/machine.c | 9 ++- tools/perf/util/data-convert-bt.c | 2 +- tools/perf/util/thread.c | 15 ++++- tools/testing/selftests/netfilter/nft_nat.sh | 6 +- 111 files changed, 760 insertions(+), 360 deletions(-)
From: Florian Westphal fw@strlen.de
commit 6bac76db1da3cb162c425d58ae421486f8e43955 upstream.
Due to copy&paste error nf_nat_mangle_udp_packet passes IPPROTO_TCP, resulting in incorrect udp checksum when payload had to be mangled.
Fixes: dac3fe72596f9 ("netfilter: nat: remove csum_recalc hook") Reported-by: Marc Haber mh+netdev@zugschlus.de Tested-by: Marc Haber mh+netdev@zugschlus.de Signed-off-by: Florian Westphal fw@strlen.de Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- net/netfilter/nf_nat_helper.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/net/netfilter/nf_nat_helper.c +++ b/net/netfilter/nf_nat_helper.c @@ -170,7 +170,7 @@ nf_nat_mangle_udp_packet(struct sk_buff if (!udph->check && skb->ip_summed != CHECKSUM_PARTIAL) return true;
- nf_nat_csum_recalc(skb, nf_ct_l3num(ct), IPPROTO_TCP, + nf_nat_csum_recalc(skb, nf_ct_l3num(ct), IPPROTO_UDP, udph, &udph->check, datalen, oldlen);
return true;
From: Eric Dumazet edumazet@google.com
[ Upstream commit d4d5d8e83c9616aeef28a2869cea49cc3fb35526 ]
Before thread in process context uses bh_lock_sock() we must disable bh.
sysbot reported :
WARNING: inconsistent lock state 5.2.0-rc3+ #32 Not tainted
inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage. blkid/26581 [HC0[0]:SC1[1]:HE1:SE0] takes: 00000000e0da85ee (slock-AF_AX25){+.?.}, at: spin_lock include/linux/spinlock.h:338 [inline] 00000000e0da85ee (slock-AF_AX25){+.?.}, at: ax25_destroy_timer+0x53/0xc0 net/ax25/af_ax25.c:275 {SOFTIRQ-ON-W} state was registered at: lock_acquire+0x16f/0x3f0 kernel/locking/lockdep.c:4303 __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline] _raw_spin_lock+0x2f/0x40 kernel/locking/spinlock.c:151 spin_lock include/linux/spinlock.h:338 [inline] ax25_rt_autobind+0x3ca/0x720 net/ax25/ax25_route.c:429 ax25_connect.cold+0x30/0xa4 net/ax25/af_ax25.c:1221 __sys_connect+0x264/0x330 net/socket.c:1834 __do_sys_connect net/socket.c:1845 [inline] __se_sys_connect net/socket.c:1842 [inline] __x64_sys_connect+0x73/0xb0 net/socket.c:1842 do_syscall_64+0xfd/0x680 arch/x86/entry/common.c:301 entry_SYSCALL_64_after_hwframe+0x49/0xbe irq event stamp: 2272 hardirqs last enabled at (2272): [<ffffffff810065f3>] trace_hardirqs_on_thunk+0x1a/0x1c hardirqs last disabled at (2271): [<ffffffff8100660f>] trace_hardirqs_off_thunk+0x1a/0x1c softirqs last enabled at (1522): [<ffffffff87400654>] __do_softirq+0x654/0x94c kernel/softirq.c:320 softirqs last disabled at (2267): [<ffffffff81449010>] invoke_softirq kernel/softirq.c:374 [inline] softirqs last disabled at (2267): [<ffffffff81449010>] irq_exit+0x180/0x1d0 kernel/softirq.c:414
other info that might help us debug this: Possible unsafe locking scenario:
CPU0 ---- lock(slock-AF_AX25); <Interrupt> lock(slock-AF_AX25);
*** DEADLOCK ***
1 lock held by blkid/26581: #0: 0000000010fd154d ((&ax25->dtimer)){+.-.}, at: lockdep_copy_map include/linux/lockdep.h:175 [inline] #0: 0000000010fd154d ((&ax25->dtimer)){+.-.}, at: call_timer_fn+0xe0/0x720 kernel/time/timer.c:1312
stack backtrace: CPU: 1 PID: 26581 Comm: blkid Not tainted 5.2.0-rc3+ #32 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: <IRQ> __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x172/0x1f0 lib/dump_stack.c:113 print_usage_bug.cold+0x393/0x4a2 kernel/locking/lockdep.c:2935 valid_state kernel/locking/lockdep.c:2948 [inline] mark_lock_irq kernel/locking/lockdep.c:3138 [inline] mark_lock+0xd46/0x1370 kernel/locking/lockdep.c:3513 mark_irqflags kernel/locking/lockdep.c:3391 [inline] __lock_acquire+0x159f/0x5490 kernel/locking/lockdep.c:3745 lock_acquire+0x16f/0x3f0 kernel/locking/lockdep.c:4303 __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline] _raw_spin_lock+0x2f/0x40 kernel/locking/spinlock.c:151 spin_lock include/linux/spinlock.h:338 [inline] ax25_destroy_timer+0x53/0xc0 net/ax25/af_ax25.c:275 call_timer_fn+0x193/0x720 kernel/time/timer.c:1322 expire_timers kernel/time/timer.c:1366 [inline] __run_timers kernel/time/timer.c:1685 [inline] __run_timers kernel/time/timer.c:1653 [inline] run_timer_softirq+0x66f/0x1740 kernel/time/timer.c:1698 __do_softirq+0x25c/0x94c kernel/softirq.c:293 invoke_softirq kernel/softirq.c:374 [inline] irq_exit+0x180/0x1d0 kernel/softirq.c:414 exiting_irq arch/x86/include/asm/apic.h:536 [inline] smp_apic_timer_interrupt+0x13b/0x550 arch/x86/kernel/apic/apic.c:1068 apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:806 </IRQ> RIP: 0033:0x7f858d5c3232 Code: 8b 61 08 48 8b 84 24 d8 00 00 00 4c 89 44 24 28 48 8b ac 24 d0 00 00 00 4c 8b b4 24 e8 00 00 00 48 89 7c 24 68 48 89 4c 24 78 <48> 89 44 24 58 8b 84 24 e0 00 00 00 89 84 24 84 00 00 00 8b 84 24 RSP: 002b:00007ffcaf0cf5c0 EFLAGS: 00000206 ORIG_RAX: ffffffffffffff13 RAX: 00007f858d7d27a8 RBX: 00007f858d7d8820 RCX: 00007f858d3940d8 RDX: 00007ffcaf0cf798 RSI: 00000000f5e616f3 RDI: 00007f858d394fee RBP: 0000000000000000 R08: 00007ffcaf0cf780 R09: 00007f858d7db480 R10: 0000000000000000 R11: 0000000009691a75 R12: 0000000000000005 R13: 00000000f5e616f3 R14: 0000000000000000 R15: 00007ffcaf0cf798
Signed-off-by: Eric Dumazet edumazet@google.com Reported-by: syzbot syzkaller@googlegroups.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ax25/ax25_route.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/net/ax25/ax25_route.c +++ b/net/ax25/ax25_route.c @@ -429,9 +429,11 @@ int ax25_rt_autobind(ax25_cb *ax25, ax25 }
if (ax25->sk != NULL) { + local_bh_disable(); bh_lock_sock(ax25->sk); sock_reset_flag(ax25->sk, SOCK_ZAPPED); bh_unlock_sock(ax25->sk); + local_bh_enable(); }
put:
From: Ivan Vecera ivecera@redhat.com
[ Upstream commit 718f4a2537089ea41903bf357071306163bc7c04 ]
Number of Rx queues used for flow hashing returned by the driver is incorrect and this bug prevents user to use the last Rx queue in indirection table.
Let's say we have a NIC with 6 combined queues:
[root@sm-03 ~]# ethtool -l enp4s0f0 Channel parameters for enp4s0f0: Pre-set maximums: RX: 5 TX: 5 Other: 0 Combined: 6 Current hardware settings: RX: 0 TX: 0 Other: 0 Combined: 6
Default indirection table maps all (6) queues equally but the driver reports only 5 rings available.
[root@sm-03 ~]# ethtool -x enp4s0f0 RX flow hash indirection table for enp4s0f0 with 5 RX ring(s): 0: 0 1 2 3 4 5 0 1 8: 2 3 4 5 0 1 2 3 16: 4 5 0 1 2 3 4 5 24: 0 1 2 3 4 5 0 1 ...
Now change indirection table somehow:
[root@sm-03 ~]# ethtool -X enp4s0f0 weight 1 1 [root@sm-03 ~]# ethtool -x enp4s0f0 RX flow hash indirection table for enp4s0f0 with 6 RX ring(s): 0: 0 0 0 0 0 0 0 0 ... 64: 1 1 1 1 1 1 1 1 ...
Now it is not possible to change mapping back to equal (default) state:
[root@sm-03 ~]# ethtool -X enp4s0f0 equal 6 Cannot set RX flow hash configuration: Invalid argument
Fixes: 594ad54a2c3b ("be2net: Add support for setting and getting rx flow hash options") Reported-by: Tianhao tizhao@redhat.com Signed-off-by: Ivan Vecera ivecera@redhat.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/emulex/benet/be_ethtool.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/net/ethernet/emulex/benet/be_ethtool.c +++ b/drivers/net/ethernet/emulex/benet/be_ethtool.c @@ -1105,7 +1105,7 @@ static int be_get_rxnfc(struct net_devic cmd->data = be_get_rss_hash_opts(adapter, cmd->flow_type); break; case ETHTOOL_GRXRINGS: - cmd->data = adapter->num_rx_qs - 1; + cmd->data = adapter->num_rx_qs; break; default: return -EINVAL;
From: Haiyang Zhang haiyangz@microsoft.com
[ Upstream commit 9a33629ba6b26caebd73e3c581ba1e6068c696a7 ]
For better consistency of synthetic NIC names, we set the probe mode to PROBE_FORCE_SYNCHRONOUS. So the names can be aligned with the vmbus channel offer sequence.
Fixes: af0a5646cb8d ("use the new async probing feature for the hyperv drivers") Signed-off-by: Haiyang Zhang haiyangz@microsoft.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/hyperv/netvsc_drv.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/net/hyperv/netvsc_drv.c +++ b/drivers/net/hyperv/netvsc_drv.c @@ -2414,7 +2414,7 @@ static struct hv_driver netvsc_drv = { .probe = netvsc_probe, .remove = netvsc_remove, .driver = { - .probe_type = PROBE_PREFER_ASYNCHRONOUS, + .probe_type = PROBE_FORCE_SYNCHRONOUS, }, };
From: Eric Dumazet edumazet@google.com
[ Upstream commit 65a3c497c0e965a552008db8bc2653f62bc925a1 ]
Before taking a refcount, make sure the object is not already scheduled for deletion.
Same fix is needed in ipv6_flowlabel_opt()
Fixes: 18367681a10b ("ipv6 flowlabel: Convert np->ipv6_fl_list to RCU.") Signed-off-by: Eric Dumazet edumazet@google.com Cc: Willem de Bruijn willemb@google.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv6/ip6_flowlabel.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
--- a/net/ipv6/ip6_flowlabel.c +++ b/net/ipv6/ip6_flowlabel.c @@ -254,9 +254,9 @@ struct ip6_flowlabel *fl6_sock_lookup(st rcu_read_lock_bh(); for_each_sk_fl_rcu(np, sfl) { struct ip6_flowlabel *fl = sfl->fl; - if (fl->label == label) { + + if (fl->label == label && atomic_inc_not_zero(&fl->users)) { fl->lastuse = jiffies; - atomic_inc(&fl->users); rcu_read_unlock_bh(); return fl; } @@ -622,7 +622,8 @@ int ipv6_flowlabel_opt(struct sock *sk, goto done; } fl1 = sfl->fl; - atomic_inc(&fl1->users); + if (!atomic_inc_not_zero(&fl1->users)) + fl1 = NULL; break; } }
From: Jeremy Sowden jeremy@azazel.net
[ Upstream commit 6be8e297f9bcea666ea85ac7a6cd9d52d6deaf92 ]
lapb_register calls lapb_create_cb, which initializes the control- block's ref-count to one, and __lapb_insert_cb, which increments it when adding the new block to the list of blocks.
lapb_unregister calls __lapb_remove_cb, which decrements the ref-count when removing control-block from the list of blocks, and calls lapb_put itself to decrement the ref-count before returning.
However, lapb_unregister also calls __lapb_devtostruct to look up the right control-block for the given net_device, and __lapb_devtostruct also bumps the ref-count, which means that when lapb_unregister returns the ref-count is still 1 and the control-block is leaked.
Call lapb_put after __lapb_devtostruct to fix leak.
Reported-by: syzbot+afb980676c836b4a0afa@syzkaller.appspotmail.com Signed-off-by: Jeremy Sowden jeremy@azazel.net Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/lapb/lapb_iface.c | 1 + 1 file changed, 1 insertion(+)
--- a/net/lapb/lapb_iface.c +++ b/net/lapb/lapb_iface.c @@ -182,6 +182,7 @@ int lapb_unregister(struct net_device *d lapb = __lapb_devtostruct(dev); if (!lapb) goto out; + lapb_put(lapb);
lapb_stop_t1timer(lapb); lapb_stop_t2timer(lapb);
From: Eric Dumazet edumazet@google.com
[ Upstream commit f3e92cb8e2eb8c27d109e6fd73d3a69a8c09e288 ]
Nine years ago, I added RCU handling to neighbours, not pneighbours. (pneigh are not commonly used)
Unfortunately I missed that /proc dump operations would use a common entry and exit point : neigh_seq_start() and neigh_seq_stop()
We need to read_lock(tbl->lock) or risk use-after-free while iterating the pneigh structures.
We might later convert pneigh to RCU and revert this patch.
sysbot reported :
BUG: KASAN: use-after-free in pneigh_get_next.isra.0+0x24b/0x280 net/core/neighbour.c:3158 Read of size 8 at addr ffff888097f2a700 by task syz-executor.0/9825
CPU: 1 PID: 9825 Comm: syz-executor.0 Not tainted 5.2.0-rc4+ #32 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x172/0x1f0 lib/dump_stack.c:113 print_address_description.cold+0x7c/0x20d mm/kasan/report.c:188 __kasan_report.cold+0x1b/0x40 mm/kasan/report.c:317 kasan_report+0x12/0x20 mm/kasan/common.c:614 __asan_report_load8_noabort+0x14/0x20 mm/kasan/generic_report.c:132 pneigh_get_next.isra.0+0x24b/0x280 net/core/neighbour.c:3158 neigh_seq_next+0xdb/0x210 net/core/neighbour.c:3240 seq_read+0x9cf/0x1110 fs/seq_file.c:258 proc_reg_read+0x1fc/0x2c0 fs/proc/inode.c:221 do_loop_readv_writev fs/read_write.c:714 [inline] do_loop_readv_writev fs/read_write.c:701 [inline] do_iter_read+0x4a4/0x660 fs/read_write.c:935 vfs_readv+0xf0/0x160 fs/read_write.c:997 kernel_readv fs/splice.c:359 [inline] default_file_splice_read+0x475/0x890 fs/splice.c:414 do_splice_to+0x127/0x180 fs/splice.c:877 splice_direct_to_actor+0x2d2/0x970 fs/splice.c:954 do_splice_direct+0x1da/0x2a0 fs/splice.c:1063 do_sendfile+0x597/0xd00 fs/read_write.c:1464 __do_sys_sendfile64 fs/read_write.c:1525 [inline] __se_sys_sendfile64 fs/read_write.c:1511 [inline] __x64_sys_sendfile64+0x1dd/0x220 fs/read_write.c:1511 do_syscall_64+0xfd/0x680 arch/x86/entry/common.c:301 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x4592c9 Code: fd b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 cb b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007f4aab51dc78 EFLAGS: 00000246 ORIG_RAX: 0000000000000028 RAX: ffffffffffffffda RBX: 0000000000000004 RCX: 00000000004592c9 RDX: 0000000000000000 RSI: 0000000000000004 RDI: 0000000000000005 RBP: 000000000075bf20 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000080000000 R11: 0000000000000246 R12: 00007f4aab51e6d4 R13: 00000000004c689d R14: 00000000004db828 R15: 00000000ffffffff
Allocated by task 9827: save_stack+0x23/0x90 mm/kasan/common.c:71 set_track mm/kasan/common.c:79 [inline] __kasan_kmalloc mm/kasan/common.c:489 [inline] __kasan_kmalloc.constprop.0+0xcf/0xe0 mm/kasan/common.c:462 kasan_kmalloc+0x9/0x10 mm/kasan/common.c:503 __do_kmalloc mm/slab.c:3660 [inline] __kmalloc+0x15c/0x740 mm/slab.c:3669 kmalloc include/linux/slab.h:552 [inline] pneigh_lookup+0x19c/0x4a0 net/core/neighbour.c:731 arp_req_set_public net/ipv4/arp.c:1010 [inline] arp_req_set+0x613/0x720 net/ipv4/arp.c:1026 arp_ioctl+0x652/0x7f0 net/ipv4/arp.c:1226 inet_ioctl+0x2a0/0x340 net/ipv4/af_inet.c:926 sock_do_ioctl+0xd8/0x2f0 net/socket.c:1043 sock_ioctl+0x3ed/0x780 net/socket.c:1194 vfs_ioctl fs/ioctl.c:46 [inline] file_ioctl fs/ioctl.c:509 [inline] do_vfs_ioctl+0xd5f/0x1380 fs/ioctl.c:696 ksys_ioctl+0xab/0xd0 fs/ioctl.c:713 __do_sys_ioctl fs/ioctl.c:720 [inline] __se_sys_ioctl fs/ioctl.c:718 [inline] __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:718 do_syscall_64+0xfd/0x680 arch/x86/entry/common.c:301 entry_SYSCALL_64_after_hwframe+0x49/0xbe
Freed by task 9824: save_stack+0x23/0x90 mm/kasan/common.c:71 set_track mm/kasan/common.c:79 [inline] __kasan_slab_free+0x102/0x150 mm/kasan/common.c:451 kasan_slab_free+0xe/0x10 mm/kasan/common.c:459 __cache_free mm/slab.c:3432 [inline] kfree+0xcf/0x220 mm/slab.c:3755 pneigh_ifdown_and_unlock net/core/neighbour.c:812 [inline] __neigh_ifdown+0x236/0x2f0 net/core/neighbour.c:356 neigh_ifdown+0x20/0x30 net/core/neighbour.c:372 arp_ifdown+0x1d/0x21 net/ipv4/arp.c:1274 inetdev_destroy net/ipv4/devinet.c:319 [inline] inetdev_event+0xa14/0x11f0 net/ipv4/devinet.c:1544 notifier_call_chain+0xc2/0x230 kernel/notifier.c:95 __raw_notifier_call_chain kernel/notifier.c:396 [inline] raw_notifier_call_chain+0x2e/0x40 kernel/notifier.c:403 call_netdevice_notifiers_info+0x3f/0x90 net/core/dev.c:1749 call_netdevice_notifiers_extack net/core/dev.c:1761 [inline] call_netdevice_notifiers net/core/dev.c:1775 [inline] rollback_registered_many+0x9b9/0xfc0 net/core/dev.c:8178 rollback_registered+0x109/0x1d0 net/core/dev.c:8220 unregister_netdevice_queue net/core/dev.c:9267 [inline] unregister_netdevice_queue+0x1ee/0x2c0 net/core/dev.c:9260 unregister_netdevice include/linux/netdevice.h:2631 [inline] __tun_detach+0xd8a/0x1040 drivers/net/tun.c:724 tun_detach drivers/net/tun.c:741 [inline] tun_chr_close+0xe0/0x180 drivers/net/tun.c:3451 __fput+0x2ff/0x890 fs/file_table.c:280 ____fput+0x16/0x20 fs/file_table.c:313 task_work_run+0x145/0x1c0 kernel/task_work.c:113 tracehook_notify_resume include/linux/tracehook.h:185 [inline] exit_to_usermode_loop+0x273/0x2c0 arch/x86/entry/common.c:168 prepare_exit_to_usermode arch/x86/entry/common.c:199 [inline] syscall_return_slowpath arch/x86/entry/common.c:279 [inline] do_syscall_64+0x58e/0x680 arch/x86/entry/common.c:304 entry_SYSCALL_64_after_hwframe+0x49/0xbe
The buggy address belongs to the object at ffff888097f2a700 which belongs to the cache kmalloc-64 of size 64 The buggy address is located 0 bytes inside of 64-byte region [ffff888097f2a700, ffff888097f2a740) The buggy address belongs to the page: page:ffffea00025fca80 refcount:1 mapcount:0 mapping:ffff8880aa400340 index:0x0 flags: 0x1fffc0000000200(slab) raw: 01fffc0000000200 ffffea000250d548 ffffea00025726c8 ffff8880aa400340 raw: 0000000000000000 ffff888097f2a000 0000000100000020 0000000000000000 page dumped because: kasan: bad access detected
Memory state around the buggy address: ffff888097f2a600: 00 00 00 00 00 00 00 00 fc fc fc fc fc fc fc fc ffff888097f2a680: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
ffff888097f2a700: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
^ ffff888097f2a780: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc ffff888097f2a800: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
Fixes: 767e97e1e0db ("neigh: RCU conversion of struct neighbour") Signed-off-by: Eric Dumazet edumazet@google.com Reported-by: syzbot syzkaller@googlegroups.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/core/neighbour.c | 7 +++++++ 1 file changed, 7 insertions(+)
--- a/net/core/neighbour.c +++ b/net/core/neighbour.c @@ -3199,6 +3199,7 @@ static void *neigh_get_idx_any(struct se }
void *neigh_seq_start(struct seq_file *seq, loff_t *pos, struct neigh_table *tbl, unsigned int neigh_seq_flags) + __acquires(tbl->lock) __acquires(rcu_bh) { struct neigh_seq_state *state = seq->private; @@ -3209,6 +3210,7 @@ void *neigh_seq_start(struct seq_file *s
rcu_read_lock_bh(); state->nht = rcu_dereference_bh(tbl->nht); + read_lock(&tbl->lock);
return *pos ? neigh_get_idx_any(seq, pos) : SEQ_START_TOKEN; } @@ -3242,8 +3244,13 @@ out: EXPORT_SYMBOL(neigh_seq_next);
void neigh_seq_stop(struct seq_file *seq, void *v) + __releases(tbl->lock) __releases(rcu_bh) { + struct neigh_seq_state *state = seq->private; + struct neigh_table *tbl = state->tbl; + + read_unlock(&tbl->lock); rcu_read_unlock_bh(); } EXPORT_SYMBOL(neigh_seq_stop);
From: Linus Walleij linus.walleij@linaro.org
[ Upstream commit 760c80b70bed2cd01630e8595d1bbde910339f31 ]
We get this regression when using RTL8366RB as part of a bridge with OpenWrt:
WARNING: CPU: 0 PID: 1347 at net/switchdev/switchdev.c:291 switchdev_port_attr_set_now+0x80/0xa4 lan0: Commit of attribute (id=7) failed. (...) realtek-smi switch lan0: failed to initialize vlan filtering on this port
This is because it is trying to disable VLAN filtering on VLAN0, as we have forgot to add 1 to the port number to get the right VLAN in rtl8366_vlan_filtering(): when we initialize the VLAN we associate VLAN1 with port 0, VLAN2 with port 1 etc, so we need to add 1 to the port offset.
Fixes: d8652956cf37 ("net: dsa: realtek-smi: Add Realtek SMI driver") Signed-off-by: Linus Walleij linus.walleij@linaro.org Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/dsa/rtl8366.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
--- a/drivers/net/dsa/rtl8366.c +++ b/drivers/net/dsa/rtl8366.c @@ -307,7 +307,8 @@ int rtl8366_vlan_filtering(struct dsa_sw struct rtl8366_vlan_4k vlan4k; int ret;
- if (!smi->ops->is_vlan_valid(smi, port)) + /* Use VLAN nr port + 1 since VLAN0 is not valid */ + if (!smi->ops->is_vlan_valid(smi, port + 1)) return -EINVAL;
dev_info(smi->dev, "%s filtering on port %d\n", @@ -318,12 +319,12 @@ int rtl8366_vlan_filtering(struct dsa_sw * The hardware support filter ID (FID) 0..7, I have no clue how to * support this in the driver when the callback only says on/off. */ - ret = smi->ops->get_vlan_4k(smi, port, &vlan4k); + ret = smi->ops->get_vlan_4k(smi, port + 1, &vlan4k); if (ret) return ret;
/* Just set the filter to FID 1 for now then */ - ret = rtl8366_set_vlan(smi, port, + ret = rtl8366_set_vlan(smi, port + 1, vlan4k.member, vlan4k.untag, 1);
From: Taehee Yoo ap420073@gmail.com
[ Upstream commit 309b66970ee2abf721ecd0876a48940fa0b99a35 ]
In order to create an internal vport, internal_dev_create() is used and that calls register_netdevice() internally. If register_netdevice() fails, it calls dev->priv_destructor() to free private data of netdev. actually, a private data of this is a vport.
Hence internal_dev_create() should not free and use a vport after failure of register_netdevice().
Test command ovs-dpctl add-dp bonding_masters
Splat looks like: [ 1035.667767] kasan: GPF could be caused by NULL-ptr deref or user memory access [ 1035.675958] general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI [ 1035.676916] CPU: 1 PID: 1028 Comm: ovs-vswitchd Tainted: G B 5.2.0-rc3+ #240 [ 1035.676916] RIP: 0010:internal_dev_create+0x2e5/0x4e0 [openvswitch] [ 1035.676916] Code: 48 c1 ea 03 80 3c 02 00 0f 85 9f 01 00 00 4c 8b 23 48 b8 00 00 00 00 00 fc ff df 49 8d bc 24 60 05 00 00 48 89 fa 48 c1 ea 03 <80> 3c 02 00 0f 85 86 01 00 00 49 8b bc 24 60 05 00 00 e8 e4 68 f4 [ 1035.713720] RSP: 0018:ffff88810dcb7578 EFLAGS: 00010206 [ 1035.713720] RAX: dffffc0000000000 RBX: ffff88810d13fe08 RCX: ffffffff84297704 [ 1035.713720] RDX: 00000000000000ac RSI: 0000000000000000 RDI: 0000000000000560 [ 1035.713720] RBP: 00000000ffffffef R08: fffffbfff0d3b881 R09: fffffbfff0d3b881 [ 1035.713720] R10: 0000000000000001 R11: fffffbfff0d3b880 R12: 0000000000000000 [ 1035.768776] R13: 0000607ee460b900 R14: ffff88810dcb7690 R15: ffff88810dcb7698 [ 1035.777709] FS: 00007f02095fc980(0000) GS:ffff88811b400000(0000) knlGS:0000000000000000 [ 1035.777709] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1035.777709] CR2: 00007ffdf01d2f28 CR3: 0000000108258000 CR4: 00000000001006e0 [ 1035.777709] Call Trace: [ 1035.777709] ovs_vport_add+0x267/0x4f0 [openvswitch] [ 1035.777709] new_vport+0x15/0x1e0 [openvswitch] [ 1035.777709] ovs_vport_cmd_new+0x567/0xd10 [openvswitch] [ 1035.777709] ? ovs_dp_cmd_dump+0x490/0x490 [openvswitch] [ 1035.777709] ? __kmalloc+0x131/0x2e0 [ 1035.777709] ? genl_family_rcv_msg+0xa54/0x1030 [ 1035.777709] genl_family_rcv_msg+0x63a/0x1030 [ 1035.777709] ? genl_unregister_family+0x630/0x630 [ 1035.841681] ? debug_show_all_locks+0x2d0/0x2d0 [ ... ]
Fixes: cf124db566e6 ("net: Fix inconsistent teardown and release of private netdev state.") Signed-off-by: Taehee Yoo ap420073@gmail.com Reviewed-by: Greg Rose gvrose8192@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/openvswitch/vport-internal_dev.c | 18 ++++++++++++------ 1 file changed, 12 insertions(+), 6 deletions(-)
--- a/net/openvswitch/vport-internal_dev.c +++ b/net/openvswitch/vport-internal_dev.c @@ -170,7 +170,9 @@ static struct vport *internal_dev_create { struct vport *vport; struct internal_dev *internal_dev; + struct net_device *dev; int err; + bool free_vport = true;
vport = ovs_vport_alloc(0, &ovs_internal_vport_ops, parms); if (IS_ERR(vport)) { @@ -178,8 +180,9 @@ static struct vport *internal_dev_create goto error; }
- vport->dev = alloc_netdev(sizeof(struct internal_dev), - parms->name, NET_NAME_USER, do_setup); + dev = alloc_netdev(sizeof(struct internal_dev), + parms->name, NET_NAME_USER, do_setup); + vport->dev = dev; if (!vport->dev) { err = -ENOMEM; goto error_free_vport; @@ -200,8 +203,10 @@ static struct vport *internal_dev_create
rtnl_lock(); err = register_netdevice(vport->dev); - if (err) + if (err) { + free_vport = false; goto error_unlock; + }
dev_set_promiscuity(vport->dev, 1); rtnl_unlock(); @@ -211,11 +216,12 @@ static struct vport *internal_dev_create
error_unlock: rtnl_unlock(); - free_percpu(vport->dev->tstats); + free_percpu(dev->tstats); error_free_netdev: - free_netdev(vport->dev); + free_netdev(dev); error_free_vport: - ovs_vport_free(vport); + if (free_vport) + ovs_vport_free(vport); error: return ERR_PTR(err); }
From: John Fastabend john.fastabend@gmail.com
[ Upstream commit 648ee6cea7dde4a5cdf817e5d964fd60b22006a4 ]
tls_sw_do_sendpage needs to return the total number of bytes sent regardless of how many sk_msgs are allocated. Unfortunately, copied (the value we return up the stack) is zero'd before each new sk_msg is allocated so we only return the copied size of the last sk_msg used.
The caller (splice, etc.) of sendpage will then believe only part of its data was sent and send the missing chunks again. However, because the data actually was sent the receiver will get multiple copies of the same data.
To reproduce this do multiple sendfile calls with a length close to the max record size. This will in turn call splice/sendpage, sendpage may use multiple sk_msg in this case and then returns the incorrect number of bytes. This will cause splice to resend creating duplicate data on the receiver. Andre created a C program that can easily generate this case so we will push a similar selftest for this to bpf-next shortly.
The fix is to _not_ zero the copied field so that the total sent bytes is returned.
Reported-by: Steinar H. Gunderson steinar+kernel@gunderson.no Reported-by: Andre Tomt andre@tomt.net Tested-by: Andre Tomt andre@tomt.net Fixes: d829e9c4112b ("tls: convert to generic sk_msg interface") Signed-off-by: John Fastabend john.fastabend@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/tls/tls_sw.c | 1 - 1 file changed, 1 deletion(-)
--- a/net/tls/tls_sw.c +++ b/net/tls/tls_sw.c @@ -1128,7 +1128,6 @@ static int tls_sw_do_sendpage(struct soc
full_record = false; record_room = TLS_MAX_PAYLOAD_SIZE - msg_pl->sg.size; - copied = 0; copy = size; if (copy >= record_room) { copy = record_room;
From: Young Xiao 92siuyang@gmail.com
[ Upstream commit 385097a3675749cbc9e97c085c0e5dfe4269ca51 ]
Check that the NFC_ATTR_TARGET_INDEX attributes (in addition to NFC_ATTR_DEVICE_INDEX) are provided by the netlink client prior to accessing them. This prevents potential unhandled NULL pointer dereference exceptions which can be triggered by malicious user-mode programs, if they omit one or both of these attributes.
Signed-off-by: Young Xiao 92siuyang@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/nfc/netlink.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/net/nfc/netlink.c +++ b/net/nfc/netlink.c @@ -922,7 +922,8 @@ static int nfc_genl_deactivate_target(st u32 device_idx, target_idx; int rc;
- if (!info->attrs[NFC_ATTR_DEVICE_INDEX]) + if (!info->attrs[NFC_ATTR_DEVICE_INDEX] || + !info->attrs[NFC_ATTR_TARGET_INDEX]) return -EINVAL;
device_idx = nla_get_u32(info->attrs[NFC_ATTR_DEVICE_INDEX]);
From: Neil Horman nhorman@tuxdriver.com
[ Upstream commit ce950f1050cece5e406a5cde723c69bba60e1b26 ]
Based on comments from Xin, even after fixes for our recent syzbot report of cookie memory leaks, its possible to get a resend of an INIT chunk which would lead to us leaking cookie memory.
To ensure that we don't leak cookie memory, free any previously allocated cookie first.
Change notes v1->v2 update subsystem tag in subject (davem) repeat kfree check for peer_random and peer_hmacs (xin)
v2->v3 net->sctp also free peer_chunks
v3->v4 fix subject tags
v4->v5 remove cut line
Signed-off-by: Neil Horman nhorman@tuxdriver.com Reported-by: syzbot+f7e9153b037eac9b1df8@syzkaller.appspotmail.com CC: Marcelo Ricardo Leitner marcelo.leitner@gmail.com CC: Xin Long lucien.xin@gmail.com CC: "David S. Miller" davem@davemloft.net CC: netdev@vger.kernel.org Acked-by: Marcelo Ricardo Leitner marcelo.leitner@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/sctp/sm_make_chunk.c | 8 ++++++++ 1 file changed, 8 insertions(+)
--- a/net/sctp/sm_make_chunk.c +++ b/net/sctp/sm_make_chunk.c @@ -2600,6 +2600,8 @@ do_addr_param: case SCTP_PARAM_STATE_COOKIE: asoc->peer.cookie_len = ntohs(param.p->length) - sizeof(struct sctp_paramhdr); + if (asoc->peer.cookie) + kfree(asoc->peer.cookie); asoc->peer.cookie = kmemdup(param.cookie->body, asoc->peer.cookie_len, gfp); if (!asoc->peer.cookie) retval = 0; @@ -2664,6 +2666,8 @@ do_addr_param: goto fall_through;
/* Save peer's random parameter */ + if (asoc->peer.peer_random) + kfree(asoc->peer.peer_random); asoc->peer.peer_random = kmemdup(param.p, ntohs(param.p->length), gfp); if (!asoc->peer.peer_random) { @@ -2677,6 +2681,8 @@ do_addr_param: goto fall_through;
/* Save peer's HMAC list */ + if (asoc->peer.peer_hmacs) + kfree(asoc->peer.peer_hmacs); asoc->peer.peer_hmacs = kmemdup(param.p, ntohs(param.p->length), gfp); if (!asoc->peer.peer_hmacs) { @@ -2692,6 +2698,8 @@ do_addr_param: if (!ep->auth_enable) goto fall_through;
+ if (asoc->peer.peer_chunks) + kfree(asoc->peer.peer_chunks); asoc->peer.peer_chunks = kmemdup(param.p, ntohs(param.p->length), gfp); if (!asoc->peer.peer_chunks)
From: John Paul Adrian Glaubitz glaubitz@physik.fu-berlin.de
[ Upstream commit 07a6d63eb1b54b5fb38092780fe618dfe1d96e23 ]
In d5a2aa24, the name in struct console sunhv_console was changed from "ttyS" to "ttyHV" while the name in struct uart_ops sunhv_pops remained unchanged.
This results in the hypervisor console device to be listed as "ttyHV0" under /proc/consoles while the device node is still named "ttyS0":
root@osaka:~# cat /proc/consoles ttyHV0 -W- (EC p ) 4:64 tty0 -WU (E ) 4:1 root@osaka:~# readlink /sys/dev/char/4:64 ../../devices/root/f02836f0/f0285690/tty/ttyS0 root@osaka:~#
This means that any userland code which tries to determine the name of the device file of the hypervisor console device can not rely on the information provided by /proc/consoles. In particular, booting current versions of debian- installer inside a SPARC LDOM will fail with the installer unable to determine the console device.
After renaming the device in struct uart_ops sunhv_pops to "ttyHV" as well, the inconsistency is fixed and it is possible again to determine the name of the device file of the hypervisor console device by reading the contents of /proc/console:
root@osaka:~# cat /proc/consoles ttyHV0 -W- (EC p ) 4:64 tty0 -WU (E ) 4:1 root@osaka:~# readlink /sys/dev/char/4:64 ../../devices/root/f02836f0/f0285690/tty/ttyHV0 root@osaka:~#
With this change, debian-installer works correctly when installing inside a SPARC LDOM.
Signed-off-by: John Paul Adrian Glaubitz glaubitz@physik.fu-berlin.de Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/tty/serial/sunhv.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/tty/serial/sunhv.c +++ b/drivers/tty/serial/sunhv.c @@ -397,7 +397,7 @@ static const struct uart_ops sunhv_pops static struct uart_driver sunhv_reg = { .owner = THIS_MODULE, .driver_name = "sunhv", - .dev_name = "ttyS", + .dev_name = "ttyHV", .major = TTY_MAJOR, };
From: Xin Long lucien.xin@gmail.com
[ Upstream commit 5cf02612b33f104fe1015b2dfaf1758ad3675588 ]
Syzbot reported a memleak caused by grp members' deferredq list not purged when the grp is be deleted.
The issue occurs when more(msg_grp_bc_seqno(hdr), m->bc_rcv_nxt) in tipc_group_filter_msg() and the skb will stay in deferredq.
So fix it by calling __skb_queue_purge for each member's deferredq in tipc_group_delete() when a tipc sk leaves the grp.
Fixes: b87a5ea31c93 ("tipc: guarantee group unicast doesn't bypass group broadcast") Reported-by: syzbot+78fbe679c8ca8d264a8d@syzkaller.appspotmail.com Signed-off-by: Xin Long lucien.xin@gmail.com Acked-by: Ying Xue ying.xue@windriver.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/tipc/group.c | 1 + 1 file changed, 1 insertion(+)
--- a/net/tipc/group.c +++ b/net/tipc/group.c @@ -218,6 +218,7 @@ void tipc_group_delete(struct net *net,
rbtree_postorder_for_each_entry_safe(m, tmp, tree, tree_node) { tipc_group_proto_xmit(grp, m, GRP_LEAVE_MSG, &xmitq); + __skb_queue_purge(&m->deferredq); list_del(&m->list); kfree(m); }
From: Stephen Barber smbarber@chromium.org
[ Upstream commit 42f5cda5eaf4396a939ae9bb43bb8d1d09c1b15c ]
Set the SOCK_DONE flag to match the TCP_CLOSING state when a peer has shut down and there is nothing left to read.
This fixes the following bug: 1) Peer sends SHUTDOWN(RDWR). 2) Socket enters TCP_CLOSING but SOCK_DONE is not set. 3) read() returns -ENOTCONN until close() is called, then returns 0.
Signed-off-by: Stephen Barber smbarber@chromium.org Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/vmw_vsock/virtio_transport_common.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
--- a/net/vmw_vsock/virtio_transport_common.c +++ b/net/vmw_vsock/virtio_transport_common.c @@ -871,8 +871,10 @@ virtio_transport_recv_connected(struct s if (le32_to_cpu(pkt->hdr.flags) & VIRTIO_VSOCK_SHUTDOWN_SEND) vsk->peer_shutdown |= SEND_SHUTDOWN; if (vsk->peer_shutdown == SHUTDOWN_MASK && - vsock_stream_has_data(vsk) <= 0) + vsock_stream_has_data(vsk) <= 0) { + sock_set_flag(sk, SOCK_DONE); sk->sk_state = TCP_CLOSING; + } if (le32_to_cpu(pkt->hdr.flags)) sk->sk_state_change(sk); break;
From: Alaa Hleihel alaa@mellanox.com
Prior to reloading a device we must first verify that it was not already removed. Otherwise, the attempt to remove the device will do nothing, and in that case we will end up proceeding with adding an new device that no one was expecting to remove, leaving behind used resources such as EQs that causes a failure to destroy comp EQs and syndrome (0x30f433).
Fix that by making sure that we try to remove and add a device (based on a protocol) only if the device is already added.
Fixes: c5447c70594b ("net/mlx5: E-Switch, Reload IB interface when switching devlink modes") Signed-off-by: Alaa Hleihel alaa@mellanox.com Signed-off-by: Saeed Mahameed saeedm@mellanox.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/mellanox/mlx5/core/dev.c | 25 +++++++++++++++++++++++-- 1 file changed, 23 insertions(+), 2 deletions(-)
--- a/drivers/net/ethernet/mellanox/mlx5/core/dev.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/dev.c @@ -248,11 +248,32 @@ void mlx5_unregister_interface(struct ml } EXPORT_SYMBOL(mlx5_unregister_interface);
+/* Must be called with intf_mutex held */ +static bool mlx5_has_added_dev_by_protocol(struct mlx5_core_dev *mdev, int protocol) +{ + struct mlx5_device_context *dev_ctx; + struct mlx5_interface *intf; + bool found = false; + + list_for_each_entry(intf, &intf_list, list) { + if (intf->protocol == protocol) { + dev_ctx = mlx5_get_device(intf, &mdev->priv); + if (dev_ctx && test_bit(MLX5_INTERFACE_ADDED, &dev_ctx->state)) + found = true; + break; + } + } + + return found; +} + void mlx5_reload_interface(struct mlx5_core_dev *mdev, int protocol) { mutex_lock(&mlx5_intf_mutex); - mlx5_remove_dev_by_protocol(mdev, protocol); - mlx5_add_dev_by_protocol(mdev, protocol); + if (mlx5_has_added_dev_by_protocol(mdev, protocol)) { + mlx5_remove_dev_by_protocol(mdev, protocol); + mlx5_add_dev_by_protocol(mdev, protocol); + } mutex_unlock(&mlx5_intf_mutex); }
From: Stefano Brivio sbrivio@redhat.com
[ Upstream commit 8399a6930d12f5965230f4ff058228a4cc80c0b9 ]
In commit c3a43b9fec8a ("vxlan: ICMP error lookup handler") I wrongly assumed buffers from icmp_socket_deliver() would be linear. This is not the case: icmp_socket_deliver() only guarantees we have 8 bytes of linear data.
Eric fixed this same issue for fou and fou6 in commits 26fc181e6cac ("fou, fou6: do not assume linear skbs") and 5355ed6388e2 ("fou, fou6: avoid uninit-value in gue_err() and gue6_err()").
Use pskb_may_pull() instead of checking skb->len, and take into account the fact we later access the VXLAN header with udp_hdr(), so we also need to sum skb_transport_header() here.
Reported-by: Guillaume Nault gnault@redhat.com Fixes: c3a43b9fec8a ("vxlan: ICMP error lookup handler") Signed-off-by: Stefano Brivio sbrivio@redhat.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/vxlan.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/net/vxlan.c +++ b/drivers/net/vxlan.c @@ -1765,7 +1765,7 @@ static int vxlan_err_lookup(struct sock struct vxlanhdr *hdr; __be32 vni;
- if (skb->len < VXLAN_HLEN) + if (!pskb_may_pull(skb, skb_transport_offset(skb) + VXLAN_HLEN)) return -EINVAL;
hdr = vxlan_hdr(skb);
From: Stefano Brivio sbrivio@redhat.com
[ Upstream commit eccc73a6b2cb6c04bfbc40a0769f3c428dfba232 ]
In commit a07966447f39 ("geneve: ICMP error lookup handler") I wrongly assumed buffers from icmp_socket_deliver() would be linear. This is not the case: icmp_socket_deliver() only guarantees we have 8 bytes of linear data.
Eric fixed this same issue for fou and fou6 in commits 26fc181e6cac ("fou, fou6: do not assume linear skbs") and 5355ed6388e2 ("fou, fou6: avoid uninit-value in gue_err() and gue6_err()").
Use pskb_may_pull() instead of checking skb->len, and take into account the fact we later access the GENEVE header with udp_hdr(), so we also need to sum skb_transport_header() here.
Reported-by: Guillaume Nault gnault@redhat.com Fixes: a07966447f39 ("geneve: ICMP error lookup handler") Signed-off-by: Stefano Brivio sbrivio@redhat.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/geneve.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/net/geneve.c +++ b/drivers/net/geneve.c @@ -396,7 +396,7 @@ static int geneve_udp_encap_err_lookup(s u8 zero_vni[3] = { 0 }; u8 *vni = zero_vni;
- if (skb->len < GENEVE_BASE_HLEN) + if (!pskb_may_pull(skb, skb_transport_offset(skb) + GENEVE_BASE_HLEN)) return -EINVAL;
geneveh = geneve_hdr(skb);
From: Maxime Chevallier maxime.chevallier@bootlin.com
[ Upstream commit 46b0090a6636cf34c0e856f15dd03e15ba4cdda6 ]
VID filtering is implemented in the Header Parser, with one range of 11 vids being assigned for each no-loopback port.
Make sure we use the per-port range when looking for existing entries in the Parser.
Since we used a global range instead of a per-port one, this causes VIDs to be removed from the whitelist from all ports of the same PPv2 instance.
Fixes: 56beda3db602 ("net: mvpp2: Add hardware offloading for VLAN filtering") Suggested-by: Yuri Chipchev yuric@marvell.com Signed-off-by: Maxime Chevallier maxime.chevallier@bootlin.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/marvell/mvpp2/mvpp2_prs.c | 17 ++++++++--------- 1 file changed, 8 insertions(+), 9 deletions(-)
--- a/drivers/net/ethernet/marvell/mvpp2/mvpp2_prs.c +++ b/drivers/net/ethernet/marvell/mvpp2/mvpp2_prs.c @@ -1905,8 +1905,7 @@ static int mvpp2_prs_ip6_init(struct mvp }
/* Find tcam entry with matched pair <vid,port> */ -static int mvpp2_prs_vid_range_find(struct mvpp2 *priv, int pmap, u16 vid, - u16 mask) +static int mvpp2_prs_vid_range_find(struct mvpp2_port *port, u16 vid, u16 mask) { unsigned char byte[2], enable[2]; struct mvpp2_prs_entry pe; @@ -1914,13 +1913,13 @@ static int mvpp2_prs_vid_range_find(stru int tid;
/* Go through the all entries with MVPP2_PRS_LU_VID */ - for (tid = MVPP2_PE_VID_FILT_RANGE_START; - tid <= MVPP2_PE_VID_FILT_RANGE_END; tid++) { - if (!priv->prs_shadow[tid].valid || - priv->prs_shadow[tid].lu != MVPP2_PRS_LU_VID) + for (tid = MVPP2_PRS_VID_PORT_FIRST(port->id); + tid <= MVPP2_PRS_VID_PORT_LAST(port->id); tid++) { + if (!port->priv->prs_shadow[tid].valid || + port->priv->prs_shadow[tid].lu != MVPP2_PRS_LU_VID) continue;
- mvpp2_prs_init_from_hw(priv, &pe, tid); + mvpp2_prs_init_from_hw(port->priv, &pe, tid);
mvpp2_prs_tcam_data_byte_get(&pe, 2, &byte[0], &enable[0]); mvpp2_prs_tcam_data_byte_get(&pe, 3, &byte[1], &enable[1]); @@ -1950,7 +1949,7 @@ int mvpp2_prs_vid_entry_add(struct mvpp2 memset(&pe, 0, sizeof(pe));
/* Scan TCAM and see if entry with this <vid,port> already exist */ - tid = mvpp2_prs_vid_range_find(priv, (1 << port->id), vid, mask); + tid = mvpp2_prs_vid_range_find(port, vid, mask);
reg_val = mvpp2_read(priv, MVPP2_MH_REG(port->id)); if (reg_val & MVPP2_DSA_EXTENDED) @@ -2008,7 +2007,7 @@ void mvpp2_prs_vid_entry_remove(struct m int tid;
/* Scan TCAM and see if entry with this <vid,port> already exist */ - tid = mvpp2_prs_vid_range_find(priv, (1 << port->id), vid, 0xfff); + tid = mvpp2_prs_vid_range_find(port, vid, 0xfff);
/* No such entry */ if (tid < 0)
From: Maxime Chevallier maxime.chevallier@bootlin.com
[ Upstream commit 6b7a3430c163455cf8a514d636bda52b04654972 ]
When removing all VID filters, the mvpp2_prs_vid_entry_remove would be called with the TCAM id incorrectly used as a VID, causing the wrong TCAM entries to be invalidated.
Fix this by directly invalidating entries in the VID range.
Fixes: 56beda3db602 ("net: mvpp2: Add hardware offloading for VLAN filtering") Suggested-by: Yuri Chipchev yuric@marvell.com Signed-off-by: Maxime Chevallier maxime.chevallier@bootlin.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/marvell/mvpp2/mvpp2_prs.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
--- a/drivers/net/ethernet/marvell/mvpp2/mvpp2_prs.c +++ b/drivers/net/ethernet/marvell/mvpp2/mvpp2_prs.c @@ -2025,8 +2025,10 @@ void mvpp2_prs_vid_remove_all(struct mvp
for (tid = MVPP2_PRS_VID_PORT_FIRST(port->id); tid <= MVPP2_PRS_VID_PORT_LAST(port->id); tid++) { - if (priv->prs_shadow[tid].valid) - mvpp2_prs_vid_entry_remove(port, tid); + if (priv->prs_shadow[tid].valid) { + mvpp2_prs_hw_inv(priv, tid); + priv->prs_shadow[tid].valid = false; + } } }
From: Robert Hancock hancock@sedsystems.ca
[ Upstream commit 6bb9e376c2a4cc5120c3bf5fd3048b9a0a6ec1f8 ]
If some of the switch ports were not listed in the device tree, due to being unused, the ksz_mib_read_work function ended up accessing a NULL dp->slave pointer and causing an oops. Skip checking statistics for any unused ports.
Fixes: 7c6ff470aa867f53 ("net: dsa: microchip: add MIB counter reading support") Signed-off-by: Robert Hancock hancock@sedsystems.ca Reviewed-by: Vivien Didelot vivien.didelot@gmail.com Reviewed-by: Andrew Lunn andrew@lunn.ch Reviewed-by: Florian Fainelli f.fainelli@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/dsa/microchip/ksz_common.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/drivers/net/dsa/microchip/ksz_common.c +++ b/drivers/net/dsa/microchip/ksz_common.c @@ -83,6 +83,9 @@ static void ksz_mib_read_work(struct wor int i;
for (i = 0; i < dev->mib_port_cnt; i++) { + if (dsa_is_unused_port(dev->ds, i)) + continue; + p = &dev->ports[i]; mib = &p->mib; mutex_lock(&mib->cnt_mutex);
From: Maxime Chevallier maxime.chevallier@bootlin.com
[ Upstream commit f0d2ca1531377e7da888913e277eefac05a59b6f ]
Using ethtool, users can specify a classification action matching on the full vlan tag, which includes the DEI bit (also previously called CFI).
However, when converting the ethool_flow_spec to a flow_rule, we use dissector keys to represent the matching patterns.
Since the vlan dissector key doesn't include the DEI bit, this information was silently discarded when translating the ethtool flow spec in to a flow_rule.
This commit adds the DEI bit into the vlan dissector key, and allows propagating the information to the driver when parsing the ethtool flow spec.
Fixes: eca4205f9ec3 ("ethtool: add ethtool_rx_flow_spec to flow_rule structure translator") Reported-by: Michał Mirosław mirq-linux@rere.qmqm.pl Signed-off-by: Maxime Chevallier maxime.chevallier@bootlin.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/net/flow_dissector.h | 1 + net/core/ethtool.c | 5 +++++ 2 files changed, 6 insertions(+)
--- a/include/net/flow_dissector.h +++ b/include/net/flow_dissector.h @@ -46,6 +46,7 @@ struct flow_dissector_key_tags {
struct flow_dissector_key_vlan { u16 vlan_id:12, + vlan_dei:1, vlan_priority:3; __be16 vlan_tpid; }; --- a/net/core/ethtool.c +++ b/net/core/ethtool.c @@ -3022,6 +3022,11 @@ ethtool_rx_flow_rule_create(const struct match->mask.vlan.vlan_id = ntohs(ext_m_spec->vlan_tci) & 0x0fff;
+ match->key.vlan.vlan_dei = + !!(ext_h_spec->vlan_tci & htons(0x1000)); + match->mask.vlan.vlan_dei = + !!(ext_m_spec->vlan_tci & htons(0x1000)); + match->key.vlan.vlan_priority = (ntohs(ext_h_spec->vlan_tci) & 0xe000) >> 13; match->mask.vlan.vlan_priority =
From: Edward Srouji edwards@mellanox.com
Add missing entries for create/destroy UCTX and UMEM commands. This could get us wrong "unknown FW command" error in flows where we unbind the device or reset the driver.
Also the translation of these commands from opcodes to string was missing.
Fixes: 6e3722baac04 ("IB/mlx5: Use the correct commands for UMEM and UCTX allocation") Signed-off-by: Edward Srouji edwards@mellanox.com Signed-off-by: Saeed Mahameed saeedm@mellanox.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/mellanox/mlx5/core/cmd.c | 8 ++++++++ 1 file changed, 8 insertions(+)
--- a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c @@ -441,6 +441,10 @@ static int mlx5_internal_err_ret_value(s case MLX5_CMD_OP_CREATE_GENERAL_OBJECT: case MLX5_CMD_OP_MODIFY_GENERAL_OBJECT: case MLX5_CMD_OP_QUERY_GENERAL_OBJECT: + case MLX5_CMD_OP_CREATE_UCTX: + case MLX5_CMD_OP_DESTROY_UCTX: + case MLX5_CMD_OP_CREATE_UMEM: + case MLX5_CMD_OP_DESTROY_UMEM: case MLX5_CMD_OP_ALLOC_MEMIC: *status = MLX5_DRIVER_STATUS_ABORTED; *synd = MLX5_DRIVER_SYND; @@ -629,6 +633,10 @@ const char *mlx5_command_str(int command MLX5_COMMAND_STR_CASE(ALLOC_MEMIC); MLX5_COMMAND_STR_CASE(DEALLOC_MEMIC); MLX5_COMMAND_STR_CASE(QUERY_HOST_PARAMS); + MLX5_COMMAND_STR_CASE(CREATE_UCTX); + MLX5_COMMAND_STR_CASE(DESTROY_UCTX); + MLX5_COMMAND_STR_CASE(CREATE_UMEM); + MLX5_COMMAND_STR_CASE(DESTROY_UMEM); default: return "unknown command opcode"; } }
From: Ido Schimmel idosch@mellanox.com
The driver tries to periodically refresh neighbours that are used to reach nexthops. This is done by periodically calling neigh_event_send().
However, if the neighbour becomes dead, there is nothing we can do to return it to a connected state and the above function call is basically a NOP.
This results in the nexthop never being written to the device's adjacency table and therefore never used to forward packets.
Fix this by dropping our reference from the dead neighbour and associating the nexthop with a new neigbhour which we will try to refresh.
Fixes: a7ff87acd995 ("mlxsw: spectrum_router: Implement next-hop routing") Signed-off-by: Ido Schimmel idosch@mellanox.com Reported-by: Alex Veber alexve@mellanox.com Tested-by: Alex Veber alexve@mellanox.com Acked-by: Jiri Pirko jiri@mellanox.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c | 73 +++++++++++++++++- 1 file changed, 70 insertions(+), 3 deletions(-)
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c +++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c @@ -2363,7 +2363,7 @@ static void mlxsw_sp_router_probe_unreso static void mlxsw_sp_nexthop_neigh_update(struct mlxsw_sp *mlxsw_sp, struct mlxsw_sp_neigh_entry *neigh_entry, - bool removing); + bool removing, bool dead);
static enum mlxsw_reg_rauht_op mlxsw_sp_rauht_op(bool adding) { @@ -2494,7 +2494,8 @@ static void mlxsw_sp_router_neigh_event_
memcpy(neigh_entry->ha, ha, ETH_ALEN); mlxsw_sp_neigh_entry_update(mlxsw_sp, neigh_entry, entry_connected); - mlxsw_sp_nexthop_neigh_update(mlxsw_sp, neigh_entry, !entry_connected); + mlxsw_sp_nexthop_neigh_update(mlxsw_sp, neigh_entry, !entry_connected, + dead);
if (!neigh_entry->connected && list_empty(&neigh_entry->nexthop_list)) mlxsw_sp_neigh_entry_destroy(mlxsw_sp, neigh_entry); @@ -3458,13 +3459,79 @@ static void __mlxsw_sp_nexthop_neigh_upd nh->update = 1; }
+static int +mlxsw_sp_nexthop_dead_neigh_replace(struct mlxsw_sp *mlxsw_sp, + struct mlxsw_sp_neigh_entry *neigh_entry) +{ + struct neighbour *n, *old_n = neigh_entry->key.n; + struct mlxsw_sp_nexthop *nh; + bool entry_connected; + u8 nud_state, dead; + int err; + + nh = list_first_entry(&neigh_entry->nexthop_list, + struct mlxsw_sp_nexthop, neigh_list_node); + + n = neigh_lookup(nh->nh_grp->neigh_tbl, &nh->gw_addr, nh->rif->dev); + if (!n) { + n = neigh_create(nh->nh_grp->neigh_tbl, &nh->gw_addr, + nh->rif->dev); + if (IS_ERR(n)) + return PTR_ERR(n); + neigh_event_send(n, NULL); + } + + mlxsw_sp_neigh_entry_remove(mlxsw_sp, neigh_entry); + neigh_entry->key.n = n; + err = mlxsw_sp_neigh_entry_insert(mlxsw_sp, neigh_entry); + if (err) + goto err_neigh_entry_insert; + + read_lock_bh(&n->lock); + nud_state = n->nud_state; + dead = n->dead; + read_unlock_bh(&n->lock); + entry_connected = nud_state & NUD_VALID && !dead; + + list_for_each_entry(nh, &neigh_entry->nexthop_list, + neigh_list_node) { + neigh_release(old_n); + neigh_clone(n); + __mlxsw_sp_nexthop_neigh_update(nh, !entry_connected); + mlxsw_sp_nexthop_group_refresh(mlxsw_sp, nh->nh_grp); + } + + neigh_release(n); + + return 0; + +err_neigh_entry_insert: + neigh_entry->key.n = old_n; + mlxsw_sp_neigh_entry_insert(mlxsw_sp, neigh_entry); + neigh_release(n); + return err; +} + static void mlxsw_sp_nexthop_neigh_update(struct mlxsw_sp *mlxsw_sp, struct mlxsw_sp_neigh_entry *neigh_entry, - bool removing) + bool removing, bool dead) { struct mlxsw_sp_nexthop *nh;
+ if (list_empty(&neigh_entry->nexthop_list)) + return; + + if (dead) { + int err; + + err = mlxsw_sp_nexthop_dead_neigh_replace(mlxsw_sp, + neigh_entry); + if (err) + dev_err(mlxsw_sp->bus_info->dev, "Failed to replace dead neigh\n"); + return; + } + list_for_each_entry(nh, &neigh_entry->nexthop_list, neigh_list_node) { __mlxsw_sp_nexthop_neigh_update(nh, removing);
From: Chris Mi chrism@mellanox.com
After we have a dedicated uplink representor, the new netdev ops doesn't support ndo_set_feature. Because of that, we can't change some features, eg. rxvlan. Now add it back.
In this patch, I also do a cleanup for the features flag handling, eg. remove duplicate NETIF_F_HW_TC flag setting.
Fixes: aec002f6f82c ("net/mlx5e: Uninstantiate esw manager vport netdev on switchdev mode") Signed-off-by: Chris Mi chrism@mellanox.com Reviewed-by: Roi Dayan roid@mellanox.com Reviewed-by: Vlad Buslov vladbu@mellanox.com Signed-off-by: Saeed Mahameed saeedm@mellanox.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/mellanox/mlx5/core/en.h | 1 + drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 3 +-- drivers/net/ethernet/mellanox/mlx5/core/en_rep.c | 10 ++++++---- 3 files changed, 8 insertions(+), 6 deletions(-)
--- a/drivers/net/ethernet/mellanox/mlx5/core/en.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h @@ -1059,6 +1059,7 @@ void mlx5e_del_vxlan_port(struct net_dev netdev_features_t mlx5e_features_check(struct sk_buff *skb, struct net_device *netdev, netdev_features_t features); +int mlx5e_set_features(struct net_device *netdev, netdev_features_t features); #ifdef CONFIG_MLX5_ESWITCH int mlx5e_set_vf_mac(struct net_device *dev, int vf, u8 *mac); int mlx5e_set_vf_rate(struct net_device *dev, int vf, int min_tx_rate, int max_tx_rate); --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c @@ -3698,8 +3698,7 @@ static int mlx5e_handle_feature(struct n return 0; }
-static int mlx5e_set_features(struct net_device *netdev, - netdev_features_t features) +int mlx5e_set_features(struct net_device *netdev, netdev_features_t features) { netdev_features_t oper_features = netdev->features; int err = 0; --- a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c @@ -1350,6 +1350,7 @@ static const struct net_device_ops mlx5e .ndo_get_vf_stats = mlx5e_get_vf_stats, .ndo_set_vf_vlan = mlx5e_uplink_rep_set_vf_vlan, .ndo_get_port_parent_id = mlx5e_rep_get_port_parent_id, + .ndo_set_features = mlx5e_set_features, };
bool mlx5e_eswitch_rep(struct net_device *netdev) @@ -1423,10 +1424,9 @@ static void mlx5e_build_rep_netdev(struc
netdev->watchdog_timeo = 15 * HZ;
+ netdev->features |= NETIF_F_NETNS_LOCAL;
- netdev->features |= NETIF_F_HW_TC | NETIF_F_NETNS_LOCAL; - netdev->hw_features |= NETIF_F_HW_TC; - + netdev->hw_features |= NETIF_F_HW_TC; netdev->hw_features |= NETIF_F_SG; netdev->hw_features |= NETIF_F_IP_CSUM; netdev->hw_features |= NETIF_F_IPV6_CSUM; @@ -1435,7 +1435,9 @@ static void mlx5e_build_rep_netdev(struc netdev->hw_features |= NETIF_F_TSO6; netdev->hw_features |= NETIF_F_RXCSUM;
- if (rep->vport != MLX5_VPORT_UPLINK) + if (rep->vport == MLX5_VPORT_UPLINK) + netdev->hw_features |= NETIF_F_HW_VLAN_CTAG_RX; + else netdev->features |= NETIF_F_VLAN_CHALLENGED;
netdev->features |= netdev->hw_features;
From: Jiri Pirko jiri@mellanox.com
The TOS value was not extracted correctly. Fix it.
Fixes: 87996f91f739 ("mlxsw: spectrum_flower: Add support for ip tos") Reported-by: Alexander Petrovskiy alexpe@mellanox.com Signed-off-by: Jiri Pirko jiri@mellanox.com Signed-off-by: Ido Schimmel idosch@mellanox.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/mellanox/mlxsw/spectrum_flower.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_flower.c +++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_flower.c @@ -247,8 +247,8 @@ static int mlxsw_sp_flower_parse_ip(stru match.mask->tos & 0x3);
mlxsw_sp_acl_rulei_keymask_u32(rulei, MLXSW_AFK_ELEMENT_IP_DSCP, - match.key->tos >> 6, - match.mask->tos >> 6); + match.key->tos >> 2, + match.mask->tos >> 2);
return 0; }
From: Raed Salem raeds@mellanox.com
The cited commit changed the initialization placement of the eswitch attributes so it is done prior to parse tc actions function call, including among others the in_rep and in_mdev fields which are mistakenly reassigned inside the parse actions function.
This breaks the source port matching criteria of the peer redirect rule.
Fix by removing the now redundant reassignment of the already initialized fields.
Fixes: 988ab9c7363a ("net/mlx5e: Introduce mlx5e_flow_esw_attr_init() helper") Signed-off-by: Raed Salem raeds@mellanox.com Reviewed-by: Roi Dayan roid@mellanox.com Signed-off-by: Saeed Mahameed saeedm@mellanox.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/mellanox/mlx5/core/en_tc.c | 3 --- 1 file changed, 3 deletions(-)
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c @@ -2572,9 +2572,6 @@ static int parse_tc_fdb_actions(struct m if (!flow_action_has_entries(flow_action)) return -EINVAL;
- attr->in_rep = rpriv->rep; - attr->in_mdev = priv->mdev; - flow_action_for_each(i, act, flow_action) { switch (act->id) { case FLOW_ACTION_DROP:
From: Petr Machata petrm@mellanox.com
Due to an issue on Spectrum-2, in front-panel ports split four ways, 2 out of 32 port buffers cannot be used. To work around this, the next FW release will mark them as unused, and will report correspondingly lower total shared buffer size. mlxsw will pick up the new value through a query to cap_total_buffer_size resource. However the initial size for shared buffer pool 0 is hard-coded and therefore needs to be updated.
Thus reduce the pool size by 2.7 MiB (which corresponds to 2/32 of the total size of 42 MiB), and round down to the whole number of cells.
Fixes: fe099bf682ab ("mlxsw: spectrum_buffers: Add Spectrum-2 shared buffer configuration") Signed-off-by: Petr Machata petrm@mellanox.com Acked-by: Jiri Pirko jiri@mellanox.com Signed-off-by: Ido Schimmel idosch@mellanox.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/mellanox/mlxsw/spectrum_buffers.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_buffers.c +++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_buffers.c @@ -411,9 +411,9 @@ static const struct mlxsw_sp_sb_pr mlxsw MLXSW_SP_SB_PR(MLXSW_REG_SBPR_MODE_STATIC, MLXSW_SP_SB_INFI), };
-#define MLXSW_SP2_SB_PR_INGRESS_SIZE 40960000 +#define MLXSW_SP2_SB_PR_INGRESS_SIZE 38128752 +#define MLXSW_SP2_SB_PR_EGRESS_SIZE 38128752 #define MLXSW_SP2_SB_PR_INGRESS_MNG_SIZE (200 * 1000) -#define MLXSW_SP2_SB_PR_EGRESS_SIZE 40960000
static const struct mlxsw_sp_sb_pr mlxsw_sp2_sb_prs[] = { /* Ingress pools. */
From: Eli Britstein elibr@mellanox.com
Stacked devices like bond interface may have a VLAN device on top of them. Detect lag state correctly under this condition, and return the correct routed net device, according to it the encap header is built.
Fixes: e32ee6c78efa ("net/mlx5e: Support tunnel encap over tagged Ethernet") Signed-off-by: Eli Britstein elibr@mellanox.com Reviewed-by: Roi Dayan roid@mellanox.com Signed-off-by: Saeed Mahameed saeedm@mellanox.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun.c | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-)
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun.c @@ -11,24 +11,25 @@ static int get_route_and_out_devs(struct struct net_device **route_dev, struct net_device **out_dev) { + struct net_device *uplink_dev, *uplink_upper, *real_dev; struct mlx5_eswitch *esw = priv->mdev->priv.eswitch; - struct net_device *uplink_dev, *uplink_upper; bool dst_is_lag_dev;
+ real_dev = is_vlan_dev(dev) ? vlan_dev_real_dev(dev) : dev; uplink_dev = mlx5_eswitch_uplink_get_proto_dev(esw, REP_ETH); uplink_upper = netdev_master_upper_dev_get(uplink_dev); dst_is_lag_dev = (uplink_upper && netif_is_lag_master(uplink_upper) && - dev == uplink_upper && + real_dev == uplink_upper && mlx5_lag_is_sriov(priv->mdev));
/* if the egress device isn't on the same HW e-switch or * it's a LAG device, use the uplink */ - if (!netdev_port_same_parent_id(priv->netdev, dev) || + if (!netdev_port_same_parent_id(priv->netdev, real_dev) || dst_is_lag_dev) { - *route_dev = uplink_dev; - *out_dev = *route_dev; + *route_dev = dev; + *out_dev = uplink_dev; } else { *route_dev = dev; if (is_vlan_dev(*route_dev))
From: Willem de Bruijn willemb@google.com
[ Upstream commit 522924b583082f51b8a2406624a2f27c22119b20 ]
The below patch fixes an incorrect zerocopy refcnt increment when appending with MSG_MORE to an existing zerocopy udp skb.
send(.., MSG_ZEROCOPY | MSG_MORE); // refcnt 1 send(.., MSG_ZEROCOPY | MSG_MORE); // refcnt still 1 (bar frags)
But it missed that zerocopy need not be passed at the first send. The right test whether the uarg is newly allocated and thus has extra refcnt 1 is not !skb, but !skb_zcopy.
send(.., MSG_MORE); // <no uarg> send(.., MSG_ZEROCOPY); // refcnt 1
Fixes: 100f6d8e09905 ("net: correct zerocopy refcnt with udp MSG_MORE") Reported-by: syzbot syzkaller@googlegroups.com Signed-off-by: Willem de Bruijn willemb@google.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv4/ip_output.c | 2 +- net/ipv6/ip6_output.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-)
--- a/net/ipv4/ip_output.c +++ b/net/ipv4/ip_output.c @@ -923,7 +923,7 @@ static int __ip_append_data(struct sock uarg = sock_zerocopy_realloc(sk, length, skb_zcopy(skb)); if (!uarg) return -ENOBUFS; - extra_uref = !skb; /* only extra ref if !MSG_MORE */ + extra_uref = !skb_zcopy(skb); /* only ref on new uarg */ if (rt->dst.dev->features & NETIF_F_SG && csummode == CHECKSUM_PARTIAL) { paged = true; --- a/net/ipv6/ip6_output.c +++ b/net/ipv6/ip6_output.c @@ -1344,7 +1344,7 @@ emsgsize: uarg = sock_zerocopy_realloc(sk, length, skb_zcopy(skb)); if (!uarg) return -ENOBUFS; - extra_uref = !skb; /* only extra ref if !MSG_MORE */ + extra_uref = !skb_zcopy(skb); /* only ref on new uarg */ if (rt->dst.dev->features & NETIF_F_SG && csummode == CHECKSUM_PARTIAL) { paged = true;
From: Alaa Hleihel alaa@mellanox.com
After introducing dedicated uplink representor, the netdev instance set over the esw manager vport (PF) became no longer in use, so it was removed in the cited commit once we're on switchdev mode. However, the mlx5e_detach function was not updated accordingly, and it still tries to detach a non-existing netdev, causing a kernel crash.
This patch fixes this issue.
Fixes: aec002f6f82c ("net/mlx5e: Uninstantiate esw manager vport netdev on switchdev mode") Signed-off-by: Alaa Hleihel alaa@mellanox.com Reviewed-by: Roi Dayan roid@mellanox.com Signed-off-by: Saeed Mahameed saeedm@mellanox.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 5 +++++ 1 file changed, 5 insertions(+)
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c @@ -5165,6 +5165,11 @@ static void mlx5e_detach(struct mlx5_cor struct mlx5e_priv *priv = vpriv; struct net_device *netdev = priv->netdev;
+#ifdef CONFIG_MLX5_ESWITCH + if (MLX5_ESWITCH_MANAGER(mdev) && vpriv == mdev) + return; +#endif + if (!netif_device_present(netdev)) return;
[ Upstream commit 1615fe41a1959a2ee2814ba62736b2bb54e9802a ]
The MPU6050 driver has recently gained support for the ICM20602 IMU, which is very similar to MPU6xxx. However, the ICM20602's FIFO data specifically includes temperature readings, which were not present on MPU6xxx parts. As a result, the driver will under-read the ICM20602's FIFO register, causing the same (partial) sample to be returned for all reads, until the FIFO overflows.
Fix this by adding a table of scan elements specifically for the ICM20602, which takes the extra temperature data into consideration.
While we're at it, fix the temperature offset and scaling on ICM20602, since it uses different scale/offset constants than the rest of the MPU6xxx devices.
Signed-off-by: Steve Moskovchenko stevemo@skydio.com Fixes: 22904bdff978 ("iio: imu: mpu6050: Add support for the ICM 20602 IMU") Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/iio/imu/inv_mpu6050/inv_mpu_core.c | 46 ++++++++++++++++++++-- drivers/iio/imu/inv_mpu6050/inv_mpu_iio.h | 20 +++++++++- drivers/iio/imu/inv_mpu6050/inv_mpu_ring.c | 3 ++ 3 files changed, 64 insertions(+), 5 deletions(-)
diff --git a/drivers/iio/imu/inv_mpu6050/inv_mpu_core.c b/drivers/iio/imu/inv_mpu6050/inv_mpu_core.c index 650de0fefb7b..385f14a4d5a7 100644 --- a/drivers/iio/imu/inv_mpu6050/inv_mpu_core.c +++ b/drivers/iio/imu/inv_mpu6050/inv_mpu_core.c @@ -471,7 +471,10 @@ inv_mpu6050_read_raw(struct iio_dev *indio_dev, return IIO_VAL_INT_PLUS_MICRO; case IIO_TEMP: *val = 0; - *val2 = INV_MPU6050_TEMP_SCALE; + if (st->chip_type == INV_ICM20602) + *val2 = INV_ICM20602_TEMP_SCALE; + else + *val2 = INV_MPU6050_TEMP_SCALE;
return IIO_VAL_INT_PLUS_MICRO; default: @@ -480,7 +483,10 @@ inv_mpu6050_read_raw(struct iio_dev *indio_dev, case IIO_CHAN_INFO_OFFSET: switch (chan->type) { case IIO_TEMP: - *val = INV_MPU6050_TEMP_OFFSET; + if (st->chip_type == INV_ICM20602) + *val = INV_ICM20602_TEMP_OFFSET; + else + *val = INV_MPU6050_TEMP_OFFSET;
return IIO_VAL_INT; default: @@ -845,6 +851,32 @@ static const struct iio_chan_spec inv_mpu_channels[] = { INV_MPU6050_CHAN(IIO_ACCEL, IIO_MOD_Z, INV_MPU6050_SCAN_ACCL_Z), };
+static const struct iio_chan_spec inv_icm20602_channels[] = { + IIO_CHAN_SOFT_TIMESTAMP(INV_ICM20602_SCAN_TIMESTAMP), + { + .type = IIO_TEMP, + .info_mask_separate = BIT(IIO_CHAN_INFO_RAW) + | BIT(IIO_CHAN_INFO_OFFSET) + | BIT(IIO_CHAN_INFO_SCALE), + .scan_index = INV_ICM20602_SCAN_TEMP, + .scan_type = { + .sign = 's', + .realbits = 16, + .storagebits = 16, + .shift = 0, + .endianness = IIO_BE, + }, + }, + + INV_MPU6050_CHAN(IIO_ANGL_VEL, IIO_MOD_X, INV_ICM20602_SCAN_GYRO_X), + INV_MPU6050_CHAN(IIO_ANGL_VEL, IIO_MOD_Y, INV_ICM20602_SCAN_GYRO_Y), + INV_MPU6050_CHAN(IIO_ANGL_VEL, IIO_MOD_Z, INV_ICM20602_SCAN_GYRO_Z), + + INV_MPU6050_CHAN(IIO_ACCEL, IIO_MOD_Y, INV_ICM20602_SCAN_ACCL_Y), + INV_MPU6050_CHAN(IIO_ACCEL, IIO_MOD_X, INV_ICM20602_SCAN_ACCL_X), + INV_MPU6050_CHAN(IIO_ACCEL, IIO_MOD_Z, INV_ICM20602_SCAN_ACCL_Z), +}; + /* * The user can choose any frequency between INV_MPU6050_MIN_FIFO_RATE and * INV_MPU6050_MAX_FIFO_RATE, but only these frequencies are matched by the @@ -1100,8 +1132,14 @@ int inv_mpu_core_probe(struct regmap *regmap, int irq, const char *name, indio_dev->name = name; else indio_dev->name = dev_name(dev); - indio_dev->channels = inv_mpu_channels; - indio_dev->num_channels = ARRAY_SIZE(inv_mpu_channels); + + if (chip_type == INV_ICM20602) { + indio_dev->channels = inv_icm20602_channels; + indio_dev->num_channels = ARRAY_SIZE(inv_icm20602_channels); + } else { + indio_dev->channels = inv_mpu_channels; + indio_dev->num_channels = ARRAY_SIZE(inv_mpu_channels); + }
indio_dev->info = &mpu_info; indio_dev->modes = INDIO_BUFFER_TRIGGERED; diff --git a/drivers/iio/imu/inv_mpu6050/inv_mpu_iio.h b/drivers/iio/imu/inv_mpu6050/inv_mpu_iio.h index 325afd9f5f61..3d5fe4474378 100644 --- a/drivers/iio/imu/inv_mpu6050/inv_mpu_iio.h +++ b/drivers/iio/imu/inv_mpu6050/inv_mpu_iio.h @@ -208,6 +208,9 @@ struct inv_mpu6050_state { #define INV_MPU6050_BYTES_PER_3AXIS_SENSOR 6 #define INV_MPU6050_FIFO_COUNT_BYTE 2
+/* ICM20602 FIFO samples include temperature readings */ +#define INV_ICM20602_BYTES_PER_TEMP_SENSOR 2 + /* mpu6500 registers */ #define INV_MPU6500_REG_ACCEL_CONFIG_2 0x1D #define INV_MPU6500_REG_ACCEL_OFFSET 0x77 @@ -229,6 +232,9 @@ struct inv_mpu6050_state { #define INV_MPU6050_GYRO_CONFIG_FSR_SHIFT 3 #define INV_MPU6050_ACCL_CONFIG_FSR_SHIFT 3
+#define INV_ICM20602_TEMP_OFFSET 8170 +#define INV_ICM20602_TEMP_SCALE 3060 + /* 6 + 6 round up and plus 8 */ #define INV_MPU6050_OUTPUT_DATA_SIZE 24
@@ -270,7 +276,7 @@ struct inv_mpu6050_state { #define INV_ICM20608_WHOAMI_VALUE 0xAF #define INV_ICM20602_WHOAMI_VALUE 0x12
-/* scan element definition */ +/* scan element definition for generic MPU6xxx devices */ enum inv_mpu6050_scan { INV_MPU6050_SCAN_ACCL_X, INV_MPU6050_SCAN_ACCL_Y, @@ -281,6 +287,18 @@ enum inv_mpu6050_scan { INV_MPU6050_SCAN_TIMESTAMP, };
+/* scan element definition for ICM20602, which includes temperature */ +enum inv_icm20602_scan { + INV_ICM20602_SCAN_ACCL_X, + INV_ICM20602_SCAN_ACCL_Y, + INV_ICM20602_SCAN_ACCL_Z, + INV_ICM20602_SCAN_TEMP, + INV_ICM20602_SCAN_GYRO_X, + INV_ICM20602_SCAN_GYRO_Y, + INV_ICM20602_SCAN_GYRO_Z, + INV_ICM20602_SCAN_TIMESTAMP, +}; + enum inv_mpu6050_filter_e { INV_MPU6050_FILTER_256HZ_NOLPF2 = 0, INV_MPU6050_FILTER_188HZ, diff --git a/drivers/iio/imu/inv_mpu6050/inv_mpu_ring.c b/drivers/iio/imu/inv_mpu6050/inv_mpu_ring.c index 548e042f7b5b..57bd11bde56b 100644 --- a/drivers/iio/imu/inv_mpu6050/inv_mpu_ring.c +++ b/drivers/iio/imu/inv_mpu6050/inv_mpu_ring.c @@ -207,6 +207,9 @@ irqreturn_t inv_mpu6050_read_fifo(int irq, void *p) if (st->chip_config.gyro_fifo_enable) bytes_per_datum += INV_MPU6050_BYTES_PER_3AXIS_SENSOR;
+ if (st->chip_type == INV_ICM20602) + bytes_per_datum += INV_ICM20602_BYTES_PER_TEMP_SENSOR; + /* * read fifo_count register to know how many bytes are inside the FIFO * right now
[ Upstream commit f2dcb8841e6b155da098edae09125859ef7e853d ]
Set sb->s_root to NULL when failing from __getname(), so that we can avoid double dput and unnecessary operations in generic_shutdown_super().
Signed-off-by: Chengguang Xu cgxu519@gmail.com Reviewed-by: Chao Yu yuchao0@huawei.com Reviewed-by: Gao Xiang gaoxiang25@huawei.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/staging/erofs/super.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/staging/erofs/super.c b/drivers/staging/erofs/super.c index 15c784fba879..c8981662a49b 100644 --- a/drivers/staging/erofs/super.c +++ b/drivers/staging/erofs/super.c @@ -459,6 +459,7 @@ static int erofs_read_super(struct super_block *sb, */ err_devname: dput(sb->s_root); + sb->s_root = NULL; err_iget: #ifdef EROFS_FS_HAS_MANAGED_CACHE iput(sbi->managed_cache);
[ Upstream commit ca4e4efbefbbdde0a7bb3023ea08d491f4daf9b9 ]
These are accidentally returning positive EINVAL instead of negative -EINVAL. Some of the callers treat positive values as success.
Fixes: 7b3ad5abf027 ("staging: Import the BCM2835 MMAL-based V4L2 camera driver.") Signed-off-by: Dan Carpenter dan.carpenter@oracle.com Acked-by: Stefan Wahren stefan.wahren@i2se.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/staging/vc04_services/bcm2835-camera/controls.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/staging/vc04_services/bcm2835-camera/controls.c b/drivers/staging/vc04_services/bcm2835-camera/controls.c index a2c55cb2192a..52f3c4be5ff8 100644 --- a/drivers/staging/vc04_services/bcm2835-camera/controls.c +++ b/drivers/staging/vc04_services/bcm2835-camera/controls.c @@ -576,7 +576,7 @@ static int ctrl_set_image_effect(struct bm2835_mmal_dev *dev, dev->colourfx.enable ? "true" : "false", dev->colourfx.u, dev->colourfx.v, ret, (ret == 0 ? 0 : -EINVAL)); - return (ret == 0 ? 0 : EINVAL); + return (ret == 0 ? 0 : -EINVAL); }
static int ctrl_set_colfx(struct bm2835_mmal_dev *dev, @@ -600,7 +600,7 @@ static int ctrl_set_colfx(struct bm2835_mmal_dev *dev, "%s: After: mmal_ctrl:%p ctrl id:0x%x ctrl val:%d ret %d(%d)\n", __func__, mmal_ctrl, ctrl->id, ctrl->val, ret, (ret == 0 ? 0 : -EINVAL)); - return (ret == 0 ? 0 : EINVAL); + return (ret == 0 ? 0 : -EINVAL); }
static int ctrl_set_bitrate(struct bm2835_mmal_dev *dev,
[ Upstream commit fea69916360468e364a4988db25a5afa835f3406 ]
If ->hif_read_reg() or ->hif_write_reg() fail then the code unlocks and keeps executing. It should just return.
Fixes: c5c77ba18ea6 ("staging: wilc1000: Add SDIO/SPI 802.11 driver") Signed-off-by: Dan Carpenter dan.carpenter@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/staging/wilc1000/wilc_wlan.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/drivers/staging/wilc1000/wilc_wlan.c b/drivers/staging/wilc1000/wilc_wlan.c index c2389695fe20..70b1ab21f8a3 100644 --- a/drivers/staging/wilc1000/wilc_wlan.c +++ b/drivers/staging/wilc1000/wilc_wlan.c @@ -1076,13 +1076,17 @@ void wilc_wlan_cleanup(struct net_device *dev) acquire_bus(wilc, WILC_BUS_ACQUIRE_AND_WAKEUP);
ret = wilc->hif_func->hif_read_reg(wilc, WILC_GP_REG_0, ®); - if (!ret) + if (!ret) { release_bus(wilc, WILC_BUS_RELEASE_ALLOW_SLEEP); + return; + }
ret = wilc->hif_func->hif_write_reg(wilc, WILC_GP_REG_0, (reg | ABORT_INT)); - if (!ret) + if (!ret) { release_bus(wilc, WILC_BUS_RELEASE_ALLOW_SLEEP); + return; + }
release_bus(wilc, WILC_BUS_RELEASE_ALLOW_SLEEP); wilc->hif_func->hif_deinit(NULL);
[ Upstream commit 670784fb4ebe54434e263837390e358405031d9e ]
Commit a939bb57cd47 ("pinctrl: intel: implement gpio_irq_enable") was added because clearing interrupt status bit is required to avoid unexpected behavior.
Turns out the unmask callback also needs the fix, which can solve weird IRQ triggering issues on I2C touchpad ELAN1200.
Signed-off-by: Kai-Heng Feng kai.heng.feng@canonical.com Signed-off-by: Andy Shevchenko andriy.shevchenko@linux.intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pinctrl/intel/pinctrl-intel.c | 37 +++++---------------------- 1 file changed, 6 insertions(+), 31 deletions(-)
diff --git a/drivers/pinctrl/intel/pinctrl-intel.c b/drivers/pinctrl/intel/pinctrl-intel.c index 70638b74f9d6..95d224404c7c 100644 --- a/drivers/pinctrl/intel/pinctrl-intel.c +++ b/drivers/pinctrl/intel/pinctrl-intel.c @@ -913,35 +913,6 @@ static void intel_gpio_irq_ack(struct irq_data *d) } }
-static void intel_gpio_irq_enable(struct irq_data *d) -{ - struct gpio_chip *gc = irq_data_get_irq_chip_data(d); - struct intel_pinctrl *pctrl = gpiochip_get_data(gc); - const struct intel_community *community; - const struct intel_padgroup *padgrp; - int pin; - - pin = intel_gpio_to_pin(pctrl, irqd_to_hwirq(d), &community, &padgrp); - if (pin >= 0) { - unsigned int gpp, gpp_offset, is_offset; - unsigned long flags; - u32 value; - - gpp = padgrp->reg_num; - gpp_offset = padgroup_offset(padgrp, pin); - is_offset = community->is_offset + gpp * 4; - - raw_spin_lock_irqsave(&pctrl->lock, flags); - /* Clear interrupt status first to avoid unexpected interrupt */ - writel(BIT(gpp_offset), community->regs + is_offset); - - value = readl(community->regs + community->ie_offset + gpp * 4); - value |= BIT(gpp_offset); - writel(value, community->regs + community->ie_offset + gpp * 4); - raw_spin_unlock_irqrestore(&pctrl->lock, flags); - } -} - static void intel_gpio_irq_mask_unmask(struct irq_data *d, bool mask) { struct gpio_chip *gc = irq_data_get_irq_chip_data(d); @@ -954,15 +925,20 @@ static void intel_gpio_irq_mask_unmask(struct irq_data *d, bool mask) if (pin >= 0) { unsigned int gpp, gpp_offset; unsigned long flags; - void __iomem *reg; + void __iomem *reg, *is; u32 value;
gpp = padgrp->reg_num; gpp_offset = padgroup_offset(padgrp, pin);
reg = community->regs + community->ie_offset + gpp * 4; + is = community->regs + community->is_offset + gpp * 4;
raw_spin_lock_irqsave(&pctrl->lock, flags); + + /* Clear interrupt status first to avoid unexpected interrupt */ + writel(BIT(gpp_offset), is); + value = readl(reg); if (mask) value &= ~BIT(gpp_offset); @@ -1106,7 +1082,6 @@ static irqreturn_t intel_gpio_irq(int irq, void *data)
static struct irq_chip intel_gpio_irqchip = { .name = "intel-gpio", - .irq_enable = intel_gpio_irq_enable, .irq_ack = intel_gpio_irq_ack, .irq_mask = intel_gpio_irq_mask, .irq_unmask = intel_gpio_irq_unmask,
[ Upstream commit 2c82c7e724ff51cab78e1afd5c2aaa31994fe41e ]
We can oops in nf_tables_fill_rule_info().
Its not possible to fetch previous element in rcu-protected lists when deletions are not prevented somehow: list_del_rcu poisons the ->prev pointer value.
Before rcu-conversion this was safe as dump operations did hold nfnetlink mutex.
Pass previous rule as argument, obtained by keeping a pointer to the previous rule during traversal.
Fixes: d9adf22a291883 ("netfilter: nf_tables: use call_rcu in netlink dumps") Signed-off-by: Florian Westphal fw@strlen.de Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/netfilter/nf_tables_api.c | 20 +++++++++++--------- 1 file changed, 11 insertions(+), 9 deletions(-)
diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c index aa5e7b00a581..101975386547 100644 --- a/net/netfilter/nf_tables_api.c +++ b/net/netfilter/nf_tables_api.c @@ -2261,13 +2261,13 @@ static int nf_tables_fill_rule_info(struct sk_buff *skb, struct net *net, u32 flags, int family, const struct nft_table *table, const struct nft_chain *chain, - const struct nft_rule *rule) + const struct nft_rule *rule, + const struct nft_rule *prule) { struct nlmsghdr *nlh; struct nfgenmsg *nfmsg; const struct nft_expr *expr, *next; struct nlattr *list; - const struct nft_rule *prule; u16 type = nfnl_msg_type(NFNL_SUBSYS_NFTABLES, event);
nlh = nlmsg_put(skb, portid, seq, type, sizeof(struct nfgenmsg), flags); @@ -2287,8 +2287,7 @@ static int nf_tables_fill_rule_info(struct sk_buff *skb, struct net *net, NFTA_RULE_PAD)) goto nla_put_failure;
- if ((event != NFT_MSG_DELRULE) && (rule->list.prev != &chain->rules)) { - prule = list_prev_entry(rule, list); + if (event != NFT_MSG_DELRULE && prule) { if (nla_put_be64(skb, NFTA_RULE_POSITION, cpu_to_be64(prule->handle), NFTA_RULE_PAD)) @@ -2335,7 +2334,7 @@ static void nf_tables_rule_notify(const struct nft_ctx *ctx,
err = nf_tables_fill_rule_info(skb, ctx->net, ctx->portid, ctx->seq, event, 0, ctx->family, ctx->table, - ctx->chain, rule); + ctx->chain, rule, NULL); if (err < 0) { kfree_skb(skb); goto err; @@ -2360,12 +2359,13 @@ static int __nf_tables_dump_rules(struct sk_buff *skb, const struct nft_chain *chain) { struct net *net = sock_net(skb->sk); + const struct nft_rule *rule, *prule; unsigned int s_idx = cb->args[0]; - const struct nft_rule *rule;
+ prule = NULL; list_for_each_entry_rcu(rule, &chain->rules, list) { if (!nft_is_active(net, rule)) - goto cont; + goto cont_skip; if (*idx < s_idx) goto cont; if (*idx > s_idx) { @@ -2377,11 +2377,13 @@ static int __nf_tables_dump_rules(struct sk_buff *skb, NFT_MSG_NEWRULE, NLM_F_MULTI | NLM_F_APPEND, table->family, - table, chain, rule) < 0) + table, chain, rule, prule) < 0) return 1;
nl_dump_check_consistent(cb, nlmsg_hdr(skb)); cont: + prule = rule; +cont_skip: (*idx)++; } return 0; @@ -2537,7 +2539,7 @@ static int nf_tables_getrule(struct net *net, struct sock *nlsk,
err = nf_tables_fill_rule_info(skb2, net, NETLINK_CB(skb).portid, nlh->nlmsg_seq, NFT_MSG_NEWRULE, 0, - family, table, chain, rule); + family, table, chain, rule, NULL); if (err < 0) goto err;
[ Upstream commit 23e3983a466cd540ffdd2bbc6e0c51e31934f941 ]
This patch fixes an bug revealed by the following commit:
6b89d4c1ae85 ("perf/x86/intel: Fix INTEL_FLAGS_EVENT_CONSTRAINT* masking")
That patch modified INTEL_FLAGS_EVENT_CONSTRAINT() to only look at the event code when matching a constraint. If code+umask were needed, then the INTEL_FLAGS_UEVENT_CONSTRAINT() macro was needed instead. This broke with some of the constraints for PEBS events.
Several of them, including the one used for cycles:p, cycles:pp, cycles:ppp fell in that category and caused the event to be rejected in PEBS mode. In other words, on some platforms a cmdline such as:
$ perf top -e cycles:pp
would fail with -EINVAL.
This patch fixes this bug by properly using INTEL_FLAGS_UEVENT_CONSTRAINT() when needed in the PEBS constraint tables.
Reported-by: Ingo Molnar mingo@kernel.org Signed-off-by: Stephane Eranian eranian@google.com Cc: Alexander Shishkin alexander.shishkin@linux.intel.com Cc: Arnaldo Carvalho de Melo acme@redhat.com Cc: Jiri Olsa jolsa@redhat.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Peter Zijlstra peterz@infradead.org Cc: Thomas Gleixner tglx@linutronix.de Cc: Vince Weaver vincent.weaver@maine.edu Cc: kan.liang@intel.com Link: http://lkml.kernel.org/r/20190521005246.423-1-eranian@google.com Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/events/intel/ds.c | 28 ++++++++++++++-------------- 1 file changed, 14 insertions(+), 14 deletions(-)
diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c index 10c99ce1fead..b71adf603b86 100644 --- a/arch/x86/events/intel/ds.c +++ b/arch/x86/events/intel/ds.c @@ -684,7 +684,7 @@ struct event_constraint intel_core2_pebs_event_constraints[] = { INTEL_FLAGS_UEVENT_CONSTRAINT(0x1fc7, 0x1), /* SIMD_INST_RETURED.ANY */ INTEL_FLAGS_EVENT_CONSTRAINT(0xcb, 0x1), /* MEM_LOAD_RETIRED.* */ /* INST_RETIRED.ANY_P, inv=1, cmask=16 (cycles:p). */ - INTEL_FLAGS_EVENT_CONSTRAINT(0x108000c0, 0x01), + INTEL_FLAGS_UEVENT_CONSTRAINT(0x108000c0, 0x01), EVENT_CONSTRAINT_END };
@@ -693,7 +693,7 @@ struct event_constraint intel_atom_pebs_event_constraints[] = { INTEL_FLAGS_UEVENT_CONSTRAINT(0x00c5, 0x1), /* MISPREDICTED_BRANCH_RETIRED */ INTEL_FLAGS_EVENT_CONSTRAINT(0xcb, 0x1), /* MEM_LOAD_RETIRED.* */ /* INST_RETIRED.ANY_P, inv=1, cmask=16 (cycles:p). */ - INTEL_FLAGS_EVENT_CONSTRAINT(0x108000c0, 0x01), + INTEL_FLAGS_UEVENT_CONSTRAINT(0x108000c0, 0x01), /* Allow all events as PEBS with no flags */ INTEL_ALL_EVENT_CONSTRAINT(0, 0x1), EVENT_CONSTRAINT_END @@ -701,7 +701,7 @@ struct event_constraint intel_atom_pebs_event_constraints[] = {
struct event_constraint intel_slm_pebs_event_constraints[] = { /* INST_RETIRED.ANY_P, inv=1, cmask=16 (cycles:p). */ - INTEL_FLAGS_EVENT_CONSTRAINT(0x108000c0, 0x1), + INTEL_FLAGS_UEVENT_CONSTRAINT(0x108000c0, 0x1), /* Allow all events as PEBS with no flags */ INTEL_ALL_EVENT_CONSTRAINT(0, 0x1), EVENT_CONSTRAINT_END @@ -726,7 +726,7 @@ struct event_constraint intel_nehalem_pebs_event_constraints[] = { INTEL_FLAGS_EVENT_CONSTRAINT(0xcb, 0xf), /* MEM_LOAD_RETIRED.* */ INTEL_FLAGS_EVENT_CONSTRAINT(0xf7, 0xf), /* FP_ASSIST.* */ /* INST_RETIRED.ANY_P, inv=1, cmask=16 (cycles:p). */ - INTEL_FLAGS_EVENT_CONSTRAINT(0x108000c0, 0x0f), + INTEL_FLAGS_UEVENT_CONSTRAINT(0x108000c0, 0x0f), EVENT_CONSTRAINT_END };
@@ -743,7 +743,7 @@ struct event_constraint intel_westmere_pebs_event_constraints[] = { INTEL_FLAGS_EVENT_CONSTRAINT(0xcb, 0xf), /* MEM_LOAD_RETIRED.* */ INTEL_FLAGS_EVENT_CONSTRAINT(0xf7, 0xf), /* FP_ASSIST.* */ /* INST_RETIRED.ANY_P, inv=1, cmask=16 (cycles:p). */ - INTEL_FLAGS_EVENT_CONSTRAINT(0x108000c0, 0x0f), + INTEL_FLAGS_UEVENT_CONSTRAINT(0x108000c0, 0x0f), EVENT_CONSTRAINT_END };
@@ -752,7 +752,7 @@ struct event_constraint intel_snb_pebs_event_constraints[] = { INTEL_PLD_CONSTRAINT(0x01cd, 0x8), /* MEM_TRANS_RETIRED.LAT_ABOVE_THR */ INTEL_PST_CONSTRAINT(0x02cd, 0x8), /* MEM_TRANS_RETIRED.PRECISE_STORES */ /* UOPS_RETIRED.ALL, inv=1, cmask=16 (cycles:p). */ - INTEL_FLAGS_EVENT_CONSTRAINT(0x108001c2, 0xf), + INTEL_FLAGS_UEVENT_CONSTRAINT(0x108001c2, 0xf), INTEL_EXCLEVT_CONSTRAINT(0xd0, 0xf), /* MEM_UOP_RETIRED.* */ INTEL_EXCLEVT_CONSTRAINT(0xd1, 0xf), /* MEM_LOAD_UOPS_RETIRED.* */ INTEL_EXCLEVT_CONSTRAINT(0xd2, 0xf), /* MEM_LOAD_UOPS_LLC_HIT_RETIRED.* */ @@ -767,9 +767,9 @@ struct event_constraint intel_ivb_pebs_event_constraints[] = { INTEL_PLD_CONSTRAINT(0x01cd, 0x8), /* MEM_TRANS_RETIRED.LAT_ABOVE_THR */ INTEL_PST_CONSTRAINT(0x02cd, 0x8), /* MEM_TRANS_RETIRED.PRECISE_STORES */ /* UOPS_RETIRED.ALL, inv=1, cmask=16 (cycles:p). */ - INTEL_FLAGS_EVENT_CONSTRAINT(0x108001c2, 0xf), + INTEL_FLAGS_UEVENT_CONSTRAINT(0x108001c2, 0xf), /* INST_RETIRED.PREC_DIST, inv=1, cmask=16 (cycles:ppp). */ - INTEL_FLAGS_EVENT_CONSTRAINT(0x108001c0, 0x2), + INTEL_FLAGS_UEVENT_CONSTRAINT(0x108001c0, 0x2), INTEL_EXCLEVT_CONSTRAINT(0xd0, 0xf), /* MEM_UOP_RETIRED.* */ INTEL_EXCLEVT_CONSTRAINT(0xd1, 0xf), /* MEM_LOAD_UOPS_RETIRED.* */ INTEL_EXCLEVT_CONSTRAINT(0xd2, 0xf), /* MEM_LOAD_UOPS_LLC_HIT_RETIRED.* */ @@ -783,9 +783,9 @@ struct event_constraint intel_hsw_pebs_event_constraints[] = { INTEL_FLAGS_UEVENT_CONSTRAINT(0x01c0, 0x2), /* INST_RETIRED.PRECDIST */ INTEL_PLD_CONSTRAINT(0x01cd, 0xf), /* MEM_TRANS_RETIRED.* */ /* UOPS_RETIRED.ALL, inv=1, cmask=16 (cycles:p). */ - INTEL_FLAGS_EVENT_CONSTRAINT(0x108001c2, 0xf), + INTEL_FLAGS_UEVENT_CONSTRAINT(0x108001c2, 0xf), /* INST_RETIRED.PREC_DIST, inv=1, cmask=16 (cycles:ppp). */ - INTEL_FLAGS_EVENT_CONSTRAINT(0x108001c0, 0x2), + INTEL_FLAGS_UEVENT_CONSTRAINT(0x108001c0, 0x2), INTEL_FLAGS_UEVENT_CONSTRAINT_DATALA_NA(0x01c2, 0xf), /* UOPS_RETIRED.ALL */ INTEL_FLAGS_UEVENT_CONSTRAINT_DATALA_XLD(0x11d0, 0xf), /* MEM_UOPS_RETIRED.STLB_MISS_LOADS */ INTEL_FLAGS_UEVENT_CONSTRAINT_DATALA_XLD(0x21d0, 0xf), /* MEM_UOPS_RETIRED.LOCK_LOADS */ @@ -806,9 +806,9 @@ struct event_constraint intel_bdw_pebs_event_constraints[] = { INTEL_FLAGS_UEVENT_CONSTRAINT(0x01c0, 0x2), /* INST_RETIRED.PRECDIST */ INTEL_PLD_CONSTRAINT(0x01cd, 0xf), /* MEM_TRANS_RETIRED.* */ /* UOPS_RETIRED.ALL, inv=1, cmask=16 (cycles:p). */ - INTEL_FLAGS_EVENT_CONSTRAINT(0x108001c2, 0xf), + INTEL_FLAGS_UEVENT_CONSTRAINT(0x108001c2, 0xf), /* INST_RETIRED.PREC_DIST, inv=1, cmask=16 (cycles:ppp). */ - INTEL_FLAGS_EVENT_CONSTRAINT(0x108001c0, 0x2), + INTEL_FLAGS_UEVENT_CONSTRAINT(0x108001c0, 0x2), INTEL_FLAGS_UEVENT_CONSTRAINT_DATALA_NA(0x01c2, 0xf), /* UOPS_RETIRED.ALL */ INTEL_FLAGS_UEVENT_CONSTRAINT_DATALA_LD(0x11d0, 0xf), /* MEM_UOPS_RETIRED.STLB_MISS_LOADS */ INTEL_FLAGS_UEVENT_CONSTRAINT_DATALA_LD(0x21d0, 0xf), /* MEM_UOPS_RETIRED.LOCK_LOADS */ @@ -829,9 +829,9 @@ struct event_constraint intel_bdw_pebs_event_constraints[] = { struct event_constraint intel_skl_pebs_event_constraints[] = { INTEL_FLAGS_UEVENT_CONSTRAINT(0x1c0, 0x2), /* INST_RETIRED.PREC_DIST */ /* INST_RETIRED.PREC_DIST, inv=1, cmask=16 (cycles:ppp). */ - INTEL_FLAGS_EVENT_CONSTRAINT(0x108001c0, 0x2), + INTEL_FLAGS_UEVENT_CONSTRAINT(0x108001c0, 0x2), /* INST_RETIRED.TOTAL_CYCLES_PS (inv=1, cmask=16) (cycles:p). */ - INTEL_FLAGS_EVENT_CONSTRAINT(0x108000c0, 0x0f), + INTEL_FLAGS_UEVENT_CONSTRAINT(0x108000c0, 0x0f), INTEL_PLD_CONSTRAINT(0x1cd, 0xf), /* MEM_TRANS_RETIRED.* */ INTEL_FLAGS_UEVENT_CONSTRAINT_DATALA_LD(0x11d0, 0xf), /* MEM_INST_RETIRED.STLB_MISS_LOADS */ INTEL_FLAGS_UEVENT_CONSTRAINT_DATALA_ST(0x12d0, 0xf), /* MEM_INST_RETIRED.STLB_MISS_STORES */
[ Upstream commit 946c0d8e6ed43dae6527e878d0077c1e11015db0 ]
This patch fixes netfilter hook traversal when there are more than 1 hooks returning NF_QUEUE verdict. When the first queue reinjects the packet, 'nf_reinject' starts traversing hooks with a proper hook_index. However, if it again receives a NF_QUEUE verdict (by some other netfilter hook), it queues the packet with a wrong hook_index. So, when the second queue reinjects the packet, it re-executes hooks in between.
Fixes: 960632ece694 ("netfilter: convert hook list to an array") Signed-off-by: Jagdish Motwani jagdish.motwani@sophos.com Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/netfilter/nf_queue.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/net/netfilter/nf_queue.c b/net/netfilter/nf_queue.c index a36a77bae1d6..5b86574e7b89 100644 --- a/net/netfilter/nf_queue.c +++ b/net/netfilter/nf_queue.c @@ -254,6 +254,7 @@ static unsigned int nf_iterate(struct sk_buff *skb, repeat: verdict = nf_hook_entry_hookfn(hook, skb, state); if (verdict != NF_ACCEPT) { + *index = i; if (verdict != NF_REPEAT) return verdict; goto repeat;
[ Upstream commit e633508a95289489d28faacb68b32c3e7e68ef6f ]
NFTA_FIB_F_PRESENT flag was not always honored since eval functions did not call nft_fib_store_result in all cases.
Given that in all callsites there is a struct net_device pointer available which holds the interface data to be stored in destination register, simplify nft_fib_store_result() to just accept that pointer instead of the nft_pktinfo pointer and interface index. This also allows to drop the index to interface lookup previously needed to get the name associated with given index.
Fixes: 055c4b34b94f6 ("netfilter: nft_fib: Support existence check") Signed-off-by: Phil Sutter phil@nwl.cc Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Sasha Levin sashal@kernel.org --- include/net/netfilter/nft_fib.h | 2 +- net/ipv4/netfilter/nft_fib_ipv4.c | 23 +++-------------------- net/ipv6/netfilter/nft_fib_ipv6.c | 16 ++-------------- net/netfilter/nft_fib.c | 6 +++--- 4 files changed, 9 insertions(+), 38 deletions(-)
diff --git a/include/net/netfilter/nft_fib.h b/include/net/netfilter/nft_fib.h index a88f92737308..e4c4d8eaca8c 100644 --- a/include/net/netfilter/nft_fib.h +++ b/include/net/netfilter/nft_fib.h @@ -34,5 +34,5 @@ void nft_fib6_eval(const struct nft_expr *expr, struct nft_regs *regs, const struct nft_pktinfo *pkt);
void nft_fib_store_result(void *reg, const struct nft_fib *priv, - const struct nft_pktinfo *pkt, int index); + const struct net_device *dev); #endif diff --git a/net/ipv4/netfilter/nft_fib_ipv4.c b/net/ipv4/netfilter/nft_fib_ipv4.c index 94eb25bc8d7e..c8888e52591f 100644 --- a/net/ipv4/netfilter/nft_fib_ipv4.c +++ b/net/ipv4/netfilter/nft_fib_ipv4.c @@ -58,11 +58,6 @@ void nft_fib4_eval_type(const struct nft_expr *expr, struct nft_regs *regs, } EXPORT_SYMBOL_GPL(nft_fib4_eval_type);
-static int get_ifindex(const struct net_device *dev) -{ - return dev ? dev->ifindex : 0; -} - void nft_fib4_eval(const struct nft_expr *expr, struct nft_regs *regs, const struct nft_pktinfo *pkt) { @@ -94,8 +89,7 @@ void nft_fib4_eval(const struct nft_expr *expr, struct nft_regs *regs,
if (nft_hook(pkt) == NF_INET_PRE_ROUTING && nft_fib_is_loopback(pkt->skb, nft_in(pkt))) { - nft_fib_store_result(dest, priv, pkt, - nft_in(pkt)->ifindex); + nft_fib_store_result(dest, priv, nft_in(pkt)); return; }
@@ -108,8 +102,7 @@ void nft_fib4_eval(const struct nft_expr *expr, struct nft_regs *regs, if (ipv4_is_zeronet(iph->saddr)) { if (ipv4_is_lbcast(iph->daddr) || ipv4_is_local_multicast(iph->daddr)) { - nft_fib_store_result(dest, priv, pkt, - get_ifindex(pkt->skb->dev)); + nft_fib_store_result(dest, priv, pkt->skb->dev); return; } } @@ -150,17 +143,7 @@ void nft_fib4_eval(const struct nft_expr *expr, struct nft_regs *regs, found = oif; }
- switch (priv->result) { - case NFT_FIB_RESULT_OIF: - *dest = found->ifindex; - break; - case NFT_FIB_RESULT_OIFNAME: - strncpy((char *)dest, found->name, IFNAMSIZ); - break; - default: - WARN_ON_ONCE(1); - break; - } + nft_fib_store_result(dest, priv, found); } EXPORT_SYMBOL_GPL(nft_fib4_eval);
diff --git a/net/ipv6/netfilter/nft_fib_ipv6.c b/net/ipv6/netfilter/nft_fib_ipv6.c index 73cdc0bc63f7..ec068b0cffca 100644 --- a/net/ipv6/netfilter/nft_fib_ipv6.c +++ b/net/ipv6/netfilter/nft_fib_ipv6.c @@ -169,8 +169,7 @@ void nft_fib6_eval(const struct nft_expr *expr, struct nft_regs *regs,
if (nft_hook(pkt) == NF_INET_PRE_ROUTING && nft_fib_is_loopback(pkt->skb, nft_in(pkt))) { - nft_fib_store_result(dest, priv, pkt, - nft_in(pkt)->ifindex); + nft_fib_store_result(dest, priv, nft_in(pkt)); return; }
@@ -187,18 +186,7 @@ void nft_fib6_eval(const struct nft_expr *expr, struct nft_regs *regs, if (oif && oif != rt->rt6i_idev->dev) goto put_rt_err;
- switch (priv->result) { - case NFT_FIB_RESULT_OIF: - *dest = rt->rt6i_idev->dev->ifindex; - break; - case NFT_FIB_RESULT_OIFNAME: - strncpy((char *)dest, rt->rt6i_idev->dev->name, IFNAMSIZ); - break; - default: - WARN_ON_ONCE(1); - break; - } - + nft_fib_store_result(dest, priv, rt->rt6i_idev->dev); put_rt_err: ip6_rt_put(rt); } diff --git a/net/netfilter/nft_fib.c b/net/netfilter/nft_fib.c index 21df8cccea65..77f00a99dfab 100644 --- a/net/netfilter/nft_fib.c +++ b/net/netfilter/nft_fib.c @@ -135,17 +135,17 @@ int nft_fib_dump(struct sk_buff *skb, const struct nft_expr *expr) EXPORT_SYMBOL_GPL(nft_fib_dump);
void nft_fib_store_result(void *reg, const struct nft_fib *priv, - const struct nft_pktinfo *pkt, int index) + const struct net_device *dev) { - struct net_device *dev; u32 *dreg = reg; + int index;
switch (priv->result) { case NFT_FIB_RESULT_OIF: + index = dev ? dev->ifindex : 0; *dreg = (priv->flags & NFTA_FIB_F_PRESENT) ? !!index : index; break; case NFT_FIB_RESULT_OIFNAME: - dev = dev_get_by_index_rcu(nft_net(pkt), index); if (priv->flags & NFTA_FIB_F_PRESENT) *dreg = !!dev; else
[ Upstream commit 719c7d563c17b150877cee03a4b812a424989dfa ]
BUG: KASAN: use-after-free in ip_vs_in.part.29+0xe8/0xd20 [ip_vs] Read of size 4 at addr ffff8881e9b26e2c by task sshd/5603
CPU: 0 PID: 5603 Comm: sshd Not tainted 4.19.39+ #30 Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 Call Trace: dump_stack+0x71/0xab print_address_description+0x6a/0x270 kasan_report+0x179/0x2c0 ip_vs_in.part.29+0xe8/0xd20 [ip_vs] ip_vs_in+0xd8/0x170 [ip_vs] nf_hook_slow+0x5f/0xe0 __ip_local_out+0x1d5/0x250 ip_local_out+0x19/0x60 __tcp_transmit_skb+0xba1/0x14f0 tcp_write_xmit+0x41f/0x1ed0 ? _copy_from_iter_full+0xca/0x340 __tcp_push_pending_frames+0x52/0x140 tcp_sendmsg_locked+0x787/0x1600 ? tcp_sendpage+0x60/0x60 ? inet_sk_set_state+0xb0/0xb0 tcp_sendmsg+0x27/0x40 sock_sendmsg+0x6d/0x80 sock_write_iter+0x121/0x1c0 ? sock_sendmsg+0x80/0x80 __vfs_write+0x23e/0x370 vfs_write+0xe7/0x230 ksys_write+0xa1/0x120 ? __ia32_sys_read+0x50/0x50 ? __audit_syscall_exit+0x3ce/0x450 do_syscall_64+0x73/0x200 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x7ff6f6147c60 Code: 73 01 c3 48 8b 0d 28 12 2d 00 f7 d8 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00 00 83 3d 5d 73 2d 00 00 75 10 b8 01 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 31 c3 48 83 RSP: 002b:00007ffd772ead18 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 0000000000000034 RCX: 00007ff6f6147c60 RDX: 0000000000000034 RSI: 000055df30a31270 RDI: 0000000000000003 RBP: 000055df30a31270 R08: 0000000000000000 R09: 0000000000000000 R10: 00007ffd772ead70 R11: 0000000000000246 R12: 00007ffd772ead74 R13: 00007ffd772eae20 R14: 00007ffd772eae24 R15: 000055df2f12ddc0
Allocated by task 6052: kasan_kmalloc+0xa0/0xd0 __kmalloc+0x10a/0x220 ops_init+0x97/0x190 register_pernet_operations+0x1ac/0x360 register_pernet_subsys+0x24/0x40 0xffffffffc0ea016d do_one_initcall+0x8b/0x253 do_init_module+0xe3/0x335 load_module+0x2fc0/0x3890 __do_sys_finit_module+0x192/0x1c0 do_syscall_64+0x73/0x200 entry_SYSCALL_64_after_hwframe+0x44/0xa9
Freed by task 6067: __kasan_slab_free+0x130/0x180 kfree+0x90/0x1a0 ops_free_list.part.7+0xa6/0xc0 unregister_pernet_operations+0x18b/0x1f0 unregister_pernet_subsys+0x1d/0x30 ip_vs_cleanup+0x1d/0xd2f [ip_vs] __x64_sys_delete_module+0x20c/0x300 do_syscall_64+0x73/0x200 entry_SYSCALL_64_after_hwframe+0x44/0xa9
The buggy address belongs to the object at ffff8881e9b26600 which belongs to the cache kmalloc-4096 of size 4096 The buggy address is located 2092 bytes inside of 4096-byte region [ffff8881e9b26600, ffff8881e9b27600) The buggy address belongs to the page: page:ffffea0007a6c800 count:1 mapcount:0 mapping:ffff888107c0e600 index:0x0 compound_mapcount: 0 flags: 0x17ffffc0008100(slab|head) raw: 0017ffffc0008100 dead000000000100 dead000000000200 ffff888107c0e600 raw: 0000000000000000 0000000080070007 00000001ffffffff 0000000000000000 page dumped because: kasan: bad access detected
while unregistering ipvs module, ops_free_list calls __ip_vs_cleanup, then nf_unregister_net_hooks be called to do remove nf hook entries. It need a RCU period to finish, however net->ipvs is set to NULL immediately, which will trigger NULL pointer dereference when a packet is hooked and handled by ip_vs_in where net->ipvs is dereferenced.
Another scene is ops_free_list call ops_free to free the net_generic directly while __ip_vs_cleanup finished, then calling ip_vs_in will triggers use-after-free.
This patch moves nf_unregister_net_hooks from __ip_vs_cleanup() to __ip_vs_dev_cleanup(), where rcu_barrier() is called by unregister_pernet_device -> unregister_pernet_operations, that will do the needed grace period.
Reported-by: Hulk Robot hulkci@huawei.com Fixes: efe41606184e ("ipvs: convert to use pernet nf_hook api") Suggested-by: Julian Anastasov ja@ssi.bg Signed-off-by: YueHaibing yuehaibing@huawei.com Acked-by: Julian Anastasov ja@ssi.bg Signed-off-by: Simon Horman horms@verge.net.au Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/netfilter/ipvs/ip_vs_core.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/netfilter/ipvs/ip_vs_core.c b/net/netfilter/ipvs/ip_vs_core.c index 14457551bcb4..8ebf21149ec3 100644 --- a/net/netfilter/ipvs/ip_vs_core.c +++ b/net/netfilter/ipvs/ip_vs_core.c @@ -2312,7 +2312,6 @@ static void __net_exit __ip_vs_cleanup(struct net *net) { struct netns_ipvs *ipvs = net_ipvs(net);
- nf_unregister_net_hooks(net, ip_vs_ops, ARRAY_SIZE(ip_vs_ops)); ip_vs_service_net_cleanup(ipvs); /* ip_vs_flush() with locks */ ip_vs_conn_net_cleanup(ipvs); ip_vs_app_net_cleanup(ipvs); @@ -2327,6 +2326,7 @@ static void __net_exit __ip_vs_dev_cleanup(struct net *net) { struct netns_ipvs *ipvs = net_ipvs(net); EnterFunction(2); + nf_unregister_net_hooks(net, ip_vs_ops, ARRAY_SIZE(ip_vs_ops)); ipvs->enable = 0; /* Disable packet reception */ smp_wmb(); ip_vs_sync_net_cleanup(ipvs);
[ Upstream commit 82ce6eb1dd13fd12e449b2ee2c2ec051e6f52c43 ]
A test for the basic NAT functionality uses ip command which needs veth device. There is a condition where the kernel support for veth is not compiled into the kernel and the test script breaks. This patch contains code for reasonable error display and correct code exit.
Signed-off-by: Jeffrin Jose T jeffrin@rajagiritech.edu.in Acked-by: Florian Westphal fw@strlen.de Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Sasha Levin sashal@kernel.org --- tools/testing/selftests/netfilter/nft_nat.sh | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/netfilter/nft_nat.sh b/tools/testing/selftests/netfilter/nft_nat.sh index 3194007cf8d1..a59c5fd4e987 100755 --- a/tools/testing/selftests/netfilter/nft_nat.sh +++ b/tools/testing/selftests/netfilter/nft_nat.sh @@ -23,7 +23,11 @@ ip netns add ns0 ip netns add ns1 ip netns add ns2
-ip link add veth0 netns ns0 type veth peer name eth0 netns ns1 +ip link add veth0 netns ns0 type veth peer name eth0 netns ns1 > /dev/null 2>&1 +if [ $? -ne 0 ];then + echo "SKIP: No virtual ethernet pair device support in kernel" + exit $ksft_skip +fi ip link add veth1 netns ns0 type veth peer name eth0 netns ns2
ip -net ns0 link set lo up
[ Upstream commit 1cc54078d104f5b4d7e9f8d55362efa5a8daffdb ]
We need to always call clkdm_clk_enable() and clkdm_clk_disable() even the clkctrl clock(s) enabled for the domain do not have any gate register bits. Otherwise clockdomains may never get enabled except when devices get probed with the legacy "ti,hwmods" devicetree property.
Fixes: 88a172526c32 ("clk: ti: add support for clkctrl clocks") Signed-off-by: Tony Lindgren tony@atomide.com Signed-off-by: Stephen Boyd sboyd@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/clk/ti/clkctrl.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/drivers/clk/ti/clkctrl.c b/drivers/clk/ti/clkctrl.c index 639f515e08f0..3325ee43bcc1 100644 --- a/drivers/clk/ti/clkctrl.c +++ b/drivers/clk/ti/clkctrl.c @@ -137,9 +137,6 @@ static int _omap4_clkctrl_clk_enable(struct clk_hw *hw) int ret; union omap4_timeout timeout = { 0 };
- if (!clk->enable_bit) - return 0; - if (clk->clkdm) { ret = ti_clk_ll_ops->clkdm_clk_enable(clk->clkdm, hw->clk); if (ret) { @@ -151,6 +148,9 @@ static int _omap4_clkctrl_clk_enable(struct clk_hw *hw) } }
+ if (!clk->enable_bit) + return 0; + val = ti_clk_ll_ops->clk_readl(&clk->enable_reg);
val &= ~OMAP4_MODULEMODE_MASK; @@ -179,7 +179,7 @@ static void _omap4_clkctrl_clk_disable(struct clk_hw *hw) union omap4_timeout timeout = { 0 };
if (!clk->enable_bit) - return; + goto exit;
val = ti_clk_ll_ops->clk_readl(&clk->enable_reg);
[ Upstream commit b59bd3527fe3c1939340df558d7f9d568fc9f882 ]
Currently init_imc_pmu() can fail either because we try to register an IMC unit with an invalid domain (i.e an IMC node not supported by the kernel) or something went wrong while registering a valid IMC unit. In both the cases kernel provides a 'Register failed' error message.
For example when trace-imc node is not supported by the kernel, but skiboot advertises a trace-imc node we print:
IMC Unknown Device type IMC PMU (null) Register failed
To avoid confusion just print the unknown device type message, before attempting PMU registration, so the second message isn't printed.
Fixes: 8f95faaac56c ("powerpc/powernv: Detect and create IMC device") Reported-by: Pavaman Subramaniyam pavsubra@in.ibm.com Signed-off-by: Anju T Sudhakar anju@linux.vnet.ibm.com Reviewed-by: Madhavan Srinivasan maddy@linux.vnet.ibm.com [mpe: Reword change log a bit] Signed-off-by: Michael Ellerman mpe@ellerman.id.au Signed-off-by: Sasha Levin sashal@kernel.org --- arch/powerpc/platforms/powernv/opal-imc.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/arch/powerpc/platforms/powernv/opal-imc.c b/arch/powerpc/platforms/powernv/opal-imc.c index 3d27f02695e4..828f6656f8f7 100644 --- a/arch/powerpc/platforms/powernv/opal-imc.c +++ b/arch/powerpc/platforms/powernv/opal-imc.c @@ -161,6 +161,10 @@ static int imc_pmu_create(struct device_node *parent, int pmu_index, int domain) struct imc_pmu *pmu_ptr; u32 offset;
+ /* Return for unknown domain */ + if (domain < 0) + return -EINVAL; + /* memory for pmu */ pmu_ptr = kzalloc(sizeof(*pmu_ptr), GFP_KERNEL); if (!pmu_ptr)
[ Upstream commit 5bce256f0b528624a34fe907db385133bb7be33e ]
In xhci_debugfs_create_slot(), kzalloc() can fail and dev->debugfs_private will be NULL. In xhci_debugfs_create_endpoint(), dev->debugfs_private is used without any null-pointer check, and can cause a null pointer dereference.
To fix this bug, a null-pointer check is added in xhci_debugfs_create_endpoint().
This bug is found by a runtime fuzzing tool named FIZZER written by us.
[subjet line change change, add potential -Mathais] Signed-off-by: Jia-Ju Bai baijiaju1990@gmail.com Reviewed-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Mathias Nyman mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/usb/host/xhci-debugfs.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/usb/host/xhci-debugfs.c b/drivers/usb/host/xhci-debugfs.c index cadc01336bf8..7ba6afc7ef23 100644 --- a/drivers/usb/host/xhci-debugfs.c +++ b/drivers/usb/host/xhci-debugfs.c @@ -440,6 +440,9 @@ void xhci_debugfs_create_endpoint(struct xhci_hcd *xhci, struct xhci_ep_priv *epriv; struct xhci_slot_priv *spriv = dev->debugfs_private;
+ if (!spriv) + return; + if (spriv->eps[ep_index]) return;
[ Upstream commit ccfb62f27beb295103e9392462b20a6ed807d0ea ]
The user can change the device_name with the IMSETDEVNAME ioctl, but we need to ensure that the user's name is NUL terminated. Otherwise it could result in a buffer overflow when we copy the name back to the user with IMGETDEVINFO ioctl.
I also changed two strcpy() calls which handle the name to strscpy(). Hopefully, there aren't any other ways to create a too long name, but it's nice to do this as a kernel hardening measure.
Signed-off-by: Dan Carpenter dan.carpenter@oracle.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/isdn/mISDN/socket.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/isdn/mISDN/socket.c b/drivers/isdn/mISDN/socket.c index a14e35d40538..84e1d4c2db66 100644 --- a/drivers/isdn/mISDN/socket.c +++ b/drivers/isdn/mISDN/socket.c @@ -393,7 +393,7 @@ data_sock_ioctl(struct socket *sock, unsigned int cmd, unsigned long arg) memcpy(di.channelmap, dev->channelmap, sizeof(di.channelmap)); di.nrbchan = dev->nrbchan; - strcpy(di.name, dev_name(&dev->dev)); + strscpy(di.name, dev_name(&dev->dev), sizeof(di.name)); if (copy_to_user((void __user *)arg, &di, sizeof(di))) err = -EFAULT; } else @@ -676,7 +676,7 @@ base_sock_ioctl(struct socket *sock, unsigned int cmd, unsigned long arg) memcpy(di.channelmap, dev->channelmap, sizeof(di.channelmap)); di.nrbchan = dev->nrbchan; - strcpy(di.name, dev_name(&dev->dev)); + strscpy(di.name, dev_name(&dev->dev), sizeof(di.name)); if (copy_to_user((void __user *)arg, &di, sizeof(di))) err = -EFAULT; } else @@ -690,6 +690,7 @@ base_sock_ioctl(struct socket *sock, unsigned int cmd, unsigned long arg) err = -EFAULT; break; } + dn.name[sizeof(dn.name) - 1] = '\0'; dev = get_mdevice(dn.id); if (dev) err = device_rename(&dev->dev, dn.name);
[ Upstream commit 2ac44ab608705948564791ce1d15d43ba81a1e38 ]
For F17h AMD CPUs, the CPB capability ('Core Performance Boost') is forcibly set, because some versions of that chip incorrectly report that they do not have it.
However, a hypervisor may filter out the CPB capability, for good reasons. For example, KVM currently does not emulate setting the CPB bit in MSR_K7_HWCR, and unchecked MSR access errors will be thrown when trying to set it as a guest:
unchecked MSR access error: WRMSR to 0xc0010015 (tried to write 0x0000000001000011) at rIP: 0xffffffff890638f4 (native_write_msr+0x4/0x20)
Call Trace: boost_set_msr+0x50/0x80 [acpi_cpufreq] cpuhp_invoke_callback+0x86/0x560 sort_range+0x20/0x20 cpuhp_thread_fun+0xb0/0x110 smpboot_thread_fn+0xef/0x160 kthread+0x113/0x130 kthread_create_worker_on_cpu+0x70/0x70 ret_from_fork+0x35/0x40
To avoid this issue, don't forcibly set the CPB capability for a CPU when running under a hypervisor.
Signed-off-by: Frank van der Linden fllinden@amazon.com Acked-by: Borislav Petkov bp@suse.de Cc: Andy Lutomirski luto@kernel.org Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Peter Zijlstra peterz@infradead.org Cc: Thomas Gleixner tglx@linutronix.de Cc: bp@alien8.de Cc: jiaxun.yang@flygoat.com Fixes: 0237199186e7 ("x86/CPU/AMD: Set the CPB bit unconditionally on F17h") Link: http://lkml.kernel.org/r/20190522221745.GA15789@dev-dsk-fllinden-2c-c1893d73... [ Minor edits to the changelog. ] Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/kernel/cpu/amd.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c index 01004bfb1a1b..524709dcf749 100644 --- a/arch/x86/kernel/cpu/amd.c +++ b/arch/x86/kernel/cpu/amd.c @@ -820,8 +820,11 @@ static void init_amd_zn(struct cpuinfo_x86 *c) { set_cpu_cap(c, X86_FEATURE_ZEN);
- /* Fix erratum 1076: CPB feature bit not being set in CPUID. */ - if (!cpu_has(c, X86_FEATURE_CPB)) + /* + * Fix erratum 1076: CPB feature bit not being set in CPUID. + * Always set it, except when running under a hypervisor. + */ + if (!cpu_has(c, X86_FEATURE_HYPERVISOR) && !cpu_has(c, X86_FEATURE_CPB)) set_cpu_cap(c, X86_FEATURE_CPB); }
[ Upstream commit 1b038c6e05ff70a1e66e3e571c2e6106bdb75f53 ]
In perf_output_put_handle(), an IRQ/NMI can happen in below location and write records to the same ring buffer:
... local_dec_and_test(&rb->nest) ... <-- an IRQ/NMI can happen here rb->user_page->data_head = head; ...
In this case, a value A is written to data_head in the IRQ, then a value B is written to data_head after the IRQ. And A > B. As a result, data_head is temporarily decreased from A to B. And a reader may see data_head < data_tail if it read the buffer frequently enough, which creates unexpected behaviors.
This can be fixed by moving dec(&rb->nest) to after updating data_head, which prevents the IRQ/NMI above from updating data_head.
[ Split up by peterz. ]
Signed-off-by: Yabin Cui yabinc@google.com Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Cc: Alexander Shishkin alexander.shishkin@linux.intel.com Cc: Arnaldo Carvalho de Melo acme@kernel.org Cc: Arnaldo Carvalho de Melo acme@redhat.com Cc: Jiri Olsa jolsa@redhat.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Namhyung Kim namhyung@kernel.org Cc: Peter Zijlstra peterz@infradead.org Cc: Stephane Eranian eranian@google.com Cc: Thomas Gleixner tglx@linutronix.de Cc: Vince Weaver vincent.weaver@maine.edu Cc: mark.rutland@arm.com Fixes: ef60777c9abd ("perf: Optimize the perf_output() path by removing IRQ-disables") Link: http://lkml.kernel.org/r/20190517115418.224478157@infradead.org Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/events/ring_buffer.c | 24 ++++++++++++++++++++---- 1 file changed, 20 insertions(+), 4 deletions(-)
diff --git a/kernel/events/ring_buffer.c b/kernel/events/ring_buffer.c index 674b35383491..009467a60578 100644 --- a/kernel/events/ring_buffer.c +++ b/kernel/events/ring_buffer.c @@ -51,11 +51,18 @@ static void perf_output_put_handle(struct perf_output_handle *handle) head = local_read(&rb->head);
/* - * IRQ/NMI can happen here, which means we can miss a head update. + * IRQ/NMI can happen here and advance @rb->head, causing our + * load above to be stale. */
- if (!local_dec_and_test(&rb->nest)) + /* + * If this isn't the outermost nesting, we don't have to update + * @rb->user_page->data_head. + */ + if (local_read(&rb->nest) > 1) { + local_dec(&rb->nest); goto out; + }
/* * Since the mmap() consumer (userspace) can run on a different CPU: @@ -87,9 +94,18 @@ static void perf_output_put_handle(struct perf_output_handle *handle) rb->user_page->data_head = head;
/* - * Now check if we missed an update -- rely on previous implied - * compiler barriers to force a re-read. + * We must publish the head before decrementing the nest count, + * otherwise an IRQ/NMI can publish a more recent head value and our + * write will (temporarily) publish a stale value. + */ + barrier(); + local_set(&rb->nest, 0); + + /* + * Ensure we decrement @rb->nest before we validate the @rb->head. + * Otherwise we cannot be sure we caught the 'last' nested update. */ + barrier(); if (unlikely(head != local_read(&rb->head))) { local_inc(&rb->nest); goto again;
[ Upstream commit 3f9fbe9bd86c534eba2faf5d840fd44c6049f50e ]
Similar to how decrementing rb->next too early can cause data_head to (temporarily) be observed to go backward, so too can this happen when we increment too late.
This barrier() ensures the rb->head load happens after the increment, both the one in the 'goto again' path, as the one from perf_output_get_handle() -- albeit very unlikely to matter for the latter.
Suggested-by: Yabin Cui yabinc@google.com Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Cc: Alexander Shishkin alexander.shishkin@linux.intel.com Cc: Arnaldo Carvalho de Melo acme@redhat.com Cc: Jiri Olsa jolsa@redhat.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Peter Zijlstra peterz@infradead.org Cc: Stephane Eranian eranian@google.com Cc: Thomas Gleixner tglx@linutronix.de Cc: Vince Weaver vincent.weaver@maine.edu Cc: acme@kernel.org Cc: mark.rutland@arm.com Cc: namhyung@kernel.org Fixes: ef60777c9abd ("perf: Optimize the perf_output() path by removing IRQ-disables") Link: http://lkml.kernel.org/r/20190517115418.309516009@infradead.org Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/events/ring_buffer.c | 9 +++++++++ 1 file changed, 9 insertions(+)
diff --git a/kernel/events/ring_buffer.c b/kernel/events/ring_buffer.c index 009467a60578..4b5f8d932400 100644 --- a/kernel/events/ring_buffer.c +++ b/kernel/events/ring_buffer.c @@ -48,6 +48,15 @@ static void perf_output_put_handle(struct perf_output_handle *handle) unsigned long head;
again: + /* + * In order to avoid publishing a head value that goes backwards, + * we must ensure the load of @rb->head happens after we've + * incremented @rb->nest. + * + * Otherwise we can observe a @rb->head value before one published + * by an IRQ/NMI happening between the load and the increment. + */ + barrier(); head = local_read(&rb->head);
/*
[ Upstream commit 4d839dd9e4356bbacf3eb0ab13a549b83b008c21 ]
We must use {READ,WRITE}_ONCE() on rb->user_page data such that concurrent usage will see whole values. A few key sites were missing this.
Suggested-by: Yabin Cui yabinc@google.com Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Cc: Alexander Shishkin alexander.shishkin@linux.intel.com Cc: Arnaldo Carvalho de Melo acme@redhat.com Cc: Jiri Olsa jolsa@redhat.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Peter Zijlstra peterz@infradead.org Cc: Stephane Eranian eranian@google.com Cc: Thomas Gleixner tglx@linutronix.de Cc: Vince Weaver vincent.weaver@maine.edu Cc: acme@kernel.org Cc: mark.rutland@arm.com Cc: namhyung@kernel.org Fixes: 7b732a750477 ("perf_counter: new output ABI - part 1") Link: http://lkml.kernel.org/r/20190517115418.394192145@infradead.org Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/events/ring_buffer.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/kernel/events/ring_buffer.c b/kernel/events/ring_buffer.c index 4b5f8d932400..7a0c73e4b3eb 100644 --- a/kernel/events/ring_buffer.c +++ b/kernel/events/ring_buffer.c @@ -100,7 +100,7 @@ static void perf_output_put_handle(struct perf_output_handle *handle) * See perf_output_begin(). */ smp_wmb(); /* B, matches C */ - rb->user_page->data_head = head; + WRITE_ONCE(rb->user_page->data_head, head);
/* * We must publish the head before decrementing the nest count, @@ -496,7 +496,7 @@ void perf_aux_output_end(struct perf_output_handle *handle, unsigned long size) perf_event_aux_event(handle->event, aux_head, size, handle->aux_flags);
- rb->user_page->aux_head = rb->aux_head; + WRITE_ONCE(rb->user_page->aux_head, rb->aux_head); if (rb_need_aux_wakeup(rb)) wakeup = true;
@@ -528,7 +528,7 @@ int perf_aux_output_skip(struct perf_output_handle *handle, unsigned long size)
rb->aux_head += size;
- rb->user_page->aux_head = rb->aux_head; + WRITE_ONCE(rb->user_page->aux_head, rb->aux_head); if (rb_need_aux_wakeup(rb)) { perf_output_wakeup(handle); handle->wakeup = rb->aux_wakeup + rb->aux_watermark;
[ Upstream commit e9646f0f5bb62b7d43f0968f39d536cfe7123b53 ]
The gpio-adp5588 driver uses interfaces that are provided by GPIOLIB_IRQCHIP, so select that symbol in its Kconfig entry.
Fixes these build errors:
../drivers/gpio/gpio-adp5588.c: In function ‘adp5588_irq_handler’: ../drivers/gpio/gpio-adp5588.c:266:26: error: ‘struct gpio_chip’ has no member named ‘irq’ dev->gpio_chip.irq.domain, gpio)); ^ ../drivers/gpio/gpio-adp5588.c: In function ‘adp5588_irq_setup’: ../drivers/gpio/gpio-adp5588.c:298:2: error: implicit declaration of function ‘gpiochip_irqchip_add_nested’ [-Werror=implicit-function-declaration] ret = gpiochip_irqchip_add_nested(&dev->gpio_chip, ^ ../drivers/gpio/gpio-adp5588.c:307:2: error: implicit declaration of function ‘gpiochip_set_nested_irqchip’ [-Werror=implicit-function-declaration] gpiochip_set_nested_irqchip(&dev->gpio_chip, ^
Fixes: 459773ae8dbb ("gpio: adp5588-gpio: support interrupt controller") Reported-by: kbuild test robot lkp@intel.com Signed-off-by: Randy Dunlap rdunlap@infradead.org Cc: linux-gpio@vger.kernel.org Reviewed-by: Bartosz Golaszewski bgolaszewski@baylibre.com Acked-by: Michael Hennerich michael.hennerich@analog.com Signed-off-by: Linus Walleij linus.walleij@linaro.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpio/Kconfig | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/gpio/Kconfig b/drivers/gpio/Kconfig index 3f50526a771f..864a1ba7aa3a 100644 --- a/drivers/gpio/Kconfig +++ b/drivers/gpio/Kconfig @@ -824,6 +824,7 @@ config GPIO_ADP5588 config GPIO_ADP5588_IRQ bool "Interrupt controller support for ADP5588" depends on GPIO_ADP5588=y + select GPIOLIB_IRQCHIP help Say yes here to enable the adp5588 to be used as an interrupt controller. It requires the driver to be built in the kernel.
[ Upstream commit 4523a5611526709ec9b4e2574f1bb7818212651e ]
Currently we will not update the receive descriptor tail pointer in stmmac_rx_refill. Rx dma will think no available descriptors and stop once received packets exceed DMA_RX_SIZE, so that the rx only test will fail.
Update the receive tail pointer in stmmac_rx_refill to add more descriptors to the rx channel, so packets can be received continually
Fixes: 54139cf3bb33 ("net: stmmac: adding multiple buffers for rx") Signed-off-by: Biao Huang biao.huang@mediatek.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c index 3c409862c52e..8cebc44108b2 100644 --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c @@ -3338,6 +3338,7 @@ static inline void stmmac_rx_refill(struct stmmac_priv *priv, u32 queue) entry = STMMAC_GET_ENTRY(entry, DMA_RX_SIZE); } rx_q->dirty_rx = entry; + stmmac_set_rx_tail_ptr(priv, priv->ioaddr, rx_q->rx_tail_addr, queue); }
/**
[ Upstream commit 5e7f7fc538d894b2d9aa41876b8dcf35f5fe11e6 ]
The specific clk_csr value can be zero, and stmmac_clk is necessary for MDC clock which can be set dynamically. So, change the condition from plat->clk_csr to plat->stmmac_clk to fix clk_csr can't be zero issue.
Fixes: cd7201f477b9 ("stmmac: MDC clock dynamically based on the csr clock input") Signed-off-by: Biao Huang biao.huang@mediatek.com Acked-by: Alexandre TORGUE alexandre.torgue@st.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 6 +++--- drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c | 5 ++++- 2 files changed, 7 insertions(+), 4 deletions(-)
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c index 8cebc44108b2..635d88d82610 100644 --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c @@ -4380,10 +4380,10 @@ int stmmac_dvr_probe(struct device *device, * set the MDC clock dynamically according to the csr actual * clock input. */ - if (!priv->plat->clk_csr) - stmmac_clk_csr_set(priv); - else + if (priv->plat->clk_csr >= 0) priv->clk_csr = priv->plat->clk_csr; + else + stmmac_clk_csr_set(priv);
stmmac_check_pcs_mode(priv);
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c index 3031f2bf15d6..f45bfbef97d0 100644 --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c @@ -408,7 +408,10 @@ stmmac_probe_config_dt(struct platform_device *pdev, const char **mac) /* Default to phy auto-detection */ plat->phy_addr = -1;
- /* Get clk_csr from device tree */ + /* Default to get clk_csr from stmmac_clk_crs_set(), + * or get clk_csr from device tree. + */ + plat->clk_csr = -1; of_property_read_u32(np, "clk_csr", &plat->clk_csr);
/* "snps,phy-addr" is not a standard property. Mark it as deprecated
[ Upstream commit f4ca7a9260dfe700f2a16f0881825de625067515 ]
1. the frequency of csr clock is 66.5MHz, so the csr_clk value should be 0 other than 5. 2. the csr_clk can be got from device tree, so remove initialization here.
Fixes: 9992f37e346b ("stmmac: dwmac-mediatek: add support for mt2712") Signed-off-by: Biao Huang biao.huang@mediatek.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/stmicro/stmmac/dwmac-mediatek.c | 2 -- 1 file changed, 2 deletions(-)
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac-mediatek.c b/drivers/net/ethernet/stmicro/stmmac/dwmac-mediatek.c index bf2562995fc8..126b66bb73a6 100644 --- a/drivers/net/ethernet/stmicro/stmmac/dwmac-mediatek.c +++ b/drivers/net/ethernet/stmicro/stmmac/dwmac-mediatek.c @@ -346,8 +346,6 @@ static int mediatek_dwmac_probe(struct platform_device *pdev) return PTR_ERR(plat_dat);
plat_dat->interface = priv_plat->phy_mode; - /* clk_csr_i = 250-300MHz & MDC = clk_csr_i/124 */ - plat_dat->clk_csr = 5; plat_dat->has_gmac4 = 1; plat_dat->has_gmac = 0; plat_dat->pmt = 0;
[ Upstream commit a278682dad37fd2f8d2f30d8e84e376a856ab472 ]
If io_copy_iov() fails, it will break the loop and report success, albeit partially completed operation.
Signed-off-by: Pavel Begunkov asml.silence@gmail.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- fs/io_uring.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/io_uring.c b/fs/io_uring.c index 4e32a033394c..e82adbf8adc1 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -2506,7 +2506,7 @@ static int io_sqe_buffer_register(struct io_ring_ctx *ctx, void __user *arg,
ret = io_copy_iov(ctx, &iov, arg, i); if (ret) - break; + goto err;
/* * Don't impose further limits on the size and buffer
[ Upstream commit 5a20a093d965560f632b2ec325f8876918f78165 ]
Smatch reports a potential spectre vulnerability in the dpaa2-eth driver, where the value of rxnfc->fs.location (which is provided from user-space) is used as index in an array.
Add a call to array_index_nospec() to sanitize the access.
Signed-off-by: Ioana Radulescu ruxandra.radulescu@nxp.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/freescale/dpaa2/dpaa2-ethtool.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/net/ethernet/freescale/dpaa2/dpaa2-ethtool.c b/drivers/net/ethernet/freescale/dpaa2/dpaa2-ethtool.c index 591dfcf76adb..0610fc0bebc2 100644 --- a/drivers/net/ethernet/freescale/dpaa2/dpaa2-ethtool.c +++ b/drivers/net/ethernet/freescale/dpaa2/dpaa2-ethtool.c @@ -4,6 +4,7 @@ */
#include <linux/net_tstamp.h> +#include <linux/nospec.h>
#include "dpni.h" /* DPNI_LINK_OPT_* */ #include "dpaa2-eth.h" @@ -589,6 +590,8 @@ static int dpaa2_eth_get_rxnfc(struct net_device *net_dev, case ETHTOOL_GRXCLSRULE: if (rxnfc->fs.location >= max_rules) return -EINVAL; + rxnfc->fs.location = array_index_nospec(rxnfc->fs.location, + max_rules); if (!priv->cls_rules[rxnfc->fs.location].in_use) return -EINVAL; rxnfc->fs = priv->cls_rules[rxnfc->fs.location].fs;
[ Upstream commit bd8460fa4de46e9d6177af4fe33bf0763a7af4b7 ]
Use PTR_ERR_OR_ZERO instead of PTR_ERR in cases where zero is a valid input. Reported by smatch.
Signed-off-by: Ioana Radulescu ruxandra.radulescu@nxp.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c b/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c index 57cbaa38d247..df371c81a706 100644 --- a/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c +++ b/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c @@ -1966,7 +1966,7 @@ alloc_channel(struct dpaa2_eth_priv *priv)
channel->dpcon = setup_dpcon(priv); if (IS_ERR_OR_NULL(channel->dpcon)) { - err = PTR_ERR(channel->dpcon); + err = PTR_ERR_OR_ZERO(channel->dpcon); goto err_setup; }
@@ -2022,7 +2022,7 @@ static int setup_dpio(struct dpaa2_eth_priv *priv) /* Try to allocate a channel */ channel = alloc_channel(priv); if (IS_ERR_OR_NULL(channel)) { - err = PTR_ERR(channel); + err = PTR_ERR_OR_ZERO(channel); if (err != -EPROBE_DEFER) dev_info(dev, "No affine channel for cpu %d and above\n", i);
[ Upstream commit 3e66b7cc50ef921121babc91487e1fb98af1ba6e ]
Building with Clang reports the redundant use of MODULE_DEVICE_TABLE():
drivers/net/ethernet/dec/tulip/de4x5.c:2110:1: error: redefinition of '__mod_eisa__de4x5_eisa_ids_device_table' MODULE_DEVICE_TABLE(eisa, de4x5_eisa_ids); ^ ./include/linux/module.h:229:21: note: expanded from macro 'MODULE_DEVICE_TABLE' extern typeof(name) __mod_##type##__##name##_device_table \ ^ <scratch space>:90:1: note: expanded from here __mod_eisa__de4x5_eisa_ids_device_table ^ drivers/net/ethernet/dec/tulip/de4x5.c:2100:1: note: previous definition is here MODULE_DEVICE_TABLE(eisa, de4x5_eisa_ids); ^ ./include/linux/module.h:229:21: note: expanded from macro 'MODULE_DEVICE_TABLE' extern typeof(name) __mod_##type##__##name##_device_table \ ^ <scratch space>:85:1: note: expanded from here __mod_eisa__de4x5_eisa_ids_device_table ^
This drops the one further from the table definition to match the common use of MODULE_DEVICE_TABLE().
Fixes: 07563c711fbc ("EISA bus MODALIAS attributes support") Signed-off-by: Kees Cook keescook@chromium.org Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/dec/tulip/de4x5.c | 1 - 1 file changed, 1 deletion(-)
diff --git a/drivers/net/ethernet/dec/tulip/de4x5.c b/drivers/net/ethernet/dec/tulip/de4x5.c index 66535d1653f6..f16853c3c851 100644 --- a/drivers/net/ethernet/dec/tulip/de4x5.c +++ b/drivers/net/ethernet/dec/tulip/de4x5.c @@ -2107,7 +2107,6 @@ static struct eisa_driver de4x5_eisa_driver = { .remove = de4x5_eisa_remove, } }; -MODULE_DEVICE_TABLE(eisa, de4x5_eisa_ids); #endif
#ifdef CONFIG_PCI
[ Upstream commit 9a51c6b1f9e0239a9435db036b212498a2a3b75c ]
Both acpi_pci_need_resume() and acpi_dev_needs_resume() check if the current ACPI wakeup configuration of the device matches what is expected as far as system wakeup from sleep states is concerned, as reflected by the device_may_wakeup() return value for the device.
However, they only should do that if wakeup.flags.valid is set for the device's ACPI companion, because otherwise the wakeup.prepare_count value for it is meaningless.
Add the missing wakeup.flags.valid checks to these functions.
Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Reviewed-by: Mika Westerberg mika.westerberg@linux.intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/acpi/device_pm.c | 4 ++-- drivers/pci/pci-acpi.c | 3 ++- 2 files changed, 4 insertions(+), 3 deletions(-)
diff --git a/drivers/acpi/device_pm.c b/drivers/acpi/device_pm.c index 824ae985ad93..ccb59768b1f3 100644 --- a/drivers/acpi/device_pm.c +++ b/drivers/acpi/device_pm.c @@ -949,8 +949,8 @@ static bool acpi_dev_needs_resume(struct device *dev, struct acpi_device *adev) u32 sys_target = acpi_target_system_state(); int ret, state;
- if (!pm_runtime_suspended(dev) || !adev || - device_may_wakeup(dev) != !!adev->wakeup.prepare_count) + if (!pm_runtime_suspended(dev) || !adev || (adev->wakeup.flags.valid && + device_may_wakeup(dev) != !!adev->wakeup.prepare_count)) return true;
if (sys_target == ACPI_STATE_S0) diff --git a/drivers/pci/pci-acpi.c b/drivers/pci/pci-acpi.c index e1949f7efd9c..bf32fde328c2 100644 --- a/drivers/pci/pci-acpi.c +++ b/drivers/pci/pci-acpi.c @@ -666,7 +666,8 @@ static bool acpi_pci_need_resume(struct pci_dev *dev) if (!adev || !acpi_device_power_manageable(adev)) return false;
- if (device_may_wakeup(&dev->dev) != !!adev->wakeup.prepare_count) + if (adev->wakeup.flags.valid && + device_may_wakeup(&dev->dev) != !!adev->wakeup.prepare_count) return true;
if (acpi_target_system_state() == ACPI_STATE_S0)
[ Upstream commit 1396500d673bd027683a0609ff84dca7eb6ea2e7 ]
The devcoredump needs to operate on a stable state of the MMU while it is writing the MMU state to the coredump. The missing lock allowed both the userspace submit, as well as the GPU job finish paths to mutate the MMU state while a coredump is under way.
Fixes: a8c21a5451d8 (drm/etnaviv: add initial etnaviv DRM driver) Reported-by: David Jander david@protonic.nl Signed-off-by: Lucas Stach l.stach@pengutronix.de Tested-by: David Jander david@protonic.nl Reviewed-by: Philipp Zabel p.zabel@pengutronix.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/etnaviv/etnaviv_dump.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_dump.c b/drivers/gpu/drm/etnaviv/etnaviv_dump.c index 33854c94cb85..515515ef24f9 100644 --- a/drivers/gpu/drm/etnaviv/etnaviv_dump.c +++ b/drivers/gpu/drm/etnaviv/etnaviv_dump.c @@ -125,6 +125,8 @@ void etnaviv_core_dump(struct etnaviv_gpu *gpu) return; etnaviv_dump_core = false;
+ mutex_lock(&gpu->mmu->lock); + mmu_size = etnaviv_iommu_dump_size(gpu->mmu);
/* We always dump registers, mmu, ring and end marker */ @@ -167,6 +169,7 @@ void etnaviv_core_dump(struct etnaviv_gpu *gpu) iter.start = __vmalloc(file_size, GFP_KERNEL | __GFP_NOWARN | __GFP_NORETRY, PAGE_KERNEL); if (!iter.start) { + mutex_unlock(&gpu->mmu->lock); dev_warn(gpu->dev, "failed to allocate devcoredump file\n"); return; } @@ -234,6 +237,8 @@ void etnaviv_core_dump(struct etnaviv_gpu *gpu) obj->base.size); }
+ mutex_unlock(&gpu->mmu->lock); + etnaviv_core_dump_header(&iter, ETDUMP_BUF_END, iter.data);
dev_coredumpv(gpu->dev, iter.start, iter.data - iter.start, GFP_KERNEL);
[ Upstream commit 31bafc49a7736989e4c2d9f7280002c66536e590 ]
In case no other traffic happening on the ring, full tx cleanup may not be completed. That may cause socket buffer to overflow and tx traffic to stuck until next activity on the ring happens.
This is due to logic error in budget variable decrementor. Variable is compared with zero, and then post decremented, causing it to become MAX_INT. Solution is remove decrementor from the `for` statement and rewrite it in a clear way.
Fixes: b647d3980948e ("net: aquantia: Add tx clean budget and valid budget handling logic") Signed-off-by: Igor Russkikh igor.russkikh@aquantia.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/aquantia/atlantic/aq_ring.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/drivers/net/ethernet/aquantia/atlantic/aq_ring.c b/drivers/net/ethernet/aquantia/atlantic/aq_ring.c index e2ffb159cbe2..bf4aa7060f1a 100644 --- a/drivers/net/ethernet/aquantia/atlantic/aq_ring.c +++ b/drivers/net/ethernet/aquantia/atlantic/aq_ring.c @@ -139,10 +139,10 @@ void aq_ring_queue_stop(struct aq_ring_s *ring) bool aq_ring_tx_clean(struct aq_ring_s *self) { struct device *dev = aq_nic_get_dev(self->aq_nic); - unsigned int budget = AQ_CFG_TX_CLEAN_BUDGET; + unsigned int budget;
- for (; self->sw_head != self->hw_head && budget--; - self->sw_head = aq_ring_next_dx(self, self->sw_head)) { + for (budget = AQ_CFG_TX_CLEAN_BUDGET; + budget && self->sw_head != self->hw_head; budget--) { struct aq_ring_buff_s *buff = &self->buff_ring[self->sw_head];
if (likely(buff->is_mapped)) { @@ -167,6 +167,7 @@ bool aq_ring_tx_clean(struct aq_ring_s *self)
buff->pa = 0U; buff->eop_index = 0xffffU; + self->sw_head = aq_ring_next_dx(self, self->sw_head); }
return !!budget;
[ Upstream commit eaeb3b7494ba9159323814a8ce8af06a9277d99b ]
Driver stops producing skbs on ring if a packet with FCS error was coalesced into LRO session. Ring gets hang forever.
Thats a logical error in driver processing descriptors: When rx_stat indicates MAC Error, next pointer and eop flags are not filled. This confuses driver so it waits for descriptor 0 to be filled by HW.
Solution is fill next pointer and eop flag even for packets with FCS error.
Fixes: bab6de8fd180b ("net: ethernet: aquantia: Atlantic A0 and B0 specific functions.") Signed-off-by: Igor Russkikh igor.russkikh@aquantia.com Signed-off-by: Dmitry Bogdanov dmitry.bogdanov@aquantia.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- .../aquantia/atlantic/hw_atl/hw_atl_b0.c | 61 ++++++++++--------- 1 file changed, 32 insertions(+), 29 deletions(-)
diff --git a/drivers/net/ethernet/aquantia/atlantic/hw_atl/hw_atl_b0.c b/drivers/net/ethernet/aquantia/atlantic/hw_atl/hw_atl_b0.c index b31dba1b1a55..ec302fdfec63 100644 --- a/drivers/net/ethernet/aquantia/atlantic/hw_atl/hw_atl_b0.c +++ b/drivers/net/ethernet/aquantia/atlantic/hw_atl/hw_atl_b0.c @@ -702,38 +702,41 @@ static int hw_atl_b0_hw_ring_rx_receive(struct aq_hw_s *self, if ((rx_stat & BIT(0)) || rxd_wb->type & 0x1000U) { /* MAC error or DMA error */ buff->is_error = 1U; - } else { - if (self->aq_nic_cfg->is_rss) { - /* last 4 byte */ - u16 rss_type = rxd_wb->type & 0xFU; - - if (rss_type && rss_type < 0x8U) { - buff->is_hash_l4 = (rss_type == 0x4 || - rss_type == 0x5); - buff->rss_hash = rxd_wb->rss_hash; - } + } + if (self->aq_nic_cfg->is_rss) { + /* last 4 byte */ + u16 rss_type = rxd_wb->type & 0xFU; + + if (rss_type && rss_type < 0x8U) { + buff->is_hash_l4 = (rss_type == 0x4 || + rss_type == 0x5); + buff->rss_hash = rxd_wb->rss_hash; } + }
- if (HW_ATL_B0_RXD_WB_STAT2_EOP & rxd_wb->status) { - buff->len = rxd_wb->pkt_len % - AQ_CFG_RX_FRAME_MAX; - buff->len = buff->len ? - buff->len : AQ_CFG_RX_FRAME_MAX; - buff->next = 0U; - buff->is_eop = 1U; + if (HW_ATL_B0_RXD_WB_STAT2_EOP & rxd_wb->status) { + buff->len = rxd_wb->pkt_len % + AQ_CFG_RX_FRAME_MAX; + buff->len = buff->len ? + buff->len : AQ_CFG_RX_FRAME_MAX; + buff->next = 0U; + buff->is_eop = 1U; + } else { + buff->len = + rxd_wb->pkt_len > AQ_CFG_RX_FRAME_MAX ? + AQ_CFG_RX_FRAME_MAX : rxd_wb->pkt_len; + + if (HW_ATL_B0_RXD_WB_STAT2_RSCCNT & + rxd_wb->status) { + /* LRO */ + buff->next = rxd_wb->next_desc_ptr; + ++ring->stats.rx.lro_packets; } else { - if (HW_ATL_B0_RXD_WB_STAT2_RSCCNT & - rxd_wb->status) { - /* LRO */ - buff->next = rxd_wb->next_desc_ptr; - ++ring->stats.rx.lro_packets; - } else { - /* jumbo */ - buff->next = - aq_ring_next_dx(ring, - ring->hw_head); - ++ring->stats.rx.jumbo_packets; - } + /* jumbo */ + buff->next = + aq_ring_next_dx(ring, + ring->hw_head); + ++ring->stats.rx.jumbo_packets; } } }
[ Upstream commit a0692f0eef91354b62c2b4c94954536536be5425 ]
If I2C_M_RECV_LEN check failed, msgs[i].buf allocated by memdup_user will not be freed. Pump index up so it will be freed.
Fixes: 838bfa6049fb ("i2c-dev: Add support for I2C_M_RECV_LEN") Signed-off-by: Yingjoe Chen yingjoe.chen@mediatek.com Signed-off-by: Wolfram Sang wsa@the-dreams.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/i2c/i2c-dev.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/i2c/i2c-dev.c b/drivers/i2c/i2c-dev.c index 3f7b9af11137..776f36690448 100644 --- a/drivers/i2c/i2c-dev.c +++ b/drivers/i2c/i2c-dev.c @@ -283,6 +283,7 @@ static noinline int i2cdev_ioctl_rdwr(struct i2c_client *client, msgs[i].len < 1 || msgs[i].buf[0] < 1 || msgs[i].len < msgs[i].buf[0] + I2C_SMBUS_BLOCK_MAX) { + i++; res = -EINVAL; break; }
[ Upstream commit fa763f1b2858752e6150ffff46886a1b7faffc82 ]
We observed the same issue as reported by commit a8d7bde23e7130686b7662 ("ALSA: hda - Force polling mode on CFL for fixing codec communication") We don't have a better solution. So apply the same workaround to CNL.
Signed-off-by: Bard Liao yung-chuan.liao@linux.intel.com Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Sasha Levin sashal@kernel.org --- sound/pci/hda/hda_intel.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/sound/pci/hda/hda_intel.c b/sound/pci/hda/hda_intel.c index 789308f54785..5c29d6490a18 100644 --- a/sound/pci/hda/hda_intel.c +++ b/sound/pci/hda/hda_intel.c @@ -375,6 +375,7 @@ enum {
#define IS_BXT(pci) ((pci)->vendor == 0x8086 && (pci)->device == 0x5a98) #define IS_CFL(pci) ((pci)->vendor == 0x8086 && (pci)->device == 0xa348) +#define IS_CNL(pci) ((pci)->vendor == 0x8086 && (pci)->device == 0x9dc8)
static char *driver_short_names[] = { [AZX_DRIVER_ICH] = "HDA Intel", @@ -1700,8 +1701,8 @@ static int azx_create(struct snd_card *card, struct pci_dev *pci, else chip->bdl_pos_adj = bdl_pos_adj[dev];
- /* Workaround for a communication error on CFL (bko#199007) */ - if (IS_CFL(pci)) + /* Workaround for a communication error on CFL (bko#199007) and CNL */ + if (IS_CFL(pci) || IS_CNL(pci)) chip->polling_mode = 1;
err = azx_bus_init(chip, model[dev], &pci_hda_io_ops);
[ Upstream commit f6122ed2a4f9c9c1c073ddf6308d1b2ac10e0781 ]
In the vfs_statx() context, during path lookup, the dentry gets added to sd->s_dentry via configfs_attach_attr(). In the end, vfs_statx() kills the dentry by calling path_put(), which invokes configfs_d_iput(). Ideally, this dentry must be removed from sd->s_dentry but it doesn't if the sd->s_count >= 3. As a result, sd->s_dentry is holding reference to a stale dentry pointer whose memory is already freed up. This results in use-after-free issue, when this stale sd->s_dentry is accessed later in configfs_readdir() path.
This issue can be easily reproduced, by running the LTP test case - sh fs_racer_file_list.sh /config (https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/fs/ra...)
Fixes: 76ae281f6307 ('configfs: fix race between dentry put and lookup') Signed-off-by: Sahitya Tummala stummala@codeaurora.org Signed-off-by: Christoph Hellwig hch@lst.de Signed-off-by: Sasha Levin sashal@kernel.org --- fs/configfs/dir.c | 14 ++++++-------- 1 file changed, 6 insertions(+), 8 deletions(-)
diff --git a/fs/configfs/dir.c b/fs/configfs/dir.c index 920d350df37b..809c1edffbaf 100644 --- a/fs/configfs/dir.c +++ b/fs/configfs/dir.c @@ -58,15 +58,13 @@ static void configfs_d_iput(struct dentry * dentry, if (sd) { /* Coordinate with configfs_readdir */ spin_lock(&configfs_dirent_lock); - /* Coordinate with configfs_attach_attr where will increase - * sd->s_count and update sd->s_dentry to new allocated one. - * Only set sd->dentry to null when this dentry is the only - * sd owner. - * If not do so, configfs_d_iput may run just after - * configfs_attach_attr and set sd->s_dentry to null - * even it's still in use. + /* + * Set sd->s_dentry to null only when this dentry is the one + * that is going to be killed. Otherwise configfs_d_iput may + * run just after configfs_attach_attr and set sd->s_dentry to + * NULL even it's still in use. */ - if (atomic_read(&sd->s_count) <= 2) + if (sd->s_dentry == dentry) sd->s_dentry = NULL;
spin_unlock(&configfs_dirent_lock);
[ Upstream commit 97acec7df172cd1e450f81f5e293c0aa145a2797 ]
This strncat() is safe because the buffer was allocated with zalloc(), however gcc doesn't know that. Since the string always has 4 non-null bytes, just use memcpy() here.
CC /home/shawn/linux/tools/perf/util/data-convert-bt.o In file included from /usr/include/string.h:494, from /home/shawn/linux/tools/lib/traceevent/event-parse.h:27, from util/data-convert-bt.c:22: In function ‘strncat’, inlined from ‘string_set_value’ at util/data-convert-bt.c:274:4: /usr/include/powerpc64le-linux-gnu/bits/string_fortified.h:136:10: error: ‘__builtin_strncat’ output may be truncated copying 4 bytes from a string of length 4 [-Werror=stringop-truncation] 136 | return __builtin___strncat_chk (__dest, __src, __len, __bos (__dest)); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Signed-off-by: Shawn Landden shawn@git.icu Cc: Adrian Hunter adrian.hunter@intel.com Cc: Jiri Olsa jolsa@redhat.com Cc: Namhyung Kim namhyung@kernel.org Cc: Wang Nan wangnan0@huawei.com LPU-Reference: 20190518183238.10954-1-shawn@git.icu Link: https://lkml.kernel.org/n/tip-289f1jice17ta7tr3tstm9jm@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/perf/util/data-convert-bt.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/perf/util/data-convert-bt.c b/tools/perf/util/data-convert-bt.c index 26af43ad9ddd..53d49fd8b8ae 100644 --- a/tools/perf/util/data-convert-bt.c +++ b/tools/perf/util/data-convert-bt.c @@ -271,7 +271,7 @@ static int string_set_value(struct bt_ctf_field *field, const char *string) if (i > 0) strncpy(buffer, string, i); } - strncat(buffer + p, numstr, 4); + memcpy(buffer + p, numstr, 4); p += 3; } }
[ Upstream commit 7379e652797c0b9b5f6caea1576f2dff9ce6a708 ]
The zcrypt device driver does not handle CPRBs which address a control domain correctly. This fix introduces a workaround: The domain field of the request CPRB is checked if there is a valid domain value in there. If this is true and the value is a control only domain (a domain which is enabled in the crypto config ADM mask but disabled in the AQM mask) the CPRB is forwarded to the default usage domain. If there is no default domain, the request is rejected with an ENODEV.
This fix is important for maintaining crypto adapters. For example one LPAR can use a crypto adapter domain ('Control and Usage') but another LPAR needs to be able to maintain this adapter domain ('Control'). Scenarios like this did not work properly and the patch enables this.
Signed-off-by: Harald Freudenberger freude@linux.ibm.com Signed-off-by: Heiko Carstens heiko.carstens@de.ibm.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/s390/include/asm/ap.h | 4 ++-- drivers/s390/crypto/ap_bus.c | 26 ++++++++++++++++++++++---- drivers/s390/crypto/ap_bus.h | 3 +++ drivers/s390/crypto/zcrypt_api.c | 17 ++++++++++++++--- 4 files changed, 41 insertions(+), 9 deletions(-)
diff --git a/arch/s390/include/asm/ap.h b/arch/s390/include/asm/ap.h index e94a0a28b5eb..aea32dda3d14 100644 --- a/arch/s390/include/asm/ap.h +++ b/arch/s390/include/asm/ap.h @@ -160,8 +160,8 @@ struct ap_config_info { unsigned char Nd; /* max # of Domains - 1 */ unsigned char _reserved3[10]; unsigned int apm[8]; /* AP ID mask */ - unsigned int aqm[8]; /* AP queue mask */ - unsigned int adm[8]; /* AP domain mask */ + unsigned int aqm[8]; /* AP (usage) queue mask */ + unsigned int adm[8]; /* AP (control) domain mask */ unsigned char _reserved4[16]; } __aligned(8);
diff --git a/drivers/s390/crypto/ap_bus.c b/drivers/s390/crypto/ap_bus.c index 1546389d71db..6717536a633c 100644 --- a/drivers/s390/crypto/ap_bus.c +++ b/drivers/s390/crypto/ap_bus.c @@ -254,19 +254,37 @@ static inline int ap_test_config_card_id(unsigned int id) }
/* - * ap_test_config_domain(): Test, whether an AP usage domain is configured. + * ap_test_config_usage_domain(): Test, whether an AP usage domain + * is configured. * @domain AP usage domain ID * * Returns 0 if the usage domain is not configured * 1 if the usage domain is configured or * if the configuration information is not available */ -static inline int ap_test_config_domain(unsigned int domain) +int ap_test_config_usage_domain(unsigned int domain) { if (!ap_configuration) /* QCI not supported */ return domain < 16; return ap_test_config(ap_configuration->aqm, domain); } +EXPORT_SYMBOL(ap_test_config_usage_domain); + +/* + * ap_test_config_ctrl_domain(): Test, whether an AP control domain + * is configured. + * @domain AP control domain ID + * + * Returns 1 if the control domain is configured + * 0 in all other cases + */ +int ap_test_config_ctrl_domain(unsigned int domain) +{ + if (!ap_configuration) /* QCI not supported */ + return 0; + return ap_test_config(ap_configuration->adm, domain); +} +EXPORT_SYMBOL(ap_test_config_ctrl_domain);
/** * ap_query_queue(): Check if an AP queue is available. @@ -1267,7 +1285,7 @@ static void ap_select_domain(void) best_domain = -1; max_count = 0; for (i = 0; i < AP_DOMAINS; i++) { - if (!ap_test_config_domain(i) || + if (!ap_test_config_usage_domain(i) || !test_bit_inv(i, ap_perms.aqm)) continue; count = 0; @@ -1442,7 +1460,7 @@ static void _ap_scan_bus_adapter(int id) (void *)(long) qid, __match_queue_device_with_qid); aq = dev ? to_ap_queue(dev) : NULL; - if (!ap_test_config_domain(dom)) { + if (!ap_test_config_usage_domain(dom)) { if (dev) { /* Queue device exists but has been * removed from configuration. diff --git a/drivers/s390/crypto/ap_bus.h b/drivers/s390/crypto/ap_bus.h index 15a98a673c5c..6f3cf37776ca 100644 --- a/drivers/s390/crypto/ap_bus.h +++ b/drivers/s390/crypto/ap_bus.h @@ -251,6 +251,9 @@ void ap_wait(enum ap_wait wait); void ap_request_timeout(struct timer_list *t); void ap_bus_force_rescan(void);
+int ap_test_config_usage_domain(unsigned int domain); +int ap_test_config_ctrl_domain(unsigned int domain); + void ap_queue_init_reply(struct ap_queue *aq, struct ap_message *ap_msg); struct ap_queue *ap_queue_create(ap_qid_t qid, int device_type); void ap_queue_prepare_remove(struct ap_queue *aq); diff --git a/drivers/s390/crypto/zcrypt_api.c b/drivers/s390/crypto/zcrypt_api.c index c31b2d31cd83..03b1853464db 100644 --- a/drivers/s390/crypto/zcrypt_api.c +++ b/drivers/s390/crypto/zcrypt_api.c @@ -822,7 +822,7 @@ static long _zcrypt_send_cprb(struct ap_perms *perms, struct ap_message ap_msg; unsigned int weight, pref_weight; unsigned int func_code; - unsigned short *domain; + unsigned short *domain, tdom; int qid = 0, rc = -ENODEV; struct module *mod;
@@ -834,6 +834,17 @@ static long _zcrypt_send_cprb(struct ap_perms *perms, if (rc) goto out;
+ /* + * If a valid target domain is set and this domain is NOT a usage + * domain but a control only domain, use the default domain as target. + */ + tdom = *domain; + if (tdom >= 0 && tdom < AP_DOMAINS && + !ap_test_config_usage_domain(tdom) && + ap_test_config_ctrl_domain(tdom) && + ap_domain_index >= 0) + tdom = ap_domain_index; + pref_zc = NULL; pref_zq = NULL; spin_lock(&zcrypt_list_lock); @@ -856,8 +867,8 @@ static long _zcrypt_send_cprb(struct ap_perms *perms, /* check if device is online and eligible */ if (!zq->online || !zq->ops->send_cprb || - ((*domain != (unsigned short) AUTOSELECT) && - (*domain != AP_QID_QUEUE(zq->queue->qid)))) + (tdom != (unsigned short) AUTOSELECT && + tdom != AP_QID_QUEUE(zq->queue->qid))) continue; /* check if device node has admission for this queue */ if (!zcrypt_check_queue(perms,
[ Upstream commit 6584140ba9e6762dd7ec73795243289b914f31f9 ]
It seems that the current code lacks holding the namespace lock in thread__namespaces(). Otherwise it can see inconsistent results.
Signed-off-by: Namhyung Kim namhyung@kernel.org Cc: Hari Bathini hbathini@linux.vnet.ibm.com Cc: Jiri Olsa jolsa@redhat.com Cc: Krister Johansen kjlx@templeofstupid.com Link: http://lkml.kernel.org/r/20190522053250.207156-2-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/perf/util/thread.c | 15 +++++++++++++-- 1 file changed, 13 insertions(+), 2 deletions(-)
diff --git a/tools/perf/util/thread.c b/tools/perf/util/thread.c index 50678d318185..b800752745af 100644 --- a/tools/perf/util/thread.c +++ b/tools/perf/util/thread.c @@ -132,7 +132,7 @@ void thread__put(struct thread *thread) } }
-struct namespaces *thread__namespaces(const struct thread *thread) +static struct namespaces *__thread__namespaces(const struct thread *thread) { if (list_empty(&thread->namespaces_list)) return NULL; @@ -140,10 +140,21 @@ struct namespaces *thread__namespaces(const struct thread *thread) return list_first_entry(&thread->namespaces_list, struct namespaces, list); }
+struct namespaces *thread__namespaces(const struct thread *thread) +{ + struct namespaces *ns; + + down_read((struct rw_semaphore *)&thread->namespaces_lock); + ns = __thread__namespaces(thread); + up_read((struct rw_semaphore *)&thread->namespaces_lock); + + return ns; +} + static int __thread__set_namespaces(struct thread *thread, u64 timestamp, struct namespaces_event *event) { - struct namespaces *new, *curr = thread__namespaces(thread); + struct namespaces *new, *curr = __thread__namespaces(thread);
new = namespaces__new(event); if (!new)
[ Upstream commit 6738028dd57df064b969d8392c943ef3b3ae705d ]
Command 'perf record' and 'perf report' on a system without kernel debuginfo packages uses /proc/kallsyms and /proc/modules to find addresses for kernel and module symbols. On x86 this works for root and non-root users.
On s390, when invoked as non-root user, many of the following warnings are shown and module symbols are missing:
proc/{kallsyms,modules} inconsistency while looking for "[sha1_s390]" module!
Command 'perf record' creates a list of module start addresses by parsing the output of /proc/modules and creates a PERF_RECORD_MMAP record for the kernel and each module. The following function call sequence is executed:
machine__create_kernel_maps machine__create_module modules__parse machine__create_module --> for each line in /proc/modules arch__fix_module_text_start
Function arch__fix_module_text_start() is s390 specific. It opens file /sys/module/<name>/sections/.text to extract the module's .text section start address. On s390 the module loader prepends a header before the first section, whereas on x86 the module's text section address is identical the the module's load address.
However module section files are root readable only. For non-root the read operation fails and machine__create_module() returns an error. Command perf record does not generate any PERF_RECORD_MMAP record for loaded modules. Later command perf report complains about missing module maps.
To fix this function arch__fix_module_text_start() always returns success. For root users there is no change, for non-root users the module's load address is used as module's text start address (the prepended header then counts as part of the text section).
This enable non-root users to use module symbols and avoid the warning when perf report is executed.
Output before:
[tmricht@m83lp54 perf]$ ./perf report -D | fgrep MMAP 0 0x168 [0x50]: PERF_RECORD_MMAP ... x [kernel.kallsyms]_text
Output after:
[tmricht@m83lp54 perf]$ ./perf report -D | fgrep MMAP 0 0x168 [0x50]: PERF_RECORD_MMAP ... x [kernel.kallsyms]_text 0 0x1b8 [0x98]: PERF_RECORD_MMAP ... x /lib/modules/.../autofs4.ko.xz 0 0x250 [0xa8]: PERF_RECORD_MMAP ... x /lib/modules/.../sha_common.ko.xz 0 0x2f8 [0x98]: PERF_RECORD_MMAP ... x /lib/modules/.../des_generic.ko.xz
Signed-off-by: Thomas Richter tmricht@linux.ibm.com Reviewed-by: Hendrik Brueckner brueckner@linux.ibm.com Cc: Heiko Carstens heiko.carstens@de.ibm.com Link: http://lkml.kernel.org/r/20190522144601.50763-4-tmricht@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/perf/arch/s390/util/machine.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/tools/perf/arch/s390/util/machine.c b/tools/perf/arch/s390/util/machine.c index 0b2054007314..a19690a17291 100644 --- a/tools/perf/arch/s390/util/machine.c +++ b/tools/perf/arch/s390/util/machine.c @@ -5,16 +5,19 @@ #include "util.h" #include "machine.h" #include "api/fs/fs.h" +#include "debug.h"
int arch__fix_module_text_start(u64 *start, const char *name) { + u64 m_start = *start; char path[PATH_MAX];
snprintf(path, PATH_MAX, "module/%.*s/sections/.text", (int)strlen(name) - 2, name + 1); - - if (sysfs__read_ull(path, (unsigned long long *)start) < 0) - return -1; + if (sysfs__read_ull(path, (unsigned long long *)start) < 0) { + pr_debug2("Using module %s start:%#lx\n", path, m_start); + *start = m_start; + }
return 0; }
[ Upstream commit 9a626c4a6326da4433a0d4d4a8a7d1571caf1ed3 ]
Fix build errors on ia64 when DISCONTIGMEM=y and NUMA=y by exporting paddr_to_nid().
Fixes these build errors:
ERROR: "paddr_to_nid" [sound/core/snd-pcm.ko] undefined! ERROR: "paddr_to_nid" [net/sunrpc/sunrpc.ko] undefined! ERROR: "paddr_to_nid" [fs/cifs/cifs.ko] undefined! ERROR: "paddr_to_nid" [drivers/video/fbdev/core/fb.ko] undefined! ERROR: "paddr_to_nid" [drivers/usb/mon/usbmon.ko] undefined! ERROR: "paddr_to_nid" [drivers/usb/core/usbcore.ko] undefined! ERROR: "paddr_to_nid" [drivers/md/raid1.ko] undefined! ERROR: "paddr_to_nid" [drivers/md/dm-mod.ko] undefined! ERROR: "paddr_to_nid" [drivers/md/dm-crypt.ko] undefined! ERROR: "paddr_to_nid" [drivers/md/dm-bufio.ko] undefined! ERROR: "paddr_to_nid" [drivers/ide/ide-core.ko] undefined! ERROR: "paddr_to_nid" [drivers/ide/ide-cd_mod.ko] undefined! ERROR: "paddr_to_nid" [drivers/gpu/drm/drm.ko] undefined! ERROR: "paddr_to_nid" [drivers/char/agp/agpgart.ko] undefined! ERROR: "paddr_to_nid" [drivers/block/nbd.ko] undefined! ERROR: "paddr_to_nid" [drivers/block/loop.ko] undefined! ERROR: "paddr_to_nid" [drivers/block/brd.ko] undefined! ERROR: "paddr_to_nid" [crypto/ccm.ko] undefined!
Reported-by: kbuild test robot lkp@intel.com Signed-off-by: Randy Dunlap rdunlap@infradead.org Cc: Tony Luck tony.luck@intel.com Cc: Fenghua Yu fenghua.yu@intel.com Cc: linux-ia64@vger.kernel.org Signed-off-by: Tony Luck tony.luck@intel.com Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/ia64/mm/numa.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/arch/ia64/mm/numa.c b/arch/ia64/mm/numa.c index a03803506b0c..5e1015eb6d0d 100644 --- a/arch/ia64/mm/numa.c +++ b/arch/ia64/mm/numa.c @@ -55,6 +55,7 @@ paddr_to_nid(unsigned long paddr)
return (i < num_node_memblks) ? node_memblk[i].nid : (num_node_memblks ? -1 : 0); } +EXPORT_SYMBOL(paddr_to_nid);
#if defined(CONFIG_SPARSEMEM) && defined(CONFIG_NUMA) /*
[ Upstream commit 7aae703f8096d21e34ce5f34f16715587bc30902 ]
Make sure only the portals for the online CPUs are used. Without this change, there are issues when someone boots with maxcpus=n, with n < actual number of cores available as frames either received or corresponding to the transmit confirmation path would be offered for dequeue to the offline CPU portals, getting lost.
Signed-off-by: Madalin Bucur madalin.bucur@nxp.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/freescale/dpaa/dpaa_eth.c | 9 ++++----- drivers/net/ethernet/freescale/dpaa/dpaa_ethtool.c | 4 ++-- 2 files changed, 6 insertions(+), 7 deletions(-)
diff --git a/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c b/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c index d3f2408dc9e8..f38c3fa7d705 100644 --- a/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c +++ b/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c @@ -780,7 +780,7 @@ static void dpaa_eth_add_channel(u16 channel) struct qman_portal *portal; int cpu;
- for_each_cpu(cpu, cpus) { + for_each_cpu_and(cpu, cpus, cpu_online_mask) { portal = qman_get_affine_portal(cpu); qman_p_static_dequeue_add(portal, pool); } @@ -896,7 +896,7 @@ static void dpaa_fq_setup(struct dpaa_priv *priv, u16 channels[NR_CPUS]; struct dpaa_fq *fq;
- for_each_cpu(cpu, affine_cpus) + for_each_cpu_and(cpu, affine_cpus, cpu_online_mask) channels[num_portals++] = qman_affine_channel(cpu);
if (num_portals == 0) @@ -2174,7 +2174,6 @@ static int dpaa_eth_poll(struct napi_struct *napi, int budget) if (cleaned < budget) { napi_complete_done(napi, cleaned); qman_p_irqsource_add(np->p, QM_PIRQ_DQRI); - } else if (np->down) { qman_p_irqsource_add(np->p, QM_PIRQ_DQRI); } @@ -2448,7 +2447,7 @@ static void dpaa_eth_napi_enable(struct dpaa_priv *priv) struct dpaa_percpu_priv *percpu_priv; int i;
- for_each_possible_cpu(i) { + for_each_online_cpu(i) { percpu_priv = per_cpu_ptr(priv->percpu_priv, i);
percpu_priv->np.down = 0; @@ -2461,7 +2460,7 @@ static void dpaa_eth_napi_disable(struct dpaa_priv *priv) struct dpaa_percpu_priv *percpu_priv; int i;
- for_each_possible_cpu(i) { + for_each_online_cpu(i) { percpu_priv = per_cpu_ptr(priv->percpu_priv, i);
percpu_priv->np.down = 1; diff --git a/drivers/net/ethernet/freescale/dpaa/dpaa_ethtool.c b/drivers/net/ethernet/freescale/dpaa/dpaa_ethtool.c index bdee441bc3b7..7ce2e99b594d 100644 --- a/drivers/net/ethernet/freescale/dpaa/dpaa_ethtool.c +++ b/drivers/net/ethernet/freescale/dpaa/dpaa_ethtool.c @@ -569,7 +569,7 @@ static int dpaa_set_coalesce(struct net_device *dev, qman_dqrr_get_ithresh(portal, &prev_thresh);
/* set new values */ - for_each_cpu(cpu, cpus) { + for_each_cpu_and(cpu, cpus, cpu_online_mask) { portal = qman_get_affine_portal(cpu); res = qman_portal_set_iperiod(portal, period); if (res) @@ -586,7 +586,7 @@ static int dpaa_set_coalesce(struct net_device *dev,
revert_values: /* restore previous values */ - for_each_cpu(cpu, cpus) { + for_each_cpu_and(cpu, cpus, cpu_online_mask) { if (!needs_revert[cpu]) continue; portal = qman_get_affine_portal(cpu);
[ Upstream commit 41349672e3cbc2e8349831f21253509c3415aa2b ]
Fixes gcc '-Wunused-but-set-variable' warning:
drivers/xen/pvcalls-front.c: In function pvcalls_front_sendmsg: drivers/xen/pvcalls-front.c:543:25: warning: variable bedata set but not used [-Wunused-but-set-variable] drivers/xen/pvcalls-front.c: In function pvcalls_front_recvmsg: drivers/xen/pvcalls-front.c:638:25: warning: variable bedata set but not used [-Wunused-but-set-variable]
They are never used since introduction.
Signed-off-by: YueHaibing yuehaibing@huawei.com Reviewed-by: Juergen Gross jgross@suse.com Signed-off-by: Boris Ostrovsky boris.ostrovsky@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/xen/pvcalls-front.c | 4 ---- 1 file changed, 4 deletions(-)
diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c index 8a249c95c193..d7438fdc5706 100644 --- a/drivers/xen/pvcalls-front.c +++ b/drivers/xen/pvcalls-front.c @@ -540,7 +540,6 @@ static int __write_ring(struct pvcalls_data_intf *intf, int pvcalls_front_sendmsg(struct socket *sock, struct msghdr *msg, size_t len) { - struct pvcalls_bedata *bedata; struct sock_mapping *map; int sent, tot_sent = 0; int count = 0, flags; @@ -552,7 +551,6 @@ int pvcalls_front_sendmsg(struct socket *sock, struct msghdr *msg, map = pvcalls_enter_sock(sock); if (IS_ERR(map)) return PTR_ERR(map); - bedata = dev_get_drvdata(&pvcalls_front_dev->dev);
mutex_lock(&map->active.out_mutex); if ((flags & MSG_DONTWAIT) && !pvcalls_front_write_todo(map)) { @@ -635,7 +633,6 @@ static int __read_ring(struct pvcalls_data_intf *intf, int pvcalls_front_recvmsg(struct socket *sock, struct msghdr *msg, size_t len, int flags) { - struct pvcalls_bedata *bedata; int ret; struct sock_mapping *map;
@@ -645,7 +642,6 @@ int pvcalls_front_recvmsg(struct socket *sock, struct msghdr *msg, size_t len, map = pvcalls_enter_sock(sock); if (IS_ERR(map)) return PTR_ERR(map); - bedata = dev_get_drvdata(&pvcalls_front_dev->dev);
mutex_lock(&map->active.in_mutex); if (len > XEN_FLEX_RING_SIZE(PVCALLS_RING_ORDER))
[ Upstream commit d10e0cc113c9e1b64b5c6e3db37b5c839794f3df ]
During a suspend/resume, the xenwatch thread waits for all outstanding xenstore requests and transactions to complete. This does not work correctly for transactions started by userspace because it waits for them to complete after freezing userspace threads which means the transactions have no way of completing, resulting in a deadlock. This is trivial to reproduce by running this script and then suspending the VM:
import pyxs, time c = pyxs.client.Client(xen_bus_path="/dev/xen/xenbus") c.connect() c.transaction() time.sleep(3600)
Even if this deadlock were resolved, misbehaving userspace should not prevent a VM from being migrated. So, instead of waiting for these transactions to complete before suspending, store the current generation id for each transaction when it is started. The global generation id is incremented during resume. If the caller commits the transaction and the generation id does not match the current generation id, return EAGAIN so that they try again. If the transaction was instead discarded, return OK since no changes were made anyway.
This only affects users of the xenbus file interface. In-kernel users of xenbus are assumed to be well-behaved and complete all transactions before freezing.
Signed-off-by: Ross Lagerwall ross.lagerwall@citrix.com Reviewed-by: Juergen Gross jgross@suse.com Signed-off-by: Boris Ostrovsky boris.ostrovsky@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/xen/xenbus/xenbus.h | 3 +++ drivers/xen/xenbus/xenbus_dev_frontend.c | 18 ++++++++++++++++++ drivers/xen/xenbus/xenbus_xs.c | 7 +++++-- 3 files changed, 26 insertions(+), 2 deletions(-)
diff --git a/drivers/xen/xenbus/xenbus.h b/drivers/xen/xenbus/xenbus.h index 092981171df1..d75a2385b37c 100644 --- a/drivers/xen/xenbus/xenbus.h +++ b/drivers/xen/xenbus/xenbus.h @@ -83,6 +83,7 @@ struct xb_req_data { int num_vecs; int err; enum xb_req_state state; + bool user_req; void (*cb)(struct xb_req_data *); void *par; }; @@ -133,4 +134,6 @@ void xenbus_ring_ops_init(void); int xenbus_dev_request_and_reply(struct xsd_sockmsg *msg, void *par); void xenbus_dev_queue_reply(struct xb_req_data *req);
+extern unsigned int xb_dev_generation_id; + #endif diff --git a/drivers/xen/xenbus/xenbus_dev_frontend.c b/drivers/xen/xenbus/xenbus_dev_frontend.c index 0782ff3c2273..39c63152a358 100644 --- a/drivers/xen/xenbus/xenbus_dev_frontend.c +++ b/drivers/xen/xenbus/xenbus_dev_frontend.c @@ -62,6 +62,8 @@
#include "xenbus.h"
+unsigned int xb_dev_generation_id; + /* * An element of a list of outstanding transactions, for which we're * still waiting a reply. @@ -69,6 +71,7 @@ struct xenbus_transaction_holder { struct list_head list; struct xenbus_transaction handle; + unsigned int generation_id; };
/* @@ -441,6 +444,7 @@ static int xenbus_write_transaction(unsigned msg_type, rc = -ENOMEM; goto out; } + trans->generation_id = xb_dev_generation_id; list_add(&trans->list, &u->transactions); } else if (msg->hdr.tx_id != 0 && !xenbus_get_transaction(u, msg->hdr.tx_id)) @@ -449,6 +453,20 @@ static int xenbus_write_transaction(unsigned msg_type, !(msg->hdr.len == 2 && (!strcmp(msg->body, "T") || !strcmp(msg->body, "F")))) return xenbus_command_reply(u, XS_ERROR, "EINVAL"); + else if (msg_type == XS_TRANSACTION_END) { + trans = xenbus_get_transaction(u, msg->hdr.tx_id); + if (trans && trans->generation_id != xb_dev_generation_id) { + list_del(&trans->list); + kfree(trans); + if (!strcmp(msg->body, "T")) + return xenbus_command_reply(u, XS_ERROR, + "EAGAIN"); + else + return xenbus_command_reply(u, + XS_TRANSACTION_END, + "OK"); + } + }
rc = xenbus_dev_request_and_reply(&msg->hdr, u); if (rc && trans) { diff --git a/drivers/xen/xenbus/xenbus_xs.c b/drivers/xen/xenbus/xenbus_xs.c index 49a3874ae6bb..ddc18da61834 100644 --- a/drivers/xen/xenbus/xenbus_xs.c +++ b/drivers/xen/xenbus/xenbus_xs.c @@ -105,6 +105,7 @@ static void xs_suspend_enter(void)
static void xs_suspend_exit(void) { + xb_dev_generation_id++; spin_lock(&xs_state_lock); xs_suspend_active--; spin_unlock(&xs_state_lock); @@ -125,7 +126,7 @@ static uint32_t xs_request_enter(struct xb_req_data *req) spin_lock(&xs_state_lock); }
- if (req->type == XS_TRANSACTION_START) + if (req->type == XS_TRANSACTION_START && !req->user_req) xs_state_users++; xs_state_users++; rq_id = xs_request_id++; @@ -140,7 +141,7 @@ void xs_request_exit(struct xb_req_data *req) spin_lock(&xs_state_lock); xs_state_users--; if ((req->type == XS_TRANSACTION_START && req->msg.type == XS_ERROR) || - (req->type == XS_TRANSACTION_END && + (req->type == XS_TRANSACTION_END && !req->user_req && !WARN_ON_ONCE(req->msg.type == XS_ERROR && !strcmp(req->body, "ENOENT")))) xs_state_users--; @@ -286,6 +287,7 @@ int xenbus_dev_request_and_reply(struct xsd_sockmsg *msg, void *par) req->num_vecs = 1; req->cb = xenbus_dev_queue_reply; req->par = par; + req->user_req = true;
xs_send(req, msg);
@@ -313,6 +315,7 @@ static void *xs_talkv(struct xenbus_transaction t, req->vec = iovec; req->num_vecs = num_vecs; req->cb = xs_wake_up; + req->user_req = false;
msg.req_id = 0; msg.tx_id = t.id;
[ Upstream commit 50fbc13dc12666f3604dc2555a47fc8c4e29162b ]
In flush_cache_ent(), 'ce->ce_path' is allocated by kstrdup_const(). It should be freed by kfree_const(), rather than kfree().
Signed-off-by: Gen Zhang blackgod016574@gmail.com Reviewed-by: Paulo Alcantara palcantara@suse.de Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/cifs/dfs_cache.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/fs/cifs/dfs_cache.c b/fs/cifs/dfs_cache.c index 09b7d0d4f6e4..007cfa39be5f 100644 --- a/fs/cifs/dfs_cache.c +++ b/fs/cifs/dfs_cache.c @@ -131,7 +131,7 @@ static inline void flush_cache_ent(struct dfs_cache_entry *ce) return;
hlist_del_init_rcu(&ce->ce_hlist); - kfree(ce->ce_path); + kfree_const(ce->ce_path); free_tgts(ce); dfs_cache_count--; call_rcu(&ce->ce_rcu, free_cache_entry); @@ -421,7 +421,7 @@ alloc_cache_entry(const char *path, const struct dfs_info3_param *refs,
rc = copy_ref_data(refs, numrefs, ce, NULL); if (rc) { - kfree(ce->ce_path); + kfree_const(ce->ce_path); kmem_cache_free(dfs_cache_slab, ce); ce = ERR_PTR(rc); }
[ Upstream commit 0d4ee88d92884c661fcafd5576da243aa943dc24 ]
Currently the HV KVM code uses kvm->lock in conjunction with a flag, kvm->arch.mmu_ready, to synchronize MMU setup and hold off vcpu execution until the MMU-related data structures are ready. However, this means that kvm->lock is being taken inside vcpu->mutex, which is contrary to Documentation/virtual/kvm/locking.txt and results in lockdep warnings.
To fix this, we add a new mutex, kvm->arch.mmu_setup_lock, which nests inside the vcpu mutexes, and is taken in the places where kvm->lock was taken that are related to MMU setup.
Additionally we take the new mutex in the vcpu creation code at the point where we are creating a new vcore, in order to provide mutual exclusion with kvmppc_update_lpcr() and ensure that an update to kvm->arch.lpcr doesn't get missed, which could otherwise lead to a stale vcore->lpcr value.
Signed-off-by: Paul Mackerras paulus@ozlabs.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/powerpc/include/asm/kvm_host.h | 1 + arch/powerpc/kvm/book3s_64_mmu_hv.c | 36 ++++++++++++++--------------- arch/powerpc/kvm/book3s_hv.c | 31 ++++++++++++++++++------- 3 files changed, 42 insertions(+), 26 deletions(-)
diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h index e6b5bb012ccb..8d3658275a34 100644 --- a/arch/powerpc/include/asm/kvm_host.h +++ b/arch/powerpc/include/asm/kvm_host.h @@ -317,6 +317,7 @@ struct kvm_arch { #endif struct kvmppc_ops *kvm_ops; #ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE + struct mutex mmu_setup_lock; /* nests inside vcpu mutexes */ u64 l1_ptcr; int max_nested_lpid; struct kvm_nested_guest *nested_guests[KVM_MAX_NESTED_GUESTS]; diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c b/arch/powerpc/kvm/book3s_64_mmu_hv.c index be7bc070eae5..c1ced22455f9 100644 --- a/arch/powerpc/kvm/book3s_64_mmu_hv.c +++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c @@ -63,7 +63,7 @@ struct kvm_resize_hpt { struct work_struct work; u32 order;
- /* These fields protected by kvm->lock */ + /* These fields protected by kvm->arch.mmu_setup_lock */
/* Possible values and their usage: * <0 an error occurred during allocation, @@ -73,7 +73,7 @@ struct kvm_resize_hpt { int error;
/* Private to the work thread, until error != -EBUSY, - * then protected by kvm->lock. + * then protected by kvm->arch.mmu_setup_lock. */ struct kvm_hpt_info hpt; }; @@ -139,7 +139,7 @@ long kvmppc_alloc_reset_hpt(struct kvm *kvm, int order) long err = -EBUSY; struct kvm_hpt_info info;
- mutex_lock(&kvm->lock); + mutex_lock(&kvm->arch.mmu_setup_lock); if (kvm->arch.mmu_ready) { kvm->arch.mmu_ready = 0; /* order mmu_ready vs. vcpus_running */ @@ -183,7 +183,7 @@ long kvmppc_alloc_reset_hpt(struct kvm *kvm, int order) /* Ensure that each vcpu will flush its TLB on next entry. */ cpumask_setall(&kvm->arch.need_tlb_flush);
- mutex_unlock(&kvm->lock); + mutex_unlock(&kvm->arch.mmu_setup_lock); return err; }
@@ -1447,7 +1447,7 @@ static void resize_hpt_pivot(struct kvm_resize_hpt *resize)
static void resize_hpt_release(struct kvm *kvm, struct kvm_resize_hpt *resize) { - if (WARN_ON(!mutex_is_locked(&kvm->lock))) + if (WARN_ON(!mutex_is_locked(&kvm->arch.mmu_setup_lock))) return;
if (!resize) @@ -1474,14 +1474,14 @@ static void resize_hpt_prepare_work(struct work_struct *work) if (WARN_ON(resize->error != -EBUSY)) return;
- mutex_lock(&kvm->lock); + mutex_lock(&kvm->arch.mmu_setup_lock);
/* Request is still current? */ if (kvm->arch.resize_hpt == resize) { /* We may request large allocations here: - * do not sleep with kvm->lock held for a while. + * do not sleep with kvm->arch.mmu_setup_lock held for a while. */ - mutex_unlock(&kvm->lock); + mutex_unlock(&kvm->arch.mmu_setup_lock);
resize_hpt_debug(resize, "resize_hpt_prepare_work(): order = %d\n", resize->order); @@ -1494,9 +1494,9 @@ static void resize_hpt_prepare_work(struct work_struct *work) if (WARN_ON(err == -EBUSY)) err = -EINPROGRESS;
- mutex_lock(&kvm->lock); + mutex_lock(&kvm->arch.mmu_setup_lock); /* It is possible that kvm->arch.resize_hpt != resize - * after we grab kvm->lock again. + * after we grab kvm->arch.mmu_setup_lock again. */ }
@@ -1505,7 +1505,7 @@ static void resize_hpt_prepare_work(struct work_struct *work) if (kvm->arch.resize_hpt != resize) resize_hpt_release(kvm, resize);
- mutex_unlock(&kvm->lock); + mutex_unlock(&kvm->arch.mmu_setup_lock); }
long kvm_vm_ioctl_resize_hpt_prepare(struct kvm *kvm, @@ -1522,7 +1522,7 @@ long kvm_vm_ioctl_resize_hpt_prepare(struct kvm *kvm, if (shift && ((shift < 18) || (shift > 46))) return -EINVAL;
- mutex_lock(&kvm->lock); + mutex_lock(&kvm->arch.mmu_setup_lock);
resize = kvm->arch.resize_hpt;
@@ -1565,7 +1565,7 @@ long kvm_vm_ioctl_resize_hpt_prepare(struct kvm *kvm, ret = 100; /* estimated time in ms */
out: - mutex_unlock(&kvm->lock); + mutex_unlock(&kvm->arch.mmu_setup_lock); return ret; }
@@ -1588,7 +1588,7 @@ long kvm_vm_ioctl_resize_hpt_commit(struct kvm *kvm, if (shift && ((shift < 18) || (shift > 46))) return -EINVAL;
- mutex_lock(&kvm->lock); + mutex_lock(&kvm->arch.mmu_setup_lock);
resize = kvm->arch.resize_hpt;
@@ -1625,7 +1625,7 @@ long kvm_vm_ioctl_resize_hpt_commit(struct kvm *kvm, smp_mb(); out_no_hpt: resize_hpt_release(kvm, resize); - mutex_unlock(&kvm->lock); + mutex_unlock(&kvm->arch.mmu_setup_lock); return ret; }
@@ -1868,7 +1868,7 @@ static ssize_t kvm_htab_write(struct file *file, const char __user *buf, return -EINVAL;
/* lock out vcpus from running while we're doing this */ - mutex_lock(&kvm->lock); + mutex_lock(&kvm->arch.mmu_setup_lock); mmu_ready = kvm->arch.mmu_ready; if (mmu_ready) { kvm->arch.mmu_ready = 0; /* temporarily */ @@ -1876,7 +1876,7 @@ static ssize_t kvm_htab_write(struct file *file, const char __user *buf, smp_mb(); if (atomic_read(&kvm->arch.vcpus_running)) { kvm->arch.mmu_ready = 1; - mutex_unlock(&kvm->lock); + mutex_unlock(&kvm->arch.mmu_setup_lock); return -EBUSY; } } @@ -1963,7 +1963,7 @@ static ssize_t kvm_htab_write(struct file *file, const char __user *buf, /* Order HPTE updates vs. mmu_ready */ smp_wmb(); kvm->arch.mmu_ready = mmu_ready; - mutex_unlock(&kvm->lock); + mutex_unlock(&kvm->arch.mmu_setup_lock);
if (err) return err; diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c index bd68b3e59de5..9f49087c3a41 100644 --- a/arch/powerpc/kvm/book3s_hv.c +++ b/arch/powerpc/kvm/book3s_hv.c @@ -2257,11 +2257,17 @@ static struct kvm_vcpu *kvmppc_core_vcpu_create_hv(struct kvm *kvm, pr_devel("KVM: collision on id %u", id); vcore = NULL; } else if (!vcore) { + /* + * Take mmu_setup_lock for mutual exclusion + * with kvmppc_update_lpcr(). + */ err = -ENOMEM; vcore = kvmppc_vcore_create(kvm, id & ~(kvm->arch.smt_mode - 1)); + mutex_lock(&kvm->arch.mmu_setup_lock); kvm->arch.vcores[core] = vcore; kvm->arch.online_vcores++; + mutex_unlock(&kvm->arch.mmu_setup_lock); } } mutex_unlock(&kvm->lock); @@ -3821,7 +3827,7 @@ static int kvmhv_setup_mmu(struct kvm_vcpu *vcpu) int r = 0; struct kvm *kvm = vcpu->kvm;
- mutex_lock(&kvm->lock); + mutex_lock(&kvm->arch.mmu_setup_lock); if (!kvm->arch.mmu_ready) { if (!kvm_is_radix(kvm)) r = kvmppc_hv_setup_htab_rma(vcpu); @@ -3831,7 +3837,7 @@ static int kvmhv_setup_mmu(struct kvm_vcpu *vcpu) kvm->arch.mmu_ready = 1; } } - mutex_unlock(&kvm->lock); + mutex_unlock(&kvm->arch.mmu_setup_lock); return r; }
@@ -4439,7 +4445,8 @@ static void kvmppc_core_commit_memory_region_hv(struct kvm *kvm,
/* * Update LPCR values in kvm->arch and in vcores. - * Caller must hold kvm->lock. + * Caller must hold kvm->arch.mmu_setup_lock (for mutual exclusion + * of kvm->arch.lpcr update). */ void kvmppc_update_lpcr(struct kvm *kvm, unsigned long lpcr, unsigned long mask) { @@ -4491,7 +4498,7 @@ void kvmppc_setup_partition_table(struct kvm *kvm)
/* * Set up HPT (hashed page table) and RMA (real-mode area). - * Must be called with kvm->lock held. + * Must be called with kvm->arch.mmu_setup_lock held. */ static int kvmppc_hv_setup_htab_rma(struct kvm_vcpu *vcpu) { @@ -4579,7 +4586,10 @@ static int kvmppc_hv_setup_htab_rma(struct kvm_vcpu *vcpu) goto out_srcu; }
-/* Must be called with kvm->lock held and mmu_ready = 0 and no vcpus running */ +/* + * Must be called with kvm->arch.mmu_setup_lock held and + * mmu_ready = 0 and no vcpus running. + */ int kvmppc_switch_mmu_to_hpt(struct kvm *kvm) { if (nesting_enabled(kvm)) @@ -4596,7 +4606,10 @@ int kvmppc_switch_mmu_to_hpt(struct kvm *kvm) return 0; }
-/* Must be called with kvm->lock held and mmu_ready = 0 and no vcpus running */ +/* + * Must be called with kvm->arch.mmu_setup_lock held and + * mmu_ready = 0 and no vcpus running. + */ int kvmppc_switch_mmu_to_radix(struct kvm *kvm) { int err; @@ -4701,6 +4714,8 @@ static int kvmppc_core_init_vm_hv(struct kvm *kvm) char buf[32]; int ret;
+ mutex_init(&kvm->arch.mmu_setup_lock); + /* Allocate the guest's logical partition ID */
lpid = kvmppc_alloc_lpid(); @@ -5226,7 +5241,7 @@ static int kvmhv_configure_mmu(struct kvm *kvm, struct kvm_ppc_mmuv3_cfg *cfg) if (kvmhv_on_pseries() && !radix) return -EINVAL;
- mutex_lock(&kvm->lock); + mutex_lock(&kvm->arch.mmu_setup_lock); if (radix != kvm_is_radix(kvm)) { if (kvm->arch.mmu_ready) { kvm->arch.mmu_ready = 0; @@ -5254,7 +5269,7 @@ static int kvmhv_configure_mmu(struct kvm *kvm, struct kvm_ppc_mmuv3_cfg *cfg) err = 0;
out_unlock: - mutex_unlock(&kvm->lock); + mutex_unlock(&kvm->arch.mmu_setup_lock); return err; }
[ Upstream commit 1659e27d2bc1ef47b6d031abe01b467f18cb72d9 ]
Currently the Book 3S KVM code uses kvm->lock to synchronize access to the kvm->arch.rtas_tokens list. Because this list is scanned inside kvmppc_rtas_hcall(), which is called with the vcpu mutex held, taking kvm->lock cause a lock inversion problem, which could lead to a deadlock.
To fix this, we add a new mutex, kvm->arch.rtas_token_lock, which nests inside the vcpu mutexes, and use that instead of kvm->lock when accessing the rtas token list.
This removes the lockdep_assert_held() in kvmppc_rtas_tokens_free(). At this point we don't hold the new mutex, but that is OK because kvmppc_rtas_tokens_free() is only called when the whole VM is being destroyed, and at that point nothing can be looking up a token in the list.
Signed-off-by: Paul Mackerras paulus@ozlabs.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/powerpc/include/asm/kvm_host.h | 1 + arch/powerpc/kvm/book3s.c | 1 + arch/powerpc/kvm/book3s_rtas.c | 14 ++++++-------- 3 files changed, 8 insertions(+), 8 deletions(-)
diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h index 8d3658275a34..1f9eb75ce95a 100644 --- a/arch/powerpc/include/asm/kvm_host.h +++ b/arch/powerpc/include/asm/kvm_host.h @@ -305,6 +305,7 @@ struct kvm_arch { #ifdef CONFIG_PPC_BOOK3S_64 struct list_head spapr_tce_tables; struct list_head rtas_tokens; + struct mutex rtas_token_lock; DECLARE_BITMAP(enabled_hcalls, MAX_HCALL_OPCODE/4 + 1); #endif #ifdef CONFIG_KVM_MPIC diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c index 10c5579d20ce..020304403bae 100644 --- a/arch/powerpc/kvm/book3s.c +++ b/arch/powerpc/kvm/book3s.c @@ -878,6 +878,7 @@ int kvmppc_core_init_vm(struct kvm *kvm) #ifdef CONFIG_PPC64 INIT_LIST_HEAD_RCU(&kvm->arch.spapr_tce_tables); INIT_LIST_HEAD(&kvm->arch.rtas_tokens); + mutex_init(&kvm->arch.rtas_token_lock); #endif
return kvm->arch.kvm_ops->init_vm(kvm); diff --git a/arch/powerpc/kvm/book3s_rtas.c b/arch/powerpc/kvm/book3s_rtas.c index 4e178c4c1ea5..b7ae3dfbf00e 100644 --- a/arch/powerpc/kvm/book3s_rtas.c +++ b/arch/powerpc/kvm/book3s_rtas.c @@ -146,7 +146,7 @@ static int rtas_token_undefine(struct kvm *kvm, char *name) { struct rtas_token_definition *d, *tmp;
- lockdep_assert_held(&kvm->lock); + lockdep_assert_held(&kvm->arch.rtas_token_lock);
list_for_each_entry_safe(d, tmp, &kvm->arch.rtas_tokens, list) { if (rtas_name_matches(d->handler->name, name)) { @@ -167,7 +167,7 @@ static int rtas_token_define(struct kvm *kvm, char *name, u64 token) bool found; int i;
- lockdep_assert_held(&kvm->lock); + lockdep_assert_held(&kvm->arch.rtas_token_lock);
list_for_each_entry(d, &kvm->arch.rtas_tokens, list) { if (d->token == token) @@ -206,14 +206,14 @@ int kvm_vm_ioctl_rtas_define_token(struct kvm *kvm, void __user *argp) if (copy_from_user(&args, argp, sizeof(args))) return -EFAULT;
- mutex_lock(&kvm->lock); + mutex_lock(&kvm->arch.rtas_token_lock);
if (args.token) rc = rtas_token_define(kvm, args.name, args.token); else rc = rtas_token_undefine(kvm, args.name);
- mutex_unlock(&kvm->lock); + mutex_unlock(&kvm->arch.rtas_token_lock);
return rc; } @@ -245,7 +245,7 @@ int kvmppc_rtas_hcall(struct kvm_vcpu *vcpu) orig_rets = args.rets; args.rets = &args.args[be32_to_cpu(args.nargs)];
- mutex_lock(&vcpu->kvm->lock); + mutex_lock(&vcpu->kvm->arch.rtas_token_lock);
rc = -ENOENT; list_for_each_entry(d, &vcpu->kvm->arch.rtas_tokens, list) { @@ -256,7 +256,7 @@ int kvmppc_rtas_hcall(struct kvm_vcpu *vcpu) } }
- mutex_unlock(&vcpu->kvm->lock); + mutex_unlock(&vcpu->kvm->arch.rtas_token_lock);
if (rc == 0) { args.rets = orig_rets; @@ -282,8 +282,6 @@ void kvmppc_rtas_tokens_free(struct kvm *kvm) { struct rtas_token_definition *d, *tmp;
- lockdep_assert_held(&kvm->lock); - list_for_each_entry_safe(d, tmp, &kvm->arch.rtas_tokens, list) { list_del(&d->list); kfree(d);
[ Upstream commit 5a3f49364c3ffa1107bd88f8292406e98c5d206c ]
Currently the HV KVM code takes the kvm->lock around calls to kvm_for_each_vcpu() and kvm_get_vcpu_by_id() (which can call kvm_for_each_vcpu() internally). However, that leads to a lock order inversion problem, because these are called in contexts where the vcpu mutex is held, but the vcpu mutexes nest within kvm->lock according to Documentation/virtual/kvm/locking.txt. Hence there is a possibility of deadlock.
To fix this, we simply don't take the kvm->lock mutex around these calls. This is safe because the implementations of kvm_for_each_vcpu() and kvm_get_vcpu_by_id() have been designed to be able to be called locklessly.
Signed-off-by: Paul Mackerras paulus@ozlabs.org Reviewed-by: Cédric Le Goater clg@kaod.org Signed-off-by: Paul Mackerras paulus@ozlabs.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/powerpc/kvm/book3s_hv.c | 9 +-------- 1 file changed, 1 insertion(+), 8 deletions(-)
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c index 9f49087c3a41..6d4f0f72231f 100644 --- a/arch/powerpc/kvm/book3s_hv.c +++ b/arch/powerpc/kvm/book3s_hv.c @@ -445,12 +445,7 @@ static void kvmppc_dump_regs(struct kvm_vcpu *vcpu)
static struct kvm_vcpu *kvmppc_find_vcpu(struct kvm *kvm, int id) { - struct kvm_vcpu *ret; - - mutex_lock(&kvm->lock); - ret = kvm_get_vcpu_by_id(kvm, id); - mutex_unlock(&kvm->lock); - return ret; + return kvm_get_vcpu_by_id(kvm, id); }
static void init_vpa(struct kvm_vcpu *vcpu, struct lppaca *vpa) @@ -1502,7 +1497,6 @@ static void kvmppc_set_lpcr(struct kvm_vcpu *vcpu, u64 new_lpcr, struct kvmppc_vcore *vc = vcpu->arch.vcore; u64 mask;
- mutex_lock(&kvm->lock); spin_lock(&vc->lock); /* * If ILE (interrupt little-endian) has changed, update the @@ -1542,7 +1536,6 @@ static void kvmppc_set_lpcr(struct kvm_vcpu *vcpu, u64 new_lpcr, mask &= 0xFFFFFFFF; vc->lpcr = (vc->lpcr & ~mask) | (new_lpcr & mask); spin_unlock(&vc->lock); - mutex_unlock(&kvm->lock); }
static int kvmppc_get_one_reg_hv(struct kvm_vcpu *vcpu, u64 id,
[ Upstream commit 6954158a16404e7091cea494cd0a435ca2f90388 ]
With gcc 4.1:
sound/firewire/fireface/ff-protocol-latter.c: In function ‘latter_switch_fetching_mode’: sound/firewire/fireface/ff-protocol-latter.c:97: warning: integer constant is too large for ‘long’ type sound/firewire/fireface/ff-protocol-latter.c: In function ‘latter_begin_session’: sound/firewire/fireface/ff-protocol-latter.c:170: warning: integer constant is too large for ‘long’ type sound/firewire/fireface/ff-protocol-latter.c:197: warning: integer constant is too large for ‘long’ type sound/firewire/fireface/ff-protocol-latter.c:205: warning: integer constant is too large for ‘long’ type sound/firewire/fireface/ff-protocol-latter.c: In function ‘latter_finish_session’: sound/firewire/fireface/ff-protocol-latter.c:214: warning: integer constant is too large for ‘long’ type
Fix this by adding the missing "ULL" suffixes. Add the same suffix to the last constant, to maintain consistency.
Fixes: fd1cc9de64c2ca6c ("ALSA: fireface: add support for Fireface UCX") Signed-off-by: Geert Uytterhoeven geert@linux-m68k.org Reviewed-by: Takashi Sakamoto o-takashi@sakamocchi.jp Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Sasha Levin sashal@kernel.org --- sound/firewire/fireface/ff-protocol-latter.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/sound/firewire/fireface/ff-protocol-latter.c b/sound/firewire/fireface/ff-protocol-latter.c index c8236ff89b7f..b30d02d359b1 100644 --- a/sound/firewire/fireface/ff-protocol-latter.c +++ b/sound/firewire/fireface/ff-protocol-latter.c @@ -9,11 +9,11 @@
#include "ff.h"
-#define LATTER_STF 0xffff00000004 -#define LATTER_ISOC_CHANNELS 0xffff00000008 -#define LATTER_ISOC_START 0xffff0000000c -#define LATTER_FETCH_MODE 0xffff00000010 -#define LATTER_SYNC_STATUS 0x0000801c0000 +#define LATTER_STF 0xffff00000004ULL +#define LATTER_ISOC_CHANNELS 0xffff00000008ULL +#define LATTER_ISOC_START 0xffff0000000cULL +#define LATTER_FETCH_MODE 0xffff00000010ULL +#define LATTER_SYNC_STATUS 0x0000801c0000ULL
static int parse_clock_bits(u32 data, unsigned int *rate, enum snd_ff_clock_src *src)
[ Upstream commit 8ef8f368ce72b5e17f7c1f1ef15c38dcfd0fef64 ]
Syscall wrappers in <asm/syscall_wrapper.h> use const struct pt_regs * as the argument type. Use const in syscall_fn_t as well to fix indirect call type mismatches with Control-Flow Integrity checking.
Signed-off-by: Sami Tolvanen samitolvanen@google.com Reviewed-by: Mark Rutland mark.rutland@arm.com Signed-off-by: Will Deacon will.deacon@arm.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm64/include/asm/syscall.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/arm64/include/asm/syscall.h b/arch/arm64/include/asm/syscall.h index a179df3674a1..6206ab9bfcfc 100644 --- a/arch/arm64/include/asm/syscall.h +++ b/arch/arm64/include/asm/syscall.h @@ -20,7 +20,7 @@ #include <linux/compat.h> #include <linux/err.h>
-typedef long (*syscall_fn_t)(struct pt_regs *regs); +typedef long (*syscall_fn_t)(const struct pt_regs *regs);
extern const syscall_fn_t sys_call_table[];
[ Upstream commit 0e358bd7b7ebd27e491dabed938eae254c17fe3b ]
Although a syscall defined using SYSCALL_DEFINE0 doesn't accept parameters, use the correct function type to avoid indirect call type mismatches with Control-Flow Integrity checking.
Signed-off-by: Sami Tolvanen samitolvanen@google.com Signed-off-by: Will Deacon will.deacon@arm.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm64/include/asm/syscall_wrapper.h | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-)
diff --git a/arch/arm64/include/asm/syscall_wrapper.h b/arch/arm64/include/asm/syscall_wrapper.h index a4477e515b79..507d0ee6bc69 100644 --- a/arch/arm64/include/asm/syscall_wrapper.h +++ b/arch/arm64/include/asm/syscall_wrapper.h @@ -30,10 +30,10 @@ } \ static inline long __do_compat_sys##name(__MAP(x,__SC_DECL,__VA_ARGS__))
-#define COMPAT_SYSCALL_DEFINE0(sname) \ - asmlinkage long __arm64_compat_sys_##sname(void); \ - ALLOW_ERROR_INJECTION(__arm64_compat_sys_##sname, ERRNO); \ - asmlinkage long __arm64_compat_sys_##sname(void) +#define COMPAT_SYSCALL_DEFINE0(sname) \ + asmlinkage long __arm64_compat_sys_##sname(const struct pt_regs *__unused); \ + ALLOW_ERROR_INJECTION(__arm64_compat_sys_##sname, ERRNO); \ + asmlinkage long __arm64_compat_sys_##sname(const struct pt_regs *__unused)
#define COND_SYSCALL_COMPAT(name) \ cond_syscall(__arm64_compat_sys_##name); @@ -62,11 +62,11 @@ static inline long __do_sys##name(__MAP(x,__SC_DECL,__VA_ARGS__))
#ifndef SYSCALL_DEFINE0 -#define SYSCALL_DEFINE0(sname) \ - SYSCALL_METADATA(_##sname, 0); \ - asmlinkage long __arm64_sys_##sname(void); \ - ALLOW_ERROR_INJECTION(__arm64_sys_##sname, ERRNO); \ - asmlinkage long __arm64_sys_##sname(void) +#define SYSCALL_DEFINE0(sname) \ + SYSCALL_METADATA(_##sname, 0); \ + asmlinkage long __arm64_sys_##sname(const struct pt_regs *__unused); \ + ALLOW_ERROR_INJECTION(__arm64_sys_##sname, ERRNO); \ + asmlinkage long __arm64_sys_##sname(const struct pt_regs *__unused) #endif
#ifndef COND_SYSCALL
[ Upstream commit 1e29ab3186e33c77dbb2d7566172a205b59fa390 ]
Calling sys_ni_syscall through a syscall_fn_t pointer trips indirect call Control-Flow Integrity checking due to a function type mismatch. Use SYSCALL_DEFINE0 for __arm64_sys_ni_syscall instead and remove the now unnecessary casts.
Signed-off-by: Sami Tolvanen samitolvanen@google.com Signed-off-by: Will Deacon will.deacon@arm.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm64/kernel/sys.c | 14 +++++++++----- arch/arm64/kernel/sys32.c | 7 ++----- 2 files changed, 11 insertions(+), 10 deletions(-)
diff --git a/arch/arm64/kernel/sys.c b/arch/arm64/kernel/sys.c index 162a95ed0881..fe20c461582a 100644 --- a/arch/arm64/kernel/sys.c +++ b/arch/arm64/kernel/sys.c @@ -47,22 +47,26 @@ SYSCALL_DEFINE1(arm64_personality, unsigned int, personality) return ksys_personality(personality); }
+asmlinkage long sys_ni_syscall(void); + +asmlinkage long __arm64_sys_ni_syscall(const struct pt_regs *__unused) +{ + return sys_ni_syscall(); +} + /* * Wrappers to pass the pt_regs argument. */ #define __arm64_sys_personality __arm64_sys_arm64_personality
-asmlinkage long sys_ni_syscall(const struct pt_regs *); -#define __arm64_sys_ni_syscall sys_ni_syscall - #undef __SYSCALL #define __SYSCALL(nr, sym) asmlinkage long __arm64_##sym(const struct pt_regs *); #include <asm/unistd.h>
#undef __SYSCALL -#define __SYSCALL(nr, sym) [nr] = (syscall_fn_t)__arm64_##sym, +#define __SYSCALL(nr, sym) [nr] = __arm64_##sym,
const syscall_fn_t sys_call_table[__NR_syscalls] = { - [0 ... __NR_syscalls - 1] = (syscall_fn_t)sys_ni_syscall, + [0 ... __NR_syscalls - 1] = __arm64_sys_ni_syscall, #include <asm/unistd.h> }; diff --git a/arch/arm64/kernel/sys32.c b/arch/arm64/kernel/sys32.c index 0f8bcb7de700..3c80a40c1c9d 100644 --- a/arch/arm64/kernel/sys32.c +++ b/arch/arm64/kernel/sys32.c @@ -133,17 +133,14 @@ COMPAT_SYSCALL_DEFINE6(aarch32_fallocate, int, fd, int, mode, return ksys_fallocate(fd, mode, arg_u64(offset), arg_u64(len)); }
-asmlinkage long sys_ni_syscall(const struct pt_regs *); -#define __arm64_sys_ni_syscall sys_ni_syscall - #undef __SYSCALL #define __SYSCALL(nr, sym) asmlinkage long __arm64_##sym(const struct pt_regs *); #include <asm/unistd32.h>
#undef __SYSCALL -#define __SYSCALL(nr, sym) [nr] = (syscall_fn_t)__arm64_##sym, +#define __SYSCALL(nr, sym) [nr] = __arm64_##sym,
const syscall_fn_t compat_sys_call_table[__NR_compat_syscalls] = { - [0 ... __NR_compat_syscalls - 1] = (syscall_fn_t)sys_ni_syscall, + [0 ... __NR_compat_syscalls - 1] = __arm64_sys_ni_syscall, #include <asm/unistd32.h> };
[ Upstream commit 315ca92dd863fecbffc0bb52ae0ac11e0398726a ]
The sh_eth_close() resets the MAC and then calls phy_stop() so that mdio read access result is incorrect without any error according to kernel trace like below:
ifconfig-216 [003] .n.. 109.133124: mdio_access: ee700000.ethernet-ffffffff read phy:0x01 reg:0x00 val:0xffff
According to the hardware manual, the RMII mode should be set to 1 before operation the Ethernet MAC. However, the previous code was not set to 1 after the driver issued the soft_reset in sh_eth_dev_exit() so that the mdio read access result seemed incorrect. To fix the issue, this patch adds a condition and set the RMII mode register in sh_eth_dev_exit() for R-Car Gen2 and RZ/A1 SoCs.
Note that when I have tried to move the sh_eth_dev_exit() calling after phy_stop() on sh_eth_close(), but it gets worse (kernel panic happened and it seems that a register is accessed while the clock is off).
Signed-off-by: Yoshihiro Shimoda yoshihiro.shimoda.uh@renesas.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/renesas/sh_eth.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/drivers/net/ethernet/renesas/sh_eth.c b/drivers/net/ethernet/renesas/sh_eth.c index e33af371b169..48967dd27bbf 100644 --- a/drivers/net/ethernet/renesas/sh_eth.c +++ b/drivers/net/ethernet/renesas/sh_eth.c @@ -1594,6 +1594,10 @@ static void sh_eth_dev_exit(struct net_device *ndev) sh_eth_get_stats(ndev); mdp->cd->soft_reset(ndev);
+ /* Set the RMII mode again if required */ + if (mdp->cd->rmiimode) + sh_eth_write(ndev, 0x1, RMIIMODE); + /* Set MAC address again */ update_mac_address(ndev); }
[ Upstream commit 41de54c64811bf087c8464fdeb43c6ad8be2686b ]
If blk_mq_init_allocated_queue() fails, make sure to free the poll stat callback struct allocated.
Signed-off-by: Jes Sorensen jsorensen@fb.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- block/blk-mq.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/block/blk-mq.c b/block/blk-mq.c index 11efca3534ad..00b826399228 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -2846,7 +2846,7 @@ struct request_queue *blk_mq_init_allocated_queue(struct blk_mq_tag_set *set, goto err_exit;
if (blk_mq_alloc_ctxs(q)) - goto err_exit; + goto err_poll;
/* init q->mq_kobj and sw queues' kobjects */ blk_mq_sysfs_init(q); @@ -2907,6 +2907,9 @@ struct request_queue *blk_mq_init_allocated_queue(struct blk_mq_tag_set *set, kfree(q->queue_hw_ctx); err_sys_init: blk_mq_sysfs_deinit(q); +err_poll: + blk_stat_free_callback(q->poll_cb); + q->poll_cb = NULL; err_exit: q->mq_ops = NULL; return ERR_PTR(-ENOMEM);
[ Upstream commit c678726305b9425454be7c8a7624290b602602fc ]
Ensure that we supply the same phy interface mode to mac_link_down() as we did for the corresponding mac_link_up() call. This ensures that MAC drivers that use the phy interface mode in these methods can depend on mac_link_down() always corresponding to a mac_link_up() call for the same interface mode.
Signed-off-by: Russell King rmk+kernel@armlinux.org.uk Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/phy/phylink.c | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-)
diff --git a/drivers/net/phy/phylink.c b/drivers/net/phy/phylink.c index d8f919fe49fd..507baa10ec70 100644 --- a/drivers/net/phy/phylink.c +++ b/drivers/net/phy/phylink.c @@ -51,6 +51,10 @@ struct phylink {
/* The link configuration settings */ struct phylink_link_state link_config; + + /* The current settings */ + phy_interface_t cur_interface; + struct gpio_desc *link_gpio; struct timer_list link_poll; void (*get_fixed_state)(struct net_device *dev, @@ -453,12 +457,12 @@ static void phylink_resolve(struct work_struct *w) if (!link_state.link) { netif_carrier_off(ndev); pl->ops->mac_link_down(ndev, pl->link_an_mode, - pl->phy_state.interface); + pl->cur_interface); netdev_info(ndev, "Link is Down\n"); } else { + pl->cur_interface = link_state.interface; pl->ops->mac_link_up(ndev, pl->link_an_mode, - pl->phy_state.interface, - pl->phydev); + pl->cur_interface, pl->phydev);
netif_carrier_on(ndev);
[ Upstream commit 333061b924539c0de081339643f45514f5f1c1e6 ]
For supporting 10Mps speed in SGMII mode DP83867_10M_SGMII_RATE_ADAPT bit of DP83867_10M_SGMII_CFG register has to be cleared by software. That does not affect speeds 100 and 1000 so can be done on init.
Signed-off-by: Max Uvarov muvarov@gmail.com Cc: Heiner Kallweit hkallweit1@gmail.com Reviewed-by: Florian Fainelli f.fainelli@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/phy/dp83867.c | 17 +++++++++++++++++ 1 file changed, 17 insertions(+)
diff --git a/drivers/net/phy/dp83867.c b/drivers/net/phy/dp83867.c index 8448d01819ef..29cae4de9a4f 100644 --- a/drivers/net/phy/dp83867.c +++ b/drivers/net/phy/dp83867.c @@ -30,6 +30,8 @@ #define DP83867_STRAP_STS1 0x006E #define DP83867_RGMIIDCTL 0x0086 #define DP83867_IO_MUX_CFG 0x0170 +#define DP83867_10M_SGMII_CFG 0x016F +#define DP83867_10M_SGMII_RATE_ADAPT_MASK BIT(7)
#define DP83867_SW_RESET BIT(15) #define DP83867_SW_RESTART BIT(14) @@ -277,6 +279,21 @@ static int dp83867_config_init(struct phy_device *phydev) DP83867_IO_MUX_CFG_IO_IMPEDANCE_CTRL); }
+ if (phydev->interface == PHY_INTERFACE_MODE_SGMII) { + /* For support SPEED_10 in SGMII mode + * DP83867_10M_SGMII_RATE_ADAPT bit + * has to be cleared by software. That + * does not affect SPEED_100 and + * SPEED_1000. + */ + ret = phy_modify_mmd(phydev, DP83867_DEVADDR, + DP83867_10M_SGMII_CFG, + DP83867_10M_SGMII_RATE_ADAPT_MASK, + 0); + if (ret) + return ret; + } + /* Enable Interrupt output INT_OE in CFG3 register */ if (phy_interrupt_is_valid(phydev)) { val = phy_read(phydev, DP83867_CFG3);
[ Upstream commit 1a97a477e666cbdededab93bd3754e508f0c09d7 ]
After reset SGMII Autoneg timer is set to 2us (bits 6 and 5 are 01). That is not enough to finalize autonegatiation on some devices. Increase this timer duration to maximum supported 16ms.
Signed-off-by: Max Uvarov muvarov@gmail.com Cc: Heiner Kallweit hkallweit1@gmail.com Reviewed-by: Florian Fainelli f.fainelli@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/phy/dp83867.c | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+)
diff --git a/drivers/net/phy/dp83867.c b/drivers/net/phy/dp83867.c index 29cae4de9a4f..ffaf67bdb140 100644 --- a/drivers/net/phy/dp83867.c +++ b/drivers/net/phy/dp83867.c @@ -26,6 +26,12 @@
/* Extended Registers */ #define DP83867_CFG4 0x0031 +#define DP83867_CFG4_SGMII_ANEG_MASK (BIT(5) | BIT(6)) +#define DP83867_CFG4_SGMII_ANEG_TIMER_11MS (3 << 5) +#define DP83867_CFG4_SGMII_ANEG_TIMER_800US (2 << 5) +#define DP83867_CFG4_SGMII_ANEG_TIMER_2US (1 << 5) +#define DP83867_CFG4_SGMII_ANEG_TIMER_16MS (0 << 5) + #define DP83867_RGMIICTL 0x0032 #define DP83867_STRAP_STS1 0x006E #define DP83867_RGMIIDCTL 0x0086 @@ -292,6 +298,18 @@ static int dp83867_config_init(struct phy_device *phydev) 0); if (ret) return ret; + + /* After reset SGMII Autoneg timer is set to 2us (bits 6 and 5 + * are 01). That is not enough to finalize autoneg on some + * devices. Increase this timer duration to maximum 16ms. + */ + ret = phy_modify_mmd(phydev, DP83867_DEVADDR, + DP83867_CFG4, + DP83867_CFG4_SGMII_ANEG_MASK, + DP83867_CFG4_SGMII_ANEG_TIMER_16MS); + + if (ret) + return ret; }
/* Enable Interrupt output INT_OE in CFG3 register */
[ Upstream commit 2b892649254fec01678c64f16427622b41fa27f4 ]
PHY_INTERFACE_MODE_RGMII_RXID is less then TXID so code to set tx delay is never called.
Fixes: 2a10154abcb75 ("net: phy: dp83867: Add TI dp83867 phy") Signed-off-by: Max Uvarov muvarov@gmail.com Cc: Florian Fainelli f.fainelli@gmail.com Reviewed-by: Florian Fainelli f.fainelli@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/phy/dp83867.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/drivers/net/phy/dp83867.c b/drivers/net/phy/dp83867.c index ffaf67bdb140..2995a1788ceb 100644 --- a/drivers/net/phy/dp83867.c +++ b/drivers/net/phy/dp83867.c @@ -255,10 +255,8 @@ static int dp83867_config_init(struct phy_device *phydev) ret = phy_write(phydev, MII_DP83867_PHYCTRL, val); if (ret) return ret; - }
- if ((phydev->interface >= PHY_INTERFACE_MODE_RGMII_ID) && - (phydev->interface <= PHY_INTERFACE_MODE_RGMII_RXID)) { + /* Set up RGMII delays */ val = phy_read_mmd(phydev, DP83867_DEVADDR, DP83867_RGMIICTL);
if (phydev->interface == PHY_INTERFACE_MODE_RGMII_ID)
[ Upstream commit cc555759117e8349088e0c5d19f2f2a500bafdbd ]
ip_dev_find() can return NULL so add a check for NULL pointer.
Signed-off-by: Varun Prakash varun@chelsio.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/cxgbi/libcxgbi.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/drivers/scsi/cxgbi/libcxgbi.c b/drivers/scsi/cxgbi/libcxgbi.c index 006372b3fba2..a50734f3c486 100644 --- a/drivers/scsi/cxgbi/libcxgbi.c +++ b/drivers/scsi/cxgbi/libcxgbi.c @@ -641,6 +641,10 @@ cxgbi_check_route(struct sockaddr *dst_addr, int ifindex)
if (ndev->flags & IFF_LOOPBACK) { ndev = ip_dev_find(&init_net, daddr->sin_addr.s_addr); + if (!ndev) { + err = -ENETUNREACH; + goto rel_neigh; + } mtu = ndev->mtu; pr_info("rt dev %s, loopback -> %s, mtu %u.\n", n->dev->name, ndev->name, mtu);
[ Upstream commit 1d94f06e7f5df4064ef336b7b710f50143b64a53 ]
When SME is enabled, the smartpqi driver won't work on the HP DL385 G10 machine, which causes the failure of kernel boot because it fails to allocate pqi error buffer. Please refer to the kernel log: .... [ 9.431749] usbcore: registered new interface driver uas [ 9.441524] Microsemi PQI Driver (v1.1.4-130) [ 9.442956] i40e 0000:04:00.0: fw 6.70.48768 api 1.7 nvm 10.2.5 [ 9.447237] smartpqi 0000:23:00.0: Microsemi Smart Family Controller found Starting dracut initqueue hook... [ OK ] Started Show Plymouth Boot Scre[ 9.471654] Broadcom NetXtreme-C/E driver bnxt_en v1.9.1 en. [ OK ] Started Forward Password Requests to Plymouth Directory Watch. [[0;[ 9.487108] smartpqi 0000:23:00.0: failed to allocate PQI error buffer .... [ 139.050544] dracut-initqueue[949]: Warning: dracut-initqueue timeout - starting timeout scripts [ 139.589779] dracut-initqueue[949]: Warning: dracut-initqueue timeout - starting timeout scripts
Basically, the fact that the coherent DMA mask value wasn't set caused the driver to fall back to SWIOTLB when SME is active.
For correct operation, lets call the dma_set_mask_and_coherent() to properly set the mask for both streaming and coherent, in order to inform the kernel about the devices DMA addressing capabilities.
Signed-off-by: Lianbo Jiang lijiang@redhat.com Acked-by: Don Brace don.brace@microsemi.com Tested-by: Don Brace don.brace@microsemi.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/smartpqi/smartpqi_init.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/scsi/smartpqi/smartpqi_init.c b/drivers/scsi/smartpqi/smartpqi_init.c index 75ec43aa8df3..531824afba5f 100644 --- a/drivers/scsi/smartpqi/smartpqi_init.c +++ b/drivers/scsi/smartpqi/smartpqi_init.c @@ -7285,7 +7285,7 @@ static int pqi_pci_init(struct pqi_ctrl_info *ctrl_info) else mask = DMA_BIT_MASK(32);
- rc = dma_set_mask(&ctrl_info->pci_dev->dev, mask); + rc = dma_set_mask_and_coherent(&ctrl_info->pci_dev->dev, mask); if (rc) { dev_err(&ctrl_info->pci_dev->dev, "failed to set DMA mask\n"); goto disable_device;
[ Upstream commit 12e750bc62044de096ab9a95201213fd912b9994 ]
If alloc_workqueue fails in alua_init, it should return -ENOMEM, otherwise it will trigger null-ptr-deref while unloading module which calls destroy_workqueue dereference wq->lock like this:
BUG: KASAN: null-ptr-deref in __lock_acquire+0x6b4/0x1ee0 Read of size 8 at addr 0000000000000080 by task syz-executor.0/7045
CPU: 0 PID: 7045 Comm: syz-executor.0 Tainted: G C 5.1.0+ #28 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 Call Trace: dump_stack+0xa9/0x10e __kasan_report+0x171/0x18d ? __lock_acquire+0x6b4/0x1ee0 kasan_report+0xe/0x20 __lock_acquire+0x6b4/0x1ee0 lock_acquire+0xb4/0x1b0 __mutex_lock+0xd8/0xb90 drain_workqueue+0x25/0x290 destroy_workqueue+0x1f/0x3f0 __x64_sys_delete_module+0x244/0x330 do_syscall_64+0x72/0x2a0 entry_SYSCALL_64_after_hwframe+0x49/0xbe
Reported-by: Hulk Robot hulkci@huawei.com Fixes: 03197b61c5ec ("scsi_dh_alua: Use workqueue for RTPG") Signed-off-by: YueHaibing yuehaibing@huawei.com Reviewed-by: Bart Van Assche bvanassche@acm.org Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/device_handler/scsi_dh_alua.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-)
diff --git a/drivers/scsi/device_handler/scsi_dh_alua.c b/drivers/scsi/device_handler/scsi_dh_alua.c index d7ac498ba35a..2a9dcb8973b7 100644 --- a/drivers/scsi/device_handler/scsi_dh_alua.c +++ b/drivers/scsi/device_handler/scsi_dh_alua.c @@ -1174,10 +1174,8 @@ static int __init alua_init(void) int r;
kaluad_wq = alloc_workqueue("kaluad", WQ_MEM_RECLAIM, 0); - if (!kaluad_wq) { - /* Temporary failure, bypass */ - return SCSI_DH_DEV_TEMP_BUSY; - } + if (!kaluad_wq) + return -ENOMEM;
r = scsi_register_device_handler(&alua_dh); if (r != 0) {
[ Upstream commit 3b0541791453fbe7f42867e310e0c9eb6295364d ]
The sas_port(phy->port) allocated in sas_ex_discover_expander() will not be deleted when the expander failed to discover. This will cause resource leak and a further issue of kernel BUG like below:
[159785.843156] port-2:17:29: trying to add phy phy-2:17:29 fails: it's already part of another port [159785.852144] ------------[ cut here ]------------ [159785.856833] kernel BUG at drivers/scsi/scsi_transport_sas.c:1086! [159785.863000] Internal error: Oops - BUG: 0 [#1] SMP [159785.867866] CPU: 39 PID: 16993 Comm: kworker/u96:2 Tainted: G W OE 4.19.25-vhulk1901.1.0.h111.aarch64 #1 [159785.878458] Hardware name: Huawei Technologies Co., Ltd. Hi1620EVBCS/Hi1620EVBCS, BIOS Hi1620 CS B070 1P TA 03/21/2019 [159785.889231] Workqueue: 0000:74:02.0_disco_q sas_discover_domain [159785.895224] pstate: 40c00009 (nZcv daif +PAN +UAO) [159785.900094] pc : sas_port_add_phy+0x188/0x1b8 [159785.904524] lr : sas_port_add_phy+0x188/0x1b8 [159785.908952] sp : ffff0001120e3b80 [159785.912341] x29: ffff0001120e3b80 x28: 0000000000000000 [159785.917727] x27: ffff802ade8f5400 x26: ffff0000681b7560 [159785.923111] x25: ffff802adf11a800 x24: ffff0000680e8000 [159785.928496] x23: ffff802ade8f5728 x22: ffff802ade8f5708 [159785.933880] x21: ffff802adea2db40 x20: ffff802ade8f5400 [159785.939264] x19: ffff802adea2d800 x18: 0000000000000010 [159785.944649] x17: 00000000821bf734 x16: ffff00006714faa0 [159785.950033] x15: ffff0000e8ab4ecf x14: 7261702079646165 [159785.955417] x13: 726c612073277469 x12: ffff00006887b830 [159785.960802] x11: ffff00006773eaa0 x10: 7968702079687020 [159785.966186] x9 : 0000000000002453 x8 : 726f702072656874 [159785.971570] x7 : 6f6e6120666f2074 x6 : ffff802bcfb21290 [159785.976955] x5 : ffff802bcfb21290 x4 : 0000000000000000 [159785.982339] x3 : ffff802bcfb298c8 x2 : 337752b234c2ab00 [159785.987723] x1 : 337752b234c2ab00 x0 : 0000000000000000 [159785.993108] Process kworker/u96:2 (pid: 16993, stack limit = 0x0000000072dae094) [159786.000576] Call trace: [159786.003097] sas_port_add_phy+0x188/0x1b8 [159786.007179] sas_ex_get_linkrate.isra.5+0x134/0x140 [159786.012130] sas_ex_discover_expander+0x128/0x408 [159786.016906] sas_ex_discover_dev+0x218/0x4c8 [159786.021249] sas_ex_discover_devices+0x9c/0x1a8 [159786.025852] sas_discover_root_expander+0x134/0x160 [159786.030802] sas_discover_domain+0x1b8/0x1e8 [159786.035148] process_one_work+0x1b4/0x3f8 [159786.039230] worker_thread+0x54/0x470 [159786.042967] kthread+0x134/0x138 [159786.046269] ret_from_fork+0x10/0x18 [159786.049918] Code: 91322300 f0004402 91178042 97fe4c9b (d4210000) [159786.056083] Modules linked in: hns3_enet_ut(OE) hclge(OE) hnae3(OE) hisi_sas_test_hw(OE) hisi_sas_test_main(OE) serdes(OE) [159786.067202] ---[ end trace 03622b9e2d99e196 ]--- [159786.071893] Kernel panic - not syncing: Fatal exception [159786.077190] SMP: stopping secondary CPUs [159786.081192] Kernel Offset: disabled [159786.084753] CPU features: 0x2,a2a00a38
Fixes: 2908d778ab3e ("[SCSI] aic94xx: new driver") Reported-by: Jian Luo luojian5@huawei.com Signed-off-by: Jason Yan yanaijie@huawei.com CC: John Garry john.garry@huawei.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/libsas/sas_expander.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/scsi/libsas/sas_expander.c b/drivers/scsi/libsas/sas_expander.c index 3611a4ef0d15..7c2d78d189e4 100644 --- a/drivers/scsi/libsas/sas_expander.c +++ b/drivers/scsi/libsas/sas_expander.c @@ -1014,6 +1014,8 @@ static struct domain_device *sas_ex_discover_expander( list_del(&child->dev_list_node); spin_unlock_irq(&parent->port->dev_list_lock); sas_put_device(child); + sas_port_delete(phy->port); + phy->port = NULL; return NULL; } list_add_tail(&child->siblings, &parent->ex_dev.children);
[ Upstream commit 275e928f19117d22f6d26dee94548baf4041b773 ]
Force of 56G is not supported by hardware in Ethernet devices. This configuration fails with a bad parameter error from firmware.
Add check of this case. Instead of trying to set 56G with autoneg off, return a meaningful error.
Fixes: 56ade8fe3fe1 ("mlxsw: spectrum: Add initial support for Spectrum ASIC") Signed-off-by: Amit Cohen amitc@mellanox.com Acked-by: Jiri Pirko jiri@mellanox.com Signed-off-by: Ido Schimmel idosch@mellanox.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/mellanox/mlxsw/spectrum.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum.c index 6b8aa3761899..f4acb38569e1 100644 --- a/drivers/net/ethernet/mellanox/mlxsw/spectrum.c +++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum.c @@ -3110,6 +3110,10 @@ mlxsw_sp_port_set_link_ksettings(struct net_device *dev, ops->reg_ptys_eth_unpack(mlxsw_sp, ptys_pl, ð_proto_cap, NULL, NULL);
autoneg = cmd->base.autoneg == AUTONEG_ENABLE; + if (!autoneg && cmd->base.speed == SPEED_56000) { + netdev_err(dev, "56G not supported with autoneg off\n"); + return -EINVAL; + } eth_proto_new = autoneg ? ops->to_ptys_advert_link(mlxsw_sp, cmd) : ops->to_ptys_speed(mlxsw_sp, cmd->base.speed);
[ Upstream commit b9fba67b3806e21b98bd5a98dc3921a8e9b42d61 ]
If a call to kobject_init_and_add() fails we should call kobject_put() otherwise we leak memory.
Add call to kobject_put() in the error path of call to kobject_init_and_add(). Please note, this has the side effect that the release method is called if kobject_init_and_add() fails.
Link: http://lkml.kernel.org/r/20190513033458.2824-1-tobin@kernel.org Signed-off-by: Tobin C. Harding tobin@kernel.org Reviewed-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Reviewed-by: Joseph Qi joseph.qi@linux.alibaba.com Cc: Mark Fasheh mark@fasheh.com Cc: Joel Becker jlbec@evilplan.org Cc: Junxiao Bi junxiao.bi@oracle.com Cc: Changwei Ge gechangwei@live.cn Cc: Gang He ghe@suse.com Cc: Jun Piao piaojun@huawei.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/ocfs2/filecheck.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/fs/ocfs2/filecheck.c b/fs/ocfs2/filecheck.c index f65f2b2f594d..1906cc962c4d 100644 --- a/fs/ocfs2/filecheck.c +++ b/fs/ocfs2/filecheck.c @@ -193,6 +193,7 @@ int ocfs2_filecheck_create_sysfs(struct ocfs2_super *osb) ret = kobject_init_and_add(&entry->fs_kobj, &ocfs2_ktype_filecheck, NULL, "filecheck"); if (ret) { + kobject_put(&entry->fs_kobj); kfree(fcheck); return ret; }
From: Yang Shi yang.shi@linux.alibaba.com
commit 7a30df49f63ad92318ddf1f7498d1129a77dd4bd upstream.
A few new fields were added to mmu_gather to make TLB flush smarter for huge page by telling what level of page table is changed.
__tlb_reset_range() is used to reset all these page table state to unchanged, which is called by TLB flush for parallel mapping changes for the same range under non-exclusive lock (i.e. read mmap_sem).
Before commit dd2283f2605e ("mm: mmap: zap pages with read mmap_sem in munmap"), the syscalls (e.g. MADV_DONTNEED, MADV_FREE) which may update PTEs in parallel don't remove page tables. But, the forementioned commit may do munmap() under read mmap_sem and free page tables. This may result in program hang on aarch64 reported by Jan Stancek. The problem could be reproduced by his test program with slightly modified below.
---8<---
static int map_size = 4096; static int num_iter = 500; static long threads_total;
static void *distant_area;
void *map_write_unmap(void *ptr) { int *fd = ptr; unsigned char *map_address; int i, j = 0;
for (i = 0; i < num_iter; i++) { map_address = mmap(distant_area, (size_t) map_size, PROT_WRITE | PROT_READ, MAP_SHARED | MAP_ANONYMOUS, -1, 0); if (map_address == MAP_FAILED) { perror("mmap"); exit(1); }
for (j = 0; j < map_size; j++) map_address[j] = 'b';
if (munmap(map_address, map_size) == -1) { perror("munmap"); exit(1); } }
return NULL; }
void *dummy(void *ptr) { return NULL; }
int main(void) { pthread_t thid[2];
/* hint for mmap in map_write_unmap() */ distant_area = mmap(0, DISTANT_MMAP_SIZE, PROT_WRITE | PROT_READ, MAP_ANONYMOUS | MAP_PRIVATE, -1, 0); munmap(distant_area, (size_t)DISTANT_MMAP_SIZE); distant_area += DISTANT_MMAP_SIZE / 2;
while (1) { pthread_create(&thid[0], NULL, map_write_unmap, NULL); pthread_create(&thid[1], NULL, dummy, NULL);
pthread_join(thid[0], NULL); pthread_join(thid[1], NULL); } } ---8<---
The program may bring in parallel execution like below:
t1 t2 munmap(map_address) downgrade_write(&mm->mmap_sem); unmap_region() tlb_gather_mmu() inc_tlb_flush_pending(tlb->mm); free_pgtables() tlb->freed_tables = 1 tlb->cleared_pmds = 1
pthread_exit() madvise(thread_stack, 8M, MADV_DONTNEED) zap_page_range() tlb_gather_mmu() inc_tlb_flush_pending(tlb->mm);
tlb_finish_mmu() if (mm_tlb_flush_nested(tlb->mm)) __tlb_reset_range()
__tlb_reset_range() would reset freed_tables and cleared_* bits, but this may cause inconsistency for munmap() which do free page tables. Then it may result in some architectures, e.g. aarch64, may not flush TLB completely as expected to have stale TLB entries remained.
Use fullmm flush since it yields much better performance on aarch64 and non-fullmm doesn't yields significant difference on x86.
The original proposed fix came from Jan Stancek who mainly debugged this issue, I just wrapped up everything together.
Jan's testing results:
v5.2-rc2-24-gbec7550cca10 -------------------------- mean stddev real 37.382 2.780 user 1.420 0.078 sys 54.658 1.855
v5.2-rc2-24-gbec7550cca10 + "mm: mmu_gather: remove __tlb_reset_range() for force flush" ---------------------------------------------------------------------------------------_ mean stddev real 37.119 2.105 user 1.548 0.087 sys 55.698 1.357
[akpm@linux-foundation.org: coding-style fixes] Link: http://lkml.kernel.org/r/1558322252-113575-1-git-send-email-yang.shi@linux.a... Fixes: dd2283f2605e ("mm: mmap: zap pages with read mmap_sem in munmap") Signed-off-by: Yang Shi yang.shi@linux.alibaba.com Signed-off-by: Jan Stancek jstancek@redhat.com Reported-by: Jan Stancek jstancek@redhat.com Tested-by: Jan Stancek jstancek@redhat.com Suggested-by: Will Deacon will.deacon@arm.com Tested-by: Will Deacon will.deacon@arm.com Acked-by: Will Deacon will.deacon@arm.com Cc: Peter Zijlstra peterz@infradead.org Cc: Nick Piggin npiggin@gmail.com Cc: "Aneesh Kumar K.V" aneesh.kumar@linux.ibm.com Cc: Nadav Amit namit@vmware.com Cc: Minchan Kim minchan@kernel.org Cc: Mel Gorman mgorman@suse.de Cc: stable@vger.kernel.org [4.20+] Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- mm/mmu_gather.c | 24 +++++++++++++++++++----- 1 file changed, 19 insertions(+), 5 deletions(-)
--- a/mm/mmu_gather.c +++ b/mm/mmu_gather.c @@ -93,8 +93,17 @@ void arch_tlb_finish_mmu(struct mmu_gath struct mmu_gather_batch *batch, *next;
if (force) { + /* + * The aarch64 yields better performance with fullmm by + * avoiding multiple CPUs spamming TLBI messages at the + * same time. + * + * On x86 non-fullmm doesn't yield significant difference + * against fullmm. + */ + tlb->fullmm = 1; __tlb_reset_range(tlb); - __tlb_adjust_range(tlb, start, end - start); + tlb->freed_tables = 1; }
tlb_flush_mmu(tlb); @@ -249,10 +258,15 @@ void tlb_finish_mmu(struct mmu_gather *t { /* * If there are parallel threads are doing PTE changes on same range - * under non-exclusive lock(e.g., mmap_sem read-side) but defer TLB - * flush by batching, a thread has stable TLB entry can fail to flush - * the TLB by observing pte_none|!pte_dirty, for example so flush TLB - * forcefully if we detect parallel PTE batching threads. + * under non-exclusive lock (e.g., mmap_sem read-side) but defer TLB + * flush by batching, one thread may end up seeing inconsistent PTEs + * and result in having stale TLB entries. So flush TLB forcefully + * if we detect parallel PTE batching threads. + * + * However, some syscalls, e.g. munmap(), may free page tables, this + * needs force flush everything in the given range. Otherwise this + * may result in having stale TLB entries for some architectures, + * e.g. aarch64, that could specify flush what level TLB. */ bool force = mm_tlb_flush_nested(tlb->mm);
From: Sagi Grimberg sagi@grimberg.me
commit efb973b19b88642bb7e08b8ce8e03b0bbd2a7e2a upstream.
usually nvme_ prefix is for core functions. While we're cleaning up, remove redundant empty lines
Signed-off-by: Sagi Grimberg sagi@grimberg.me Reviewed-by: Minwoo Im minwoo.im@samsung.com Signed-off-by: Christoph Hellwig hch@lst.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/nvme/host/tcp.c | 10 ++++------ 1 file changed, 4 insertions(+), 6 deletions(-)
--- a/drivers/nvme/host/tcp.c +++ b/drivers/nvme/host/tcp.c @@ -473,7 +473,6 @@ static int nvme_tcp_handle_c2h_data(stru }
return 0; - }
static int nvme_tcp_handle_comp(struct nvme_tcp_queue *queue, @@ -634,7 +633,6 @@ static inline void nvme_tcp_end_request( nvme_end_request(rq, cpu_to_le16(status << 1), res); }
- static int nvme_tcp_recv_data(struct nvme_tcp_queue *queue, struct sk_buff *skb, unsigned int *offset, size_t *len) { @@ -1535,7 +1533,7 @@ out_free_queue: return ret; }
-static int nvme_tcp_alloc_io_queues(struct nvme_ctrl *ctrl) +static int __nvme_tcp_alloc_io_queues(struct nvme_ctrl *ctrl) { int i, ret;
@@ -1565,7 +1563,7 @@ static unsigned int nvme_tcp_nr_io_queue return nr_io_queues; }
-static int nvme_alloc_io_queues(struct nvme_ctrl *ctrl) +static int nvme_tcp_alloc_io_queues(struct nvme_ctrl *ctrl) { unsigned int nr_io_queues; int ret; @@ -1582,7 +1580,7 @@ static int nvme_alloc_io_queues(struct n dev_info(ctrl->device, "creating %d I/O queues.\n", nr_io_queues);
- return nvme_tcp_alloc_io_queues(ctrl); + return __nvme_tcp_alloc_io_queues(ctrl); }
static void nvme_tcp_destroy_io_queues(struct nvme_ctrl *ctrl, bool remove) @@ -1599,7 +1597,7 @@ static int nvme_tcp_configure_io_queues( { int ret;
- ret = nvme_alloc_io_queues(ctrl); + ret = nvme_tcp_alloc_io_queues(ctrl); if (ret) return ret;
From: Sagi Grimberg sagi@grimberg.me
commit f34e25898a608380a60135288019c4cb6013bec8 upstream.
If I/O queue connect times out, we might have freed the queue socket already, so check for that on the error path in nvme_tcp_start_queue.
Signed-off-by: Sagi Grimberg sagi@grimberg.me Signed-off-by: Christoph Hellwig hch@lst.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/nvme/host/tcp.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/drivers/nvme/host/tcp.c +++ b/drivers/nvme/host/tcp.c @@ -1423,7 +1423,8 @@ static int nvme_tcp_start_queue(struct n if (!ret) { set_bit(NVME_TCP_Q_LIVE, &ctrl->queues[idx].flags); } else { - __nvme_tcp_stop_queue(&ctrl->queues[idx]); + if (test_bit(NVME_TCP_Q_ALLOCATED, &ctrl->queues[idx].flags)) + __nvme_tcp_stop_queue(&ctrl->queues[idx]); dev_err(nctrl->device, "failed to connect queue: %d ret=%d\n", idx, ret); }
From: Sagi Grimberg sagi@grimberg.me
commit 6486199378a505c58fddc47459631235c9fb7638 upstream.
When the controller supports less queues than requested, we should make sure that queue mapping does the right thing and not assume that all queues are available. This fixes a crash when the controller supports less queues than requested.
The rules are: 1. if no write queues are requested, we assign the available queues to the default queue map. The default and read queue maps share the existing queues. 2. if write queues are requested: - first make sure that read queue map gets the requested nr_io_queues count - then grant the default queue map the minimum between the requested nr_write_queues and the remaining queues. If there are no available queues to dedicate to the default queue map, fallback to (1) and share all the queues in the existing queue map.
Also, provide a log indication on how we constructed the different queue maps.
Reported-by: Harris, James R james.r.harris@intel.com Tested-by: Jim Harris james.r.harris@intel.com Cc: stable@vger.kernel.org # v5.0+ Suggested-by: Roy Shterman roys@lightbitslabs.com Signed-off-by: Sagi Grimberg sagi@grimberg.me Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/nvme/host/tcp.c | 57 ++++++++++++++++++++++++++++++++++++++++++------ 1 file changed, 50 insertions(+), 7 deletions(-)
--- a/drivers/nvme/host/tcp.c +++ b/drivers/nvme/host/tcp.c @@ -111,6 +111,7 @@ struct nvme_tcp_ctrl { struct work_struct err_work; struct delayed_work connect_work; struct nvme_tcp_request async_req; + u32 io_queues[HCTX_MAX_TYPES]; };
static LIST_HEAD(nvme_tcp_ctrl_list); @@ -1564,6 +1565,35 @@ static unsigned int nvme_tcp_nr_io_queue return nr_io_queues; }
+static void nvme_tcp_set_io_queues(struct nvme_ctrl *nctrl, + unsigned int nr_io_queues) +{ + struct nvme_tcp_ctrl *ctrl = to_tcp_ctrl(nctrl); + struct nvmf_ctrl_options *opts = nctrl->opts; + + if (opts->nr_write_queues && opts->nr_io_queues < nr_io_queues) { + /* + * separate read/write queues + * hand out dedicated default queues only after we have + * sufficient read queues. + */ + ctrl->io_queues[HCTX_TYPE_READ] = opts->nr_io_queues; + nr_io_queues -= ctrl->io_queues[HCTX_TYPE_READ]; + ctrl->io_queues[HCTX_TYPE_DEFAULT] = + min(opts->nr_write_queues, nr_io_queues); + nr_io_queues -= ctrl->io_queues[HCTX_TYPE_DEFAULT]; + } else { + /* + * shared read/write queues + * either no write queues were requested, or we don't have + * sufficient queue count to have dedicated default queues. + */ + ctrl->io_queues[HCTX_TYPE_DEFAULT] = + min(opts->nr_io_queues, nr_io_queues); + nr_io_queues -= ctrl->io_queues[HCTX_TYPE_DEFAULT]; + } +} + static int nvme_tcp_alloc_io_queues(struct nvme_ctrl *ctrl) { unsigned int nr_io_queues; @@ -1581,6 +1611,8 @@ static int nvme_tcp_alloc_io_queues(stru dev_info(ctrl->device, "creating %d I/O queues.\n", nr_io_queues);
+ nvme_tcp_set_io_queues(ctrl, nr_io_queues); + return __nvme_tcp_alloc_io_queues(ctrl); }
@@ -2089,23 +2121,34 @@ static blk_status_t nvme_tcp_queue_rq(st static int nvme_tcp_map_queues(struct blk_mq_tag_set *set) { struct nvme_tcp_ctrl *ctrl = set->driver_data; + struct nvmf_ctrl_options *opts = ctrl->ctrl.opts;
- set->map[HCTX_TYPE_DEFAULT].queue_offset = 0; - set->map[HCTX_TYPE_READ].nr_queues = ctrl->ctrl.opts->nr_io_queues; - if (ctrl->ctrl.opts->nr_write_queues) { + if (opts->nr_write_queues && ctrl->io_queues[HCTX_TYPE_READ]) { /* separate read/write queues */ set->map[HCTX_TYPE_DEFAULT].nr_queues = - ctrl->ctrl.opts->nr_write_queues; + ctrl->io_queues[HCTX_TYPE_DEFAULT]; + set->map[HCTX_TYPE_DEFAULT].queue_offset = 0; + set->map[HCTX_TYPE_READ].nr_queues = + ctrl->io_queues[HCTX_TYPE_READ]; set->map[HCTX_TYPE_READ].queue_offset = - ctrl->ctrl.opts->nr_write_queues; + ctrl->io_queues[HCTX_TYPE_DEFAULT]; } else { - /* mixed read/write queues */ + /* shared read/write queues */ set->map[HCTX_TYPE_DEFAULT].nr_queues = - ctrl->ctrl.opts->nr_io_queues; + ctrl->io_queues[HCTX_TYPE_DEFAULT]; + set->map[HCTX_TYPE_DEFAULT].queue_offset = 0; + set->map[HCTX_TYPE_READ].nr_queues = + ctrl->io_queues[HCTX_TYPE_DEFAULT]; set->map[HCTX_TYPE_READ].queue_offset = 0; } blk_mq_map_queues(&set->map[HCTX_TYPE_DEFAULT]); blk_mq_map_queues(&set->map[HCTX_TYPE_READ]); + + dev_info(ctrl->ctrl.device, + "mapped %d/%d default/read queues.\n", + ctrl->io_queues[HCTX_TYPE_DEFAULT], + ctrl->io_queues[HCTX_TYPE_READ]); + return 0; }
From: Andrea Arcangeli aarcange@redhat.com
commit 59ea6d06cfa9247b586a695c21f94afa7183af74 upstream.
When fixing the race conditions between the coredump and the mmap_sem holders outside the context of the process, we focused on mmget_not_zero()/get_task_mm() callers in 04f5866e41fb70 ("coredump: fix race condition between mmget_not_zero()/get_task_mm() and core dumping"), but those aren't the only cases where the mmap_sem can be taken outside of the context of the process as Michal Hocko noticed while backporting that commit to older -stable kernels.
If mmgrab() is called in the context of the process, but then the mm_count reference is transferred outside the context of the process, that can also be a problem if the mmap_sem has to be taken for writing through that mm_count reference.
khugepaged registration calls mmgrab() in the context of the process, but the mmap_sem for writing is taken later in the context of the khugepaged kernel thread.
collapse_huge_page() after taking the mmap_sem for writing doesn't modify any vma, so it's not obvious that it could cause a problem to the coredump, but it happens to modify the pmd in a way that breaks an invariant that pmd_trans_huge_lock() relies upon. collapse_huge_page() needs the mmap_sem for writing just to block concurrent page faults that call pmd_trans_huge_lock().
Specifically the invariant that "!pmd_trans_huge()" cannot become a "pmd_trans_huge()" doesn't hold while collapse_huge_page() runs.
The coredump will call __get_user_pages() without mmap_sem for reading, which eventually can invoke a lockless page fault which will need a functional pmd_trans_huge_lock().
So collapse_huge_page() needs to use mmget_still_valid() to check it's not running concurrently with the coredump... as long as the coredump can invoke page faults without holding the mmap_sem for reading.
This has "Fixes: khugepaged" to facilitate backporting, but in my view it's more a bug in the coredump code that will eventually have to be rewritten to stop invoking page faults without the mmap_sem for reading. So the long term plan is still to drop all mmget_still_valid().
Link: http://lkml.kernel.org/r/20190607161558.32104-1-aarcange@redhat.com Fixes: ba76149f47d8 ("thp: khugepaged") Signed-off-by: Andrea Arcangeli aarcange@redhat.com Reported-by: Michal Hocko mhocko@suse.com Acked-by: Michal Hocko mhocko@suse.com Acked-by: Kirill A. Shutemov kirill.shutemov@linux.intel.com Cc: Oleg Nesterov oleg@redhat.com Cc: Jann Horn jannh@google.com Cc: Hugh Dickins hughd@google.com Cc: Mike Rapoport rppt@linux.vnet.ibm.com Cc: Mike Kravetz mike.kravetz@oracle.com Cc: Peter Xu peterx@redhat.com Cc: Jason Gunthorpe jgg@mellanox.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- include/linux/sched/mm.h | 4 ++++ mm/khugepaged.c | 3 +++ 2 files changed, 7 insertions(+)
--- a/include/linux/sched/mm.h +++ b/include/linux/sched/mm.h @@ -54,6 +54,10 @@ static inline void mmdrop(struct mm_stru * followed by taking the mmap_sem for writing before modifying the * vmas or anything the coredump pretends not to change from under it. * + * It also has to be called when mmgrab() is used in the context of + * the process, but then the mm_count refcount is transferred outside + * the context of the process to run down_write() on that pinned mm. + * * NOTE: find_extend_vma() called from GUP context is the only place * that can modify the "mm" (notably the vm_start/end) under mmap_sem * for reading and outside the context of the process, so it is also --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -1004,6 +1004,9 @@ static void collapse_huge_page(struct mm * handled by the anon_vma lock + PG_lock. */ down_write(&mm->mmap_sem); + result = SCAN_ANY_PROCESS; + if (!mmget_still_valid(mm)) + goto out; result = hugepage_vma_revalidate(mm, address, &vma); if (result) goto out;
On Thu, Jun 20, 2019 at 07:56:27PM +0200, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.1.13 release. There are 98 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sat 22 Jun 2019 05:42:15 PM UTC. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.1.13-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.1.y and the diffstat can be found below.
thanks,
greg k-h
Compiled and booted. No regressions on x86_64.
THX
Jiunn
On Thu, Jun 20, 2019 at 06:48:40PM -0500, Jiunn Chang wrote:
On Thu, Jun 20, 2019 at 07:56:27PM +0200, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.1.13 release. There are 98 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sat 22 Jun 2019 05:42:15 PM UTC. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.1.13-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.1.y and the diffstat can be found below.
thanks,
greg k-h
Compiled and booted. No regressions on x86_64.
thanks for testing!
stable-rc/linux-5.1.y boot: 130 boots: 1 failed, 127 passed with 2 offline (v5.1.12-99-g10bbe23e94c5)
Full Boot Summary: https://kernelci.org/boot/all/job/stable-rc/branch/linux-5.1.y/kernel/v5.1.1... Full Build Summary: https://kernelci.org/build/stable-rc/branch/linux-5.1.y/kernel/v5.1.12-99-g1...
Tree: stable-rc Branch: linux-5.1.y Git Describe: v5.1.12-99-g10bbe23e94c5 Git Commit: 10bbe23e94c5975292d0a3ff74893d1625c1e07c Git URL: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git Tested: 76 unique boards, 25 SoC families, 16 builds out of 209
Boot Failure Detected:
arm: multi_v7_defconfig: gcc-8: bcm4708-smartrg-sr400ac: 1 failed lab
Offline Platforms:
arm:
tegra_defconfig: gcc-8 tegra30-beaver: 1 offline lab
multi_v7_defconfig: gcc-8 tegra30-beaver: 1 offline lab
--- For more info write to info@kernelci.org
On Thu, 20 Jun 2019 at 23:44, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 5.1.13 release. There are 98 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sat 22 Jun 2019 05:42:15 PM UTC. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.1.13-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.1.y and the diffstat can be found below.
thanks,
greg k-h
Results from Linaro’s test farm. No regressions on arm64, arm, x86_64, and i386.
Summary ------------------------------------------------------------------------
kernel: 5.1.13-rc1 git repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git git branch: linux-5.1.y git commit: 10bbe23e94c5975292d0a3ff74893d1625c1e07c git describe: v5.1.12-99-g10bbe23e94c5 Test details: https://qa-reports.linaro.org/lkft/linux-stable-rc-5.1-oe/build/v5.1.12-99-g...
No regressions (compared to build v5.1.12)
No fixes (compared to build v5.1.12)
Ran 24447 total tests in the following environments and test suites.
Environments -------------- - dragonboard-410c - hi6220-hikey - i386 - juno-r2 - qemu_arm - qemu_arm64 - qemu_i386 - qemu_x86_64 - x15 - x86
Test Suites ----------- * build * install-android-platform-tools-r2600 * kselftest * libgpiod * libhugetlbfs * ltp-cap_bounds-tests * ltp-commands-tests * ltp-containers-tests * ltp-cpuhotplug-tests * ltp-cve-tests * ltp-dio-tests * ltp-fcntl-locktests-tests * ltp-filecaps-tests * ltp-fs_bind-tests * ltp-fs_perms_simple-tests * ltp-fsx-tests * ltp-hugetlb-tests * ltp-io-tests * ltp-ipc-tests * ltp-math-tests * ltp-mm-tests * ltp-nptl-tests * ltp-pty-tests * ltp-sched-tests * ltp-securebits-tests * ltp-syscalls-tests * ltp-timers-tests * perf * spectre-meltdown-checker-test * v4l2-compliance * ltp-fs-tests * network-basic-tests * ltp-open-posix-tests * kvm-unit-tests * kselftest-vsyscall-mode-native * kselftest-vsyscall-mode-none
On Fri, Jun 21, 2019 at 09:25:54AM +0530, Naresh Kamboju wrote:
On Thu, 20 Jun 2019 at 23:44, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 5.1.13 release. There are 98 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sat 22 Jun 2019 05:42:15 PM UTC. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.1.13-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.1.y and the diffstat can be found below.
thanks,
greg k-h
Results from Linaro’s test farm. No regressions on arm64, arm, x86_64, and i386.
Wonderful, thanks for testing all of these and letting me know.
greg k-h
On 6/20/19 10:56 AM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.1.13 release. There are 98 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sat 22 Jun 2019 05:42:15 PM UTC. Anything received after that time might be too late.
Build results: total: 159 pass: 159 fail: 0 Qemu test results: total: 364 pass: 364 fail: 0
Guenter
On Fri, Jun 21, 2019 at 05:45:58PM -0700, Guenter Roeck wrote:
On 6/20/19 10:56 AM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.1.13 release. There are 98 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sat 22 Jun 2019 05:42:15 PM UTC. Anything received after that time might be too late.
Build results: total: 159 pass: 159 fail: 0 Qemu test results: total: 364 pass: 364 fail: 0
Wonderful! Thanks for testing all of these and letting me know.
greg k-h
linux-stable-mirror@lists.linaro.org