This is the start of the stable review cycle for the 4.9.181 release. There are 83 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Tue 11 Jun 2019 04:39:58 PM UTC. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.9.181-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.9.y and the diffstat can be found below.
thanks,
greg k-h
------------- Pseudo-Shortlog of commits:
Greg Kroah-Hartman gregkh@linuxfoundation.org Linux 4.9.181-rc1
Kirill Smelkov kirr@nexedi.com fuse: Add FOPEN_STREAM to use stream_open()
Kirill Smelkov kirr@nexedi.com fs: stream_open - opener for stream-like files so that read and write can run simultaneously without deadlock
Jiri Slaby jslaby@suse.cz TTY: serial_core, add ->install
Chris Wilson chris@chris-wilson.co.uk drm/i915: Fix I915_EXEC_RING_MASK
Christian König christian.koenig@amd.com drm/radeon: prefer lower reference dividers
Patrik Jakobsson patrik.r.jakobsson@gmail.com drm/gma500/cdv: Check vbt config bits when detecting lvds panels
Dan Carpenter dan.carpenter@oracle.com genwqe: Prevent an integer overflow in the ioctl
Greg Kroah-Hartman gregkh@linuxfoundation.org Revert "MIPS: perf: ath79: Fix perfcount IRQ assignment"
Paul Burton paul.burton@mips.com MIPS: pistachio: Build uImage.gz by default
Jiri Kosina jkosina@suse.cz x86/power: Fix 'nosmt' vs hibernation triple fault during resume
Miklos Szeredi mszeredi@redhat.com fuse: fallocate: fix return with locked inode
John David Anglin dave.anglin@bell.net parisc: Use implicit space register selection for loading the coherence index of I/O pdirs
Linus Torvalds torvalds@linux-foundation.org rcu: locking and unlocking need to always be at least barriers
Hangbin Liu liuhangbin@gmail.com Revert "fib_rules: return 0 directly if an exactly same rule exists when NLM_F_EXCL not supplied"
Greg Kroah-Hartman gregkh@linuxfoundation.org Revert "fib_rules: fix error in backport of e9919a24d302 ("fib_rules: return 0...")"
Olivier Matz olivier.matz@6wind.com ipv6: use READ_ONCE() for inet->hdrincl as in ipv4
Olivier Matz olivier.matz@6wind.com ipv6: fix EFAULT on sendto with icmpv6 and hdrincl
Paolo Abeni pabeni@redhat.com pktgen: do not sleep with the thread lock held.
Zhu Yanjun yanjun.zhu@oracle.com net: rds: fix memory leak in rds_ib_flush_mr_pool
Erez Alfasi ereza@mellanox.com net/mlx4_en: ethtool, Remove unsupported SFP EEPROM high pages query
David Ahern dsahern@gmail.com neighbor: Call __ipv4_neigh_lookup_noref in neigh_xmit
Vivien Didelot vivien.didelot@gmail.com ethtool: fix potential userspace buffer overflow
Nadav Amit namit@vmware.com media: uvcvideo: Fix uvc_alloc_entity() allocation alignment
Ard Biesheuvel ard.biesheuvel@linaro.org efi/libstub: Unify command line param parsing
Greg Kroah-Hartman gregkh@linuxfoundation.org Revert "x86/build: Move _etext to actual end of .text"
Linus Torvalds torvalds@linux-foundation.org mm: make page ref count overflow check tighter and more explicit
Linus Torvalds torvalds@linux-foundation.org mm: prevent get_user_pages() from overflowing page refcount
Punit Agrawal punit.agrawal@arm.com mm, gup: ensure real head page is ref-counted when using hugepages
Will Deacon will.deacon@arm.com mm, gup: remove broken VM_BUG_ON_PAGE compound check for hugepages
Matthew Wilcox willy@infradead.org fs: prevent page refcount overflow in pipe_buf_get
Todd Kjos tkjos@android.com binder: replace "%p" with "%pK"
Ben Hutchings ben.hutchings@codethink.co.uk binder: Replace "%p" with "%pK" for stable
Arend van Spriel arend.vanspriel@broadcom.com brcmfmac: add subtype check for event handling in data path
Arend van Spriel arend.vanspriel@broadcom.com brcmfmac: assure SSID length from firmware is limited
Arend Van Spriel arend.vanspriel@broadcom.com brcmfmac: add length checks in scheduled scan result handler
Thomas Hellstrom thellstrom@vmware.com drm/vmwgfx: Don't send drm sysfs hotplug events on initial master set
Kees Cook keescook@chromium.org gcc-plugins: Fix build failures under Darwin host
Roberto Bergantinos Corpas rbergant@redhat.com CIFS: cifs_read_allocate_pages: don't iterate through whole page array on ENOMEM
Dan Carpenter dan.carpenter@oracle.com staging: vc04_services: prevent integer overflow in create_pagelist()
Jonathan Corbet corbet@lwn.net docs: Fix conf.py for Sphinx 2.0
Zhenliang Wei weizhenliang@huawei.com kernel/signal.c: trace_signal_deliver when signal_group_exit
Jiri Slaby jslaby@suse.cz memcg: make it work on sparse non-0-node systems
Joe Burmeister joe.burmeister@devtank.co.uk tty: max310x: Fix external crystal register setup
Jorge Ramirez-Ortiz jorge.ramirez-ortiz@linaro.org tty: serial: msm_serial: Fix XON/XOFF
Lyude Paul lyude@redhat.com drm/nouveau/i2c: Disable i2c bus access after ->fini()
Kailang Yang kailang@realtek.com ALSA: hda/realtek - Set default power save node to 0
Ravi Bangoria ravi.bangoria@linux.ibm.com powerpc/perf: Fix MMCRA corruption by bhrb_filter
Filipe Manana fdmanana@suse.com Btrfs: fix race updating log root item during fsync
Steffen Maier maier@linux.ibm.com scsi: zfcp: fix to prevent port_remove with pure auto scan LUNs (only sdevs)
Steffen Maier maier@linux.ibm.com scsi: zfcp: fix missing zfcp_port reference put on -EBUSY from port_remove
Mauro Carvalho Chehab mchehab+samsung@kernel.org media: smsusb: better handle optional alignment
Alan Stern stern@rowland.harvard.edu media: usb: siano: Fix false-positive "uninitialized variable" warning
Alan Stern stern@rowland.harvard.edu media: usb: siano: Fix general protection fault in smsusb
Oliver Neukum oneukum@suse.com USB: rio500: fix memory leak in close after disconnect
Oliver Neukum oneukum@suse.com USB: rio500: refuse more than one device at a time
Maximilian Luz luzmaximilian@gmail.com USB: Add LPM quirk for Surface Dock GigE adapter
Oliver Neukum oneukum@suse.com USB: sisusbvga: fix oops in error path of sisusb_probe
Alan Stern stern@rowland.harvard.edu USB: Fix slab-out-of-bounds write in usb_get_bos_descriptor
Shuah Khan skhan@linuxfoundation.org usbip: usbip_host: fix stub_dev lock context imbalance regression
Shuah Khan skhan@linuxfoundation.org usbip: usbip_host: fix BUG: sleeping function called from invalid context
Carsten Schmid carsten_schmid@mentor.com usb: xhci: avoid null pointer deref when bos field is NULL
Andrey Smirnov andrew.smirnov@gmail.com xhci: Convert xhci_handshake() to use readl_poll_timeout_atomic()
Fabio Estevam festevam@gmail.com xhci: Use %zu for printing size_t type
Henry Lin henryl@nvidia.com xhci: update bounce buffer with correct sg num
Rasmus Villemoes linux@rasmusvillemoes.dk include/linux/bitops.h: sanitize rotate primitives
James Clarke jrtc27@jrtc27.com sparc64: Fix regression in non-hypervisor TLB flush xcall
Junwei Hu hujunwei4@huawei.com tipc: fix modprobe tipc failed after switch order of device registration
David S. Miller davem@davemloft.net Revert "tipc: fix modprobe tipc failed after switch order of device registration"
Konrad Rzeszutek Wilk konrad.wilk@oracle.com xen/pciback: Don't disable PCI_COMMAND on PCI device reset.
Daniel Axtens dja@axtens.net crypto: vmx - ghash: do nosimd fallback manually
Antoine Tenart antoine.tenart@bootlin.com net: mvpp2: fix bad MVPP2_TXQ_SCHED_TOKEN_CNTR_REG queue value
Jisheng Zhang Jisheng.Zhang@synaptics.com net: mvneta: Fix err code path of probe
Rasmus Villemoes rasmus.villemoes@prevas.dk net: dsa: mv88e6xxx: fix handling of upper half of STATS_TYPE_PORT
Eric Dumazet edumazet@google.com ipv4/igmp: fix build error if !CONFIG_IP_MULTICAST
Eric Dumazet edumazet@google.com ipv4/igmp: fix another memory leak in igmpv3_del_delrec()
Michael Chan michael.chan@broadcom.com bnxt_en: Fix aggregation buffer leak under OOM condition.
Chris Packham chris.packham@alliedtelesis.co.nz tipc: Avoid copying bytes beyond the supplied data
Kloetzke Jan Jan.Kloetzke@preh.de usbnet: fix kernel crash after disconnect
Jisheng Zhang Jisheng.Zhang@synaptics.com net: stmmac: fix reset gpio free missing
Eric Dumazet edumazet@google.com net-gro: fix use-after-free read in napi_gro_frags()
Andy Duan fugang.duan@nxp.com net: fec: fix the clk mismatch in failed_reset path
Eric Dumazet edumazet@google.com llc: fix skb leak in llc_build_and_send_ui_pkt()
Mike Manning mmanning@vyatta.att-mail.com ipv6: Consider sk_bound_dev_if when binding a raw socket to an address
-------------
Diffstat:
Documentation/conf.py | 2 +- Makefile | 4 +- arch/mips/ath79/setup.c | 6 + arch/mips/pistachio/Platform | 1 + arch/powerpc/perf/core-book3s.c | 6 +- arch/powerpc/perf/power8-pmu.c | 3 + arch/powerpc/perf/power9-pmu.c | 3 + arch/sparc/mm/ultra.S | 4 +- arch/x86/kernel/vmlinux.lds.S | 6 +- arch/x86/power/cpu.c | 10 + arch/x86/power/hibernate_64.c | 33 ++ drivers/android/binder.c | 36 +- drivers/crypto/vmx/ghash.c | 213 +++++------- drivers/firmware/efi/libstub/arm-stub.c | 23 +- drivers/firmware/efi/libstub/arm64-stub.c | 4 +- drivers/firmware/efi/libstub/efi-stub-helper.c | 19 +- drivers/firmware/efi/libstub/efistub.h | 2 + drivers/gpu/drm/gma500/cdv_intel_lvds.c | 3 + drivers/gpu/drm/gma500/intel_bios.c | 3 + drivers/gpu/drm/gma500/psb_drv.h | 1 + drivers/gpu/drm/nouveau/include/nvkm/subdev/i2c.h | 2 + drivers/gpu/drm/nouveau/nvkm/subdev/i2c/aux.c | 26 +- drivers/gpu/drm/nouveau/nvkm/subdev/i2c/aux.h | 2 + drivers/gpu/drm/nouveau/nvkm/subdev/i2c/base.c | 15 + drivers/gpu/drm/nouveau/nvkm/subdev/i2c/bus.c | 21 +- drivers/gpu/drm/nouveau/nvkm/subdev/i2c/bus.h | 1 + drivers/gpu/drm/radeon/radeon_display.c | 4 +- drivers/gpu/drm/vmwgfx/vmwgfx_drv.c | 8 +- drivers/irqchip/irq-ath79-misc.c | 11 - drivers/media/usb/siano/smsusb.c | 33 +- drivers/media/usb/uvc/uvc_driver.c | 2 +- drivers/misc/genwqe/card_dev.c | 2 + drivers/misc/genwqe/card_utils.c | 4 + drivers/net/dsa/mv88e6xxx/chip.c | 2 +- drivers/net/ethernet/broadcom/bnxt/bnxt.c | 2 + drivers/net/ethernet/freescale/fec_main.c | 2 +- drivers/net/ethernet/marvell/mvneta.c | 4 +- drivers/net/ethernet/marvell/mvpp2.c | 10 +- drivers/net/ethernet/mellanox/mlx4/en_ethtool.c | 4 +- drivers/net/ethernet/mellanox/mlx4/port.c | 5 - drivers/net/ethernet/stmicro/stmmac/stmmac_mdio.c | 3 +- drivers/net/usb/usbnet.c | 6 + .../broadcom/brcm80211/brcmfmac/cfg80211.c | 16 +- .../wireless/broadcom/brcm80211/brcmfmac/core.c | 5 +- .../wireless/broadcom/brcm80211/brcmfmac/fweh.h | 16 +- .../wireless/broadcom/brcm80211/brcmfmac/msgbuf.c | 2 +- drivers/parisc/ccio-dma.c | 4 +- drivers/parisc/sba_iommu.c | 3 +- drivers/s390/scsi/zfcp_ext.h | 1 + drivers/s390/scsi/zfcp_scsi.c | 9 + drivers/s390/scsi/zfcp_sysfs.c | 55 +++- drivers/s390/scsi/zfcp_unit.c | 8 +- .../interface/vchiq_arm/vchiq_2835_arm.c | 9 + drivers/tty/serial/max310x.c | 2 +- drivers/tty/serial/msm_serial.c | 5 +- drivers/tty/serial/serial_core.c | 24 +- drivers/usb/core/config.c | 4 +- drivers/usb/core/quirks.c | 3 + drivers/usb/host/xhci-ring.c | 17 +- drivers/usb/host/xhci.c | 24 +- drivers/usb/misc/rio500.c | 41 ++- drivers/usb/misc/sisusbvga/sisusb.c | 15 +- drivers/usb/usbip/stub_dev.c | 75 +++-- drivers/xen/xen-pciback/pciback_ops.c | 2 - drivers/xen/xenbus/xenbus_dev_frontend.c | 2 +- fs/btrfs/tree-log.c | 8 +- fs/cifs/file.c | 4 +- fs/fuse/dev.c | 12 +- fs/fuse/file.c | 6 +- fs/open.c | 18 + fs/pipe.c | 4 +- fs/read_write.c | 5 +- fs/splice.c | 12 +- include/linux/bitops.h | 16 +- include/linux/cpu.h | 4 + include/linux/efi.h | 2 +- include/linux/fs.h | 4 + include/linux/list_lru.h | 1 + include/linux/mm.h | 6 +- include/linux/pipe_fs_i.h | 10 +- include/linux/rcupdate.h | 6 +- include/uapi/drm/i915_drm.h | 2 +- include/uapi/linux/fuse.h | 2 + include/uapi/linux/tipc_config.h | 10 +- kernel/cpu.c | 4 +- kernel/power/hibernate.c | 9 + kernel/signal.c | 2 + kernel/trace/trace.c | 6 +- mm/gup.c | 54 ++- mm/hugetlb.c | 16 +- mm/list_lru.c | 8 +- net/core/dev.c | 2 +- net/core/ethtool.c | 5 +- net/core/fib_rules.c | 7 +- net/core/neighbour.c | 9 +- net/core/pktgen.c | 11 + net/ipv4/igmp.c | 47 ++- net/ipv6/raw.c | 27 +- net/llc/llc_output.c | 2 + net/rds/ib_rdma.c | 10 +- net/tipc/core.c | 32 +- net/tipc/subscr.c | 14 +- net/tipc/subscr.h | 5 +- scripts/coccinelle/api/stream_open.cocci | 363 +++++++++++++++++++++ scripts/gcc-plugins/gcc-common.h | 4 + sound/pci/hda/patch_realtek.c | 2 +- 106 files changed, 1223 insertions(+), 441 deletions(-)
From: Mike Manning mmanning@vyatta.att-mail.com
[ Upstream commit 72f7cfab6f93a8ea825fab8ccfb016d064269f7f ]
IPv6 does not consider if the socket is bound to a device when binding to an address. The result is that a socket can be bound to eth0 and then bound to the address of eth1. If the device is a VRF, the result is that a socket can only be bound to an address in the default VRF.
Resolve by considering the device if sk_bound_dev_if is set.
Signed-off-by: Mike Manning mmanning@vyatta.att-mail.com Reviewed-by: David Ahern dsahern@gmail.com Tested-by: David Ahern dsahern@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv6/raw.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/net/ipv6/raw.c +++ b/net/ipv6/raw.c @@ -283,7 +283,9 @@ static int rawv6_bind(struct sock *sk, s /* Binding to link-local address requires an interface */ if (!sk->sk_bound_dev_if) goto out_unlock; + }
+ if (sk->sk_bound_dev_if) { err = -ENODEV; dev = dev_get_by_index_rcu(sock_net(sk), sk->sk_bound_dev_if);
From: Eric Dumazet edumazet@google.com
[ Upstream commit 8fb44d60d4142cd2a440620cd291d346e23c131e ]
If llc_mac_hdr_init() returns an error, we must drop the skb since no llc_build_and_send_ui_pkt() caller will take care of this.
BUG: memory leak unreferenced object 0xffff8881202b6800 (size 2048): comm "syz-executor907", pid 7074, jiffies 4294943781 (age 8.590s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 1a 00 07 40 00 00 00 00 00 00 00 00 00 00 00 00 ...@............ backtrace: [<00000000e25b5abe>] kmemleak_alloc_recursive include/linux/kmemleak.h:55 [inline] [<00000000e25b5abe>] slab_post_alloc_hook mm/slab.h:439 [inline] [<00000000e25b5abe>] slab_alloc mm/slab.c:3326 [inline] [<00000000e25b5abe>] __do_kmalloc mm/slab.c:3658 [inline] [<00000000e25b5abe>] __kmalloc+0x161/0x2c0 mm/slab.c:3669 [<00000000a1ae188a>] kmalloc include/linux/slab.h:552 [inline] [<00000000a1ae188a>] sk_prot_alloc+0xd6/0x170 net/core/sock.c:1608 [<00000000ded25bbe>] sk_alloc+0x35/0x2f0 net/core/sock.c:1662 [<000000002ecae075>] llc_sk_alloc+0x35/0x170 net/llc/llc_conn.c:950 [<00000000551f7c47>] llc_ui_create+0x7b/0x140 net/llc/af_llc.c:173 [<0000000029027f0e>] __sock_create+0x164/0x250 net/socket.c:1430 [<000000008bdec225>] sock_create net/socket.c:1481 [inline] [<000000008bdec225>] __sys_socket+0x69/0x110 net/socket.c:1523 [<00000000b6439228>] __do_sys_socket net/socket.c:1532 [inline] [<00000000b6439228>] __se_sys_socket net/socket.c:1530 [inline] [<00000000b6439228>] __x64_sys_socket+0x1e/0x30 net/socket.c:1530 [<00000000cec820c1>] do_syscall_64+0x76/0x1a0 arch/x86/entry/common.c:301 [<000000000c32554f>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
BUG: memory leak unreferenced object 0xffff88811d750d00 (size 224): comm "syz-executor907", pid 7074, jiffies 4294943781 (age 8.600s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00 f0 0c 24 81 88 ff ff 00 68 2b 20 81 88 ff ff ...$.....h+ .... backtrace: [<0000000053026172>] kmemleak_alloc_recursive include/linux/kmemleak.h:55 [inline] [<0000000053026172>] slab_post_alloc_hook mm/slab.h:439 [inline] [<0000000053026172>] slab_alloc_node mm/slab.c:3269 [inline] [<0000000053026172>] kmem_cache_alloc_node+0x153/0x2a0 mm/slab.c:3579 [<00000000fa8f3c30>] __alloc_skb+0x6e/0x210 net/core/skbuff.c:198 [<00000000d96fdafb>] alloc_skb include/linux/skbuff.h:1058 [inline] [<00000000d96fdafb>] alloc_skb_with_frags+0x5f/0x250 net/core/skbuff.c:5327 [<000000000a34a2e7>] sock_alloc_send_pskb+0x269/0x2a0 net/core/sock.c:2225 [<00000000ee39999b>] sock_alloc_send_skb+0x32/0x40 net/core/sock.c:2242 [<00000000e034d810>] llc_ui_sendmsg+0x10a/0x540 net/llc/af_llc.c:933 [<00000000c0bc8445>] sock_sendmsg_nosec net/socket.c:652 [inline] [<00000000c0bc8445>] sock_sendmsg+0x54/0x70 net/socket.c:671 [<000000003b687167>] __sys_sendto+0x148/0x1f0 net/socket.c:1964 [<00000000922d78d9>] __do_sys_sendto net/socket.c:1976 [inline] [<00000000922d78d9>] __se_sys_sendto net/socket.c:1972 [inline] [<00000000922d78d9>] __x64_sys_sendto+0x2a/0x30 net/socket.c:1972 [<00000000cec820c1>] do_syscall_64+0x76/0x1a0 arch/x86/entry/common.c:301 [<000000000c32554f>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Eric Dumazet edumazet@google.com Reported-by: syzbot syzkaller@googlegroups.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/llc/llc_output.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/net/llc/llc_output.c +++ b/net/llc/llc_output.c @@ -72,6 +72,8 @@ int llc_build_and_send_ui_pkt(struct llc rc = llc_mac_hdr_init(skb, skb->dev->dev_addr, dmac); if (likely(!rc)) rc = dev_queue_xmit(skb); + else + kfree_skb(skb); return rc; }
From: Andy Duan fugang.duan@nxp.com
[ Upstream commit ce8d24f9a5965a58c588f9342689702a1024433c ]
Fix the clk mismatch in the error path "failed_reset" because below error path will disable clk_ahb and clk_ipg directly, it should use pm_runtime_put_noidle() instead of pm_runtime_put() to avoid to call runtime resume callback.
Reported-by: Baruch Siach baruch@tkos.co.il Signed-off-by: Fugang Duan fugang.duan@nxp.com Tested-by: Baruch Siach baruch@tkos.co.il Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/freescale/fec_main.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/net/ethernet/freescale/fec_main.c +++ b/drivers/net/ethernet/freescale/fec_main.c @@ -3508,7 +3508,7 @@ failed_init: if (fep->reg_phy) regulator_disable(fep->reg_phy); failed_reset: - pm_runtime_put(&pdev->dev); + pm_runtime_put_noidle(&pdev->dev); pm_runtime_disable(&pdev->dev); failed_regulator: failed_clk_ipg:
From: Eric Dumazet edumazet@google.com
[ Upstream commit a4270d6795b0580287453ea55974d948393e66ef ]
If a network driver provides to napi_gro_frags() an skb with a page fragment of exactly 14 bytes, the call to gro_pull_from_frag0() will 'consume' the fragment by calling skb_frag_unref(skb, 0), and the page might be freed and reused.
Reading eth->h_proto at the end of napi_frags_skb() might read mangled data, or crash under specific debugging features.
BUG: KASAN: use-after-free in napi_frags_skb net/core/dev.c:5833 [inline] BUG: KASAN: use-after-free in napi_gro_frags+0xc6f/0xd10 net/core/dev.c:5841 Read of size 2 at addr ffff88809366840c by task syz-executor599/8957
CPU: 1 PID: 8957 Comm: syz-executor599 Not tainted 5.2.0-rc1+ #32 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x172/0x1f0 lib/dump_stack.c:113 print_address_description.cold+0x7c/0x20d mm/kasan/report.c:188 __kasan_report.cold+0x1b/0x40 mm/kasan/report.c:317 kasan_report+0x12/0x20 mm/kasan/common.c:614 __asan_report_load_n_noabort+0xf/0x20 mm/kasan/generic_report.c:142 napi_frags_skb net/core/dev.c:5833 [inline] napi_gro_frags+0xc6f/0xd10 net/core/dev.c:5841 tun_get_user+0x2f3c/0x3ff0 drivers/net/tun.c:1991 tun_chr_write_iter+0xbd/0x156 drivers/net/tun.c:2037 call_write_iter include/linux/fs.h:1872 [inline] do_iter_readv_writev+0x5f8/0x8f0 fs/read_write.c:693 do_iter_write fs/read_write.c:970 [inline] do_iter_write+0x184/0x610 fs/read_write.c:951 vfs_writev+0x1b3/0x2f0 fs/read_write.c:1015 do_writev+0x15b/0x330 fs/read_write.c:1058
Fixes: a50e233c50db ("net-gro: restore frag0 optimization") Signed-off-by: Eric Dumazet edumazet@google.com Reported-by: syzbot syzkaller@googlegroups.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/core/dev.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/net/core/dev.c +++ b/net/core/dev.c @@ -4828,7 +4828,6 @@ static struct sk_buff *napi_frags_skb(st skb_reset_mac_header(skb); skb_gro_reset_offset(skb);
- eth = skb_gro_header_fast(skb, 0); if (unlikely(skb_gro_header_hard(skb, hlen))) { eth = skb_gro_header_slow(skb, hlen, 0); if (unlikely(!eth)) { @@ -4838,6 +4837,7 @@ static struct sk_buff *napi_frags_skb(st return NULL; } } else { + eth = (const struct ethhdr *)skb->data; gro_pull_from_frag0(skb, hlen); NAPI_GRO_CB(skb)->frag0 += hlen; NAPI_GRO_CB(skb)->frag0_len -= hlen;
From: Jisheng Zhang Jisheng.Zhang@synaptics.com
[ Upstream commit 49ce881c0d4c4a7a35358d9dccd5f26d0e56fc61 ]
Commit 984203ceff27 ("net: stmmac: mdio: remove reset gpio free") removed the reset gpio free, when the driver is unbinded or rmmod, we miss the gpio free.
This patch uses managed API to request the reset gpio, so that the gpio could be freed properly.
Fixes: 984203ceff27 ("net: stmmac: mdio: remove reset gpio free") Signed-off-by: Jisheng Zhang Jisheng.Zhang@synaptics.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/stmicro/stmmac/stmmac_mdio.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_mdio.c +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_mdio.c @@ -240,7 +240,8 @@ int stmmac_mdio_reset(struct mii_bus *bu of_property_read_u32_array(np, "snps,reset-delays-us", data->delays, 3);
- if (gpio_request(data->reset_gpio, "mdio-reset")) + if (devm_gpio_request(priv->device, data->reset_gpio, + "mdio-reset")) return 0; }
From: Kloetzke Jan Jan.Kloetzke@preh.de
[ Upstream commit ad70411a978d1e6e97b1e341a7bde9a79af0c93d ]
When disconnecting cdc_ncm the kernel sporadically crashes shortly after the disconnect:
[ 57.868812] Unable to handle kernel NULL pointer dereference at virtual address 00000000 ... [ 58.006653] PC is at 0x0 [ 58.009202] LR is at call_timer_fn+0xec/0x1b4 [ 58.013567] pc : [<0000000000000000>] lr : [<ffffff80080f5130>] pstate: 00000145 [ 58.020976] sp : ffffff8008003da0 [ 58.024295] x29: ffffff8008003da0 x28: 0000000000000001 [ 58.029618] x27: 000000000000000a x26: 0000000000000100 [ 58.034941] x25: 0000000000000000 x24: ffffff8008003e68 [ 58.040263] x23: 0000000000000000 x22: 0000000000000000 [ 58.045587] x21: 0000000000000000 x20: ffffffc68fac1808 [ 58.050910] x19: 0000000000000100 x18: 0000000000000000 [ 58.056232] x17: 0000007f885aff8c x16: 0000007f883a9f10 [ 58.061556] x15: 0000000000000001 x14: 000000000000006e [ 58.066878] x13: 0000000000000000 x12: 00000000000000ba [ 58.072201] x11: ffffffc69ff1db30 x10: 0000000000000020 [ 58.077524] x9 : 8000100008001000 x8 : 0000000000000001 [ 58.082847] x7 : 0000000000000800 x6 : ffffff8008003e70 [ 58.088169] x5 : ffffffc69ff17a28 x4 : 00000000ffff138b [ 58.093492] x3 : 0000000000000000 x2 : 0000000000000000 [ 58.098814] x1 : 0000000000000000 x0 : 0000000000000000 ... [ 58.205800] [< (null)>] (null) [ 58.210521] [<ffffff80080f5298>] expire_timers+0xa0/0x14c [ 58.215937] [<ffffff80080f542c>] run_timer_softirq+0xe8/0x128 [ 58.221702] [<ffffff8008081120>] __do_softirq+0x298/0x348 [ 58.227118] [<ffffff80080a6304>] irq_exit+0x74/0xbc [ 58.232009] [<ffffff80080e17dc>] __handle_domain_irq+0x78/0xac [ 58.237857] [<ffffff8008080cf4>] gic_handle_irq+0x80/0xac ...
The crash happens roughly 125..130ms after the disconnect. This correlates with the 'delay' timer that is started on certain USB tx/rx errors in the URB completion handler.
The problem is a race of usbnet_stop() with usbnet_start_xmit(). In usbnet_stop() we call usbnet_terminate_urbs() to cancel all URBs in flight. This only makes sense if no new URBs are submitted concurrently, though. But the usbnet_start_xmit() can run at the same time on another CPU which almost unconditionally submits an URB. The error callback of the new URB will then schedule the timer after it was already stopped.
The fix adds a check if the tx queue is stopped after the tx list lock has been taken. This should reliably prevent the submission of new URBs while usbnet_terminate_urbs() does its job. The same thing is done on the rx side even though it might be safe due to other flags that are checked there.
Signed-off-by: Jan Klötzke Jan.Kloetzke@preh.de Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/usb/usbnet.c | 6 ++++++ 1 file changed, 6 insertions(+)
--- a/drivers/net/usb/usbnet.c +++ b/drivers/net/usb/usbnet.c @@ -508,6 +508,7 @@ static int rx_submit (struct usbnet *dev
if (netif_running (dev->net) && netif_device_present (dev->net) && + test_bit(EVENT_DEV_OPEN, &dev->flags) && !test_bit (EVENT_RX_HALT, &dev->flags) && !test_bit (EVENT_DEV_ASLEEP, &dev->flags)) { switch (retval = usb_submit_urb (urb, GFP_ATOMIC)) { @@ -1394,6 +1395,11 @@ netdev_tx_t usbnet_start_xmit (struct sk spin_unlock_irqrestore(&dev->txq.lock, flags); goto drop; } + if (netif_queue_stopped(net)) { + usb_autopm_put_interface_async(dev->intf); + spin_unlock_irqrestore(&dev->txq.lock, flags); + goto drop; + }
#ifdef CONFIG_PM /* if this triggers the device is still a sleep */
From: Chris Packham chris.packham@alliedtelesis.co.nz
TLV_SET is called with a data pointer and a len parameter that tells us how many bytes are pointed to by data. When invoking memcpy() we need to careful to only copy len bytes.
Previously we would copy TLV_LENGTH(len) bytes which would copy an extra 4 bytes past the end of the data pointer which newer GCC versions complain about.
In file included from test.c:17: In function 'TLV_SET', inlined from 'test' at test.c:186:5: /usr/include/linux/tipc_config.h:317:3: warning: 'memcpy' forming offset [33, 36] is out of the bounds [0, 32] of object 'bearer_name' with type 'char[32]' [-Warray-bounds] memcpy(TLV_DATA(tlv_ptr), data, tlv_len); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ test.c: In function 'test': test.c::161:10: note: 'bearer_name' declared here char bearer_name[TIPC_MAX_BEARER_NAME]; ^~~~~~~~~~~
We still want to ensure any padding bytes at the end are initialised, do this with a explicit memset() rather than copy bytes past the end of data. Apply the same logic to TCM_SET.
Signed-off-by: Chris Packham chris.packham@alliedtelesis.co.nz Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/uapi/linux/tipc_config.h | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-)
--- a/include/uapi/linux/tipc_config.h +++ b/include/uapi/linux/tipc_config.h @@ -301,8 +301,10 @@ static inline int TLV_SET(void *tlv, __u tlv_ptr = (struct tlv_desc *)tlv; tlv_ptr->tlv_type = htons(type); tlv_ptr->tlv_len = htons(tlv_len); - if (len && data) - memcpy(TLV_DATA(tlv_ptr), data, tlv_len); + if (len && data) { + memcpy(TLV_DATA(tlv_ptr), data, len); + memset(TLV_DATA(tlv_ptr) + len, 0, TLV_SPACE(len) - tlv_len); + } return TLV_SPACE(len); }
@@ -399,8 +401,10 @@ static inline int TCM_SET(void *msg, __u tcm_hdr->tcm_len = htonl(msg_len); tcm_hdr->tcm_type = htons(cmd); tcm_hdr->tcm_flags = htons(flags); - if (data_len && data) + if (data_len && data) { memcpy(TCM_DATA(msg), data, data_len); + memset(TCM_DATA(msg) + data_len, 0, TCM_SPACE(data_len) - msg_len); + } return TCM_SPACE(data_len); }
From: Michael Chan michael.chan@broadcom.com
[ Upstream commit 296d5b54163964b7ae536b8b57dfbd21d4e868e1 ]
For every RX packet, the driver replenishes all buffers used for that packet and puts them back into the RX ring and RX aggregation ring. In one code path where the RX packet has one RX buffer and one or more aggregation buffers, we missed recycling the aggregation buffer(s) if we are unable to allocate a new SKB buffer. This leads to the aggregation ring slowly running out of buffers over time. Fix it by properly recycling the aggregation buffers.
Fixes: c0c050c58d84 ("bnxt_en: New Broadcom ethernet driver.") Reported-by: Rakesh Hemnani rhemnani@fb.com Signed-off-by: Michael Chan michael.chan@broadcom.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/broadcom/bnxt/bnxt.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c @@ -1425,6 +1425,8 @@ static int bnxt_rx_pkt(struct bnxt *bp, skb = bnxt_copy_skb(bnapi, data, len, dma_addr); bnxt_reuse_rx_data(rxr, cons, data); if (!skb) { + if (agg_bufs) + bnxt_reuse_rx_agg_bufs(bnapi, cp_cons, agg_bufs); rc = -ENOMEM; goto next_rx; }
From: Eric Dumazet edumazet@google.com
[ Upstream commit 3580d04aa674383c42de7b635d28e52a1e5bc72c ]
syzbot reported memory leaks [1] that I have back tracked to a missing cleanup from igmpv3_del_delrec() when (im->sfmode != MCAST_INCLUDE)
Add ip_sf_list_clear_all() and kfree_pmc() helpers to explicitely handle the cleanups before freeing.
[1]
BUG: memory leak unreferenced object 0xffff888123e32b00 (size 64): comm "softirq", pid 0, jiffies 4294942968 (age 8.010s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 e0 00 00 01 00 00 00 00 ................ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<000000006105011b>] kmemleak_alloc_recursive include/linux/kmemleak.h:55 [inline] [<000000006105011b>] slab_post_alloc_hook mm/slab.h:439 [inline] [<000000006105011b>] slab_alloc mm/slab.c:3326 [inline] [<000000006105011b>] kmem_cache_alloc_trace+0x13d/0x280 mm/slab.c:3553 [<000000004bba8073>] kmalloc include/linux/slab.h:547 [inline] [<000000004bba8073>] kzalloc include/linux/slab.h:742 [inline] [<000000004bba8073>] ip_mc_add1_src net/ipv4/igmp.c:1961 [inline] [<000000004bba8073>] ip_mc_add_src+0x36b/0x400 net/ipv4/igmp.c:2085 [<00000000a46a65a0>] ip_mc_msfilter+0x22d/0x310 net/ipv4/igmp.c:2475 [<000000005956ca89>] do_ip_setsockopt.isra.0+0x1795/0x1930 net/ipv4/ip_sockglue.c:957 [<00000000848e2d2f>] ip_setsockopt+0x3b/0xb0 net/ipv4/ip_sockglue.c:1246 [<00000000b9db185c>] udp_setsockopt+0x4e/0x90 net/ipv4/udp.c:2616 [<000000003028e438>] sock_common_setsockopt+0x38/0x50 net/core/sock.c:3130 [<0000000015b65589>] __sys_setsockopt+0x98/0x120 net/socket.c:2078 [<00000000ac198ef0>] __do_sys_setsockopt net/socket.c:2089 [inline] [<00000000ac198ef0>] __se_sys_setsockopt net/socket.c:2086 [inline] [<00000000ac198ef0>] __x64_sys_setsockopt+0x26/0x30 net/socket.c:2086 [<000000000a770437>] do_syscall_64+0x76/0x1a0 arch/x86/entry/common.c:301 [<00000000d3adb93b>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
Fixes: 9c8bb163ae78 ("igmp, mld: Fix memory leak in igmpv3/mld_del_delrec()") Signed-off-by: Eric Dumazet edumazet@google.com Cc: Hangbin Liu liuhangbin@gmail.com Reported-by: syzbot syzkaller@googlegroups.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv4/igmp.c | 47 ++++++++++++++++++++++++++++++----------------- 1 file changed, 30 insertions(+), 17 deletions(-)
--- a/net/ipv4/igmp.c +++ b/net/ipv4/igmp.c @@ -635,6 +635,24 @@ static void igmpv3_clear_zeros(struct ip } }
+static void ip_sf_list_clear_all(struct ip_sf_list *psf) +{ + struct ip_sf_list *next; + + while (psf) { + next = psf->sf_next; + kfree(psf); + psf = next; + } +} + +static void kfree_pmc(struct ip_mc_list *pmc) +{ + ip_sf_list_clear_all(pmc->sources); + ip_sf_list_clear_all(pmc->tomb); + kfree(pmc); +} + static void igmpv3_send_cr(struct in_device *in_dev) { struct ip_mc_list *pmc, *pmc_prev, *pmc_next; @@ -671,7 +689,7 @@ static void igmpv3_send_cr(struct in_dev else in_dev->mc_tomb = pmc_next; in_dev_put(pmc->interface); - kfree(pmc); + kfree_pmc(pmc); } else pmc_prev = pmc; } @@ -1195,12 +1213,16 @@ static void igmpv3_del_delrec(struct in_ im->crcount = in_dev->mr_qrv ?: net->ipv4.sysctl_igmp_qrv; if (im->sfmode == MCAST_INCLUDE) { im->tomb = pmc->tomb; + pmc->tomb = NULL; + im->sources = pmc->sources; + pmc->sources = NULL; + for (psf = im->sources; psf; psf = psf->sf_next) psf->sf_crcount = im->crcount; } in_dev_put(pmc->interface); - kfree(pmc); + kfree_pmc(pmc); } spin_unlock_bh(&im->lock); } @@ -1221,21 +1243,18 @@ static void igmpv3_clear_delrec(struct i nextpmc = pmc->next; ip_mc_clear_src(pmc); in_dev_put(pmc->interface); - kfree(pmc); + kfree_pmc(pmc); } /* clear dead sources, too */ rcu_read_lock(); for_each_pmc_rcu(in_dev, pmc) { - struct ip_sf_list *psf, *psf_next; + struct ip_sf_list *psf;
spin_lock_bh(&pmc->lock); psf = pmc->tomb; pmc->tomb = NULL; spin_unlock_bh(&pmc->lock); - for (; psf; psf = psf_next) { - psf_next = psf->sf_next; - kfree(psf); - } + ip_sf_list_clear_all(psf); } rcu_read_unlock(); } @@ -2099,7 +2118,7 @@ static int ip_mc_add_src(struct in_devic
static void ip_mc_clear_src(struct ip_mc_list *pmc) { - struct ip_sf_list *psf, *nextpsf, *tomb, *sources; + struct ip_sf_list *tomb, *sources;
spin_lock_bh(&pmc->lock); tomb = pmc->tomb; @@ -2111,14 +2130,8 @@ static void ip_mc_clear_src(struct ip_mc pmc->sfcount[MCAST_EXCLUDE] = 1; spin_unlock_bh(&pmc->lock);
- for (psf = tomb; psf; psf = nextpsf) { - nextpsf = psf->sf_next; - kfree(psf); - } - for (psf = sources; psf; psf = nextpsf) { - nextpsf = psf->sf_next; - kfree(psf); - } + ip_sf_list_clear_all(tomb); + ip_sf_list_clear_all(sources); }
/* Join a multicast group
From: Eric Dumazet edumazet@google.com
[ Upstream commit 903869bd10e6719b9df6718e785be7ec725df59f ]
ip_sf_list_clear_all() needs to be defined even if !CONFIG_IP_MULTICAST
Fixes: 3580d04aa674 ("ipv4/igmp: fix another memory leak in igmpv3_del_delrec()") Signed-off-by: Eric Dumazet edumazet@google.com Reported-by: kbuild test robot lkp@intel.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv4/igmp.c | 22 +++++++++++----------- 1 file changed, 11 insertions(+), 11 deletions(-)
--- a/net/ipv4/igmp.c +++ b/net/ipv4/igmp.c @@ -190,6 +190,17 @@ static void ip_ma_put(struct ip_mc_list pmc != NULL; \ pmc = rtnl_dereference(pmc->next_rcu))
+static void ip_sf_list_clear_all(struct ip_sf_list *psf) +{ + struct ip_sf_list *next; + + while (psf) { + next = psf->sf_next; + kfree(psf); + psf = next; + } +} + #ifdef CONFIG_IP_MULTICAST
/* @@ -635,17 +646,6 @@ static void igmpv3_clear_zeros(struct ip } }
-static void ip_sf_list_clear_all(struct ip_sf_list *psf) -{ - struct ip_sf_list *next; - - while (psf) { - next = psf->sf_next; - kfree(psf); - psf = next; - } -} - static void kfree_pmc(struct ip_mc_list *pmc) { ip_sf_list_clear_all(pmc->sources);
From: Rasmus Villemoes rasmus.villemoes@prevas.dk
[ Upstream commit 84b3fd1fc9592d431e23b077e692fa4e3fd0f086 ]
Currently, the upper half of a 4-byte STATS_TYPE_PORT statistic ends up in bits 47:32 of the return value, instead of bits 31:16 as they should.
Fixes: 6e46e2d821bb ("net: dsa: mv88e6xxx: Fix u64 statistics") Signed-off-by: Rasmus Villemoes rasmus.villemoes@prevas.dk Reviewed-by: Vivien Didelot vivien.didelot@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/dsa/mv88e6xxx/chip.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/net/dsa/mv88e6xxx/chip.c +++ b/drivers/net/dsa/mv88e6xxx/chip.c @@ -789,7 +789,7 @@ static uint64_t _mv88e6xxx_get_ethtool_s err = mv88e6xxx_port_read(chip, port, s->reg + 1, ®); if (err) return UINT64_MAX; - high = reg; + low |= ((u32)reg) << 16; } break; case BANK0:
From: Jisheng Zhang Jisheng.Zhang@synaptics.com
[ Upstream commit d484e06e25ebb937d841dac02ac1fe76ec7d4ddd ]
Fix below issues in err code path of probe: 1. we don't need to unregister_netdev() because the netdev isn't registered. 2. when register_netdev() fails, we also need to destroy bm pool for HWBM case.
Fixes: dc35a10f68d3 ("net: mvneta: bm: add support for hardware buffer management") Signed-off-by: Jisheng Zhang Jisheng.Zhang@synaptics.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/marvell/mvneta.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-)
--- a/drivers/net/ethernet/marvell/mvneta.c +++ b/drivers/net/ethernet/marvell/mvneta.c @@ -4162,7 +4162,7 @@ static int mvneta_probe(struct platform_ err = register_netdev(dev); if (err < 0) { dev_err(&pdev->dev, "failed to register\n"); - goto err_free_stats; + goto err_netdev; }
netdev_info(dev, "Using %s mac address %pM\n", mac_from, @@ -4181,13 +4181,11 @@ static int mvneta_probe(struct platform_ return 0;
err_netdev: - unregister_netdev(dev); if (pp->bm_priv) { mvneta_bm_pool_destroy(pp->bm_priv, pp->pool_long, 1 << pp->id); mvneta_bm_pool_destroy(pp->bm_priv, pp->pool_short, 1 << pp->id); } -err_free_stats: free_percpu(pp->stats); err_free_ports: free_percpu(pp->ports);
From: Antoine Tenart antoine.tenart@bootlin.com
[ Upstream commit 21808437214637952b61beaba6034d97880fbeb3 ]
MVPP2_TXQ_SCHED_TOKEN_CNTR_REG() expects the logical queue id but the current code is passing the global tx queue offset, so it ends up writing to unknown registers (between 0x8280 and 0x82fc, which seemed to be unused by the hardware). This fixes the issue by using the logical queue id instead.
Fixes: 3f518509dedc ("ethernet: Add new driver for Marvell Armada 375 network unit") Signed-off-by: Antoine Tenart antoine.tenart@bootlin.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/marvell/mvpp2.c | 10 ++++------ 1 file changed, 4 insertions(+), 6 deletions(-)
--- a/drivers/net/ethernet/marvell/mvpp2.c +++ b/drivers/net/ethernet/marvell/mvpp2.c @@ -3938,7 +3938,7 @@ static inline void mvpp2_gmac_max_rx_siz /* Set defaults to the MVPP2 port */ static void mvpp2_defaults_set(struct mvpp2_port *port) { - int tx_port_num, val, queue, ptxq, lrxq; + int tx_port_num, val, queue, lrxq;
/* Configure port to loopback if needed */ if (port->flags & MVPP2_F_LOOPBACK) @@ -3958,11 +3958,9 @@ static void mvpp2_defaults_set(struct mv mvpp2_write(port->priv, MVPP2_TXP_SCHED_CMD_1_REG, 0);
/* Close bandwidth for all queues */ - for (queue = 0; queue < MVPP2_MAX_TXQ; queue++) { - ptxq = mvpp2_txq_phys(port->id, queue); + for (queue = 0; queue < MVPP2_MAX_TXQ; queue++) mvpp2_write(port->priv, - MVPP2_TXQ_SCHED_TOKEN_CNTR_REG(ptxq), 0); - } + MVPP2_TXQ_SCHED_TOKEN_CNTR_REG(queue), 0);
/* Set refill period to 1 usec, refill tokens * and bucket size to maximum @@ -4709,7 +4707,7 @@ static void mvpp2_txq_deinit(struct mvpp txq->descs_phys = 0;
/* Set minimum bandwidth for disabled TXQs */ - mvpp2_write(port->priv, MVPP2_TXQ_SCHED_TOKEN_CNTR_REG(txq->id), 0); + mvpp2_write(port->priv, MVPP2_TXQ_SCHED_TOKEN_CNTR_REG(txq->log_id), 0);
/* Set Tx descriptors queue starting address and size */ mvpp2_write(port->priv, MVPP2_TXQ_NUM_REG, txq->id);
From: Daniel Axtens dja@axtens.net
commit 357d065a44cdd77ed5ff35155a989f2a763e96ef upstream.
VMX ghash was using a fallback that did not support interleaving simd and nosimd operations, leading to failures in the extended test suite.
If I understood correctly, Eric's suggestion was to use the same data format that the generic code uses, allowing us to call into it with the same contexts. I wasn't able to get that to work - I think there's a very different key structure and data layout being used.
So instead steal the arm64 approach and perform the fallback operations directly if required.
Fixes: cc333cd68dfa ("crypto: vmx - Adding GHASH routines for VMX module") Cc: stable@vger.kernel.org # v4.1+ Reported-by: Eric Biggers ebiggers@google.com Signed-off-by: Daniel Axtens dja@axtens.net Acked-by: Ard Biesheuvel ard.biesheuvel@linaro.org Tested-by: Michael Ellerman mpe@ellerman.id.au Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Daniel Axtens dja@axtens.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/crypto/vmx/ghash.c | 213 ++++++++++++++++++--------------------------- 1 file changed, 87 insertions(+), 126 deletions(-)
--- a/drivers/crypto/vmx/ghash.c +++ b/drivers/crypto/vmx/ghash.c @@ -1,22 +1,14 @@ +// SPDX-License-Identifier: GPL-2.0 /** * GHASH routines supporting VMX instructions on the Power 8 * - * Copyright (C) 2015 International Business Machines Inc. - * - * This program is free software; you can redistribute it and/or modify - * it under the terms of the GNU General Public License as published by - * the Free Software Foundation; version 2 only. - * - * This program is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU General Public License for more details. - * - * You should have received a copy of the GNU General Public License - * along with this program; if not, write to the Free Software - * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. + * Copyright (C) 2015, 2019 International Business Machines Inc. * * Author: Marcelo Henrique Cerri mhcerri@br.ibm.com + * + * Extended by Daniel Axtens dja@axtens.net to replace the fallback + * mechanism. The new approach is based on arm64 code, which is: + * Copyright (C) 2014 - 2018 Linaro Ltd. ard.biesheuvel@linaro.org */
#include <linux/types.h> @@ -39,71 +31,25 @@ void gcm_ghash_p8(u64 Xi[2], const u128 const u8 *in, size_t len);
struct p8_ghash_ctx { + /* key used by vector asm */ u128 htable[16]; - struct crypto_shash *fallback; + /* key used by software fallback */ + be128 key; };
struct p8_ghash_desc_ctx { u64 shash[2]; u8 buffer[GHASH_DIGEST_SIZE]; int bytes; - struct shash_desc fallback_desc; };
-static int p8_ghash_init_tfm(struct crypto_tfm *tfm) -{ - const char *alg = "ghash-generic"; - struct crypto_shash *fallback; - struct crypto_shash *shash_tfm = __crypto_shash_cast(tfm); - struct p8_ghash_ctx *ctx = crypto_tfm_ctx(tfm); - - fallback = crypto_alloc_shash(alg, 0, CRYPTO_ALG_NEED_FALLBACK); - if (IS_ERR(fallback)) { - printk(KERN_ERR - "Failed to allocate transformation for '%s': %ld\n", - alg, PTR_ERR(fallback)); - return PTR_ERR(fallback); - } - - crypto_shash_set_flags(fallback, - crypto_shash_get_flags((struct crypto_shash - *) tfm)); - - /* Check if the descsize defined in the algorithm is still enough. */ - if (shash_tfm->descsize < sizeof(struct p8_ghash_desc_ctx) - + crypto_shash_descsize(fallback)) { - printk(KERN_ERR - "Desc size of the fallback implementation (%s) does not match the expected value: %lu vs %u\n", - alg, - shash_tfm->descsize - sizeof(struct p8_ghash_desc_ctx), - crypto_shash_descsize(fallback)); - return -EINVAL; - } - ctx->fallback = fallback; - - return 0; -} - -static void p8_ghash_exit_tfm(struct crypto_tfm *tfm) -{ - struct p8_ghash_ctx *ctx = crypto_tfm_ctx(tfm); - - if (ctx->fallback) { - crypto_free_shash(ctx->fallback); - ctx->fallback = NULL; - } -} - static int p8_ghash_init(struct shash_desc *desc) { - struct p8_ghash_ctx *ctx = crypto_tfm_ctx(crypto_shash_tfm(desc->tfm)); struct p8_ghash_desc_ctx *dctx = shash_desc_ctx(desc);
dctx->bytes = 0; memset(dctx->shash, 0, GHASH_DIGEST_SIZE); - dctx->fallback_desc.tfm = ctx->fallback; - dctx->fallback_desc.flags = desc->flags; - return crypto_shash_init(&dctx->fallback_desc); + return 0; }
static int p8_ghash_setkey(struct crypto_shash *tfm, const u8 *key, @@ -121,7 +67,51 @@ static int p8_ghash_setkey(struct crypto disable_kernel_vsx(); pagefault_enable(); preempt_enable(); - return crypto_shash_setkey(ctx->fallback, key, keylen); + + memcpy(&ctx->key, key, GHASH_BLOCK_SIZE); + + return 0; +} + +static inline void __ghash_block(struct p8_ghash_ctx *ctx, + struct p8_ghash_desc_ctx *dctx) +{ + if (!IN_INTERRUPT) { + preempt_disable(); + pagefault_disable(); + enable_kernel_vsx(); + gcm_ghash_p8(dctx->shash, ctx->htable, + dctx->buffer, GHASH_DIGEST_SIZE); + disable_kernel_vsx(); + pagefault_enable(); + preempt_enable(); + } else { + crypto_xor((u8 *)dctx->shash, dctx->buffer, GHASH_BLOCK_SIZE); + gf128mul_lle((be128 *)dctx->shash, &ctx->key); + } +} + +static inline void __ghash_blocks(struct p8_ghash_ctx *ctx, + struct p8_ghash_desc_ctx *dctx, + const u8 *src, unsigned int srclen) +{ + if (!IN_INTERRUPT) { + preempt_disable(); + pagefault_disable(); + enable_kernel_vsx(); + gcm_ghash_p8(dctx->shash, ctx->htable, + src, srclen); + disable_kernel_vsx(); + pagefault_enable(); + preempt_enable(); + } else { + while (srclen >= GHASH_BLOCK_SIZE) { + crypto_xor((u8 *)dctx->shash, src, GHASH_BLOCK_SIZE); + gf128mul_lle((be128 *)dctx->shash, &ctx->key); + srclen -= GHASH_BLOCK_SIZE; + src += GHASH_BLOCK_SIZE; + } + } }
static int p8_ghash_update(struct shash_desc *desc, @@ -131,49 +121,33 @@ static int p8_ghash_update(struct shash_ struct p8_ghash_ctx *ctx = crypto_tfm_ctx(crypto_shash_tfm(desc->tfm)); struct p8_ghash_desc_ctx *dctx = shash_desc_ctx(desc);
- if (IN_INTERRUPT) { - return crypto_shash_update(&dctx->fallback_desc, src, - srclen); - } else { - if (dctx->bytes) { - if (dctx->bytes + srclen < GHASH_DIGEST_SIZE) { - memcpy(dctx->buffer + dctx->bytes, src, - srclen); - dctx->bytes += srclen; - return 0; - } + if (dctx->bytes) { + if (dctx->bytes + srclen < GHASH_DIGEST_SIZE) { memcpy(dctx->buffer + dctx->bytes, src, - GHASH_DIGEST_SIZE - dctx->bytes); - preempt_disable(); - pagefault_disable(); - enable_kernel_vsx(); - gcm_ghash_p8(dctx->shash, ctx->htable, - dctx->buffer, GHASH_DIGEST_SIZE); - disable_kernel_vsx(); - pagefault_enable(); - preempt_enable(); - src += GHASH_DIGEST_SIZE - dctx->bytes; - srclen -= GHASH_DIGEST_SIZE - dctx->bytes; - dctx->bytes = 0; + srclen); + dctx->bytes += srclen; + return 0; } - len = srclen & ~(GHASH_DIGEST_SIZE - 1); - if (len) { - preempt_disable(); - pagefault_disable(); - enable_kernel_vsx(); - gcm_ghash_p8(dctx->shash, ctx->htable, src, len); - disable_kernel_vsx(); - pagefault_enable(); - preempt_enable(); - src += len; - srclen -= len; - } - if (srclen) { - memcpy(dctx->buffer, src, srclen); - dctx->bytes = srclen; - } - return 0; + memcpy(dctx->buffer + dctx->bytes, src, + GHASH_DIGEST_SIZE - dctx->bytes); + + __ghash_block(ctx, dctx); + + src += GHASH_DIGEST_SIZE - dctx->bytes; + srclen -= GHASH_DIGEST_SIZE - dctx->bytes; + dctx->bytes = 0; + } + len = srclen & ~(GHASH_DIGEST_SIZE - 1); + if (len) { + __ghash_blocks(ctx, dctx, src, len); + src += len; + srclen -= len; } + if (srclen) { + memcpy(dctx->buffer, src, srclen); + dctx->bytes = srclen; + } + return 0; }
static int p8_ghash_final(struct shash_desc *desc, u8 *out) @@ -182,25 +156,14 @@ static int p8_ghash_final(struct shash_d struct p8_ghash_ctx *ctx = crypto_tfm_ctx(crypto_shash_tfm(desc->tfm)); struct p8_ghash_desc_ctx *dctx = shash_desc_ctx(desc);
- if (IN_INTERRUPT) { - return crypto_shash_final(&dctx->fallback_desc, out); - } else { - if (dctx->bytes) { - for (i = dctx->bytes; i < GHASH_DIGEST_SIZE; i++) - dctx->buffer[i] = 0; - preempt_disable(); - pagefault_disable(); - enable_kernel_vsx(); - gcm_ghash_p8(dctx->shash, ctx->htable, - dctx->buffer, GHASH_DIGEST_SIZE); - disable_kernel_vsx(); - pagefault_enable(); - preempt_enable(); - dctx->bytes = 0; - } - memcpy(out, dctx->shash, GHASH_DIGEST_SIZE); - return 0; + if (dctx->bytes) { + for (i = dctx->bytes; i < GHASH_DIGEST_SIZE; i++) + dctx->buffer[i] = 0; + __ghash_block(ctx, dctx); + dctx->bytes = 0; } + memcpy(out, dctx->shash, GHASH_DIGEST_SIZE); + return 0; }
struct shash_alg p8_ghash_alg = { @@ -215,11 +178,9 @@ struct shash_alg p8_ghash_alg = { .cra_name = "ghash", .cra_driver_name = "p8_ghash", .cra_priority = 1000, - .cra_flags = CRYPTO_ALG_TYPE_SHASH | CRYPTO_ALG_NEED_FALLBACK, + .cra_flags = CRYPTO_ALG_TYPE_SHASH, .cra_blocksize = GHASH_BLOCK_SIZE, .cra_ctxsize = sizeof(struct p8_ghash_ctx), .cra_module = THIS_MODULE, - .cra_init = p8_ghash_init_tfm, - .cra_exit = p8_ghash_exit_tfm, }, };
From: Konrad Rzeszutek Wilk konrad.wilk@oracle.com
commit 7681f31ec9cdacab4fd10570be924f2cef6669ba upstream.
There is no need for this at all. Worst it means that if the guest tries to write to BARs it could lead (on certain platforms) to PCI SERR errors.
Please note that with af6fc858a35b90e89ea7a7ee58e66628c55c776b "xen-pciback: limit guest control of command register" a guest is still allowed to enable those control bits (safely), but is not allowed to disable them and that therefore a well behaved frontend which enables things before using them will still function correctly.
This is done via an write to the configuration register 0x4 which triggers on the backend side: command_write - pci_enable_device - pci_enable_device_flags - do_pci_enable_device - pcibios_enable_device -pci_enable_resourcess [which enables the PCI_COMMAND_MEMORY|PCI_COMMAND_IO]
However guests (and drivers) which don't do this could cause problems, including the security issues which XSA-120 sought to address.
Reported-by: Jan Beulich jbeulich@suse.com Signed-off-by: Konrad Rzeszutek Wilk konrad.wilk@oracle.com Reviewed-by: Prarit Bhargava prarit@redhat.com Signed-off-by: Juergen Gross jgross@suse.com Cc: Ben Hutchings ben.hutchings@codethink.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/xen/xen-pciback/pciback_ops.c | 2 -- 1 file changed, 2 deletions(-)
--- a/drivers/xen/xen-pciback/pciback_ops.c +++ b/drivers/xen/xen-pciback/pciback_ops.c @@ -126,8 +126,6 @@ void xen_pcibk_reset_device(struct pci_d if (pci_is_enabled(dev)) pci_disable_device(dev);
- pci_write_config_word(dev, PCI_COMMAND, 0); - dev->is_busmaster = 0; } else { pci_read_config_word(dev, PCI_COMMAND, &cmd);
From: David S. Miller davem@davemloft.net
commit 5593530e56943182ebb6d81eca8a3be6db6dbba4 upstream.
This reverts commit 532b0f7ece4cb2ffd24dc723ddf55242d1188e5e.
More revisions coming up.
Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- net/tipc/core.c | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-)
--- a/net/tipc/core.c +++ b/net/tipc/core.c @@ -62,10 +62,6 @@ static int __net_init tipc_init_net(stru INIT_LIST_HEAD(&tn->node_list); spin_lock_init(&tn->node_list_lock);
- err = tipc_socket_init(); - if (err) - goto out_socket; - err = tipc_sk_rht_init(net); if (err) goto out_sk_rht; @@ -92,8 +88,6 @@ out_subscr: out_nametbl: tipc_sk_rht_destroy(net); out_sk_rht: - tipc_socket_stop(); -out_socket: return err; }
@@ -104,7 +98,6 @@ static void __net_exit tipc_exit_net(str tipc_bcast_stop(net); tipc_nametbl_stop(net); tipc_sk_rht_destroy(net); - tipc_socket_stop(); }
static struct pernet_operations tipc_net_ops = { @@ -140,6 +133,10 @@ static int __init tipc_init(void) if (err) goto out_pernet;
+ err = tipc_socket_init(); + if (err) + goto out_socket; + err = tipc_bearer_setup(); if (err) goto out_bearer; @@ -147,6 +144,8 @@ static int __init tipc_init(void) pr_info("Started in single node mode\n"); return 0; out_bearer: + tipc_socket_stop(); +out_socket: unregister_pernet_subsys(&tipc_net_ops); out_pernet: tipc_unregister_sysctl(); @@ -162,6 +161,7 @@ out_netlink: static void __exit tipc_exit(void) { tipc_bearer_cleanup(); + tipc_socket_stop(); unregister_pernet_subsys(&tipc_net_ops); tipc_netlink_stop(); tipc_netlink_compat_stop();
From: Junwei Hu hujunwei4@huawei.com
commit 526f5b851a96566803ee4bee60d0a34df56c77f8 upstream.
Error message printed: modprobe: ERROR: could not insert 'tipc': Address family not supported by protocol. when modprobe tipc after the following patch: switch order of device registration, commit 7e27e8d6130c ("tipc: switch order of device registration to fix a crash")
Because sock_create_kern(net, AF_TIPC, ...) called by tipc_topsrv_create_listener() in the initialization process of tipc_init_net(), so tipc_socket_init() must be execute before that. Meanwhile, tipc_net_id need to be initialized when sock_create() called, and tipc_socket_init() is no need to be called for each namespace.
I add a variable tipc_topsrv_net_ops, and split the register_pernet_subsys() of tipc into two parts, and split tipc_socket_init() with initialization of pernet params.
By the way, I fixed resources rollback error when tipc_bcast_init() failed in tipc_init_net().
Fixes: 7e27e8d6130c ("tipc: switch order of device registration to fix a crash") Signed-off-by: Junwei Hu hujunwei4@huawei.com Reported-by: Wang Wang wangwang2@huawei.com Reported-by: syzbot+1e8114b61079bfe9cbc5@syzkaller.appspotmail.com Reviewed-by: Kang Zhou zhoukang7@huawei.com Reviewed-by: Suanming Mou mousuanming@huawei.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- net/tipc/core.c | 18 ++++++++++++------ net/tipc/subscr.c | 14 ++++++++++++-- net/tipc/subscr.h | 5 +++-- 3 files changed, 27 insertions(+), 10 deletions(-)
--- a/net/tipc/core.c +++ b/net/tipc/core.c @@ -71,9 +71,6 @@ static int __net_init tipc_init_net(stru goto out_nametbl;
INIT_LIST_HEAD(&tn->dist_queue); - err = tipc_topsrv_start(net); - if (err) - goto out_subscr;
err = tipc_bcast_init(net); if (err) @@ -82,8 +79,6 @@ static int __net_init tipc_init_net(stru return 0;
out_bclink: - tipc_bcast_stop(net); -out_subscr: tipc_nametbl_stop(net); out_nametbl: tipc_sk_rht_destroy(net); @@ -93,7 +88,6 @@ out_sk_rht:
static void __net_exit tipc_exit_net(struct net *net) { - tipc_topsrv_stop(net); tipc_net_stop(net); tipc_bcast_stop(net); tipc_nametbl_stop(net); @@ -107,6 +101,11 @@ static struct pernet_operations tipc_net .size = sizeof(struct tipc_net), };
+static struct pernet_operations tipc_topsrv_net_ops = { + .init = tipc_topsrv_init_net, + .exit = tipc_topsrv_exit_net, +}; + static int __init tipc_init(void) { int err; @@ -137,6 +136,10 @@ static int __init tipc_init(void) if (err) goto out_socket;
+ err = register_pernet_subsys(&tipc_topsrv_net_ops); + if (err) + goto out_pernet_topsrv; + err = tipc_bearer_setup(); if (err) goto out_bearer; @@ -144,6 +147,8 @@ static int __init tipc_init(void) pr_info("Started in single node mode\n"); return 0; out_bearer: + unregister_pernet_subsys(&tipc_topsrv_net_ops); +out_pernet_topsrv: tipc_socket_stop(); out_socket: unregister_pernet_subsys(&tipc_net_ops); @@ -161,6 +166,7 @@ out_netlink: static void __exit tipc_exit(void) { tipc_bearer_cleanup(); + unregister_pernet_subsys(&tipc_topsrv_net_ops); tipc_socket_stop(); unregister_pernet_subsys(&tipc_net_ops); tipc_netlink_stop(); --- a/net/tipc/subscr.c +++ b/net/tipc/subscr.c @@ -358,7 +358,7 @@ static void *tipc_subscrb_connect_cb(int return (void *)tipc_subscrb_create(conid); }
-int tipc_topsrv_start(struct net *net) +static int tipc_topsrv_start(struct net *net) { struct tipc_net *tn = net_generic(net, tipc_net_id); const char name[] = "topology_server"; @@ -396,7 +396,7 @@ int tipc_topsrv_start(struct net *net) return tipc_server_start(topsrv); }
-void tipc_topsrv_stop(struct net *net) +static void tipc_topsrv_stop(struct net *net) { struct tipc_net *tn = net_generic(net, tipc_net_id); struct tipc_server *topsrv = tn->topsrv; @@ -405,3 +405,13 @@ void tipc_topsrv_stop(struct net *net) kfree(topsrv->saddr); kfree(topsrv); } + +int __net_init tipc_topsrv_init_net(struct net *net) +{ + return tipc_topsrv_start(net); +} + +void __net_exit tipc_topsrv_exit_net(struct net *net) +{ + tipc_topsrv_stop(net); +} --- a/net/tipc/subscr.h +++ b/net/tipc/subscr.h @@ -75,7 +75,8 @@ void tipc_subscrp_report_overlap(struct void tipc_subscrp_convert_seq(struct tipc_name_seq *in, int swap, struct tipc_name_seq *out); u32 tipc_subscrp_convert_seq_type(u32 type, int swap); -int tipc_topsrv_start(struct net *net); -void tipc_topsrv_stop(struct net *net); + +int __net_init tipc_topsrv_init_net(struct net *net); +void __net_exit tipc_topsrv_exit_net(struct net *net);
#endif
From: James Clarke jrtc27@jrtc27.com
commit d3c976c14ad8af421134c428b0a89ff8dd3bd8f8 upstream.
Previously, %g2 would end up with the value PAGE_SIZE, but after the commit mentioned below it ends up with the value 1 due to being reused for a different purpose. We need it to be PAGE_SIZE as we use it to step through pages in our demap loop, otherwise we set different flags in the low 12 bits of the address written to, thereby doing things other than a nucleus page flush.
Fixes: a74ad5e660a9 ("sparc64: Handle extremely large kernel TLB range flushes more gracefully.") Reported-by: Meelis Roos mroos@linux.ee Tested-by: Meelis Roos mroos@linux.ee Signed-off-by: James Clarke jrtc27@jrtc27.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/sparc/mm/ultra.S | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/arch/sparc/mm/ultra.S +++ b/arch/sparc/mm/ultra.S @@ -586,7 +586,7 @@ xcall_flush_tlb_kernel_range: /* 44 insn sub %g7, %g1, %g3 srlx %g3, 18, %g2 brnz,pn %g2, 2f - add %g2, 1, %g2 + sethi %hi(PAGE_SIZE), %g2 sub %g3, %g2, %g3 or %g1, 0x20, %g1 ! Nucleus 1: stxa %g0, [%g1 + %g3] ASI_DMMU_DEMAP @@ -750,7 +750,7 @@ __cheetah_xcall_flush_tlb_kernel_range: sub %g7, %g1, %g3 srlx %g3, 18, %g2 brnz,pn %g2, 2f - add %g2, 1, %g2 + sethi %hi(PAGE_SIZE), %g2 sub %g3, %g2, %g3 or %g1, 0x20, %g1 ! Nucleus 1: stxa %g0, [%g1 + %g3] ASI_DMMU_DEMAP
From: Rasmus Villemoes linux@rasmusvillemoes.dk
commit ef4d6f6b275c498f8e5626c99dbeefdc5027f843 upstream.
The ror32 implementation (word >> shift) | (word << (32 - shift) has undefined behaviour if shift is outside the [1, 31] range. Similarly for the 64 bit variants. Most callers pass a compile-time constant (naturally in that range), but there's an UBSAN report that these may actually be called with a shift count of 0.
Instead of special-casing that, we can make them DTRT for all values of shift while also avoiding UB. For some reason, this was already partly done for rol32 (which was well-defined for [0, 31]). gcc 8 recognizes these patterns as rotates, so for example
__u32 rol32(__u32 word, unsigned int shift) { return (word << (shift & 31)) | (word >> ((-shift) & 31)); }
compiles to
0000000000000020 <rol32>: 20: 89 f8 mov %edi,%eax 22: 89 f1 mov %esi,%ecx 24: d3 c0 rol %cl,%eax 26: c3 retq
Older compilers unfortunately do not do as well, but this only affects the small minority of users that don't pass constants.
Due to integer promotions, ro[lr]8 were already well-defined for shifts in [0, 8], and ro[lr]16 were mostly well-defined for shifts in [0, 16] (only mostly - u16 gets promoted to _signed_ int, so if bit 15 is set, word << 16 is undefined). For consistency, update those as well.
Link: http://lkml.kernel.org/r/20190410211906.2190-1-linux@rasmusvillemoes.dk Signed-off-by: Rasmus Villemoes linux@rasmusvillemoes.dk Reported-by: Ido Schimmel idosch@mellanox.com Tested-by: Ido Schimmel idosch@mellanox.com Reviewed-by: Will Deacon will.deacon@arm.com Cc: Vadim Pasternak vadimp@mellanox.com Cc: Andrey Ryabinin aryabinin@virtuozzo.com Cc: Jacek Anaszewski jacek.anaszewski@gmail.com Cc: Pavel Machek pavel@ucw.cz Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Matthias Kaehlcke mka@chromium.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- include/linux/bitops.h | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-)
--- a/include/linux/bitops.h +++ b/include/linux/bitops.h @@ -58,7 +58,7 @@ static __always_inline unsigned long hwe */ static inline __u64 rol64(__u64 word, unsigned int shift) { - return (word << shift) | (word >> (64 - shift)); + return (word << (shift & 63)) | (word >> ((-shift) & 63)); }
/** @@ -68,7 +68,7 @@ static inline __u64 rol64(__u64 word, un */ static inline __u64 ror64(__u64 word, unsigned int shift) { - return (word >> shift) | (word << (64 - shift)); + return (word >> (shift & 63)) | (word << ((-shift) & 63)); }
/** @@ -78,7 +78,7 @@ static inline __u64 ror64(__u64 word, un */ static inline __u32 rol32(__u32 word, unsigned int shift) { - return (word << shift) | (word >> ((-shift) & 31)); + return (word << (shift & 31)) | (word >> ((-shift) & 31)); }
/** @@ -88,7 +88,7 @@ static inline __u32 rol32(__u32 word, un */ static inline __u32 ror32(__u32 word, unsigned int shift) { - return (word >> shift) | (word << (32 - shift)); + return (word >> (shift & 31)) | (word << ((-shift) & 31)); }
/** @@ -98,7 +98,7 @@ static inline __u32 ror32(__u32 word, un */ static inline __u16 rol16(__u16 word, unsigned int shift) { - return (word << shift) | (word >> (16 - shift)); + return (word << (shift & 15)) | (word >> ((-shift) & 15)); }
/** @@ -108,7 +108,7 @@ static inline __u16 rol16(__u16 word, un */ static inline __u16 ror16(__u16 word, unsigned int shift) { - return (word >> shift) | (word << (16 - shift)); + return (word >> (shift & 15)) | (word << ((-shift) & 15)); }
/** @@ -118,7 +118,7 @@ static inline __u16 ror16(__u16 word, un */ static inline __u8 rol8(__u8 word, unsigned int shift) { - return (word << shift) | (word >> (8 - shift)); + return (word << (shift & 7)) | (word >> ((-shift) & 7)); }
/** @@ -128,7 +128,7 @@ static inline __u8 rol8(__u8 word, unsig */ static inline __u8 ror8(__u8 word, unsigned int shift) { - return (word >> shift) | (word << (8 - shift)); + return (word >> (shift & 7)) | (word << ((-shift) & 7)); }
/**
From: Henry Lin henryl@nvidia.com
commit 597c56e372dab2c7f79b8d700aad3a5deebf9d1b upstream.
This change fixes a data corruption issue occurred on USB hard disk for the case that bounce buffer is used during transferring data.
While updating data between sg list and bounce buffer, current implementation passes mapped sg number (urb->num_mapped_sgs) to sg_pcopy_from_buffer() and sg_pcopy_to_buffer(). This causes data not get copied if target buffer is located in the elements after mapped sg elements. This change passes sg number for full list to fix issue.
Besides, for copying data from bounce buffer, calling dma_unmap_single() on the bounce buffer before copying data to sg list can avoid cache issue.
Fixes: f9c589e142d0 ("xhci: TD-fragment, align the unsplittable case with a bounce buffer") Cc: stable@vger.kernel.org # v4.8+ Signed-off-by: Henry Lin henryl@nvidia.com Signed-off-by: Mathias Nyman mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/usb/host/xhci-ring.c | 17 +++++++++++++---- 1 file changed, 13 insertions(+), 4 deletions(-)
--- a/drivers/usb/host/xhci-ring.c +++ b/drivers/usb/host/xhci-ring.c @@ -678,6 +678,7 @@ void xhci_unmap_td_bounce_buffer(struct struct device *dev = xhci_to_hcd(xhci)->self.controller; struct xhci_segment *seg = td->bounce_seg; struct urb *urb = td->urb; + size_t len;
if (!seg || !urb) return; @@ -688,11 +689,14 @@ void xhci_unmap_td_bounce_buffer(struct return; }
- /* for in tranfers we need to copy the data from bounce to sg */ - sg_pcopy_from_buffer(urb->sg, urb->num_mapped_sgs, seg->bounce_buf, - seg->bounce_len, seg->bounce_offs); dma_unmap_single(dev, seg->bounce_dma, ring->bounce_buf_len, DMA_FROM_DEVICE); + /* for in tranfers we need to copy the data from bounce to sg */ + len = sg_pcopy_from_buffer(urb->sg, urb->num_sgs, seg->bounce_buf, + seg->bounce_len, seg->bounce_offs); + if (len != seg->bounce_len) + xhci_warn(xhci, "WARN Wrong bounce buffer read length: %ld != %d\n", + len, seg->bounce_len); seg->bounce_len = 0; seg->bounce_offs = 0; } @@ -3163,6 +3167,7 @@ static int xhci_align_td(struct xhci_hcd unsigned int unalign; unsigned int max_pkt; u32 new_buff_len; + size_t len;
max_pkt = GET_MAX_PACKET(usb_endpoint_maxp(&urb->ep->desc)); unalign = (enqd_len + *trb_buff_len) % max_pkt; @@ -3193,8 +3198,12 @@ static int xhci_align_td(struct xhci_hcd
/* create a max max_pkt sized bounce buffer pointed to by last trb */ if (usb_urb_dir_out(urb)) { - sg_pcopy_to_buffer(urb->sg, urb->num_mapped_sgs, + len = sg_pcopy_to_buffer(urb->sg, urb->num_sgs, seg->bounce_buf, new_buff_len, enqd_len); + if (len != seg->bounce_len) + xhci_warn(xhci, + "WARN Wrong bounce buffer write length: %ld != %d\n", + len, seg->bounce_len); seg->bounce_dma = dma_map_single(dev, seg->bounce_buf, max_pkt, DMA_TO_DEVICE); } else {
From: Fabio Estevam festevam@gmail.com
commit c1a145a3ed9a40f3b6145feb97789e8eb49c5566 upstream.
Commit 597c56e372da ("xhci: update bounce buffer with correct sg num") caused the following build warnings:
drivers/usb/host/xhci-ring.c:676:19: warning: format '%ld' expects argument of type 'long int', but argument 3 has type 'size_t {aka unsigned int}' [-Wformat=]
Use %zu for printing size_t type in order to fix the warnings.
Fixes: 597c56e372da ("xhci: update bounce buffer with correct sg num") Reported-by: kbuild test robot lkp@intel.com Signed-off-by: Fabio Estevam festevam@gmail.com Cc: stable stable@vger.kernel.org Acked-by: Mathias Nyman mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/usb/host/xhci-ring.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/usb/host/xhci-ring.c +++ b/drivers/usb/host/xhci-ring.c @@ -695,7 +695,7 @@ void xhci_unmap_td_bounce_buffer(struct len = sg_pcopy_from_buffer(urb->sg, urb->num_sgs, seg->bounce_buf, seg->bounce_len, seg->bounce_offs); if (len != seg->bounce_len) - xhci_warn(xhci, "WARN Wrong bounce buffer read length: %ld != %d\n", + xhci_warn(xhci, "WARN Wrong bounce buffer read length: %zu != %d\n", len, seg->bounce_len); seg->bounce_len = 0; seg->bounce_offs = 0; @@ -3202,7 +3202,7 @@ static int xhci_align_td(struct xhci_hcd seg->bounce_buf, new_buff_len, enqd_len); if (len != seg->bounce_len) xhci_warn(xhci, - "WARN Wrong bounce buffer write length: %ld != %d\n", + "WARN Wrong bounce buffer write length: %zu != %d\n", len, seg->bounce_len); seg->bounce_dma = dma_map_single(dev, seg->bounce_buf, max_pkt, DMA_TO_DEVICE);
From: Andrey Smirnov andrew.smirnov@gmail.com
commit f7fac17ca925faa03fc5eb854c081a24075f8bad upstream.
Xhci_handshake() implements the algorithm already captured by readl_poll_timeout_atomic(). Convert the former to use the latter to avoid repetition.
Turned out this patch also fixes a bug on the AMD Stoneyridge platform where usleep(1) sometimes takes over 10ms. This means a 5 second timeout can easily take over 15 seconds which will trigger the watchdog and reboot the system.
[Add info about patch fixing a bug to commit message -Mathias] Signed-off-by: Andrey Smirnov andrew.smirnov@gmail.com Tested-by: Raul E Rangel rrangel@chromium.org Reviewed-by: Raul E Rangel rrangel@chromium.org Cc: stable@vger.kernel.org Signed-off-by: Mathias Nyman mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/usb/host/xhci.c | 22 ++++++++++------------ 1 file changed, 10 insertions(+), 12 deletions(-)
--- a/drivers/usb/host/xhci.c +++ b/drivers/usb/host/xhci.c @@ -21,6 +21,7 @@ */
#include <linux/pci.h> +#include <linux/iopoll.h> #include <linux/irq.h> #include <linux/log2.h> #include <linux/module.h> @@ -47,7 +48,6 @@ static unsigned int quirks; module_param(quirks, uint, S_IRUGO); MODULE_PARM_DESC(quirks, "Bit flags for quirks to be enabled as default");
-/* TODO: copied from ehci-hcd.c - can this be refactored? */ /* * xhci_handshake - spin reading hc until handshake completes or fails * @ptr: address of hc register to be read @@ -64,18 +64,16 @@ MODULE_PARM_DESC(quirks, "Bit flags for int xhci_handshake(void __iomem *ptr, u32 mask, u32 done, int usec) { u32 result; + int ret;
- do { - result = readl(ptr); - if (result == ~(u32)0) /* card removed */ - return -ENODEV; - result &= mask; - if (result == done) - return 0; - udelay(1); - usec--; - } while (usec > 0); - return -ETIMEDOUT; + ret = readl_poll_timeout_atomic(ptr, result, + (result & mask) == done || + result == U32_MAX, + 1, usec); + if (result == U32_MAX) /* card removed */ + return -ENODEV; + + return ret; }
/*
From: Carsten Schmid carsten_schmid@mentor.com
commit 7aa1bb2ffd84d6b9b5f546b079bb15cd0ab6e76e upstream.
With defective USB sticks we see the following error happen: usb 1-3: new high-speed USB device number 6 using xhci_hcd usb 1-3: device descriptor read/64, error -71 usb 1-3: device descriptor read/64, error -71 usb 1-3: new high-speed USB device number 7 using xhci_hcd usb 1-3: device descriptor read/64, error -71 usb 1-3: unable to get BOS descriptor set usb 1-3: New USB device found, idVendor=0781, idProduct=5581 usb 1-3: New USB device strings: Mfr=1, Product=2, SerialNumber=3 ... BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
This comes from the following place: [ 1660.215380] IP: xhci_set_usb2_hardware_lpm+0xdf/0x3d0 [xhci_hcd] [ 1660.222092] PGD 0 P4D 0 [ 1660.224918] Oops: 0000 [#1] PREEMPT SMP NOPTI [ 1660.425520] CPU: 1 PID: 38 Comm: kworker/1:1 Tainted: P U W O 4.14.67-apl #1 [ 1660.434277] Workqueue: usb_hub_wq hub_event [usbcore] [ 1660.439918] task: ffffa295b6ae4c80 task.stack: ffffad4580150000 [ 1660.446532] RIP: 0010:xhci_set_usb2_hardware_lpm+0xdf/0x3d0 [xhci_hcd] [ 1660.453821] RSP: 0018:ffffad4580153c70 EFLAGS: 00010046 [ 1660.459655] RAX: 0000000000000000 RBX: ffffa295b4d7c000 RCX: 0000000000000002 [ 1660.467625] RDX: 0000000000000002 RSI: ffffffff984a55b2 RDI: ffffffff984a55b2 [ 1660.475586] RBP: ffffad4580153cc8 R08: 0000000000d6520a R09: 0000000000000001 [ 1660.483556] R10: ffffad4580a004a0 R11: 0000000000000286 R12: ffffa295b4d7c000 [ 1660.491525] R13: 0000000000010648 R14: ffffa295a84e1800 R15: 0000000000000000 [ 1660.499494] FS: 0000000000000000(0000) GS:ffffa295bfc80000(0000) knlGS:0000000000000000 [ 1660.508530] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1660.514947] CR2: 0000000000000008 CR3: 000000025a114000 CR4: 00000000003406a0 [ 1660.522917] Call Trace: [ 1660.525657] usb_set_usb2_hardware_lpm+0x3d/0x70 [usbcore] [ 1660.531792] usb_disable_device+0x242/0x260 [usbcore] [ 1660.537439] usb_disconnect+0xc1/0x2b0 [usbcore] [ 1660.542600] hub_event+0x596/0x18f0 [usbcore] [ 1660.547467] ? trace_preempt_on+0xdf/0x100 [ 1660.552040] ? process_one_work+0x1c1/0x410 [ 1660.556708] process_one_work+0x1d2/0x410 [ 1660.561184] ? preempt_count_add.part.3+0x21/0x60 [ 1660.566436] worker_thread+0x2d/0x3f0 [ 1660.570522] kthread+0x122/0x140 [ 1660.574123] ? process_one_work+0x410/0x410 [ 1660.578792] ? kthread_create_on_node+0x60/0x60 [ 1660.583849] ret_from_fork+0x3a/0x50 [ 1660.587839] Code: 00 49 89 c3 49 8b 84 24 50 16 00 00 8d 4a ff 48 8d 04 c8 48 89 ca 4c 8b 10 45 8b 6a 04 48 8b 00 48 89 45 c0 49 8b 86 80 03 00 00 <48> 8b 40 08 8b 40 03 0f 1f 44 00 00 45 85 ff 0f 84 81 01 00 00 [ 1660.608980] RIP: xhci_set_usb2_hardware_lpm+0xdf/0x3d0 [xhci_hcd] RSP: ffffad4580153c70 [ 1660.617921] CR2: 0000000000000008
Tracking this down shows that udev->bos is NULL in the following code: (xhci.c, in xhci_set_usb2_hardware_lpm) field = le32_to_cpu(udev->bos->ext_cap->bmAttributes); <<<<<<< here
xhci_dbg(xhci, "%s port %d USB2 hardware LPM\n", enable ? "enable" : "disable", port_num + 1);
if (enable) { /* Host supports BESL timeout instead of HIRD */ if (udev->usb2_hw_lpm_besl_capable) { /* if device doesn't have a preferred BESL value use a * default one which works with mixed HIRD and BESL * systems. See XHCI_DEFAULT_BESL definition in xhci.h */ if ((field & USB_BESL_SUPPORT) && (field & USB_BESL_BASELINE_VALID)) hird = USB_GET_BESL_BASELINE(field); else hird = udev->l1_params.besl;
The failing case is when disabling LPM. So it is sufficient to avoid access to udev->bos by moving the instruction into the "enable" clause.
Cc: Stable stable@vger.kernel.org Signed-off-by: Carsten Schmid carsten_schmid@mentor.com Signed-off-by: Mathias Nyman mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/usb/host/xhci.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/usb/host/xhci.c +++ b/drivers/usb/host/xhci.c @@ -4172,7 +4172,6 @@ int xhci_set_usb2_hardware_lpm(struct us pm_addr = port_array[port_num] + PORTPMSC; pm_val = readl(pm_addr); hlpm_addr = port_array[port_num] + PORTHLPMC; - field = le32_to_cpu(udev->bos->ext_cap->bmAttributes);
xhci_dbg(xhci, "%s port %d USB2 hardware LPM\n", enable ? "enable" : "disable", port_num + 1); @@ -4184,6 +4183,7 @@ int xhci_set_usb2_hardware_lpm(struct us * default one which works with mixed HIRD and BESL * systems. See XHCI_DEFAULT_BESL definition in xhci.h */ + field = le32_to_cpu(udev->bos->ext_cap->bmAttributes); if ((field & USB_BESL_SUPPORT) && (field & USB_BESL_BASELINE_VALID)) hird = USB_GET_BESL_BASELINE(field);
From: Shuah Khan skhan@linuxfoundation.org
commit 0c9e8b3cad654bfc499c10b652fbf8f0b890af8f upstream.
stub_probe() and stub_disconnect() call functions which could call sleeping function in invalid context whil holding busid_lock.
Fix the problem by refining the lock holds to short critical sections to change the busid_priv fields. This fix restructures the code to limit the lock holds in stub_probe() and stub_disconnect().
stub_probe():
[15217.927028] BUG: sleeping function called from invalid context at mm/slab.h:418 [15217.927038] in_atomic(): 1, irqs_disabled(): 0, pid: 29087, name: usbip [15217.927044] 5 locks held by usbip/29087: [15217.927047] #0: 0000000091647f28 (sb_writers#6){....}, at: vfs_write+0x191/0x1c0 [15217.927062] #1: 000000008f9ba75b (&of->mutex){....}, at: kernfs_fop_write+0xf7/0x1b0 [15217.927072] #2: 00000000872e5b4b (&dev->mutex){....}, at: __device_driver_lock+0x3b/0x50 [15217.927082] #3: 00000000e74ececc (&dev->mutex){....}, at: __device_driver_lock+0x46/0x50 [15217.927090] #4: 00000000b20abbe0 (&(&busid_table[i].busid_lock)->rlock){....}, at: get_busid_priv+0x48/0x60 [usbip_host] [15217.927103] CPU: 3 PID: 29087 Comm: usbip Tainted: G W 5.1.0-rc6+ #40 [15217.927106] Hardware name: Dell Inc. OptiPlex 790/0HY9JP, BIOS A18 09/24/2013 [15217.927109] Call Trace: [15217.927118] dump_stack+0x63/0x85 [15217.927127] ___might_sleep+0xff/0x120 [15217.927133] __might_sleep+0x4a/0x80 [15217.927143] kmem_cache_alloc_trace+0x1aa/0x210 [15217.927156] stub_probe+0xe8/0x440 [usbip_host] [15217.927171] usb_probe_device+0x34/0x70
stub_disconnect():
[15279.182478] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:908 [15279.182487] in_atomic(): 1, irqs_disabled(): 0, pid: 29114, name: usbip [15279.182492] 5 locks held by usbip/29114: [15279.182494] #0: 0000000091647f28 (sb_writers#6){....}, at: vfs_write+0x191/0x1c0 [15279.182506] #1: 00000000702cf0f3 (&of->mutex){....}, at: kernfs_fop_write+0xf7/0x1b0 [15279.182514] #2: 00000000872e5b4b (&dev->mutex){....}, at: __device_driver_lock+0x3b/0x50 [15279.182522] #3: 00000000e74ececc (&dev->mutex){....}, at: __device_driver_lock+0x46/0x50 [15279.182529] #4: 00000000b20abbe0 (&(&busid_table[i].busid_lock)->rlock){....}, at: get_busid_priv+0x48/0x60 [usbip_host] [15279.182541] CPU: 0 PID: 29114 Comm: usbip Tainted: G W 5.1.0-rc6+ #40 [15279.182543] Hardware name: Dell Inc. OptiPlex 790/0HY9JP, BIOS A18 09/24/2013 [15279.182546] Call Trace: [15279.182554] dump_stack+0x63/0x85 [15279.182561] ___might_sleep+0xff/0x120 [15279.182566] __might_sleep+0x4a/0x80 [15279.182574] __mutex_lock+0x55/0x950 [15279.182582] ? get_busid_priv+0x48/0x60 [usbip_host] [15279.182587] ? reacquire_held_locks+0xec/0x1a0 [15279.182591] ? get_busid_priv+0x48/0x60 [usbip_host] [15279.182597] ? find_held_lock+0x94/0xa0 [15279.182609] mutex_lock_nested+0x1b/0x20 [15279.182614] ? mutex_lock_nested+0x1b/0x20 [15279.182618] kernfs_remove_by_name_ns+0x2a/0x90 [15279.182625] sysfs_remove_file_ns+0x15/0x20 [15279.182629] device_remove_file+0x19/0x20 [15279.182634] stub_disconnect+0x6d/0x180 [usbip_host] [15279.182643] usb_unbind_device+0x27/0x60
Signed-off-by: Shuah Khan skhan@linuxfoundation.org Cc: stable stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/usb/usbip/stub_dev.c | 65 ++++++++++++++++++++++++++++--------------- 1 file changed, 43 insertions(+), 22 deletions(-)
--- a/drivers/usb/usbip/stub_dev.c +++ b/drivers/usb/usbip/stub_dev.c @@ -315,9 +315,17 @@ static int stub_probe(struct usb_device const char *udev_busid = dev_name(&udev->dev); struct bus_id_priv *busid_priv; int rc = 0; + char save_status;
dev_dbg(&udev->dev, "Enter probe\n");
+ /* Not sure if this is our device. Allocate here to avoid + * calling alloc while holding busid_table lock. + */ + sdev = stub_device_alloc(udev); + if (!sdev) + return -ENOMEM; + /* check we should claim or not by busid_table */ busid_priv = get_busid_priv(udev_busid); if (!busid_priv || (busid_priv->status == STUB_BUSID_REMOV) || @@ -332,14 +340,14 @@ static int stub_probe(struct usb_device * See driver_probe_device() in driver/base/dd.c */ rc = -ENODEV; - goto call_put_busid_priv; + goto sdev_free; }
if (udev->descriptor.bDeviceClass == USB_CLASS_HUB) { dev_dbg(&udev->dev, "%s is a usb hub device... skip!\n", udev_busid); rc = -ENODEV; - goto call_put_busid_priv; + goto sdev_free; }
if (!strcmp(udev->bus->bus_name, "vhci_hcd")) { @@ -348,15 +356,9 @@ static int stub_probe(struct usb_device udev_busid);
rc = -ENODEV; - goto call_put_busid_priv; + goto sdev_free; }
- /* ok, this is my device */ - sdev = stub_device_alloc(udev); - if (!sdev) { - rc = -ENOMEM; - goto call_put_busid_priv; - }
dev_info(&udev->dev, "usbip-host: register new device (bus %u dev %u)\n", @@ -366,9 +368,13 @@ static int stub_probe(struct usb_device
/* set private data to usb_device */ dev_set_drvdata(&udev->dev, sdev); + busid_priv->sdev = sdev; busid_priv->udev = udev;
+ save_status = busid_priv->status; + busid_priv->status = STUB_BUSID_ALLOC; + /* * Claim this hub port. * It doesn't matter what value we pass as owner @@ -381,15 +387,16 @@ static int stub_probe(struct usb_device goto err_port; }
+ /* release the busid_lock */ + put_busid_priv(busid_priv); + rc = stub_add_files(&udev->dev); if (rc) { dev_err(&udev->dev, "stub_add_files for %s\n", udev_busid); goto err_files; } - busid_priv->status = STUB_BUSID_ALLOC;
- rc = 0; - goto call_put_busid_priv; + return 0;
err_files: usb_hub_release_port(udev->parent, udev->portnum, @@ -398,23 +405,24 @@ err_port: dev_set_drvdata(&udev->dev, NULL); usb_put_dev(udev);
+ /* we already have busid_priv, just lock busid_lock */ + spin_lock(&busid_priv->busid_lock); busid_priv->sdev = NULL; + busid_priv->status = save_status; +sdev_free: stub_device_free(sdev); - -call_put_busid_priv: + /* release the busid_lock */ put_busid_priv(busid_priv); + return rc; }
static void shutdown_busid(struct bus_id_priv *busid_priv) { - if (busid_priv->sdev && !busid_priv->shutdown_busid) { - busid_priv->shutdown_busid = 1; - usbip_event_add(&busid_priv->sdev->ud, SDEV_EVENT_REMOVED); + usbip_event_add(&busid_priv->sdev->ud, SDEV_EVENT_REMOVED);
- /* wait for the stop of the event handler */ - usbip_stop_eh(&busid_priv->sdev->ud); - } + /* wait for the stop of the event handler */ + usbip_stop_eh(&busid_priv->sdev->ud); }
/* @@ -446,6 +454,9 @@ static void stub_disconnect(struct usb_d
dev_set_drvdata(&udev->dev, NULL);
+ /* release busid_lock before call to remove device files */ + put_busid_priv(busid_priv); + /* * NOTE: rx/tx threads are invoked for each usb_device. */ @@ -456,18 +467,27 @@ static void stub_disconnect(struct usb_d (struct usb_dev_state *) udev); if (rc) { dev_dbg(&udev->dev, "unable to release port\n"); - goto call_put_busid_priv; + return; }
/* If usb reset is called from event handler */ if (usbip_in_eh(current)) - goto call_put_busid_priv; + return; + + /* we already have busid_priv, just lock busid_lock */ + spin_lock(&busid_priv->busid_lock); + if (!busid_priv->shutdown_busid) + busid_priv->shutdown_busid = 1; + /* release busid_lock */ + put_busid_priv(busid_priv);
/* shutdown the current connection */ shutdown_busid(busid_priv);
usb_put_dev(sdev->udev);
+ /* we already have busid_priv, just lock busid_lock */ + spin_lock(&busid_priv->busid_lock); /* free sdev */ busid_priv->sdev = NULL; stub_device_free(sdev); @@ -476,6 +496,7 @@ static void stub_disconnect(struct usb_d busid_priv->status = STUB_BUSID_ADDED;
call_put_busid_priv: + /* release busid_lock */ put_busid_priv(busid_priv); }
From: Shuah Khan skhan@linuxfoundation.org
commit 3ea3091f1bd8586125848c62be295910e9802af0 upstream.
Fix the following sparse context imbalance regression introduced in a patch that fixed sleeping function called from invalid context bug.
kbuild test robot reported on:
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git usb-linus
Regressions in current branch:
drivers/usb/usbip/stub_dev.c:399:9: sparse: sparse: context imbalance in 'stub_probe' - different lock contexts for basic block drivers/usb/usbip/stub_dev.c:418:13: sparse: sparse: context imbalance in 'stub_disconnect' - different lock contexts for basic block drivers/usb/usbip/stub_dev.c:464:1-10: second lock on line 476
Error ids grouped by kconfigs:
recent_errors ├── i386-allmodconfig │ └── drivers-usb-usbip-stub_dev.c:second-lock-on-line ├── x86_64-allmodconfig │ ├── drivers-usb-usbip-stub_dev.c:sparse:sparse:context-imbalance-in-stub_disconnect-different-lock-contexts-for-basic-block │ └── drivers-usb-usbip-stub_dev.c:sparse:sparse:context-imbalance-in-stub_probe-different-lock-contexts-for-basic-block └── x86_64-allyesconfig └── drivers-usb-usbip-stub_dev.c:second-lock-on-line
This is a real problem in an error leg where spin_lock() is called on an already held lock.
Fix the imbalance in stub_probe() and stub_disconnect().
Signed-off-by: Shuah Khan skhan@linuxfoundation.org Fixes: 0c9e8b3cad65 ("usbip: usbip_host: fix BUG: sleeping function called from invalid context") Cc: stable stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/usb/usbip/stub_dev.c | 36 +++++++++++++++++++++++------------- 1 file changed, 23 insertions(+), 13 deletions(-)
--- a/drivers/usb/usbip/stub_dev.c +++ b/drivers/usb/usbip/stub_dev.c @@ -340,14 +340,17 @@ static int stub_probe(struct usb_device * See driver_probe_device() in driver/base/dd.c */ rc = -ENODEV; - goto sdev_free; + if (!busid_priv) + goto sdev_free; + + goto call_put_busid_priv; }
if (udev->descriptor.bDeviceClass == USB_CLASS_HUB) { dev_dbg(&udev->dev, "%s is a usb hub device... skip!\n", udev_busid); rc = -ENODEV; - goto sdev_free; + goto call_put_busid_priv; }
if (!strcmp(udev->bus->bus_name, "vhci_hcd")) { @@ -356,7 +359,7 @@ static int stub_probe(struct usb_device udev_busid);
rc = -ENODEV; - goto sdev_free; + goto call_put_busid_priv; }
@@ -375,6 +378,9 @@ static int stub_probe(struct usb_device save_status = busid_priv->status; busid_priv->status = STUB_BUSID_ALLOC;
+ /* release the busid_lock */ + put_busid_priv(busid_priv); + /* * Claim this hub port. * It doesn't matter what value we pass as owner @@ -387,9 +393,6 @@ static int stub_probe(struct usb_device goto err_port; }
- /* release the busid_lock */ - put_busid_priv(busid_priv); - rc = stub_add_files(&udev->dev); if (rc) { dev_err(&udev->dev, "stub_add_files for %s\n", udev_busid); @@ -409,11 +412,17 @@ err_port: spin_lock(&busid_priv->busid_lock); busid_priv->sdev = NULL; busid_priv->status = save_status; -sdev_free: - stub_device_free(sdev); + spin_unlock(&busid_priv->busid_lock); + /* lock is released - go to free */ + goto sdev_free; + +call_put_busid_priv: /* release the busid_lock */ put_busid_priv(busid_priv);
+sdev_free: + stub_device_free(sdev); + return rc; }
@@ -449,7 +458,9 @@ static void stub_disconnect(struct usb_d /* get stub_device */ if (!sdev) { dev_err(&udev->dev, "could not get device"); - goto call_put_busid_priv; + /* release busid_lock */ + put_busid_priv(busid_priv); + return; }
dev_set_drvdata(&udev->dev, NULL); @@ -479,7 +490,7 @@ static void stub_disconnect(struct usb_d if (!busid_priv->shutdown_busid) busid_priv->shutdown_busid = 1; /* release busid_lock */ - put_busid_priv(busid_priv); + spin_unlock(&busid_priv->busid_lock);
/* shutdown the current connection */ shutdown_busid(busid_priv); @@ -494,10 +505,9 @@ static void stub_disconnect(struct usb_d
if (busid_priv->status == STUB_BUSID_ALLOC) busid_priv->status = STUB_BUSID_ADDED; - -call_put_busid_priv: /* release busid_lock */ - put_busid_priv(busid_priv); + spin_unlock(&busid_priv->busid_lock); + return; }
#ifdef CONFIG_PM
From: Alan Stern stern@rowland.harvard.edu
commit a03ff54460817c76105f81f3aa8ef655759ccc9a upstream.
The syzkaller USB fuzzer found a slab-out-of-bounds write bug in the USB core, caused by a failure to check the actual size of a BOS descriptor. This patch adds a check to make sure the descriptor is at least as large as it is supposed to be, so that the code doesn't inadvertently access memory beyond the end of the allocated region when assigning to dev->bos->desc->bNumDeviceCaps later on.
Signed-off-by: Alan Stern stern@rowland.harvard.edu Reported-and-tested-by: syzbot+71f1e64501a309fcc012@syzkaller.appspotmail.com CC: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/usb/core/config.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/usb/core/config.c +++ b/drivers/usb/core/config.c @@ -931,8 +931,8 @@ int usb_get_bos_descriptor(struct usb_de
/* Get BOS descriptor */ ret = usb_get_descriptor(dev, USB_DT_BOS, 0, bos, USB_DT_BOS_SIZE); - if (ret < USB_DT_BOS_SIZE) { - dev_err(ddev, "unable to get BOS descriptor\n"); + if (ret < USB_DT_BOS_SIZE || bos->bLength < USB_DT_BOS_SIZE) { + dev_err(ddev, "unable to get BOS descriptor or descriptor too short\n"); if (ret >= 0) ret = -ENOMSG; kfree(bos);
From: Oliver Neukum oneukum@suse.com
commit 9a5729f68d3a82786aea110b1bfe610be318f80a upstream.
The pointer used to log a failure of usb_register_dev() must be set before the error is logged.
v2: fix that minor is not available before registration
Signed-off-by: oliver Neukum oneukum@suse.com Reported-by: syzbot+a0cbdbd6d169020c8959@syzkaller.appspotmail.com Fixes: 7b5cd5fefbe02 ("USB: SisUSB2VGA: Convert printk to dev_* macros") Cc: stable stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/usb/misc/sisusbvga/sisusb.c | 15 ++++++++------- 1 file changed, 8 insertions(+), 7 deletions(-)
--- a/drivers/usb/misc/sisusbvga/sisusb.c +++ b/drivers/usb/misc/sisusbvga/sisusb.c @@ -3041,6 +3041,13 @@ static int sisusb_probe(struct usb_inter
mutex_init(&(sisusb->lock));
+ sisusb->sisusb_dev = dev; + sisusb->vrambase = SISUSB_PCI_MEMBASE; + sisusb->mmiobase = SISUSB_PCI_MMIOBASE; + sisusb->mmiosize = SISUSB_PCI_MMIOSIZE; + sisusb->ioportbase = SISUSB_PCI_IOPORTBASE; + /* Everything else is zero */ + /* Register device */ retval = usb_register_dev(intf, &usb_sisusb_class); if (retval) { @@ -3051,13 +3058,7 @@ static int sisusb_probe(struct usb_inter goto error_1; }
- sisusb->sisusb_dev = dev; - sisusb->minor = intf->minor; - sisusb->vrambase = SISUSB_PCI_MEMBASE; - sisusb->mmiobase = SISUSB_PCI_MMIOBASE; - sisusb->mmiosize = SISUSB_PCI_MMIOSIZE; - sisusb->ioportbase = SISUSB_PCI_IOPORTBASE; - /* Everything else is zero */ + sisusb->minor = intf->minor;
/* Allocate buffers */ sisusb->ibufsize = SISUSB_IBUF_SIZE;
From: Maximilian Luz luzmaximilian@gmail.com
commit ea261113385ac0a71c2838185f39e8452d54b152 upstream.
Without USB_QUIRK_NO_LPM ethernet will not work and rtl8152 will complain with
r8152 <device...>: Stop submitting intr, status -71
Adding the quirk resolves this. As the dock is externally powered, this should not have any drawbacks.
Signed-off-by: Maximilian Luz luzmaximilian@gmail.com Cc: stable stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/usb/core/quirks.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/drivers/usb/core/quirks.c +++ b/drivers/usb/core/quirks.c @@ -64,6 +64,9 @@ static const struct usb_device_id usb_qu /* Microsoft LifeCam-VX700 v2.0 */ { USB_DEVICE(0x045e, 0x0770), .driver_info = USB_QUIRK_RESET_RESUME },
+ /* Microsoft Surface Dock Ethernet (RTL8153 GigE) */ + { USB_DEVICE(0x045e, 0x07c6), .driver_info = USB_QUIRK_NO_LPM }, + /* Cherry Stream G230 2.0 (G85-231) and 3.0 (G85-232) */ { USB_DEVICE(0x046a, 0x0023), .driver_info = USB_QUIRK_RESET_RESUME },
From: Oliver Neukum oneukum@suse.com
commit 3864d33943b4a76c6e64616280e98d2410b1190f upstream.
This driver is using a global variable. It cannot handle more than one device at a time. The issue has been existing since the dawn of the driver.
Signed-off-by: Oliver Neukum oneukum@suse.com Reported-by: syzbot+35f04d136fc975a70da4@syzkaller.appspotmail.com Cc: stable stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/usb/misc/rio500.c | 24 ++++++++++++++++++------ 1 file changed, 18 insertions(+), 6 deletions(-)
--- a/drivers/usb/misc/rio500.c +++ b/drivers/usb/misc/rio500.c @@ -464,15 +464,23 @@ static int probe_rio(struct usb_interfac { struct usb_device *dev = interface_to_usbdev(intf); struct rio_usb_data *rio = &rio_instance; - int retval; + int retval = 0;
- dev_info(&intf->dev, "USB Rio found at address %d\n", dev->devnum); + mutex_lock(&rio500_mutex); + if (rio->present) { + dev_info(&intf->dev, "Second USB Rio at address %d refused\n", dev->devnum); + retval = -EBUSY; + goto bail_out; + } else { + dev_info(&intf->dev, "USB Rio found at address %d\n", dev->devnum); + }
retval = usb_register_dev(intf, &usb_rio_class); if (retval) { dev_err(&dev->dev, "Not able to get a minor for this device.\n"); - return -ENOMEM; + retval = -ENOMEM; + goto bail_out; }
rio->rio_dev = dev; @@ -481,7 +489,8 @@ static int probe_rio(struct usb_interfac dev_err(&dev->dev, "probe_rio: Not enough memory for the output buffer\n"); usb_deregister_dev(intf, &usb_rio_class); - return -ENOMEM; + retval = -ENOMEM; + goto bail_out; } dev_dbg(&intf->dev, "obuf address:%p\n", rio->obuf);
@@ -490,7 +499,8 @@ static int probe_rio(struct usb_interfac "probe_rio: Not enough memory for the input buffer\n"); usb_deregister_dev(intf, &usb_rio_class); kfree(rio->obuf); - return -ENOMEM; + retval = -ENOMEM; + goto bail_out; } dev_dbg(&intf->dev, "ibuf address:%p\n", rio->ibuf);
@@ -498,8 +508,10 @@ static int probe_rio(struct usb_interfac
usb_set_intfdata (intf, rio); rio->present = 1; +bail_out: + mutex_unlock(&rio500_mutex);
- return 0; + return retval; }
static void disconnect_rio(struct usb_interface *intf)
From: Oliver Neukum oneukum@suse.com
commit e0feb73428b69322dd5caae90b0207de369b5575 upstream.
If a disconnected device is closed, rio_close() must free the buffers.
Signed-off-by: Oliver Neukum oneukum@suse.com Cc: stable stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/usb/misc/rio500.c | 17 +++++++++++++++-- 1 file changed, 15 insertions(+), 2 deletions(-)
--- a/drivers/usb/misc/rio500.c +++ b/drivers/usb/misc/rio500.c @@ -103,9 +103,22 @@ static int close_rio(struct inode *inode { struct rio_usb_data *rio = &rio_instance;
- rio->isopen = 0; + /* against disconnect() */ + mutex_lock(&rio500_mutex); + mutex_lock(&(rio->lock));
- dev_info(&rio->rio_dev->dev, "Rio closed.\n"); + rio->isopen = 0; + if (!rio->present) { + /* cleanup has been delayed */ + kfree(rio->ibuf); + kfree(rio->obuf); + rio->ibuf = NULL; + rio->obuf = NULL; + } else { + dev_info(&rio->rio_dev->dev, "Rio closed.\n"); + } + mutex_unlock(&(rio->lock)); + mutex_unlock(&rio500_mutex); return 0; }
From: Alan Stern stern@rowland.harvard.edu
commit 31e0456de5be379b10fea0fa94a681057114a96e upstream.
The syzkaller USB fuzzer found a general-protection-fault bug in the smsusb part of the Siano DVB driver. The fault occurs during probe because the driver assumes without checking that the device has both IN and OUT endpoints and the IN endpoint is ep1.
By slightly rearranging the driver's initialization code, we can make the appropriate checks early on and thus avoid the problem. If the expected endpoints aren't present, the new code safely returns -ENODEV from the probe routine.
Signed-off-by: Alan Stern stern@rowland.harvard.edu Reported-and-tested-by: syzbot+53f029db71c19a47325a@syzkaller.appspotmail.com CC: stable@vger.kernel.org Reviewed-by: Johan Hovold johan@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/media/usb/siano/smsusb.c | 33 ++++++++++++++++++++------------- 1 file changed, 20 insertions(+), 13 deletions(-)
--- a/drivers/media/usb/siano/smsusb.c +++ b/drivers/media/usb/siano/smsusb.c @@ -402,6 +402,7 @@ static int smsusb_init_device(struct usb struct smsusb_device_t *dev; void *mdev; int i, rc; + int in_maxp;
/* create device object */ dev = kzalloc(sizeof(struct smsusb_device_t), GFP_KERNEL); @@ -413,6 +414,24 @@ static int smsusb_init_device(struct usb dev->udev = interface_to_usbdev(intf); dev->state = SMSUSB_DISCONNECTED;
+ for (i = 0; i < intf->cur_altsetting->desc.bNumEndpoints; i++) { + struct usb_endpoint_descriptor *desc = + &intf->cur_altsetting->endpoint[i].desc; + + if (desc->bEndpointAddress & USB_DIR_IN) { + dev->in_ep = desc->bEndpointAddress; + in_maxp = usb_endpoint_maxp(desc); + } else { + dev->out_ep = desc->bEndpointAddress; + } + } + + pr_debug("in_ep = %02x, out_ep = %02x\n", dev->in_ep, dev->out_ep); + if (!dev->in_ep || !dev->out_ep) { /* Missing endpoints? */ + smsusb_term_device(intf); + return -ENODEV; + } + params.device_type = sms_get_board(board_id)->type;
switch (params.device_type) { @@ -427,24 +446,12 @@ static int smsusb_init_device(struct usb /* fall-thru */ default: dev->buffer_size = USB2_BUFFER_SIZE; - dev->response_alignment = - le16_to_cpu(dev->udev->ep_in[1]->desc.wMaxPacketSize) - - sizeof(struct sms_msg_hdr); + dev->response_alignment = in_maxp - sizeof(struct sms_msg_hdr);
params.flags |= SMS_DEVICE_FAMILY2; break; }
- for (i = 0; i < intf->cur_altsetting->desc.bNumEndpoints; i++) { - if (intf->cur_altsetting->endpoint[i].desc. bEndpointAddress & USB_DIR_IN) - dev->in_ep = intf->cur_altsetting->endpoint[i].desc.bEndpointAddress; - else - dev->out_ep = intf->cur_altsetting->endpoint[i].desc.bEndpointAddress; - } - - pr_debug("in_ep = %02x, out_ep = %02x\n", - dev->in_ep, dev->out_ep); - params.device = &dev->udev->dev; params.buffer_size = dev->buffer_size; params.num_buffers = MAX_BUFFERS;
From: Alan Stern stern@rowland.harvard.edu
commit 45457c01171fd1488a7000d1751c06ed8560ee38 upstream.
GCC complains about an apparently uninitialized variable recently added to smsusb_init_device(). It's a false positive, but to silence the warning this patch adds a trivial initialization.
Signed-off-by: Alan Stern stern@rowland.harvard.edu Reported-by: kbuild test robot lkp@intel.com CC: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/media/usb/siano/smsusb.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/media/usb/siano/smsusb.c +++ b/drivers/media/usb/siano/smsusb.c @@ -402,7 +402,7 @@ static int smsusb_init_device(struct usb struct smsusb_device_t *dev; void *mdev; int i, rc; - int in_maxp; + int in_maxp = 0;
/* create device object */ dev = kzalloc(sizeof(struct smsusb_device_t), GFP_KERNEL);
From: Mauro Carvalho Chehab mchehab+samsung@kernel.org
commit a47686636d84eaec5c9c6e84bd5f96bed34d526d upstream.
Most Siano devices require an alignment for the response.
Changeset f3be52b0056a ("media: usb: siano: Fix general protection fault in smsusb") changed the logic with gets such aligment, but it now produces a sparce warning:
drivers/media/usb/siano/smsusb.c: In function 'smsusb_init_device': drivers/media/usb/siano/smsusb.c:447:37: warning: 'in_maxp' may be used uninitialized in this function [-Wmaybe-uninitialized] 447 | dev->response_alignment = in_maxp - sizeof(struct sms_msg_hdr); | ~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~
The sparse message itself is bogus, but a broken (or fake) USB eeprom could produce a negative value for response_alignment.
So, change the code in order to check if the result is not negative.
Fixes: 31e0456de5be ("media: usb: siano: Fix general protection fault in smsusb") CC: stable@vger.kernel.org Signed-off-by: Mauro Carvalho Chehab mchehab+samsung@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/media/usb/siano/smsusb.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
--- a/drivers/media/usb/siano/smsusb.c +++ b/drivers/media/usb/siano/smsusb.c @@ -402,7 +402,7 @@ static int smsusb_init_device(struct usb struct smsusb_device_t *dev; void *mdev; int i, rc; - int in_maxp = 0; + int align = 0;
/* create device object */ dev = kzalloc(sizeof(struct smsusb_device_t), GFP_KERNEL); @@ -420,14 +420,14 @@ static int smsusb_init_device(struct usb
if (desc->bEndpointAddress & USB_DIR_IN) { dev->in_ep = desc->bEndpointAddress; - in_maxp = usb_endpoint_maxp(desc); + align = usb_endpoint_maxp(desc) - sizeof(struct sms_msg_hdr); } else { dev->out_ep = desc->bEndpointAddress; } }
pr_debug("in_ep = %02x, out_ep = %02x\n", dev->in_ep, dev->out_ep); - if (!dev->in_ep || !dev->out_ep) { /* Missing endpoints? */ + if (!dev->in_ep || !dev->out_ep || align < 0) { /* Missing endpoints? */ smsusb_term_device(intf); return -ENODEV; } @@ -446,7 +446,7 @@ static int smsusb_init_device(struct usb /* fall-thru */ default: dev->buffer_size = USB2_BUFFER_SIZE; - dev->response_alignment = in_maxp - sizeof(struct sms_msg_hdr); + dev->response_alignment = align;
params.flags |= SMS_DEVICE_FAMILY2; break;
From: Steffen Maier maier@linux.ibm.com
commit d27e5e07f9c49bf2a6a4ef254ce531c1b4fb5a38 upstream.
With this early return due to zfcp_unit child(ren), we don't use the zfcp_port reference from the earlier zfcp_get_port_by_wwpn() anymore and need to put it.
Signed-off-by: Steffen Maier maier@linux.ibm.com Fixes: d99b601b6338 ("[SCSI] zfcp: restore refcount check on port_remove") Cc: stable@vger.kernel.org #3.7+ Reviewed-by: Jens Remus jremus@linux.ibm.com Reviewed-by: Benjamin Block bblock@linux.ibm.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/s390/scsi/zfcp_sysfs.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/s390/scsi/zfcp_sysfs.c +++ b/drivers/s390/scsi/zfcp_sysfs.c @@ -263,6 +263,7 @@ static ssize_t zfcp_sysfs_port_remove_st if (atomic_read(&port->units) > 0) { retval = -EBUSY; mutex_unlock(&zfcp_sysfs_port_units_mutex); + put_device(&port->dev); /* undo zfcp_get_port_by_wwpn() */ goto out; } /* port is about to be removed, so no more unit_add */
From: Steffen Maier maier@linux.ibm.com
commit ef4021fe5fd77ced0323cede27979d80a56211ca upstream.
When the user tries to remove a zfcp port via sysfs, we only rejected it if there are zfcp unit children under the port. With purely automatically scanned LUNs there are no zfcp units but only SCSI devices. In such cases, the port_remove erroneously continued. We close the port and this implicitly closes all LUNs under the port. The SCSI devices survive with their private zfcp_scsi_dev still holding a reference to the "removed" zfcp_port (still allocated but invisible in sysfs) [zfcp_get_port_by_wwpn in zfcp_scsi_slave_alloc]. This is not a problem as long as the fc_rport stays blocked. Once (auto) port scan brings back the removed port, we unblock its fc_rport again by design. However, there is no mechanism that would recover (open) the LUNs under the port (no "ersfs_3" without zfcp_unit [zfcp_erp_strategy_followup_success]). Any pending or new I/O to such LUN leads to repeated:
Done: NEEDS_RETRY Result: hostbyte=DID_IMM_RETRY driverbyte=DRIVER_OK
See also v4.10 commit 6f2ce1c6af37 ("scsi: zfcp: fix rport unblock race with LUN recovery"). Even a manual LUN recovery (echo 0 > /sys/bus/scsi/devices/H:C:T:L/zfcp_failed) does not help, as the LUN links to the old "removed" port which remains to lack ZFCP_STATUS_COMMON_RUNNING [zfcp_erp_required_act]. The only workaround is to first ensure that the fc_rport is blocked (e.g. port_remove again in case it was re-discovered by (auto) port scan), then delete the SCSI devices, and finally re-discover by (auto) port scan. The port scan includes an fc_rport unblock, which in turn triggers a new scan on the scsi target to freshly get new pure auto scan LUNs.
Fix this by rejecting port_remove also if there are SCSI devices (even without any zfcp_unit) under this port. Re-use mechanics from v3.7 commit d99b601b6338 ("[SCSI] zfcp: restore refcount check on port_remove"). However, we have to give up zfcp_sysfs_port_units_mutex earlier in unit_add to prevent a deadlock with scsi_host scan taking shost->scan_mutex first and then zfcp_sysfs_port_units_mutex now in our zfcp_scsi_slave_alloc().
Signed-off-by: Steffen Maier maier@linux.ibm.com Fixes: b62a8d9b45b9 ("[SCSI] zfcp: Use SCSI device data zfcp scsi dev instead of zfcp unit") Fixes: f8210e34887e ("[SCSI] zfcp: Allow midlayer to scan for LUNs when running in NPIV mode") Cc: stable@vger.kernel.org #2.6.37+ Reviewed-by: Benjamin Block bblock@linux.ibm.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/s390/scsi/zfcp_ext.h | 1 drivers/s390/scsi/zfcp_scsi.c | 9 ++++++ drivers/s390/scsi/zfcp_sysfs.c | 54 ++++++++++++++++++++++++++++++++++++----- drivers/s390/scsi/zfcp_unit.c | 8 +++++- 4 files changed, 65 insertions(+), 7 deletions(-)
--- a/drivers/s390/scsi/zfcp_ext.h +++ b/drivers/s390/scsi/zfcp_ext.h @@ -161,6 +161,7 @@ extern const struct attribute_group *zfc extern struct mutex zfcp_sysfs_port_units_mutex; extern struct device_attribute *zfcp_sysfs_sdev_attrs[]; extern struct device_attribute *zfcp_sysfs_shost_attrs[]; +bool zfcp_sysfs_port_is_removing(const struct zfcp_port *const port);
/* zfcp_unit.c */ extern int zfcp_unit_add(struct zfcp_port *, u64); --- a/drivers/s390/scsi/zfcp_scsi.c +++ b/drivers/s390/scsi/zfcp_scsi.c @@ -124,6 +124,15 @@ static int zfcp_scsi_slave_alloc(struct
zfcp_sdev->erp_action.port = port;
+ mutex_lock(&zfcp_sysfs_port_units_mutex); + if (zfcp_sysfs_port_is_removing(port)) { + /* port is already gone */ + mutex_unlock(&zfcp_sysfs_port_units_mutex); + put_device(&port->dev); /* undo zfcp_get_port_by_wwpn() */ + return -ENXIO; + } + mutex_unlock(&zfcp_sysfs_port_units_mutex); + unit = zfcp_unit_find(port, zfcp_scsi_dev_lun(sdev)); if (unit) put_device(&unit->dev); --- a/drivers/s390/scsi/zfcp_sysfs.c +++ b/drivers/s390/scsi/zfcp_sysfs.c @@ -237,6 +237,53 @@ static ZFCP_DEV_ATTR(adapter, port_resca
DEFINE_MUTEX(zfcp_sysfs_port_units_mutex);
+static void zfcp_sysfs_port_set_removing(struct zfcp_port *const port) +{ + lockdep_assert_held(&zfcp_sysfs_port_units_mutex); + atomic_set(&port->units, -1); +} + +bool zfcp_sysfs_port_is_removing(const struct zfcp_port *const port) +{ + lockdep_assert_held(&zfcp_sysfs_port_units_mutex); + return atomic_read(&port->units) == -1; +} + +static bool zfcp_sysfs_port_in_use(struct zfcp_port *const port) +{ + struct zfcp_adapter *const adapter = port->adapter; + unsigned long flags; + struct scsi_device *sdev; + bool in_use = true; + + mutex_lock(&zfcp_sysfs_port_units_mutex); + if (atomic_read(&port->units) > 0) + goto unlock_port_units_mutex; /* zfcp_unit(s) under port */ + + spin_lock_irqsave(adapter->scsi_host->host_lock, flags); + __shost_for_each_device(sdev, adapter->scsi_host) { + const struct zfcp_scsi_dev *zsdev = sdev_to_zfcp(sdev); + + if (sdev->sdev_state == SDEV_DEL || + sdev->sdev_state == SDEV_CANCEL) + continue; + if (zsdev->port != port) + continue; + /* alive scsi_device under port of interest */ + goto unlock_host_lock; + } + + /* port is about to be removed, so no more unit_add or slave_alloc */ + zfcp_sysfs_port_set_removing(port); + in_use = false; + +unlock_host_lock: + spin_unlock_irqrestore(adapter->scsi_host->host_lock, flags); +unlock_port_units_mutex: + mutex_unlock(&zfcp_sysfs_port_units_mutex); + return in_use; +} + static ssize_t zfcp_sysfs_port_remove_store(struct device *dev, struct device_attribute *attr, const char *buf, size_t count) @@ -259,16 +306,11 @@ static ssize_t zfcp_sysfs_port_remove_st else retval = 0;
- mutex_lock(&zfcp_sysfs_port_units_mutex); - if (atomic_read(&port->units) > 0) { + if (zfcp_sysfs_port_in_use(port)) { retval = -EBUSY; - mutex_unlock(&zfcp_sysfs_port_units_mutex); put_device(&port->dev); /* undo zfcp_get_port_by_wwpn() */ goto out; } - /* port is about to be removed, so no more unit_add */ - atomic_set(&port->units, -1); - mutex_unlock(&zfcp_sysfs_port_units_mutex);
write_lock_irq(&adapter->port_list_lock); list_del(&port->list); --- a/drivers/s390/scsi/zfcp_unit.c +++ b/drivers/s390/scsi/zfcp_unit.c @@ -123,7 +123,7 @@ int zfcp_unit_add(struct zfcp_port *port int retval = 0;
mutex_lock(&zfcp_sysfs_port_units_mutex); - if (atomic_read(&port->units) == -1) { + if (zfcp_sysfs_port_is_removing(port)) { /* port is already gone */ retval = -ENODEV; goto out; @@ -167,8 +167,14 @@ int zfcp_unit_add(struct zfcp_port *port write_lock_irq(&port->unit_list_lock); list_add_tail(&unit->list, &port->unit_list); write_unlock_irq(&port->unit_list_lock); + /* + * lock order: shost->scan_mutex before zfcp_sysfs_port_units_mutex + * due to zfcp_unit_scsi_scan() => zfcp_scsi_slave_alloc() + */ + mutex_unlock(&zfcp_sysfs_port_units_mutex);
zfcp_unit_scsi_scan(unit); + return retval;
out: mutex_unlock(&zfcp_sysfs_port_units_mutex);
From: Filipe Manana fdmanana@suse.com
commit 06989c799f04810f6876900d4760c0edda369cf7 upstream.
When syncing the log, the final phase of a fsync operation, we need to either create a log root's item or update the existing item in the log tree of log roots, and that depends on the current value of the log root's log_transid - if it's 1 we need to create the log root item, otherwise it must exist already and we update it. Since there is no synchronization between updating the log_transid and checking it for deciding whether the log root's item needs to be created or updated, we end up with a tiny race window that results in attempts to update the item to fail because the item was not yet created:
CPU 1 CPU 2
btrfs_sync_log()
lock root->log_mutex
set log root's log_transid to 1
unlock root->log_mutex
btrfs_sync_log()
lock root->log_mutex
sets log root's log_transid to 2
unlock root->log_mutex
update_log_root()
sees log root's log_transid with a value of 2
calls btrfs_update_root(), which fails with -EUCLEAN and causes transaction abort
Until recently the race lead to a BUG_ON at btrfs_update_root(), but after the recent commit 7ac1e464c4d47 ("btrfs: Don't panic when we can't find a root key") we just abort the current transaction.
A sample trace of the BUG_ON() on a SLE12 kernel:
------------[ cut here ]------------ kernel BUG at ../fs/btrfs/root-tree.c:157! Oops: Exception in kernel mode, sig: 5 [#1] SMP NR_CPUS=2048 NUMA pSeries (...) Supported: Yes, External CPU: 78 PID: 76303 Comm: rtas_errd Tainted: G X 4.4.156-94.57-default #1 task: c00000ffa906d010 ti: c00000ff42b08000 task.ti: c00000ff42b08000 NIP: d000000036ae5cdc LR: d000000036ae5cd8 CTR: 0000000000000000 REGS: c00000ff42b0b860 TRAP: 0700 Tainted: G X (4.4.156-94.57-default) MSR: 8000000002029033 <SF,VEC,EE,ME,IR,DR,RI,LE> CR: 22444484 XER: 20000000 CFAR: d000000036aba66c SOFTE: 1 GPR00: d000000036ae5cd8 c00000ff42b0bae0 d000000036bda220 0000000000000054 GPR04: 0000000000000001 0000000000000000 c00007ffff8d37c8 0000000000000000 GPR08: c000000000e19c00 0000000000000000 0000000000000000 3736343438312079 GPR12: 3930373337303434 c000000007a3a800 00000000007fffff 0000000000000023 GPR16: c00000ffa9d26028 c00000ffa9d261f8 0000000000000010 c00000ffa9d2ab28 GPR20: c00000ff42b0bc48 0000000000000001 c00000ff9f0d9888 0000000000000001 GPR24: c00000ffa9d26000 c00000ffa9d261e8 c00000ffa9d2a800 c00000ff9f0d9888 GPR28: c00000ffa9d26028 c00000ffa9d2aa98 0000000000000001 c00000ffa98f5b20 NIP [d000000036ae5cdc] btrfs_update_root+0x25c/0x4e0 [btrfs] LR [d000000036ae5cd8] btrfs_update_root+0x258/0x4e0 [btrfs] Call Trace: [c00000ff42b0bae0] [d000000036ae5cd8] btrfs_update_root+0x258/0x4e0 [btrfs] (unreliable) [c00000ff42b0bba0] [d000000036b53610] btrfs_sync_log+0x2d0/0xc60 [btrfs] [c00000ff42b0bce0] [d000000036b1785c] btrfs_sync_file+0x44c/0x4e0 [btrfs] [c00000ff42b0bd80] [c00000000032e300] vfs_fsync_range+0x70/0x120 [c00000ff42b0bdd0] [c00000000032e44c] do_fsync+0x5c/0xb0 [c00000ff42b0be10] [c00000000032e8dc] SyS_fdatasync+0x2c/0x40 [c00000ff42b0be30] [c000000000009488] system_call+0x3c/0x100 Instruction dump: 7f43d378 4bffebb9 60000000 88d90008 3d220000 e8b90000 3b390009 e87a01f0 e8898e08 e8f90000 4bfd48e5 60000000 <0fe00000> e95b0060 39200004 394a0ea0 ---[ end trace 8f2dc8f919cabab8 ]---
So fix this by doing the check of log_transid and updating or creating the log root's item while holding the root's log_mutex.
Fixes: 7237f1833601d ("Btrfs: fix tree logs parallel sync") CC: stable@vger.kernel.org # 4.4+ Signed-off-by: Filipe Manana fdmanana@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/btrfs/tree-log.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)
--- a/fs/btrfs/tree-log.c +++ b/fs/btrfs/tree-log.c @@ -2827,6 +2827,12 @@ int btrfs_sync_log(struct btrfs_trans_ha log->log_transid = root->log_transid; root->log_start_pid = 0; /* + * Update or create log root item under the root's log_mutex to prevent + * races with concurrent log syncs that can lead to failure to update + * log root item because it was not created yet. + */ + ret = update_log_root(trans, log); + /* * IO has been started, blocks of the log tree have WRITTEN flag set * in their headers. new modifications of the log will be written to * new positions. so it's safe to allow log writers to go in. @@ -2845,8 +2851,6 @@ int btrfs_sync_log(struct btrfs_trans_ha
mutex_unlock(&log_root_tree->log_mutex);
- ret = update_log_root(trans, log); - mutex_lock(&log_root_tree->log_mutex); if (atomic_dec_and_test(&log_root_tree->log_writers)) { /*
From: Ravi Bangoria ravi.bangoria@linux.ibm.com
commit 3202e35ec1c8fc19cea24253ff83edf702a60a02 upstream.
Consider a scenario where user creates two events:
1st event: attr.sample_type |= PERF_SAMPLE_BRANCH_STACK; attr.branch_sample_type = PERF_SAMPLE_BRANCH_ANY; fd = perf_event_open(attr, 0, 1, -1, 0);
This sets cpuhw->bhrb_filter to 0 and returns valid fd.
2nd event: attr.sample_type |= PERF_SAMPLE_BRANCH_STACK; attr.branch_sample_type = PERF_SAMPLE_BRANCH_CALL; fd = perf_event_open(attr, 0, 1, -1, 0);
It overrides cpuhw->bhrb_filter to -1 and returns with error.
Now if power_pmu_enable() gets called by any path other than power_pmu_add(), ppmu->config_bhrb(-1) will set MMCRA to -1.
Fixes: 3925f46bb590 ("powerpc/perf: Enable branch stack sampling framework") Cc: stable@vger.kernel.org # v3.10+ Signed-off-by: Ravi Bangoria ravi.bangoria@linux.ibm.com Reviewed-by: Madhavan Srinivasan maddy@linux.vnet.ibm.com Signed-off-by: Michael Ellerman mpe@ellerman.id.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/powerpc/perf/core-book3s.c | 6 ++++-- arch/powerpc/perf/power8-pmu.c | 3 +++ arch/powerpc/perf/power9-pmu.c | 3 +++ 3 files changed, 10 insertions(+), 2 deletions(-)
--- a/arch/powerpc/perf/core-book3s.c +++ b/arch/powerpc/perf/core-book3s.c @@ -1800,6 +1800,7 @@ static int power_pmu_event_init(struct p int n; int err; struct cpu_hw_events *cpuhw; + u64 bhrb_filter;
if (!ppmu) return -ENOENT; @@ -1896,13 +1897,14 @@ static int power_pmu_event_init(struct p err = power_check_constraints(cpuhw, events, cflags, n + 1);
if (has_branch_stack(event)) { - cpuhw->bhrb_filter = ppmu->bhrb_filter_map( + bhrb_filter = ppmu->bhrb_filter_map( event->attr.branch_sample_type);
- if (cpuhw->bhrb_filter == -1) { + if (bhrb_filter == -1) { put_cpu_var(cpu_hw_events); return -EOPNOTSUPP; } + cpuhw->bhrb_filter = bhrb_filter; }
put_cpu_var(cpu_hw_events); --- a/arch/powerpc/perf/power8-pmu.c +++ b/arch/powerpc/perf/power8-pmu.c @@ -29,6 +29,7 @@ enum { #define POWER8_MMCRA_IFM1 0x0000000040000000UL #define POWER8_MMCRA_IFM2 0x0000000080000000UL #define POWER8_MMCRA_IFM3 0x00000000C0000000UL +#define POWER8_MMCRA_BHRB_MASK 0x00000000C0000000UL
/* Table of alternatives, sorted by column 0 */ static const unsigned int event_alternatives[][MAX_ALT] = { @@ -262,6 +263,8 @@ static u64 power8_bhrb_filter_map(u64 br
static void power8_config_bhrb(u64 pmu_bhrb_filter) { + pmu_bhrb_filter &= POWER8_MMCRA_BHRB_MASK; + /* Enable BHRB filter in PMU */ mtspr(SPRN_MMCRA, (mfspr(SPRN_MMCRA) | pmu_bhrb_filter)); } --- a/arch/powerpc/perf/power9-pmu.c +++ b/arch/powerpc/perf/power9-pmu.c @@ -30,6 +30,7 @@ enum { #define POWER9_MMCRA_IFM1 0x0000000040000000UL #define POWER9_MMCRA_IFM2 0x0000000080000000UL #define POWER9_MMCRA_IFM3 0x00000000C0000000UL +#define POWER9_MMCRA_BHRB_MASK 0x00000000C0000000UL
GENERIC_EVENT_ATTR(cpu-cycles, PM_CYC); GENERIC_EVENT_ATTR(stalled-cycles-frontend, PM_ICT_NOSLOT_CYC); @@ -177,6 +178,8 @@ static u64 power9_bhrb_filter_map(u64 br
static void power9_config_bhrb(u64 pmu_bhrb_filter) { + pmu_bhrb_filter &= POWER9_MMCRA_BHRB_MASK; + /* Enable BHRB filter in PMU */ mtspr(SPRN_MMCRA, (mfspr(SPRN_MMCRA) | pmu_bhrb_filter)); }
From: Kailang Yang kailang@realtek.com
commit 317d9313925cd8388304286c0d3c8dda7f060a2d upstream.
I measured power consumption between power_save_node=1 and power_save_node=0. It's almost the same. Codec will enter to runtime suspend and suspend. That pin also will enter to D3. Don't need to enter to D3 by single pin. So, Disable power_save_node as default. It will avoid more issues. Windows Driver also has not this option at runtime PM.
Signed-off-by: Kailang Yang kailang@realtek.com Cc: stable@vger.kernel.org Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- sound/pci/hda/patch_realtek.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/sound/pci/hda/patch_realtek.c +++ b/sound/pci/hda/patch_realtek.c @@ -6317,7 +6317,7 @@ static int patch_alc269(struct hda_codec
spec = codec->spec; spec->gen.shared_mic_vref_pin = 0x18; - codec->power_save_node = 1; + codec->power_save_node = 0;
#ifdef CONFIG_PM codec->patch_ops.suspend = alc269_suspend;
From: Lyude Paul lyude@redhat.com
commit 342406e4fbba9a174125fbfe6aeac3d64ef90f76 upstream.
For a while, we've had the problem of i2c bus access not grabbing a runtime PM ref when it's being used in userspace by i2c-dev, resulting in nouveau spamming the kernel log with errors if anything attempts to access the i2c bus while the GPU is in runtime suspend. An example:
[ 130.078386] nouveau 0000:01:00.0: i2c: aux 000d: begin idle timeout ffffffff
Since the GPU is in runtime suspend, the MMIO region that the i2c bus is on isn't accessible. On x86, the standard behavior for accessing an unavailable MMIO region is to just return ~0.
Except, that turned out to be a lie. While computers with a clean concious will return ~0 in this scenario, some machines will actually completely hang a CPU on certian bad MMIO accesses. This was witnessed with someone's Lenovo ThinkPad P50, where sensors-detect attempting to access the i2c bus while the GPU was suspended would result in a CPU hang:
CPU: 5 PID: 12438 Comm: sensors-detect Not tainted 5.0.0-0.rc4.git3.1.fc30.x86_64 #1 Hardware name: LENOVO 20EQS64N17/20EQS64N17, BIOS N1EET74W (1.47 ) 11/21/2017 RIP: 0010:ioread32+0x2b/0x30 Code: 81 ff ff ff 03 00 77 20 48 81 ff 00 00 01 00 76 05 0f b7 d7 ed c3 48 c7 c6 e1 0c 36 96 e8 2d ff ff ff b8 ff ff ff ff c3 8b 07 <c3> 0f 1f 40 00 49 89 f0 48 81 fe ff ff 03 00 76 04 40 88 3e c3 48 RSP: 0018:ffffaac3c5007b48 EFLAGS: 00000292 ORIG_RAX: ffffffffffffff13 RAX: 0000000001111000 RBX: 0000000001111000 RCX: 0000043017a97186 RDX: 0000000000000aaa RSI: 0000000000000005 RDI: ffffaac3c400e4e4 RBP: ffff9e6443902c00 R08: ffffaac3c400e4e4 R09: ffffaac3c5007be7 R10: 0000000000000004 R11: 0000000000000001 R12: ffff9e6445dd0000 R13: 000000000000e4e4 R14: 00000000000003c4 R15: 0000000000000000 FS: 00007f253155a740(0000) GS:ffff9e644f600000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00005630d1500358 CR3: 0000000417c44006 CR4: 00000000003606e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: g94_i2c_aux_xfer+0x326/0x850 [nouveau] nvkm_i2c_aux_i2c_xfer+0x9e/0x140 [nouveau] __i2c_transfer+0x14b/0x620 i2c_smbus_xfer_emulated+0x159/0x680 ? _raw_spin_unlock_irqrestore+0x1/0x60 ? rt_mutex_slowlock.constprop.0+0x13d/0x1e0 ? __lock_is_held+0x59/0xa0 __i2c_smbus_xfer+0x138/0x5a0 i2c_smbus_xfer+0x4f/0x80 i2cdev_ioctl_smbus+0x162/0x2d0 [i2c_dev] i2cdev_ioctl+0x1db/0x2c0 [i2c_dev] do_vfs_ioctl+0x408/0x750 ksys_ioctl+0x5e/0x90 __x64_sys_ioctl+0x16/0x20 do_syscall_64+0x60/0x1e0 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x7f25317f546b Code: 0f 1e fa 48 8b 05 1d da 0c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d ed d9 0c 00 f7 d8 64 89 01 48 RSP: 002b:00007ffc88caab68 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 00005630d0fe7260 RCX: 00007f25317f546b RDX: 00005630d1598e80 RSI: 0000000000000720 RDI: 0000000000000003 RBP: 00005630d155b968 R08: 0000000000000001 R09: 00005630d15a1da0 R10: 0000000000000070 R11: 0000000000000246 R12: 00005630d1598e80 R13: 00005630d12f3d28 R14: 0000000000000720 R15: 00005630d12f3ce0 watchdog: BUG: soft lockup - CPU#5 stuck for 23s! [sensors-detect:12438]
Yikes! While I wanted to try to make it so that accessing an i2c bus on nouveau would wake up the GPU as needed, airlied pointed out that pretty much any usecase for userspace accessing an i2c bus on a GPU (mainly for the DDC brightness control that some displays have) is going to only be useful while there's at least one display enabled on the GPU anyway, and the GPU never sleeps while there's displays running.
Since teaching the i2c bus to wake up the GPU on userspace accesses is a good deal more difficult than it might seem, mostly due to the fact that we have to use the i2c bus during runtime resume of the GPU, we instead opt for the easiest solution: don't let userspace access i2c busses on the GPU at all while it's in runtime suspend.
Changes since v1: * Also disable i2c busses that run over DP AUX
Signed-off-by: Lyude Paul lyude@redhat.com Cc: stable@vger.kernel.org Signed-off-by: Ben Skeggs bskeggs@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/gpu/drm/nouveau/include/nvkm/subdev/i2c.h | 2 + drivers/gpu/drm/nouveau/nvkm/subdev/i2c/aux.c | 26 +++++++++++++++++++++- drivers/gpu/drm/nouveau/nvkm/subdev/i2c/aux.h | 2 + drivers/gpu/drm/nouveau/nvkm/subdev/i2c/base.c | 15 ++++++++++++ drivers/gpu/drm/nouveau/nvkm/subdev/i2c/bus.c | 21 ++++++++++++++++- drivers/gpu/drm/nouveau/nvkm/subdev/i2c/bus.h | 1 6 files changed, 65 insertions(+), 2 deletions(-)
--- a/drivers/gpu/drm/nouveau/include/nvkm/subdev/i2c.h +++ b/drivers/gpu/drm/nouveau/include/nvkm/subdev/i2c.h @@ -37,6 +37,7 @@ struct nvkm_i2c_bus { struct mutex mutex; struct list_head head; struct i2c_adapter i2c; + u8 enabled; };
int nvkm_i2c_bus_acquire(struct nvkm_i2c_bus *); @@ -56,6 +57,7 @@ struct nvkm_i2c_aux { struct mutex mutex; struct list_head head; struct i2c_adapter i2c; + u8 enabled;
u32 intr; }; --- a/drivers/gpu/drm/nouveau/nvkm/subdev/i2c/aux.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/i2c/aux.c @@ -105,9 +105,15 @@ nvkm_i2c_aux_acquire(struct nvkm_i2c_aux { struct nvkm_i2c_pad *pad = aux->pad; int ret; + AUX_TRACE(aux, "acquire"); mutex_lock(&aux->mutex); - ret = nvkm_i2c_pad_acquire(pad, NVKM_I2C_PAD_AUX); + + if (aux->enabled) + ret = nvkm_i2c_pad_acquire(pad, NVKM_I2C_PAD_AUX); + else + ret = -EIO; + if (ret) mutex_unlock(&aux->mutex); return ret; @@ -141,6 +147,24 @@ nvkm_i2c_aux_del(struct nvkm_i2c_aux **p } }
+void +nvkm_i2c_aux_init(struct nvkm_i2c_aux *aux) +{ + AUX_TRACE(aux, "init"); + mutex_lock(&aux->mutex); + aux->enabled = true; + mutex_unlock(&aux->mutex); +} + +void +nvkm_i2c_aux_fini(struct nvkm_i2c_aux *aux) +{ + AUX_TRACE(aux, "fini"); + mutex_lock(&aux->mutex); + aux->enabled = false; + mutex_unlock(&aux->mutex); +} + int nvkm_i2c_aux_ctor(const struct nvkm_i2c_aux_func *func, struct nvkm_i2c_pad *pad, int id, --- a/drivers/gpu/drm/nouveau/nvkm/subdev/i2c/aux.h +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/i2c/aux.h @@ -14,6 +14,8 @@ int nvkm_i2c_aux_ctor(const struct nvkm_ int nvkm_i2c_aux_new_(const struct nvkm_i2c_aux_func *, struct nvkm_i2c_pad *, int id, struct nvkm_i2c_aux **); void nvkm_i2c_aux_del(struct nvkm_i2c_aux **); +void nvkm_i2c_aux_init(struct nvkm_i2c_aux *); +void nvkm_i2c_aux_fini(struct nvkm_i2c_aux *); int nvkm_i2c_aux_xfer(struct nvkm_i2c_aux *, bool retry, u8 type, u32 addr, u8 *data, u8 size);
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/i2c/base.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/i2c/base.c @@ -160,8 +160,18 @@ nvkm_i2c_fini(struct nvkm_subdev *subdev { struct nvkm_i2c *i2c = nvkm_i2c(subdev); struct nvkm_i2c_pad *pad; + struct nvkm_i2c_bus *bus; + struct nvkm_i2c_aux *aux; u32 mask;
+ list_for_each_entry(aux, &i2c->aux, head) { + nvkm_i2c_aux_fini(aux); + } + + list_for_each_entry(bus, &i2c->bus, head) { + nvkm_i2c_bus_fini(bus); + } + if ((mask = (1 << i2c->func->aux) - 1), i2c->func->aux_stat) { i2c->func->aux_mask(i2c, NVKM_I2C_ANY, mask, 0); i2c->func->aux_stat(i2c, &mask, &mask, &mask, &mask); @@ -180,6 +190,7 @@ nvkm_i2c_init(struct nvkm_subdev *subdev struct nvkm_i2c *i2c = nvkm_i2c(subdev); struct nvkm_i2c_bus *bus; struct nvkm_i2c_pad *pad; + struct nvkm_i2c_aux *aux;
list_for_each_entry(pad, &i2c->pad, head) { nvkm_i2c_pad_init(pad); @@ -189,6 +200,10 @@ nvkm_i2c_init(struct nvkm_subdev *subdev nvkm_i2c_bus_init(bus); }
+ list_for_each_entry(aux, &i2c->aux, head) { + nvkm_i2c_aux_init(aux); + } + return 0; }
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/i2c/bus.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/i2c/bus.c @@ -110,6 +110,19 @@ nvkm_i2c_bus_init(struct nvkm_i2c_bus *b BUS_TRACE(bus, "init"); if (bus->func->init) bus->func->init(bus); + + mutex_lock(&bus->mutex); + bus->enabled = true; + mutex_unlock(&bus->mutex); +} + +void +nvkm_i2c_bus_fini(struct nvkm_i2c_bus *bus) +{ + BUS_TRACE(bus, "fini"); + mutex_lock(&bus->mutex); + bus->enabled = false; + mutex_unlock(&bus->mutex); }
void @@ -126,9 +139,15 @@ nvkm_i2c_bus_acquire(struct nvkm_i2c_bus { struct nvkm_i2c_pad *pad = bus->pad; int ret; + BUS_TRACE(bus, "acquire"); mutex_lock(&bus->mutex); - ret = nvkm_i2c_pad_acquire(pad, NVKM_I2C_PAD_I2C); + + if (bus->enabled) + ret = nvkm_i2c_pad_acquire(pad, NVKM_I2C_PAD_I2C); + else + ret = -EIO; + if (ret) mutex_unlock(&bus->mutex); return ret; --- a/drivers/gpu/drm/nouveau/nvkm/subdev/i2c/bus.h +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/i2c/bus.h @@ -17,6 +17,7 @@ int nvkm_i2c_bus_new_(const struct nvkm_ int id, struct nvkm_i2c_bus **); void nvkm_i2c_bus_del(struct nvkm_i2c_bus **); void nvkm_i2c_bus_init(struct nvkm_i2c_bus *); +void nvkm_i2c_bus_fini(struct nvkm_i2c_bus *);
int nvkm_i2c_bit_xfer(struct nvkm_i2c_bus *, struct i2c_msg *, int);
From: Jorge Ramirez-Ortiz jorge.ramirez-ortiz@linaro.org
commit 61c0e37950b88bad590056286c1d766b1f167f4e upstream.
When the tty layer requests the uart to throttle, the current code executing in msm_serial will trigger "Bad mode in Error Handler" and generate an invalid stack frame in pstore before rebooting (that is if pstore is indeed configured: otherwise the user shall just notice a reboot with no further information dumped to the console).
This patch replaces the PIO byte accessor with the word accessor already used in PIO mode.
Fixes: 68252424a7c7 ("tty: serial: msm: Support big-endian CPUs") Cc: stable@vger.kernel.org Signed-off-by: Jorge Ramirez-Ortiz jorge.ramirez-ortiz@linaro.org Reviewed-by: Bjorn Andersson bjorn.andersson@linaro.org Reviewed-by: Stephen Boyd swboyd@chromium.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/tty/serial/msm_serial.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
--- a/drivers/tty/serial/msm_serial.c +++ b/drivers/tty/serial/msm_serial.c @@ -868,6 +868,7 @@ static void msm_handle_tx(struct uart_po struct circ_buf *xmit = &msm_port->uart.state->xmit; struct msm_dma *dma = &msm_port->tx_dma; unsigned int pio_count, dma_count, dma_min; + char buf[4] = { 0 }; void __iomem *tf; int err = 0;
@@ -877,10 +878,12 @@ static void msm_handle_tx(struct uart_po else tf = port->membase + UART_TF;
+ buf[0] = port->x_char; + if (msm_port->is_uartdm) msm_reset_dm_count(port, 1);
- iowrite8_rep(tf, &port->x_char, 1); + iowrite32_rep(tf, buf, 1); port->icount.tx++; port->x_char = 0; return;
From: Joe Burmeister joe.burmeister@devtank.co.uk
commit 5d24f455c182d5116dd5db8e1dc501115ecc9c2c upstream.
The datasheet states:
Bit 4: ClockEnSet the ClockEn bit high to enable an external clocking (crystal or clock generator at XIN). Set the ClockEn bit to 0 to disable clocking Bit 1: CrystalEnSet the CrystalEn bit high to enable the crystal oscillator. When using an external clock source at XIN, CrystalEn must be set low.
The bit 4, MAX310X_CLKSRC_EXTCLK_BIT, should be set and was not.
This was required to make the MAX3107 with an external crystal on our board able to send or receive data.
Signed-off-by: Joe Burmeister joe.burmeister@devtank.co.uk Cc: stable stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/tty/serial/max310x.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/tty/serial/max310x.c +++ b/drivers/tty/serial/max310x.c @@ -579,7 +579,7 @@ static int max310x_set_ref_clk(struct ma }
/* Configure clock source */ - clksrc = xtal ? MAX310X_CLKSRC_CRYST_BIT : MAX310X_CLKSRC_EXTCLK_BIT; + clksrc = MAX310X_CLKSRC_EXTCLK_BIT | (xtal ? MAX310X_CLKSRC_CRYST_BIT : 0);
/* Configure PLL */ if (pllcfg) {
From: Jiri Slaby jslaby@suse.cz
commit 3e8589963773a5c23e2f1fe4bcad0e9a90b7f471 upstream.
We have a single node system with node 0 disabled: Scanning NUMA topology in Northbridge 24 Number of physical nodes 2 Skipping disabled node 0 Node 1 MemBase 0000000000000000 Limit 00000000fbff0000 NODE_DATA(1) allocated [mem 0xfbfda000-0xfbfeffff]
This causes crashes in memcg when system boots: BUG: unable to handle kernel NULL pointer dereference at 0000000000000008 #PF error: [normal kernel read fault] ... RIP: 0010:list_lru_add+0x94/0x170 ... Call Trace: d_lru_add+0x44/0x50 dput.part.34+0xfc/0x110 __fput+0x108/0x230 task_work_run+0x9f/0xc0 exit_to_usermode_loop+0xf5/0x100
It is reproducible as far as 4.12. I did not try older kernels. You have to have a new enough systemd, e.g. 241 (the reason is unknown -- was not investigated). Cannot be reproduced with systemd 234.
The system crashes because the size of lru array is never updated in memcg_update_all_list_lrus and the reads are past the zero-sized array, causing dereferences of random memory.
The root cause are list_lru_memcg_aware checks in the list_lru code. The test in list_lru_memcg_aware is broken: it assumes node 0 is always present, but it is not true on some systems as can be seen above.
So fix this by avoiding checks on node 0. Remember the memcg-awareness by a bool flag in struct list_lru.
Link: http://lkml.kernel.org/r/20190522091940.3615-1-jslaby@suse.cz Fixes: 60d3fd32a7a9 ("list_lru: introduce per-memcg lists") Signed-off-by: Jiri Slaby jslaby@suse.cz Acked-by: Michal Hocko mhocko@suse.com Suggested-by: Vladimir Davydov vdavydov.dev@gmail.com Acked-by: Vladimir Davydov vdavydov.dev@gmail.com Reviewed-by: Shakeel Butt shakeelb@google.com Cc: Johannes Weiner hannes@cmpxchg.org Cc: Raghavendra K T raghavendra.kt@linux.vnet.ibm.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- include/linux/list_lru.h | 1 + mm/list_lru.c | 8 +++----- 2 files changed, 4 insertions(+), 5 deletions(-)
--- a/include/linux/list_lru.h +++ b/include/linux/list_lru.h @@ -51,6 +51,7 @@ struct list_lru { struct list_lru_node *node; #if defined(CONFIG_MEMCG) && !defined(CONFIG_SLOB) struct list_head list; + bool memcg_aware; #endif };
--- a/mm/list_lru.c +++ b/mm/list_lru.c @@ -42,11 +42,7 @@ static void list_lru_unregister(struct l #if defined(CONFIG_MEMCG) && !defined(CONFIG_SLOB) static inline bool list_lru_memcg_aware(struct list_lru *lru) { - /* - * This needs node 0 to be always present, even - * in the systems supporting sparse numa ids. - */ - return !!lru->node[0].memcg_lrus; + return lru->memcg_aware; }
static inline struct list_lru_one * @@ -389,6 +385,8 @@ static int memcg_init_list_lru(struct li { int i;
+ lru->memcg_aware = memcg_aware; + if (!memcg_aware) return 0;
From: Zhenliang Wei weizhenliang@huawei.com
commit 98af37d624ed8c83f1953b1b6b2f6866011fc064 upstream.
In the fixes commit, removing SIGKILL from each thread signal mask and executing "goto fatal" directly will skip the call to "trace_signal_deliver". At this point, the delivery tracking of the SIGKILL signal will be inaccurate.
Therefore, we need to add trace_signal_deliver before "goto fatal" after executing sigdelset.
Note: SEND_SIG_NOINFO matches the fact that SIGKILL doesn't have any info.
Link: http://lkml.kernel.org/r/20190425025812.91424-1-weizhenliang@huawei.com Fixes: cf43a757fd4944 ("signal: Restore the stop PTRACE_EVENT_EXIT") Signed-off-by: Zhenliang Wei weizhenliang@huawei.com Reviewed-by: Christian Brauner christian@brauner.io Reviewed-by: Oleg Nesterov oleg@redhat.com Cc: Eric W. Biederman ebiederm@xmission.com Cc: Ivan Delalande colona@arista.com Cc: Arnd Bergmann arnd@arndb.de Cc: Thomas Gleixner tglx@linutronix.de Cc: Deepa Dinamani deepa.kernel@gmail.com Cc: Greg Kroah-Hartman gregkh@linuxfoundation.org Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- kernel/signal.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/kernel/signal.c +++ b/kernel/signal.c @@ -2244,6 +2244,8 @@ relock: if (signal_group_exit(signal)) { ksig->info.si_signo = signr = SIGKILL; sigdelset(¤t->pending.signal, SIGKILL); + trace_signal_deliver(SIGKILL, SEND_SIG_NOINFO, + &sighand->action[SIGKILL - 1]); recalc_sigpending(); goto fatal; }
From: Jonathan Corbet corbet@lwn.net
commit 3bc8088464712fdcb078eefb68837ccfcc413c88 upstream.
Our version check in Documentation/conf.py never envisioned a world where Sphinx moved beyond 1.x. Now that the unthinkable has happened, fix our version check to handle higher version numbers correctly.
Cc: stable@vger.kernel.org Signed-off-by: Jonathan Corbet corbet@lwn.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- Documentation/conf.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/Documentation/conf.py +++ b/Documentation/conf.py @@ -37,7 +37,7 @@ from load_config import loadConfig extensions = ['kernel-doc', 'rstFlatTable', 'kernel_include', 'cdomain']
# The name of the math extension changed on Sphinx 1.4 -if major == 1 and minor > 3: +if (major == 1 and minor > 3) or (major > 1): extensions.append("sphinx.ext.imgmath") else: extensions.append("sphinx.ext.pngmath")
From: Dan Carpenter dan.carpenter@oracle.com
commit ca641bae6da977d638458e78cd1487b6160a2718 upstream.
The create_pagelist() "count" parameter comes from the user in vchiq_ioctl() and it could overflow. If you look at how create_page() is called in vchiq_prepare_bulk_data(), then the "size" variable is an int so it doesn't make sense to allow negatives or larger than INT_MAX.
I don't know this code terribly well, but I believe that typical values of "count" are typically quite low and I don't think this check will affect normal valid uses at all.
The "pagelist_size" calculation can also overflow on 32 bit systems, but not on 64 bit systems. I have added an integer overflow check for that as well.
The Raspberry PI doesn't offer the same level of memory protection that x86 does so these sorts of bugs are probably not super critical to fix.
Fixes: 71bad7f08641 ("staging: add bcm2708 vchiq driver") Signed-off-by: Dan Carpenter dan.carpenter@oracle.com Cc: stable stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/staging/vc04_services/interface/vchiq_arm/vchiq_2835_arm.c | 9 +++++++++ 1 file changed, 9 insertions(+)
--- a/drivers/staging/vc04_services/interface/vchiq_arm/vchiq_2835_arm.c +++ b/drivers/staging/vc04_services/interface/vchiq_arm/vchiq_2835_arm.c @@ -381,9 +381,18 @@ create_pagelist(char __user *buf, size_t int run, addridx, actual_pages; unsigned long *need_release;
+ if (count >= INT_MAX - PAGE_SIZE) + return NULL; + offset = (unsigned int)buf & (PAGE_SIZE - 1); num_pages = (count + offset + PAGE_SIZE - 1) / PAGE_SIZE;
+ if (num_pages > (SIZE_MAX - sizeof(PAGELIST_T) - + sizeof(struct vchiq_pagelist_info)) / + (sizeof(u32) + sizeof(pages[0]) + + sizeof(struct scatterlist))) + return NULL; + *ppagelist = NULL;
/* Allocate enough storage to hold the page pointers and the page
Hi.
On 6/9/19 6:42 PM, Greg Kroah-Hartman wrote:
From: Dan Carpenter dan.carpenter@oracle.com
commit ca641bae6da977d638458e78cd1487b6160a2718 upstream.
This commit breaks the kernel build because the vchiq_pagelist_info struct is not defined in v4.9.182.
It was only added in v4.10, in commit 4807f2c0e684e907c501cb96049809d7a957dbc2.
Best regards,
Martin Weinelt
In file included from ./include/uapi/linux/posix_types.h:4:0, from ./include/uapi/linux/types.h:13, from ./include/linux/compiler.h:224, from ./include/linux/linkage.h:4, from ./include/linux/kernel.h:6, from drivers/staging/vc04_services/interface/vchiq_arm/vchiq_2835_arm.c:34: drivers/staging/vc04_services/interface/vchiq_arm/vchiq_2835_arm.c: In function 'create_pagelist': ./include/linux/stddef.h:7:14: warning: return makes integer from pointer without a cast [-Wint-conversion] #define NULL ((void *)0) ^ drivers/staging/vc04_services/interface/vchiq_arm/vchiq_2835_arm.c:385:10: note: in expansion of macro 'NULL' return NULL; ^~~~ drivers/staging/vc04_services/interface/vchiq_arm/vchiq_2835_arm.c:391:12: error: invalid application of 'sizeof' to incomplete type 'struct vchiq_pagelist_info' sizeof(struct vchiq_pagelist_info)) / ^~~~~~ In file included from ./include/uapi/linux/posix_types.h:4:0, from ./include/uapi/linux/types.h:13, from ./include/linux/compiler.h:224, from ./include/linux/linkage.h:4, from ./include/linux/kernel.h:6, from drivers/staging/vc04_services/interface/vchiq_arm/vchiq_2835_arm.c:34: ./include/linux/stddef.h:7:14: warning: return makes integer from pointer without a cast [-Wint-conversion] #define NULL ((void *)0) ^ drivers/staging/vc04_services/interface/vchiq_arm/vchiq_2835_arm.c:394:10: note: in expansion of macro 'NULL' return NULL; ^~~~
The create_pagelist() "count" parameter comes from the user in vchiq_ioctl() and it could overflow. If you look at how create_page() is called in vchiq_prepare_bulk_data(), then the "size" variable is an int so it doesn't make sense to allow negatives or larger than INT_MAX.
I don't know this code terribly well, but I believe that typical values of "count" are typically quite low and I don't think this check will affect normal valid uses at all.
The "pagelist_size" calculation can also overflow on 32 bit systems, but not on 64 bit systems. I have added an integer overflow check for that as well.
The Raspberry PI doesn't offer the same level of memory protection that x86 does so these sorts of bugs are probably not super critical to fix.
Fixes: 71bad7f08641 ("staging: add bcm2708 vchiq driver") Signed-off-by: Dan Carpenter dan.carpenter@oracle.com Cc: stable stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
drivers/staging/vc04_services/interface/vchiq_arm/vchiq_2835_arm.c | 9 +++++++++ 1 file changed, 9 insertions(+)
--- a/drivers/staging/vc04_services/interface/vchiq_arm/vchiq_2835_arm.c +++ b/drivers/staging/vc04_services/interface/vchiq_arm/vchiq_2835_arm.c @@ -381,9 +381,18 @@ create_pagelist(char __user *buf, size_t int run, addridx, actual_pages; unsigned long *need_release;
- if (count >= INT_MAX - PAGE_SIZE)
return NULL;
- offset = (unsigned int)buf & (PAGE_SIZE - 1); num_pages = (count + offset + PAGE_SIZE - 1) / PAGE_SIZE;
- if (num_pages > (SIZE_MAX - sizeof(PAGELIST_T) -
sizeof(struct vchiq_pagelist_info)) /
(sizeof(u32) + sizeof(pages[0]) +
sizeof(struct scatterlist)))
return NULL;
- *ppagelist = NULL;
/* Allocate enough storage to hold the page pointers and the page
On Wed, Jun 19, 2019 at 06:02:07PM +0200, Martin Weinelt wrote:
Hi.
On 6/9/19 6:42 PM, Greg Kroah-Hartman wrote:
From: Dan Carpenter dan.carpenter@oracle.com
commit ca641bae6da977d638458e78cd1487b6160a2718 upstream.
This commit breaks the kernel build because the vchiq_pagelist_info struct is not defined in v4.9.182.
It was only added in v4.10, in commit 4807f2c0e684e907c501cb96049809d7a957dbc2.
Best regards,
Martin Weinelt
In file included from ./include/uapi/linux/posix_types.h:4:0, from ./include/uapi/linux/types.h:13, from ./include/linux/compiler.h:224, from ./include/linux/linkage.h:4, from ./include/linux/kernel.h:6, from drivers/staging/vc04_services/interface/vchiq_arm/vchiq_2835_arm.c:34: drivers/staging/vc04_services/interface/vchiq_arm/vchiq_2835_arm.c: In function 'create_pagelist': ./include/linux/stddef.h:7:14: warning: return makes integer from pointer without a cast [-Wint-conversion] #define NULL ((void *)0) ^ drivers/staging/vc04_services/interface/vchiq_arm/vchiq_2835_arm.c:385:10: note: in expansion of macro 'NULL' return NULL; ^~~~ drivers/staging/vc04_services/interface/vchiq_arm/vchiq_2835_arm.c:391:12: error: invalid application of 'sizeof' to incomplete type 'struct vchiq_pagelist_info' sizeof(struct vchiq_pagelist_info)) / ^~~~~~ In file included from ./include/uapi/linux/posix_types.h:4:0, from ./include/uapi/linux/types.h:13, from ./include/linux/compiler.h:224, from ./include/linux/linkage.h:4, from ./include/linux/kernel.h:6, from drivers/staging/vc04_services/interface/vchiq_arm/vchiq_2835_arm.c:34: ./include/linux/stddef.h:7:14: warning: return makes integer from pointer without a cast [-Wint-conversion] #define NULL ((void *)0) ^ drivers/staging/vc04_services/interface/vchiq_arm/vchiq_2835_arm.c:394:10: note: in expansion of macro 'NULL' return NULL; ^~~~
Really? How come all of the built tests still succeed?
Ah, arm systems :(
Odd that we didn't catch this already, sorry about that. And that was my fault in the backport, which the build tests did catch. Odd that it didn't catch the failure after that...
Anyway, thanks, I'll go revert this.
greg k-h
From: Roberto Bergantinos Corpas rbergant@redhat.com
commit 31fad7d41e73731f05b8053d17078638cf850fa6 upstream.
In cifs_read_allocate_pages, in case of ENOMEM, we go through whole rdata->pages array but we have failed the allocation before nr_pages, therefore we may end up calling put_page with NULL pointer, causing oops
Signed-off-by: Roberto Bergantinos Corpas rbergant@redhat.com Acked-by: Pavel Shilovsky pshilov@microsoft.com Signed-off-by: Steve French stfrench@microsoft.com CC: Stable stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/cifs/file.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
--- a/fs/cifs/file.c +++ b/fs/cifs/file.c @@ -2892,7 +2892,9 @@ cifs_read_allocate_pages(struct cifs_rea }
if (rc) { - for (i = 0; i < nr_pages; i++) { + unsigned int nr_page_failed = i; + + for (i = 0; i < nr_page_failed; i++) { put_page(rdata->pages[i]); rdata->pages[i] = NULL; }
From: Kees Cook keescook@chromium.org
commit 7210e060155b9cf557fb13128353c3e494fa5ed3 upstream.
The gcc-common.h file did not take into account certain macros that might have already been defined in the build environment. This updates the header to avoid redefining the macros, as seen on a Darwin host using gcc 4.9.2:
HOSTCXX -fPIC scripts/gcc-plugins/arm_ssp_per_task_plugin.o - due to: scripts/gcc-plugins/gcc-common.h In file included from scripts/gcc-plugins/arm_ssp_per_task_plugin.c:3:0: scripts/gcc-plugins/gcc-common.h:153:0: warning: "__unused" redefined ^ In file included from /usr/include/stdio.h:64:0, from /Users/hns/Documents/Projects/QuantumSTEP/System/Library/Frameworks/System.framework/Versions-jessie/x86_64-apple-darwin15.0.0/gcc/arm-linux-gnueabi/bin/../lib/gcc/arm-linux-gnueabi/4.9.2/plugin/include/system.h:40, from /Users/hns/Documents/Projects/QuantumSTEP/System/Library/Frameworks/System.framework/Versions-jessie/x86_64-apple-darwin15.0.0/gcc/arm-linux-gnueabi/bin/../lib/gcc/arm-linux-gnueabi/4.9.2/plugin/include/gcc-plugin.h:28, from /Users/hns/Documents/Projects/QuantumSTEP/System/Library/Frameworks/System.framework/Versions-jessie/x86_64-apple-darwin15.0.0/gcc/arm-linux-gnueabi/bin/../lib/gcc/arm-linux-gnueabi/4.9.2/plugin/include/plugin.h:23, from scripts/gcc-plugins/gcc-common.h:9, from scripts/gcc-plugins/arm_ssp_per_task_plugin.c:3: /usr/include/sys/cdefs.h:161:0: note: this is the location of the previous definition ^
Reported-and-tested-by: "H. Nikolaus Schaller" hns@goldelico.com Fixes: 189af4657186 ("ARM: smp: add support for per-task stack canaries") Cc: stable@vger.kernel.org Signed-off-by: Kees Cook keescook@chromium.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- scripts/gcc-plugins/gcc-common.h | 4 ++++ 1 file changed, 4 insertions(+)
--- a/scripts/gcc-plugins/gcc-common.h +++ b/scripts/gcc-plugins/gcc-common.h @@ -135,8 +135,12 @@ extern void print_gimple_expr(FILE *, gi extern void dump_gimple_stmt(pretty_printer *, gimple, int, int); #endif
+#ifndef __unused #define __unused __attribute__((__unused__)) +#endif +#ifndef __visible #define __visible __attribute__((visibility("default"))) +#endif
#define DECL_NAME_POINTER(node) IDENTIFIER_POINTER(DECL_NAME(node)) #define DECL_NAME_LENGTH(node) IDENTIFIER_LENGTH(DECL_NAME(node))
From: Thomas Hellstrom thellstrom@vmware.com
commit 63cb44441826e842b7285575b96db631cc9f2505 upstream.
This may confuse user-space clients like plymouth that opens a drm file descriptor as a result of a hotplug event and then generates a new event...
Cc: stable@vger.kernel.org Fixes: 5ea1734827bb ("drm/vmwgfx: Send a hotplug event at master_set") Signed-off-by: Thomas Hellstrom thellstrom@vmware.com Reviewed-by: Deepak Rawat drawat@vmware.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/gpu/drm/vmwgfx/vmwgfx_drv.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-)
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c @@ -1245,7 +1245,13 @@ static int vmw_master_set(struct drm_dev }
dev_priv->active_master = vmaster; - drm_sysfs_hotplug_event(dev); + + /* + * Inform a new master that the layout may have changed while + * it was gone. + */ + if (!from_open) + drm_sysfs_hotplug_event(dev);
return 0; }
From: Arend Van Spriel arend.vanspriel@broadcom.com
commit 4835f37e3bafc138f8bfa3cbed2920dd56fed283 upstream.
Assure the event data buffer is long enough to hold the array of netinfo items and that SSID length does not exceed the maximum of 32 characters as per 802.11 spec.
Reviewed-by: Hante Meuleman hante.meuleman@broadcom.com Reviewed-by: Pieter-Paul Giesberts pieter-paul.giesberts@broadcom.com Reviewed-by: Franky Lin franky.lin@broadcom.com Signed-off-by: Arend van Spriel arend.vanspriel@broadcom.com Signed-off-by: Kalle Valo kvalo@codeaurora.org [bwh: Backported to 4.9: - Move the assignment to "data" along with the assignment to "netinfo_start" that depends on it - Adjust context, indentation] Signed-off-by: Ben Hutchings ben@decadent.org.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/wireless/broadcom/brcm80211/brcmfmac/cfg80211.c | 14 +++++++++--- 1 file changed, 11 insertions(+), 3 deletions(-)
--- a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/cfg80211.c +++ b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/cfg80211.c @@ -3220,6 +3220,7 @@ brcmf_notify_sched_scan_results(struct b struct brcmf_pno_scanresults_le *pfn_result; u32 result_count; u32 status; + u32 datalen;
brcmf_dbg(SCAN, "Enter\n");
@@ -3245,6 +3246,14 @@ brcmf_notify_sched_scan_results(struct b if (result_count > 0) { int i;
+ data += sizeof(struct brcmf_pno_scanresults_le); + netinfo_start = (struct brcmf_pno_net_info_le *)data; + datalen = e->datalen - ((void *)netinfo_start - (void *)pfn_result); + if (datalen < result_count * sizeof(*netinfo)) { + brcmf_err("insufficient event data\n"); + goto out_err; + } + request = kzalloc(sizeof(*request), GFP_KERNEL); ssid = kcalloc(result_count, sizeof(*ssid), GFP_KERNEL); channel = kcalloc(result_count, sizeof(*channel), GFP_KERNEL); @@ -3254,9 +3263,6 @@ brcmf_notify_sched_scan_results(struct b }
request->wiphy = wiphy; - data += sizeof(struct brcmf_pno_scanresults_le); - netinfo_start = (struct brcmf_pno_net_info_le *)data; - for (i = 0; i < result_count; i++) { netinfo = &netinfo_start[i]; if (!netinfo) { @@ -3266,6 +3272,8 @@ brcmf_notify_sched_scan_results(struct b goto out_err; }
+ if (netinfo->SSID_len > IEEE80211_MAX_SSID_LEN) + netinfo->SSID_len = IEEE80211_MAX_SSID_LEN; brcmf_dbg(SCAN, "SSID:%s Channel:%d\n", netinfo->SSID, netinfo->channel); memcpy(ssid[i].ssid, netinfo->SSID, netinfo->SSID_len);
From: Arend van Spriel arend.vanspriel@broadcom.com
commit 1b5e2423164b3670e8bc9174e4762d297990deff upstream.
The SSID length as received from firmware should not exceed IEEE80211_MAX_SSID_LEN as that would result in heap overflow.
Reviewed-by: Hante Meuleman hante.meuleman@broadcom.com Reviewed-by: Pieter-Paul Giesberts pieter-paul.giesberts@broadcom.com Reviewed-by: Franky Lin franky.lin@broadcom.com Signed-off-by: Arend van Spriel arend.vanspriel@broadcom.com Signed-off-by: Kalle Valo kvalo@codeaurora.org [bwh: Backported to 4.9: adjust context] Signed-off-by: Ben Hutchings ben.hutchings@codethink.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/wireless/broadcom/brcm80211/brcmfmac/cfg80211.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/cfg80211.c +++ b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/cfg80211.c @@ -3579,6 +3579,8 @@ brcmf_wowl_nd_results(struct brcmf_if *i
data += sizeof(struct brcmf_pno_scanresults_le); netinfo = (struct brcmf_pno_net_info_le *)data; + if (netinfo->SSID_len > IEEE80211_MAX_SSID_LEN) + netinfo->SSID_len = IEEE80211_MAX_SSID_LEN; memcpy(cfg->wowl.nd->ssid.ssid, netinfo->SSID, netinfo->SSID_len); cfg->wowl.nd->ssid.ssid_len = netinfo->SSID_len; cfg->wowl.nd->n_channels = 1;
From: Arend van Spriel arend.vanspriel@broadcom.com
commit a4176ec356c73a46c07c181c6d04039fafa34a9f upstream.
For USB there is no separate channel being used to pass events from firmware to the host driver and as such are passed over the data path. In order to detect mock event messages an additional check is needed on event subtype. This check is added conditionally using unlikely() keyword.
Reviewed-by: Hante Meuleman hante.meuleman@broadcom.com Reviewed-by: Pieter-Paul Giesberts pieter-paul.giesberts@broadcom.com Reviewed-by: Franky Lin franky.lin@broadcom.com Signed-off-by: Arend van Spriel arend.vanspriel@broadcom.com Signed-off-by: Kalle Valo kvalo@codeaurora.org Signed-off-by: Ben Hutchings ben.hutchings@codethink.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/wireless/broadcom/brcm80211/brcmfmac/core.c | 5 ++-- drivers/net/wireless/broadcom/brcm80211/brcmfmac/fweh.h | 16 ++++++++++---- drivers/net/wireless/broadcom/brcm80211/brcmfmac/msgbuf.c | 2 - 3 files changed, 16 insertions(+), 7 deletions(-)
--- a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/core.c +++ b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/core.c @@ -339,7 +339,8 @@ void brcmf_rx_frame(struct device *dev, } else { /* Process special event packets */ if (handle_event) - brcmf_fweh_process_skb(ifp->drvr, skb); + brcmf_fweh_process_skb(ifp->drvr, skb, + BCMILCP_SUBTYPE_VENDOR_LONG);
brcmf_netif_rx(ifp, skb); } @@ -356,7 +357,7 @@ void brcmf_rx_event(struct device *dev, if (brcmf_rx_hdrpull(drvr, skb, &ifp)) return;
- brcmf_fweh_process_skb(ifp->drvr, skb); + brcmf_fweh_process_skb(ifp->drvr, skb, 0); brcmu_pkt_buf_free_skb(skb); }
--- a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/fweh.h +++ b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/fweh.h @@ -181,7 +181,7 @@ enum brcmf_fweh_event_code { */ #define BRCM_OUI "\x00\x10\x18" #define BCMILCP_BCM_SUBTYPE_EVENT 1 - +#define BCMILCP_SUBTYPE_VENDOR_LONG 32769
/** * struct brcm_ethhdr - broadcom specific ether header. @@ -302,10 +302,10 @@ void brcmf_fweh_process_event(struct brc void brcmf_fweh_p2pdev_setup(struct brcmf_if *ifp, bool ongoing);
static inline void brcmf_fweh_process_skb(struct brcmf_pub *drvr, - struct sk_buff *skb) + struct sk_buff *skb, u16 stype) { struct brcmf_event *event_packet; - u16 usr_stype; + u16 subtype, usr_stype;
/* only process events when protocol matches */ if (skb->protocol != cpu_to_be16(ETH_P_LINK_CTL)) @@ -314,8 +314,16 @@ static inline void brcmf_fweh_process_sk if ((skb->len + ETH_HLEN) < sizeof(*event_packet)) return;
- /* check for BRCM oui match */ event_packet = (struct brcmf_event *)skb_mac_header(skb); + + /* check subtype if needed */ + if (unlikely(stype)) { + subtype = get_unaligned_be16(&event_packet->hdr.subtype); + if (subtype != stype) + return; + } + + /* check for BRCM oui match */ if (memcmp(BRCM_OUI, &event_packet->hdr.oui[0], sizeof(event_packet->hdr.oui))) return; --- a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/msgbuf.c +++ b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/msgbuf.c @@ -1114,7 +1114,7 @@ static void brcmf_msgbuf_process_event(s
skb->protocol = eth_type_trans(skb, ifp->ndev);
- brcmf_fweh_process_skb(ifp->drvr, skb); + brcmf_fweh_process_skb(ifp->drvr, skb, 0);
exit: brcmu_pkt_buf_free_skb(skb);
From: Ben Hutchings ben.hutchings@codethink.co.uk
This was done as part of upstream commits fdfb4a99b6ab "8inder: separate binder allocator structure from binder proc", 19c987241ca1 "binder: separate out binder_alloc functions", and 7a4408c6bd3e "binder: make sure accesses to proc/thread are safe". However, those commits made lots of other changes that are not suitable for stable.
Signed-off-by: Ben Hutchings ben.hutchings@codethink.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/android/binder.c | 28 ++++++++++++++-------------- 1 file changed, 14 insertions(+), 14 deletions(-)
--- a/drivers/android/binder.c +++ b/drivers/android/binder.c @@ -488,7 +488,7 @@ static void binder_insert_free_buffer(st new_buffer_size = binder_buffer_size(proc, new_buffer);
binder_debug(BINDER_DEBUG_BUFFER_ALLOC, - "%d: add free buffer, size %zd, at %p\n", + "%d: add free buffer, size %zd, at %pK\n", proc->pid, new_buffer_size, new_buffer);
while (*p) { @@ -566,7 +566,7 @@ static int binder_update_page_range(stru struct mm_struct *mm;
binder_debug(BINDER_DEBUG_BUFFER_ALLOC, - "%d: %s pages %p-%p\n", proc->pid, + "%d: %s pages %pK-%pK\n", proc->pid, allocate ? "allocate" : "free", start, end);
if (end <= start) @@ -606,7 +606,7 @@ static int binder_update_page_range(stru BUG_ON(*page); *page = alloc_page(GFP_KERNEL | __GFP_HIGHMEM | __GFP_ZERO); if (*page == NULL) { - pr_err("%d: binder_alloc_buf failed for page at %p\n", + pr_err("%d: binder_alloc_buf failed for page at %pK\n", proc->pid, page_addr); goto err_alloc_page_failed; } @@ -615,7 +615,7 @@ static int binder_update_page_range(stru flush_cache_vmap((unsigned long)page_addr, (unsigned long)page_addr + PAGE_SIZE); if (ret != 1) { - pr_err("%d: binder_alloc_buf failed to map page at %p in kernel\n", + pr_err("%d: binder_alloc_buf failed to map page at %pK in kernel\n", proc->pid, page_addr); goto err_map_kernel_failed; } @@ -719,7 +719,7 @@ static struct binder_buffer *binder_allo }
binder_debug(BINDER_DEBUG_BUFFER_ALLOC, - "%d: binder_alloc_buf size %zd got buffer %p size %zd\n", + "%d: binder_alloc_buf size %zd got buffer %pK size %zd\n", proc->pid, size, buffer, buffer_size);
has_page_addr = @@ -749,7 +749,7 @@ static struct binder_buffer *binder_allo binder_insert_free_buffer(proc, new_buffer); } binder_debug(BINDER_DEBUG_BUFFER_ALLOC, - "%d: binder_alloc_buf size %zd got %p\n", + "%d: binder_alloc_buf size %zd got %pK\n", proc->pid, size, buffer); buffer->data_size = data_size; buffer->offsets_size = offsets_size; @@ -789,7 +789,7 @@ static void binder_delete_free_buffer(st if (buffer_end_page(prev) == buffer_end_page(buffer)) free_page_end = 0; binder_debug(BINDER_DEBUG_BUFFER_ALLOC, - "%d: merge free, buffer %p share page with %p\n", + "%d: merge free, buffer %pK share page with %pK\n", proc->pid, buffer, prev); }
@@ -802,14 +802,14 @@ static void binder_delete_free_buffer(st buffer_start_page(buffer)) free_page_start = 0; binder_debug(BINDER_DEBUG_BUFFER_ALLOC, - "%d: merge free, buffer %p share page with %p\n", + "%d: merge free, buffer %pK share page with %pK\n", proc->pid, buffer, prev); } } list_del(&buffer->entry); if (free_page_start || free_page_end) { binder_debug(BINDER_DEBUG_BUFFER_ALLOC, - "%d: merge free, buffer %p do not share page%s%s with %p or %p\n", + "%d: merge free, buffer %pK do not share page%s%s with %pK or %pK\n", proc->pid, buffer, free_page_start ? "" : " end", free_page_end ? "" : " start", prev, next); binder_update_page_range(proc, 0, free_page_start ? @@ -830,7 +830,7 @@ static void binder_free_buf(struct binde ALIGN(buffer->offsets_size, sizeof(void *));
binder_debug(BINDER_DEBUG_BUFFER_ALLOC, - "%d: binder_free_buf %p size %zd buffer_size %zd\n", + "%d: binder_free_buf %pK size %zd buffer_size %zd\n", proc->pid, buffer, size, buffer_size);
BUG_ON(buffer->free); @@ -2930,7 +2930,7 @@ static int binder_mmap(struct file *filp #ifdef CONFIG_CPU_CACHE_VIPT if (cache_is_vipt_aliasing()) { while (CACHE_COLOUR((vma->vm_start ^ (uint32_t)proc->buffer))) { - pr_info("binder_mmap: %d %lx-%lx maps %p bad alignment\n", proc->pid, vma->vm_start, vma->vm_end, proc->buffer); + pr_info("binder_mmap: %d %lx-%lx maps %pK bad alignment\n", proc->pid, vma->vm_start, vma->vm_end, proc->buffer); vma->vm_start += PAGE_SIZE; } } @@ -3191,7 +3191,7 @@ static void binder_deferred_release(stru
page_addr = proc->buffer + i * PAGE_SIZE; binder_debug(BINDER_DEBUG_BUFFER_ALLOC, - "%s: %d: page %d at %p not freed\n", + "%s: %d: page %d at %pK not freed\n", __func__, proc->pid, i, page_addr); unmap_kernel_range((unsigned long)page_addr, PAGE_SIZE); __free_page(proc->pages[i]); @@ -3294,7 +3294,7 @@ static void print_binder_transaction(str static void print_binder_buffer(struct seq_file *m, const char *prefix, struct binder_buffer *buffer) { - seq_printf(m, "%s %d: %p size %zd:%zd %s\n", + seq_printf(m, "%s %d: %pK size %zd:%zd %s\n", prefix, buffer->debug_id, buffer->data, buffer->data_size, buffer->offsets_size, buffer->transaction ? "active" : "delivered"); @@ -3397,7 +3397,7 @@ static void print_binder_node(struct seq
static void print_binder_ref(struct seq_file *m, struct binder_ref *ref) { - seq_printf(m, " ref %d: desc %d %snode %d s %d w %d d %p\n", + seq_printf(m, " ref %d: desc %d %snode %d s %d w %d d %pK\n", ref->debug_id, ref->desc, ref->node->proc ? "" : "dead ", ref->node->debug_id, ref->strong, ref->weak, ref->death); }
From: Todd Kjos tkjos@android.com
commit 8ca86f1639ec5890d400fff9211aca22d0a392eb upstream.
The format specifier "%p" can leak kernel addresses. Use "%pK" instead. There were 4 remaining cases in binder.c.
Signed-off-by: Todd Kjos tkjos@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org [bwh: Backported to 4.9: adjust context] Signed-off-by: Ben Hutchings ben.hutchings@codethink.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/android/binder.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
--- a/drivers/android/binder.c +++ b/drivers/android/binder.c @@ -1260,7 +1260,7 @@ static void binder_transaction_buffer_re int debug_id = buffer->debug_id;
binder_debug(BINDER_DEBUG_TRANSACTION, - "%d buffer release %d, size %zd-%zd, failed at %p\n", + "%d buffer release %d, size %zd-%zd, failed at %pK\n", proc->pid, buffer->debug_id, buffer->data_size, buffer->offsets_size, failed_at);
@@ -2123,7 +2123,7 @@ static int binder_thread_write(struct bi } } binder_debug(BINDER_DEBUG_DEAD_BINDER, - "%d:%d BC_DEAD_BINDER_DONE %016llx found %p\n", + "%d:%d BC_DEAD_BINDER_DONE %016llx found %pK\n", proc->pid, thread->pid, (u64)cookie, death); if (death == NULL) { @@ -3272,7 +3272,7 @@ static void print_binder_transaction(str struct binder_transaction *t) { seq_printf(m, - "%s %d: %p from %d:%d to %d:%d code %x flags %x pri %ld r%d", + "%s %d: %pK from %d:%d to %d:%d code %x flags %x pri %ld r%d", prefix, t->debug_id, t, t->from ? t->from->proc->pid : 0, t->from ? t->from->pid : 0, @@ -3286,7 +3286,7 @@ static void print_binder_transaction(str if (t->buffer->target_node) seq_printf(m, " node %d", t->buffer->target_node->debug_id); - seq_printf(m, " size %zd:%zd data %p\n", + seq_printf(m, " size %zd:%zd data %pK\n", t->buffer->data_size, t->buffer->offsets_size, t->buffer->data); }
From: Matthew Wilcox willy@infradead.org
commit 15fab63e1e57be9fdb5eec1bbc5916e9825e9acb upstream.
Change pipe_buf_get() to return a bool indicating whether it succeeded in raising the refcount of the page (if the thing in the pipe is a page). This removes another mechanism for overflowing the page refcount. All callers converted to handle a failure.
Reported-by: Jann Horn jannh@google.com Signed-off-by: Matthew Wilcox willy@infradead.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org [bwh: Backported to 4.9: adjust context] Signed-off-by: Ben Hutchings ben.hutchings@codethink.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/fuse/dev.c | 12 ++++++------ fs/pipe.c | 4 ++-- fs/splice.c | 12 ++++++++++-- include/linux/pipe_fs_i.h | 10 ++++++---- kernel/trace/trace.c | 6 +++++- 5 files changed, 29 insertions(+), 15 deletions(-)
--- a/fs/fuse/dev.c +++ b/fs/fuse/dev.c @@ -1975,10 +1975,8 @@ static ssize_t fuse_dev_splice_write(str rem += pipe->bufs[(pipe->curbuf + idx) & (pipe->buffers - 1)].len;
ret = -EINVAL; - if (rem < len) { - pipe_unlock(pipe); - goto out; - } + if (rem < len) + goto out_free;
rem = len; while (rem) { @@ -1996,7 +1994,9 @@ static ssize_t fuse_dev_splice_write(str pipe->curbuf = (pipe->curbuf + 1) & (pipe->buffers - 1); pipe->nrbufs--; } else { - pipe_buf_get(pipe, ibuf); + if (!pipe_buf_get(pipe, ibuf)) + goto out_free; + *obuf = *ibuf; obuf->flags &= ~PIPE_BUF_FLAG_GIFT; obuf->len = rem; @@ -2019,11 +2019,11 @@ static ssize_t fuse_dev_splice_write(str ret = fuse_dev_do_write(fud, &cs, len);
pipe_lock(pipe); +out_free: for (idx = 0; idx < nbuf; idx++) pipe_buf_release(pipe, &bufs[idx]); pipe_unlock(pipe);
-out: kfree(bufs); return ret; } --- a/fs/pipe.c +++ b/fs/pipe.c @@ -193,9 +193,9 @@ EXPORT_SYMBOL(generic_pipe_buf_steal); * in the tee() system call, when we duplicate the buffers in one * pipe into another. */ -void generic_pipe_buf_get(struct pipe_inode_info *pipe, struct pipe_buffer *buf) +bool generic_pipe_buf_get(struct pipe_inode_info *pipe, struct pipe_buffer *buf) { - get_page(buf->page); + return try_get_page(buf->page); } EXPORT_SYMBOL(generic_pipe_buf_get);
--- a/fs/splice.c +++ b/fs/splice.c @@ -1585,7 +1585,11 @@ retry: * Get a reference to this pipe buffer, * so we can copy the contents over. */ - pipe_buf_get(ipipe, ibuf); + if (!pipe_buf_get(ipipe, ibuf)) { + if (ret == 0) + ret = -EFAULT; + break; + } *obuf = *ibuf;
/* @@ -1659,7 +1663,11 @@ static int link_pipe(struct pipe_inode_i * Get a reference to this pipe buffer, * so we can copy the contents over. */ - pipe_buf_get(ipipe, ibuf); + if (!pipe_buf_get(ipipe, ibuf)) { + if (ret == 0) + ret = -EFAULT; + break; + }
obuf = opipe->bufs + nbuf; *obuf = *ibuf; --- a/include/linux/pipe_fs_i.h +++ b/include/linux/pipe_fs_i.h @@ -107,18 +107,20 @@ struct pipe_buf_operations { /* * Get a reference to the pipe buffer. */ - void (*get)(struct pipe_inode_info *, struct pipe_buffer *); + bool (*get)(struct pipe_inode_info *, struct pipe_buffer *); };
/** * pipe_buf_get - get a reference to a pipe_buffer * @pipe: the pipe that the buffer belongs to * @buf: the buffer to get a reference to + * + * Return: %true if the reference was successfully obtained. */ -static inline void pipe_buf_get(struct pipe_inode_info *pipe, +static inline __must_check bool pipe_buf_get(struct pipe_inode_info *pipe, struct pipe_buffer *buf) { - buf->ops->get(pipe, buf); + return buf->ops->get(pipe, buf); }
/** @@ -178,7 +180,7 @@ struct pipe_inode_info *alloc_pipe_info( void free_pipe_info(struct pipe_inode_info *);
/* Generic pipe buffer ops functions */ -void generic_pipe_buf_get(struct pipe_inode_info *, struct pipe_buffer *); +bool generic_pipe_buf_get(struct pipe_inode_info *, struct pipe_buffer *); int generic_pipe_buf_confirm(struct pipe_inode_info *, struct pipe_buffer *); int generic_pipe_buf_steal(struct pipe_inode_info *, struct pipe_buffer *); void generic_pipe_buf_release(struct pipe_inode_info *, struct pipe_buffer *); --- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -6145,12 +6145,16 @@ static void buffer_pipe_buf_release(stru buf->private = 0; }
-static void buffer_pipe_buf_get(struct pipe_inode_info *pipe, +static bool buffer_pipe_buf_get(struct pipe_inode_info *pipe, struct pipe_buffer *buf) { struct buffer_ref *ref = (struct buffer_ref *)buf->private;
+ if (ref->ref > INT_MAX/2) + return false; + ref->ref++; + return true; }
/* Pipe buffer operations for a buffer. */
From: Will Deacon will.deacon@arm.com
commit a3e328556d41bb61c55f9dfcc62d6a826ea97b85 upstream.
When operating on hugepages with DEBUG_VM enabled, the GUP code checks the compound head for each tail page prior to calling page_cache_add_speculative. This is broken, because on the fast-GUP path (where we don't hold any page table locks) we can be racing with a concurrent invocation of split_huge_page_to_list.
split_huge_page_to_list deals with this race by using page_ref_freeze to freeze the page and force concurrent GUPs to fail whilst the component pages are modified. This modification includes clearing the compound_head field for the tail pages, so checking this prior to a successful call to page_cache_add_speculative can lead to false positives: In fact, page_cache_add_speculative *already* has this check once the page refcount has been successfully updated, so we can simply remove the broken calls to VM_BUG_ON_PAGE.
Link: http://lkml.kernel.org/r/20170522133604.11392-2-punit.agrawal@arm.com Signed-off-by: Will Deacon will.deacon@arm.com Signed-off-by: Punit Agrawal punit.agrawal@arm.com Acked-by: Steve Capper steve.capper@arm.com Acked-by: Kirill A. Shutemov kirill.shutemov@linux.intel.com Cc: Aneesh Kumar K.V aneesh.kumar@linux.vnet.ibm.com Cc: Catalin Marinas catalin.marinas@arm.com Cc: Naoya Horiguchi n-horiguchi@ah.jp.nec.com Cc: Mark Rutland mark.rutland@arm.com Cc: Hillf Danton hillf.zj@alibaba-inc.com Cc: Michal Hocko mhocko@suse.com Cc: Mike Kravetz mike.kravetz@oracle.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Ben Hutchings ben.hutchings@codethink.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/gup.c | 3 --- 1 file changed, 3 deletions(-)
--- a/mm/gup.c +++ b/mm/gup.c @@ -1316,7 +1316,6 @@ static int gup_huge_pmd(pmd_t orig, pmd_ head = pmd_page(orig); page = head + ((addr & ~PMD_MASK) >> PAGE_SHIFT); do { - VM_BUG_ON_PAGE(compound_head(page) != head, page); pages[*nr] = page; (*nr)++; page++; @@ -1351,7 +1350,6 @@ static int gup_huge_pud(pud_t orig, pud_ head = pud_page(orig); page = head + ((addr & ~PUD_MASK) >> PAGE_SHIFT); do { - VM_BUG_ON_PAGE(compound_head(page) != head, page); pages[*nr] = page; (*nr)++; page++; @@ -1387,7 +1385,6 @@ static int gup_huge_pgd(pgd_t orig, pgd_ head = pgd_page(orig); page = head + ((addr & ~PGDIR_MASK) >> PAGE_SHIFT); do { - VM_BUG_ON_PAGE(compound_head(page) != head, page); pages[*nr] = page; (*nr)++; page++;
From: Punit Agrawal punit.agrawal@arm.com
commit d63206ee32b6e64b0e12d46e5d6004afd9913713 upstream.
When speculatively taking references to a hugepage using page_cache_add_speculative() in gup_huge_pmd(), it is assumed that the page returned by pmd_page() is the head page. Although normally true, this assumption doesn't hold when the hugepage comprises of successive page table entries such as when using contiguous bit on arm64 at PTE or PMD levels.
This can be addressed by ensuring that the page passed to page_cache_add_speculative() is the real head or by de-referencing the head page within the function.
We take the first approach to keep the usage pattern aligned with page_cache_get_speculative() where users already pass the appropriate page, i.e., the de-referenced head.
Apply the same logic to fix gup_huge_[pud|pgd]() as well.
[punit.agrawal@arm.com: fix arm64 ltp failure] Link: http://lkml.kernel.org/r/20170619170145.25577-5-punit.agrawal@arm.com Link: http://lkml.kernel.org/r/20170522133604.11392-3-punit.agrawal@arm.com Signed-off-by: Punit Agrawal punit.agrawal@arm.com Acked-by: Steve Capper steve.capper@arm.com Cc: Michal Hocko mhocko@suse.com Cc: "Kirill A. Shutemov" kirill.shutemov@linux.intel.com Cc: Aneesh Kumar K.V aneesh.kumar@linux.vnet.ibm.com Cc: Catalin Marinas catalin.marinas@arm.com Cc: Will Deacon will.deacon@arm.com Cc: Naoya Horiguchi n-horiguchi@ah.jp.nec.com Cc: Mark Rutland mark.rutland@arm.com Cc: Hillf Danton hillf.zj@alibaba-inc.com Cc: Mike Kravetz mike.kravetz@oracle.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Ben Hutchings ben.hutchings@codethink.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/gup.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-)
--- a/mm/gup.c +++ b/mm/gup.c @@ -1313,8 +1313,7 @@ static int gup_huge_pmd(pmd_t orig, pmd_ return 0;
refs = 0; - head = pmd_page(orig); - page = head + ((addr & ~PMD_MASK) >> PAGE_SHIFT); + page = pmd_page(orig) + ((addr & ~PMD_MASK) >> PAGE_SHIFT); do { pages[*nr] = page; (*nr)++; @@ -1322,6 +1321,7 @@ static int gup_huge_pmd(pmd_t orig, pmd_ refs++; } while (addr += PAGE_SIZE, addr != end);
+ head = compound_head(pmd_page(orig)); if (!page_cache_add_speculative(head, refs)) { *nr -= refs; return 0; @@ -1347,8 +1347,7 @@ static int gup_huge_pud(pud_t orig, pud_ return 0;
refs = 0; - head = pud_page(orig); - page = head + ((addr & ~PUD_MASK) >> PAGE_SHIFT); + page = pud_page(orig) + ((addr & ~PUD_MASK) >> PAGE_SHIFT); do { pages[*nr] = page; (*nr)++; @@ -1356,6 +1355,7 @@ static int gup_huge_pud(pud_t orig, pud_ refs++; } while (addr += PAGE_SIZE, addr != end);
+ head = compound_head(pud_page(orig)); if (!page_cache_add_speculative(head, refs)) { *nr -= refs; return 0; @@ -1382,8 +1382,7 @@ static int gup_huge_pgd(pgd_t orig, pgd_ return 0;
refs = 0; - head = pgd_page(orig); - page = head + ((addr & ~PGDIR_MASK) >> PAGE_SHIFT); + page = pgd_page(orig) + ((addr & ~PGDIR_MASK) >> PAGE_SHIFT); do { pages[*nr] = page; (*nr)++; @@ -1391,6 +1390,7 @@ static int gup_huge_pgd(pgd_t orig, pgd_ refs++; } while (addr += PAGE_SIZE, addr != end);
+ head = compound_head(pgd_page(orig)); if (!page_cache_add_speculative(head, refs)) { *nr -= refs; return 0;
From: Linus Torvalds torvalds@linux-foundation.org
commit 8fde12ca79aff9b5ba951fce1a2641901b8d8e64 upstream.
If the page refcount wraps around past zero, it will be freed while there are still four billion references to it. One of the possible avenues for an attacker to try to make this happen is by doing direct IO on a page multiple times. This patch makes get_user_pages() refuse to take a new page reference if there are already more than two billion references to the page.
Reported-by: Jann Horn jannh@google.com Acked-by: Matthew Wilcox willy@infradead.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org [bwh: Backported to 4.9: - Add the "err" variable in follow_hugetlb_page() - Adjust context] Signed-off-by: Ben Hutchings ben.hutchings@codethink.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/gup.c | 45 ++++++++++++++++++++++++++++++++++----------- mm/hugetlb.c | 16 +++++++++++++++- 2 files changed, 49 insertions(+), 12 deletions(-)
--- a/mm/gup.c +++ b/mm/gup.c @@ -153,7 +153,10 @@ retry: }
if (flags & FOLL_GET) { - get_page(page); + if (unlikely(!try_get_page(page))) { + page = ERR_PTR(-ENOMEM); + goto out; + }
/* drop the pgmap reference now that we hold the page */ if (pgmap) { @@ -292,7 +295,10 @@ struct page *follow_page_mask(struct vm_ if (pmd_trans_unstable(pmd)) ret = -EBUSY; } else { - get_page(page); + if (unlikely(!try_get_page(page))) { + spin_unlock(ptl); + return ERR_PTR(-ENOMEM); + } spin_unlock(ptl); lock_page(page); ret = split_huge_page(page); @@ -348,7 +354,10 @@ static int get_gate_page(struct mm_struc goto unmap; *page = pte_page(*pte); } - get_page(*page); + if (unlikely(!try_get_page(*page))) { + ret = -ENOMEM; + goto unmap; + } out: ret = 0; unmap: @@ -1231,6 +1240,20 @@ struct page *get_dump_page(unsigned long */ #ifdef CONFIG_HAVE_GENERIC_RCU_GUP
+/* + * Return the compund head page with ref appropriately incremented, + * or NULL if that failed. + */ +static inline struct page *try_get_compound_head(struct page *page, int refs) +{ + struct page *head = compound_head(page); + if (WARN_ON_ONCE(page_ref_count(head) < 0)) + return NULL; + if (unlikely(!page_cache_add_speculative(head, refs))) + return NULL; + return head; +} + #ifdef __HAVE_ARCH_PTE_SPECIAL static int gup_pte_range(pmd_t pmd, unsigned long addr, unsigned long end, int write, struct page **pages, int *nr) @@ -1263,9 +1286,9 @@ static int gup_pte_range(pmd_t pmd, unsi
VM_BUG_ON(!pfn_valid(pte_pfn(pte))); page = pte_page(pte); - head = compound_head(page);
- if (!page_cache_get_speculative(head)) + head = try_get_compound_head(page, 1); + if (!head) goto pte_unmap;
if (unlikely(pte_val(pte) != pte_val(*ptep))) { @@ -1321,8 +1344,8 @@ static int gup_huge_pmd(pmd_t orig, pmd_ refs++; } while (addr += PAGE_SIZE, addr != end);
- head = compound_head(pmd_page(orig)); - if (!page_cache_add_speculative(head, refs)) { + head = try_get_compound_head(pmd_page(orig), refs); + if (!head) { *nr -= refs; return 0; } @@ -1355,8 +1378,8 @@ static int gup_huge_pud(pud_t orig, pud_ refs++; } while (addr += PAGE_SIZE, addr != end);
- head = compound_head(pud_page(orig)); - if (!page_cache_add_speculative(head, refs)) { + head = try_get_compound_head(pud_page(orig), refs); + if (!head) { *nr -= refs; return 0; } @@ -1390,8 +1413,8 @@ static int gup_huge_pgd(pgd_t orig, pgd_ refs++; } while (addr += PAGE_SIZE, addr != end);
- head = compound_head(pgd_page(orig)); - if (!page_cache_add_speculative(head, refs)) { + head = try_get_compound_head(pgd_page(orig), refs); + if (!head) { *nr -= refs; return 0; } --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -3984,6 +3984,7 @@ long follow_hugetlb_page(struct mm_struc unsigned long vaddr = *position; unsigned long remainder = *nr_pages; struct hstate *h = hstate_vma(vma); + int err = -EFAULT;
while (vaddr < vma->vm_end && remainder) { pte_t *pte; @@ -4055,6 +4056,19 @@ long follow_hugetlb_page(struct mm_struc
pfn_offset = (vaddr & ~huge_page_mask(h)) >> PAGE_SHIFT; page = pte_page(huge_ptep_get(pte)); + + /* + * Instead of doing 'try_get_page()' below in the same_page + * loop, just check the count once here. + */ + if (unlikely(page_count(page) <= 0)) { + if (pages) { + spin_unlock(ptl); + remainder = 0; + err = -ENOMEM; + break; + } + } same_page: if (pages) { pages[i] = mem_map_offset(page, pfn_offset); @@ -4081,7 +4095,7 @@ same_page: *nr_pages = remainder; *position = vaddr;
- return i ? i : -EFAULT; + return i ? i : err; }
#ifndef __HAVE_ARCH_FLUSH_HUGETLB_TLB_RANGE
On 6/9/19 6:42 PM, Greg Kroah-Hartman wrote:
From: Linus Torvalds torvalds@linux-foundation.org
commit 8fde12ca79aff9b5ba951fce1a2641901b8d8e64 upstream.
If the page refcount wraps around past zero, it will be freed while there are still four billion references to it. One of the possible avenues for an attacker to try to make this happen is by doing direct IO on a page multiple times. This patch makes get_user_pages() refuse to take a new page reference if there are already more than two billion references to the page.
Reported-by: Jann Horn jannh@google.com Acked-by: Matthew Wilcox willy@infradead.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org [bwh: Backported to 4.9:
- Add the "err" variable in follow_hugetlb_page()
- Adjust context]
Signed-off-by: Ben Hutchings ben.hutchings@codethink.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
mm/gup.c | 45 ++++++++++++++++++++++++++++++++++----------- mm/hugetlb.c | 16 +++++++++++++++- 2 files changed, 49 insertions(+), 12 deletions(-)
...
@@ -1231,6 +1240,20 @@ struct page *get_dump_page(unsigned long */ #ifdef CONFIG_HAVE_GENERIC_RCU_GUP +/*
- Return the compund head page with ref appropriately incremented,
- or NULL if that failed.
- */
+static inline struct page *try_get_compound_head(struct page *page, int refs) +{
- struct page *head = compound_head(page);
- if (WARN_ON_ONCE(page_ref_count(head) < 0))
return NULL;
- if (unlikely(!page_cache_add_speculative(head, refs)))
return NULL;
- return head;
+}
#ifdef __HAVE_ARCH_PTE_SPECIAL static int gup_pte_range(pmd_t pmd, unsigned long addr, unsigned long end, int write, struct page **pages, int *nr) @@ -1263,9 +1286,9 @@ static int gup_pte_range(pmd_t pmd, unsi VM_BUG_ON(!pfn_valid(pte_pfn(pte))); page = pte_page(pte);
head = compound_head(page);
if (!page_cache_get_speculative(head))
head = try_get_compound_head(page, 1);
BTW, several arches in 4.9, including x86, have arch-specific fast gup implementation, which is not touched by this backport. Didn't check if Jann's exploit ends up using the fast on non-fast one, though.
From: Linus Torvalds torvalds@linux-foundation.org
commit f958d7b528b1b40c44cfda5eabe2d82760d868c3 upstream.
We have a VM_BUG_ON() to check that the page reference count doesn't underflow (or get close to overflow) by checking the sign of the count.
That's all fine, but we actually want to allow people to use a "get page ref unless it's already very high" helper function, and we want that one to use the sign of the page ref (without triggering this VM_BUG_ON).
Change the VM_BUG_ON to only check for small underflows (or _very_ close to overflowing), and ignore overflows which have strayed into negative territory.
Acked-by: Matthew Wilcox willy@infradead.org Cc: Jann Horn jannh@google.com Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Ben Hutchings ben.hutchings@codethink.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/linux/mm.h | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
--- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -763,6 +763,10 @@ static inline bool is_zone_device_page(c } #endif
+/* 127: arbitrary random number, small enough to assemble well */ +#define page_ref_zero_or_close_to_overflow(page) \ + ((unsigned int) page_ref_count(page) + 127u <= 127u) + static inline void get_page(struct page *page) { page = compound_head(page); @@ -770,7 +774,7 @@ static inline void get_page(struct page * Getting a normal page or the head of a compound page * requires to already have an elevated page->_refcount. */ - VM_BUG_ON_PAGE(page_ref_count(page) <= 0, page); + VM_BUG_ON_PAGE(page_ref_zero_or_close_to_overflow(page), page); page_ref_inc(page);
if (unlikely(is_zone_device_page(page)))
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
This reverts commit 392bef709659abea614abfe53cf228e7a59876a4.
It seems to cause lots of problems when using the gold linker, and no one really needs this at the moment, so just revert it from the stable trees.
Cc: Sami Tolvanen samitolvanen@google.com Reported-by: Kees Cook keescook@chromium.org Cc: Borislav Petkov bp@suse.de Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Peter Zijlstra peterz@infradead.org Cc: Thomas Gleixner tglx@linutronix.de Reported-by: Alec Ari neotheuser@gmail.com Cc: Ingo Molnar mingo@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kernel/vmlinux.lds.S | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
--- a/arch/x86/kernel/vmlinux.lds.S +++ b/arch/x86/kernel/vmlinux.lds.S @@ -111,10 +111,10 @@ SECTIONS *(.text.__x86.indirect_thunk) __indirect_thunk_end = .; #endif - } :text = 0x9090
- /* End of text section */ - _etext = .; + /* End of text section */ + _etext = .; + } :text = 0x9090
NOTES :text :note
On Sun, Jun 09, 2019 at 06:42:29PM +0200, Greg Kroah-Hartman wrote:
This reverts commit 392bef709659abea614abfe53cf228e7a59876a4.
It seems to cause lots of problems when using the gold linker, and no one really needs this at the moment, so just revert it from the stable trees.
Ah great, I just wrote a report after a build failure upgrading to 4.9.180 due to this one. It fails with older binutils (2.22 for me). I'm cancelling my e-mail now seeing it's already known :-)
I can try some patches if original authors want to try another variant of this patch later.
Cheers, Willy
From: Ard Biesheuvel ard.biesheuvel@linaro.org
commit 60f38de7a8d4e816100ceafd1b382df52527bd50 upstream.
Merge the parsing of the command line carried out in arm-stub.c with the handling in efi_parse_options(). Note that this also fixes the missing handling of CONFIG_CMDLINE_FORCE=y, in which case the builtin command line should supersede the one passed by the firmware.
Signed-off-by: Ard Biesheuvel ard.biesheuvel@linaro.org Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Matt Fleming matt@codeblueprint.co.uk Cc: Peter Zijlstra peterz@infradead.org Cc: Thomas Gleixner tglx@linutronix.de Cc: bhe@redhat.com Cc: bhsharma@redhat.com Cc: bp@alien8.de Cc: eugene@hp.com Cc: evgeny.kalugin@intel.com Cc: jhugo@codeaurora.org Cc: leif.lindholm@linaro.org Cc: linux-efi@vger.kernel.org Cc: mark.rutland@arm.com Cc: roy.franz@cavium.com Cc: rruigrok@codeaurora.org Link: http://lkml.kernel.org/r/20170404160910.28115-1-ard.biesheuvel@linaro.org Signed-off-by: Ingo Molnar mingo@kernel.org [ardb: fix up merge conflicts with 4.9.180] Signed-off-by: Ard Biesheuvel ard.biesheuvel@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/firmware/efi/libstub/arm-stub.c | 23 +++++++---------------- drivers/firmware/efi/libstub/arm64-stub.c | 4 +--- drivers/firmware/efi/libstub/efi-stub-helper.c | 19 +++++++++++-------- drivers/firmware/efi/libstub/efistub.h | 2 ++ include/linux/efi.h | 2 +- 5 files changed, 22 insertions(+), 28 deletions(-)
--- a/drivers/firmware/efi/libstub/arm-stub.c +++ b/drivers/firmware/efi/libstub/arm-stub.c @@ -18,7 +18,6 @@
#include "efistub.h"
-bool __nokaslr;
static int efi_get_secureboot(efi_system_table_t *sys_table_arg) { @@ -268,18 +267,6 @@ unsigned long efi_entry(void *handle, ef goto fail; }
- /* check whether 'nokaslr' was passed on the command line */ - if (IS_ENABLED(CONFIG_RANDOMIZE_BASE)) { - static const u8 default_cmdline[] = CONFIG_CMDLINE; - const u8 *str, *cmdline = cmdline_ptr; - - if (IS_ENABLED(CONFIG_CMDLINE_FORCE)) - cmdline = default_cmdline; - str = strstr(cmdline, "nokaslr"); - if (str == cmdline || (str > cmdline && *(str - 1) == ' ')) - __nokaslr = true; - } - si = setup_graphics(sys_table);
status = handle_kernel_image(sys_table, image_addr, &image_size, @@ -291,9 +278,13 @@ unsigned long efi_entry(void *handle, ef goto fail_free_cmdline; }
- status = efi_parse_options(cmdline_ptr); - if (status != EFI_SUCCESS) - pr_efi_err(sys_table, "Failed to parse EFI cmdline options\n"); + if (IS_ENABLED(CONFIG_CMDLINE_EXTEND) || + IS_ENABLED(CONFIG_CMDLINE_FORCE) || + cmdline_size == 0) + efi_parse_options(CONFIG_CMDLINE); + + if (!IS_ENABLED(CONFIG_CMDLINE_FORCE) && cmdline_size > 0) + efi_parse_options(cmdline_ptr);
secure_boot = efi_get_secureboot(sys_table); if (secure_boot > 0) --- a/drivers/firmware/efi/libstub/arm64-stub.c +++ b/drivers/firmware/efi/libstub/arm64-stub.c @@ -24,8 +24,6 @@
#include "efistub.h"
-extern bool __nokaslr; - efi_status_t check_platform_features(efi_system_table_t *sys_table_arg) { u64 tg; @@ -60,7 +58,7 @@ efi_status_t handle_kernel_image(efi_sys u64 phys_seed = 0;
if (IS_ENABLED(CONFIG_RANDOMIZE_BASE)) { - if (!__nokaslr) { + if (!nokaslr()) { status = efi_get_random_bytes(sys_table_arg, sizeof(phys_seed), (u8 *)&phys_seed); --- a/drivers/firmware/efi/libstub/efi-stub-helper.c +++ b/drivers/firmware/efi/libstub/efi-stub-helper.c @@ -32,6 +32,13 @@
static unsigned long __chunk_size = EFI_READ_CHUNK_SIZE;
+static int __section(.data) __nokaslr; + +int __pure nokaslr(void) +{ + return __nokaslr; +} + /* * Allow the platform to override the allocation granularity: this allows * systems that have the capability to run with a larger page size to deal @@ -351,17 +358,13 @@ void efi_free(efi_system_table_t *sys_ta * environments, first in the early boot environment of the EFI boot * stub, and subsequently during the kernel boot. */ -efi_status_t efi_parse_options(char *cmdline) +efi_status_t efi_parse_options(char const *cmdline) { char *str;
- /* - * Currently, the only efi= option we look for is 'nochunk', which - * is intended to work around known issues on certain x86 UEFI - * versions. So ignore for now on other architectures. - */ - if (!IS_ENABLED(CONFIG_X86)) - return EFI_SUCCESS; + str = strstr(cmdline, "nokaslr"); + if (str == cmdline || (str && str > cmdline && *(str - 1) == ' ')) + __nokaslr = 1;
/* * If no EFI parameters were specified on the cmdline we've got --- a/drivers/firmware/efi/libstub/efistub.h +++ b/drivers/firmware/efi/libstub/efistub.h @@ -15,6 +15,8 @@ */ #undef __init
+extern int __pure nokaslr(void); + void efi_char16_printk(efi_system_table_t *, efi_char16_t *);
efi_status_t efi_open_volume(efi_system_table_t *sys_table_arg, void *__image, --- a/include/linux/efi.h +++ b/include/linux/efi.h @@ -1427,7 +1427,7 @@ efi_status_t handle_cmdline_files(efi_sy unsigned long *load_addr, unsigned long *load_size);
-efi_status_t efi_parse_options(char *cmdline); +efi_status_t efi_parse_options(char const *cmdline);
efi_status_t efi_setup_gop(efi_system_table_t *sys_table_arg, struct screen_info *si, efi_guid_t *proto,
From: Nadav Amit namit@vmware.com
commit 89dd34caf73e28018c58cd193751e41b1f8bdc56 upstream.
The use of ALIGN() in uvc_alloc_entity() is incorrect, since the size of (entity->pads) is not a power of two. As a stop-gap, until a better solution is adapted, use roundup() instead.
Found by a static assertion. Compile-tested only.
Fixes: 4ffc2d89f38a ("uvcvideo: Register subdevices for each entity")
Signed-off-by: Nadav Amit namit@vmware.com Signed-off-by: Laurent Pinchart laurent.pinchart@ideasonboard.com Signed-off-by: Mauro Carvalho Chehab mchehab+samsung@kernel.org Cc: Doug Anderson dianders@chromium.org Cc: Ben Hutchings ben@decadent.org.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/media/usb/uvc/uvc_driver.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/media/usb/uvc/uvc_driver.c +++ b/drivers/media/usb/uvc/uvc_driver.c @@ -868,7 +868,7 @@ static struct uvc_entity *uvc_alloc_enti unsigned int size; unsigned int i;
- extra_size = ALIGN(extra_size, sizeof(*entity->pads)); + extra_size = roundup(extra_size, sizeof(*entity->pads)); num_inputs = (type & UVC_TERM_OUTPUT) ? num_pads : num_pads - 1; size = sizeof(*entity) + extra_size + sizeof(*entity->pads) * num_pads + num_inputs;
From: Vivien Didelot vivien.didelot@gmail.com
[ Upstream commit 0ee4e76937d69128a6a66861ba393ebdc2ffc8a2 ]
ethtool_get_regs() allocates a buffer of size ops->get_regs_len(), and pass it to the kernel driver via ops->get_regs() for filling.
There is no restriction about what the kernel drivers can or cannot do with the open ethtool_regs structure. They usually set regs->version and ignore regs->len or set it to the same size as ops->get_regs_len().
But if userspace allocates a smaller buffer for the registers dump, we would cause a userspace buffer overflow in the final copy_to_user() call, which uses the regs.len value potentially reset by the driver.
To fix this, make this case obvious and store regs.len before calling ops->get_regs(), to only copy as much data as requested by userspace, up to the value returned by ops->get_regs_len().
While at it, remove the redundant check for non-null regbuf.
Signed-off-by: Vivien Didelot vivien.didelot@gmail.com Reviewed-by: Michal Kubecek mkubecek@suse.cz Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/core/ethtool.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
--- a/net/core/ethtool.c +++ b/net/core/ethtool.c @@ -1390,13 +1390,16 @@ static int ethtool_get_regs(struct net_d return -ENOMEM; }
+ if (regs.len < reglen) + reglen = regs.len; + ops->get_regs(dev, ®s, regbuf);
ret = -EFAULT; if (copy_to_user(useraddr, ®s, sizeof(regs))) goto out; useraddr += offsetof(struct ethtool_regs, data); - if (regbuf && copy_to_user(useraddr, regbuf, regs.len)) + if (copy_to_user(useraddr, regbuf, reglen)) goto out; ret = 0;
From: David Ahern dsahern@gmail.com
[ Upstream commit 4b2a2bfeb3f056461a90bd621e8bd7d03fa47f60 ]
Commit cd9ff4de0107 changed the key for IFF_POINTOPOINT devices to INADDR_ANY but neigh_xmit which is used for MPLS encapsulations was not updated to use the altered key. The result is that every packet Tx does a lookup on the gateway address which does not find an entry, a new one is created only to find the existing one in the table right before the insert since arp_constructor was updated to reset the primary key. This is seen in the allocs and destroys counters: ip -s -4 ntable show | head -10 | grep alloc
which increase for each packet showing the unnecessary overhread.
Fix by having neigh_xmit use __ipv4_neigh_lookup_noref for NEIGH_ARP_TABLE.
Fixes: cd9ff4de0107 ("ipv4: Make neigh lookup keys for loopback/point-to-point devices be INADDR_ANY") Reported-by: Alan Maguire alan.maguire@oracle.com Signed-off-by: David Ahern dsahern@gmail.com Tested-by: Alan Maguire alan.maguire@oracle.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/core/neighbour.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-)
--- a/net/core/neighbour.c +++ b/net/core/neighbour.c @@ -30,6 +30,7 @@ #include <linux/times.h> #include <net/net_namespace.h> #include <net/neighbour.h> +#include <net/arp.h> #include <net/dst.h> #include <net/sock.h> #include <net/netevent.h> @@ -2489,7 +2490,13 @@ int neigh_xmit(int index, struct net_dev if (!tbl) goto out; rcu_read_lock_bh(); - neigh = __neigh_lookup_noref(tbl, addr, dev); + if (index == NEIGH_ARP_TABLE) { + u32 key = *((u32 *)addr); + + neigh = __ipv4_neigh_lookup_noref(dev, key); + } else { + neigh = __neigh_lookup_noref(tbl, addr, dev); + } if (!neigh) neigh = __neigh_create(tbl, addr, dev, false); err = PTR_ERR(neigh);
From: Erez Alfasi ereza@mellanox.com
[ Upstream commit 135dd9594f127c8a82d141c3c8430e9e2143216a ]
Querying EEPROM high pages data for SFP module is currently not supported by our driver but is still tried, resulting in invalid FW queries.
Set the EEPROM ethtool data length to 256 for SFP module to limit the reading for page 0 only and prevent invalid FW queries.
Fixes: 7202da8b7f71 ("ethtool, net/mlx4_en: Cable info, get_module_info/eeprom ethtool support") Signed-off-by: Erez Alfasi ereza@mellanox.com Signed-off-by: Tariq Toukan tariqt@mellanox.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/mellanox/mlx4/en_ethtool.c | 4 +++- drivers/net/ethernet/mellanox/mlx4/port.c | 5 ----- 2 files changed, 3 insertions(+), 6 deletions(-)
--- a/drivers/net/ethernet/mellanox/mlx4/en_ethtool.c +++ b/drivers/net/ethernet/mellanox/mlx4/en_ethtool.c @@ -1930,6 +1930,8 @@ static int mlx4_en_set_tunable(struct ne return ret; }
+#define MLX4_EEPROM_PAGE_LEN 256 + static int mlx4_en_get_module_info(struct net_device *dev, struct ethtool_modinfo *modinfo) { @@ -1964,7 +1966,7 @@ static int mlx4_en_get_module_info(struc break; case MLX4_MODULE_ID_SFP: modinfo->type = ETH_MODULE_SFF_8472; - modinfo->eeprom_len = ETH_MODULE_SFF_8472_LEN; + modinfo->eeprom_len = MLX4_EEPROM_PAGE_LEN; break; default: return -ENOSYS; --- a/drivers/net/ethernet/mellanox/mlx4/port.c +++ b/drivers/net/ethernet/mellanox/mlx4/port.c @@ -1960,11 +1960,6 @@ int mlx4_get_module_info(struct mlx4_dev size -= offset + size - I2C_PAGE_SIZE;
i2c_addr = I2C_ADDR_LOW; - if (offset >= I2C_PAGE_SIZE) { - /* Reset offset to high page */ - i2c_addr = I2C_ADDR_HIGH; - offset -= I2C_PAGE_SIZE; - }
cable_info = (struct mlx4_cable_info *)inmad->data; cable_info->dev_mem_address = cpu_to_be16(offset);
From: Zhu Yanjun yanjun.zhu@oracle.com
[ Upstream commit 85cb928787eab6a2f4ca9d2a798b6f3bed53ced1 ]
When the following tests last for several hours, the problem will occur.
Server: rds-stress -r 1.1.1.16 -D 1M Client: rds-stress -r 1.1.1.14 -s 1.1.1.16 -D 1M -T 30
The following will occur.
" Starting up.... tsks tx/s rx/s tx+rx K/s mbi K/s mbo K/s tx us/c rtt us cpu % 1 0 0 0.00 0.00 0.00 0.00 0.00 -1.00 1 0 0 0.00 0.00 0.00 0.00 0.00 -1.00 1 0 0 0.00 0.00 0.00 0.00 0.00 -1.00 1 0 0 0.00 0.00 0.00 0.00 0.00 -1.00 "
From vmcore, we can find that clean_list is NULL.
From the source code, rds_mr_flushd calls rds_ib_mr_pool_flush_worker.
Then rds_ib_mr_pool_flush_worker calls " rds_ib_flush_mr_pool(pool, 0, NULL); " Then in function " int rds_ib_flush_mr_pool(struct rds_ib_mr_pool *pool, int free_all, struct rds_ib_mr **ibmr_ret) " ibmr_ret is NULL.
In the source code, " ... list_to_llist_nodes(pool, &unmap_list, &clean_nodes, &clean_tail); if (ibmr_ret) *ibmr_ret = llist_entry(clean_nodes, struct rds_ib_mr, llnode);
/* more than one entry in llist nodes */ if (clean_nodes->next) llist_add_batch(clean_nodes->next, clean_tail, &pool->clean_list); ... " When ibmr_ret is NULL, llist_entry is not executed. clean_nodes->next instead of clean_nodes is added in clean_list. So clean_nodes is discarded. It can not be used again. The workqueue is executed periodically. So more and more clean_nodes are discarded. Finally the clean_list is NULL. Then this problem will occur.
Fixes: 1bc144b62524 ("net, rds, Replace xlist in net/rds/xlist.h with llist") Signed-off-by: Zhu Yanjun yanjun.zhu@oracle.com Acked-by: Santosh Shilimkar santosh.shilimkar@oracle.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/rds/ib_rdma.c | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-)
--- a/net/rds/ib_rdma.c +++ b/net/rds/ib_rdma.c @@ -416,12 +416,14 @@ int rds_ib_flush_mr_pool(struct rds_ib_m wait_clean_list_grace();
list_to_llist_nodes(pool, &unmap_list, &clean_nodes, &clean_tail); - if (ibmr_ret) + if (ibmr_ret) { *ibmr_ret = llist_entry(clean_nodes, struct rds_ib_mr, llnode); - + clean_nodes = clean_nodes->next; + } /* more than one entry in llist nodes */ - if (clean_nodes->next) - llist_add_batch(clean_nodes->next, clean_tail, &pool->clean_list); + if (clean_nodes) + llist_add_batch(clean_nodes, clean_tail, + &pool->clean_list);
}
From: Paolo Abeni pabeni@redhat.com
[ Upstream commit 720f1de4021f09898b8c8443f3b3e995991b6e3a ]
Currently, the process issuing a "start" command on the pktgen procfs interface, acquires the pktgen thread lock and never release it, until all pktgen threads are completed. The above can blocks indefinitely any other pktgen command and any (even unrelated) netdevice removal - as the pktgen netdev notifier acquires the same lock.
The issue is demonstrated by the following script, reported by Matteo:
ip -b - <<'EOF' link add type dummy link add type veth link set dummy0 up EOF modprobe pktgen echo reset >/proc/net/pktgen/pgctrl { echo rem_device_all echo add_device dummy0 } >/proc/net/pktgen/kpktgend_0 echo count 0 >/proc/net/pktgen/dummy0 echo start >/proc/net/pktgen/pgctrl & sleep 1 rmmod veth
Fix the above releasing the thread lock around the sleep call.
Additionally we must prevent racing with forcefull rmmod - as the thread lock no more protects from them. Instead, acquire a self-reference before waiting for any thread. As a side effect, running
rmmod pktgen
while some thread is running now fails with "module in use" error, before this patch such command hanged indefinitely.
Note: the issue predates the commit reported in the fixes tag, but this fix can't be applied before the mentioned commit.
v1 -> v2: - no need to check for thread existence after flipping the lock, pktgen threads are freed only at net exit time -
Fixes: 6146e6a43b35 ("[PKTGEN]: Removes thread_{un,}lock() macros.") Reported-and-tested-by: Matteo Croce mcroce@redhat.com Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/core/pktgen.c | 11 +++++++++++ 1 file changed, 11 insertions(+)
--- a/net/core/pktgen.c +++ b/net/core/pktgen.c @@ -3147,7 +3147,13 @@ static int pktgen_wait_thread_run(struct { while (thread_is_running(t)) {
+ /* note: 't' will still be around even after the unlock/lock + * cycle because pktgen_thread threads are only cleared at + * net exit + */ + mutex_unlock(&pktgen_thread_lock); msleep_interruptible(100); + mutex_lock(&pktgen_thread_lock);
if (signal_pending(current)) goto signal; @@ -3162,6 +3168,10 @@ static int pktgen_wait_all_threads_run(s struct pktgen_thread *t; int sig = 1;
+ /* prevent from racing with rmmod */ + if (!try_module_get(THIS_MODULE)) + return sig; + mutex_lock(&pktgen_thread_lock);
list_for_each_entry(t, &pn->pktgen_threads, th_list) { @@ -3175,6 +3185,7 @@ static int pktgen_wait_all_threads_run(s t->control |= (T_STOP);
mutex_unlock(&pktgen_thread_lock); + module_put(THIS_MODULE); return sig; }
From: Olivier Matz olivier.matz@6wind.com
[ Upstream commit b9aa52c4cb457e7416cc0c95f475e72ef4a61336 ]
The following code returns EFAULT (Bad address):
s = socket(AF_INET6, SOCK_RAW, IPPROTO_ICMPV6); setsockopt(s, SOL_IPV6, IPV6_HDRINCL, 1); sendto(ipv6_icmp6_packet, addr); /* returns -1, errno = EFAULT */
The IPv4 equivalent code works. A workaround is to use IPPROTO_RAW instead of IPPROTO_ICMPV6.
The failure happens because 2 bytes are eaten from the msghdr by rawv6_probe_proto_opt() starting from commit 19e3c66b52ca ("ipv6 equivalent of "ipv4: Avoid reading user iov twice after raw_probe_proto_opt""), but at that time it was not a problem because IPV6_HDRINCL was not yet introduced.
Only eat these 2 bytes if hdrincl == 0.
Fixes: 715f504b1189 ("ipv6: add IPV6_HDRINCL option for raw sockets") Signed-off-by: Olivier Matz olivier.matz@6wind.com Acked-by: Nicolas Dichtel nicolas.dichtel@6wind.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv6/raw.c | 13 ++++++++----- 1 file changed, 8 insertions(+), 5 deletions(-)
--- a/net/ipv6/raw.c +++ b/net/ipv6/raw.c @@ -880,11 +880,14 @@ static int rawv6_sendmsg(struct sock *sk opt = ipv6_fixup_options(&opt_space, opt);
fl6.flowi6_proto = proto; - rfv.msg = msg; - rfv.hlen = 0; - err = rawv6_probe_proto_opt(&rfv, &fl6); - if (err) - goto out; + + if (!hdrincl) { + rfv.msg = msg; + rfv.hlen = 0; + err = rawv6_probe_proto_opt(&rfv, &fl6); + if (err) + goto out; + }
if (!ipv6_addr_any(daddr)) fl6.daddr = *daddr;
From: Olivier Matz olivier.matz@6wind.com
[ Upstream commit 59e3e4b52663a9d97efbce7307f62e4bc5c9ce91 ]
As it was done in commit 8f659a03a0ba ("net: ipv4: fix for a race condition in raw_sendmsg") and commit 20b50d79974e ("net: ipv4: emulate READ_ONCE() on ->hdrincl bit-field in raw_sendmsg()") for ipv4, copy the value of inet->hdrincl in a local variable, to avoid introducing a race condition in the next commit.
Signed-off-by: Olivier Matz olivier.matz@6wind.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv6/raw.c | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-)
--- a/net/ipv6/raw.c +++ b/net/ipv6/raw.c @@ -774,6 +774,7 @@ static int rawv6_sendmsg(struct sock *sk struct sockcm_cookie sockc; struct ipcm6_cookie ipc6; int addr_len = msg->msg_namelen; + int hdrincl; u16 proto; int err;
@@ -787,6 +788,13 @@ static int rawv6_sendmsg(struct sock *sk if (msg->msg_flags & MSG_OOB) return -EOPNOTSUPP;
+ /* hdrincl should be READ_ONCE(inet->hdrincl) + * but READ_ONCE() doesn't work with bit fields. + * Doing this indirectly yields the same result. + */ + hdrincl = inet->hdrincl; + hdrincl = READ_ONCE(hdrincl); + /* * Get and verify the address. */ @@ -904,7 +912,7 @@ static int rawv6_sendmsg(struct sock *sk fl6.flowi6_oif = np->ucast_oif; security_sk_classify_flow(sk, flowi6_to_flowi(&fl6));
- if (inet->hdrincl) + if (hdrincl) fl6.flowi6_flags |= FLOWI_FLAG_KNOWN_NH;
if (ipc6.tclass < 0) @@ -927,7 +935,7 @@ static int rawv6_sendmsg(struct sock *sk goto do_confirm;
back_from_confirm: - if (inet->hdrincl) + if (hdrincl) err = rawv6_send_hdrinc(sk, msg, len, &fl6, &dst, msg->msg_flags); else { ipc6.opt = opt;
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
This reverts commit d5c71a7c533e88a9fcc74fe1b5c25743868fa300 as the patch that this "fixes" is about to be reverted...
Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/core/fib_rules.c | 1 - 1 file changed, 1 deletion(-)
--- a/net/core/fib_rules.c +++ b/net/core/fib_rules.c @@ -430,7 +430,6 @@ int fib_nl_newrule(struct sk_buff *skb, goto errout_free;
if (rule_exists(ops, frh, tb, rule)) { - err = 0; if (nlh->nlmsg_flags & NLM_F_EXCL) err = -EEXIST; goto errout_free;
From: Hangbin Liu liuhangbin@gmail.com
[ Upstream commit 4970b42d5c362bf873982db7d93245c5281e58f4 ]
This reverts commit e9919a24d3022f72bcadc407e73a6ef17093a849.
Nathan reported the new behaviour breaks Android, as Android just add new rules and delete old ones.
If we return 0 without adding dup rules, Android will remove the new added rules and causing system to soft-reboot.
Fixes: e9919a24d302 ("fib_rules: return 0 directly if an exactly same rule exists when NLM_F_EXCL not supplied") Reported-by: Nathan Chancellor natechancellor@gmail.com Reported-by: Yaro Slav yaro330@gmail.com Reported-by: Maciej Żenczykowski zenczykowski@gmail.com Signed-off-by: Hangbin Liu liuhangbin@gmail.com Reviewed-by: Nathan Chancellor natechancellor@gmail.com Tested-by: Nathan Chancellor natechancellor@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/core/fib_rules.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
--- a/net/core/fib_rules.c +++ b/net/core/fib_rules.c @@ -429,9 +429,9 @@ int fib_nl_newrule(struct sk_buff *skb, if (rule->l3mdev && rule->table) goto errout_free;
- if (rule_exists(ops, frh, tb, rule)) { - if (nlh->nlmsg_flags & NLM_F_EXCL) - err = -EEXIST; + if ((nlh->nlmsg_flags & NLM_F_EXCL) && + rule_exists(ops, frh, tb, rule)) { + err = -EEXIST; goto errout_free; }
From: Linus Torvalds torvalds@linux-foundation.org
commit 66be4e66a7f422128748e3c3ef6ee72b20a6197b upstream.
Herbert Xu pointed out that commit bb73c52bad36 ("rcu: Don't disable preemption for Tiny and Tree RCU readers") was incorrect in making the preempt_disable/enable() be conditional on CONFIG_PREEMPT_COUNT.
If CONFIG_PREEMPT_COUNT isn't enabled, the preemption enable/disable is a no-op, but still is a compiler barrier.
And RCU locking still _needs_ that compiler barrier.
It is simply fundamentally not true that RCU locking would be a complete no-op: we still need to guarantee (for example) that things that can trap and cause preemption cannot migrate into the RCU locked region.
The way we do that is by making it a barrier.
See for example commit 386afc91144b ("spinlocks and preemption points need to be at least compiler barriers") from back in 2013 that had similar issues with spinlocks that become no-ops on UP: they must still constrain the compiler from moving other operations into the critical region.
Now, it is true that a lot of RCU operations already use READ_ONCE() and WRITE_ONCE() (which in practice likely would never be re-ordered wrt anything remotely interesting), but it is also true that that is not globally the case, and that it's not even necessarily always possible (ie bitfields etc).
Reported-by: Herbert Xu herbert@gondor.apana.org.au Fixes: bb73c52bad36 ("rcu: Don't disable preemption for Tiny and Tree RCU readers") Cc: stable@kernel.org Cc: Boqun Feng boqun.feng@gmail.com Cc: Paul E. McKenney paulmck@linux.vnet.ibm.com Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- include/linux/rcupdate.h | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-)
--- a/include/linux/rcupdate.h +++ b/include/linux/rcupdate.h @@ -306,14 +306,12 @@ void synchronize_rcu(void);
static inline void __rcu_read_lock(void) { - if (IS_ENABLED(CONFIG_PREEMPT_COUNT)) - preempt_disable(); + preempt_disable(); }
static inline void __rcu_read_unlock(void) { - if (IS_ENABLED(CONFIG_PREEMPT_COUNT)) - preempt_enable(); + preempt_enable(); }
static inline void synchronize_rcu(void)
From: John David Anglin dave.anglin@bell.net
commit 63923d2c3800919774f5c651d503d1dd2adaddd5 upstream.
We only support I/O to kernel space. Using %sr1 to load the coherence index may be racy unless interrupts are disabled. This patch changes the code used to load the coherence index to use implicit space register selection. This saves one instruction and eliminates the race.
Tested on rp3440, c8000 and c3750.
Signed-off-by: John David Anglin dave.anglin@bell.net Cc: stable@vger.kernel.org Signed-off-by: Helge Deller deller@gmx.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/parisc/ccio-dma.c | 4 +--- drivers/parisc/sba_iommu.c | 3 +-- 2 files changed, 2 insertions(+), 5 deletions(-)
--- a/drivers/parisc/ccio-dma.c +++ b/drivers/parisc/ccio-dma.c @@ -563,8 +563,6 @@ ccio_io_pdir_entry(u64 *pdir_ptr, space_ /* We currently only support kernel addresses */ BUG_ON(sid != KERNEL_SPACE);
- mtsp(sid,1); - /* ** WORD 1 - low order word ** "hints" parm includes the VALID bit! @@ -595,7 +593,7 @@ ccio_io_pdir_entry(u64 *pdir_ptr, space_ ** Grab virtual index [0:11] ** Deposit virt_idx bits into I/O PDIR word */ - asm volatile ("lci %%r0(%%sr1, %1), %0" : "=r" (ci) : "r" (vba)); + asm volatile ("lci %%r0(%1), %0" : "=r" (ci) : "r" (vba)); asm volatile ("extru %1,19,12,%0" : "+r" (ci) : "r" (ci)); asm volatile ("depw %1,15,12,%0" : "+r" (pa) : "r" (ci));
--- a/drivers/parisc/sba_iommu.c +++ b/drivers/parisc/sba_iommu.c @@ -573,8 +573,7 @@ sba_io_pdir_entry(u64 *pdir_ptr, space_t pa = virt_to_phys(vba); pa &= IOVP_MASK;
- mtsp(sid,1); - asm("lci 0(%%sr1, %1), %0" : "=r" (ci) : "r" (vba)); + asm("lci 0(%1), %0" : "=r" (ci) : "r" (vba)); pa |= (ci >> PAGE_SHIFT) & 0xff; /* move CI (8 bits) into lowest byte */
pa |= SBA_PDIR_VALID_BIT; /* set "valid" bit */
From: Miklos Szeredi mszeredi@redhat.com
commit 35d6fcbb7c3e296a52136347346a698a35af3fda upstream.
Do the proper cleanup in case the size check fails.
Tested with xfstests:generic/228
Reported-by: kbuild test robot lkp@intel.com Reported-by: Dan Carpenter dan.carpenter@oracle.com Fixes: 0cbade024ba5 ("fuse: honor RLIMIT_FSIZE in fuse_file_fallocate") Cc: Liu Bo bo.liu@linux.alibaba.com Cc: stable@vger.kernel.org # v3.5 Signed-off-by: Miklos Szeredi mszeredi@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/fuse/file.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/fs/fuse/file.c +++ b/fs/fuse/file.c @@ -2965,7 +2965,7 @@ static long fuse_file_fallocate(struct f offset + length > i_size_read(inode)) { err = inode_newsize_ok(inode, offset + length); if (err) - return err; + goto out; }
if (!(mode & FALLOC_FL_KEEP_SIZE))
From: Jiri Kosina jkosina@suse.cz
commit ec527c318036a65a083ef68d8ba95789d2212246 upstream.
As explained in
0cc3cd21657b ("cpu/hotplug: Boot HT siblings at least once")
we always, no matter what, have to bring up x86 HT siblings during boot at least once in order to avoid first MCE bringing the system to its knees.
That means that whenever 'nosmt' is supplied on the kernel command-line, all the HT siblings are as a result sitting in mwait or cpudile after going through the online-offline cycle at least once.
This causes a serious issue though when a kernel, which saw 'nosmt' on its commandline, is going to perform resume from hibernation: if the resume from the hibernated image is successful, cr3 is flipped in order to point to the address space of the kernel that is being resumed, which in turn means that all the HT siblings are all of a sudden mwaiting on address which is no longer valid.
That results in triple fault shortly after cr3 is switched, and machine reboots.
Fix this by always waking up all the SMT siblings before initiating the 'restore from hibernation' process; this guarantees that all the HT siblings will be properly carried over to the resumed kernel waiting in resume_play_dead(), and acted upon accordingly afterwards, based on the target kernel configuration.
Symmetricaly, the resumed kernel has to push the SMT siblings to mwait again in case it has SMT disabled; this means it has to online all the siblings when resuming (so that they come out of hlt) and offline them again to let them reach mwait.
Cc: 4.19+ stable@vger.kernel.org # v4.19+ Debugged-by: Thomas Gleixner tglx@linutronix.de Fixes: 0cc3cd21657b ("cpu/hotplug: Boot HT siblings at least once") Signed-off-by: Jiri Kosina jkosina@suse.cz Acked-by: Pavel Machek pavel@ucw.cz Reviewed-by: Thomas Gleixner tglx@linutronix.de Reviewed-by: Josh Poimboeuf jpoimboe@redhat.com Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/power/cpu.c | 10 ++++++++++ arch/x86/power/hibernate_64.c | 33 +++++++++++++++++++++++++++++++++ include/linux/cpu.h | 4 ++++ kernel/cpu.c | 4 ++-- kernel/power/hibernate.c | 9 +++++++++ 5 files changed, 58 insertions(+), 2 deletions(-)
--- a/arch/x86/power/cpu.c +++ b/arch/x86/power/cpu.c @@ -292,7 +292,17 @@ int hibernate_resume_nonboot_cpu_disable * address in its instruction pointer may not be possible to resolve * any more at that point (the page tables used by it previously may * have been overwritten by hibernate image data). + * + * First, make sure that we wake up all the potentially disabled SMT + * threads which have been initially brought up and then put into + * mwait/cpuidle sleep. + * Those will be put to proper (not interfering with hibernation + * resume) sleep afterwards, and the resumed kernel will decide itself + * what to do with them. */ + ret = cpuhp_smt_enable(); + if (ret) + return ret; smp_ops.play_dead = resume_play_dead; ret = disable_nonboot_cpus(); smp_ops.play_dead = play_dead; --- a/arch/x86/power/hibernate_64.c +++ b/arch/x86/power/hibernate_64.c @@ -11,6 +11,7 @@ #include <linux/gfp.h> #include <linux/smp.h> #include <linux/suspend.h> +#include <linux/cpu.h>
#include <asm/init.h> #include <asm/proto.h> @@ -218,3 +219,35 @@ int arch_hibernation_header_restore(void restore_cr3 = rdr->cr3; return (rdr->magic == RESTORE_MAGIC) ? 0 : -EINVAL; } + +int arch_resume_nosmt(void) +{ + int ret = 0; + /* + * We reached this while coming out of hibernation. This means + * that SMT siblings are sleeping in hlt, as mwait is not safe + * against control transition during resume (see comment in + * hibernate_resume_nonboot_cpu_disable()). + * + * If the resumed kernel has SMT disabled, we have to take all the + * SMT siblings out of hlt, and offline them again so that they + * end up in mwait proper. + * + * Called with hotplug disabled. + */ + cpu_hotplug_enable(); + if (cpu_smt_control == CPU_SMT_DISABLED || + cpu_smt_control == CPU_SMT_FORCE_DISABLED) { + enum cpuhp_smt_control old = cpu_smt_control; + + ret = cpuhp_smt_enable(); + if (ret) + goto out; + ret = cpuhp_smt_disable(old); + if (ret) + goto out; + } +out: + cpu_hotplug_disable(); + return ret; +} --- a/include/linux/cpu.h +++ b/include/linux/cpu.h @@ -271,11 +271,15 @@ extern enum cpuhp_smt_control cpu_smt_co extern void cpu_smt_disable(bool force); extern void cpu_smt_check_topology_early(void); extern void cpu_smt_check_topology(void); +extern int cpuhp_smt_enable(void); +extern int cpuhp_smt_disable(enum cpuhp_smt_control ctrlval); #else # define cpu_smt_control (CPU_SMT_ENABLED) static inline void cpu_smt_disable(bool force) { } static inline void cpu_smt_check_topology_early(void) { } static inline void cpu_smt_check_topology(void) { } +static inline int cpuhp_smt_enable(void) { return 0; } +static inline int cpuhp_smt_disable(enum cpuhp_smt_control ctrlval) { return 0; } #endif
/* --- a/kernel/cpu.c +++ b/kernel/cpu.c @@ -1995,7 +1995,7 @@ static void cpuhp_online_cpu_device(unsi kobject_uevent(&dev->kobj, KOBJ_ONLINE); }
-static int cpuhp_smt_disable(enum cpuhp_smt_control ctrlval) +int cpuhp_smt_disable(enum cpuhp_smt_control ctrlval) { int cpu, ret = 0;
@@ -2029,7 +2029,7 @@ static int cpuhp_smt_disable(enum cpuhp_ return ret; }
-static int cpuhp_smt_enable(void) +int cpuhp_smt_enable(void) { int cpu, ret = 0;
--- a/kernel/power/hibernate.c +++ b/kernel/power/hibernate.c @@ -256,6 +256,11 @@ void swsusp_show_speed(ktime_t start, kt kps / 1000, (kps % 1000) / 10); }
+__weak int arch_resume_nosmt(void) +{ + return 0; +} + /** * create_image - Create a hibernation image. * @platform_mode: Whether or not to use the platform driver. @@ -322,6 +327,10 @@ static int create_image(int platform_mod Enable_cpus: enable_nonboot_cpus();
+ /* Allow architectures to do nosmt-specific post-resume dances */ + if (!in_suspend) + error = arch_resume_nosmt(); + Platform_finish: platform_finish(platform_mode);
From: Paul Burton paul.burton@mips.com
commit e4f2d1af7163becb181419af9dece9206001e0a6 upstream.
The pistachio platform uses the U-Boot bootloader & generally boots a kernel in the uImage format. As such it's useful to build one when building the kernel, but to do so currently requires the user to manually specify a uImage target on the make command line.
Make uImage.gz the pistachio platform's default build target, so that the default is to build a kernel image that we can actually boot on a board such as the MIPS Creator Ci40.
Marked for stable backport as far as v4.1 where pistachio support was introduced. This is primarily useful for CI systems such as kernelci.org which will benefit from us building a suitable image which can then be booted as part of automated testing, extending our test coverage to the affected stable branches.
Signed-off-by: Paul Burton paul.burton@mips.com Reviewed-by: Philippe Mathieu-Daudé f4bug@amsat.org Reviewed-by: Kevin Hilman khilman@baylibre.com Tested-by: Kevin Hilman khilman@baylibre.com URL: https://groups.io/g/kernelci/message/388 Cc: stable@vger.kernel.org # v4.1+ Cc: linux-mips@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/mips/pistachio/Platform | 1 + 1 file changed, 1 insertion(+)
--- a/arch/mips/pistachio/Platform +++ b/arch/mips/pistachio/Platform @@ -6,3 +6,4 @@ cflags-$(CONFIG_MACH_PISTACHIO) += \ -I$(srctree)/arch/mips/include/asm/mach-pistachio load-$(CONFIG_MACH_PISTACHIO) += 0xffffffff80400000 zload-$(CONFIG_MACH_PISTACHIO) += 0xffffffff81000000 +all-$(CONFIG_MACH_PISTACHIO) := uImage.gz
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
This reverts commit f9b1baac265600a61d36ebaf9ba657119303b5b5 which is commit a1e8783db8e0d58891681bc1e6d9ada66eae8e20 upstream.
Petr writes: Karl has reported to me today, that he's experiencing weird reboot hang on his devices with 4.9.180 kernel and that he has bisected it down to my backported patch.
I would like to kindly ask you for removal of this patch. This patch should be reverted from all stable kernels up to 5.1, because perf counters were not broken on those kernels, and this patch won't work on the ath79 legacy IRQ code anyway, it needs new irqchip driver which was enabled on ath79 with commit 51fa4f8912c0 ("MIPS: ath79: drop legacy IRQ code").
Reported-by: Petr Štetiar ynezz@true.cz Cc: Kevin 'ldir' Darbyshire-Bryant ldir@darbyshire-bryant.me.uk Cc: John Crispin john@phrozen.org Cc: Marc Zyngier marc.zyngier@arm.com Cc: Paul Burton paul.burton@mips.com Cc: linux-mips@vger.kernel.org Cc: Ralf Baechle ralf@linux-mips.org Cc: James Hogan jhogan@kernel.org Cc: Thomas Gleixner tglx@linutronix.de Cc: Jason Cooper jason@lakedaemon.net Cc: Sasha Levin sashal@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/mips/ath79/setup.c | 6 ++++++ drivers/irqchip/irq-ath79-misc.c | 11 ----------- 2 files changed, 6 insertions(+), 11 deletions(-)
--- a/arch/mips/ath79/setup.c +++ b/arch/mips/ath79/setup.c @@ -183,6 +183,12 @@ const char *get_system_type(void) return ath79_sys_type; }
+int get_c0_perfcount_int(void) +{ + return ATH79_MISC_IRQ(5); +} +EXPORT_SYMBOL_GPL(get_c0_perfcount_int); + unsigned int get_c0_compare_int(void) { return CP0_LEGACY_COMPARE_IRQ; --- a/drivers/irqchip/irq-ath79-misc.c +++ b/drivers/irqchip/irq-ath79-misc.c @@ -22,15 +22,6 @@ #define AR71XX_RESET_REG_MISC_INT_ENABLE 4
#define ATH79_MISC_IRQ_COUNT 32 -#define ATH79_MISC_PERF_IRQ 5 - -static int ath79_perfcount_irq; - -int get_c0_perfcount_int(void) -{ - return ath79_perfcount_irq; -} -EXPORT_SYMBOL_GPL(get_c0_perfcount_int);
static void ath79_misc_irq_handler(struct irq_desc *desc) { @@ -122,8 +113,6 @@ static void __init ath79_misc_intc_domai { void __iomem *base = domain->host_data;
- ath79_perfcount_irq = irq_create_mapping(domain, ATH79_MISC_PERF_IRQ); - /* Disable and clear all interrupts */ __raw_writel(0, base + AR71XX_RESET_REG_MISC_INT_ENABLE); __raw_writel(0, base + AR71XX_RESET_REG_MISC_INT_STATUS);
From: Dan Carpenter dan.carpenter@oracle.com
commit 110080cea0d0e4dfdb0b536e7f8a5633ead6a781 upstream.
There are a couple potential integer overflows here.
round_up(m->size + (m->addr & ~PAGE_MASK), PAGE_SIZE);
The first thing is that the "m->size + (...)" addition could overflow, and the second is that round_up() overflows to zero if the result is within PAGE_SIZE of the type max.
In this code, the "m->size" variable is an u64 but we're saving the result in "map_size" which is an unsigned long and genwqe_user_vmap() takes an unsigned long as well. So I have used ULONG_MAX as the upper bound. From a practical perspective unsigned long is fine/better than trying to change all the types to u64.
Fixes: eaf4722d4645 ("GenWQE Character device and DDCB queue") Signed-off-by: Dan Carpenter dan.carpenter@oracle.com Cc: stable stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/misc/genwqe/card_dev.c | 2 ++ drivers/misc/genwqe/card_utils.c | 4 ++++ 2 files changed, 6 insertions(+)
--- a/drivers/misc/genwqe/card_dev.c +++ b/drivers/misc/genwqe/card_dev.c @@ -782,6 +782,8 @@ static int genwqe_pin_mem(struct genwqe_
if ((m->addr == 0x0) || (m->size == 0)) return -EINVAL; + if (m->size > ULONG_MAX - PAGE_SIZE - (m->addr & ~PAGE_MASK)) + return -EINVAL;
map_addr = (m->addr & PAGE_MASK); map_size = round_up(m->size + (m->addr & ~PAGE_MASK), PAGE_SIZE); --- a/drivers/misc/genwqe/card_utils.c +++ b/drivers/misc/genwqe/card_utils.c @@ -582,6 +582,10 @@ int genwqe_user_vmap(struct genwqe_dev * /* determine space needed for page_list. */ data = (unsigned long)uaddr; offs = offset_in_page(data); + if (size > ULONG_MAX - PAGE_SIZE - offs) { + m->size = 0; /* mark unused and not added */ + return -EINVAL; + } m->nr_pages = DIV_ROUND_UP(offs + size, PAGE_SIZE);
m->page_list = kcalloc(m->nr_pages,
From: Patrik Jakobsson patrik.r.jakobsson@gmail.com
commit 7c420636860a719049fae9403e2c87804f53bdde upstream.
Some machines have an lvds child device in vbt even though a panel is not attached. To make detection more reliable we now also check the lvds config bits available in the vbt.
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1665766 Cc: stable@vger.kernel.org Reviewed-by: Hans de Goede hdegoede@redhat.com Signed-off-by: Patrik Jakobsson patrik.r.jakobsson@gmail.com Link: https://patchwork.freedesktop.org/patch/msgid/20190416114607.1072-1-patrik.r... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/gpu/drm/gma500/cdv_intel_lvds.c | 3 +++ drivers/gpu/drm/gma500/intel_bios.c | 3 +++ drivers/gpu/drm/gma500/psb_drv.h | 1 + 3 files changed, 7 insertions(+)
--- a/drivers/gpu/drm/gma500/cdv_intel_lvds.c +++ b/drivers/gpu/drm/gma500/cdv_intel_lvds.c @@ -609,6 +609,9 @@ void cdv_intel_lvds_init(struct drm_devi int pipe; u8 pin;
+ if (!dev_priv->lvds_enabled_in_vbt) + return; + pin = GMBUS_PORT_PANEL; if (!lvds_is_present_in_vbt(dev, &pin)) { DRM_DEBUG_KMS("LVDS is not present in VBT\n"); --- a/drivers/gpu/drm/gma500/intel_bios.c +++ b/drivers/gpu/drm/gma500/intel_bios.c @@ -436,6 +436,9 @@ parse_driver_features(struct drm_psb_pri if (driver->lvds_config == BDB_DRIVER_FEATURE_EDP) dev_priv->edp.support = 1;
+ dev_priv->lvds_enabled_in_vbt = driver->lvds_config != 0; + DRM_DEBUG_KMS("LVDS VBT config bits: 0x%x\n", driver->lvds_config); + /* This bit means to use 96Mhz for DPLL_A or not */ if (driver->primary_lfp_id) dev_priv->dplla_96mhz = true; --- a/drivers/gpu/drm/gma500/psb_drv.h +++ b/drivers/gpu/drm/gma500/psb_drv.h @@ -538,6 +538,7 @@ struct drm_psb_private { int lvds_ssc_freq; bool is_lvds_on; bool is_mipi_on; + bool lvds_enabled_in_vbt; u32 mipi_ctrl_display;
unsigned int core_freq;
From: Christian König christian.koenig@amd.com
commit 2e26ccb119bde03584be53406bbd22e711b0d6e6 upstream.
Instead of the closest reference divider prefer the lowest, this fixes flickering issues on HP Compaq nx9420.
Bugs: https://bugs.freedesktop.org/show_bug.cgi?id=108514 Suggested-by: Paul Dufresne dufresnep@gmail.com Signed-off-by: Christian König christian.koenig@amd.com Acked-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/gpu/drm/radeon/radeon_display.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/gpu/drm/radeon/radeon_display.c +++ b/drivers/gpu/drm/radeon/radeon_display.c @@ -935,12 +935,12 @@ static void avivo_get_fb_ref_div(unsigne ref_div_max = max(min(100 / post_div, ref_div_max), 1u);
/* get matching reference and feedback divider */ - *ref_div = min(max(DIV_ROUND_CLOSEST(den, post_div), 1u), ref_div_max); + *ref_div = min(max(den/post_div, 1u), ref_div_max); *fb_div = DIV_ROUND_CLOSEST(nom * *ref_div * post_div, den);
/* limit fb divider to its maximum */ if (*fb_div > fb_div_max) { - *ref_div = DIV_ROUND_CLOSEST(*ref_div * fb_div_max, *fb_div); + *ref_div = (*ref_div * fb_div_max)/(*fb_div); *fb_div = fb_div_max; } }
From: Chris Wilson chris@chris-wilson.co.uk
commit d90c06d57027203f73021bb7ddb30b800d65c636 upstream.
This was supposed to be a mask of all known rings, but it is being used by execbuffer to filter out invalid rings, and so is instead mapping high unused values onto valid rings. Instead of a mask of all known rings, we need it to be the mask of all possible rings.
Fixes: 549f7365820a ("drm/i915: Enable SandyBridge blitter ring") Fixes: de1add360522 ("drm/i915: Decouple execbuf uAPI from internal implementation") Signed-off-by: Chris Wilson chris@chris-wilson.co.uk Cc: Tvrtko Ursulin tvrtko.ursulin@intel.com Cc: stable@vger.kernel.org # v4.6+ Reviewed-by: Tvrtko Ursulin tvrtko.ursulin@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20190301140404.26690-21-chris@... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- include/uapi/drm/i915_drm.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/include/uapi/drm/i915_drm.h +++ b/include/uapi/drm/i915_drm.h @@ -756,7 +756,7 @@ struct drm_i915_gem_execbuffer2 { __u32 num_cliprects; /** This is a struct drm_clip_rect *cliprects */ __u64 cliprects_ptr; -#define I915_EXEC_RING_MASK (7<<0) +#define I915_EXEC_RING_MASK (0x3f) #define I915_EXEC_DEFAULT (0<<0) #define I915_EXEC_RENDER (1<<0) #define I915_EXEC_BSD (2<<0)
From: Jiri Slaby jslaby@suse.cz
commit 4cdd17ba1dff20ffc99fdbd2e6f0201fc7fe67df upstream.
We need to compute the uart state only on the first open. This is usually what is done in the ->install hook. serial_core used to do this in ->open on every open. So move it to ->install.
As a side effect, it ensures the state is set properly in the window after tty_init_dev is called, but before uart_open. This fixes a bunch of races between tty_open and flush_to_ldisc we were dealing with recently.
One of such bugs was attempted to fix in commit fedb5760648a (serial: fix race between flush_to_ldisc and tty_open), but it only took care of a couple of functions (uart_start and uart_unthrottle). I was able to reproduce the crash on a SLE system, but in uart_write_room which is also called from flush_to_ldisc via process_echoes. I was *unable* to reproduce the bug locally. It is due to having this patch in my queue since 2012!
general protection fault: 0000 [#1] SMP KASAN PTI CPU: 1 PID: 5 Comm: kworker/u4:0 Tainted: G L 4.12.14-396-default #1 SLE15-SP1 (unreleased) Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-0-ga698c89-prebuilt.qemu.org 04/01/2014 Workqueue: events_unbound flush_to_ldisc task: ffff8800427d8040 task.stack: ffff8800427f0000 RIP: 0010:uart_write_room+0xc4/0x590 RSP: 0018:ffff8800427f7088 EFLAGS: 00010202 RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 0000000000000000 RDX: 000000000000002f RSI: 00000000000000ee RDI: ffff88003888bd90 RBP: ffffffffb9545850 R08: 0000000000000001 R09: 0000000000000400 R10: ffff8800427d825c R11: 000000000000006e R12: 1ffff100084fee12 R13: ffffc900004c5000 R14: ffff88003888bb28 R15: 0000000000000178 FS: 0000000000000000(0000) GS:ffff880043300000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000561da0794148 CR3: 000000000ebf4000 CR4: 00000000000006e0 Call Trace: tty_write_room+0x6d/0xc0 __process_echoes+0x55/0x870 n_tty_receive_buf_common+0x105e/0x26d0 tty_ldisc_receive_buf+0xb7/0x1c0 tty_port_default_receive_buf+0x107/0x180 flush_to_ldisc+0x35d/0x5c0 ...
0 in rbx means tty->driver_data is NULL in uart_write_room. 0x178 is tried to be dereferenced (0x178 >> 3 is 0x2f in rdx) at uart_write_room+0xc4. 0x178 is exactly (struct uart_state *)NULL->refcount used in uart_port_lock from uart_write_room.
So revert the upstream commit here as my local patch should fix the whole family.
Signed-off-by: Jiri Slaby jslaby@suse.cz Cc: Li RongQing lirongqing@baidu.com Cc: Wang Li wangli39@baidu.com Cc: Zhang Yu zhangyu31@baidu.com Cc: Greg Kroah-Hartman gregkh@linuxfoundation.org Cc: stable stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/tty/serial/serial_core.c | 24 +++++++++++++----------- 1 file changed, 13 insertions(+), 11 deletions(-)
--- a/drivers/tty/serial/serial_core.c +++ b/drivers/tty/serial/serial_core.c @@ -141,9 +141,6 @@ static void uart_start(struct tty_struct struct uart_port *port; unsigned long flags;
- if (!state) - return; - port = uart_port_lock(state, flags); __uart_start(tty); uart_port_unlock(port, flags); @@ -1714,11 +1711,8 @@ static void uart_dtr_rts(struct tty_port */ static int uart_open(struct tty_struct *tty, struct file *filp) { - struct uart_driver *drv = tty->driver->driver_state; - int retval, line = tty->index; - struct uart_state *state = drv->state + line; - - tty->driver_data = state; + struct uart_state *state = tty->driver_data; + int retval;
retval = tty_port_open(&state->port, tty, filp); if (retval > 0) @@ -2409,9 +2403,6 @@ static void uart_poll_put_char(struct tt struct uart_state *state = drv->state + line; struct uart_port *port;
- if (!state) - return; - port = uart_port_ref(state); if (!port) return; @@ -2423,7 +2414,18 @@ static void uart_poll_put_char(struct tt } #endif
+static int uart_install(struct tty_driver *driver, struct tty_struct *tty) +{ + struct uart_driver *drv = driver->driver_state; + struct uart_state *state = drv->state + tty->index; + + tty->driver_data = state; + + return tty_standard_install(driver, tty); +} + static const struct tty_operations uart_ops = { + .install = uart_install, .open = uart_open, .close = uart_close, .write = uart_write,
From: Kirill Smelkov kirr@nexedi.com
commit 10dce8af34226d90fa56746a934f8da5dcdba3df upstream.
Commit 9c225f2655e3 ("vfs: atomic f_pos accesses as per POSIX") added locking for file.f_pos access and in particular made concurrent read and write not possible - now both those functions take f_pos lock for the whole run, and so if e.g. a read is blocked waiting for data, write will deadlock waiting for that read to complete.
This caused regression for stream-like files where previously read and write could run simultaneously, but after that patch could not do so anymore. See e.g. commit 581d21a2d02a ("xenbus: fix deadlock on writes to /proc/xen/xenbus") which fixes such regression for particular case of /proc/xen/xenbus.
The patch that added f_pos lock in 2014 did so to guarantee POSIX thread safety for read/write/lseek and added the locking to file descriptors of all regular files. In 2014 that thread-safety problem was not new as it was already discussed earlier in 2006.
However even though 2006'th version of Linus's patch was adding f_pos locking "only for files that are marked seekable with FMODE_LSEEK (thus avoiding the stream-like objects like pipes and sockets)", the 2014 version - the one that actually made it into the tree as 9c225f2655e3 - is doing so irregardless of whether a file is seekable or not.
See
https://lore.kernel.org/lkml/53022DB1.4070805@gmail.com/ https://lwn.net/Articles/180387 https://lwn.net/Articles/180396
for historic context.
The reason that it did so is, probably, that there are many files that are marked non-seekable, but e.g. their read implementation actually depends on knowing current position to correctly handle the read. Some examples:
kernel/power/user.c snapshot_read fs/debugfs/file.c u32_array_read fs/fuse/control.c fuse_conn_waiting_read + ... drivers/hwmon/asus_atk0110.c atk_debugfs_ggrp_read arch/s390/hypfs/inode.c hypfs_read_iter ...
Despite that, many nonseekable_open users implement read and write with pure stream semantics - they don't depend on passed ppos at all. And for those cases where read could wait for something inside, it creates a situation similar to xenbus - the write could be never made to go until read is done, and read is waiting for some, potentially external, event, for potentially unbounded time -> deadlock.
Besides xenbus, there are 14 such places in the kernel that I've found with semantic patch (see below):
drivers/xen/evtchn.c:667:8-24: ERROR: evtchn_fops: .read() can deadlock .write() drivers/isdn/capi/capi.c:963:8-24: ERROR: capi_fops: .read() can deadlock .write() drivers/input/evdev.c:527:1-17: ERROR: evdev_fops: .read() can deadlock .write() drivers/char/pcmcia/cm4000_cs.c:1685:7-23: ERROR: cm4000_fops: .read() can deadlock .write() net/rfkill/core.c:1146:8-24: ERROR: rfkill_fops: .read() can deadlock .write() drivers/s390/char/fs3270.c:488:1-17: ERROR: fs3270_fops: .read() can deadlock .write() drivers/usb/misc/ldusb.c:310:1-17: ERROR: ld_usb_fops: .read() can deadlock .write() drivers/hid/uhid.c:635:1-17: ERROR: uhid_fops: .read() can deadlock .write() net/batman-adv/icmp_socket.c:80:1-17: ERROR: batadv_fops: .read() can deadlock .write() drivers/media/rc/lirc_dev.c:198:1-17: ERROR: lirc_fops: .read() can deadlock .write() drivers/leds/uleds.c:77:1-17: ERROR: uleds_fops: .read() can deadlock .write() drivers/input/misc/uinput.c:400:1-17: ERROR: uinput_fops: .read() can deadlock .write() drivers/infiniband/core/user_mad.c:985:7-23: ERROR: umad_fops: .read() can deadlock .write() drivers/gnss/core.c:45:1-17: ERROR: gnss_fops: .read() can deadlock .write()
In addition to the cases above another regression caused by f_pos locking is that now FUSE filesystems that implement open with FOPEN_NONSEEKABLE flag, can no longer implement bidirectional stream-like files - for the same reason as above e.g. read can deadlock write locking on file.f_pos in the kernel.
FUSE's FOPEN_NONSEEKABLE was added in 2008 in a7c1b990f715 ("fuse: implement nonseekable open") to support OSSPD. OSSPD implements /dev/dsp in userspace with FOPEN_NONSEEKABLE flag, with corresponding read and write routines not depending on current position at all, and with both read and write being potentially blocking operations:
See
https://github.com/libfuse/osspd https://lwn.net/Articles/308445
https://github.com/libfuse/osspd/blob/14a9cff0/osspd.c#L1406 https://github.com/libfuse/osspd/blob/14a9cff0/osspd.c#L1438-L1477 https://github.com/libfuse/osspd/blob/14a9cff0/osspd.c#L1479-L1510
Corresponding libfuse example/test also describes FOPEN_NONSEEKABLE as "somewhat pipe-like files ..." with read handler not using offset. However that test implements only read without write and cannot exercise the deadlock scenario:
https://github.com/libfuse/libfuse/blob/fuse-3.4.2-3-ga1bff7d/example/poll.c... https://github.com/libfuse/libfuse/blob/fuse-3.4.2-3-ga1bff7d/example/poll.c... https://github.com/libfuse/libfuse/blob/fuse-3.4.2-3-ga1bff7d/example/poll.c...
I've actually hit the read vs write deadlock for real while implementing my FUSE filesystem where there is /head/watch file, for which open creates separate bidirectional socket-like stream in between filesystem and its user with both read and write being later performed simultaneously. And there it is semantically not easy to split the stream into two separate read-only and write-only channels:
https://lab.nexedi.com/kirr/wendelin.core/blob/f13aa600/wcfs/wcfs.go#L88-169
Let's fix this regression. The plan is:
1. We can't change nonseekable_open to include &~FMODE_ATOMIC_POS - doing so would break many in-kernel nonseekable_open users which actually use ppos in read/write handlers.
2. Add stream_open() to kernel to open stream-like non-seekable file descriptors. Read and write on such file descriptors would never use nor change ppos. And with that property on stream-like files read and write will be running without taking f_pos lock - i.e. read and write could be running simultaneously.
3. With semantic patch search and convert to stream_open all in-kernel nonseekable_open users for which read and write actually do not depend on ppos and where there is no other methods in file_operations which assume @offset access.
4. Add FOPEN_STREAM to fs/fuse/ and open in-kernel file-descriptors via steam_open if that bit is present in filesystem open reply.
It was tempting to change fs/fuse/ open handler to use stream_open instead of nonseekable_open on just FOPEN_NONSEEKABLE flags, but grepping through Debian codesearch shows users of FOPEN_NONSEEKABLE, and in particular GVFS which actually uses offset in its read and write handlers
https://codesearch.debian.net/search?q=-%3Enonseekable+%3D https://gitlab.gnome.org/GNOME/gvfs/blob/1.40.0-6-gcbc54396/client/gvfsfused... https://gitlab.gnome.org/GNOME/gvfs/blob/1.40.0-6-gcbc54396/client/gvfsfused... https://gitlab.gnome.org/GNOME/gvfs/blob/1.40.0-6-gcbc54396/client/gvfsfused...
so if we would do such a change it will break a real user.
5. Add stream_open and FOPEN_STREAM handling to stable kernels starting from v3.14+ (the kernel where 9c225f2655 first appeared).
This will allow to patch OSSPD and other FUSE filesystems that provide stream-like files to return FOPEN_STREAM | FOPEN_NONSEEKABLE in their open handler and this way avoid the deadlock on all kernel versions. This should work because fs/fuse/ ignores unknown open flags returned from a filesystem and so passing FOPEN_STREAM to a kernel that is not aware of this flag cannot hurt. In turn the kernel that is not aware of FOPEN_STREAM will be < v3.14 where just FOPEN_NONSEEKABLE is sufficient to implement streams without read vs write deadlock.
This patch adds stream_open, converts /proc/xen/xenbus to it and adds semantic patch to automatically locate in-kernel places that are either required to be converted due to read vs write deadlock, or that are just safe to be converted because read and write do not use ppos and there are no other funky methods in file_operations.
Regarding semantic patch I've verified each generated change manually - that it is correct to convert - and each other nonseekable_open instance left - that it is either not correct to convert there, or that it is not converted due to current stream_open.cocci limitations.
The script also does not convert files that should be valid to convert, but that currently have .llseek = noop_llseek or generic_file_llseek for unknown reason despite file being opened with nonseekable_open (e.g. drivers/input/mousedev.c)
Cc: Michael Kerrisk mtk.manpages@gmail.com Cc: Yongzhi Pan panyongzhi@gmail.com Cc: Jonathan Corbet corbet@lwn.net Cc: David Vrabel david.vrabel@citrix.com Cc: Juergen Gross jgross@suse.com Cc: Miklos Szeredi miklos@szeredi.hu Cc: Tejun Heo tj@kernel.org Cc: Kirill Tkhai ktkhai@virtuozzo.com Cc: Arnd Bergmann arnd@arndb.de Cc: Christoph Hellwig hch@lst.de Cc: Greg Kroah-Hartman gregkh@linuxfoundation.org Cc: Julia Lawall Julia.Lawall@lip6.fr Cc: Nikolaus Rath Nikolaus@rath.org Cc: Han-Wen Nienhuys hanwen@google.com Signed-off-by: Kirill Smelkov kirr@nexedi.com Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/xen/xenbus/xenbus_dev_frontend.c | 2 fs/open.c | 18 + fs/read_write.c | 5 include/linux/fs.h | 4 scripts/coccinelle/api/stream_open.cocci | 363 +++++++++++++++++++++++++++++++ 5 files changed, 389 insertions(+), 3 deletions(-)
--- a/drivers/xen/xenbus/xenbus_dev_frontend.c +++ b/drivers/xen/xenbus/xenbus_dev_frontend.c @@ -536,7 +536,7 @@ static int xenbus_file_open(struct inode if (xen_store_evtchn == 0) return -ENOENT;
- nonseekable_open(inode, filp); + stream_open(inode, filp);
u = kzalloc(sizeof(*u), GFP_KERNEL); if (u == NULL) --- a/fs/open.c +++ b/fs/open.c @@ -1192,3 +1192,21 @@ int nonseekable_open(struct inode *inode }
EXPORT_SYMBOL(nonseekable_open); + +/* + * stream_open is used by subsystems that want stream-like file descriptors. + * Such file descriptors are not seekable and don't have notion of position + * (file.f_pos is always 0). Contrary to file descriptors of other regular + * files, .read() and .write() can run simultaneously. + * + * stream_open never fails and is marked to return int so that it could be + * directly used as file_operations.open . + */ +int stream_open(struct inode *inode, struct file *filp) +{ + filp->f_mode &= ~(FMODE_LSEEK | FMODE_PREAD | FMODE_PWRITE | FMODE_ATOMIC_POS); + filp->f_mode |= FMODE_STREAM; + return 0; +} + +EXPORT_SYMBOL(stream_open); --- a/fs/read_write.c +++ b/fs/read_write.c @@ -575,12 +575,13 @@ EXPORT_SYMBOL(vfs_write);
static inline loff_t file_pos_read(struct file *file) { - return file->f_pos; + return file->f_mode & FMODE_STREAM ? 0 : file->f_pos; }
static inline void file_pos_write(struct file *file, loff_t pos) { - file->f_pos = pos; + if ((file->f_mode & FMODE_STREAM) == 0) + file->f_pos = pos; }
SYSCALL_DEFINE3(read, unsigned int, fd, char __user *, buf, size_t, count) --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -143,6 +143,9 @@ typedef int (dio_iodone_t)(struct kiocb /* Has write method(s) */ #define FMODE_CAN_WRITE ((__force fmode_t)0x40000)
+/* File is stream-like */ +#define FMODE_STREAM ((__force fmode_t)0x200000) + /* File was opened by fanotify and shouldn't generate fanotify events */ #define FMODE_NONOTIFY ((__force fmode_t)0x4000000)
@@ -2843,6 +2846,7 @@ extern loff_t no_seek_end_llseek_size(st extern loff_t no_seek_end_llseek(struct file *, loff_t, int); extern int generic_file_open(struct inode * inode, struct file * filp); extern int nonseekable_open(struct inode * inode, struct file * filp); +extern int stream_open(struct inode * inode, struct file * filp);
#ifdef CONFIG_BLOCK typedef void (dio_submit_t)(struct bio *bio, struct inode *inode, --- /dev/null +++ b/scripts/coccinelle/api/stream_open.cocci @@ -0,0 +1,363 @@ +// SPDX-License-Identifier: GPL-2.0 +// Author: Kirill Smelkov (kirr@nexedi.com) +// +// Search for stream-like files that are using nonseekable_open and convert +// them to stream_open. A stream-like file is a file that does not use ppos in +// its read and write. Rationale for the conversion is to avoid deadlock in +// between read and write. + +virtual report +virtual patch +virtual explain // explain decisions in the patch (SPFLAGS="-D explain") + +// stream-like reader & writer - ones that do not depend on f_pos. +@ stream_reader @ +identifier readstream, ppos; +identifier f, buf, len; +type loff_t; +@@ + ssize_t readstream(struct file *f, char *buf, size_t len, loff_t *ppos) + { + ... when != ppos + } + +@ stream_writer @ +identifier writestream, ppos; +identifier f, buf, len; +type loff_t; +@@ + ssize_t writestream(struct file *f, const char *buf, size_t len, loff_t *ppos) + { + ... when != ppos + } + + +// a function that blocks +@ blocks @ +identifier block_f; +identifier wait_event =~ "^wait_event_.*"; +@@ + block_f(...) { + ... when exists + wait_event(...) + ... when exists + } + +// stream_reader that can block inside. +// +// XXX wait_* can be called not directly from current function (e.g. func -> f -> g -> wait()) +// XXX currently reader_blocks supports only direct and 1-level indirect cases. +@ reader_blocks_direct @ +identifier stream_reader.readstream; +identifier wait_event =~ "^wait_event_.*"; +@@ + readstream(...) + { + ... when exists + wait_event(...) + ... when exists + } + +@ reader_blocks_1 @ +identifier stream_reader.readstream; +identifier blocks.block_f; +@@ + readstream(...) + { + ... when exists + block_f(...) + ... when exists + } + +@ reader_blocks depends on reader_blocks_direct || reader_blocks_1 @ +identifier stream_reader.readstream; +@@ + readstream(...) { + ... + } + + +// file_operations + whether they have _any_ .read, .write, .llseek ... at all. +// +// XXX add support for file_operations xxx[N] = ... (sound/core/pcm_native.c) +@ fops0 @ +identifier fops; +@@ + struct file_operations fops = { + ... + }; + +@ has_read @ +identifier fops0.fops; +identifier read_f; +@@ + struct file_operations fops = { + .read = read_f, + }; + +@ has_read_iter @ +identifier fops0.fops; +identifier read_iter_f; +@@ + struct file_operations fops = { + .read_iter = read_iter_f, + }; + +@ has_write @ +identifier fops0.fops; +identifier write_f; +@@ + struct file_operations fops = { + .write = write_f, + }; + +@ has_write_iter @ +identifier fops0.fops; +identifier write_iter_f; +@@ + struct file_operations fops = { + .write_iter = write_iter_f, + }; + +@ has_llseek @ +identifier fops0.fops; +identifier llseek_f; +@@ + struct file_operations fops = { + .llseek = llseek_f, + }; + +@ has_no_llseek @ +identifier fops0.fops; +@@ + struct file_operations fops = { + .llseek = no_llseek, + }; + +@ has_mmap @ +identifier fops0.fops; +identifier mmap_f; +@@ + struct file_operations fops = { + .mmap = mmap_f, + }; + +@ has_copy_file_range @ +identifier fops0.fops; +identifier copy_file_range_f; +@@ + struct file_operations fops = { + .copy_file_range = copy_file_range_f, + }; + +@ has_remap_file_range @ +identifier fops0.fops; +identifier remap_file_range_f; +@@ + struct file_operations fops = { + .remap_file_range = remap_file_range_f, + }; + +@ has_splice_read @ +identifier fops0.fops; +identifier splice_read_f; +@@ + struct file_operations fops = { + .splice_read = splice_read_f, + }; + +@ has_splice_write @ +identifier fops0.fops; +identifier splice_write_f; +@@ + struct file_operations fops = { + .splice_write = splice_write_f, + }; + + +// file_operations that is candidate for stream_open conversion - it does not +// use mmap and other methods that assume @offset access to file. +// +// XXX for simplicity require no .{read/write}_iter and no .splice_{read/write} for now. +// XXX maybe_steam.fops cannot be used in other rules - it gives "bad rule maybe_stream or bad variable fops". +@ maybe_stream depends on (!has_llseek || has_no_llseek) && !has_mmap && !has_copy_file_range && !has_remap_file_range && !has_read_iter && !has_write_iter && !has_splice_read && !has_splice_write @ +identifier fops0.fops; +@@ + struct file_operations fops = { + }; + + +// ---- conversions ---- + +// XXX .open = nonseekable_open -> .open = stream_open +// XXX .open = func -> openfunc -> nonseekable_open + +// read & write +// +// if both are used in the same file_operations together with an opener - +// under that conditions we can use stream_open instead of nonseekable_open. +@ fops_rw depends on maybe_stream @ +identifier fops0.fops, openfunc; +identifier stream_reader.readstream; +identifier stream_writer.writestream; +@@ + struct file_operations fops = { + .open = openfunc, + .read = readstream, + .write = writestream, + }; + +@ report_rw depends on report @ +identifier fops_rw.openfunc; +position p1; +@@ + openfunc(...) { + <... + nonseekable_open@p1 + ...> + } + +@ script:python depends on report && reader_blocks @ +fops << fops0.fops; +p << report_rw.p1; +@@ +coccilib.report.print_report(p[0], + "ERROR: %s: .read() can deadlock .write(); change nonseekable_open -> stream_open to fix." % (fops,)) + +@ script:python depends on report && !reader_blocks @ +fops << fops0.fops; +p << report_rw.p1; +@@ +coccilib.report.print_report(p[0], + "WARNING: %s: .read() and .write() have stream semantic; safe to change nonseekable_open -> stream_open." % (fops,)) + + +@ explain_rw_deadlocked depends on explain && reader_blocks @ +identifier fops_rw.openfunc; +@@ + openfunc(...) { + <... +- nonseekable_open ++ nonseekable_open /* read & write (was deadlock) */ + ...> + } + + +@ explain_rw_nodeadlock depends on explain && !reader_blocks @ +identifier fops_rw.openfunc; +@@ + openfunc(...) { + <... +- nonseekable_open ++ nonseekable_open /* read & write (no direct deadlock) */ + ...> + } + +@ patch_rw depends on patch @ +identifier fops_rw.openfunc; +@@ + openfunc(...) { + <... +- nonseekable_open ++ stream_open + ...> + } + + +// read, but not write +@ fops_r depends on maybe_stream && !has_write @ +identifier fops0.fops, openfunc; +identifier stream_reader.readstream; +@@ + struct file_operations fops = { + .open = openfunc, + .read = readstream, + }; + +@ report_r depends on report @ +identifier fops_r.openfunc; +position p1; +@@ + openfunc(...) { + <... + nonseekable_open@p1 + ...> + } + +@ script:python depends on report @ +fops << fops0.fops; +p << report_r.p1; +@@ +coccilib.report.print_report(p[0], + "WARNING: %s: .read() has stream semantic; safe to change nonseekable_open -> stream_open." % (fops,)) + +@ explain_r depends on explain @ +identifier fops_r.openfunc; +@@ + openfunc(...) { + <... +- nonseekable_open ++ nonseekable_open /* read only */ + ...> + } + +@ patch_r depends on patch @ +identifier fops_r.openfunc; +@@ + openfunc(...) { + <... +- nonseekable_open ++ stream_open + ...> + } + + +// write, but not read +@ fops_w depends on maybe_stream && !has_read @ +identifier fops0.fops, openfunc; +identifier stream_writer.writestream; +@@ + struct file_operations fops = { + .open = openfunc, + .write = writestream, + }; + +@ report_w depends on report @ +identifier fops_w.openfunc; +position p1; +@@ + openfunc(...) { + <... + nonseekable_open@p1 + ...> + } + +@ script:python depends on report @ +fops << fops0.fops; +p << report_w.p1; +@@ +coccilib.report.print_report(p[0], + "WARNING: %s: .write() has stream semantic; safe to change nonseekable_open -> stream_open." % (fops,)) + +@ explain_w depends on explain @ +identifier fops_w.openfunc; +@@ + openfunc(...) { + <... +- nonseekable_open ++ nonseekable_open /* write only */ + ...> + } + +@ patch_w depends on patch @ +identifier fops_w.openfunc; +@@ + openfunc(...) { + <... +- nonseekable_open ++ stream_open + ...> + } + + +// no read, no write - don't change anything
From: Kirill Smelkov kirr@nexedi.com
commit bbd84f33652f852ce5992d65db4d020aba21f882 upstream.
Starting from commit 9c225f2655e3 ("vfs: atomic f_pos accesses as per POSIX") files opened even via nonseekable_open gate read and write via lock and do not allow them to be run simultaneously. This can create read vs write deadlock if a filesystem is trying to implement a socket-like file which is intended to be simultaneously used for both read and write from filesystem client. See commit 10dce8af3422 ("fs: stream_open - opener for stream-like files so that read and write can run simultaneously without deadlock") for details and e.g. commit 581d21a2d02a ("xenbus: fix deadlock on writes to /proc/xen/xenbus") for a similar deadlock example on /proc/xen/xenbus.
To avoid such deadlock it was tempting to adjust fuse_finish_open to use stream_open instead of nonseekable_open on just FOPEN_NONSEEKABLE flags, but grepping through Debian codesearch shows users of FOPEN_NONSEEKABLE, and in particular GVFS which actually uses offset in its read and write handlers
https://codesearch.debian.net/search?q=-%3Enonseekable+%3D https://gitlab.gnome.org/GNOME/gvfs/blob/1.40.0-6-gcbc54396/client/gvfsfused... https://gitlab.gnome.org/GNOME/gvfs/blob/1.40.0-6-gcbc54396/client/gvfsfused... https://gitlab.gnome.org/GNOME/gvfs/blob/1.40.0-6-gcbc54396/client/gvfsfused...
so if we would do such a change it will break a real user.
Add another flag (FOPEN_STREAM) for filesystem servers to indicate that the opened handler is having stream-like semantics; does not use file position and thus the kernel is free to issue simultaneous read and write request on opened file handle.
This patch together with stream_open() should be added to stable kernels starting from v3.14+. This will allow to patch OSSPD and other FUSE filesystems that provide stream-like files to return FOPEN_STREAM | FOPEN_NONSEEKABLE in open handler and this way avoid the deadlock on all kernel versions. This should work because fuse_finish_open ignores unknown open flags returned from a filesystem and so passing FOPEN_STREAM to a kernel that is not aware of this flag cannot hurt. In turn the kernel that is not aware of FOPEN_STREAM will be < v3.14 where just FOPEN_NONSEEKABLE is sufficient to implement streams without read vs write deadlock.
Cc: stable@vger.kernel.org # v3.14+ Signed-off-by: Kirill Smelkov kirr@nexedi.com Signed-off-by: Miklos Szeredi mszeredi@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/fuse/file.c | 4 +++- include/uapi/linux/fuse.h | 2 ++ 2 files changed, 5 insertions(+), 1 deletion(-)
--- a/fs/fuse/file.c +++ b/fs/fuse/file.c @@ -178,7 +178,9 @@ void fuse_finish_open(struct inode *inod file->f_op = &fuse_direct_io_file_operations; if (!(ff->open_flags & FOPEN_KEEP_CACHE)) invalidate_inode_pages2(inode->i_mapping); - if (ff->open_flags & FOPEN_NONSEEKABLE) + if (ff->open_flags & FOPEN_STREAM) + stream_open(inode, file); + else if (ff->open_flags & FOPEN_NONSEEKABLE) nonseekable_open(inode, file); if (fc->atomic_o_trunc && (file->f_flags & O_TRUNC)) { struct fuse_inode *fi = get_fuse_inode(inode); --- a/include/uapi/linux/fuse.h +++ b/include/uapi/linux/fuse.h @@ -215,10 +215,12 @@ struct fuse_file_lock { * FOPEN_DIRECT_IO: bypass page cache for this open file * FOPEN_KEEP_CACHE: don't invalidate the data cache on open * FOPEN_NONSEEKABLE: the file is not seekable + * FOPEN_STREAM: the file is stream-like (no file position at all) */ #define FOPEN_DIRECT_IO (1 << 0) #define FOPEN_KEEP_CACHE (1 << 1) #define FOPEN_NONSEEKABLE (1 << 2) +#define FOPEN_STREAM (1 << 4)
/** * INIT request/reply flags
stable-rc/linux-4.9.y boot: 107 boots: 1 failed, 106 passed (v4.9.180-84-g4fcf72df7bc7)
Full Boot Summary: https://kernelci.org/boot/all/job/stable-rc/branch/linux-4.9.y/kernel/v4.9.1... Full Build Summary: https://kernelci.org/build/stable-rc/branch/linux-4.9.y/kernel/v4.9.180-84-g...
Tree: stable-rc Branch: linux-4.9.y Git Describe: v4.9.180-84-g4fcf72df7bc7 Git Commit: 4fcf72df7bc71264d86e616874a0a0cd382f1b12 Git URL: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git Tested: 53 unique boards, 23 SoC families, 15 builds out of 197
Boot Regressions Detected:
arm:
omap2plus_defconfig: gcc-8: omap3-beagle-xm: lab-baylibre: new failure (last pass: v4.9.180-62-gd9b5fd7ab17b)
Boot Failure Detected:
arm: omap2plus_defconfig: gcc-8: omap3-beagle-xm: 1 failed lab
--- For more info write to info@kernelci.org
On Sun, 9 Jun 2019 at 22:22, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 4.9.181 release. There are 83 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Tue 11 Jun 2019 04:39:58 PM UTC. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.9.181-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.9.y and the diffstat can be found below.
thanks,
greg k-h
Results from Linaro’s test farm. No regressions on arm64, arm, x86_64, and i386.
NOTE: selftest sources version updated to 5.1 LTP version upgrade to 20190517
Summary ------------------------------------------------------------------------
kernel: 4.9.181-rc1 git repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git git branch: linux-4.9.y git commit: 4fcf72df7bc71264d86e616874a0a0cd382f1b12 git describe: v4.9.180-84-g4fcf72df7bc7 Test details: https://qa-reports.linaro.org/lkft/linux-stable-rc-4.9-oe/build/v4.9.180-84-...
No regressions (compared to build v4.9.180)
No fixes (compared to build v4.9.180)
Ran 23615 total tests in the following environments and test suites.
Environments -------------- - dragonboard-410c - arm64 - hi6220-hikey - arm64 - i386 - juno-r2 - arm64 - qemu_arm - qemu_arm64 - qemu_i386 - qemu_x86_64 - x15 - arm - x86_64
Test Suites ----------- * build * install-android-platform-tools-r2600 * kselftest * libhugetlbfs * ltp-cap_bounds-tests * ltp-commands-tests * ltp-containers-tests * ltp-cpuhotplug-tests * ltp-cve-tests * ltp-dio-tests * ltp-fcntl-locktests-tests * ltp-filecaps-tests * ltp-fs-tests * ltp-fs_bind-tests * ltp-fs_perms_simple-tests * ltp-fsx-tests * ltp-hugetlb-tests * ltp-io-tests * ltp-ipc-tests * ltp-math-tests * ltp-mm-tests * ltp-nptl-tests * ltp-pty-tests * ltp-sched-tests * ltp-securebits-tests * ltp-syscalls-tests * ltp-timers-tests * perf * spectre-meltdown-checker-test * v4l2-compliance * network-basic-tests * ltp-open-posix-tests * prep-tmp-disk * kvm-unit-tests * kselftest-vsyscall-mode-native * kselftest-vsyscall-mode-none
On 09/06/2019 17:41, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 4.9.181 release. There are 83 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Tue 11 Jun 2019 04:39:58 PM UTC. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.9.181-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.9.y and the diffstat can be found below.
thanks,
greg k-h
All tests are passing for Tegra ...
Test results for stable-v4.9: 8 builds: 8 pass, 0 fail 16 boots: 16 pass, 0 fail 24 tests: 24 pass, 0 fail
Linux version: 4.9.181-rc1-g4fcf72d Boards tested: tegra124-jetson-tk1, tegra20-ventana, tegra210-p2371-2180, tegra30-cardhu-a04
Cheers Jon
On Sun, Jun 09, 2019 at 06:41:30PM +0200, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 4.9.181 release. There are 83 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Tue 11 Jun 2019 04:39:58 PM UTC. Anything received after that time might be too late.
Build results: total: 172 pass: 172 fail: 0 Qemu test results: total: 322 pass: 322 fail: 0
Guenter
On 6/9/19 10:41 AM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 4.9.181 release. There are 83 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Tue 11 Jun 2019 04:39:58 PM UTC. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.9.181-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.9.y and the diffstat can be found below.
thanks,
greg k-h
Compiled and booted on my test system. No dmesg regressions.
thanks, -- Shuah
linux-stable-mirror@lists.linaro.org