This is the start of the stable review cycle for the 4.4.224 release. There are 86 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 20 May 2020 17:32:42 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.4.224-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.4.y and the diffstat can be found below.
thanks,
greg k-h
------------- Pseudo-Shortlog of commits:
Greg Kroah-Hartman gregkh@linuxfoundation.org Linux 4.4.224-rc1
Sergei Trofimovich slyfox@gentoo.org Makefile: disallow data races on gcc-10 as well
Jim Mattson jmattson@google.com KVM: x86: Fix off-by-one error in kvm_vcpu_ioctl_x86_setup_mce
Geert Uytterhoeven geert+renesas@glider.be ARM: dts: r8a7740: Add missing extal2 to CPG node
Kai-Heng Feng kai.heng.feng@canonical.com Revert "ALSA: hda/realtek: Fix pop noise on ALC225"
Wei Yongjun weiyongjun1@huawei.com usb: gadget: legacy: fix error return code in cdc_bind()
Wei Yongjun weiyongjun1@huawei.com usb: gadget: legacy: fix error return code in gncm_bind()
Christophe JAILLET christophe.jaillet@wanadoo.fr usb: gadget: audio: Fix a missing error return value in audio_bind()
Christophe JAILLET christophe.jaillet@wanadoo.fr usb: gadget: net2272: Fix a memory leak in an error handling path in 'net2272_plat_probe()'
Eric W. Biederman ebiederm@xmission.com exec: Move would_dump into flush_old_exec
Borislav Petkov bp@suse.de x86: Fix early boot crash on gcc-10, third try
Fabio Estevam festevam@gmail.com ARM: dts: imx27-phytec-phycard-s-rdk: Fix the I2C1 pinctrl entries
Kyungtae Kim kt0755@gmail.com USB: gadget: fix illegal array access in binding with UDC
Takashi Iwai tiwai@suse.de ALSA: rawmidi: Initialize allocated buffers
Takashi Iwai tiwai@suse.de ALSA: rawmidi: Fix racy buffer resize under concurrent accesses
Takashi Iwai tiwai@suse.de ALSA: hda/realtek - Limit int mic boost for Thinkpad T530
Paolo Abeni pabeni@redhat.com netlabel: cope with NULL catmap
Paolo Abeni pabeni@redhat.com net: ipv4: really enforce backoff for redirects
Cong Wang xiyou.wangcong@gmail.com net: fix a potential recursive NETDEV_FEAT_CHANGE
Linus Torvalds torvalds@linux-foundation.org gcc-10: avoid shadowing standard library 'free()' in crypto
Boris Ostrovsky boris.ostrovsky@oracle.com x86/paravirt: Remove the unused irq_enable_sysexit pv op
Keith Busch keith.busch@intel.com blk-mq: Allow blocking queue tag iter callbacks
Jianchao Wang jianchao.w.wang@oracle.com blk-mq: sync the update nr_hw_queues with blk_mq_queue_tag_busy_iter
Gabriel Krisman Bertazi krisman@linux.vnet.ibm.com blk-mq: Allow timeouts to run while queue is freezing
Christoph Hellwig hch@lst.de block: defer timeouts to a workqueue
Linus Torvalds torvalds@linux-foundation.org gcc-10: disable 'restrict' warning for now
Linus Torvalds torvalds@linux-foundation.org gcc-10: disable 'stringop-overflow' warning for now
Linus Torvalds torvalds@linux-foundation.org gcc-10: disable 'array-bounds' warning for now
Linus Torvalds torvalds@linux-foundation.org gcc-10: disable 'zero-length-bounds' warning for now
Linus Torvalds torvalds@linux-foundation.org Stop the ad-hoc games with -Wno-maybe-initialized
Masahiro Yamada yamada.masahiro@socionext.com kbuild: compute false-positive -Wmaybe-uninitialized cases in Kconfig
Linus Torvalds torvalds@linux-foundation.org gcc-10 warnings: fix low-hanging fruit
Jason Gunthorpe jgg@mellanox.com pnp: Use list_for_each_entry() instead of open coding
Jack Morgenstein jackm@dev.mellanox.co.il IB/mlx4: Test return value of calls to ib_get_cached_pkey
Arnd Bergmann arnd@arndb.de netfilter: conntrack: avoid gcc-10 zero-length-bounds warning
Gal Pressman galp@mellanox.com net/mlx5: Fix driver load error flow when firmware is stuck
Anjali Singhai Jain anjali.singhai@intel.com i40e: avoid NVM acquire deadlock during NVM update
Ben Hutchings ben.hutchings@codethink.co.uk scsi: qla2xxx: Avoid double completion of abort command
zhong jiang zhongjiang@huawei.com mm/memory_hotplug.c: fix overflow in test_pages_in_a_zone()
Jiri Benc jbenc@redhat.com gre: do not keep the GRE header around in collect medata mode
John Hurley john.hurley@netronome.com net: openvswitch: fix csum updates for MPLS actions
Vasily Averin vvs@virtuozzo.com ipc/util.c: sysvipc_find_ipc() incorrectly updates position index
Vasily Averin vvs@virtuozzo.com drm/qxl: lost qxl_bo_kunmap_atomic_page in qxl_image_init_helper()
Lubomir Rintel lkundrak@v3.sk dmaengine: mmp_tdma: Reset channel error on release
Madhuparna Bhowmik madhuparnabhowmik10@gmail.com dmaengine: pch_dma.c: Avoid data race between probe and irq handler
Ronnie Sahlberg lsahlber@redhat.com cifs: Fix a race condition with cifs_echo_request
Samuel Cabrero scabrero@suse.de cifs: Check for timeout on Negotiate stage
wuxu.wu wuxu.wu@huawei.com spi: spi-dw: Add lock protect dw_spi rx/tx to prevent concurrent calls
Wu Bo wubo40@huawei.com scsi: sg: add sg_remove_request in sg_write
Arnd Bergmann arnd@arndb.de drop_monitor: work around gcc-10 stringop-overflow warning
Christophe JAILLET christophe.jaillet@wanadoo.fr net: moxa: Fix a potential double 'free_irq()'
Christophe JAILLET christophe.jaillet@wanadoo.fr net/sonic: Fix a resource leak in an error handling path in 'jazz_sonic_probe()'
David Ahern dsa@cumulusnetworks.com net: handle no dst on skb in icmp6_send
Vladis Dronov vdronov@redhat.com ptp: free ptp device pin descriptors properly
Vladis Dronov vdronov@redhat.com ptp: fix the race between the release of ptp_clock and cdev
YueHaibing yuehaibing@huawei.com ptp: Fix pass zero to ERR_PTR() in ptp_clock_register
Logan Gunthorpe logang@deltatee.com chardev: add helper function to register char devs with a struct device
Dmitry Torokhov dmitry.torokhov@gmail.com ptp: create "pins" together with the rest of attributes
Dmitry Torokhov dmitry.torokhov@gmail.com ptp: use is_visible method to hide unused attributes
Dmitry Torokhov dmitry.torokhov@gmail.com ptp: do not explicitly set drvdata in ptp_clock_register()
Cengiz Can cengiz@kernel.wtf blktrace: fix dereference after null check
Jan Kara jack@suse.cz blktrace: Protect q->blk_trace with RCU
Jens Axboe axboe@kernel.dk blktrace: fix trace mutex deadlock
Jens Axboe axboe@kernel.dk blktrace: fix unlocked access to init/start-stop/teardown
Waiman Long longman@redhat.com blktrace: Fix potential deadlock between delete & sysfs ops
Sabrina Dubroca sd@queasysnail.net net: ipv6_stub: use ip6_dst_lookup_flow instead of ip6_dst_lookup
Sabrina Dubroca sd@queasysnail.net net: ipv6: add net argument to ip6_dst_lookup_flow
Shijie Luo luoshijie1@huawei.com ext4: add cond_resched() to ext4_protect_reserved_inode
Kees Cook keescook@chromium.org binfmt_elf: Do not move brk for INTERP-less ET_EXEC
Alexandre Belloni alexandre.belloni@free-electrons.com phy: micrel: Ensure interrupts are reenabled on resume
Alexandre Belloni alexandre.belloni@free-electrons.com phy: micrel: Disable auto negotiation on startup
Ivan Delalande colona@arista.com scripts/decodecode: fix trapping instruction formatting
George Spelvin lkml@sdf.org batman-adv: fix batadv_nc_random_weight_tq
Oliver Neukum oneukum@suse.com USB: serial: garmin_gps: add sanity checking for data length
Oliver Neukum oneukum@suse.com USB: uas: add quirk for LaCie 2Big Quadra
Alex Estrin alex.estrin@intel.com Revert "IB/ipoib: Update broadcast object if PKey value was changed in index 0"
Ville Syrjälä ville.syrjala@linux.intel.com x86/apm: Don't access __preempt_count with zeroed fs
Kees Cook keescook@chromium.org binfmt_elf: move brk out of mmap when doing direct loader exec
Sabrina Dubroca sd@queasysnail.net ipv6: fix cleanup ordering for ip6_mr failure
Govindarajulu Varadarajan gvaradar@cisco.com enic: do not overwrite error code
Hans de Goede hdegoede@redhat.com Revert "ACPI / video: Add force_native quirk for HP Pavilion dv6"
Eric Dumazet edumazet@google.com sch_choke: avoid potential panic in choke_reset()
Eric Dumazet edumazet@google.com sch_sfq: validate silly quantum values
Tariq Toukan tariqt@mellanox.com net/mlx4_core: Fix use of ENOSPC around mlx4_counter_alloc()
Julia Lawall Julia.Lawall@inria.fr dp83640: reverse arguments to list_add_tail
Greg Kroah-Hartman gregkh@linuxfoundation.org Revert "net: phy: Avoid polling PHY with PHY_IGNORE_INTERRUPTS"
Matt Jolly Kangie@footclan.ninja USB: serial: qcserial: Add DW5816e support
-------------
Diffstat:
Makefile | 17 +- arch/arm/boot/dts/imx27-phytec-phycard-s-rdk.dts | 4 +- arch/arm/boot/dts/r8a7740.dtsi | 2 +- arch/x86/entry/entry_32.S | 8 +- arch/x86/include/asm/apm.h | 6 - arch/x86/include/asm/paravirt.h | 7 - arch/x86/include/asm/paravirt_types.h | 9 -- arch/x86/include/asm/stackprotector.h | 7 +- arch/x86/kernel/apm_32.c | 5 + arch/x86/kernel/asm-offsets.c | 3 - arch/x86/kernel/paravirt.c | 7 - arch/x86/kernel/paravirt_patch_32.c | 2 - arch/x86/kernel/paravirt_patch_64.c | 1 - arch/x86/kernel/smpboot.c | 8 + arch/x86/kvm/x86.c | 2 +- arch/x86/xen/enlighten.c | 3 - arch/x86/xen/smp.c | 1 + arch/x86/xen/xen-asm_32.S | 14 -- arch/x86/xen/xen-ops.h | 3 - block/blk-core.c | 3 + block/blk-mq-tag.c | 7 +- block/blk-mq.c | 17 ++ block/blk-timeout.c | 3 + crypto/lrw.c | 4 +- crypto/xts.c | 4 +- drivers/acpi/video_detect.c | 11 -- drivers/dma/mmp_tdma.c | 2 + drivers/dma/pch_dma.c | 2 +- drivers/gpu/drm/qxl/qxl_image.c | 3 +- drivers/infiniband/core/addr.c | 6 +- drivers/infiniband/hw/mlx4/qp.c | 14 +- drivers/infiniband/ulp/ipoib/ipoib_ib.c | 13 -- drivers/net/ethernet/cisco/enic/enic_main.c | 9 +- drivers/net/ethernet/intel/i40e/i40e_nvm.c | 98 +++++++----- drivers/net/ethernet/intel/i40e/i40e_prototype.h | 2 - drivers/net/ethernet/mellanox/mlx4/main.c | 4 +- drivers/net/ethernet/mellanox/mlx5/core/main.c | 2 +- drivers/net/ethernet/moxa/moxart_ether.c | 2 +- drivers/net/ethernet/natsemi/jazzsonic.c | 6 +- drivers/net/geneve.c | 4 +- drivers/net/phy/dp83640.c | 2 +- drivers/net/phy/micrel.c | 28 +++- drivers/net/phy/phy.c | 15 +- drivers/net/vxlan.c | 10 +- drivers/ptp/ptp_clock.c | 42 ++--- drivers/ptp/ptp_private.h | 9 +- drivers/ptp/ptp_sysfs.c | 162 ++++++++----------- drivers/scsi/qla2xxx/qla_init.c | 4 +- drivers/scsi/sg.c | 4 +- drivers/spi/spi-dw.c | 15 +- drivers/spi/spi-dw.h | 1 + drivers/usb/gadget/configfs.c | 3 + drivers/usb/gadget/legacy/audio.c | 4 +- drivers/usb/gadget/legacy/cdc2.c | 4 +- drivers/usb/gadget/legacy/ncm.c | 4 +- drivers/usb/gadget/udc/net2272.c | 2 + drivers/usb/serial/garmin_gps.c | 4 +- drivers/usb/serial/qcserial.c | 1 + drivers/usb/storage/unusual_uas.h | 7 + fs/binfmt_elf.c | 12 ++ fs/char_dev.c | 86 ++++++++++ fs/cifs/cifssmb.c | 12 ++ fs/cifs/connect.c | 11 +- fs/cifs/smb2pdu.c | 12 ++ fs/exec.c | 4 +- fs/ext4/block_validity.c | 1 + include/linux/blkdev.h | 3 +- include/linux/blktrace_api.h | 6 +- include/linux/cdev.h | 5 + include/linux/compiler.h | 7 + include/linux/fs.h | 2 +- include/linux/pnp.h | 29 ++-- include/linux/posix-clock.h | 19 ++- include/linux/tty.h | 2 +- include/net/addrconf.h | 6 +- include/net/ipv6.h | 2 +- include/net/netfilter/nf_conntrack.h | 2 +- include/sound/rawmidi.h | 1 + init/main.c | 2 + ipc/util.c | 12 +- kernel/time/posix-clock.c | 31 ++-- kernel/trace/blktrace.c | 191 +++++++++++++++++------ mm/memory_hotplug.c | 4 +- net/batman-adv/network-coding.c | 9 +- net/core/dev.c | 4 +- net/core/drop_monitor.c | 11 +- net/dccp/ipv6.c | 6 +- net/ipv4/cipso_ipv4.c | 6 +- net/ipv4/ip_gre.c | 7 +- net/ipv4/route.c | 2 +- net/ipv6/addrconf_core.c | 11 +- net/ipv6/af_inet6.c | 10 +- net/ipv6/datagram.c | 2 +- net/ipv6/icmp.c | 6 +- net/ipv6/inet6_connection_sock.c | 4 +- net/ipv6/ip6_output.c | 8 +- net/ipv6/raw.c | 2 +- net/ipv6/syncookies.c | 2 +- net/ipv6/tcp_ipv6.c | 4 +- net/l2tp/l2tp_ip6.c | 2 +- net/mpls/af_mpls.c | 7 +- net/netfilter/nf_conntrack_core.c | 4 +- net/netlabel/netlabel_kapi.c | 6 + net/openvswitch/actions.c | 6 +- net/sched/sch_choke.c | 3 +- net/sched/sch_sfq.c | 9 ++ net/sctp/ipv6.c | 4 +- net/tipc/udp_media.c | 9 +- scripts/decodecode | 2 +- sound/core/rawmidi.c | 35 ++++- sound/pci/hda/patch_realtek.c | 12 +- 111 files changed, 805 insertions(+), 496 deletions(-)
From: Matt Jolly Kangie@footclan.ninja
commit 78d6de3cfbd342918d31cf68d0d2eda401338aef upstream.
Add support for Dell Wireless 5816e to drivers/usb/serial/qcserial.c
Signed-off-by: Matt Jolly Kangie@footclan.ninja Cc: stable stable@vger.kernel.org Signed-off-by: Johan Hovold johan@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/usb/serial/qcserial.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/usb/serial/qcserial.c +++ b/drivers/usb/serial/qcserial.c @@ -177,6 +177,7 @@ static const struct usb_device_id id_tab {DEVICE_SWI(0x413c, 0x81b3)}, /* Dell Wireless 5809e Gobi(TM) 4G LTE Mobile Broadband Card (rev3) */ {DEVICE_SWI(0x413c, 0x81b5)}, /* Dell Wireless 5811e QDL */ {DEVICE_SWI(0x413c, 0x81b6)}, /* Dell Wireless 5811e QDL */ + {DEVICE_SWI(0x413c, 0x81cc)}, /* Dell Wireless 5816e */ {DEVICE_SWI(0x413c, 0x81cf)}, /* Dell Wireless 5819 */ {DEVICE_SWI(0x413c, 0x81d0)}, /* Dell Wireless 5819 */ {DEVICE_SWI(0x413c, 0x81d1)}, /* Dell Wireless 5818 */
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
This reverts commit 0d1951fa23ba0d35a4c5498ff28d1c5206d6fcdd which was commit d5c3d84657db57bd23ecd58b97f1c99dd42a7b80 upstream.
Guillaume reports that this patch breaks booting on at91-sama5d4_xplained, so revert it for now.
Reported-by: "kernelci.org bot" bot@kernelci.org Reported-by: Guillaume Tucker guillaume.tucker@collabora.com Cc: Florian Fainelli f.fainelli@gmail.com Cc: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/phy/phy.c | 15 +++++---------- 1 file changed, 5 insertions(+), 10 deletions(-)
--- a/drivers/net/phy/phy.c +++ b/drivers/net/phy/phy.c @@ -916,10 +916,10 @@ void phy_state_machine(struct work_struc phydev->adjust_link(phydev->attached_dev); break; case PHY_RUNNING: - /* Only register a CHANGE if we are polling and link changed - * since latest checking. + /* Only register a CHANGE if we are polling or ignoring + * interrupts and link changed since latest checking. */ - if (phydev->irq == PHY_POLL) { + if (!phy_interrupt_is_valid(phydev)) { old_link = phydev->link; err = phy_read_status(phydev); if (err) @@ -1019,13 +1019,8 @@ void phy_state_machine(struct work_struc dev_dbg(&phydev->dev, "PHY state change %s -> %s\n", phy_state_to_str(old_state), phy_state_to_str(phydev->state));
- /* Only re-schedule a PHY state machine change if we are polling the - * PHY, if PHY_IGNORE_INTERRUPT is set, then we will be moving - * between states from phy_mac_interrupt() - */ - if (phydev->irq == PHY_POLL) - queue_delayed_work(system_power_efficient_wq, &phydev->state_queue, - PHY_STATE_TIME * HZ); + queue_delayed_work(system_power_efficient_wq, &phydev->state_queue, + PHY_STATE_TIME * HZ); }
void phy_mac_interrupt(struct phy_device *phydev, int new_link)
From: Julia Lawall Julia.Lawall@inria.fr
[ Upstream commit 865308373ed49c9fb05720d14cbf1315349b32a9 ]
In this code, it appears that phyter_clocks is a list head, based on the previous list_for_each, and that clock->list is intended to be a list element, given that it has just been initialized in dp83640_clock_init. Accordingly, switch the arguments to list_add_tail, which takes the list head as the second argument.
Fixes: cb646e2b02b27 ("ptp: Added a clock driver for the National Semiconductor PHYTER.") Signed-off-by: Julia Lawall Julia.Lawall@inria.fr Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/phy/dp83640.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/net/phy/dp83640.c +++ b/drivers/net/phy/dp83640.c @@ -1107,7 +1107,7 @@ static struct dp83640_clock *dp83640_clo goto out; } dp83640_clock_init(clock, bus); - list_add_tail(&phyter_clocks, &clock->list); + list_add_tail(&clock->list, &phyter_clocks); out: mutex_unlock(&phyter_clocks_lock);
From: Tariq Toukan tariqt@mellanox.com
[ Upstream commit 40e473071dbad04316ddc3613c3a3d1c75458299 ]
When ENOSPC is set the idx is still valid and gets set to the global MLX4_SINK_COUNTER_INDEX. However gcc's static analysis cannot tell that ENOSPC is impossible from mlx4_cmd_imm() and gives this warning:
drivers/net/ethernet/mellanox/mlx4/main.c:2552:28: warning: 'idx' may be used uninitialized in this function [-Wmaybe-uninitialized] 2552 | priv->def_counter[port] = idx;
Also, when ENOSPC is returned mlx4_allocate_default_counters should not fail.
Fixes: 6de5f7f6a1fa ("net/mlx4_core: Allocate default counter per port") Signed-off-by: Jason Gunthorpe jgg@mellanox.com Signed-off-by: Tariq Toukan tariqt@mellanox.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/mellanox/mlx4/main.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
--- a/drivers/net/ethernet/mellanox/mlx4/main.c +++ b/drivers/net/ethernet/mellanox/mlx4/main.c @@ -2295,6 +2295,7 @@ static int mlx4_allocate_default_counter
if (!err || err == -ENOSPC) { priv->def_counter[port] = idx; + err = 0; } else if (err == -ENOENT) { err = 0; continue; @@ -2344,7 +2345,8 @@ int mlx4_counter_alloc(struct mlx4_dev * MLX4_CMD_TIME_CLASS_A, MLX4_CMD_WRAPPED); if (!err) *idx = get_param_l(&out_param); - + if (WARN_ON(err == -ENOSPC)) + err = -EINVAL; return err; } return __mlx4_counter_alloc(dev, idx);
From: Eric Dumazet edumazet@google.com
[ Upstream commit df4953e4e997e273501339f607b77953772e3559 ]
syzbot managed to set up sfq so that q->scaled_quantum was zero, triggering an infinite loop in sfq_dequeue()
More generally, we must only accept quantum between 1 and 2^18 - 7, meaning scaled_quantum must be in [1, 0x7FFF] range.
Otherwise, we also could have a loop in sfq_dequeue() if scaled_quantum happens to be 0x8000, since slot->allot could indefinitely switch between 0 and 0x8000.
Fixes: eeaeb068f139 ("sch_sfq: allow big packets and be fair") Signed-off-by: Eric Dumazet edumazet@google.com Reported-by: syzbot+0251e883fe39e7a0cb0a@syzkaller.appspotmail.com Cc: Jason A. Donenfeld Jason@zx2c4.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/sched/sch_sfq.c | 9 +++++++++ 1 file changed, 9 insertions(+)
--- a/net/sched/sch_sfq.c +++ b/net/sched/sch_sfq.c @@ -635,6 +635,15 @@ static int sfq_change(struct Qdisc *sch, if (ctl->divisor && (!is_power_of_2(ctl->divisor) || ctl->divisor > 65536)) return -EINVAL; + + /* slot->allot is a short, make sure quantum is not too big. */ + if (ctl->quantum) { + unsigned int scaled = SFQ_ALLOT_SIZE(ctl->quantum); + + if (scaled <= 0 || scaled > SHRT_MAX) + return -EINVAL; + } + if (ctl_v1 && !red_check_params(ctl_v1->qth_min, ctl_v1->qth_max, ctl_v1->Wlog)) return -EINVAL;
From: Eric Dumazet edumazet@google.com
[ Upstream commit 8738c85c72b3108c9b9a369a39868ba5f8e10ae0 ]
If choke_init() could not allocate q->tab, we would crash later in choke_reset().
BUG: KASAN: null-ptr-deref in memset include/linux/string.h:366 [inline] BUG: KASAN: null-ptr-deref in choke_reset+0x208/0x340 net/sched/sch_choke.c:326 Write of size 8 at addr 0000000000000000 by task syz-executor822/7022
CPU: 1 PID: 7022 Comm: syz-executor822 Not tainted 5.7.0-rc1-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x188/0x20d lib/dump_stack.c:118 __kasan_report.cold+0x5/0x4d mm/kasan/report.c:515 kasan_report+0x33/0x50 mm/kasan/common.c:625 check_memory_region_inline mm/kasan/generic.c:187 [inline] check_memory_region+0x141/0x190 mm/kasan/generic.c:193 memset+0x20/0x40 mm/kasan/common.c:85 memset include/linux/string.h:366 [inline] choke_reset+0x208/0x340 net/sched/sch_choke.c:326 qdisc_reset+0x6b/0x520 net/sched/sch_generic.c:910 dev_deactivate_queue.constprop.0+0x13c/0x240 net/sched/sch_generic.c:1138 netdev_for_each_tx_queue include/linux/netdevice.h:2197 [inline] dev_deactivate_many+0xe2/0xba0 net/sched/sch_generic.c:1195 dev_deactivate+0xf8/0x1c0 net/sched/sch_generic.c:1233 qdisc_graft+0xd25/0x1120 net/sched/sch_api.c:1051 tc_modify_qdisc+0xbab/0x1a00 net/sched/sch_api.c:1670 rtnetlink_rcv_msg+0x44e/0xad0 net/core/rtnetlink.c:5454 netlink_rcv_skb+0x15a/0x410 net/netlink/af_netlink.c:2469 netlink_unicast_kernel net/netlink/af_netlink.c:1303 [inline] netlink_unicast+0x537/0x740 net/netlink/af_netlink.c:1329 netlink_sendmsg+0x882/0xe10 net/netlink/af_netlink.c:1918 sock_sendmsg_nosec net/socket.c:652 [inline] sock_sendmsg+0xcf/0x120 net/socket.c:672 ____sys_sendmsg+0x6bf/0x7e0 net/socket.c:2362 ___sys_sendmsg+0x100/0x170 net/socket.c:2416 __sys_sendmsg+0xec/0x1b0 net/socket.c:2449 do_syscall_64+0xf6/0x7d0 arch/x86/entry/common.c:295
Fixes: 77e62da6e60c ("sch_choke: drop all packets in queue during reset") Signed-off-by: Eric Dumazet edumazet@google.com Reported-by: syzbot syzkaller@googlegroups.com Cc: Cong Wang xiyou.wangcong@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/sched/sch_choke.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/net/sched/sch_choke.c +++ b/net/sched/sch_choke.c @@ -396,7 +396,8 @@ static void choke_reset(struct Qdisc *sc qdisc_drop(skb, sch); }
- memset(q->tab, 0, (q->tab_mask + 1) * sizeof(struct sk_buff *)); + if (q->tab) + memset(q->tab, 0, (q->tab_mask + 1) * sizeof(struct sk_buff *)); q->head = q->tail = 0; red_restart(&q->vars); }
From: Hans de Goede hdegoede@redhat.com
commit fd25ea29093e275195d0ae8b2573021a1c98959f upstream.
Revert commit 6276e53fa8c0 (ACPI / video: Add force_native quirk for HP Pavilion dv6).
In the commit message for the quirk this revert removes I wrote:
"Note that there are quite a few HP Pavilion dv6 variants, some woth ATI and some with NVIDIA hybrid gfx, both seem to need this quirk to have working backlight control. There are also some versions with only Intel integrated gfx, these may not need this quirk, but it should not hurt there."
Unfortunately that seems wrong, I've already received 2 reports of this commit causing regressions on some dv6 variants (at least one of which actually has a nvidia GPU). So it seems that HP has made a mess here by using the same model-name both in marketing and in the DMI data for many different variants. Some of which need acpi_backlight=native for functional backlight control (as the quirk this commit reverts was doing), where as others are broken by it. So lets get back to the old sitation so as to avoid regressing on models which used to work without any kernel cmdline arguments before.
Fixes: 6276e53fa8c0 (ACPI / video: Add force_native quirk for HP Pavilion dv6) Signed-off-by: Hans de Goede hdegoede@redhat.com Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Cc: Guenter Roeck linux@roeck-us.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/acpi/video_detect.c | 11 ----------- 1 file changed, 11 deletions(-)
--- a/drivers/acpi/video_detect.c +++ b/drivers/acpi/video_detect.c @@ -289,17 +289,6 @@ static const struct dmi_system_id video_ DMI_MATCH(DMI_PRODUCT_NAME, "Dell System XPS L702X"), }, }, - { - /* https://bugzilla.redhat.com/show_bug.cgi?id=1204476 */ - /* https://bugs.launchpad.net/ubuntu/+source/linux-lts-trusty/+bug/1416940 */ - .callback = video_detect_force_native, - .ident = "HP Pavilion dv6", - .matches = { - DMI_MATCH(DMI_SYS_VENDOR, "Hewlett-Packard"), - DMI_MATCH(DMI_PRODUCT_NAME, "HP Pavilion dv6 Notebook PC"), - }, - }, - { }, };
From: Govindarajulu Varadarajan gvaradar@cisco.com
commit 56f772279a762984f6e9ebbf24a7c829faba5712 upstream.
In failure path, we overwrite err to what vnic_rq_disable() returns. In case it returns 0, enic_open() returns success in case of error.
Reported-by: Ben Hutchings ben.hutchings@codethink.co.uk Fixes: e8588e268509 ("enic: enable rq before updating rq descriptors") Signed-off-by: Govindarajulu Varadarajan gvaradar@cisco.com Signed-off-by: David S. Miller davem@davemloft.net Cc: Guenter Roeck linux@roeck-us.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/net/ethernet/cisco/enic/enic_main.c | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-)
--- a/drivers/net/ethernet/cisco/enic/enic_main.c +++ b/drivers/net/ethernet/cisco/enic/enic_main.c @@ -1708,7 +1708,7 @@ static int enic_open(struct net_device * { struct enic *enic = netdev_priv(netdev); unsigned int i; - int err; + int err, ret;
err = enic_request_intr(enic); if (err) { @@ -1766,10 +1766,9 @@ static int enic_open(struct net_device *
err_out_free_rq: for (i = 0; i < enic->rq_count; i++) { - err = vnic_rq_disable(&enic->rq[i]); - if (err) - return err; - vnic_rq_clean(&enic->rq[i], enic_free_rq_buf); + ret = vnic_rq_disable(&enic->rq[i]); + if (!ret) + vnic_rq_clean(&enic->rq[i], enic_free_rq_buf); } enic_dev_notify_unset(enic); err_out_free_intr:
From: Sabrina Dubroca sd@queasysnail.net
commit afe49de44c27a89e8e9631c44b5ffadf6ace65e2 upstream.
Commit 15e668070a64 ("ipv6: reorder icmpv6_init() and ip6_mr_init()") moved the cleanup label for ipmr_fail, but should have changed the contents of the cleanup labels as well. Now we can end up cleaning up icmpv6 even though it hasn't been initialized (jump to icmp_fail or ipmr_fail).
Simply undo things in the reverse order of their initialization.
Example of panic (triggered by faking a failure of icmpv6_init):
kasan: GPF could be caused by NULL-ptr deref or user memory access general protection fault: 0000 [#1] PREEMPT SMP KASAN PTI [...] RIP: 0010:__list_del_entry_valid+0x79/0x160 [...] Call Trace: ? lock_release+0x8a0/0x8a0 unregister_pernet_operations+0xd4/0x560 ? ops_free_list+0x480/0x480 ? down_write+0x91/0x130 ? unregister_pernet_subsys+0x15/0x30 ? down_read+0x1b0/0x1b0 ? up_read+0x110/0x110 ? kmem_cache_create_usercopy+0x1b4/0x240 unregister_pernet_subsys+0x1d/0x30 icmpv6_cleanup+0x1d/0x30 inet6_init+0x1b5/0x23f
Fixes: 15e668070a64 ("ipv6: reorder icmpv6_init() and ip6_mr_init()") Signed-off-by: Sabrina Dubroca sd@queasysnail.net Signed-off-by: David S. Miller davem@davemloft.net Cc: Guenter Roeck linux@roeck-us.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- net/ipv6/af_inet6.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
--- a/net/ipv6/af_inet6.c +++ b/net/ipv6/af_inet6.c @@ -1029,11 +1029,11 @@ netfilter_fail: igmp_fail: ndisc_cleanup(); ndisc_fail: - ip6_mr_cleanup(); + icmpv6_cleanup(); icmp_fail: - unregister_pernet_subsys(&inet6_net_ops); + ip6_mr_cleanup(); ipmr_fail: - icmpv6_cleanup(); + unregister_pernet_subsys(&inet6_net_ops); register_pernet_fail: sock_unregister(PF_INET6); rtnl_unregister_all(PF_INET6);
From: Kees Cook keescook@chromium.org
commit bbdc6076d2e5d07db44e74c11b01a3e27ab90b32 upstream.
Commmit eab09532d400 ("binfmt_elf: use ELF_ET_DYN_BASE only for PIE"), made changes in the rare case when the ELF loader was directly invoked (e.g to set a non-inheritable LD_LIBRARY_PATH, testing new versions of the loader), by moving into the mmap region to avoid both ET_EXEC and PIE binaries. This had the effect of also moving the brk region into mmap, which could lead to the stack and brk being arbitrarily close to each other. An unlucky process wouldn't get its requested stack size and stack allocations could end up scribbling on the heap.
This is illustrated here. In the case of using the loader directly, brk (so helpfully identified as "[heap]") is allocated with the _loader_ not the binary. For example, with ASLR entirely disabled, you can see this more clearly:
$ /bin/cat /proc/self/maps 555555554000-55555555c000 r-xp 00000000 ... /bin/cat 55555575b000-55555575c000 r--p 00007000 ... /bin/cat 55555575c000-55555575d000 rw-p 00008000 ... /bin/cat 55555575d000-55555577e000 rw-p 00000000 ... [heap] ... 7ffff7ff7000-7ffff7ffa000 r--p 00000000 ... [vvar] 7ffff7ffa000-7ffff7ffc000 r-xp 00000000 ... [vdso] 7ffff7ffc000-7ffff7ffd000 r--p 00027000 ... /lib/x86_64-linux-gnu/ld-2.27.so 7ffff7ffd000-7ffff7ffe000 rw-p 00028000 ... /lib/x86_64-linux-gnu/ld-2.27.so 7ffff7ffe000-7ffff7fff000 rw-p 00000000 ... 7ffffffde000-7ffffffff000 rw-p 00000000 ... [stack]
$ /lib/x86_64-linux-gnu/ld-2.27.so /bin/cat /proc/self/maps ... 7ffff7bcc000-7ffff7bd4000 r-xp 00000000 ... /bin/cat 7ffff7bd4000-7ffff7dd3000 ---p 00008000 ... /bin/cat 7ffff7dd3000-7ffff7dd4000 r--p 00007000 ... /bin/cat 7ffff7dd4000-7ffff7dd5000 rw-p 00008000 ... /bin/cat 7ffff7dd5000-7ffff7dfc000 r-xp 00000000 ... /lib/x86_64-linux-gnu/ld-2.27.so 7ffff7fb2000-7ffff7fd6000 rw-p 00000000 ... 7ffff7ff7000-7ffff7ffa000 r--p 00000000 ... [vvar] 7ffff7ffa000-7ffff7ffc000 r-xp 00000000 ... [vdso] 7ffff7ffc000-7ffff7ffd000 r--p 00027000 ... /lib/x86_64-linux-gnu/ld-2.27.so 7ffff7ffd000-7ffff7ffe000 rw-p 00028000 ... /lib/x86_64-linux-gnu/ld-2.27.so 7ffff7ffe000-7ffff8020000 rw-p 00000000 ... [heap] 7ffffffde000-7ffffffff000 rw-p 00000000 ... [stack]
The solution is to move brk out of mmap and into ELF_ET_DYN_BASE since nothing is there in the direct loader case (and ET_EXEC is still far away at 0x400000). Anything that ran before should still work (i.e. the ultimately-launched binary already had the brk very far from its text, so this should be no different from a COMPAT_BRK standpoint). The only risk I see here is that if someone started to suddenly depend on the entire memory space lower than the mmap region being available when launching binaries via a direct loader execs which seems highly unlikely, I'd hope: this would mean a binary would _not_ work when exec()ed normally.
(Note that this is only done under CONFIG_ARCH_HAS_ELF_RANDOMIZATION when randomization is turned on.)
Link: http://lkml.kernel.org/r/20190422225727.GA21011@beast Link: https://lkml.kernel.org/r/CAGXu5jJ5sj3emOT2QPxQkNQk0qbU6zEfu9=Omfhx_p0nCKPSj... Fixes: eab09532d400 ("binfmt_elf: use ELF_ET_DYN_BASE only for PIE") Signed-off-by: Kees Cook keescook@chromium.org Reported-by: Ali Saidi alisaidi@amazon.com Cc: Ali Saidi alisaidi@amazon.com Cc: Guenter Roeck linux@roeck-us.net Cc: Michal Hocko mhocko@suse.com Cc: Matthew Wilcox willy@infradead.org Cc: Thomas Gleixner tglx@linutronix.de Cc: Jann Horn jannh@google.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Cc: Guenter Roeck linux@roeck-us.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/binfmt_elf.c | 11 +++++++++++ 1 file changed, 11 insertions(+)
--- a/fs/binfmt_elf.c +++ b/fs/binfmt_elf.c @@ -1097,6 +1097,17 @@ static int load_elf_binary(struct linux_ current->mm->start_stack = bprm->p;
if ((current->flags & PF_RANDOMIZE) && (randomize_va_space > 1)) { + /* + * For architectures with ELF randomization, when executing + * a loader directly (i.e. no interpreter listed in ELF + * headers), move the brk area out of the mmap region + * (since it grows up, and may collide early with the stack + * growing down), and into the unused ELF_ET_DYN_BASE region. + */ + if (IS_ENABLED(CONFIG_ARCH_HAS_ELF_RANDOMIZE) && !interpreter) + current->mm->brk = current->mm->start_brk = + ELF_ET_DYN_BASE; + current->mm->brk = current->mm->start_brk = arch_randomize_brk(current->mm); #ifdef compat_brk_randomized
From: Ville Syrjälä ville.syrjala@linux.intel.com
commit 6f6060a5c9cc76fdbc22748264e6aa3779ec2427 upstream.
APM_DO_POP_SEGS does not restore fs/gs which were zeroed by APM_DO_ZERO_SEGS. Trying to access __preempt_count with zeroed fs doesn't really work.
Move the ibrs call outside the APM_DO_SAVE_SEGS/APM_DO_RESTORE_SEGS invocations so that fs is actually restored before calling preempt_enable().
Fixes the following sort of oopses: [ 0.313581] general protection fault: 0000 [#1] PREEMPT SMP [ 0.313803] Modules linked in: [ 0.314040] CPU: 0 PID: 268 Comm: kapmd Not tainted 4.16.0-rc1-triton-bisect-00090-gdd84441a7971 #19 [ 0.316161] EIP: __apm_bios_call_simple+0xc8/0x170 [ 0.316161] EFLAGS: 00210016 CPU: 0 [ 0.316161] EAX: 00000102 EBX: 00000000 ECX: 00000102 EDX: 00000000 [ 0.316161] ESI: 0000530e EDI: dea95f64 EBP: dea95f18 ESP: dea95ef0 [ 0.316161] DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068 [ 0.316161] CR0: 80050033 CR2: 00000000 CR3: 015d3000 CR4: 000006d0 [ 0.316161] Call Trace: [ 0.316161] ? cpumask_weight.constprop.15+0x20/0x20 [ 0.316161] on_cpu0+0x44/0x70 [ 0.316161] apm+0x54e/0x720 [ 0.316161] ? __switch_to_asm+0x26/0x40 [ 0.316161] ? __schedule+0x17d/0x590 [ 0.316161] kthread+0xc0/0xf0 [ 0.316161] ? proc_apm_show+0x150/0x150 [ 0.316161] ? kthread_create_worker_on_cpu+0x20/0x20 [ 0.316161] ret_from_fork+0x2e/0x38 [ 0.316161] Code: da 8e c2 8e e2 8e ea 57 55 2e ff 1d e0 bb 5d b1 0f 92 c3 5d 5f 07 1f 89 47 0c 90 8d b4 26 00 00 00 00 90 8d b4 26 00 00 00 00 90 <64> ff 0d 84 16 5c b1 74 7f 8b 45 dc 8e e0 8b 45 d8 8e e8 8b 45 [ 0.316161] EIP: __apm_bios_call_simple+0xc8/0x170 SS:ESP: 0068:dea95ef0 [ 0.316161] ---[ end trace 656253db2deaa12c ]---
Fixes: dd84441a7971 ("x86/speculation: Use IBRS if available before calling into firmware") Signed-off-by: Ville Syrjälä ville.syrjala@linux.intel.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Cc: stable@vger.kernel.org Cc: David Woodhouse dwmw@amazon.co.uk Cc: "H. Peter Anvin" hpa@zytor.com Cc: x86@kernel.org Cc: David Woodhouse dwmw@amazon.co.uk Cc: "H. Peter Anvin" hpa@zytor.com Link: https://lkml.kernel.org/r/20180709133534.5963-1-ville.syrjala@linux.intel.co... Cc: Guenter Roeck linux@roeck-us.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/include/asm/apm.h | 6 ------ arch/x86/kernel/apm_32.c | 5 +++++ 2 files changed, 5 insertions(+), 6 deletions(-)
--- a/arch/x86/include/asm/apm.h +++ b/arch/x86/include/asm/apm.h @@ -6,8 +6,6 @@ #ifndef _ASM_X86_MACH_DEFAULT_APM_H #define _ASM_X86_MACH_DEFAULT_APM_H
-#include <asm/nospec-branch.h> - #ifdef APM_ZERO_SEGS # define APM_DO_ZERO_SEGS \ "pushl %%ds\n\t" \ @@ -33,7 +31,6 @@ static inline void apm_bios_call_asm(u32 * N.B. We do NOT need a cld after the BIOS call * because we always save and restore the flags. */ - firmware_restrict_branch_speculation_start(); __asm__ __volatile__(APM_DO_ZERO_SEGS "pushl %%edi\n\t" "pushl %%ebp\n\t" @@ -46,7 +43,6 @@ static inline void apm_bios_call_asm(u32 "=S" (*esi) : "a" (func), "b" (ebx_in), "c" (ecx_in) : "memory", "cc"); - firmware_restrict_branch_speculation_end(); }
static inline u8 apm_bios_call_simple_asm(u32 func, u32 ebx_in, @@ -59,7 +55,6 @@ static inline u8 apm_bios_call_simple_as * N.B. We do NOT need a cld after the BIOS call * because we always save and restore the flags. */ - firmware_restrict_branch_speculation_start(); __asm__ __volatile__(APM_DO_ZERO_SEGS "pushl %%edi\n\t" "pushl %%ebp\n\t" @@ -72,7 +67,6 @@ static inline u8 apm_bios_call_simple_as "=S" (si) : "a" (func), "b" (ebx_in), "c" (ecx_in) : "memory", "cc"); - firmware_restrict_branch_speculation_end(); return error; }
--- a/arch/x86/kernel/apm_32.c +++ b/arch/x86/kernel/apm_32.c @@ -239,6 +239,7 @@ #include <asm/olpc.h> #include <asm/paravirt.h> #include <asm/reboot.h> +#include <asm/nospec-branch.h>
#if defined(CONFIG_APM_DISPLAY_BLANK) && defined(CONFIG_VT) extern int (*console_blank_hook)(int); @@ -613,11 +614,13 @@ static long __apm_bios_call(void *_call) gdt[0x40 / 8] = bad_bios_desc;
apm_irq_save(flags); + firmware_restrict_branch_speculation_start(); APM_DO_SAVE_SEGS; apm_bios_call_asm(call->func, call->ebx, call->ecx, &call->eax, &call->ebx, &call->ecx, &call->edx, &call->esi); APM_DO_RESTORE_SEGS; + firmware_restrict_branch_speculation_end(); apm_irq_restore(flags); gdt[0x40 / 8] = save_desc_40; put_cpu(); @@ -689,10 +692,12 @@ static long __apm_bios_call_simple(void gdt[0x40 / 8] = bad_bios_desc;
apm_irq_save(flags); + firmware_restrict_branch_speculation_start(); APM_DO_SAVE_SEGS; error = apm_bios_call_simple_asm(call->func, call->ebx, call->ecx, &call->eax); APM_DO_RESTORE_SEGS; + firmware_restrict_branch_speculation_end(); apm_irq_restore(flags); gdt[0x40 / 8] = save_desc_40; put_cpu();
From: Alex Estrin alex.estrin@intel.com
commit 612601d0013f03de9dc134809f242ba6da9ca252 upstream.
commit 9a9b8112699d will cause core to fail UD QP from being destroyed on ipoib unload, therefore cause resources leakage. On pkey change event above patch modifies mgid before calling underlying driver to detach it from QP. Drivers' detach_mcast() will fail to find modified mgid it was never given to attach in a first place. Core qp->usecnt will never go down, so ib_destroy_qp() will fail.
IPoIB driver actually does take care of new broadcast mgid based on new pkey by destroying an old mcast object in ipoib_mcast_dev_flush()) .... if (priv->broadcast) { rb_erase(&priv->broadcast->rb_node, &priv->multicast_tree); list_add_tail(&priv->broadcast->list, &remove_list); priv->broadcast = NULL; } ...
then in restarted ipoib_macst_join_task() creating a new broadcast mcast object, sending join request and on completion tells the driver to attach to reinitialized QP: ... if (!priv->broadcast) { ... broadcast = ipoib_mcast_alloc(dev, 0); ... memcpy(broadcast->mcmember.mgid.raw, priv->dev->broadcast + 4, sizeof (union ib_gid)); priv->broadcast = broadcast; ...
Fixes: 9a9b8112699d ("IB/ipoib: Update broadcast object if PKey value was changed in index 0") Cc: stable@vger.kernel.org Reviewed-by: Mike Marciniszyn mike.marciniszyn@intel.com Reviewed-by: Dennis Dalessandro dennis.dalessandro@intel.com Signed-off-by: Alex Estrin alex.estrin@intel.com Signed-off-by: Dennis Dalessandro dennis.dalessandro@intel.com Reviewed-by: Feras Daoud ferasda@mellanox.com Signed-off-by: Doug Ledford dledford@redhat.com Cc: Guenter Roeck linux@roeck-us.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/infiniband/ulp/ipoib/ipoib_ib.c | 13 ------------- 1 file changed, 13 deletions(-)
--- a/drivers/infiniband/ulp/ipoib/ipoib_ib.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_ib.c @@ -945,19 +945,6 @@ static inline int update_parent_pkey(str */ priv->dev->broadcast[8] = priv->pkey >> 8; priv->dev->broadcast[9] = priv->pkey & 0xff; - - /* - * Update the broadcast address in the priv->broadcast object, - * in case it already exists, otherwise no one will do that. - */ - if (priv->broadcast) { - spin_lock_irq(&priv->lock); - memcpy(priv->broadcast->mcmember.mgid.raw, - priv->dev->broadcast + 4, - sizeof(union ib_gid)); - spin_unlock_irq(&priv->lock); - } - return 0; }
From: Oliver Neukum oneukum@suse.com
commit 9f04db234af691007bb785342a06abab5fb34474 upstream.
This device needs US_FL_NO_REPORT_OPCODES to avoid going through prolonged error handling on enumeration.
Signed-off-by: Oliver Neukum oneukum@suse.com Reported-by: Julian Groß julian.g@posteo.de Cc: stable stable@vger.kernel.org Link: https://lore.kernel.org/r/20200429155218.7308-1-oneukum@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/usb/storage/unusual_uas.h | 7 +++++++ 1 file changed, 7 insertions(+)
--- a/drivers/usb/storage/unusual_uas.h +++ b/drivers/usb/storage/unusual_uas.h @@ -40,6 +40,13 @@ * and don't forget to CC: the USB development list linux-usb@vger.kernel.org */
+/* Reported-by: Julian Groß julian.g@posteo.de */ +UNUSUAL_DEV(0x059f, 0x105f, 0x0000, 0x9999, + "LaCie", + "2Big Quadra USB3", + USB_SC_DEVICE, USB_PR_DEVICE, NULL, + US_FL_NO_REPORT_OPCODES), + /* * Apricorn USB3 dongle sometimes returns "USBSUSBSUSBS" in response to SCSI * commands in UAS mode. Observed with the 1.28 firmware; are there others?
From: Oliver Neukum oneukum@suse.com
commit e9b3c610a05c1cdf8e959a6d89c38807ff758ee6 upstream.
We must not process packets shorter than a packet ID
Signed-off-by: Oliver Neukum oneukum@suse.com Reported-and-tested-by: syzbot+d29e9263e13ce0b9f4fd@syzkaller.appspotmail.com Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Cc: stable stable@vger.kernel.org Signed-off-by: Johan Hovold johan@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/usb/serial/garmin_gps.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/usb/serial/garmin_gps.c +++ b/drivers/usb/serial/garmin_gps.c @@ -1162,8 +1162,8 @@ static void garmin_read_process(struct g send it directly to the tty port */ if (garmin_data_p->flags & FLAGS_QUEUING) { pkt_add(garmin_data_p, data, data_length); - } else if (bulk_data || - getLayerId(data) == GARMIN_LAYERID_APPL) { + } else if (bulk_data || (data_length >= sizeof(u32) && + getLayerId(data) == GARMIN_LAYERID_APPL)) {
spin_lock_irqsave(&garmin_data_p->lock, flags); garmin_data_p->flags |= APP_RESP_SEEN;
From: George Spelvin lkml@sdf.org
commit fd0c42c4dea54335967c5a86f15fc064235a2797 upstream.
and change to pseudorandom numbers, as this is a traffic dithering operation that doesn't need crypto-grade.
The previous code operated in 4 steps:
1. Generate a random byte 0 <= rand_tq <= 255 2. Multiply it by BATADV_TQ_MAX_VALUE - tq 3. Divide by 255 (= BATADV_TQ_MAX_VALUE) 4. Return BATADV_TQ_MAX_VALUE - rand_tq
This would apperar to scale (BATADV_TQ_MAX_VALUE - tq) by a random value between 0/255 and 255/255.
But! The intermediate value between steps 3 and 4 is stored in a u8 variable. So it's truncated, and most of the time, is less than 255, after which the division produces 0. Specifically, if tq is odd, the product is always even, and can never be 255. If tq is even, there's exactly one random byte value that will produce a product byte of 255.
Thus, the return value is 255 (511/512 of the time) or 254 (1/512 of the time).
If we assume that the truncation is a bug, and the code is meant to scale the input, a simpler way of looking at it is that it's returning a random value between tq and BATADV_TQ_MAX_VALUE, inclusive.
Well, we have an optimized function for doing just that.
Fixes: 3c12de9a5c75 ("batman-adv: network coding - code and transmit packets if possible") Signed-off-by: George Spelvin lkml@sdf.org Signed-off-by: Sven Eckelmann sven@narfation.org Signed-off-by: Simon Wunderlich sw@simonwunderlich.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- net/batman-adv/network-coding.c | 9 +-------- 1 file changed, 1 insertion(+), 8 deletions(-)
--- a/net/batman-adv/network-coding.c +++ b/net/batman-adv/network-coding.c @@ -991,15 +991,8 @@ static struct batadv_nc_path *batadv_nc_ */ static u8 batadv_nc_random_weight_tq(u8 tq) { - u8 rand_val, rand_tq; - - get_random_bytes(&rand_val, sizeof(rand_val)); - /* randomize the estimated packet loss (max TQ - estimated TQ) */ - rand_tq = rand_val * (BATADV_TQ_MAX_VALUE - tq); - - /* normalize the randomized packet loss */ - rand_tq /= BATADV_TQ_MAX_VALUE; + u8 rand_tq = prandom_u32_max(BATADV_TQ_MAX_VALUE + 1 - tq);
/* convert to (randomized) estimated tq again */ return BATADV_TQ_MAX_VALUE - rand_tq;
From: Ivan Delalande colona@arista.com
commit e08df079b23e2e982df15aa340bfbaf50f297504 upstream.
If the trapping instruction contains a ':', for a memory access through segment registers for example, the sed substitution will insert the '*' marker in the middle of the instruction instead of the line address:
2b: 65 48 0f c7 0f cmpxchg16b %gs:*(%rdi) <-- trapping instruction
I started to think I had forgotten some quirk of the assembly syntax before noticing that it was actually coming from the script. Fix it to add the address marker at the right place for these instructions:
28: 49 8b 06 mov (%r14),%rax 2b:* 65 48 0f c7 0f cmpxchg16b %gs:(%rdi) <-- trapping instruction 30: 0f 94 c0 sete %al
Fixes: 18ff44b189e2 ("scripts/decodecode: make faulting insn ptr more robust") Signed-off-by: Ivan Delalande colona@arista.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Reviewed-by: Borislav Petkov bp@suse.de Link: http://lkml.kernel.org/r/20200419223653.GA31248@visor Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- scripts/decodecode | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/scripts/decodecode +++ b/scripts/decodecode @@ -98,7 +98,7 @@ faultlinenum=$(( $(wc -l $T.oo | cut -d faultline=`cat $T.dis | head -1 | cut -d":" -f2-` faultline=`echo "$faultline" | sed -e 's/[/\[/g; s/]/\]/g'`
-cat $T.oo | sed -e "${faultlinenum}s/^(.*:)(.*)/\1*\2\t\t<-- trapping instruction/" +cat $T.oo | sed -e "${faultlinenum}s/^([^:]*:)(.*)/\1*\2\t\t<-- trapping instruction/" echo cat $T.aa cleanup
From: Alexandre Belloni alexandre.belloni@free-electrons.com
[ Upstream commit 99f81afc139c6edd14d77a91ee91685a414a1c66 ]
Disable auto negotiation on init to properly detect an already plugged cable at boot.
At boot, when the phy is started, it is in the PHY_UP state. However, if a cable is plugged at boot, because auto negociation is already enabled at the time we get the first interrupt, the phy is already running. But the state machine then switches from PHY_UP to PHY_AN and calls phy_start_aneg(). phy_start_aneg() will not do anything because aneg is already enabled on the phy. It will then wait for a interrupt before going further. This interrupt will never happen unless the cable is unplugged and then replugged.
It was working properly before 321beec5047a (net: phy: Use interrupts when available in NOLINK state) because switching to NOLINK meant starting polling the phy, even if IRQ were enabled.
Fixes: 321beec5047a (net: phy: Use interrupts when available in NOLINK state) Signed-off-by: Alexandre Belloni alexandre.belloni@free-electrons.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/phy/micrel.c | 11 +++++++++++ 1 file changed, 11 insertions(+)
diff --git a/drivers/net/phy/micrel.c b/drivers/net/phy/micrel.c index 4eba646789c30..98166e144f2dd 100644 --- a/drivers/net/phy/micrel.c +++ b/drivers/net/phy/micrel.c @@ -285,6 +285,17 @@ static int kszphy_config_init(struct phy_device *phydev) if (priv->led_mode >= 0) kszphy_setup_led(phydev, type->led_mode_reg, priv->led_mode);
+ if (phy_interrupt_is_valid(phydev)) { + int ctl = phy_read(phydev, MII_BMCR); + + if (ctl < 0) + return ctl; + + ret = phy_write(phydev, MII_BMCR, ctl & ~BMCR_ANENABLE); + if (ret < 0) + return ret; + } + return 0; }
On Mon, May 18, 2020 at 07:35:48PM +0200, Greg Kroah-Hartman wrote:
From: Alexandre Belloni alexandre.belloni@free-electrons.com
[ Upstream commit 99f81afc139c6edd14d77a91ee91685a414a1c66 ]
I notice 99f81afc139c has been reverted in mainline with commit b43bd72835a5. The revert commit points out that:
"It was papering over the real problem, which is fixed by commit f555f34fdc58 ("net: phy: fix auto-negotiation stall due to unavailable interrupt")"
Therefore, consider backporting f555f34fdc58 instead of 99f81afc139c.
Notice if f555f34fdc58 is taken, then I believe 215d08a85b9a should also be backported.
Thanks, -- Henri
Disable auto negotiation on init to properly detect an already plugged cable at boot.
At boot, when the phy is started, it is in the PHY_UP state. However, if a cable is plugged at boot, because auto negociation is already enabled at the time we get the first interrupt, the phy is already running. But the state machine then switches from PHY_UP to PHY_AN and calls phy_start_aneg(). phy_start_aneg() will not do anything because aneg is already enabled on the phy. It will then wait for a interrupt before going further. This interrupt will never happen unless the cable is unplugged and then replugged.
It was working properly before 321beec5047a (net: phy: Use interrupts when available in NOLINK state) because switching to NOLINK meant starting polling the phy, even if IRQ were enabled.
On Tue, May 19, 2020 at 08:45:12AM +0300, Henri Rosten wrote:
On Mon, May 18, 2020 at 07:35:48PM +0200, Greg Kroah-Hartman wrote:
From: Alexandre Belloni alexandre.belloni@free-electrons.com
[ Upstream commit 99f81afc139c6edd14d77a91ee91685a414a1c66 ]
I notice 99f81afc139c has been reverted in mainline with commit b43bd72835a5. The revert commit points out that:
"It was papering over the real problem, which is fixed by commit f555f34fdc58 ("net: phy: fix auto-negotiation stall due to unavailable interrupt")" Therefore, consider backporting f555f34fdc58 instead of 99f81afc139c.
Notice if f555f34fdc58 is taken, then I believe 215d08a85b9a should also be backported.
I'm just going to drop this single patch for now instead, thanks.
greg k-h
From: Alexandre Belloni alexandre.belloni@free-electrons.com
[ Upstream commit f5aba91d7f186cba84af966a741a0346de603cd4 ]
At least on ksz8081, when getting back from power down, interrupts are disabled. ensure they are reenabled if they were previously enabled.
This fixes resuming which is failing on the xplained boards from atmel since 321beec5047a (net: phy: Use interrupts when available in NOLINK state)
Fixes: 321beec5047a (net: phy: Use interrupts when available in NOLINK state) Signed-off-by: Alexandre Belloni alexandre.belloni@free-electrons.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/phy/micrel.c | 17 ++++++++++++++++- 1 file changed, 16 insertions(+), 1 deletion(-)
diff --git a/drivers/net/phy/micrel.c b/drivers/net/phy/micrel.c index 98166e144f2dd..48788ef0ac639 100644 --- a/drivers/net/phy/micrel.c +++ b/drivers/net/phy/micrel.c @@ -603,6 +603,21 @@ ksz9021_wr_mmd_phyreg(struct phy_device *phydev, int ptrad, int devnum, { }
+static int kszphy_resume(struct phy_device *phydev) +{ + int value; + + mutex_lock(&phydev->lock); + + value = phy_read(phydev, MII_BMCR); + phy_write(phydev, MII_BMCR, value & ~BMCR_PDOWN); + + kszphy_config_intr(phydev); + mutex_unlock(&phydev->lock); + + return 0; +} + static int kszphy_probe(struct phy_device *phydev) { const struct kszphy_type *type = phydev->drv->driver_data; @@ -794,7 +809,7 @@ static struct phy_driver ksphy_driver[] = { .ack_interrupt = kszphy_ack_interrupt, .config_intr = kszphy_config_intr, .suspend = genphy_suspend, - .resume = genphy_resume, + .resume = kszphy_resume, .driver = { .owner = THIS_MODULE,}, }, { .phy_id = PHY_ID_KSZ8061,
From: Kees Cook keescook@chromium.org
commit 7be3cb019db1cbd5fd5ffe6d64a23fefa4b6f229 upstream.
When brk was moved for binaries without an interpreter, it should have been limited to ET_DYN only. In other words, the special case was an ET_DYN that lacks an INTERP, not just an executable that lacks INTERP. The bug manifested for giant static executables, where the brk would end up in the middle of the text area on 32-bit architectures.
Reported-and-tested-by: Richard Kojedzinszky richard@kojedz.in Fixes: bbdc6076d2e5 ("binfmt_elf: move brk out of mmap when doing direct loader exec") Cc: stable@vger.kernel.org Signed-off-by: Kees Cook keescook@chromium.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Cc: Guenter Roeck linux@roeck-us.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/binfmt_elf.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/fs/binfmt_elf.c +++ b/fs/binfmt_elf.c @@ -1104,7 +1104,8 @@ static int load_elf_binary(struct linux_ * (since it grows up, and may collide early with the stack * growing down), and into the unused ELF_ET_DYN_BASE region. */ - if (IS_ENABLED(CONFIG_ARCH_HAS_ELF_RANDOMIZE) && !interpreter) + if (IS_ENABLED(CONFIG_ARCH_HAS_ELF_RANDOMIZE) && + loc->elf_ex.e_type == ET_DYN && !interpreter) current->mm->brk = current->mm->start_brk = ELF_ET_DYN_BASE;
From: Shijie Luo luoshijie1@huawei.com
commit af133ade9a40794a37104ecbcc2827c0ea373a3c upstream.
When journal size is set too big by "mkfs.ext4 -J size=", or when we mount a crafted image to make journal inode->i_size too big, the loop, "while (i < num)", holds cpu too long. This could cause soft lockup.
[ 529.357541] Call trace: [ 529.357551] dump_backtrace+0x0/0x198 [ 529.357555] show_stack+0x24/0x30 [ 529.357562] dump_stack+0xa4/0xcc [ 529.357568] watchdog_timer_fn+0x300/0x3e8 [ 529.357574] __hrtimer_run_queues+0x114/0x358 [ 529.357576] hrtimer_interrupt+0x104/0x2d8 [ 529.357580] arch_timer_handler_virt+0x38/0x58 [ 529.357584] handle_percpu_devid_irq+0x90/0x248 [ 529.357588] generic_handle_irq+0x34/0x50 [ 529.357590] __handle_domain_irq+0x68/0xc0 [ 529.357593] gic_handle_irq+0x6c/0x150 [ 529.357595] el1_irq+0xb8/0x140 [ 529.357599] __ll_sc_atomic_add_return_acquire+0x14/0x20 [ 529.357668] ext4_map_blocks+0x64/0x5c0 [ext4] [ 529.357693] ext4_setup_system_zone+0x330/0x458 [ext4] [ 529.357717] ext4_fill_super+0x2170/0x2ba8 [ext4] [ 529.357722] mount_bdev+0x1a8/0x1e8 [ 529.357746] ext4_mount+0x44/0x58 [ext4] [ 529.357748] mount_fs+0x50/0x170 [ 529.357752] vfs_kern_mount.part.9+0x54/0x188 [ 529.357755] do_mount+0x5ac/0xd78 [ 529.357758] ksys_mount+0x9c/0x118 [ 529.357760] __arm64_sys_mount+0x28/0x38 [ 529.357764] el0_svc_common+0x78/0x130 [ 529.357766] el0_svc_handler+0x38/0x78 [ 529.357769] el0_svc+0x8/0xc [ 541.356516] watchdog: BUG: soft lockup - CPU#0 stuck for 23s! [mount:18674]
Link: https://lore.kernel.org/r/20200211011752.29242-1-luoshijie1@huawei.com Reviewed-by: Jan Kara jack@suse.cz Signed-off-by: Shijie Luo luoshijie1@huawei.com Signed-off-by: Theodore Ts'o tytso@mit.edu Cc: stable@kernel.org Signed-off-by: Ben Hutchings ben.hutchings@codethink.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ext4/block_validity.c | 1 + 1 file changed, 1 insertion(+)
--- a/fs/ext4/block_validity.c +++ b/fs/ext4/block_validity.c @@ -152,6 +152,7 @@ static int ext4_protect_reserved_inode(s return PTR_ERR(inode); num = (inode->i_size + sb->s_blocksize - 1) >> sb->s_blocksize_bits; while (i < num) { + cond_resched(); map.m_lblk = i; map.m_len = num - i; n = ext4_map_blocks(NULL, inode, &map, 0);
From: Sabrina Dubroca sd@queasysnail.net
commit c4e85f73afb6384123e5ef1bba3315b2e3ad031e upstream.
This will be used in the conversion of ipv6_stub to ip6_dst_lookup_flow, as some modules currently pass a net argument without a socket to ip6_dst_lookup. This is equivalent to commit 343d60aada5a ("ipv6: change ipv6_stub_impl.ipv6_dst_lookup to take net argument").
Signed-off-by: Sabrina Dubroca sd@queasysnail.net Signed-off-by: David S. Miller davem@davemloft.net [bwh: Backported to 4.4: adjust context] Signed-off-by: Ben Hutchings ben.hutchings@codethink.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/net/ipv6.h | 2 +- net/dccp/ipv6.c | 6 +++--- net/ipv6/af_inet6.c | 2 +- net/ipv6/datagram.c | 2 +- net/ipv6/inet6_connection_sock.c | 4 ++-- net/ipv6/ip6_output.c | 8 ++++---- net/ipv6/raw.c | 2 +- net/ipv6/syncookies.c | 2 +- net/ipv6/tcp_ipv6.c | 4 ++-- net/l2tp/l2tp_ip6.c | 2 +- net/sctp/ipv6.c | 4 ++-- 11 files changed, 19 insertions(+), 19 deletions(-)
--- a/include/net/ipv6.h +++ b/include/net/ipv6.h @@ -853,7 +853,7 @@ static inline struct sk_buff *ip6_finish
int ip6_dst_lookup(struct net *net, struct sock *sk, struct dst_entry **dst, struct flowi6 *fl6); -struct dst_entry *ip6_dst_lookup_flow(const struct sock *sk, struct flowi6 *fl6, +struct dst_entry *ip6_dst_lookup_flow(struct net *net, const struct sock *sk, struct flowi6 *fl6, const struct in6_addr *final_dst); struct dst_entry *ip6_sk_dst_lookup_flow(struct sock *sk, struct flowi6 *fl6, const struct in6_addr *final_dst); --- a/net/dccp/ipv6.c +++ b/net/dccp/ipv6.c @@ -209,7 +209,7 @@ static int dccp_v6_send_response(const s final_p = fl6_update_dst(&fl6, rcu_dereference(np->opt), &final); rcu_read_unlock();
- dst = ip6_dst_lookup_flow(sk, &fl6, final_p); + dst = ip6_dst_lookup_flow(sock_net(sk), sk, &fl6, final_p); if (IS_ERR(dst)) { err = PTR_ERR(dst); dst = NULL; @@ -276,7 +276,7 @@ static void dccp_v6_ctl_send_reset(const security_skb_classify_flow(rxskb, flowi6_to_flowi(&fl6));
/* sk = NULL, but it is safe for now. RST socket required. */ - dst = ip6_dst_lookup_flow(ctl_sk, &fl6, NULL); + dst = ip6_dst_lookup_flow(sock_net(ctl_sk), ctl_sk, &fl6, NULL); if (!IS_ERR(dst)) { skb_dst_set(skb, dst); ip6_xmit(ctl_sk, skb, &fl6, NULL, 0); @@ -879,7 +879,7 @@ static int dccp_v6_connect(struct sock * opt = rcu_dereference_protected(np->opt, sock_owned_by_user(sk)); final_p = fl6_update_dst(&fl6, opt, &final);
- dst = ip6_dst_lookup_flow(sk, &fl6, final_p); + dst = ip6_dst_lookup_flow(sock_net(sk), sk, &fl6, final_p); if (IS_ERR(dst)) { err = PTR_ERR(dst); goto failure; --- a/net/ipv6/af_inet6.c +++ b/net/ipv6/af_inet6.c @@ -683,7 +683,7 @@ int inet6_sk_rebuild_header(struct sock &final); rcu_read_unlock();
- dst = ip6_dst_lookup_flow(sk, &fl6, final_p); + dst = ip6_dst_lookup_flow(sock_net(sk), sk, &fl6, final_p); if (IS_ERR(dst)) { sk->sk_route_caps = 0; sk->sk_err_soft = -PTR_ERR(dst); --- a/net/ipv6/datagram.c +++ b/net/ipv6/datagram.c @@ -179,7 +179,7 @@ ipv4_connected: final_p = fl6_update_dst(&fl6, opt, &final); rcu_read_unlock();
- dst = ip6_dst_lookup_flow(sk, &fl6, final_p); + dst = ip6_dst_lookup_flow(sock_net(sk), sk, &fl6, final_p); err = 0; if (IS_ERR(dst)) { err = PTR_ERR(dst); --- a/net/ipv6/inet6_connection_sock.c +++ b/net/ipv6/inet6_connection_sock.c @@ -88,7 +88,7 @@ struct dst_entry *inet6_csk_route_req(co fl6->fl6_sport = htons(ireq->ir_num); security_req_classify_flow(req, flowi6_to_flowi(fl6));
- dst = ip6_dst_lookup_flow(sk, fl6, final_p); + dst = ip6_dst_lookup_flow(sock_net(sk), sk, fl6, final_p); if (IS_ERR(dst)) return NULL;
@@ -142,7 +142,7 @@ static struct dst_entry *inet6_csk_route
dst = __inet6_csk_dst_check(sk, np->dst_cookie); if (!dst) { - dst = ip6_dst_lookup_flow(sk, fl6, final_p); + dst = ip6_dst_lookup_flow(sock_net(sk), sk, fl6, final_p);
if (!IS_ERR(dst)) ip6_dst_store(sk, dst, NULL, NULL); --- a/net/ipv6/ip6_output.c +++ b/net/ipv6/ip6_output.c @@ -1057,13 +1057,13 @@ EXPORT_SYMBOL_GPL(ip6_dst_lookup); * It returns a valid dst pointer on success, or a pointer encoded * error code. */ -struct dst_entry *ip6_dst_lookup_flow(const struct sock *sk, struct flowi6 *fl6, +struct dst_entry *ip6_dst_lookup_flow(struct net *net, const struct sock *sk, struct flowi6 *fl6, const struct in6_addr *final_dst) { struct dst_entry *dst = NULL; int err;
- err = ip6_dst_lookup_tail(sock_net(sk), sk, &dst, fl6); + err = ip6_dst_lookup_tail(net, sk, &dst, fl6); if (err) return ERR_PTR(err); if (final_dst) @@ -1071,7 +1071,7 @@ struct dst_entry *ip6_dst_lookup_flow(co if (!fl6->flowi6_oif) fl6->flowi6_oif = l3mdev_fib_oif(dst->dev);
- return xfrm_lookup_route(sock_net(sk), dst, flowi6_to_flowi(fl6), sk, 0); + return xfrm_lookup_route(net, dst, flowi6_to_flowi(fl6), sk, 0); } EXPORT_SYMBOL_GPL(ip6_dst_lookup_flow);
@@ -1096,7 +1096,7 @@ struct dst_entry *ip6_sk_dst_lookup_flow
dst = ip6_sk_dst_check(sk, dst, fl6); if (!dst) - dst = ip6_dst_lookup_flow(sk, fl6, final_dst); + dst = ip6_dst_lookup_flow(sock_net(sk), sk, fl6, final_dst);
return dst; } --- a/net/ipv6/raw.c +++ b/net/ipv6/raw.c @@ -889,7 +889,7 @@ static int rawv6_sendmsg(struct sock *sk if (hdrincl) fl6.flowi6_flags |= FLOWI_FLAG_KNOWN_NH;
- dst = ip6_dst_lookup_flow(sk, &fl6, final_p); + dst = ip6_dst_lookup_flow(sock_net(sk), sk, &fl6, final_p); if (IS_ERR(dst)) { err = PTR_ERR(dst); goto out; --- a/net/ipv6/syncookies.c +++ b/net/ipv6/syncookies.c @@ -231,7 +231,7 @@ struct sock *cookie_v6_check(struct sock fl6.fl6_sport = inet_sk(sk)->inet_sport; security_req_classify_flow(req, flowi6_to_flowi(&fl6));
- dst = ip6_dst_lookup_flow(sk, &fl6, final_p); + dst = ip6_dst_lookup_flow(sock_net(sk), sk, &fl6, final_p); if (IS_ERR(dst)) goto out_free; } --- a/net/ipv6/tcp_ipv6.c +++ b/net/ipv6/tcp_ipv6.c @@ -245,7 +245,7 @@ static int tcp_v6_connect(struct sock *s
security_sk_classify_flow(sk, flowi6_to_flowi(&fl6));
- dst = ip6_dst_lookup_flow(sk, &fl6, final_p); + dst = ip6_dst_lookup_flow(sock_net(sk), sk, &fl6, final_p); if (IS_ERR(dst)) { err = PTR_ERR(dst); goto failure; @@ -831,7 +831,7 @@ static void tcp_v6_send_response(const s * Underlying function will use this to retrieve the network * namespace */ - dst = ip6_dst_lookup_flow(ctl_sk, &fl6, NULL); + dst = ip6_dst_lookup_flow(sock_net(ctl_sk), ctl_sk, &fl6, NULL); if (!IS_ERR(dst)) { skb_dst_set(buff, dst); ip6_xmit(ctl_sk, buff, &fl6, NULL, tclass); --- a/net/l2tp/l2tp_ip6.c +++ b/net/l2tp/l2tp_ip6.c @@ -619,7 +619,7 @@ static int l2tp_ip6_sendmsg(struct sock
security_sk_classify_flow(sk, flowi6_to_flowi(&fl6));
- dst = ip6_dst_lookup_flow(sk, &fl6, final_p); + dst = ip6_dst_lookup_flow(sock_net(sk), sk, &fl6, final_p); if (IS_ERR(dst)) { err = PTR_ERR(dst); goto out; --- a/net/sctp/ipv6.c +++ b/net/sctp/ipv6.c @@ -268,7 +268,7 @@ static void sctp_v6_get_dst(struct sctp_ final_p = fl6_update_dst(fl6, rcu_dereference(np->opt), &final); rcu_read_unlock();
- dst = ip6_dst_lookup_flow(sk, fl6, final_p); + dst = ip6_dst_lookup_flow(sock_net(sk), sk, fl6, final_p); if (!asoc || saddr) { t->dst = dst; memcpy(fl, &_fl, sizeof(_fl)); @@ -326,7 +326,7 @@ static void sctp_v6_get_dst(struct sctp_ fl6->saddr = laddr->a.v6.sin6_addr; fl6->fl6_sport = laddr->a.v6.sin6_port; final_p = fl6_update_dst(fl6, rcu_dereference(np->opt), &final); - bdst = ip6_dst_lookup_flow(sk, fl6, final_p); + bdst = ip6_dst_lookup_flow(sock_net(sk), sk, fl6, final_p);
if (IS_ERR(bdst)) continue;
From: Sabrina Dubroca sd@queasysnail.net
commit 6c8991f41546c3c472503dff1ea9daaddf9331c2 upstream.
ipv6_stub uses the ip6_dst_lookup function to allow other modules to perform IPv6 lookups. However, this function skips the XFRM layer entirely.
All users of ipv6_stub->ip6_dst_lookup use ip_route_output_flow (via the ip_route_output_key and ip_route_output helpers) for their IPv4 lookups, which calls xfrm_lookup_route(). This patch fixes this inconsistent behavior by switching the stub to ip6_dst_lookup_flow, which also calls xfrm_lookup_route().
This requires some changes in all the callers, as these two functions take different arguments and have different return types.
Fixes: 5f81bd2e5d80 ("ipv6: export a stub for IPv6 symbols used by vxlan") Reported-by: Xiumei Mu xmu@redhat.com Signed-off-by: Sabrina Dubroca sd@queasysnail.net Signed-off-by: David S. Miller davem@davemloft.net [bwh: Backported to 4.4: - Drop changes in lwt_bpf.c, mlx5, and rxe - Adjust filename, context, indentation] Signed-off-by: Ben Hutchings ben.hutchings@codethink.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/infiniband/core/addr.c | 6 +++--- drivers/net/geneve.c | 4 +++- drivers/net/vxlan.c | 10 ++++------ include/net/addrconf.h | 6 ++++-- net/ipv6/addrconf_core.c | 11 ++++++----- net/ipv6/af_inet6.c | 2 +- net/mpls/af_mpls.c | 7 +++---- net/tipc/udp_media.c | 9 ++++++--- 8 files changed, 30 insertions(+), 25 deletions(-)
--- a/drivers/infiniband/core/addr.c +++ b/drivers/infiniband/core/addr.c @@ -293,9 +293,9 @@ static int addr6_resolve(struct sockaddr fl6.saddr = src_in->sin6_addr; fl6.flowi6_oif = addr->bound_dev_if;
- ret = ipv6_stub->ipv6_dst_lookup(addr->net, NULL, &dst, &fl6); - if (ret < 0) - goto put; + dst = ipv6_stub->ipv6_dst_lookup_flow(addr->net, NULL, &fl6, NULL); + if (IS_ERR(dst)) + return PTR_ERR(dst);
if (ipv6_addr_any(&fl6.saddr)) { ret = ipv6_dev_get_saddr(addr->net, ip6_dst_idev(dst)->dev, --- a/drivers/net/geneve.c +++ b/drivers/net/geneve.c @@ -781,7 +781,9 @@ static struct dst_entry *geneve_get_v6_d fl6->daddr = geneve->remote.sin6.sin6_addr; }
- if (ipv6_stub->ipv6_dst_lookup(geneve->net, gs6->sock->sk, &dst, fl6)) { + dst = ipv6_stub->ipv6_dst_lookup_flow(geneve->net, gs6->sock->sk, fl6, + NULL); + if (IS_ERR(dst)) { netdev_dbg(dev, "no route to %pI6\n", &fl6->daddr); return ERR_PTR(-ENETUNREACH); } --- a/drivers/net/vxlan.c +++ b/drivers/net/vxlan.c @@ -1864,7 +1864,6 @@ static struct dst_entry *vxlan6_get_rout { struct dst_entry *ndst; struct flowi6 fl6; - int err;
memset(&fl6, 0, sizeof(fl6)); fl6.flowi6_oif = oif; @@ -1873,11 +1872,10 @@ static struct dst_entry *vxlan6_get_rout fl6.flowi6_mark = skb->mark; fl6.flowi6_proto = IPPROTO_UDP;
- err = ipv6_stub->ipv6_dst_lookup(vxlan->net, - vxlan->vn6_sock->sock->sk, - &ndst, &fl6); - if (err < 0) - return ERR_PTR(err); + ndst = ipv6_stub->ipv6_dst_lookup_flow(vxlan->net, vxlan->vn6_sock->sock->sk, + &fl6, NULL); + if (unlikely(IS_ERR(ndst))) + return ERR_PTR(-ENETUNREACH);
*saddr = fl6.saddr; return ndst; --- a/include/net/addrconf.h +++ b/include/net/addrconf.h @@ -192,8 +192,10 @@ struct ipv6_stub { const struct in6_addr *addr); int (*ipv6_sock_mc_drop)(struct sock *sk, int ifindex, const struct in6_addr *addr); - int (*ipv6_dst_lookup)(struct net *net, struct sock *sk, - struct dst_entry **dst, struct flowi6 *fl6); + struct dst_entry *(*ipv6_dst_lookup_flow)(struct net *net, + const struct sock *sk, + struct flowi6 *fl6, + const struct in6_addr *final_dst); void (*udpv6_encap_enable)(void); void (*ndisc_send_na)(struct net_device *dev, const struct in6_addr *daddr, const struct in6_addr *solicited_addr, --- a/net/ipv6/addrconf_core.c +++ b/net/ipv6/addrconf_core.c @@ -107,15 +107,16 @@ int inet6addr_notifier_call_chain(unsign } EXPORT_SYMBOL(inet6addr_notifier_call_chain);
-static int eafnosupport_ipv6_dst_lookup(struct net *net, struct sock *u1, - struct dst_entry **u2, - struct flowi6 *u3) +static struct dst_entry *eafnosupport_ipv6_dst_lookup_flow(struct net *net, + const struct sock *sk, + struct flowi6 *fl6, + const struct in6_addr *final_dst) { - return -EAFNOSUPPORT; + return ERR_PTR(-EAFNOSUPPORT); }
const struct ipv6_stub *ipv6_stub __read_mostly = &(struct ipv6_stub) { - .ipv6_dst_lookup = eafnosupport_ipv6_dst_lookup, + .ipv6_dst_lookup_flow = eafnosupport_ipv6_dst_lookup_flow, }; EXPORT_SYMBOL_GPL(ipv6_stub);
--- a/net/ipv6/af_inet6.c +++ b/net/ipv6/af_inet6.c @@ -841,7 +841,7 @@ static struct pernet_operations inet6_ne static const struct ipv6_stub ipv6_stub_impl = { .ipv6_sock_mc_join = ipv6_sock_mc_join, .ipv6_sock_mc_drop = ipv6_sock_mc_drop, - .ipv6_dst_lookup = ip6_dst_lookup, + .ipv6_dst_lookup_flow = ip6_dst_lookup_flow, .udpv6_encap_enable = udpv6_encap_enable, .ndisc_send_na = ndisc_send_na, .nd_tbl = &nd_tbl, --- a/net/mpls/af_mpls.c +++ b/net/mpls/af_mpls.c @@ -470,16 +470,15 @@ static struct net_device *inet6_fib_look struct net_device *dev; struct dst_entry *dst; struct flowi6 fl6; - int err;
if (!ipv6_stub) return ERR_PTR(-EAFNOSUPPORT);
memset(&fl6, 0, sizeof(fl6)); memcpy(&fl6.daddr, addr, sizeof(struct in6_addr)); - err = ipv6_stub->ipv6_dst_lookup(net, NULL, &dst, &fl6); - if (err) - return ERR_PTR(err); + dst = ipv6_stub->ipv6_dst_lookup_flow(net, NULL, &fl6, NULL); + if (IS_ERR(dst)) + return ERR_CAST(dst);
dev = dst->dev; dev_hold(dev); --- a/net/tipc/udp_media.c +++ b/net/tipc/udp_media.c @@ -200,10 +200,13 @@ static int tipc_udp_send_msg(struct net .saddr = src->ipv6, .flowi6_proto = IPPROTO_UDP }; - err = ipv6_stub->ipv6_dst_lookup(net, ub->ubsock->sk, &ndst, - &fl6); - if (err) + ndst = ipv6_stub->ipv6_dst_lookup_flow(net, + ub->ubsock->sk, + &fl6, NULL); + if (IS_ERR(ndst)) { + err = PTR_ERR(ndst); goto tx_error; + } ttl = ip6_dst_hoplimit(ndst); err = udp_tunnel6_xmit_skb(ndst, ub->ubsock->sk, skb, ndst->dev, &src->ipv6,
From: Waiman Long longman@redhat.com
commit 5acb3cc2c2e9d3020a4fee43763c6463767f1572 upstream.
The lockdep code had reported the following unsafe locking scenario:
CPU0 CPU1 ---- ---- lock(s_active#228); lock(&bdev->bd_mutex/1); lock(s_active#228); lock(&bdev->bd_mutex);
*** DEADLOCK ***
The deadlock may happen when one task (CPU1) is trying to delete a partition in a block device and another task (CPU0) is accessing tracing sysfs file (e.g. /sys/block/dm-1/trace/act_mask) in that partition.
The s_active isn't an actual lock. It is a reference count (kn->count) on the sysfs (kernfs) file. Removal of a sysfs file, however, require a wait until all the references are gone. The reference count is treated like a rwsem using lockdep instrumentation code.
The fact that a thread is in the sysfs callback method or in the ioctl call means there is a reference to the opended sysfs or device file. That should prevent the underlying block structure from being removed.
Instead of using bd_mutex in the block_device structure, a new blk_trace_mutex is now added to the request_queue structure to protect access to the blk_trace structure.
Suggested-by: Christoph Hellwig hch@infradead.org Signed-off-by: Waiman Long longman@redhat.com Acked-by: Steven Rostedt (VMware) rostedt@goodmis.org
Fix typo in patch subject line, and prune a comment detailing how the code used to work.
Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Ben Hutchings ben.hutchings@codethink.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- block/blk-core.c | 3 +++ include/linux/blkdev.h | 1 + kernel/trace/blktrace.c | 18 ++++++++++++------ 3 files changed, 16 insertions(+), 6 deletions(-)
--- a/block/blk-core.c +++ b/block/blk-core.c @@ -719,6 +719,9 @@ struct request_queue *blk_alloc_queue_no
kobject_init(&q->kobj, &blk_queue_ktype);
+#ifdef CONFIG_BLK_DEV_IO_TRACE + mutex_init(&q->blk_trace_mutex); +#endif mutex_init(&q->sysfs_lock); spin_lock_init(&q->__queue_lock);
--- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -432,6 +432,7 @@ struct request_queue { int node; #ifdef CONFIG_BLK_DEV_IO_TRACE struct blk_trace *blk_trace; + struct mutex blk_trace_mutex; #endif /* * for flush operations --- a/kernel/trace/blktrace.c +++ b/kernel/trace/blktrace.c @@ -644,6 +644,12 @@ int blk_trace_startstop(struct request_q } EXPORT_SYMBOL_GPL(blk_trace_startstop);
+/* + * When reading or writing the blktrace sysfs files, the references to the + * opened sysfs or device files should prevent the underlying block device + * from being removed. So no further delete protection is really needed. + */ + /** * blk_trace_ioctl: - handle the ioctls associated with tracing * @bdev: the block device @@ -661,7 +667,7 @@ int blk_trace_ioctl(struct block_device if (!q) return -ENXIO;
- mutex_lock(&bdev->bd_mutex); + mutex_lock(&q->blk_trace_mutex);
switch (cmd) { case BLKTRACESETUP: @@ -687,7 +693,7 @@ int blk_trace_ioctl(struct block_device break; }
- mutex_unlock(&bdev->bd_mutex); + mutex_unlock(&q->blk_trace_mutex); return ret; }
@@ -1652,7 +1658,7 @@ static ssize_t sysfs_blk_trace_attr_show if (q == NULL) goto out_bdput;
- mutex_lock(&bdev->bd_mutex); + mutex_lock(&q->blk_trace_mutex);
if (attr == &dev_attr_enable) { ret = sprintf(buf, "%u\n", !!q->blk_trace); @@ -1671,7 +1677,7 @@ static ssize_t sysfs_blk_trace_attr_show ret = sprintf(buf, "%llu\n", q->blk_trace->end_lba);
out_unlock_bdev: - mutex_unlock(&bdev->bd_mutex); + mutex_unlock(&q->blk_trace_mutex); out_bdput: bdput(bdev); out: @@ -1713,7 +1719,7 @@ static ssize_t sysfs_blk_trace_attr_stor if (q == NULL) goto out_bdput;
- mutex_lock(&bdev->bd_mutex); + mutex_lock(&q->blk_trace_mutex);
if (attr == &dev_attr_enable) { if (!!value == !!q->blk_trace) { @@ -1743,7 +1749,7 @@ static ssize_t sysfs_blk_trace_attr_stor }
out_unlock_bdev: - mutex_unlock(&bdev->bd_mutex); + mutex_unlock(&q->blk_trace_mutex); out_bdput: bdput(bdev); out:
From: Jens Axboe axboe@kernel.dk
commit 1f2cac107c591c24b60b115d6050adc213d10fc0 upstream.
sg.c calls into the blktrace functions without holding the proper queue mutex for doing setup, start/stop, or teardown.
Add internal unlocked variants, and export the ones that do the proper locking.
Fixes: 6da127ad0918 ("blktrace: Add blktrace ioctls to SCSI generic devices") Tested-by: Dmitry Vyukov dvyukov@google.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Ben Hutchings ben.hutchings@codethink.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/trace/blktrace.c | 58 +++++++++++++++++++++++++++++++++++++++--------- 1 file changed, 48 insertions(+), 10 deletions(-)
--- a/kernel/trace/blktrace.c +++ b/kernel/trace/blktrace.c @@ -323,7 +323,7 @@ static void blk_trace_cleanup(struct blk put_probe_ref(); }
-int blk_trace_remove(struct request_queue *q) +static int __blk_trace_remove(struct request_queue *q) { struct blk_trace *bt;
@@ -336,6 +336,17 @@ int blk_trace_remove(struct request_queu
return 0; } + +int blk_trace_remove(struct request_queue *q) +{ + int ret; + + mutex_lock(&q->blk_trace_mutex); + ret = __blk_trace_remove(q); + mutex_unlock(&q->blk_trace_mutex); + + return ret; +} EXPORT_SYMBOL_GPL(blk_trace_remove);
static ssize_t blk_dropped_read(struct file *filp, char __user *buffer, @@ -546,9 +557,8 @@ err: return ret; }
-int blk_trace_setup(struct request_queue *q, char *name, dev_t dev, - struct block_device *bdev, - char __user *arg) +static int __blk_trace_setup(struct request_queue *q, char *name, dev_t dev, + struct block_device *bdev, char __user *arg) { struct blk_user_trace_setup buts; int ret; @@ -567,6 +577,19 @@ int blk_trace_setup(struct request_queue } return 0; } + +int blk_trace_setup(struct request_queue *q, char *name, dev_t dev, + struct block_device *bdev, + char __user *arg) +{ + int ret; + + mutex_lock(&q->blk_trace_mutex); + ret = __blk_trace_setup(q, name, dev, bdev, arg); + mutex_unlock(&q->blk_trace_mutex); + + return ret; +} EXPORT_SYMBOL_GPL(blk_trace_setup);
#if defined(CONFIG_COMPAT) && defined(CONFIG_X86_64) @@ -603,7 +626,7 @@ static int compat_blk_trace_setup(struct } #endif
-int blk_trace_startstop(struct request_queue *q, int start) +static int __blk_trace_startstop(struct request_queue *q, int start) { int ret; struct blk_trace *bt = q->blk_trace; @@ -642,6 +665,17 @@ int blk_trace_startstop(struct request_q
return ret; } + +int blk_trace_startstop(struct request_queue *q, int start) +{ + int ret; + + mutex_lock(&q->blk_trace_mutex); + ret = __blk_trace_startstop(q, start); + mutex_unlock(&q->blk_trace_mutex); + + return ret; +} EXPORT_SYMBOL_GPL(blk_trace_startstop);
/* @@ -672,7 +706,7 @@ int blk_trace_ioctl(struct block_device switch (cmd) { case BLKTRACESETUP: bdevname(bdev, b); - ret = blk_trace_setup(q, b, bdev->bd_dev, bdev, arg); + ret = __blk_trace_setup(q, b, bdev->bd_dev, bdev, arg); break; #if defined(CONFIG_COMPAT) && defined(CONFIG_X86_64) case BLKTRACESETUP32: @@ -683,10 +717,10 @@ int blk_trace_ioctl(struct block_device case BLKTRACESTART: start = 1; case BLKTRACESTOP: - ret = blk_trace_startstop(q, start); + ret = __blk_trace_startstop(q, start); break; case BLKTRACETEARDOWN: - ret = blk_trace_remove(q); + ret = __blk_trace_remove(q); break; default: ret = -ENOTTY; @@ -704,10 +738,14 @@ int blk_trace_ioctl(struct block_device **/ void blk_trace_shutdown(struct request_queue *q) { + mutex_lock(&q->blk_trace_mutex); + if (q->blk_trace) { - blk_trace_startstop(q, 0); - blk_trace_remove(q); + __blk_trace_startstop(q, 0); + __blk_trace_remove(q); } + + mutex_unlock(&q->blk_trace_mutex); }
/*
From: Jens Axboe axboe@kernel.dk
commit 2967acbb257a6a9bf912f4778b727e00972eac9b upstream.
A previous commit changed the locking around registration/cleanup, but direct callers of blk_trace_remove() were missed. This means that if we hit the error path in setup, we will deadlock on attempting to re-acquire the queue trace mutex.
Fixes: 1f2cac107c59 ("blktrace: fix unlocked access to init/start-stop/teardown") Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Ben Hutchings ben.hutchings@codethink.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/trace/blktrace.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/kernel/trace/blktrace.c +++ b/kernel/trace/blktrace.c @@ -572,7 +572,7 @@ static int __blk_trace_setup(struct requ return ret;
if (copy_to_user(arg, &buts, sizeof(buts))) { - blk_trace_remove(q); + __blk_trace_remove(q); return -EFAULT; } return 0; @@ -618,7 +618,7 @@ static int compat_blk_trace_setup(struct return ret;
if (copy_to_user(arg, &buts.name, ARRAY_SIZE(buts.name))) { - blk_trace_remove(q); + __blk_trace_remove(q); return -EFAULT; }
From: Jan Kara jack@suse.cz
commit c780e86dd48ef6467a1146cf7d0fe1e05a635039 upstream.
KASAN is reporting that __blk_add_trace() has a use-after-free issue when accessing q->blk_trace. Indeed the switching of block tracing (and thus eventual freeing of q->blk_trace) is completely unsynchronized with the currently running tracing and thus it can happen that the blk_trace structure is being freed just while __blk_add_trace() works on it. Protect accesses to q->blk_trace by RCU during tracing and make sure we wait for the end of RCU grace period when shutting down tracing. Luckily that is rare enough event that we can afford that. Note that postponing the freeing of blk_trace to an RCU callback should better be avoided as it could have unexpected user visible side-effects as debugfs files would be still existing for a short while block tracing has been shut down.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=205711 CC: stable@vger.kernel.org Reviewed-by: Chaitanya Kulkarni chaitanya.kulkarni@wdc.com Reviewed-by: Ming Lei ming.lei@redhat.com Tested-by: Ming Lei ming.lei@redhat.com Reviewed-by: Bart Van Assche bvanassche@acm.org Reported-by: Tristan Madani tristmd@gmail.com Signed-off-by: Jan Kara jack@suse.cz Signed-off-by: Jens Axboe axboe@kernel.dk [bwh: Backported to 4.4: - Drop changes in blk_trace_note_message_enabled(), blk_trace_bio_get_cgid() - Adjust context] Signed-off-by: Ben Hutchings ben.hutchings@codethink.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/linux/blkdev.h | 2 include/linux/blktrace_api.h | 6 +- kernel/trace/blktrace.c | 110 +++++++++++++++++++++++++++++++------------ 3 files changed, 86 insertions(+), 32 deletions(-)
--- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -431,7 +431,7 @@ struct request_queue { unsigned int sg_reserved_size; int node; #ifdef CONFIG_BLK_DEV_IO_TRACE - struct blk_trace *blk_trace; + struct blk_trace __rcu *blk_trace; struct mutex blk_trace_mutex; #endif /* --- a/include/linux/blktrace_api.h +++ b/include/linux/blktrace_api.h @@ -51,9 +51,13 @@ void __trace_note_message(struct blk_tra **/ #define blk_add_trace_msg(q, fmt, ...) \ do { \ - struct blk_trace *bt = (q)->blk_trace; \ + struct blk_trace *bt; \ + \ + rcu_read_lock(); \ + bt = rcu_dereference((q)->blk_trace); \ if (unlikely(bt)) \ __trace_note_message(bt, fmt, ##__VA_ARGS__); \ + rcu_read_unlock(); \ } while (0) #define BLK_TN_MAX_MSG 128
--- a/kernel/trace/blktrace.c +++ b/kernel/trace/blktrace.c @@ -319,6 +319,7 @@ static void put_probe_ref(void)
static void blk_trace_cleanup(struct blk_trace *bt) { + synchronize_rcu(); blk_trace_free(bt); put_probe_ref(); } @@ -629,8 +630,10 @@ static int compat_blk_trace_setup(struct static int __blk_trace_startstop(struct request_queue *q, int start) { int ret; - struct blk_trace *bt = q->blk_trace; + struct blk_trace *bt;
+ bt = rcu_dereference_protected(q->blk_trace, + lockdep_is_held(&q->blk_trace_mutex)); if (bt == NULL) return -EINVAL;
@@ -739,8 +742,8 @@ int blk_trace_ioctl(struct block_device void blk_trace_shutdown(struct request_queue *q) { mutex_lock(&q->blk_trace_mutex); - - if (q->blk_trace) { + if (rcu_dereference_protected(q->blk_trace, + lockdep_is_held(&q->blk_trace_mutex))) { __blk_trace_startstop(q, 0); __blk_trace_remove(q); } @@ -766,10 +769,14 @@ void blk_trace_shutdown(struct request_q static void blk_add_trace_rq(struct request_queue *q, struct request *rq, unsigned int nr_bytes, u32 what) { - struct blk_trace *bt = q->blk_trace; + struct blk_trace *bt;
- if (likely(!bt)) + rcu_read_lock(); + bt = rcu_dereference(q->blk_trace); + if (likely(!bt)) { + rcu_read_unlock(); return; + }
if (rq->cmd_type == REQ_TYPE_BLOCK_PC) { what |= BLK_TC_ACT(BLK_TC_PC); @@ -780,6 +787,7 @@ static void blk_add_trace_rq(struct requ __blk_add_trace(bt, blk_rq_pos(rq), nr_bytes, rq->cmd_flags, what, rq->errors, 0, NULL); } + rcu_read_unlock(); }
static void blk_add_trace_rq_abort(void *ignore, @@ -829,13 +837,18 @@ static void blk_add_trace_rq_complete(vo static void blk_add_trace_bio(struct request_queue *q, struct bio *bio, u32 what, int error) { - struct blk_trace *bt = q->blk_trace; + struct blk_trace *bt;
- if (likely(!bt)) + rcu_read_lock(); + bt = rcu_dereference(q->blk_trace); + if (likely(!bt)) { + rcu_read_unlock(); return; + }
__blk_add_trace(bt, bio->bi_iter.bi_sector, bio->bi_iter.bi_size, bio->bi_rw, what, error, 0, NULL); + rcu_read_unlock(); }
static void blk_add_trace_bio_bounce(void *ignore, @@ -880,10 +893,13 @@ static void blk_add_trace_getrq(void *ig if (bio) blk_add_trace_bio(q, bio, BLK_TA_GETRQ, 0); else { - struct blk_trace *bt = q->blk_trace; + struct blk_trace *bt;
+ rcu_read_lock(); + bt = rcu_dereference(q->blk_trace); if (bt) __blk_add_trace(bt, 0, 0, rw, BLK_TA_GETRQ, 0, 0, NULL); + rcu_read_unlock(); } }
@@ -895,27 +911,35 @@ static void blk_add_trace_sleeprq(void * if (bio) blk_add_trace_bio(q, bio, BLK_TA_SLEEPRQ, 0); else { - struct blk_trace *bt = q->blk_trace; + struct blk_trace *bt;
+ rcu_read_lock(); + bt = rcu_dereference(q->blk_trace); if (bt) __blk_add_trace(bt, 0, 0, rw, BLK_TA_SLEEPRQ, 0, 0, NULL); + rcu_read_unlock(); } }
static void blk_add_trace_plug(void *ignore, struct request_queue *q) { - struct blk_trace *bt = q->blk_trace; + struct blk_trace *bt;
+ rcu_read_lock(); + bt = rcu_dereference(q->blk_trace); if (bt) __blk_add_trace(bt, 0, 0, 0, BLK_TA_PLUG, 0, 0, NULL); + rcu_read_unlock(); }
static void blk_add_trace_unplug(void *ignore, struct request_queue *q, unsigned int depth, bool explicit) { - struct blk_trace *bt = q->blk_trace; + struct blk_trace *bt;
+ rcu_read_lock(); + bt = rcu_dereference(q->blk_trace); if (bt) { __be64 rpdu = cpu_to_be64(depth); u32 what; @@ -927,14 +951,17 @@ static void blk_add_trace_unplug(void *i
__blk_add_trace(bt, 0, 0, 0, what, 0, sizeof(rpdu), &rpdu); } + rcu_read_unlock(); }
static void blk_add_trace_split(void *ignore, struct request_queue *q, struct bio *bio, unsigned int pdu) { - struct blk_trace *bt = q->blk_trace; + struct blk_trace *bt;
+ rcu_read_lock(); + bt = rcu_dereference(q->blk_trace); if (bt) { __be64 rpdu = cpu_to_be64(pdu);
@@ -942,6 +969,7 @@ static void blk_add_trace_split(void *ig bio->bi_iter.bi_size, bio->bi_rw, BLK_TA_SPLIT, bio->bi_error, sizeof(rpdu), &rpdu); } + rcu_read_unlock(); }
/** @@ -961,11 +989,15 @@ static void blk_add_trace_bio_remap(void struct request_queue *q, struct bio *bio, dev_t dev, sector_t from) { - struct blk_trace *bt = q->blk_trace; + struct blk_trace *bt; struct blk_io_trace_remap r;
- if (likely(!bt)) + rcu_read_lock(); + bt = rcu_dereference(q->blk_trace); + if (likely(!bt)) { + rcu_read_unlock(); return; + }
r.device_from = cpu_to_be32(dev); r.device_to = cpu_to_be32(bio->bi_bdev->bd_dev); @@ -974,6 +1006,7 @@ static void blk_add_trace_bio_remap(void __blk_add_trace(bt, bio->bi_iter.bi_sector, bio->bi_iter.bi_size, bio->bi_rw, BLK_TA_REMAP, bio->bi_error, sizeof(r), &r); + rcu_read_unlock(); }
/** @@ -994,11 +1027,15 @@ static void blk_add_trace_rq_remap(void struct request *rq, dev_t dev, sector_t from) { - struct blk_trace *bt = q->blk_trace; + struct blk_trace *bt; struct blk_io_trace_remap r;
- if (likely(!bt)) + rcu_read_lock(); + bt = rcu_dereference(q->blk_trace); + if (likely(!bt)) { + rcu_read_unlock(); return; + }
r.device_from = cpu_to_be32(dev); r.device_to = cpu_to_be32(disk_devt(rq->rq_disk)); @@ -1007,6 +1044,7 @@ static void blk_add_trace_rq_remap(void __blk_add_trace(bt, blk_rq_pos(rq), blk_rq_bytes(rq), rq_data_dir(rq), BLK_TA_REMAP, !!rq->errors, sizeof(r), &r); + rcu_read_unlock(); }
/** @@ -1024,10 +1062,14 @@ void blk_add_driver_data(struct request_ struct request *rq, void *data, size_t len) { - struct blk_trace *bt = q->blk_trace; + struct blk_trace *bt;
- if (likely(!bt)) + rcu_read_lock(); + bt = rcu_dereference(q->blk_trace); + if (likely(!bt)) { + rcu_read_unlock(); return; + }
if (rq->cmd_type == REQ_TYPE_BLOCK_PC) __blk_add_trace(bt, 0, blk_rq_bytes(rq), 0, @@ -1035,6 +1077,7 @@ void blk_add_driver_data(struct request_ else __blk_add_trace(bt, blk_rq_pos(rq), blk_rq_bytes(rq), 0, BLK_TA_DRV_DATA, rq->errors, len, data); + rcu_read_unlock(); } EXPORT_SYMBOL_GPL(blk_add_driver_data);
@@ -1526,6 +1569,7 @@ static int blk_trace_remove_queue(struct return -EINVAL;
put_probe_ref(); + synchronize_rcu(); blk_trace_free(bt); return 0; } @@ -1686,6 +1730,7 @@ static ssize_t sysfs_blk_trace_attr_show struct hd_struct *p = dev_to_part(dev); struct request_queue *q; struct block_device *bdev; + struct blk_trace *bt; ssize_t ret = -ENXIO;
bdev = bdget(part_devt(p)); @@ -1698,21 +1743,23 @@ static ssize_t sysfs_blk_trace_attr_show
mutex_lock(&q->blk_trace_mutex);
+ bt = rcu_dereference_protected(q->blk_trace, + lockdep_is_held(&q->blk_trace_mutex)); if (attr == &dev_attr_enable) { - ret = sprintf(buf, "%u\n", !!q->blk_trace); + ret = sprintf(buf, "%u\n", !!bt); goto out_unlock_bdev; }
- if (q->blk_trace == NULL) + if (bt == NULL) ret = sprintf(buf, "disabled\n"); else if (attr == &dev_attr_act_mask) - ret = blk_trace_mask2str(buf, q->blk_trace->act_mask); + ret = blk_trace_mask2str(buf, bt->act_mask); else if (attr == &dev_attr_pid) - ret = sprintf(buf, "%u\n", q->blk_trace->pid); + ret = sprintf(buf, "%u\n", bt->pid); else if (attr == &dev_attr_start_lba) - ret = sprintf(buf, "%llu\n", q->blk_trace->start_lba); + ret = sprintf(buf, "%llu\n", bt->start_lba); else if (attr == &dev_attr_end_lba) - ret = sprintf(buf, "%llu\n", q->blk_trace->end_lba); + ret = sprintf(buf, "%llu\n", bt->end_lba);
out_unlock_bdev: mutex_unlock(&q->blk_trace_mutex); @@ -1729,6 +1776,7 @@ static ssize_t sysfs_blk_trace_attr_stor struct block_device *bdev; struct request_queue *q; struct hd_struct *p; + struct blk_trace *bt; u64 value; ssize_t ret = -EINVAL;
@@ -1759,8 +1807,10 @@ static ssize_t sysfs_blk_trace_attr_stor
mutex_lock(&q->blk_trace_mutex);
+ bt = rcu_dereference_protected(q->blk_trace, + lockdep_is_held(&q->blk_trace_mutex)); if (attr == &dev_attr_enable) { - if (!!value == !!q->blk_trace) { + if (!!value == !!bt) { ret = 0; goto out_unlock_bdev; } @@ -1772,18 +1822,18 @@ static ssize_t sysfs_blk_trace_attr_stor }
ret = 0; - if (q->blk_trace == NULL) + if (bt == NULL) ret = blk_trace_setup_queue(q, bdev);
if (ret == 0) { if (attr == &dev_attr_act_mask) - q->blk_trace->act_mask = value; + bt->act_mask = value; else if (attr == &dev_attr_pid) - q->blk_trace->pid = value; + bt->pid = value; else if (attr == &dev_attr_start_lba) - q->blk_trace->start_lba = value; + bt->start_lba = value; else if (attr == &dev_attr_end_lba) - q->blk_trace->end_lba = value; + bt->end_lba = value; }
out_unlock_bdev:
From: Cengiz Can cengiz@kernel.wtf
commit 153031a301bb07194e9c37466cfce8eacb977621 upstream.
There was a recent change in blktrace.c that added a RCU protection to `q->blk_trace` in order to fix a use-after-free issue during access.
However the change missed an edge case that can lead to dereferencing of `bt` pointer even when it's NULL:
Coverity static analyzer marked this as a FORWARD_NULL issue with CID 1460458.
``` /kernel/trace/blktrace.c: 1904 in sysfs_blk_trace_attr_store() 1898 ret = 0; 1899 if (bt == NULL) 1900 ret = blk_trace_setup_queue(q, bdev); 1901 1902 if (ret == 0) { 1903 if (attr == &dev_attr_act_mask)
CID 1460458: Null pointer dereferences (FORWARD_NULL) Dereferencing null pointer "bt".
1904 bt->act_mask = value; 1905 else if (attr == &dev_attr_pid) 1906 bt->pid = value; 1907 else if (attr == &dev_attr_start_lba) 1908 bt->start_lba = value; 1909 else if (attr == &dev_attr_end_lba) ```
Added a reassignment with RCU annotation to fix the issue.
Fixes: c780e86dd48 ("blktrace: Protect q->blk_trace with RCU") Reviewed-by: Ming Lei ming.lei@redhat.com Reviewed-by: Bob Liu bob.liu@oracle.com Reviewed-by: Steven Rostedt (VMware) rostedt@goodmis.org Signed-off-by: Cengiz Can cengiz@kernel.wtf Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Ben Hutchings ben.hutchings@codethink.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/trace/blktrace.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
--- a/kernel/trace/blktrace.c +++ b/kernel/trace/blktrace.c @@ -1822,8 +1822,11 @@ static ssize_t sysfs_blk_trace_attr_stor }
ret = 0; - if (bt == NULL) + if (bt == NULL) { ret = blk_trace_setup_queue(q, bdev); + bt = rcu_dereference_protected(q->blk_trace, + lockdep_is_held(&q->blk_trace_mutex)); + }
if (ret == 0) { if (attr == &dev_attr_act_mask)
From: Dmitry Torokhov dmitry.torokhov@gmail.com
commit 882f312dc0751c973db26478f07f082c584d16aa upstream.
We do not need explicitly call dev_set_drvdata(), as it is done for us by device_create().
Acked-by: Richard Cochran richardcochran@gmail.com Signed-off-by: Dmitry Torokhov dmitry.torokhov@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Ben Hutchings ben.hutchings@codethink.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/ptp/ptp_clock.c | 2 -- 1 file changed, 2 deletions(-)
--- a/drivers/ptp/ptp_clock.c +++ b/drivers/ptp/ptp_clock.c @@ -220,8 +220,6 @@ struct ptp_clock *ptp_clock_register(str if (IS_ERR(ptp->dev)) goto no_device;
- dev_set_drvdata(ptp->dev, ptp); - err = ptp_populate_sysfs(ptp); if (err) goto no_sysfs;
From: Dmitry Torokhov dmitry.torokhov@gmail.com
commit af59e717d5ff9c8dbf9bcc581c0dfb3b2a9c9030 upstream.
Instead of creating selected attributes after the device is created (and after userspace potentially seen uevent), lets use attribute group is_visible() method to control which attributes are shown. This will allow us to create all attributes (except "pins" group, which will be taken care of later) before userspace gets notified about new ptp class device.
Signed-off-by: Dmitry Torokhov dmitry.torokhov@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Ben Hutchings ben.hutchings@codethink.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/ptp/ptp_sysfs.c | 125 +++++++++++++++++++++--------------------------- 1 file changed, 55 insertions(+), 70 deletions(-)
--- a/drivers/ptp/ptp_sysfs.c +++ b/drivers/ptp/ptp_sysfs.c @@ -46,27 +46,6 @@ PTP_SHOW_INT(n_periodic_outputs, n_per_o PTP_SHOW_INT(n_programmable_pins, n_pins); PTP_SHOW_INT(pps_available, pps);
-static struct attribute *ptp_attrs[] = { - &dev_attr_clock_name.attr, - &dev_attr_max_adjustment.attr, - &dev_attr_n_alarms.attr, - &dev_attr_n_external_timestamps.attr, - &dev_attr_n_periodic_outputs.attr, - &dev_attr_n_programmable_pins.attr, - &dev_attr_pps_available.attr, - NULL, -}; - -static const struct attribute_group ptp_group = { - .attrs = ptp_attrs, -}; - -const struct attribute_group *ptp_groups[] = { - &ptp_group, - NULL, -}; - - static ssize_t extts_enable_store(struct device *dev, struct device_attribute *attr, const char *buf, size_t count) @@ -91,6 +70,7 @@ static ssize_t extts_enable_store(struct out: return err; } +static DEVICE_ATTR(extts_enable, 0220, NULL, extts_enable_store);
static ssize_t extts_fifo_show(struct device *dev, struct device_attribute *attr, char *page) @@ -124,6 +104,7 @@ out: mutex_unlock(&ptp->tsevq_mux); return cnt; } +static DEVICE_ATTR(fifo, 0444, extts_fifo_show, NULL);
static ssize_t period_store(struct device *dev, struct device_attribute *attr, @@ -151,6 +132,7 @@ static ssize_t period_store(struct devic out: return err; } +static DEVICE_ATTR(period, 0220, NULL, period_store);
static ssize_t pps_enable_store(struct device *dev, struct device_attribute *attr, @@ -177,6 +159,57 @@ static ssize_t pps_enable_store(struct d out: return err; } +static DEVICE_ATTR(pps_enable, 0220, NULL, pps_enable_store); + +static struct attribute *ptp_attrs[] = { + &dev_attr_clock_name.attr, + + &dev_attr_max_adjustment.attr, + &dev_attr_n_alarms.attr, + &dev_attr_n_external_timestamps.attr, + &dev_attr_n_periodic_outputs.attr, + &dev_attr_n_programmable_pins.attr, + &dev_attr_pps_available.attr, + + &dev_attr_extts_enable.attr, + &dev_attr_fifo.attr, + &dev_attr_period.attr, + &dev_attr_pps_enable.attr, + NULL +}; + +static umode_t ptp_is_attribute_visible(struct kobject *kobj, + struct attribute *attr, int n) +{ + struct device *dev = kobj_to_dev(kobj); + struct ptp_clock *ptp = dev_get_drvdata(dev); + struct ptp_clock_info *info = ptp->info; + umode_t mode = attr->mode; + + if (attr == &dev_attr_extts_enable.attr || + attr == &dev_attr_fifo.attr) { + if (!info->n_ext_ts) + mode = 0; + } else if (attr == &dev_attr_period.attr) { + if (!info->n_per_out) + mode = 0; + } else if (attr == &dev_attr_pps_enable.attr) { + if (!info->pps) + mode = 0; + } + + return mode; +} + +static const struct attribute_group ptp_group = { + .is_visible = ptp_is_attribute_visible, + .attrs = ptp_attrs, +}; + +const struct attribute_group *ptp_groups[] = { + &ptp_group, + NULL +};
static int ptp_pin_name2index(struct ptp_clock *ptp, const char *name) { @@ -235,26 +268,11 @@ static ssize_t ptp_pin_store(struct devi return count; }
-static DEVICE_ATTR(extts_enable, 0220, NULL, extts_enable_store); -static DEVICE_ATTR(fifo, 0444, extts_fifo_show, NULL); -static DEVICE_ATTR(period, 0220, NULL, period_store); -static DEVICE_ATTR(pps_enable, 0220, NULL, pps_enable_store); - int ptp_cleanup_sysfs(struct ptp_clock *ptp) { struct device *dev = ptp->dev; struct ptp_clock_info *info = ptp->info;
- if (info->n_ext_ts) { - device_remove_file(dev, &dev_attr_extts_enable); - device_remove_file(dev, &dev_attr_fifo); - } - if (info->n_per_out) - device_remove_file(dev, &dev_attr_period); - - if (info->pps) - device_remove_file(dev, &dev_attr_pps_enable); - if (info->n_pins) { sysfs_remove_group(&dev->kobj, &ptp->pin_attr_group); kfree(ptp->pin_attr); @@ -307,46 +325,13 @@ no_dev_attr:
int ptp_populate_sysfs(struct ptp_clock *ptp) { - struct device *dev = ptp->dev; struct ptp_clock_info *info = ptp->info; int err;
- if (info->n_ext_ts) { - err = device_create_file(dev, &dev_attr_extts_enable); - if (err) - goto out1; - err = device_create_file(dev, &dev_attr_fifo); - if (err) - goto out2; - } - if (info->n_per_out) { - err = device_create_file(dev, &dev_attr_period); - if (err) - goto out3; - } - if (info->pps) { - err = device_create_file(dev, &dev_attr_pps_enable); - if (err) - goto out4; - } if (info->n_pins) { err = ptp_populate_pins(ptp); if (err) - goto out5; + return err; } return 0; -out5: - if (info->pps) - device_remove_file(dev, &dev_attr_pps_enable); -out4: - if (info->n_per_out) - device_remove_file(dev, &dev_attr_period); -out3: - if (info->n_ext_ts) - device_remove_file(dev, &dev_attr_fifo); -out2: - if (info->n_ext_ts) - device_remove_file(dev, &dev_attr_extts_enable); -out1: - return err; }
From: Dmitry Torokhov dmitry.torokhov@gmail.com
commit 85a66e55019583da1e0f18706b7a8281c9f6de5b upstream.
Let's switch to using device_create_with_groups(), which will allow us to create "pins" attribute group together with the rest of ptp device attributes, and before userspace gets notified about ptp device creation.
Signed-off-by: Dmitry Torokhov dmitry.torokhov@gmail.com Signed-off-by: David S. Miller davem@davemloft.net [bwh: Backported to 4.9: adjust context] Signed-off-by: Ben Hutchings ben.hutchings@codethink.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/ptp/ptp_clock.c | 20 +++++++++++--------- drivers/ptp/ptp_private.h | 7 ++++--- drivers/ptp/ptp_sysfs.c | 39 +++++++++------------------------------ 3 files changed, 24 insertions(+), 42 deletions(-)
--- a/drivers/ptp/ptp_clock.c +++ b/drivers/ptp/ptp_clock.c @@ -214,16 +214,17 @@ struct ptp_clock *ptp_clock_register(str mutex_init(&ptp->pincfg_mux); init_waitqueue_head(&ptp->tsev_wq);
+ err = ptp_populate_pin_groups(ptp); + if (err) + goto no_pin_groups; + /* Create a new device in our class. */ - ptp->dev = device_create(ptp_class, parent, ptp->devid, ptp, - "ptp%d", ptp->index); + ptp->dev = device_create_with_groups(ptp_class, parent, ptp->devid, + ptp, ptp->pin_attr_groups, + "ptp%d", ptp->index); if (IS_ERR(ptp->dev)) goto no_device;
- err = ptp_populate_sysfs(ptp); - if (err) - goto no_sysfs; - /* Register a new PPS source. */ if (info->pps) { struct pps_source_info pps; @@ -251,10 +252,10 @@ no_clock: if (ptp->pps_source) pps_unregister_source(ptp->pps_source); no_pps: - ptp_cleanup_sysfs(ptp); -no_sysfs: device_destroy(ptp_class, ptp->devid); no_device: + ptp_cleanup_pin_groups(ptp); +no_pin_groups: mutex_destroy(&ptp->tsevq_mux); mutex_destroy(&ptp->pincfg_mux); no_slot: @@ -272,8 +273,9 @@ int ptp_clock_unregister(struct ptp_cloc /* Release the clock's resources. */ if (ptp->pps_source) pps_unregister_source(ptp->pps_source); - ptp_cleanup_sysfs(ptp); + device_destroy(ptp_class, ptp->devid); + ptp_cleanup_pin_groups(ptp);
posix_clock_unregister(&ptp->clock); return 0; --- a/drivers/ptp/ptp_private.h +++ b/drivers/ptp/ptp_private.h @@ -54,6 +54,8 @@ struct ptp_clock { struct device_attribute *pin_dev_attr; struct attribute **pin_attr; struct attribute_group pin_attr_group; + /* 1st entry is a pointer to the real group, 2nd is NULL terminator */ + const struct attribute_group *pin_attr_groups[2]; };
/* @@ -94,8 +96,7 @@ uint ptp_poll(struct posix_clock *pc,
extern const struct attribute_group *ptp_groups[];
-int ptp_cleanup_sysfs(struct ptp_clock *ptp); - -int ptp_populate_sysfs(struct ptp_clock *ptp); +int ptp_populate_pin_groups(struct ptp_clock *ptp); +void ptp_cleanup_pin_groups(struct ptp_clock *ptp);
#endif --- a/drivers/ptp/ptp_sysfs.c +++ b/drivers/ptp/ptp_sysfs.c @@ -268,25 +268,14 @@ static ssize_t ptp_pin_store(struct devi return count; }
-int ptp_cleanup_sysfs(struct ptp_clock *ptp) +int ptp_populate_pin_groups(struct ptp_clock *ptp) { - struct device *dev = ptp->dev; - struct ptp_clock_info *info = ptp->info; - - if (info->n_pins) { - sysfs_remove_group(&dev->kobj, &ptp->pin_attr_group); - kfree(ptp->pin_attr); - kfree(ptp->pin_dev_attr); - } - return 0; -} - -static int ptp_populate_pins(struct ptp_clock *ptp) -{ - struct device *dev = ptp->dev; struct ptp_clock_info *info = ptp->info; int err = -ENOMEM, i, n_pins = info->n_pins;
+ if (!n_pins) + return 0; + ptp->pin_dev_attr = kzalloc(n_pins * sizeof(*ptp->pin_dev_attr), GFP_KERNEL); if (!ptp->pin_dev_attr) @@ -310,28 +299,18 @@ static int ptp_populate_pins(struct ptp_ ptp->pin_attr_group.name = "pins"; ptp->pin_attr_group.attrs = ptp->pin_attr;
- err = sysfs_create_group(&dev->kobj, &ptp->pin_attr_group); - if (err) - goto no_group; + ptp->pin_attr_groups[0] = &ptp->pin_attr_group; + return 0;
-no_group: - kfree(ptp->pin_attr); no_pin_attr: kfree(ptp->pin_dev_attr); no_dev_attr: return err; }
-int ptp_populate_sysfs(struct ptp_clock *ptp) +void ptp_cleanup_pin_groups(struct ptp_clock *ptp) { - struct ptp_clock_info *info = ptp->info; - int err; - - if (info->n_pins) { - err = ptp_populate_pins(ptp); - if (err) - return err; - } - return 0; + kfree(ptp->pin_attr); + kfree(ptp->pin_dev_attr); }
From: Logan Gunthorpe logang@deltatee.com
commit 233ed09d7fdacf592ee91e6c97ce5f4364fbe7c0 upstream.
Credit for this patch goes is shared with Dan Williams [1]. I've taken things one step further to make the helper function more useful and clean up calling code.
There's a common pattern in the kernel whereby a struct cdev is placed in a structure along side a struct device which manages the life-cycle of both. In the naive approach, the reference counting is broken and the struct device can free everything before the chardev code is entirely released.
Many developers have solved this problem by linking the internal kobjs in this fashion:
cdev.kobj.parent = &parent_dev.kobj;
The cdev code explicitly gets and puts a reference to it's kobj parent. So this seems like it was intended to be used this way. Dmitrty Torokhov first put this in place in 2012 with this commit:
2f0157f char_dev: pin parent kobject
and the first instance of the fix was then done in the input subsystem in the following commit:
4a215aa Input: fix use-after-free introduced with dynamic minor changes
Subsequently over the years, however, this issue seems to have tripped up multiple developers independently. For example, see these commits:
0d5b7da iio: Prevent race between IIO chardev opening and IIO device (by Lars-Peter Clausen in 2013)
ba0ef85 tpm: Fix initialization of the cdev (by Jason Gunthorpe in 2015)
5b28dde [media] media: fix use-after-free in cdev_put() when app exits after driver unbind (by Shauh Khan in 2016)
This technique is similarly done in at least 15 places within the kernel and probably should have been done so in another, at least, 5 places. The kobj line also looks very suspect in that one would not expect drivers to have to mess with kobject internals in this way. Even highly experienced kernel developers can be surprised by this code, as seen in [2].
To help alleviate this situation, and hopefully prevent future wasted effort on this problem, this patch introduces a helper function to register a char device along with its parent struct device. This creates a more regular API for tying a char device to its parent without the developer having to set members in the underlying kobject.
This patch introduce cdev_device_add and cdev_device_del which replaces a common pattern including setting the kobj parent, calling cdev_add and then calling device_add. It also introduces cdev_set_parent for the few cases that set the kobject parent without using device_add.
[1] https://lkml.org/lkml/2017/2/13/700 [2] https://lkml.org/lkml/2017/2/10/370
Signed-off-by: Logan Gunthorpe logang@deltatee.com Signed-off-by: Dan Williams dan.j.williams@intel.com Reviewed-by: Hans Verkuil hans.verkuil@cisco.com Reviewed-by: Alexandre Belloni alexandre.belloni@free-electrons.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Ben Hutchings ben.hutchings@codethink.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/char_dev.c | 86 +++++++++++++++++++++++++++++++++++++++++++++++++++ include/linux/cdev.h | 5 ++ 2 files changed, 91 insertions(+)
--- a/fs/char_dev.c +++ b/fs/char_dev.c @@ -472,6 +472,85 @@ int cdev_add(struct cdev *p, dev_t dev, return 0; }
+/** + * cdev_set_parent() - set the parent kobject for a char device + * @p: the cdev structure + * @kobj: the kobject to take a reference to + * + * cdev_set_parent() sets a parent kobject which will be referenced + * appropriately so the parent is not freed before the cdev. This + * should be called before cdev_add. + */ +void cdev_set_parent(struct cdev *p, struct kobject *kobj) +{ + WARN_ON(!kobj->state_initialized); + p->kobj.parent = kobj; +} + +/** + * cdev_device_add() - add a char device and it's corresponding + * struct device, linkink + * @dev: the device structure + * @cdev: the cdev structure + * + * cdev_device_add() adds the char device represented by @cdev to the system, + * just as cdev_add does. It then adds @dev to the system using device_add + * The dev_t for the char device will be taken from the struct device which + * needs to be initialized first. This helper function correctly takes a + * reference to the parent device so the parent will not get released until + * all references to the cdev are released. + * + * This helper uses dev->devt for the device number. If it is not set + * it will not add the cdev and it will be equivalent to device_add. + * + * This function should be used whenever the struct cdev and the + * struct device are members of the same structure whose lifetime is + * managed by the struct device. + * + * NOTE: Callers must assume that userspace was able to open the cdev and + * can call cdev fops callbacks at any time, even if this function fails. + */ +int cdev_device_add(struct cdev *cdev, struct device *dev) +{ + int rc = 0; + + if (dev->devt) { + cdev_set_parent(cdev, &dev->kobj); + + rc = cdev_add(cdev, dev->devt, 1); + if (rc) + return rc; + } + + rc = device_add(dev); + if (rc) + cdev_del(cdev); + + return rc; +} + +/** + * cdev_device_del() - inverse of cdev_device_add + * @dev: the device structure + * @cdev: the cdev structure + * + * cdev_device_del() is a helper function to call cdev_del and device_del. + * It should be used whenever cdev_device_add is used. + * + * If dev->devt is not set it will not remove the cdev and will be equivalent + * to device_del. + * + * NOTE: This guarantees that associated sysfs callbacks are not running + * or runnable, however any cdevs already open will remain and their fops + * will still be callable even after this function returns. + */ +void cdev_device_del(struct cdev *cdev, struct device *dev) +{ + device_del(dev); + if (dev->devt) + cdev_del(cdev); +} + static void cdev_unmap(dev_t dev, unsigned count) { kobj_unmap(cdev_map, dev, count); @@ -483,6 +562,10 @@ static void cdev_unmap(dev_t dev, unsign * * cdev_del() removes @p from the system, possibly freeing the structure * itself. + * + * NOTE: This guarantees that cdev device will no longer be able to be + * opened, however any cdevs already open will remain and their fops will + * still be callable even after cdev_del returns. */ void cdev_del(struct cdev *p) { @@ -571,5 +654,8 @@ EXPORT_SYMBOL(cdev_init); EXPORT_SYMBOL(cdev_alloc); EXPORT_SYMBOL(cdev_del); EXPORT_SYMBOL(cdev_add); +EXPORT_SYMBOL(cdev_set_parent); +EXPORT_SYMBOL(cdev_device_add); +EXPORT_SYMBOL(cdev_device_del); EXPORT_SYMBOL(__register_chrdev); EXPORT_SYMBOL(__unregister_chrdev); --- a/include/linux/cdev.h +++ b/include/linux/cdev.h @@ -4,6 +4,7 @@ #include <linux/kobject.h> #include <linux/kdev_t.h> #include <linux/list.h> +#include <linux/device.h>
struct file_operations; struct inode; @@ -26,6 +27,10 @@ void cdev_put(struct cdev *p);
int cdev_add(struct cdev *, dev_t, unsigned);
+void cdev_set_parent(struct cdev *p, struct kobject *kobj); +int cdev_device_add(struct cdev *cdev, struct device *dev); +void cdev_device_del(struct cdev *cdev, struct device *dev); + void cdev_del(struct cdev *);
void cd_forget(struct inode *);
From: YueHaibing yuehaibing@huawei.com
commit aea0a897af9e44c258e8ab9296fad417f1bc063a upstream.
Fix smatch warning:
drivers/ptp/ptp_clock.c:298 ptp_clock_register() warn: passing zero to 'ERR_PTR'
'err' should be set while device_create_with_groups and pps_register_source fails
Fixes: 85a66e550195 ("ptp: create "pins" together with the rest of attributes") Signed-off-by: YueHaibing yuehaibing@huawei.com Acked-by: Richard Cochran richardcochran@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Ben Hutchings ben.hutchings@codethink.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/ptp/ptp_clock.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
--- a/drivers/ptp/ptp_clock.c +++ b/drivers/ptp/ptp_clock.c @@ -222,8 +222,10 @@ struct ptp_clock *ptp_clock_register(str ptp->dev = device_create_with_groups(ptp_class, parent, ptp->devid, ptp, ptp->pin_attr_groups, "ptp%d", ptp->index); - if (IS_ERR(ptp->dev)) + if (IS_ERR(ptp->dev)) { + err = PTR_ERR(ptp->dev); goto no_device; + }
/* Register a new PPS source. */ if (info->pps) { @@ -234,6 +236,7 @@ struct ptp_clock *ptp_clock_register(str pps.owner = info->owner; ptp->pps_source = pps_register_source(&pps, PTP_PPS_DEFAULTS); if (!ptp->pps_source) { + err = -EINVAL; pr_err("failed to register pps source\n"); goto no_pps; }
From: Vladis Dronov vdronov@redhat.com
commit a33121e5487b424339636b25c35d3a180eaa5f5e upstream.
In a case when a ptp chardev (like /dev/ptp0) is open but an underlying device is removed, closing this file leads to a race. This reproduces easily in a kvm virtual machine:
ts# cat openptp0.c int main() { ... fp = fopen("/dev/ptp0", "r"); ... sleep(10); } ts# uname -r 5.5.0-rc3-46cf053e ts# cat /proc/cmdline ... slub_debug=FZP ts# modprobe ptp_kvm ts# ./openptp0 & [1] 670 opened /dev/ptp0, sleeping 10s... ts# rmmod ptp_kvm ts# ls /dev/ptp* ls: cannot access '/dev/ptp*': No such file or directory ts# ...woken up [ 48.010809] general protection fault: 0000 [#1] SMP [ 48.012502] CPU: 6 PID: 658 Comm: openptp0 Not tainted 5.5.0-rc3-46cf053e #25 [ 48.014624] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), ... [ 48.016270] RIP: 0010:module_put.part.0+0x7/0x80 [ 48.017939] RSP: 0018:ffffb3850073be00 EFLAGS: 00010202 [ 48.018339] RAX: 000000006b6b6b6b RBX: 6b6b6b6b6b6b6b6b RCX: ffff89a476c00ad0 [ 48.018936] RDX: fffff65a08d3ea08 RSI: 0000000000000247 RDI: 6b6b6b6b6b6b6b6b [ 48.019470] ... ^^^ a slub poison [ 48.023854] Call Trace: [ 48.024050] __fput+0x21f/0x240 [ 48.024288] task_work_run+0x79/0x90 [ 48.024555] do_exit+0x2af/0xab0 [ 48.024799] ? vfs_write+0x16a/0x190 [ 48.025082] do_group_exit+0x35/0x90 [ 48.025387] __x64_sys_exit_group+0xf/0x10 [ 48.025737] do_syscall_64+0x3d/0x130 [ 48.026056] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 48.026479] RIP: 0033:0x7f53b12082f6 [ 48.026792] ... [ 48.030945] Modules linked in: ptp i6300esb watchdog [last unloaded: ptp_kvm] [ 48.045001] Fixing recursive fault but reboot is needed!
This happens in:
static void __fput(struct file *file) { ... if (file->f_op->release) file->f_op->release(inode, file); <<< cdev is kfree'd here if (unlikely(S_ISCHR(inode->i_mode) && inode->i_cdev != NULL && !(mode & FMODE_PATH))) { cdev_put(inode->i_cdev); <<< cdev fields are accessed here
Namely:
__fput() posix_clock_release() kref_put(&clk->kref, delete_clock) <<< the last reference delete_clock() delete_ptp_clock() kfree(ptp) <<< cdev is embedded in ptp cdev_put module_put(p->owner) <<< *p is kfree'd, bang!
Here cdev is embedded in posix_clock which is embedded in ptp_clock. The race happens because ptp_clock's lifetime is controlled by two refcounts: kref and cdev.kobj in posix_clock. This is wrong.
Make ptp_clock's sysfs device a parent of cdev with cdev_device_add() created especially for such cases. This way the parent device with its ptp_clock is not released until all references to the cdev are released. This adds a requirement that an initialized but not exposed struct device should be provided to posix_clock_register() by a caller instead of a simple dev_t.
This approach was adopted from the commit 72139dfa2464 ("watchdog: Fix the race between the release of watchdog_core_data and cdev"). See details of the implementation in the commit 233ed09d7fda ("chardev: add helper function to register char devs with a struct device").
Link: https://lore.kernel.org/linux-fsdevel/20191125125342.6189-1-vdronov@redhat.c... Analyzed-by: Stephen Johnston sjohnsto@redhat.com Analyzed-by: Vern Lovejoy vlovejoy@redhat.com Signed-off-by: Vladis Dronov vdronov@redhat.com Acked-by: Richard Cochran richardcochran@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Ben Hutchings ben.hutchings@codethink.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/ptp/ptp_clock.c | 31 ++++++++++++++----------------- drivers/ptp/ptp_private.h | 2 +- include/linux/posix-clock.h | 19 +++++++++++-------- kernel/time/posix-clock.c | 31 +++++++++++++------------------ 4 files changed, 39 insertions(+), 44 deletions(-)
--- a/drivers/ptp/ptp_clock.c +++ b/drivers/ptp/ptp_clock.c @@ -171,9 +171,9 @@ static struct posix_clock_operations ptp .read = ptp_read, };
-static void delete_ptp_clock(struct posix_clock *pc) +static void ptp_clock_release(struct device *dev) { - struct ptp_clock *ptp = container_of(pc, struct ptp_clock, clock); + struct ptp_clock *ptp = container_of(dev, struct ptp_clock, dev);
mutex_destroy(&ptp->tsevq_mux); mutex_destroy(&ptp->pincfg_mux); @@ -205,7 +205,6 @@ struct ptp_clock *ptp_clock_register(str }
ptp->clock.ops = ptp_clock_ops; - ptp->clock.release = delete_ptp_clock; ptp->info = info; ptp->devid = MKDEV(major, index); ptp->index = index; @@ -218,15 +217,6 @@ struct ptp_clock *ptp_clock_register(str if (err) goto no_pin_groups;
- /* Create a new device in our class. */ - ptp->dev = device_create_with_groups(ptp_class, parent, ptp->devid, - ptp, ptp->pin_attr_groups, - "ptp%d", ptp->index); - if (IS_ERR(ptp->dev)) { - err = PTR_ERR(ptp->dev); - goto no_device; - } - /* Register a new PPS source. */ if (info->pps) { struct pps_source_info pps; @@ -242,8 +232,18 @@ struct ptp_clock *ptp_clock_register(str } }
- /* Create a posix clock. */ - err = posix_clock_register(&ptp->clock, ptp->devid); + /* Initialize a new device of our class in our clock structure. */ + device_initialize(&ptp->dev); + ptp->dev.devt = ptp->devid; + ptp->dev.class = ptp_class; + ptp->dev.parent = parent; + ptp->dev.groups = ptp->pin_attr_groups; + ptp->dev.release = ptp_clock_release; + dev_set_drvdata(&ptp->dev, ptp); + dev_set_name(&ptp->dev, "ptp%d", ptp->index); + + /* Create a posix clock and link it to the device. */ + err = posix_clock_register(&ptp->clock, &ptp->dev); if (err) { pr_err("failed to create posix clock\n"); goto no_clock; @@ -255,8 +255,6 @@ no_clock: if (ptp->pps_source) pps_unregister_source(ptp->pps_source); no_pps: - device_destroy(ptp_class, ptp->devid); -no_device: ptp_cleanup_pin_groups(ptp); no_pin_groups: mutex_destroy(&ptp->tsevq_mux); @@ -277,7 +275,6 @@ int ptp_clock_unregister(struct ptp_cloc if (ptp->pps_source) pps_unregister_source(ptp->pps_source);
- device_destroy(ptp_class, ptp->devid); ptp_cleanup_pin_groups(ptp);
posix_clock_unregister(&ptp->clock); --- a/drivers/ptp/ptp_private.h +++ b/drivers/ptp/ptp_private.h @@ -40,7 +40,7 @@ struct timestamp_event_queue {
struct ptp_clock { struct posix_clock clock; - struct device *dev; + struct device dev; struct ptp_clock_info *info; dev_t devid; int index; /* index into clocks.map */ --- a/include/linux/posix-clock.h +++ b/include/linux/posix-clock.h @@ -104,29 +104,32 @@ struct posix_clock_operations { * * @ops: Functional interface to the clock * @cdev: Character device instance for this clock - * @kref: Reference count. + * @dev: Pointer to the clock's device. * @rwsem: Protects the 'zombie' field from concurrent access. * @zombie: If 'zombie' is true, then the hardware has disappeared. - * @release: A function to free the structure when the reference count reaches - * zero. May be NULL if structure is statically allocated. * * Drivers should embed their struct posix_clock within a private * structure, obtaining a reference to it during callbacks using * container_of(). + * + * Drivers should supply an initialized but not exposed struct device + * to posix_clock_register(). It is used to manage lifetime of the + * driver's private structure. It's 'release' field should be set to + * a release function for this private structure. */ struct posix_clock { struct posix_clock_operations ops; struct cdev cdev; - struct kref kref; + struct device *dev; struct rw_semaphore rwsem; bool zombie; - void (*release)(struct posix_clock *clk); };
/** * posix_clock_register() - register a new clock - * @clk: Pointer to the clock. Caller must provide 'ops' and 'release' - * @devid: Allocated device id + * @clk: Pointer to the clock. Caller must provide 'ops' field + * @dev: Pointer to the initialized device. Caller must provide + * 'release' field * * A clock driver calls this function to register itself with the * clock device subsystem. If 'clk' points to dynamically allocated @@ -135,7 +138,7 @@ struct posix_clock { * * Returns zero on success, non-zero otherwise. */ -int posix_clock_register(struct posix_clock *clk, dev_t devid); +int posix_clock_register(struct posix_clock *clk, struct device *dev);
/** * posix_clock_unregister() - unregister a clock --- a/kernel/time/posix-clock.c +++ b/kernel/time/posix-clock.c @@ -25,8 +25,6 @@ #include <linux/syscalls.h> #include <linux/uaccess.h>
-static void delete_clock(struct kref *kref); - /* * Returns NULL if the posix_clock instance attached to 'fp' is old and stale. */ @@ -168,7 +166,7 @@ static int posix_clock_open(struct inode err = 0;
if (!err) { - kref_get(&clk->kref); + get_device(clk->dev); fp->private_data = clk; } out: @@ -184,7 +182,7 @@ static int posix_clock_release(struct in if (clk->ops.release) err = clk->ops.release(clk);
- kref_put(&clk->kref, delete_clock); + put_device(clk->dev);
fp->private_data = NULL;
@@ -206,38 +204,35 @@ static const struct file_operations posi #endif };
-int posix_clock_register(struct posix_clock *clk, dev_t devid) +int posix_clock_register(struct posix_clock *clk, struct device *dev) { int err;
- kref_init(&clk->kref); init_rwsem(&clk->rwsem);
cdev_init(&clk->cdev, &posix_clock_file_operations); + err = cdev_device_add(&clk->cdev, dev); + if (err) { + pr_err("%s unable to add device %d:%d\n", + dev_name(dev), MAJOR(dev->devt), MINOR(dev->devt)); + return err; + } clk->cdev.owner = clk->ops.owner; - err = cdev_add(&clk->cdev, devid, 1); + clk->dev = dev;
- return err; + return 0; } EXPORT_SYMBOL_GPL(posix_clock_register);
-static void delete_clock(struct kref *kref) -{ - struct posix_clock *clk = container_of(kref, struct posix_clock, kref); - - if (clk->release) - clk->release(clk); -} - void posix_clock_unregister(struct posix_clock *clk) { - cdev_del(&clk->cdev); + cdev_device_del(&clk->cdev, clk->dev);
down_write(&clk->rwsem); clk->zombie = true; up_write(&clk->rwsem);
- kref_put(&clk->kref, delete_clock); + put_device(clk->dev); } EXPORT_SYMBOL_GPL(posix_clock_unregister);
From: Vladis Dronov vdronov@redhat.com
commit 75718584cb3c64e6269109d4d54f888ac5a5fd15 upstream.
There is a bug in ptp_clock_unregister(), where ptp_cleanup_pin_groups() first frees ptp->pin_{,dev_}attr, but then posix_clock_unregister() needs them to destroy a related sysfs device.
These functions can not be just swapped, as posix_clock_unregister() frees ptp which is needed in the ptp_cleanup_pin_groups(). Fix this by calling ptp_cleanup_pin_groups() in ptp_clock_release(), right before ptp is freed.
This makes this patch fix an UAF bug in a patch which fixes an UAF bug.
Reported-by: Antti Laakso antti.laakso@intel.com Fixes: a33121e5487b ("ptp: fix the race between the release of ptp_clock and cdev") Link: https://lore.kernel.org/netdev/3d2bd09735dbdaf003585ca376b7c1e5b69a19bd.came... Signed-off-by: Vladis Dronov vdronov@redhat.com Acked-by: Richard Cochran richardcochran@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Ben Hutchings ben.hutchings@codethink.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/ptp/ptp_clock.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/ptp/ptp_clock.c +++ b/drivers/ptp/ptp_clock.c @@ -175,6 +175,7 @@ static void ptp_clock_release(struct dev { struct ptp_clock *ptp = container_of(dev, struct ptp_clock, dev);
+ ptp_cleanup_pin_groups(ptp); mutex_destroy(&ptp->tsevq_mux); mutex_destroy(&ptp->pincfg_mux); ida_simple_remove(&ptp_clocks_map, ptp->index); @@ -275,9 +276,8 @@ int ptp_clock_unregister(struct ptp_cloc if (ptp->pps_source) pps_unregister_source(ptp->pps_source);
- ptp_cleanup_pin_groups(ptp); - posix_clock_unregister(&ptp->clock); + return 0; } EXPORT_SYMBOL(ptp_clock_unregister);
From: David Ahern dsa@cumulusnetworks.com
commit 79dc7e3f1cd323be4c81aa1a94faa1b3ed987fb2 upstream.
Andrey reported the following while fuzzing the kernel with syzkaller:
kasan: CONFIG_KASAN_INLINE enabled kasan: GPF could be caused by NULL-ptr deref or user memory access general protection fault: 0000 [#1] SMP KASAN Modules linked in: CPU: 0 PID: 3859 Comm: a.out Not tainted 4.9.0-rc6+ #429 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 task: ffff8800666d4200 task.stack: ffff880067348000 RIP: 0010:[<ffffffff833617ec>] [<ffffffff833617ec>] icmp6_send+0x5fc/0x1e30 net/ipv6/icmp.c:451 RSP: 0018:ffff88006734f2c0 EFLAGS: 00010206 RAX: ffff8800666d4200 RBX: 0000000000000000 RCX: 0000000000000000 RDX: 0000000000000000 RSI: dffffc0000000000 RDI: 0000000000000018 RBP: ffff88006734f630 R08: ffff880064138418 R09: 0000000000000003 R10: dffffc0000000000 R11: 0000000000000005 R12: 0000000000000000 R13: ffffffff84e7e200 R14: ffff880064138484 R15: ffff8800641383c0 FS: 00007fb3887a07c0(0000) GS:ffff88006cc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000020000000 CR3: 000000006b040000 CR4: 00000000000006f0 Stack: ffff8800666d4200 ffff8800666d49f8 ffff8800666d4200 ffffffff84c02460 ffff8800666d4a1a 1ffff1000ccdaa2f ffff88006734f498 0000000000000046 ffff88006734f440 ffffffff832f4269 ffff880064ba7456 0000000000000000 Call Trace: [<ffffffff83364ddc>] icmpv6_param_prob+0x2c/0x40 net/ipv6/icmp.c:557 [< inline >] ip6_tlvopt_unknown net/ipv6/exthdrs.c:88 [<ffffffff83394405>] ip6_parse_tlv+0x555/0x670 net/ipv6/exthdrs.c:157 [<ffffffff8339a759>] ipv6_parse_hopopts+0x199/0x460 net/ipv6/exthdrs.c:663 [<ffffffff832ee773>] ipv6_rcv+0xfa3/0x1dc0 net/ipv6/ip6_input.c:191 ...
icmp6_send / icmpv6_send is invoked for both rx and tx paths. In both cases the dst->dev should be preferred for determining the L3 domain if the dst has been set on the skb. Fallback to the skb->dev if it has not. This covers the case reported here where icmp6_send is invoked on Rx before the route lookup.
Fixes: 5d41ce29e ("net: icmp6_send should use dst dev to determine L3 domain") Reported-by: Andrey Konovalov andreyknvl@google.com Signed-off-by: David Ahern dsa@cumulusnetworks.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Nobuhiro Iwamatsu (CIP) nobuhiro1.iwamatsu@toshiba.co.jp Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- net/ipv6/icmp.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
--- a/net/ipv6/icmp.c +++ b/net/ipv6/icmp.c @@ -445,8 +445,10 @@ static void icmp6_send(struct sk_buff *s
if (__ipv6_addr_needs_scope_id(addr_type)) iif = skb->dev->ifindex; - else - iif = l3mdev_master_ifindex(skb_dst(skb)->dev); + else { + dst = skb_dst(skb); + iif = l3mdev_master_ifindex(dst ? dst->dev : skb->dev); + }
/* * Must not send error if the source does not uniquely
From: Christophe JAILLET christophe.jaillet@wanadoo.fr
[ Upstream commit 10e3cc180e64385edc9890c6855acf5ed9ca1339 ]
A call to 'dma_alloc_coherent()' is hidden in 'sonic_alloc_descriptors()', called from 'sonic_probe1()'.
This is correctly freed in the remove function, but not in the error handling path of the probe function. Fix it and add the missing 'dma_free_coherent()' call.
While at it, rename a label in order to be slightly more informative.
Fixes: efcce839360f ("[PATCH] macsonic/jazzsonic network drivers update") Signed-off-by: Christophe JAILLET christophe.jaillet@wanadoo.fr Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/natsemi/jazzsonic.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/natsemi/jazzsonic.c b/drivers/net/ethernet/natsemi/jazzsonic.c index acf3f11e38cc1..68d2f31921ff8 100644 --- a/drivers/net/ethernet/natsemi/jazzsonic.c +++ b/drivers/net/ethernet/natsemi/jazzsonic.c @@ -247,13 +247,15 @@ static int jazz_sonic_probe(struct platform_device *pdev) goto out; err = register_netdev(dev); if (err) - goto out1; + goto undo_probe1;
printk("%s: MAC %pM IRQ %d\n", dev->name, dev->dev_addr, dev->irq);
return 0;
-out1: +undo_probe1: + dma_free_coherent(lp->device, SIZEOF_SONIC_DESC * SONIC_BUS_SCALE(lp->dma_bitmode), + lp->descriptors, lp->descriptors_laddr); release_mem_region(dev->base_addr, SONIC_MEM_SIZE); out: free_netdev(dev);
From: Christophe JAILLET christophe.jaillet@wanadoo.fr
[ Upstream commit ee8d2267f0e39a1bfd95532da3a6405004114b27 ]
Should an irq requested with 'devm_request_irq' be released explicitly, it should be done by 'devm_free_irq()', not 'free_irq()'.
Fixes: 6c821bd9edc9 ("net: Add MOXA ART SoCs ethernet driver") Signed-off-by: Christophe JAILLET christophe.jaillet@wanadoo.fr Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/moxa/moxart_ether.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/moxa/moxart_ether.c b/drivers/net/ethernet/moxa/moxart_ether.c index f1dde59c9fa6c..374e691b11da6 100644 --- a/drivers/net/ethernet/moxa/moxart_ether.c +++ b/drivers/net/ethernet/moxa/moxart_ether.c @@ -541,7 +541,7 @@ static int moxart_remove(struct platform_device *pdev) struct net_device *ndev = platform_get_drvdata(pdev);
unregister_netdev(ndev); - free_irq(ndev->irq, ndev); + devm_free_irq(&pdev->dev, ndev->irq, ndev); moxart_mac_free_memory(ndev); free_netdev(ndev);
From: Arnd Bergmann arnd@arndb.de
[ Upstream commit dc30b4059f6e2abf3712ab537c8718562b21c45d ]
The current gcc-10 snapshot produces a false-positive warning:
net/core/drop_monitor.c: In function 'trace_drop_common.constprop': cc1: error: writing 8 bytes into a region of size 0 [-Werror=stringop-overflow=] In file included from net/core/drop_monitor.c:23: include/uapi/linux/net_dropmon.h:36:8: note: at offset 0 to object 'entries' with size 4 declared here 36 | __u32 entries; | ^~~~~~~
I reported this in the gcc bugzilla, but in case it does not get fixed in the release, work around it by using a temporary variable.
Fixes: 9a8afc8d3962 ("Network Drop Monitor: Adding drop monitor implementation & Netlink protocol") Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94881 Signed-off-by: Arnd Bergmann arnd@arndb.de Acked-by: Neil Horman nhorman@tuxdriver.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/core/drop_monitor.c | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-)
diff --git a/net/core/drop_monitor.c b/net/core/drop_monitor.c index a2270188b8649..9bcc6fdade3eb 100644 --- a/net/core/drop_monitor.c +++ b/net/core/drop_monitor.c @@ -159,6 +159,7 @@ static void sched_send_work(unsigned long _data) static void trace_drop_common(struct sk_buff *skb, void *location) { struct net_dm_alert_msg *msg; + struct net_dm_drop_point *point; struct nlmsghdr *nlh; struct nlattr *nla; int i; @@ -177,11 +178,13 @@ static void trace_drop_common(struct sk_buff *skb, void *location) nlh = (struct nlmsghdr *)dskb->data; nla = genlmsg_data(nlmsg_data(nlh)); msg = nla_data(nla); + point = msg->points; for (i = 0; i < msg->entries; i++) { - if (!memcmp(&location, msg->points[i].pc, sizeof(void *))) { - msg->points[i].count++; + if (!memcmp(&location, &point->pc, sizeof(void *))) { + point->count++; goto out; } + point++; } if (msg->entries == dm_hit_limit) goto out; @@ -190,8 +193,8 @@ static void trace_drop_common(struct sk_buff *skb, void *location) */ __nla_reserve_nohdr(dskb, sizeof(struct net_dm_drop_point)); nla->nla_len += NLA_ALIGN(sizeof(struct net_dm_drop_point)); - memcpy(msg->points[msg->entries].pc, &location, sizeof(void *)); - msg->points[msg->entries].count = 1; + memcpy(point->pc, &location, sizeof(void *)); + point->count = 1; msg->entries++;
if (!timer_pending(&data->send_timer)) {
From: Wu Bo wubo40@huawei.com
commit 83c6f2390040f188cc25b270b4befeb5628c1aee upstream.
If the __copy_from_user function failed we need to call sg_remove_request in sg_write.
Link: https://lore.kernel.org/r/610618d9-e983-fd56-ed0f-639428343af7@huawei.com Acked-by: Douglas Gilbert dgilbert@interlog.com Signed-off-by: Wu Bo wubo40@huawei.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org [groeck: Backport to v5.4.y and older kernels] Signed-off-by: Guenter Roeck linux@roeck-us.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/scsi/sg.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
--- a/drivers/scsi/sg.c +++ b/drivers/scsi/sg.c @@ -706,8 +706,10 @@ sg_write(struct file *filp, const char _ hp->flags = input_size; /* structure abuse ... */ hp->pack_id = old_hdr.pack_id; hp->usr_ptr = NULL; - if (__copy_from_user(cmnd, buf, cmd_size)) + if (__copy_from_user(cmnd, buf, cmd_size)) { + sg_remove_request(sfp, srp); return -EFAULT; + } /* * SG_DXFER_TO_FROM_DEV is functionally equivalent to SG_DXFER_FROM_DEV, * but is is possible that the app intended SG_DXFER_TO_DEV, because there
From: wuxu.wu wuxu.wu@huawei.com
commit 19b61392c5a852b4e8a0bf35aecb969983c5932d upstream.
dw_spi_irq() and dw_spi_transfer_one concurrent calls.
I find a panic in dw_writer(): txw = *(u8 *)(dws->tx), when dw->tx==null, dw->len==4, and dw->tx_end==1.
When tpm driver's message overtime dw_spi_irq() and dw_spi_transfer_one may concurrent visit dw_spi, so I think dw_spi structure lack of protection.
Otherwise dw_spi_transfer_one set dw rx/tx buffer and then open irq, store dw rx/tx instructions and other cores handle irq load dw rx/tx instructions may out of order.
[ 1025.321302] Call trace: ... [ 1025.321319] __crash_kexec+0x98/0x148 [ 1025.321323] panic+0x17c/0x314 [ 1025.321329] die+0x29c/0x2e8 [ 1025.321334] die_kernel_fault+0x68/0x78 [ 1025.321337] __do_kernel_fault+0x90/0xb0 [ 1025.321346] do_page_fault+0x88/0x500 [ 1025.321347] do_translation_fault+0xa8/0xb8 [ 1025.321349] do_mem_abort+0x68/0x118 [ 1025.321351] el1_da+0x20/0x8c [ 1025.321362] dw_writer+0xc8/0xd0 [ 1025.321364] interrupt_transfer+0x60/0x110 [ 1025.321365] dw_spi_irq+0x48/0x70 ...
Signed-off-by: wuxu.wu wuxu.wu@huawei.com Link: https://lore.kernel.org/r/1577849981-31489-1-git-send-email-wuxu.wu@huawei.c... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Nobuhiro Iwamatsu (CIP) nobuhiro1.iwamatsu@toshiba.co.jp Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/spi/spi-dw.c | 15 ++++++++++++--- drivers/spi/spi-dw.h | 1 + 2 files changed, 13 insertions(+), 3 deletions(-)
--- a/drivers/spi/spi-dw.c +++ b/drivers/spi/spi-dw.c @@ -180,9 +180,11 @@ static inline u32 rx_max(struct dw_spi *
static void dw_writer(struct dw_spi *dws) { - u32 max = tx_max(dws); + u32 max; u16 txw = 0;
+ spin_lock(&dws->buf_lock); + max = tx_max(dws); while (max--) { /* Set the tx word if the transfer's original "tx" is not null */ if (dws->tx_end - dws->len) { @@ -194,13 +196,16 @@ static void dw_writer(struct dw_spi *dws dw_write_io_reg(dws, DW_SPI_DR, txw); dws->tx += dws->n_bytes; } + spin_unlock(&dws->buf_lock); }
static void dw_reader(struct dw_spi *dws) { - u32 max = rx_max(dws); + u32 max; u16 rxw;
+ spin_lock(&dws->buf_lock); + max = rx_max(dws); while (max--) { rxw = dw_read_io_reg(dws, DW_SPI_DR); /* Care rx only if the transfer's original "rx" is not null */ @@ -212,6 +217,7 @@ static void dw_reader(struct dw_spi *dws } dws->rx += dws->n_bytes; } + spin_unlock(&dws->buf_lock); }
static void int_error_stop(struct dw_spi *dws, const char *msg) @@ -284,6 +290,7 @@ static int dw_spi_transfer_one(struct sp { struct dw_spi *dws = spi_master_get_devdata(master); struct chip_data *chip = spi_get_ctldata(spi); + unsigned long flags; u8 imask = 0; u16 txlevel = 0; u16 clk_div; @@ -291,12 +298,13 @@ static int dw_spi_transfer_one(struct sp int ret;
dws->dma_mapped = 0; - + spin_lock_irqsave(&dws->buf_lock, flags); dws->tx = (void *)transfer->tx_buf; dws->tx_end = dws->tx + transfer->len; dws->rx = transfer->rx_buf; dws->rx_end = dws->rx + transfer->len; dws->len = transfer->len; + spin_unlock_irqrestore(&dws->buf_lock, flags);
spi_enable_chip(dws, 0);
@@ -488,6 +496,7 @@ int dw_spi_add_host(struct device *dev, dws->dma_inited = 0; dws->dma_addr = (dma_addr_t)(dws->paddr + DW_SPI_DR); snprintf(dws->name, sizeof(dws->name), "dw_spi%d", dws->bus_num); + spin_lock_init(&dws->buf_lock);
ret = request_irq(dws->irq, dw_spi_irq, IRQF_SHARED, dws->name, master); if (ret < 0) { --- a/drivers/spi/spi-dw.h +++ b/drivers/spi/spi-dw.h @@ -117,6 +117,7 @@ struct dw_spi { size_t len; void *tx; void *tx_end; + spinlock_t buf_lock; void *rx; void *rx_end; int dma_mapped;
From: Samuel Cabrero scabrero@suse.de
[ Upstream commit 76e752701a8af4404bbd9c45723f7cbd6e4a251e ]
Some servers seem to accept connections while booting but never send the SMBNegotiate response neither close the connection, causing all processes accessing the share hang on uninterruptible sleep state.
This happens when the cifs_demultiplex_thread detects the server is unresponsive so releases the socket and start trying to reconnect. At some point, the faulty server will accept the socket and the TCP status will be set to NeedNegotiate. The first issued command accessing the share will start the negotiation (pid 5828 below), but the response will never arrive so other commands will be blocked waiting on the mutex (pid 55352).
This patch checks for unresponsive servers also on the negotiate stage releasing the socket and reconnecting if the response is not received and checking again the tcp state when the mutex is acquired.
PID: 55352 TASK: ffff880fd6cc02c0 CPU: 0 COMMAND: "ls" #0 [ffff880fd9add9f0] schedule at ffffffff81467eb9 #1 [ffff880fd9addb38] __mutex_lock_slowpath at ffffffff81468fe0 #2 [ffff880fd9addba8] mutex_lock at ffffffff81468b1a #3 [ffff880fd9addbc0] cifs_reconnect_tcon at ffffffffa042f905 [cifs] #4 [ffff880fd9addc60] smb_init at ffffffffa042faeb [cifs] #5 [ffff880fd9addca0] CIFSSMBQPathInfo at ffffffffa04360b5 [cifs] ....
Which is waiting a mutex owned by:
PID: 5828 TASK: ffff880fcc55e400 CPU: 0 COMMAND: "xxxx" #0 [ffff880fbfdc19b8] schedule at ffffffff81467eb9 #1 [ffff880fbfdc1b00] wait_for_response at ffffffffa044f96d [cifs] #2 [ffff880fbfdc1b60] SendReceive at ffffffffa04505ce [cifs] #3 [ffff880fbfdc1bb0] CIFSSMBNegotiate at ffffffffa0438d79 [cifs] #4 [ffff880fbfdc1c50] cifs_negotiate_protocol at ffffffffa043b383 [cifs] #5 [ffff880fbfdc1c80] cifs_reconnect_tcon at ffffffffa042f911 [cifs] #6 [ffff880fbfdc1d20] smb_init at ffffffffa042faeb [cifs] #7 [ffff880fbfdc1d60] CIFSSMBQFSInfo at ffffffffa0434eb0 [cifs] ....
Signed-off-by: Samuel Cabrero scabrero@suse.de Reviewed-by: Aurélien Aptel aaptel@suse.de Reviewed-by: Ronnie Sahlberg lsahlber@redhat.com Signed-off-by: Steve French smfrench@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/cifs/cifssmb.c | 12 ++++++++++++ fs/cifs/connect.c | 3 ++- fs/cifs/smb2pdu.c | 12 ++++++++++++ 3 files changed, 26 insertions(+), 1 deletion(-)
diff --git a/fs/cifs/cifssmb.c b/fs/cifs/cifssmb.c index b9b8f19dce0e1..fa07f7cb85a51 100644 --- a/fs/cifs/cifssmb.c +++ b/fs/cifs/cifssmb.c @@ -184,6 +184,18 @@ cifs_reconnect_tcon(struct cifs_tcon *tcon, int smb_command) * reconnect the same SMB session */ mutex_lock(&ses->session_mutex); + + /* + * Recheck after acquire mutex. If another thread is negotiating + * and the server never sends an answer the socket will be closed + * and tcpStatus set to reconnect. + */ + if (server->tcpStatus == CifsNeedReconnect) { + rc = -EHOSTDOWN; + mutex_unlock(&ses->session_mutex); + goto out; + } + rc = cifs_negotiate_protocol(0, ses); if (rc == 0 && ses->need_reconnect) rc = cifs_setup_session(0, ses, nls_codepage); diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c index c9793ce0d3368..7022750cae2fd 100644 --- a/fs/cifs/connect.c +++ b/fs/cifs/connect.c @@ -558,7 +558,8 @@ server_unresponsive(struct TCP_Server_Info *server) * 65s kernel_recvmsg times out, and we see that we haven't gotten * a response in >60s. */ - if (server->tcpStatus == CifsGood && + if ((server->tcpStatus == CifsGood || + server->tcpStatus == CifsNeedNegotiate) && time_after(jiffies, server->lstrp + 2 * SMB_ECHO_INTERVAL)) { cifs_dbg(VFS, "Server %s has not responded in %d seconds. Reconnecting...\n", server->hostname, (2 * SMB_ECHO_INTERVAL) / HZ); diff --git a/fs/cifs/smb2pdu.c b/fs/cifs/smb2pdu.c index d4472a4947581..4ffd5e177288e 100644 --- a/fs/cifs/smb2pdu.c +++ b/fs/cifs/smb2pdu.c @@ -249,6 +249,18 @@ smb2_reconnect(__le16 smb2_command, struct cifs_tcon *tcon) * the same SMB session */ mutex_lock(&tcon->ses->session_mutex); + + /* + * Recheck after acquire mutex. If another thread is negotiating + * and the server never sends an answer the socket will be closed + * and tcpStatus set to reconnect. + */ + if (server->tcpStatus == CifsNeedReconnect) { + rc = -EHOSTDOWN; + mutex_unlock(&tcon->ses->session_mutex); + goto out; + } + rc = cifs_negotiate_protocol(0, tcon->ses); if (!rc && tcon->ses->need_reconnect) { rc = cifs_setup_session(0, tcon->ses, nls_codepage);
From: Ronnie Sahlberg lsahlber@redhat.com
[ Upstream commit f2caf901c1b7ce65f9e6aef4217e3241039db768 ]
There is a race condition with how we send (or supress and don't send) smb echos that will cause the client to incorrectly think the server is unresponsive and thus needs to be reconnected.
Summary of the race condition: 1) Daisy chaining scheduling creates a gap. 2) If traffic comes unfortunate shortly after the last echo, the planned echo is suppressed. 3) Due to the gap, the next echo transmission is delayed until after the timeout, which is set hard to twice the echo interval.
This is fixed by changing the timeouts from 2 to three times the echo interval.
Detailed description of the bug: https://lutz.donnerhacke.de/eng/Blog/Groundhog-Day-with-SMB-remount
Signed-off-by: Ronnie Sahlberg lsahlber@redhat.com Reviewed-by: Pavel Shilovsky pshilov@microsoft.com Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/cifs/connect.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c index 7022750cae2fd..21ddfd77966eb 100644 --- a/fs/cifs/connect.c +++ b/fs/cifs/connect.c @@ -548,10 +548,10 @@ static bool server_unresponsive(struct TCP_Server_Info *server) { /* - * We need to wait 2 echo intervals to make sure we handle such + * We need to wait 3 echo intervals to make sure we handle such * situations right: * 1s client sends a normal SMB request - * 2s client gets a response + * 3s client gets a response * 30s echo workqueue job pops, and decides we got a response recently * and don't need to send another * ... @@ -560,9 +560,9 @@ server_unresponsive(struct TCP_Server_Info *server) */ if ((server->tcpStatus == CifsGood || server->tcpStatus == CifsNeedNegotiate) && - time_after(jiffies, server->lstrp + 2 * SMB_ECHO_INTERVAL)) { + time_after(jiffies, server->lstrp + 3 * SMB_ECHO_INTERVAL)) { cifs_dbg(VFS, "Server %s has not responded in %d seconds. Reconnecting...\n", - server->hostname, (2 * SMB_ECHO_INTERVAL) / HZ); + server->hostname, (3 * SMB_ECHO_INTERVAL) / HZ); cifs_reconnect(server); wake_up(&server->response_q); return true;
From: Madhuparna Bhowmik madhuparnabhowmik10@gmail.com
[ Upstream commit 2e45676a4d33af47259fa186ea039122ce263ba9 ]
pd->dma.dev is read in irq handler pd_irq(). However, it is set to pdev->dev after request_irq(). Therefore, set pd->dma.dev to pdev->dev before request_irq() to avoid data race between pch_dma_probe() and pd_irq().
Found by Linux Driver Verification project (linuxtesting.org).
Signed-off-by: Madhuparna Bhowmik madhuparnabhowmik10@gmail.com Link: https://lore.kernel.org/r/20200416062335.29223-1-madhuparnabhowmik10@gmail.c... Signed-off-by: Vinod Koul vkoul@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/dma/pch_dma.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/dma/pch_dma.c b/drivers/dma/pch_dma.c index 113605f6fe208..32517003e118e 100644 --- a/drivers/dma/pch_dma.c +++ b/drivers/dma/pch_dma.c @@ -877,6 +877,7 @@ static int pch_dma_probe(struct pci_dev *pdev, }
pci_set_master(pdev); + pd->dma.dev = &pdev->dev;
err = request_irq(pdev->irq, pd_irq, IRQF_SHARED, DRV_NAME, pd); if (err) { @@ -892,7 +893,6 @@ static int pch_dma_probe(struct pci_dev *pdev, goto err_free_irq; }
- pd->dma.dev = &pdev->dev;
INIT_LIST_HEAD(&pd->dma.channels);
From: Lubomir Rintel lkundrak@v3.sk
[ Upstream commit 0c89446379218698189a47871336cb30286a7197 ]
When a channel configuration fails, the status of the channel is set to DEV_ERROR so that an attempt to submit it fails. However, this status sticks until the heat end of the universe, making it impossible to recover from the error.
Let's reset it when the channel is released so that further use of the channel with correct configuration is not impacted.
Signed-off-by: Lubomir Rintel lkundrak@v3.sk Link: https://lore.kernel.org/r/20200419164912.670973-5-lkundrak@v3.sk Signed-off-by: Vinod Koul vkoul@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/dma/mmp_tdma.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/dma/mmp_tdma.c b/drivers/dma/mmp_tdma.c index 3df0422607d59..ac9aede1bfbec 100644 --- a/drivers/dma/mmp_tdma.c +++ b/drivers/dma/mmp_tdma.c @@ -364,6 +364,8 @@ static void mmp_tdma_free_descriptor(struct mmp_tdma_chan *tdmac) gen_pool_free(gpool, (unsigned long)tdmac->desc_arr, size); tdmac->desc_arr = NULL; + if (tdmac->status == DMA_ERROR) + tdmac->status = DMA_COMPLETE;
return; }
From: Vasily Averin vvs@virtuozzo.com
[ Upstream commit 5b5703dbafae74adfbe298a56a81694172caf5e6 ]
v2: removed TODO reminder
Signed-off-by: Vasily Averin vvs@virtuozzo.com Link: http://patchwork.freedesktop.org/patch/msgid/a4e0ae09-a73c-1c62-04ef-3f990d4... Signed-off-by: Gerd Hoffmann kraxel@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/qxl/qxl_image.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/qxl/qxl_image.c b/drivers/gpu/drm/qxl/qxl_image.c index 7fbcc35e8ad35..c89c10055641e 100644 --- a/drivers/gpu/drm/qxl/qxl_image.c +++ b/drivers/gpu/drm/qxl/qxl_image.c @@ -210,7 +210,8 @@ qxl_image_init_helper(struct qxl_device *qdev, break; default: DRM_ERROR("unsupported image bit depth\n"); - return -EINVAL; /* TODO: cleanup */ + qxl_bo_kunmap_atomic_page(qdev, image_bo, ptr); + return -EINVAL; } image->u.bitmap.flags = QXL_BITMAP_TOP_DOWN; image->u.bitmap.x = width;
From: Vasily Averin vvs@virtuozzo.com
[ Upstream commit 5e698222c70257d13ae0816720dde57c56f81e15 ]
Commit 89163f93c6f9 ("ipc/util.c: sysvipc_find_ipc() should increase position index") is causing this bug (seen on 5.6.8):
# ipcs -q
------ Message Queues -------- key msqid owner perms used-bytes messages
# ipcmk -Q Message queue id: 0 # ipcs -q
------ Message Queues -------- key msqid owner perms used-bytes messages 0x82db8127 0 root 644 0 0
# ipcmk -Q Message queue id: 1 # ipcs -q
------ Message Queues -------- key msqid owner perms used-bytes messages 0x82db8127 0 root 644 0 0 0x76d1fb2a 1 root 644 0 0
# ipcrm -q 0 # ipcs -q
------ Message Queues -------- key msqid owner perms used-bytes messages 0x76d1fb2a 1 root 644 0 0 0x76d1fb2a 1 root 644 0 0
# ipcmk -Q Message queue id: 2 # ipcrm -q 2 # ipcs -q
------ Message Queues -------- key msqid owner perms used-bytes messages 0x76d1fb2a 1 root 644 0 0 0x76d1fb2a 1 root 644 0 0
# ipcmk -Q Message queue id: 3 # ipcrm -q 1 # ipcs -q
------ Message Queues -------- key msqid owner perms used-bytes messages 0x7c982867 3 root 644 0 0 0x7c982867 3 root 644 0 0 0x7c982867 3 root 644 0 0 0x7c982867 3 root 644 0 0
Whenever an IPC item with a low id is deleted, the items with higher ids are duplicated, as if filling a hole.
new_pos should jump through hole of unused ids, pos can be updated inside "for" cycle.
Fixes: 89163f93c6f9 ("ipc/util.c: sysvipc_find_ipc() should increase position index") Reported-by: Andreas Schwab schwab@suse.de Reported-by: Randy Dunlap rdunlap@infradead.org Signed-off-by: Vasily Averin vvs@virtuozzo.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Acked-by: Waiman Long longman@redhat.com Cc: NeilBrown neilb@suse.com Cc: Steven Rostedt rostedt@goodmis.org Cc: Ingo Molnar mingo@redhat.com Cc: Peter Oberparleiter oberpar@linux.ibm.com Cc: Davidlohr Bueso dave@stgolabs.net Cc: Manfred Spraul manfred@colorfullife.com Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/4921fe9b-9385-a2b4-1dc4-1099be6d2e39@virtuozzo.com Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- ipc/util.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/ipc/util.c b/ipc/util.c index 2724f9071ab39..7af476b6dcdde 100644 --- a/ipc/util.c +++ b/ipc/util.c @@ -756,21 +756,21 @@ static struct kern_ipc_perm *sysvipc_find_ipc(struct ipc_ids *ids, loff_t pos, total++; }
- *new_pos = pos + 1; + ipc = NULL; if (total >= ids->in_use) - return NULL; + goto out;
for (; pos < IPCMNI; pos++) { ipc = idr_find(&ids->ipcs_idr, pos); if (ipc != NULL) { rcu_read_lock(); ipc_lock_object(ipc); - return ipc; + break; } } - - /* Out of range - return NULL to terminate iteration */ - return NULL; +out: + *new_pos = pos + 1; + return ipc; }
static void *sysvipc_proc_next(struct seq_file *s, void *it, loff_t *pos)
From: John Hurley john.hurley@netronome.com
[ Upstream commit 0e3183cd2a64843a95b62f8bd4a83605a4cf0615 ]
Skbs may have their checksum value populated by HW. If this is a checksum calculated over the entire packet then the CHECKSUM_COMPLETE field is marked. Changes to the data pointer on the skb throughout the network stack still try to maintain this complete csum value if it is required through functions such as skb_postpush_rcsum.
The MPLS actions in Open vSwitch modify a CHECKSUM_COMPLETE value when changes are made to packet data without a push or a pull. This occurs when the ethertype of the MAC header is changed or when MPLS lse fields are modified.
The modification is carried out using the csum_partial function to get the csum of a buffer and add it into the larger checksum. The buffer is an inversion of the data to be removed followed by the new data. Because the csum is calculated over 16 bits and these values align with 16 bits, the effect is the removal of the old value from the CHECKSUM_COMPLETE and addition of the new value.
However, the csum fed into the function and the outcome of the calculation are also inverted. This would only make sense if it was the new value rather than the old that was inverted in the input buffer.
Fix the issue by removing the bit inverts in the csum_partial calculation.
The bug was verified and the fix tested by comparing the folded value of the updated CHECKSUM_COMPLETE value with the folded value of a full software checksum calculation (reset skb->csum to 0 and run skb_checksum_complete(skb)). Prior to the fix the outcomes differed but after they produce the same result.
Fixes: 25cd9ba0abc0 ("openvswitch: Add basic MPLS support to kernel") Fixes: bc7cc5999fd3 ("openvswitch: update checksum in {push,pop}_mpls") Signed-off-by: John Hurley john.hurley@netronome.com Reviewed-by: Jakub Kicinski jakub.kicinski@netronome.com Reviewed-by: Simon Horman simon.horman@netronome.com Acked-by: Pravin B Shelar pshelar@ovn.org Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/openvswitch/actions.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-)
diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c index fd6c587b6a040..828fdced4ecd8 100644 --- a/net/openvswitch/actions.c +++ b/net/openvswitch/actions.c @@ -143,8 +143,7 @@ static void update_ethertype(struct sk_buff *skb, struct ethhdr *hdr, if (skb->ip_summed == CHECKSUM_COMPLETE) { __be16 diff[] = { ~(hdr->h_proto), ethertype };
- skb->csum = ~csum_partial((char *)diff, sizeof(diff), - ~skb->csum); + skb->csum = csum_partial((char *)diff, sizeof(diff), skb->csum); }
hdr->h_proto = ethertype; @@ -227,8 +226,7 @@ static int set_mpls(struct sk_buff *skb, struct sw_flow_key *flow_key, if (skb->ip_summed == CHECKSUM_COMPLETE) { __be32 diff[] = { ~(*stack), lse };
- skb->csum = ~csum_partial((char *)diff, sizeof(diff), - ~skb->csum); + skb->csum = csum_partial((char *)diff, sizeof(diff), skb->csum); }
*stack = lse;
From: Jiri Benc jbenc@redhat.com
[ Upstream commit e271c7b4420ddbb9fae82a2b31a5ab3edafcf4fe ]
For ipgre interface in collect metadata mode, it doesn't make sense for the interface to be of ARPHRD_IPGRE type. The outer header of received packets is not needed, as all the information from it is present in metadata_dst. We already don't set ipgre_header_ops for collect metadata interfaces, which is the only consumer of mac_header pointing to the outer IP header.
Just set the interface type to ARPHRD_NONE in collect metadata mode for ipgre (not gretap, that still correctly stays ARPHRD_ETHER) and reset mac_header.
Fixes: a64b04d86d14 ("gre: do not assign header_ops in collect metadata mode") Fixes: 2e15ea390e6f4 ("ip_gre: Add support to collect tunnel metadata.") Signed-off-by: Jiri Benc jbenc@redhat.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/ipv4/ip_gre.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/net/ipv4/ip_gre.c b/net/ipv4/ip_gre.c index e5448570d6483..900ee28bda99a 100644 --- a/net/ipv4/ip_gre.c +++ b/net/ipv4/ip_gre.c @@ -399,7 +399,10 @@ static int ipgre_rcv(struct sk_buff *skb, const struct tnl_ptk_info *tpi) iph->saddr, iph->daddr, tpi->key);
if (tunnel) { - skb_pop_mac_header(skb); + if (tunnel->dev->type != ARPHRD_NONE) + skb_pop_mac_header(skb); + else + skb_reset_mac_header(skb); if (tunnel->collect_md) { __be16 flags; __be64 tun_id; @@ -1015,6 +1018,8 @@ static void ipgre_netlink_parms(struct net_device *dev, struct ip_tunnel *t = netdev_priv(dev);
t->collect_md = true; + if (dev->type == ARPHRD_IPGRE) + dev->type = ARPHRD_NONE; } }
From: zhong jiang zhongjiang@huawei.com
[ Upstream commit d6d8c8a48291b929b2e039f220f0b62958cccfea ]
When mainline introduced commit a96dfddbcc04 ("base/memory, hotplug: fix a kernel oops in show_valid_zones()"), it obtained the valid start and end pfn from the given pfn range. The valid start pfn can fix the actual issue, but it introduced another issue. The valid end pfn will may exceed the given end_pfn.
Although the incorrect overflow will not result in actual problem at present, but I think it need to be fixed.
[toshi.kani@hpe.com: remove assumption that end_pfn is aligned by MAX_ORDER_NR_PAGES] Fixes: a96dfddbcc04 ("base/memory, hotplug: fix a kernel oops in show_valid_zones()") Link: http://lkml.kernel.org/r/1486467299-22648-1-git-send-email-zhongjiang@huawei... Signed-off-by: zhong jiang zhongjiang@huawei.com Signed-off-by: Toshi Kani toshi.kani@hpe.com Cc: Vlastimil Babka vbabka@suse.cz Cc: Mel Gorman mgorman@techsingularity.net Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- mm/memory_hotplug.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index 804cbfe9132dd..5fa8a3606f409 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -1397,7 +1397,7 @@ int test_pages_in_a_zone(unsigned long start_pfn, unsigned long end_pfn, while ((i < MAX_ORDER_NR_PAGES) && !pfn_valid_within(pfn + i)) i++; - if (i == MAX_ORDER_NR_PAGES) + if (i == MAX_ORDER_NR_PAGES || pfn + i >= end_pfn) continue; /* Check if we got outside of the zone */ if (zone && !zone_spans_pfn(zone, pfn + i)) @@ -1414,7 +1414,7 @@ int test_pages_in_a_zone(unsigned long start_pfn, unsigned long end_pfn,
if (zone) { *valid_start = start; - *valid_end = end; + *valid_end = min(end, end_pfn); return 1; } else { return 0;
From: Ben Hutchings ben.hutchings@codethink.co.uk
[ Upstream commit 3a9910d7b686546dcc9986e790af17e148f1c888 ]
qla2x00_tmf_sp_done() now deletes the timer that will run qla2x00_tmf_iocb_timeout(), but doesn't check whether the timer already expired. Check the return value from del_timer() to avoid calling complete() a second time.
Fixes: 4440e46d5db7 ("[SCSI] qla2xxx: Add IOCB Abort command asynchronous ...") Fixes: 1514839b3664 ("scsi: qla2xxx: Fix NULL pointer crash due to active ...") Signed-off-by: Ben Hutchings ben.hutchings@codethink.co.uk Acked-by: Himanshu Madhani himanshu.madhani@cavium.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/qla2xxx/qla_init.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/scsi/qla2xxx/qla_init.c b/drivers/scsi/qla2xxx/qla_init.c index 41a646696babb..0772804dbc27e 100644 --- a/drivers/scsi/qla2xxx/qla_init.c +++ b/drivers/scsi/qla2xxx/qla_init.c @@ -364,8 +364,8 @@ qla24xx_abort_sp_done(void *data, void *ptr, int res) srb_t *sp = (srb_t *)ptr; struct srb_iocb *abt = &sp->u.iocb_cmd;
- del_timer(&sp->u.iocb_cmd.timer); - complete(&abt->u.abt.comp); + if (del_timer(&sp->u.iocb_cmd.timer)) + complete(&abt->u.abt.comp); }
static int
From: Anjali Singhai Jain anjali.singhai@intel.com
[ Upstream commit 09f79fd49d94cda5837e9bfd0cb222232b3b6d9f ]
X722 devices use the AdminQ to access the NVM, and this requires taking the AdminQ lock. Because of this, we lock the AdminQ during i40e_read_nvm(), which is also called in places where the lock is already held, such as the firmware update path which wants to lock once and then unlock when finished after performing several tasks.
Although this should have only affected X722 devices, commit 96a39aed25e6 ("i40e: Acquire NVM lock before reads on all devices", 2016-12-02) added locking for all NVM reads, regardless of device family.
This resulted in us accidentally causing NVM acquire timeouts on all devices, causing failed firmware updates which left the eeprom in a corrupt state.
Create unsafe non-locked variants of i40e_read_nvm_word and i40e_read_nvm_buffer, __i40e_read_nvm_word and __i40e_read_nvm_buffer respectively. These variants will not take the NVM lock and are expected to only be called in places where the NVM lock is already held if needed.
Since the only caller of i40e_read_nvm_buffer() was in such a path, remove it entirely in favor of the unsafe version. If necessary we can always add it back in the future.
Additionally, we now need to hold the NVM lock in i40e_validate_checksum because the call to i40e_calc_nvm_checksum now assumes that the NVM lock is held. We can further move the call to read I40E_SR_SW_CHECKSUM_WORD up a bit so that we do not need to acquire the NVM lock twice.
This should resolve firmware updates and also fix potential raise that could have caused the driver to report an invalid NVM checksum upon driver load.
Reported-by: Stefan Assmann sassmann@kpanic.de Fixes: 96a39aed25e6 ("i40e: Acquire NVM lock before reads on all devices", 2016-12-02) Signed-off-by: Anjali Singhai Jain anjali.singhai@intel.com Signed-off-by: Jacob Keller jacob.e.keller@intel.com Tested-by: Andrew Bowers andrewx.bowers@intel.com Signed-off-by: Jeff Kirsher jeffrey.t.kirsher@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/i40e/i40e_nvm.c | 98 ++++++++++++------- .../net/ethernet/intel/i40e/i40e_prototype.h | 2 - 2 files changed, 60 insertions(+), 40 deletions(-)
diff --git a/drivers/net/ethernet/intel/i40e/i40e_nvm.c b/drivers/net/ethernet/intel/i40e/i40e_nvm.c index dd4e6ea9e0e1b..af7f97791320d 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_nvm.c +++ b/drivers/net/ethernet/intel/i40e/i40e_nvm.c @@ -266,7 +266,7 @@ static i40e_status i40e_read_nvm_aq(struct i40e_hw *hw, u8 module_pointer, * @offset: offset of the Shadow RAM word to read (0x000000 - 0x001FFF) * @data: word read from the Shadow RAM * - * Reads one 16 bit word from the Shadow RAM using the GLNVM_SRCTL register. + * Reads one 16 bit word from the Shadow RAM using the AdminQ **/ static i40e_status i40e_read_nvm_word_aq(struct i40e_hw *hw, u16 offset, u16 *data) @@ -280,27 +280,49 @@ static i40e_status i40e_read_nvm_word_aq(struct i40e_hw *hw, u16 offset, }
/** - * i40e_read_nvm_word - Reads Shadow RAM + * __i40e_read_nvm_word - Reads nvm word, assumes called does the locking * @hw: pointer to the HW structure * @offset: offset of the Shadow RAM word to read (0x000000 - 0x001FFF) * @data: word read from the Shadow RAM * - * Reads one 16 bit word from the Shadow RAM using the GLNVM_SRCTL register. + * Reads one 16 bit word from the Shadow RAM. + * + * Do not use this function except in cases where the nvm lock is already + * taken via i40e_acquire_nvm(). + **/ +static i40e_status __i40e_read_nvm_word(struct i40e_hw *hw, + u16 offset, u16 *data) +{ + i40e_status ret_code = 0; + + if (hw->flags & I40E_HW_FLAG_AQ_SRCTL_ACCESS_ENABLE) + ret_code = i40e_read_nvm_word_aq(hw, offset, data); + else + ret_code = i40e_read_nvm_word_srctl(hw, offset, data); + return ret_code; +} + +/** + * i40e_read_nvm_word - Reads nvm word and acquire lock if necessary + * @hw: pointer to the HW structure + * @offset: offset of the Shadow RAM word to read (0x000000 - 0x001FFF) + * @data: word read from the Shadow RAM + * + * Reads one 16 bit word from the Shadow RAM. **/ i40e_status i40e_read_nvm_word(struct i40e_hw *hw, u16 offset, u16 *data) { - enum i40e_status_code ret_code = 0; + i40e_status ret_code = 0;
ret_code = i40e_acquire_nvm(hw, I40E_RESOURCE_READ); - if (!ret_code) { - if (hw->flags & I40E_HW_FLAG_AQ_SRCTL_ACCESS_ENABLE) { - ret_code = i40e_read_nvm_word_aq(hw, offset, data); - } else { - ret_code = i40e_read_nvm_word_srctl(hw, offset, data); - } - i40e_release_nvm(hw); - } + if (ret_code) + return ret_code; + + ret_code = __i40e_read_nvm_word(hw, offset, data); + + i40e_release_nvm(hw); + return ret_code; }
@@ -393,31 +415,25 @@ static i40e_status i40e_read_nvm_buffer_aq(struct i40e_hw *hw, u16 offset, }
/** - * i40e_read_nvm_buffer - Reads Shadow RAM buffer + * __i40e_read_nvm_buffer - Reads nvm buffer, caller must acquire lock * @hw: pointer to the HW structure * @offset: offset of the Shadow RAM word to read (0x000000 - 0x001FFF). * @words: (in) number of words to read; (out) number of words actually read * @data: words read from the Shadow RAM * * Reads 16 bit words (data buffer) from the SR using the i40e_read_nvm_srrd() - * method. The buffer read is preceded by the NVM ownership take - * and followed by the release. + * method. **/ -i40e_status i40e_read_nvm_buffer(struct i40e_hw *hw, u16 offset, - u16 *words, u16 *data) +static i40e_status __i40e_read_nvm_buffer(struct i40e_hw *hw, + u16 offset, u16 *words, + u16 *data) { - enum i40e_status_code ret_code = 0; + i40e_status ret_code = 0;
- if (hw->flags & I40E_HW_FLAG_AQ_SRCTL_ACCESS_ENABLE) { - ret_code = i40e_acquire_nvm(hw, I40E_RESOURCE_READ); - if (!ret_code) { - ret_code = i40e_read_nvm_buffer_aq(hw, offset, words, - data); - i40e_release_nvm(hw); - } - } else { + if (hw->flags & I40E_HW_FLAG_AQ_SRCTL_ACCESS_ENABLE) + ret_code = i40e_read_nvm_buffer_aq(hw, offset, words, data); + else ret_code = i40e_read_nvm_buffer_srctl(hw, offset, words, data); - } return ret_code; }
@@ -499,15 +515,15 @@ static i40e_status i40e_calc_nvm_checksum(struct i40e_hw *hw, data = (u16 *)vmem.va;
/* read pointer to VPD area */ - ret_code = i40e_read_nvm_word(hw, I40E_SR_VPD_PTR, &vpd_module); + ret_code = __i40e_read_nvm_word(hw, I40E_SR_VPD_PTR, &vpd_module); if (ret_code) { ret_code = I40E_ERR_NVM_CHECKSUM; goto i40e_calc_nvm_checksum_exit; }
/* read pointer to PCIe Alt Auto-load module */ - ret_code = i40e_read_nvm_word(hw, I40E_SR_PCIE_ALT_AUTO_LOAD_PTR, - &pcie_alt_module); + ret_code = __i40e_read_nvm_word(hw, I40E_SR_PCIE_ALT_AUTO_LOAD_PTR, + &pcie_alt_module); if (ret_code) { ret_code = I40E_ERR_NVM_CHECKSUM; goto i40e_calc_nvm_checksum_exit; @@ -521,7 +537,7 @@ static i40e_status i40e_calc_nvm_checksum(struct i40e_hw *hw, if ((i % I40E_SR_SECTOR_SIZE_IN_WORDS) == 0) { u16 words = I40E_SR_SECTOR_SIZE_IN_WORDS;
- ret_code = i40e_read_nvm_buffer(hw, i, &words, data); + ret_code = __i40e_read_nvm_buffer(hw, i, &words, data); if (ret_code) { ret_code = I40E_ERR_NVM_CHECKSUM; goto i40e_calc_nvm_checksum_exit; @@ -593,14 +609,19 @@ i40e_status i40e_validate_nvm_checksum(struct i40e_hw *hw, u16 checksum_sr = 0; u16 checksum_local = 0;
+ /* We must acquire the NVM lock in order to correctly synchronize the + * NVM accesses across multiple PFs. Without doing so it is possible + * for one of the PFs to read invalid data potentially indicating that + * the checksum is invalid. + */ + ret_code = i40e_acquire_nvm(hw, I40E_RESOURCE_READ); + if (ret_code) + return ret_code; ret_code = i40e_calc_nvm_checksum(hw, &checksum_local); + __i40e_read_nvm_word(hw, I40E_SR_SW_CHECKSUM_WORD, &checksum_sr); + i40e_release_nvm(hw); if (ret_code) - goto i40e_validate_nvm_checksum_exit; - - /* Do not use i40e_read_nvm_word() because we do not want to take - * the synchronization semaphores twice here. - */ - i40e_read_nvm_word(hw, I40E_SR_SW_CHECKSUM_WORD, &checksum_sr); + return ret_code;
/* Verify read checksum from EEPROM is the same as * calculated checksum @@ -612,7 +633,6 @@ i40e_status i40e_validate_nvm_checksum(struct i40e_hw *hw, if (checksum) *checksum = checksum_local;
-i40e_validate_nvm_checksum_exit: return ret_code; }
@@ -958,6 +978,7 @@ static i40e_status i40e_nvmupd_state_writing(struct i40e_hw *hw, break;
case I40E_NVMUPD_CSUM_CON: + /* Assumes the caller has acquired the nvm */ status = i40e_update_nvm_checksum(hw); if (status) { *perrno = hw->aq.asq_last_status ? @@ -971,6 +992,7 @@ static i40e_status i40e_nvmupd_state_writing(struct i40e_hw *hw, break;
case I40E_NVMUPD_CSUM_LCB: + /* Assumes the caller has acquired the nvm */ status = i40e_update_nvm_checksum(hw); if (status) { *perrno = hw->aq.asq_last_status ? diff --git a/drivers/net/ethernet/intel/i40e/i40e_prototype.h b/drivers/net/ethernet/intel/i40e/i40e_prototype.h index bb9d583e5416f..6caa2ab0ad743 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_prototype.h +++ b/drivers/net/ethernet/intel/i40e/i40e_prototype.h @@ -282,8 +282,6 @@ i40e_status i40e_acquire_nvm(struct i40e_hw *hw, void i40e_release_nvm(struct i40e_hw *hw); i40e_status i40e_read_nvm_word(struct i40e_hw *hw, u16 offset, u16 *data); -i40e_status i40e_read_nvm_buffer(struct i40e_hw *hw, u16 offset, - u16 *words, u16 *data); i40e_status i40e_update_nvm_checksum(struct i40e_hw *hw); i40e_status i40e_validate_nvm_checksum(struct i40e_hw *hw, u16 *checksum);
From: Gal Pressman galp@mellanox.com
[ Upstream commit 8ce59b16b4b6eacedaec1f7b652b4781cdbfe15f ]
When wait for firmware init fails, previous code would mistakenly return success and cause inconsistency in the driver state.
Fixes: 6c780a0267b8 ("net/mlx5: Wait for FW readiness before initializing command interface") Signed-off-by: Gal Pressman galp@mellanox.com Signed-off-by: Saeed Mahameed saeedm@mellanox.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/mellanox/mlx5/core/main.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c b/drivers/net/ethernet/mellanox/mlx5/core/main.c index bf4447581072f..e88605de84cca 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/main.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c @@ -933,7 +933,7 @@ static int mlx5_load_one(struct mlx5_core_dev *dev, struct mlx5_priv *priv) if (err) { dev_err(&dev->pdev->dev, "Firmware over %d MS in pre-initializing state, aborting\n", FW_PRE_INIT_TIMEOUT_MILI); - goto out; + goto out_err; }
err = mlx5_cmd_init(dev);
From: Arnd Bergmann arnd@arndb.de
[ Upstream commit 2c407aca64977ede9b9f35158e919773cae2082f ]
gcc-10 warns around a suspicious access to an empty struct member:
net/netfilter/nf_conntrack_core.c: In function '__nf_conntrack_alloc': net/netfilter/nf_conntrack_core.c:1522:9: warning: array subscript 0 is outside the bounds of an interior zero-length array 'u8[0]' {aka 'unsigned char[0]'} [-Wzero-length-bounds] 1522 | memset(&ct->__nfct_init_offset[0], 0, | ^~~~~~~~~~~~~~~~~~~~~~~~~~ In file included from net/netfilter/nf_conntrack_core.c:37: include/net/netfilter/nf_conntrack.h:90:5: note: while referencing '__nfct_init_offset' 90 | u8 __nfct_init_offset[0]; | ^~~~~~~~~~~~~~~~~~
The code is correct but a bit unusual. Rework it slightly in a way that does not trigger the warning, using an empty struct instead of an empty array. There are probably more elegant ways to do this, but this is the smallest change.
Fixes: c41884ce0562 ("netfilter: conntrack: avoid zeroing timer") Signed-off-by: Arnd Bergmann arnd@arndb.de Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Sasha Levin sashal@kernel.org --- include/net/netfilter/nf_conntrack.h | 2 +- net/netfilter/nf_conntrack_core.c | 4 ++-- 2 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/include/net/netfilter/nf_conntrack.h b/include/net/netfilter/nf_conntrack.h index 636e9e11bd5f6..e3f73fd1d53a9 100644 --- a/include/net/netfilter/nf_conntrack.h +++ b/include/net/netfilter/nf_conntrack.h @@ -98,7 +98,7 @@ struct nf_conn { possible_net_t ct_net;
/* all members below initialized via memset */ - u8 __nfct_init_offset[0]; + struct { } __nfct_init_offset;
/* If we were expected by an expectation, this will be it */ struct nf_conn *master; diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c index de0aad12b91d2..e58516274e86a 100644 --- a/net/netfilter/nf_conntrack_core.c +++ b/net/netfilter/nf_conntrack_core.c @@ -898,9 +898,9 @@ __nf_conntrack_alloc(struct net *net, /* Don't set timer yet: wait for confirmation */ setup_timer(&ct->timeout, death_by_timeout, (unsigned long)ct); write_pnet(&ct->ct_net, net); - memset(&ct->__nfct_init_offset[0], 0, + memset(&ct->__nfct_init_offset, 0, offsetof(struct nf_conn, proto) - - offsetof(struct nf_conn, __nfct_init_offset[0])); + offsetof(struct nf_conn, __nfct_init_offset));
if (zone && nf_ct_zone_add(ct, GFP_ATOMIC, zone) < 0) goto out_free;
From: Jack Morgenstein jackm@dev.mellanox.co.il
[ Upstream commit 6693ca95bd4330a0ad7326967e1f9bcedd6b0800 ]
In the mlx4_ib_post_send() flow, some functions call ib_get_cached_pkey() without checking its return value. If ib_get_cached_pkey() returns an error code, these functions should return failure.
Fixes: 1ffeb2eb8be9 ("IB/mlx4: SR-IOV IB context objects and proxy/tunnel SQP support") Fixes: 225c7b1feef1 ("IB/mlx4: Add a driver Mellanox ConnectX InfiniBand adapters") Fixes: e622f2f4ad21 ("IB: split struct ib_send_wr") Link: https://lore.kernel.org/r/20200426075921.130074-1-leon@kernel.org Signed-off-by: Jack Morgenstein jackm@dev.mellanox.co.il Signed-off-by: Leon Romanovsky leonro@mellanox.com Signed-off-by: Jason Gunthorpe jgg@mellanox.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/infiniband/hw/mlx4/qp.c | 14 +++++++++++--- 1 file changed, 11 insertions(+), 3 deletions(-)
diff --git a/drivers/infiniband/hw/mlx4/qp.c b/drivers/infiniband/hw/mlx4/qp.c index 348828271cb07..ecd461ee6dbe2 100644 --- a/drivers/infiniband/hw/mlx4/qp.c +++ b/drivers/infiniband/hw/mlx4/qp.c @@ -2156,6 +2156,7 @@ static int build_sriov_qp0_header(struct mlx4_ib_sqp *sqp, int send_size; int header_size; int spc; + int err; int i;
if (wr->wr.opcode != IB_WR_SEND) @@ -2190,7 +2191,9 @@ static int build_sriov_qp0_header(struct mlx4_ib_sqp *sqp,
sqp->ud_header.lrh.virtual_lane = 0; sqp->ud_header.bth.solicited_event = !!(wr->wr.send_flags & IB_SEND_SOLICITED); - ib_get_cached_pkey(ib_dev, sqp->qp.port, 0, &pkey); + err = ib_get_cached_pkey(ib_dev, sqp->qp.port, 0, &pkey); + if (err) + return err; sqp->ud_header.bth.pkey = cpu_to_be16(pkey); if (sqp->qp.mlx4_ib_qp_type == MLX4_IB_QPT_TUN_SMI_OWNER) sqp->ud_header.bth.destination_qpn = cpu_to_be32(wr->remote_qpn); @@ -2423,9 +2426,14 @@ static int build_mlx_header(struct mlx4_ib_sqp *sqp, struct ib_ud_wr *wr, } sqp->ud_header.bth.solicited_event = !!(wr->wr.send_flags & IB_SEND_SOLICITED); if (!sqp->qp.ibqp.qp_num) - ib_get_cached_pkey(ib_dev, sqp->qp.port, sqp->pkey_index, &pkey); + err = ib_get_cached_pkey(ib_dev, sqp->qp.port, sqp->pkey_index, + &pkey); else - ib_get_cached_pkey(ib_dev, sqp->qp.port, wr->pkey_index, &pkey); + err = ib_get_cached_pkey(ib_dev, sqp->qp.port, wr->pkey_index, + &pkey); + if (err) + return err; + sqp->ud_header.bth.pkey = cpu_to_be16(pkey); sqp->ud_header.bth.destination_qpn = cpu_to_be32(wr->remote_qpn); sqp->ud_header.bth.psn = cpu_to_be32((sqp->send_psn++) & ((1 << 24) - 1));
From: Jason Gunthorpe jgg@mellanox.com
commit 01b2bafe57b19d9119413f138765ef57990921ce upstream.
Aside from good practice, this avoids a warning from gcc 10:
./include/linux/kernel.h:997:3: warning: array subscript -31 is outside array bounds of ‘struct list_head[1]’ [-Warray-bounds] 997 | ((type *)(__mptr - offsetof(type, member))); }) | ~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ./include/linux/list.h:493:2: note: in expansion of macro ‘container_of’ 493 | container_of(ptr, type, member) | ^~~~~~~~~~~~ ./include/linux/pnp.h:275:30: note: in expansion of macro ‘list_entry’ 275 | #define global_to_pnp_dev(n) list_entry(n, struct pnp_dev, global_list) | ^~~~~~~~~~ ./include/linux/pnp.h:281:11: note: in expansion of macro ‘global_to_pnp_dev’ 281 | (dev) != global_to_pnp_dev(&pnp_global); \ | ^~~~~~~~~~~~~~~~~ arch/x86/kernel/rtc.c:189:2: note: in expansion of macro ‘pnp_for_each_dev’ 189 | pnp_for_each_dev(dev) {
Because the common code doesn't cast the starting list_head to the containing struct.
Signed-off-by: Jason Gunthorpe jgg@mellanox.com [ rjw: Whitespace adjustments ] Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- include/linux/pnp.h | 29 +++++++++-------------------- 1 file changed, 9 insertions(+), 20 deletions(-)
--- a/include/linux/pnp.h +++ b/include/linux/pnp.h @@ -219,10 +219,8 @@ struct pnp_card { #define global_to_pnp_card(n) list_entry(n, struct pnp_card, global_list) #define protocol_to_pnp_card(n) list_entry(n, struct pnp_card, protocol_list) #define to_pnp_card(n) container_of(n, struct pnp_card, dev) -#define pnp_for_each_card(card) \ - for((card) = global_to_pnp_card(pnp_cards.next); \ - (card) != global_to_pnp_card(&pnp_cards); \ - (card) = global_to_pnp_card((card)->global_list.next)) +#define pnp_for_each_card(card) \ + list_for_each_entry(card, &pnp_cards, global_list)
struct pnp_card_link { struct pnp_card *card; @@ -275,14 +273,9 @@ struct pnp_dev { #define card_to_pnp_dev(n) list_entry(n, struct pnp_dev, card_list) #define protocol_to_pnp_dev(n) list_entry(n, struct pnp_dev, protocol_list) #define to_pnp_dev(n) container_of(n, struct pnp_dev, dev) -#define pnp_for_each_dev(dev) \ - for((dev) = global_to_pnp_dev(pnp_global.next); \ - (dev) != global_to_pnp_dev(&pnp_global); \ - (dev) = global_to_pnp_dev((dev)->global_list.next)) -#define card_for_each_dev(card,dev) \ - for((dev) = card_to_pnp_dev((card)->devices.next); \ - (dev) != card_to_pnp_dev(&(card)->devices); \ - (dev) = card_to_pnp_dev((dev)->card_list.next)) +#define pnp_for_each_dev(dev) list_for_each_entry(dev, &pnp_global, global_list) +#define card_for_each_dev(card, dev) \ + list_for_each_entry(dev, &(card)->devices, card_list) #define pnp_dev_name(dev) (dev)->name
static inline void *pnp_get_drvdata(struct pnp_dev *pdev) @@ -434,14 +427,10 @@ struct pnp_protocol { };
#define to_pnp_protocol(n) list_entry(n, struct pnp_protocol, protocol_list) -#define protocol_for_each_card(protocol,card) \ - for((card) = protocol_to_pnp_card((protocol)->cards.next); \ - (card) != protocol_to_pnp_card(&(protocol)->cards); \ - (card) = protocol_to_pnp_card((card)->protocol_list.next)) -#define protocol_for_each_dev(protocol,dev) \ - for((dev) = protocol_to_pnp_dev((protocol)->devices.next); \ - (dev) != protocol_to_pnp_dev(&(protocol)->devices); \ - (dev) = protocol_to_pnp_dev((dev)->protocol_list.next)) +#define protocol_for_each_card(protocol, card) \ + list_for_each_entry(card, &(protocol)->cards, protocol_list) +#define protocol_for_each_dev(protocol, dev) \ + list_for_each_entry(dev, &(protocol)->devices, protocol_list)
extern struct bus_type pnp_bus_type;
From: Linus Torvalds torvalds@linux-foundation.org
commit 9d82973e032e246ff5663c9805fbb5407ae932e3 upstream.
Due to a bug-report that was compiler-dependent, I updated one of my machines to gcc-10. That shows a lot of new warnings. Happily they seem to be mostly the valid kind, but it's going to cause a round of churn for getting rid of them..
This is the really low-hanging fruit of removing a couple of zero-sized arrays in some core code. We have had a round of these patches before, and we'll have many more coming, and there is nothing special about these except that they were particularly trivial, and triggered more warnings than most.
Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- include/linux/fs.h | 2 +- include/linux/tty.h | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-)
--- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -915,7 +915,7 @@ struct file_handle { __u32 handle_bytes; int handle_type; /* file identifier */ - unsigned char f_handle[0]; + unsigned char f_handle[]; };
static inline struct file *get_file(struct file *f) --- a/include/linux/tty.h +++ b/include/linux/tty.h @@ -64,7 +64,7 @@ struct tty_buffer { int read; int flags; /* Data points here */ - unsigned long data[0]; + unsigned long data[]; };
/* Values for .flags field of tty_buffer */
From: Masahiro Yamada yamada.masahiro@socionext.com
commit b303c6df80c9f8f13785aa83a0471fca7e38b24d upstream.
Since -Wmaybe-uninitialized was introduced by GCC 4.7, we have patched various false positives:
- commit e74fc973b6e5 ("Turn off -Wmaybe-uninitialized when building with -Os") turned off this option for -Os.
- commit 815eb71e7149 ("Kbuild: disable 'maybe-uninitialized' warning for CONFIG_PROFILE_ALL_BRANCHES") turned off this option for CONFIG_PROFILE_ALL_BRANCHES
- commit a76bcf557ef4 ("Kbuild: enable -Wmaybe-uninitialized warning for "make W=1"") turned off this option for GCC < 4.9 Arnd provided more explanation in https://lkml.org/lkml/2017/3/14/903
I think this looks better by shifting the logic from Makefile to Kconfig.
Link: https://github.com/ClangBuiltLinux/linux/issues/350 Signed-off-by: Masahiro Yamada yamada.masahiro@socionext.com Reviewed-by: Nathan Chancellor natechancellor@gmail.com Tested-by: Nick Desaulniers ndesaulniers@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- Makefile | 5 ++++- init/Kconfig | 17 +++++++++++++++++ kernel/trace/Kconfig | 1 + 3 files changed, 22 insertions(+), 1 deletion(-)
--- a/Makefile +++ b/Makefile @@ -631,7 +631,6 @@ ARCH_CFLAGS := include arch/$(SRCARCH)/Makefile
KBUILD_CFLAGS += $(call cc-option,-fno-delete-null-pointer-checks,) -KBUILD_CFLAGS += $(call cc-disable-warning,maybe-uninitialized,) KBUILD_CFLAGS += $(call cc-disable-warning,frame-address,) KBUILD_CFLAGS += $(call cc-disable-warning, format-truncation) KBUILD_CFLAGS += $(call cc-disable-warning, format-overflow) @@ -649,6 +648,10 @@ KBUILD_CFLAGS += -O2 endif endif
+ifdef CONFIG_CC_DISABLE_WARN_MAYBE_UNINITIALIZED +KBUILD_CFLAGS += -Wno-maybe-uninitialized +endif + # Tell gcc to never replace conditional load with a non-conditional one KBUILD_CFLAGS += $(call cc-option,--param=allow-store-data-races=0)
--- a/init/Kconfig +++ b/init/Kconfig @@ -16,6 +16,22 @@ config DEFCONFIG_LIST default "$ARCH_DEFCONFIG" default "arch/$ARCH/defconfig"
+config CC_HAS_WARN_MAYBE_UNINITIALIZED + def_bool $(cc-option,-Wmaybe-uninitialized) + help + GCC >= 4.7 supports this option. + +config CC_DISABLE_WARN_MAYBE_UNINITIALIZED + bool + depends on CC_HAS_WARN_MAYBE_UNINITIALIZED + default CC_IS_GCC && GCC_VERSION < 40900 # unreliable for GCC < 4.9 + help + GCC's -Wmaybe-uninitialized is not reliable by definition. + Lots of false positive warnings are produced in some cases. + + If this option is enabled, -Wno-maybe-uninitialzed is passed + to the compiler to suppress maybe-uninitialized warnings. + config CONSTRUCTORS bool depends on !UML @@ -1331,6 +1347,7 @@ config CC_OPTIMIZE_FOR_PERFORMANCE
config CC_OPTIMIZE_FOR_SIZE bool "Optimize for size" + imply CC_DISABLE_WARN_MAYBE_UNINITIALIZED # avoid false positives help Enabling this option will pass "-Os" instead of "-O2" to your compiler resulting in a smaller kernel. --- a/kernel/trace/Kconfig +++ b/kernel/trace/Kconfig @@ -312,6 +312,7 @@ config PROFILE_ANNOTATED_BRANCHES config PROFILE_ALL_BRANCHES bool "Profile all if conditionals" select TRACE_BRANCH_PROFILING + imply CC_DISABLE_WARN_MAYBE_UNINITIALIZED # avoid false positives help This tracer profiles all branch conditions. Every if () taken in the kernel is recorded whether it hit or miss.
From: Linus Torvalds torvalds@linux-foundation.org
commit 78a5255ffb6a1af189a83e493d916ba1c54d8c75 upstream.
We have some rather random rules about when we accept the "maybe-initialized" warnings, and when we don't.
For example, we consider it unreliable for gcc versions < 4.9, but also if -O3 is enabled, or if optimizing for size. And then various kernel config options disabled it, because they know that they trigger that warning by confusing gcc sufficiently (ie PROFILE_ALL_BRANCHES).
And now gcc-10 seems to be introducing a lot of those warnings too, so it falls under the same heading as 4.9 did.
At the same time, we have a very straightforward way to _enable_ that warning when wanted: use "W=2" to enable more warnings.
So stop playing these ad-hoc games, and just disable that warning by default, with the known and straight-forward "if you want to work on the extra compiler warnings, use W=123".
Would it be great to have code that is always so obvious that it never confuses the compiler whether a variable is used initialized or not? Yes, it would. In a perfect world, the compilers would be smarter, and our source code would be simpler.
That's currently not the world we live in, though.
Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- Makefile | 7 +++---- init/Kconfig | 17 ----------------- kernel/trace/Kconfig | 1 - 3 files changed, 3 insertions(+), 22 deletions(-)
--- a/Makefile +++ b/Makefile @@ -648,10 +648,6 @@ KBUILD_CFLAGS += -O2 endif endif
-ifdef CONFIG_CC_DISABLE_WARN_MAYBE_UNINITIALIZED -KBUILD_CFLAGS += -Wno-maybe-uninitialized -endif - # Tell gcc to never replace conditional load with a non-conditional one KBUILD_CFLAGS += $(call cc-option,--param=allow-store-data-races=0)
@@ -799,6 +795,9 @@ KBUILD_CFLAGS += $(call cc-disable-warni # disable stringop warnings in gcc 8+ KBUILD_CFLAGS += $(call cc-disable-warning, stringop-truncation)
+# Enabled with W=2, disabled by default as noisy +KBUILD_CFLAGS += $(call cc-disable-warning, maybe-uninitialized) + # disable invalid "can't wrap" optimizations for signed / pointers KBUILD_CFLAGS += $(call cc-option,-fno-strict-overflow)
--- a/init/Kconfig +++ b/init/Kconfig @@ -16,22 +16,6 @@ config DEFCONFIG_LIST default "$ARCH_DEFCONFIG" default "arch/$ARCH/defconfig"
-config CC_HAS_WARN_MAYBE_UNINITIALIZED - def_bool $(cc-option,-Wmaybe-uninitialized) - help - GCC >= 4.7 supports this option. - -config CC_DISABLE_WARN_MAYBE_UNINITIALIZED - bool - depends on CC_HAS_WARN_MAYBE_UNINITIALIZED - default CC_IS_GCC && GCC_VERSION < 40900 # unreliable for GCC < 4.9 - help - GCC's -Wmaybe-uninitialized is not reliable by definition. - Lots of false positive warnings are produced in some cases. - - If this option is enabled, -Wno-maybe-uninitialzed is passed - to the compiler to suppress maybe-uninitialized warnings. - config CONSTRUCTORS bool depends on !UML @@ -1347,7 +1331,6 @@ config CC_OPTIMIZE_FOR_PERFORMANCE
config CC_OPTIMIZE_FOR_SIZE bool "Optimize for size" - imply CC_DISABLE_WARN_MAYBE_UNINITIALIZED # avoid false positives help Enabling this option will pass "-Os" instead of "-O2" to your compiler resulting in a smaller kernel. --- a/kernel/trace/Kconfig +++ b/kernel/trace/Kconfig @@ -312,7 +312,6 @@ config PROFILE_ANNOTATED_BRANCHES config PROFILE_ALL_BRANCHES bool "Profile all if conditionals" select TRACE_BRANCH_PROFILING - imply CC_DISABLE_WARN_MAYBE_UNINITIALIZED # avoid false positives help This tracer profiles all branch conditions. Every if () taken in the kernel is recorded whether it hit or miss.
From: Linus Torvalds torvalds@linux-foundation.org
commit 5c45de21a2223fe46cf9488c99a7fbcf01527670 upstream.
This is a fine warning, but we still have a number of zero-length arrays in the kernel that come from the traditional gcc extension. Yes, they are getting converted to flexible arrays, but in the meantime the gcc-10 warning about zero-length bounds is very verbose, and is hiding other issues.
I missed one actual build failure because it was hidden among hundreds of lines of warning. Thankfully I caught it on the second go before pushing things out, but it convinced me that I really need to disable the new warnings for now.
We'll hopefully be all done with our conversion to flexible arrays in the not too distant future, and we can then re-enable this warning.
Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- Makefile | 3 +++ 1 file changed, 3 insertions(+)
--- a/Makefile +++ b/Makefile @@ -795,6 +795,9 @@ KBUILD_CFLAGS += $(call cc-disable-warni # disable stringop warnings in gcc 8+ KBUILD_CFLAGS += $(call cc-disable-warning, stringop-truncation)
+# We'll want to enable this eventually, but it's not going away for 5.7 at least +KBUILD_CFLAGS += $(call cc-disable-warning, zero-length-bounds) + # Enabled with W=2, disabled by default as noisy KBUILD_CFLAGS += $(call cc-disable-warning, maybe-uninitialized)
From: Linus Torvalds torvalds@linux-foundation.org
commit 44720996e2d79e47d508b0abe99b931a726a3197 upstream.
This is another fine warning, related to the 'zero-length-bounds' one, but hitting the same historical code in the kernel.
Because C didn't historically support flexible array members, we have code that instead uses a one-sized array, the same way we have cases of zero-sized arrays.
The one-sized arrays come from either not wanting to use the gcc zero-sized array extension, or from a slight convenience-feature, where particularly for strings, the size of the structure now includes the allocation for the final NUL character.
So with a "char name[1];" at the end of a structure, you can do things like
v = my_malloc(sizeof(struct vendor) + strlen(name));
and avoid the "+1" for the terminator.
Yes, the modern way to do that is with a flexible array, and using 'offsetof()' instead of 'sizeof()', and adding the "+1" by hand. That also technically gets the size "more correct" in that it avoids any alignment (and thus padding) issues, but this is another long-term cleanup thing that will not happen for 5.7.
So disable the warning for now, even though it's potentially quite useful. Having a slew of warnings that then hide more urgent new issues is not an improvement.
Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- Makefile | 1 + 1 file changed, 1 insertion(+)
--- a/Makefile +++ b/Makefile @@ -797,6 +797,7 @@ KBUILD_CFLAGS += $(call cc-disable-warni
# We'll want to enable this eventually, but it's not going away for 5.7 at least KBUILD_CFLAGS += $(call cc-disable-warning, zero-length-bounds) +KBUILD_CFLAGS += $(call cc-disable-warning, array-bounds)
# Enabled with W=2, disabled by default as noisy KBUILD_CFLAGS += $(call cc-disable-warning, maybe-uninitialized)
From: Linus Torvalds torvalds@linux-foundation.org
commit 5a76021c2eff7fcf2f0918a08fd8a37ce7922921 upstream.
This is the final array bounds warning removal for gcc-10 for now.
Again, the warning is good, and we should re-enable all these warnings when we have converted all the legacy array declaration cases to flexible arrays. But in the meantime, it's just noise.
Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- Makefile | 1 + 1 file changed, 1 insertion(+)
--- a/Makefile +++ b/Makefile @@ -798,6 +798,7 @@ KBUILD_CFLAGS += $(call cc-disable-warni # We'll want to enable this eventually, but it's not going away for 5.7 at least KBUILD_CFLAGS += $(call cc-disable-warning, zero-length-bounds) KBUILD_CFLAGS += $(call cc-disable-warning, array-bounds) +KBUILD_CFLAGS += $(call cc-disable-warning, stringop-overflow)
# Enabled with W=2, disabled by default as noisy KBUILD_CFLAGS += $(call cc-disable-warning, maybe-uninitialized)
From: Linus Torvalds torvalds@linux-foundation.org
commit adc71920969870dfa54e8f40dac8616284832d02 upstream.
gcc-10 now warns about passing aliasing pointers to functions that take restricted pointers.
That's actually a great warning, and if we ever start using 'restrict' in the kernel, it might be quite useful. But right now we don't, and it turns out that the only thing this warns about is an idiom where we have declared a few functions to be "printf-like" (which seems to make gcc pick up the restricted pointer thing), and then we print to the same buffer that we also use as an input.
And people do that as an odd concatenation pattern, with code like this:
#define sysfs_show_gen_prop(buffer, fmt, ...) \ snprintf(buffer, PAGE_SIZE, "%s"fmt, buffer, __VA_ARGS__)
where we have 'buffer' as both the destination of the final result, and as the initial argument.
Yes, it's a bit questionable. And outside of the kernel, people do have standard declarations like
int snprintf( char *restrict buffer, size_t bufsz, const char *restrict format, ... );
where that output buffer is marked as a restrict pointer that cannot alias with any other arguments.
But in the context of the kernel, that 'use snprintf() to concatenate to the end result' does work, and the pattern shows up in multiple places. And we have not marked our own version of snprintf() as taking restrict pointers, so the warning is incorrect for now, and gcc picks it up on its own.
If we do start using 'restrict' in the kernel (and it might be a good idea if people find places where it matters), we'll need to figure out how to avoid this issue for snprintf and friends. But in the meantime, this warning is not useful.
Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- Makefile | 3 +++ 1 file changed, 3 insertions(+)
--- a/Makefile +++ b/Makefile @@ -800,6 +800,9 @@ KBUILD_CFLAGS += $(call cc-disable-warni KBUILD_CFLAGS += $(call cc-disable-warning, array-bounds) KBUILD_CFLAGS += $(call cc-disable-warning, stringop-overflow)
+# Another good warning that we'll want to enable eventually +KBUILD_CFLAGS += $(call cc-disable-warning, restrict) + # Enabled with W=2, disabled by default as noisy KBUILD_CFLAGS += $(call cc-disable-warning, maybe-uninitialized)
From: Christoph Hellwig hch@lst.de
commit 287922eb0b186e2a5bf54fdd04b734c25c90035c upstream.
Timer context is not very useful for drivers to perform any meaningful abort action from. So instead of calling the driver from this useless context defer it to a workqueue as soon as possible.
Note that while a delayed_work item would seem the right thing here I didn't dare to use it due to the magic in blk_add_timer that pokes deep into timer internals. But maybe this encourages Tejun to add a sensible API for that to the workqueue API and we'll all be fine in the end :)
Contains a major update from Keith Bush:
"This patch removes synchronizing the timeout work so that the timer can start a freeze on its own queue. The timer enters the queue, so timer context can only start a freeze, but not wait for frozen."
------------- NOTE: Back-ported to 4.4.y.
The only parts of the upstream commit that have been kept are various locking changes, none of which were mentioned in the original commit message which therefore describes this change not at all.
Timeout callbacks continue to be run via a timer. Both blk_mq_rq_timer and blk_rq_timed_out_timer will return without without doing any work if they cannot acquire the queue (without waiting). -------------
Signed-off-by: Christoph Hellwig hch@lst.de Acked-by: Keith Busch keith.busch@intel.com Signed-off-by: Jens Axboe axboe@fb.com Signed-off-by: Giuliano Procida gprocida@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- block/blk-mq.c | 4 ++++ block/blk-timeout.c | 3 +++ 2 files changed, 7 insertions(+)
--- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -628,6 +628,9 @@ static void blk_mq_rq_timer(unsigned lon }; int i;
+ if (blk_queue_enter(q, GFP_NOWAIT)) + return; + blk_mq_queue_tag_busy_iter(q, blk_mq_check_expired, &data);
if (data.next_set) { @@ -642,6 +645,7 @@ static void blk_mq_rq_timer(unsigned lon blk_mq_tag_idle(hctx); } } + blk_queue_exit(q); }
/* --- a/block/blk-timeout.c +++ b/block/blk-timeout.c @@ -134,6 +134,8 @@ void blk_rq_timed_out_timer(unsigned lon struct request *rq, *tmp; int next_set = 0;
+ if (blk_queue_enter(q, GFP_NOWAIT)) + return; spin_lock_irqsave(q->queue_lock, flags);
list_for_each_entry_safe(rq, tmp, &q->timeout_list, timeout_list) @@ -143,6 +145,7 @@ void blk_rq_timed_out_timer(unsigned lon mod_timer(&q->timeout, round_jiffies_up(next));
spin_unlock_irqrestore(q->queue_lock, flags); + blk_queue_exit(q); }
/**
On Mon, May 18, 2020 at 07:36:34PM +0200, Greg Kroah-Hartman wrote:
From: Christoph Hellwig hch@lst.de
commit 287922eb0b186e2a5bf54fdd04b734c25c90035c upstream.
I notice 287922eb0b18 has been referenced in Fixes-tag in mainline commit 5480e299b5ae ("scsi: iscsi: Fix a potential deadlock in the timeout handler"). Consider if backporting 5480e299b5ae together with this 4.4 version of 287922eb0b18 is also relevant.
Thanks, -- Henri
Timer context is not very useful for drivers to perform any meaningful abort action from. So instead of calling the driver from this useless context defer it to a workqueue as soon as possible.
Note that while a delayed_work item would seem the right thing here I didn't dare to use it due to the magic in blk_add_timer that pokes deep into timer internals. But maybe this encourages Tejun to add a sensible API for that to the workqueue API and we'll all be fine in the end :)
Contains a major update from Keith Bush:
"This patch removes synchronizing the timeout work so that the timer can start a freeze on its own queue. The timer enters the queue, so timer context can only start a freeze, but not wait for frozen."
NOTE: Back-ported to 4.4.y.
The only parts of the upstream commit that have been kept are various locking changes, none of which were mentioned in the original commit message which therefore describes this change not at all.
Timeout callbacks continue to be run via a timer. Both blk_mq_rq_timer and blk_rq_timed_out_timer will return without without doing any work if they cannot acquire the queue (without waiting).
On Tue, May 19, 2020 at 09:00:38AM +0300, Henri Rosten wrote:
On Mon, May 18, 2020 at 07:36:34PM +0200, Greg Kroah-Hartman wrote:
From: Christoph Hellwig hch@lst.de
commit 287922eb0b186e2a5bf54fdd04b734c25c90035c upstream.
I notice 287922eb0b18 has been referenced in Fixes-tag in mainline commit 5480e299b5ae ("scsi: iscsi: Fix a potential deadlock in the timeout handler"). Consider if backporting 5480e299b5ae together with this 4.4 version of 287922eb0b18 is also relevant.
Ugh, there are lots of "fixes for fixes" in these releases that seem to be needed. Let me dig through and find them all as I should have done before starting this -rc round :(
thanks,
greg k-h
On Tue, May 19, 2020 at 09:31:47AM +0200, Greg Kroah-Hartman wrote:
On Tue, May 19, 2020 at 09:00:38AM +0300, Henri Rosten wrote:
On Mon, May 18, 2020 at 07:36:34PM +0200, Greg Kroah-Hartman wrote:
From: Christoph Hellwig hch@lst.de
commit 287922eb0b186e2a5bf54fdd04b734c25c90035c upstream.
I notice 287922eb0b18 has been referenced in Fixes-tag in mainline commit 5480e299b5ae ("scsi: iscsi: Fix a potential deadlock in the timeout handler"). Consider if backporting 5480e299b5ae together with this 4.4 version of 287922eb0b18 is also relevant.
Ugh, there are lots of "fixes for fixes" in these releases that seem to be needed. Let me dig through and find them all as I should have done before starting this -rc round :(
Turns out you found the two issues here, and I've queued up this fix now.
thanks,
greg k-h
From: Gabriel Krisman Bertazi krisman@linux.vnet.ibm.com
commit 71f79fb3179e69b0c1448a2101a866d871c66e7f upstream.
In case a submitted request gets stuck for some reason, the block layer can prevent the request starvation by starting the scheduled timeout work. If this stuck request occurs at the same time another thread has started a queue freeze, the blk_mq_timeout_work will not be able to acquire the queue reference and will return silently, thus not issuing the timeout. But since the request is already holding a q_usage_counter reference and is unable to complete, it will never release its reference, preventing the queue from completing the freeze started by first thread. This puts the request_queue in a hung state, forever waiting for the freeze completion.
This was observed while running IO to a NVMe device at the same time we toggled the CPU hotplug code. Eventually, once a request got stuck requiring a timeout during a queue freeze, we saw the CPU Hotplug notification code get stuck inside blk_mq_freeze_queue_wait, as shown in the trace below.
[c000000deaf13690] [c000000deaf13738] 0xc000000deaf13738 (unreliable) [c000000deaf13860] [c000000000015ce8] __switch_to+0x1f8/0x350 [c000000deaf138b0] [c000000000ade0e4] __schedule+0x314/0x990 [c000000deaf13940] [c000000000ade7a8] schedule+0x48/0xc0 [c000000deaf13970] [c0000000005492a4] blk_mq_freeze_queue_wait+0x74/0x110 [c000000deaf139e0] [c00000000054b6a8] blk_mq_queue_reinit_notify+0x1a8/0x2e0 [c000000deaf13a40] [c0000000000e7878] notifier_call_chain+0x98/0x100 [c000000deaf13a90] [c0000000000b8e08] cpu_notify_nofail+0x48/0xa0 [c000000deaf13ac0] [c0000000000b92f0] _cpu_down+0x2a0/0x400 [c000000deaf13b90] [c0000000000b94a8] cpu_down+0x58/0xa0 [c000000deaf13bc0] [c0000000006d5dcc] cpu_subsys_offline+0x2c/0x50 [c000000deaf13bf0] [c0000000006cd244] device_offline+0x104/0x140 [c000000deaf13c30] [c0000000006cd40c] online_store+0x6c/0xc0 [c000000deaf13c80] [c0000000006c8c78] dev_attr_store+0x68/0xa0 [c000000deaf13cc0] [c0000000003974d0] sysfs_kf_write+0x80/0xb0 [c000000deaf13d00] [c0000000003963e8] kernfs_fop_write+0x188/0x200 [c000000deaf13d50] [c0000000002e0f6c] __vfs_write+0x6c/0xe0 [c000000deaf13d90] [c0000000002e1ca0] vfs_write+0xc0/0x230 [c000000deaf13de0] [c0000000002e2cdc] SyS_write+0x6c/0x110 [c000000deaf13e30] [c000000000009204] system_call+0x38/0xb4
The fix is to allow the timeout work to execute in the window between dropping the initial refcount reference and the release of the last reference, which actually marks the freeze completion. This can be achieved with percpu_refcount_tryget, which does not require the counter to be alive. This way the timeout work can do it's job and terminate a stuck request even during a freeze, returning its reference and avoiding the deadlock.
Allowing the timeout to run is just a part of the fix, since for some devices, we might get stuck again inside the device driver's timeout handler, should it attempt to allocate a new request in that path - which is a quite common action for Abort commands, which need to be sent after a timeout. In NVMe, for instance, we call blk_mq_alloc_request from inside the timeout handler, which will fail during a freeze, since it also tries to acquire a queue reference.
I considered a similar change to blk_mq_alloc_request as a generic solution for further device driver hangs, but we can't do that, since it would allow new requests to disturb the freeze process. I thought about creating a new function in the block layer to support unfreezable requests for these occasions, but after working on it for a while, I feel like this should be handled in a per-driver basis. I'm now experimenting with changes to the NVMe timeout path, but I'm open to suggestions of ways to make this generic.
Signed-off-by: Gabriel Krisman Bertazi krisman@linux.vnet.ibm.com Cc: Brian King brking@linux.vnet.ibm.com Cc: Keith Busch keith.busch@intel.com Cc: linux-nvme@lists.infradead.org Cc: linux-block@vger.kernel.org Reviewed-by: Christoph Hellwig hch@lst.de Signed-off-by: Jens Axboe axboe@fb.com Signed-off-by: Giuliano Procida gprocida@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- block/blk-mq.c | 15 ++++++++++++++- 1 file changed, 14 insertions(+), 1 deletion(-)
--- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -628,7 +628,20 @@ static void blk_mq_rq_timer(unsigned lon }; int i;
- if (blk_queue_enter(q, GFP_NOWAIT)) + /* A deadlock might occur if a request is stuck requiring a + * timeout at the same time a queue freeze is waiting + * completion, since the timeout code would not be able to + * acquire the queue reference here. + * + * That's why we don't use blk_queue_enter here; instead, we use + * percpu_ref_tryget directly, because we need to be able to + * obtain a reference even in the short window between the queue + * starting to freeze, by dropping the first reference in + * blk_mq_freeze_queue_start, and the moment the last request is + * consumed, marked by the instant q_usage_counter reaches + * zero. + */ + if (!percpu_ref_tryget(&q->q_usage_counter)) return;
blk_mq_queue_tag_busy_iter(q, blk_mq_check_expired, &data);
From: Jianchao Wang jianchao.w.wang@oracle.com
commit f5bbbbe4d63577026f908a809f22f5fd5a90ea1f upstream.
For blk-mq, part_in_flight/rw will invoke blk_mq_in_flight/rw to account the inflight requests. It will access the queue_hw_ctx and nr_hw_queues w/o any protection. When updating nr_hw_queues and blk_mq_in_flight/rw occur concurrently, panic comes up.
Before update nr_hw_queues, the q will be frozen. So we could use q_usage_counter to avoid the race. percpu_ref_is_zero is used here so that we will not miss any in-flight request. The access to nr_hw_queues and queue_hw_ctx in blk_mq_queue_tag_busy_iter are under rcu critical section, __blk_mq_update_nr_hw_queues could use synchronize_rcu to ensure the zeroed q_usage_counter to be globally visible.
-------------- NOTE: Back-ported to 4.4.y.
The upstream commit was intended to prevent concurrent manipulation of nr_hw_queues and iteration over queues. The former doesn't happen in this in 4.4.7 (as __blk_mq_update_nr_hw_queues doesn't exist). The extra locking is also buggy in this commit but fixed in a follow-up.
It may protect against other concurrent accesses such as queue removal by synchronising RCU locking around q_usage_counter. --------------
Signed-off-by: Jianchao Wang jianchao.w.wang@oracle.com Reviewed-by: Ming Lei ming.lei@redhat.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Giuliano Procida gprocida@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- block/blk-mq-tag.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-)
--- a/block/blk-mq-tag.c +++ b/block/blk-mq-tag.c @@ -481,6 +481,14 @@ void blk_mq_queue_tag_busy_iter(struct r struct blk_mq_hw_ctx *hctx; int i;
+ /* + * Avoid potential races with things like queue removal. + */ + rcu_read_lock(); + if (percpu_ref_is_zero(&q->q_usage_counter)) { + rcu_read_unlock(); + return; + }
queue_for_each_hw_ctx(q, hctx, i) { struct blk_mq_tags *tags = hctx->tags; @@ -497,7 +505,7 @@ void blk_mq_queue_tag_busy_iter(struct r bt_for_each(hctx, &tags->bitmap_tags, tags->nr_reserved_tags, fn, priv, false); } - + rcu_read_unlock(); }
static unsigned int bt_unused_tags(struct blk_mq_bitmap_tags *bt)
From: Keith Busch keith.busch@intel.com
commit 530ca2c9bd6949c72c9b5cfc330cb3dbccaa3f5b upstream.
A recent commit runs tag iterator callbacks under the rcu read lock, but existing callbacks do not satisfy the non-blocking requirement. The commit intended to prevent an iterator from accessing a queue that's being modified. This patch fixes the original issue by taking a queue reference instead of reading it, which allows callbacks to make blocking calls.
Fixes: f5bbbbe4d6357 ("blk-mq: sync the update nr_hw_queues with blk_mq_queue_tag_busy_iter") Acked-by: Jianchao Wang jianchao.w.wang@oracle.com Signed-off-by: Keith Busch keith.busch@intel.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Giuliano Procida gprocida@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- block/blk-mq-tag.c | 7 ++----- 1 file changed, 2 insertions(+), 5 deletions(-)
--- a/block/blk-mq-tag.c +++ b/block/blk-mq-tag.c @@ -484,11 +484,8 @@ void blk_mq_queue_tag_busy_iter(struct r /* * Avoid potential races with things like queue removal. */ - rcu_read_lock(); - if (percpu_ref_is_zero(&q->q_usage_counter)) { - rcu_read_unlock(); + if (!percpu_ref_tryget(&q->q_usage_counter)) return; - }
queue_for_each_hw_ctx(q, hctx, i) { struct blk_mq_tags *tags = hctx->tags; @@ -505,7 +502,7 @@ void blk_mq_queue_tag_busy_iter(struct r bt_for_each(hctx, &tags->bitmap_tags, tags->nr_reserved_tags, fn, priv, false); } - rcu_read_unlock(); + blk_queue_exit(q); }
static unsigned int bt_unused_tags(struct blk_mq_bitmap_tags *bt)
From: Boris Ostrovsky boris.ostrovsky@oracle.com
commit 88c15ec90ff16880efab92b519436ee17b198477 upstream.
As result of commit "x86/xen: Avoid fast syscall path for Xen PV guests", the irq_enable_sysexit pv op is not called by Xen PV guests anymore and since they were the only ones who used it we can safely remove it.
Signed-off-by: Boris Ostrovsky boris.ostrovsky@oracle.com Reviewed-by: Borislav Petkov bp@suse.de Acked-by: Andy Lutomirski luto@kernel.org Cc: Andrew Morton akpm@linux-foundation.org Cc: Andy Lutomirski luto@amacapital.net Cc: Borislav Petkov bp@alien8.de Cc: Brian Gerst brgerst@gmail.com Cc: Denys Vlasenko dvlasenk@redhat.com Cc: H. Peter Anvin hpa@zytor.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Peter Zijlstra peterz@infradead.org Cc: Thomas Gleixner tglx@linutronix.de Cc: david.vrabel@citrix.com Cc: konrad.wilk@oracle.com Cc: virtualization@lists.linux-foundation.org Cc: xen-devel@lists.xenproject.org Link: http://lkml.kernel.org/r/1447970147-1733-3-git-send-email-boris.ostrovsky@or... Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/entry/entry_32.S | 8 ++------ arch/x86/include/asm/paravirt.h | 7 ------- arch/x86/include/asm/paravirt_types.h | 9 --------- arch/x86/kernel/asm-offsets.c | 3 --- arch/x86/kernel/paravirt.c | 7 ------- arch/x86/kernel/paravirt_patch_32.c | 2 -- arch/x86/kernel/paravirt_patch_64.c | 1 - arch/x86/xen/enlighten.c | 3 --- arch/x86/xen/xen-asm_32.S | 14 -------------- arch/x86/xen/xen-ops.h | 3 --- 10 files changed, 2 insertions(+), 55 deletions(-)
--- a/arch/x86/entry/entry_32.S +++ b/arch/x86/entry/entry_32.S @@ -331,7 +331,8 @@ sysenter_past_esp: * Return back to the vDSO, which will pop ecx and edx. * Don't bother with DS and ES (they already contain __USER_DS). */ - ENABLE_INTERRUPTS_SYSEXIT + sti + sysexit
.pushsection .fixup, "ax" 2: movl $0, PT_FS(%esp) @@ -554,11 +555,6 @@ ENTRY(native_iret) iret _ASM_EXTABLE(native_iret, iret_exc) END(native_iret) - -ENTRY(native_irq_enable_sysexit) - sti - sysexit -END(native_irq_enable_sysexit) #endif
ENTRY(overflow) --- a/arch/x86/include/asm/paravirt.h +++ b/arch/x86/include/asm/paravirt.h @@ -938,13 +938,6 @@ extern void default_banner(void); push %ecx; push %edx; \ call PARA_INDIRECT(pv_cpu_ops+PV_CPU_read_cr0); \ pop %edx; pop %ecx - -#define ENABLE_INTERRUPTS_SYSEXIT \ - PARA_SITE(PARA_PATCH(pv_cpu_ops, PV_CPU_irq_enable_sysexit), \ - CLBR_NONE, \ - jmp PARA_INDIRECT(pv_cpu_ops+PV_CPU_irq_enable_sysexit)) - - #else /* !CONFIG_X86_32 */
/* --- a/arch/x86/include/asm/paravirt_types.h +++ b/arch/x86/include/asm/paravirt_types.h @@ -162,15 +162,6 @@ struct pv_cpu_ops {
u64 (*read_pmc)(int counter);
-#ifdef CONFIG_X86_32 - /* - * Atomically enable interrupts and return to userspace. This - * is only used in 32-bit kernels. 64-bit kernels use - * usergs_sysret32 instead. - */ - void (*irq_enable_sysexit)(void); -#endif - /* * Switch to usermode gs and return to 64-bit usermode using * sysret. Only used in 64-bit kernels to return to 64-bit --- a/arch/x86/kernel/asm-offsets.c +++ b/arch/x86/kernel/asm-offsets.c @@ -65,9 +65,6 @@ void common(void) { OFFSET(PV_IRQ_irq_disable, pv_irq_ops, irq_disable); OFFSET(PV_IRQ_irq_enable, pv_irq_ops, irq_enable); OFFSET(PV_CPU_iret, pv_cpu_ops, iret); -#ifdef CONFIG_X86_32 - OFFSET(PV_CPU_irq_enable_sysexit, pv_cpu_ops, irq_enable_sysexit); -#endif OFFSET(PV_CPU_read_cr0, pv_cpu_ops, read_cr0); OFFSET(PV_MMU_read_cr2, pv_mmu_ops, read_cr2); #endif --- a/arch/x86/kernel/paravirt.c +++ b/arch/x86/kernel/paravirt.c @@ -168,9 +168,6 @@ unsigned paravirt_patch_default(u8 type, ret = paravirt_patch_ident_64(insnbuf, len);
else if (type == PARAVIRT_PATCH(pv_cpu_ops.iret) || -#ifdef CONFIG_X86_32 - type == PARAVIRT_PATCH(pv_cpu_ops.irq_enable_sysexit) || -#endif type == PARAVIRT_PATCH(pv_cpu_ops.usergs_sysret32) || type == PARAVIRT_PATCH(pv_cpu_ops.usergs_sysret64)) /* If operation requires a jmp, then jmp */ @@ -226,7 +223,6 @@ static u64 native_steal_clock(int cpu)
/* These are in entry.S */ extern void native_iret(void); -extern void native_irq_enable_sysexit(void); extern void native_usergs_sysret32(void); extern void native_usergs_sysret64(void);
@@ -385,9 +381,6 @@ __visible struct pv_cpu_ops pv_cpu_ops =
.load_sp0 = native_load_sp0,
-#if defined(CONFIG_X86_32) - .irq_enable_sysexit = native_irq_enable_sysexit, -#endif #ifdef CONFIG_X86_64 #ifdef CONFIG_IA32_EMULATION .usergs_sysret32 = native_usergs_sysret32, --- a/arch/x86/kernel/paravirt_patch_32.c +++ b/arch/x86/kernel/paravirt_patch_32.c @@ -5,7 +5,6 @@ DEF_NATIVE(pv_irq_ops, irq_enable, "sti" DEF_NATIVE(pv_irq_ops, restore_fl, "push %eax; popf"); DEF_NATIVE(pv_irq_ops, save_fl, "pushf; pop %eax"); DEF_NATIVE(pv_cpu_ops, iret, "iret"); -DEF_NATIVE(pv_cpu_ops, irq_enable_sysexit, "sti; sysexit"); DEF_NATIVE(pv_mmu_ops, read_cr2, "mov %cr2, %eax"); DEF_NATIVE(pv_mmu_ops, write_cr3, "mov %eax, %cr3"); DEF_NATIVE(pv_mmu_ops, read_cr3, "mov %cr3, %eax"); @@ -46,7 +45,6 @@ unsigned native_patch(u8 type, u16 clobb PATCH_SITE(pv_irq_ops, restore_fl); PATCH_SITE(pv_irq_ops, save_fl); PATCH_SITE(pv_cpu_ops, iret); - PATCH_SITE(pv_cpu_ops, irq_enable_sysexit); PATCH_SITE(pv_mmu_ops, read_cr2); PATCH_SITE(pv_mmu_ops, read_cr3); PATCH_SITE(pv_mmu_ops, write_cr3); --- a/arch/x86/kernel/paravirt_patch_64.c +++ b/arch/x86/kernel/paravirt_patch_64.c @@ -12,7 +12,6 @@ DEF_NATIVE(pv_mmu_ops, write_cr3, "movq DEF_NATIVE(pv_cpu_ops, clts, "clts"); DEF_NATIVE(pv_cpu_ops, wbinvd, "wbinvd");
-DEF_NATIVE(pv_cpu_ops, irq_enable_sysexit, "swapgs; sti; sysexit"); DEF_NATIVE(pv_cpu_ops, usergs_sysret64, "swapgs; sysretq"); DEF_NATIVE(pv_cpu_ops, usergs_sysret32, "swapgs; sysretl"); DEF_NATIVE(pv_cpu_ops, swapgs, "swapgs"); --- a/arch/x86/xen/enlighten.c +++ b/arch/x86/xen/enlighten.c @@ -1240,10 +1240,7 @@ static const struct pv_cpu_ops xen_cpu_o
.iret = xen_iret, #ifdef CONFIG_X86_64 - .usergs_sysret32 = xen_sysret32, .usergs_sysret64 = xen_sysret64, -#else - .irq_enable_sysexit = xen_sysexit, #endif
.load_tr_desc = paravirt_nop, --- a/arch/x86/xen/xen-asm_32.S +++ b/arch/x86/xen/xen-asm_32.S @@ -35,20 +35,6 @@ check_events: ret
/* - * We can't use sysexit directly, because we're not running in ring0. - * But we can easily fake it up using iret. Assuming xen_sysexit is - * jumped to with a standard stack frame, we can just strip it back to - * a standard iret frame and use iret. - */ -ENTRY(xen_sysexit) - movl PT_EAX(%esp), %eax /* Shouldn't be necessary? */ - orl $X86_EFLAGS_IF, PT_EFLAGS(%esp) - lea PT_EIP(%esp), %esp - - jmp xen_iret -ENDPROC(xen_sysexit) - -/* * This is run where a normal iret would be run, with the same stack setup: * 8: eflags * 4: cs --- a/arch/x86/xen/xen-ops.h +++ b/arch/x86/xen/xen-ops.h @@ -139,9 +139,6 @@ DECL_ASM(void, xen_restore_fl_direct, un
/* These are not functions, and cannot be called normally */ __visible void xen_iret(void); -#ifdef CONFIG_X86_32 -__visible void xen_sysexit(void); -#endif __visible void xen_sysret32(void); __visible void xen_sysret64(void); __visible void xen_adjust_exception_frame(void);
From: Linus Torvalds torvalds@linux-foundation.org
commit 1a263ae60b04de959d9ce9caea4889385eefcc7b upstream.
gcc-10 has started warning about conflicting types for a few new built-in functions, particularly 'free()'.
This results in warnings like:
crypto/xts.c:325:13: warning: conflicting types for built-in function ‘free’; expected ‘void(void *)’ [-Wbuiltin-declaration-mismatch]
because the crypto layer had its local freeing functions called 'free()'.
Gcc-10 is in the wrong here, since that function is marked 'static', and thus there is no chance of confusion with any standard library function namespace.
But the simplest thing to do is to just use a different name here, and avoid this gcc mis-feature.
[ Side note: gcc knowing about 'free()' is in itself not the mis-feature: the semantics of 'free()' are special enough that a compiler can validly do special things when seeing it.
So the mis-feature here is that gcc thinks that 'free()' is some restricted name, and you can't shadow it as a local static function.
Making the special 'free()' semantics be a function attribute rather than tied to the name would be the much better model ]
Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- crypto/lrw.c | 4 ++-- crypto/xts.c | 4 ++-- 2 files changed, 4 insertions(+), 4 deletions(-)
--- a/crypto/lrw.c +++ b/crypto/lrw.c @@ -377,7 +377,7 @@ out_put_alg: return inst; }
-static void free(struct crypto_instance *inst) +static void free_inst(struct crypto_instance *inst) { crypto_drop_spawn(crypto_instance_ctx(inst)); kfree(inst); @@ -386,7 +386,7 @@ static void free(struct crypto_instance static struct crypto_template crypto_tmpl = { .name = "lrw", .alloc = alloc, - .free = free, + .free = free_inst, .module = THIS_MODULE, };
--- a/crypto/xts.c +++ b/crypto/xts.c @@ -334,7 +334,7 @@ out_put_alg: return inst; }
-static void free(struct crypto_instance *inst) +static void free_inst(struct crypto_instance *inst) { crypto_drop_spawn(crypto_instance_ctx(inst)); kfree(inst); @@ -343,7 +343,7 @@ static void free(struct crypto_instance static struct crypto_template crypto_tmpl = { .name = "xts", .alloc = alloc, - .free = free, + .free = free_inst, .module = THIS_MODULE, };
From: Cong Wang xiyou.wangcong@gmail.com
[ Upstream commit dd912306ff008891c82cd9f63e8181e47a9cb2fb ]
syzbot managed to trigger a recursive NETDEV_FEAT_CHANGE event between bonding master and slave. I managed to find a reproducer for this:
ip li set bond0 up ifenslave bond0 eth0 brctl addbr br0 ethtool -K eth0 lro off brctl addif br0 bond0 ip li set br0 up
When a NETDEV_FEAT_CHANGE event is triggered on a bonding slave, it captures this and calls bond_compute_features() to fixup its master's and other slaves' features. However, when syncing with its lower devices by netdev_sync_lower_features() this event is triggered again on slaves when the LRO feature fails to change, so it goes back and forth recursively until the kernel stack is exhausted.
Commit 17b85d29e82c intentionally lets __netdev_update_features() return -1 for such a failure case, so we have to just rely on the existing check inside netdev_sync_lower_features() and skip NETDEV_FEAT_CHANGE event only for this specific failure case.
Fixes: fd867d51f889 ("net/core: generic support for disabling netdev features down stack") Reported-by: syzbot+e73ceacfd8560cc8a3ca@syzkaller.appspotmail.com Reported-by: syzbot+c2fb6f9ddcea95ba49b5@syzkaller.appspotmail.com Cc: Jarod Wilson jarod@redhat.com Cc: Nikolay Aleksandrov nikolay@cumulusnetworks.com Cc: Josh Poimboeuf jpoimboe@redhat.com Cc: Jann Horn jannh@google.com Reviewed-by: Jay Vosburgh jay.vosburgh@canonical.com Signed-off-by: Cong Wang xiyou.wangcong@gmail.com Acked-by: Nikolay Aleksandrov nikolay@cumulusnetworks.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/core/dev.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
--- a/net/core/dev.c +++ b/net/core/dev.c @@ -6449,11 +6449,13 @@ static void netdev_sync_lower_features(s netdev_dbg(upper, "Disabling feature %pNF on lower dev %s.\n", &feature, lower->name); lower->wanted_features &= ~feature; - netdev_update_features(lower); + __netdev_update_features(lower);
if (unlikely(lower->features & feature)) netdev_WARN(upper, "failed to disable %pNF on %s!\n", &feature, lower->name); + else + netdev_features_change(lower); } } }
From: Paolo Abeni pabeni@redhat.com
[ Upstream commit 57644431a6c2faac5d754ebd35780cf43a531b1a ]
In commit b406472b5ad7 ("net: ipv4: avoid mixed n_redirects and rate_tokens usage") I missed the fact that a 0 'rate_tokens' will bypass the backoff algorithm.
Since rate_tokens is cleared after a redirect silence, and never incremented on redirects, if the host keeps receiving packets requiring redirect it will reply ignoring the backoff.
Additionally, the 'rate_last' field will be updated with the cadence of the ingress packet requiring redirect. If that rate is high enough, that will prevent the host from generating any other kind of ICMP messages
The check for a zero 'rate_tokens' value was likely a shortcut to avoid the more complex backoff algorithm after a redirect silence period. Address the issue checking for 'n_redirects' instead, which is incremented on successful redirect, and does not interfere with other ICMP replies.
Fixes: b406472b5ad7 ("net: ipv4: avoid mixed n_redirects and rate_tokens usage") Reported-and-tested-by: Colin Walters walters@redhat.com Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv4/route.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/net/ipv4/route.c +++ b/net/ipv4/route.c @@ -898,7 +898,7 @@ void ip_rt_send_redirect(struct sk_buff /* Check for load limit; set rate_last to the latest sent * redirect. */ - if (peer->rate_tokens == 0 || + if (peer->n_redirects == 0 || time_after(jiffies, (peer->rate_last + (ip_rt_redirect_load << peer->n_redirects)))) {
From: Paolo Abeni pabeni@redhat.com
[ Upstream commit eead1c2ea2509fd754c6da893a94f0e69e83ebe4 ]
The cipso and calipso code can set the MLS_CAT attribute on successful parsing, even if the corresponding catmap has not been allocated, as per current configuration and external input.
Later, selinux code tries to access the catmap if the MLS_CAT flag is present via netlbl_catmap_getlong(). That may cause null ptr dereference while processing incoming network traffic.
Address the issue setting the MLS_CAT flag only if the catmap is really allocated. Additionally let netlbl_catmap_getlong() cope with NULL catmap.
Reported-by: Matthew Sheets matthew.sheets@gd-ms.com Fixes: 4b8feff251da ("netlabel: fix the horribly broken catmap functions") Fixes: ceba1832b1b2 ("calipso: Set the calipso socket label to match the secattr.") Signed-off-by: Paolo Abeni pabeni@redhat.com Acked-by: Paul Moore paul@paul-moore.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv4/cipso_ipv4.c | 6 ++++-- net/netlabel/netlabel_kapi.c | 6 ++++++ 2 files changed, 10 insertions(+), 2 deletions(-)
--- a/net/ipv4/cipso_ipv4.c +++ b/net/ipv4/cipso_ipv4.c @@ -1343,7 +1343,8 @@ static int cipso_v4_parsetag_rbm(const s return ret_val; }
- secattr->flags |= NETLBL_SECATTR_MLS_CAT; + if (secattr->attr.mls.cat) + secattr->flags |= NETLBL_SECATTR_MLS_CAT; }
return 0; @@ -1524,7 +1525,8 @@ static int cipso_v4_parsetag_rng(const s return ret_val; }
- secattr->flags |= NETLBL_SECATTR_MLS_CAT; + if (secattr->attr.mls.cat) + secattr->flags |= NETLBL_SECATTR_MLS_CAT; }
return 0; --- a/net/netlabel/netlabel_kapi.c +++ b/net/netlabel/netlabel_kapi.c @@ -605,6 +605,12 @@ int netlbl_catmap_getlong(struct netlbl_ if ((off & (BITS_PER_LONG - 1)) != 0) return -EINVAL;
+ /* a null catmap is equivalent to an empty one */ + if (!catmap) { + *offset = (u32)-1; + return 0; + } + if (off < catmap->startbit) { off = catmap->startbit; *offset = off;
From: Takashi Iwai tiwai@suse.de
commit b590b38ca305d6d7902ec7c4f7e273e0069f3bcc upstream.
Lenovo Thinkpad T530 seems to have a sensitive internal mic capture that needs to limit the mic boost like a few other Thinkpad models. Although we may change the quirk for ALC269_FIXUP_LENOVO_DOCK, this hits way too many other laptop models, so let's add a new fixup model that limits the internal mic boost on top of the existing quirk and apply to only T530.
BugLink: https://bugzilla.suse.com/show_bug.cgi?id=1171293 Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20200514160533.10337-1-tiwai@suse.de Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- sound/pci/hda/patch_realtek.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-)
--- a/sound/pci/hda/patch_realtek.c +++ b/sound/pci/hda/patch_realtek.c @@ -4840,6 +4840,7 @@ enum { ALC269_FIXUP_HP_LINE1_MIC1_LED, ALC269_FIXUP_INV_DMIC, ALC269_FIXUP_LENOVO_DOCK, + ALC269_FIXUP_LENOVO_DOCK_LIMIT_BOOST, ALC269_FIXUP_NO_SHUTUP, ALC286_FIXUP_SONY_MIC_NO_PRESENCE, ALC269_FIXUP_PINCFG_NO_HP_TO_LINEOUT, @@ -5106,6 +5107,12 @@ static const struct hda_fixup alc269_fix .chained = true, .chain_id = ALC269_FIXUP_PINCFG_NO_HP_TO_LINEOUT }, + [ALC269_FIXUP_LENOVO_DOCK_LIMIT_BOOST] = { + .type = HDA_FIXUP_FUNC, + .v.func = alc269_fixup_limit_int_mic_boost, + .chained = true, + .chain_id = ALC269_FIXUP_LENOVO_DOCK, + }, [ALC269_FIXUP_PINCFG_NO_HP_TO_LINEOUT] = { .type = HDA_FIXUP_FUNC, .v.func = alc269_fixup_pincfg_no_hp_to_lineout, @@ -5760,7 +5767,7 @@ static const struct snd_pci_quirk alc269 SND_PCI_QUIRK(0x17aa, 0x21b8, "Thinkpad Edge 14", ALC269_FIXUP_SKU_IGNORE), SND_PCI_QUIRK(0x17aa, 0x21ca, "Thinkpad L412", ALC269_FIXUP_SKU_IGNORE), SND_PCI_QUIRK(0x17aa, 0x21e9, "Thinkpad Edge 15", ALC269_FIXUP_SKU_IGNORE), - SND_PCI_QUIRK(0x17aa, 0x21f6, "Thinkpad T530", ALC269_FIXUP_LENOVO_DOCK), + SND_PCI_QUIRK(0x17aa, 0x21f6, "Thinkpad T530", ALC269_FIXUP_LENOVO_DOCK_LIMIT_BOOST), SND_PCI_QUIRK(0x17aa, 0x21fa, "Thinkpad X230", ALC269_FIXUP_LENOVO_DOCK), SND_PCI_QUIRK(0x17aa, 0x21f3, "Thinkpad T430", ALC269_FIXUP_LENOVO_DOCK), SND_PCI_QUIRK(0x17aa, 0x21fb, "Thinkpad T430s", ALC269_FIXUP_LENOVO_DOCK), @@ -5870,6 +5877,7 @@ static const struct hda_model_fixup alc2 {.id = ALC269_FIXUP_HEADSET_MODE, .name = "headset-mode"}, {.id = ALC269_FIXUP_HEADSET_MODE_NO_HP_MIC, .name = "headset-mode-no-hp-mic"}, {.id = ALC269_FIXUP_LENOVO_DOCK, .name = "lenovo-dock"}, + {.id = ALC269_FIXUP_LENOVO_DOCK_LIMIT_BOOST, .name = "lenovo-dock-limit-boost"}, {.id = ALC269_FIXUP_HP_GPIO_LED, .name = "hp-gpio-led"}, {.id = ALC269_FIXUP_HP_DOCK_GPIO_MIC1_LED, .name = "hp-dock-gpio-mic1-led"}, {.id = ALC269_FIXUP_DELL1_MIC_NO_PRESENCE, .name = "dell-headset-multi"},
From: Takashi Iwai tiwai@suse.de
commit c1f6e3c818dd734c30f6a7eeebf232ba2cf3181d upstream.
The rawmidi core allows user to resize the runtime buffer via ioctl, and this may lead to UAF when performed during concurrent reads or writes: the read/write functions unlock the runtime lock temporarily during copying form/to user-space, and that's the race window.
This patch fixes the hole by introducing a reference counter for the runtime buffer read/write access and returns -EBUSY error when the resize is performed concurrently against read/write.
Note that the ref count field is a simple integer instead of refcount_t here, since the all contexts accessing the buffer is basically protected with a spinlock, hence we need no expensive atomic ops. Also, note that this busy check is needed only against read / write functions, and not in receive/transmit callbacks; the race can happen only at the spinlock hole mentioned in the above, while the whole function is protected for receive / transmit callbacks.
Reported-by: butt3rflyh4ck butterflyhuangxx@gmail.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/CAFcO6XMWpUVK_yzzCpp8_XP7+=oUpQvuBeCbMffEDkpe8jWrf... Link: https://lore.kernel.org/r/s5heerw3r5z.wl-tiwai@suse.de Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- include/sound/rawmidi.h | 1 + sound/core/rawmidi.c | 31 +++++++++++++++++++++++++++---- 2 files changed, 28 insertions(+), 4 deletions(-)
--- a/include/sound/rawmidi.h +++ b/include/sound/rawmidi.h @@ -76,6 +76,7 @@ struct snd_rawmidi_runtime { size_t avail_min; /* min avail for wakeup */ size_t avail; /* max used buffer for wakeup */ size_t xruns; /* over/underruns counter */ + int buffer_ref; /* buffer reference count */ /* misc */ spinlock_t lock; wait_queue_head_t sleep; --- a/sound/core/rawmidi.c +++ b/sound/core/rawmidi.c @@ -108,6 +108,17 @@ static void snd_rawmidi_input_event_work runtime->event(runtime->substream); }
+/* buffer refcount management: call with runtime->lock held */ +static inline void snd_rawmidi_buffer_ref(struct snd_rawmidi_runtime *runtime) +{ + runtime->buffer_ref++; +} + +static inline void snd_rawmidi_buffer_unref(struct snd_rawmidi_runtime *runtime) +{ + runtime->buffer_ref--; +} + static int snd_rawmidi_runtime_create(struct snd_rawmidi_substream *substream) { struct snd_rawmidi_runtime *runtime; @@ -654,6 +665,11 @@ int snd_rawmidi_output_params(struct snd if (!newbuf) return -ENOMEM; spin_lock_irq(&runtime->lock); + if (runtime->buffer_ref) { + spin_unlock_irq(&runtime->lock); + kfree(newbuf); + return -EBUSY; + } oldbuf = runtime->buffer; runtime->buffer = newbuf; runtime->buffer_size = params->buffer_size; @@ -962,8 +978,10 @@ static long snd_rawmidi_kernel_read1(str long result = 0, count1; struct snd_rawmidi_runtime *runtime = substream->runtime; unsigned long appl_ptr; + int err = 0;
spin_lock_irqsave(&runtime->lock, flags); + snd_rawmidi_buffer_ref(runtime); while (count > 0 && runtime->avail) { count1 = runtime->buffer_size - runtime->appl_ptr; if (count1 > count) @@ -982,16 +1000,19 @@ static long snd_rawmidi_kernel_read1(str if (userbuf) { spin_unlock_irqrestore(&runtime->lock, flags); if (copy_to_user(userbuf + result, - runtime->buffer + appl_ptr, count1)) { - return result > 0 ? result : -EFAULT; - } + runtime->buffer + appl_ptr, count1)) + err = -EFAULT; spin_lock_irqsave(&runtime->lock, flags); + if (err) + goto out; } result += count1; count -= count1; } + out: + snd_rawmidi_buffer_unref(runtime); spin_unlock_irqrestore(&runtime->lock, flags); - return result; + return result > 0 ? result : err; }
long snd_rawmidi_kernel_read(struct snd_rawmidi_substream *substream, @@ -1262,6 +1283,7 @@ static long snd_rawmidi_kernel_write1(st return -EAGAIN; } } + snd_rawmidi_buffer_ref(runtime); while (count > 0 && runtime->avail > 0) { count1 = runtime->buffer_size - runtime->appl_ptr; if (count1 > count) @@ -1293,6 +1315,7 @@ static long snd_rawmidi_kernel_write1(st } __end: count1 = runtime->avail < runtime->buffer_size; + snd_rawmidi_buffer_unref(runtime); spin_unlock_irqrestore(&runtime->lock, flags); if (count1) snd_rawmidi_output_trigger(substream, 1);
From: Takashi Iwai tiwai@suse.de
commit 5a7b44a8df822e0667fc76ed7130252523993bda upstream.
syzbot reported the uninitialized value exposure in certain situations using virmidi loop. It's likely a very small race at writing and reading, and the influence is almost negligible. But it's safer to paper over this just by replacing the existing kvmalloc() with kvzalloc().
Reported-by: syzbot+194dffdb8b22fc5d207a@syzkaller.appspotmail.com Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- sound/core/rawmidi.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/sound/core/rawmidi.c +++ b/sound/core/rawmidi.c @@ -136,7 +136,7 @@ static int snd_rawmidi_runtime_create(st runtime->avail = 0; else runtime->avail = runtime->buffer_size; - if ((runtime->buffer = kmalloc(runtime->buffer_size, GFP_KERNEL)) == NULL) { + if ((runtime->buffer = kzalloc(runtime->buffer_size, GFP_KERNEL)) == NULL) { kfree(runtime); return -ENOMEM; } @@ -661,7 +661,7 @@ int snd_rawmidi_output_params(struct snd return -EINVAL; } if (params->buffer_size != runtime->buffer_size) { - newbuf = kmalloc(params->buffer_size, GFP_KERNEL); + newbuf = kzalloc(params->buffer_size, GFP_KERNEL); if (!newbuf) return -ENOMEM; spin_lock_irq(&runtime->lock);
From: Kyungtae Kim kt0755@gmail.com
commit 15753588bcd4bbffae1cca33c8ced5722477fe1f upstream.
FuzzUSB (a variant of syzkaller) found an illegal array access using an incorrect index while binding a gadget with UDC.
Reference: https://www.spinics.net/lists/linux-usb/msg194331.html
This bug occurs when a size variable used for a buffer is misused to access its strcpy-ed buffer. Given a buffer along with its size variable (taken from user input), from which, a new buffer is created using kstrdup(). Due to the original buffer containing 0 value in the middle, the size of the kstrdup-ed buffer becomes smaller than that of the original. So accessing the kstrdup-ed buffer with the same size variable triggers memory access violation.
The fix makes sure no zero value in the buffer, by comparing the strlen() of the orignal buffer with the size variable, so that the access to the kstrdup-ed buffer is safe.
BUG: KASAN: slab-out-of-bounds in gadget_dev_desc_UDC_store+0x1ba/0x200 drivers/usb/gadget/configfs.c:266 Read of size 1 at addr ffff88806a55dd7e by task syz-executor.0/17208
CPU: 2 PID: 17208 Comm: syz-executor.0 Not tainted 5.6.8 #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0xce/0x128 lib/dump_stack.c:118 print_address_description.constprop.4+0x21/0x3c0 mm/kasan/report.c:374 __kasan_report+0x131/0x1b0 mm/kasan/report.c:506 kasan_report+0x12/0x20 mm/kasan/common.c:641 __asan_report_load1_noabort+0x14/0x20 mm/kasan/generic_report.c:132 gadget_dev_desc_UDC_store+0x1ba/0x200 drivers/usb/gadget/configfs.c:266 flush_write_buffer fs/configfs/file.c:251 [inline] configfs_write_file+0x2f1/0x4c0 fs/configfs/file.c:283 __vfs_write+0x85/0x110 fs/read_write.c:494 vfs_write+0x1cd/0x510 fs/read_write.c:558 ksys_write+0x18a/0x220 fs/read_write.c:611 __do_sys_write fs/read_write.c:623 [inline] __se_sys_write fs/read_write.c:620 [inline] __x64_sys_write+0x73/0xb0 fs/read_write.c:620 do_syscall_64+0x9e/0x510 arch/x86/entry/common.c:294 entry_SYSCALL_64_after_hwframe+0x49/0xbe
Signed-off-by: Kyungtae Kim kt0755@gmail.com Reported-and-tested-by: Kyungtae Kim kt0755@gmail.com Cc: Felipe Balbi balbi@kernel.org Cc: stable stable@vger.kernel.org Link: https://lore.kernel.org/r/20200510054326.GA19198@pizza01 Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/usb/gadget/configfs.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/drivers/usb/gadget/configfs.c +++ b/drivers/usb/gadget/configfs.c @@ -260,6 +260,9 @@ static ssize_t gadget_dev_desc_UDC_store char *name; int ret;
+ if (strlen(page) < len) + return -EOVERFLOW; + name = kstrdup(page, GFP_KERNEL); if (!name) return -ENOMEM;
From: Fabio Estevam festevam@gmail.com
commit 0caf34350a25907515d929a9c77b9b206aac6d1e upstream.
The I2C2 pins are already used and the following errors are seen:
imx27-pinctrl 10015000.iomuxc: pin MX27_PAD_I2C2_SDA already requested by 10012000.i2c; cannot claim for 1001d000.i2c imx27-pinctrl 10015000.iomuxc: pin-69 (1001d000.i2c) status -22 imx27-pinctrl 10015000.iomuxc: could not request pin 69 (MX27_PAD_I2C2_SDA) from group i2c2grp on device 10015000.iomuxc imx-i2c 1001d000.i2c: Error applying setting, reverse things back imx-i2c: probe of 1001d000.i2c failed with error -22
Fix it by adding the correct I2C1 IOMUX entries for the pinctrl_i2c1 group.
Cc: stable@vger.kernel.org Fixes: 61664d0b432a ("ARM: dts: imx27 phyCARD-S pinctrl") Signed-off-by: Fabio Estevam festevam@gmail.com Reviewed-by: Stefan Riedmueller s.riedmueller@phytec.de Signed-off-by: Shawn Guo shawnguo@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/arm/boot/dts/imx27-phytec-phycard-s-rdk.dts | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/arch/arm/boot/dts/imx27-phytec-phycard-s-rdk.dts +++ b/arch/arm/boot/dts/imx27-phytec-phycard-s-rdk.dts @@ -81,8 +81,8 @@ imx27-phycard-s-rdk { pinctrl_i2c1: i2c1grp { fsl,pins = < - MX27_PAD_I2C2_SDA__I2C2_SDA 0x0 - MX27_PAD_I2C2_SCL__I2C2_SCL 0x0 + MX27_PAD_I2C_DATA__I2C_DATA 0x0 + MX27_PAD_I2C_CLK__I2C_CLK 0x0 >; };
From: Borislav Petkov bp@suse.de
commit a9a3ed1eff3601b63aea4fb462d8b3b92c7c1e7e upstream.
... or the odyssey of trying to disable the stack protector for the function which generates the stack canary value.
The whole story started with Sergei reporting a boot crash with a kernel built with gcc-10:
Kernel panic — not syncing: stack-protector: Kernel stack is corrupted in: start_secondary CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.6.0-rc5—00235—gfffb08b37df9 #139 Hardware name: Gigabyte Technology Co., Ltd. To be filled by O.E.M./H77M—D3H, BIOS F12 11/14/2013 Call Trace: dump_stack panic ? start_secondary __stack_chk_fail start_secondary secondary_startup_64 -—-[ end Kernel panic — not syncing: stack—protector: Kernel stack is corrupted in: start_secondary
This happens because gcc-10 tail-call optimizes the last function call in start_secondary() - cpu_startup_entry() - and thus emits a stack canary check which fails because the canary value changes after the boot_init_stack_canary() call.
To fix that, the initial attempt was to mark the one function which generates the stack canary with:
__attribute__((optimize("-fno-stack-protector"))) ... start_secondary(void *unused)
however, using the optimize attribute doesn't work cumulatively as the attribute does not add to but rather replaces previously supplied optimization options - roughly all -fxxx options.
The key one among them being -fno-omit-frame-pointer and thus leading to not present frame pointer - frame pointer which the kernel needs.
The next attempt to prevent compilers from tail-call optimizing the last function call cpu_startup_entry(), shy of carving out start_secondary() into a separate compilation unit and building it with -fno-stack-protector, was to add an empty asm("").
This current solution was short and sweet, and reportedly, is supported by both compilers but we didn't get very far this time: future (LTO?) optimization passes could potentially eliminate this, which leads us to the third attempt: having an actual memory barrier there which the compiler cannot ignore or move around etc.
That should hold for a long time, but hey we said that about the other two solutions too so...
Reported-by: Sergei Trofimovich slyfox@gentoo.org Signed-off-by: Borislav Petkov bp@suse.de Tested-by: Kalle Valo kvalo@codeaurora.org Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20200314164451.346497-1-slyfox@gentoo.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/include/asm/stackprotector.h | 7 ++++++- arch/x86/kernel/smpboot.c | 8 ++++++++ arch/x86/xen/smp.c | 1 + include/linux/compiler.h | 7 +++++++ init/main.c | 2 ++ 5 files changed, 24 insertions(+), 1 deletion(-)
--- a/arch/x86/include/asm/stackprotector.h +++ b/arch/x86/include/asm/stackprotector.h @@ -54,8 +54,13 @@ /* * Initialize the stackprotector canary value. * - * NOTE: this must only be called from functions that never return, + * NOTE: this must only be called from functions that never return * and it must always be inlined. + * + * In addition, it should be called from a compilation unit for which + * stack protector is disabled. Alternatively, the caller should not end + * with a function call which gets tail-call optimized as that would + * lead to checking a modified canary value. */ static __always_inline void boot_init_stack_canary(void) { --- a/arch/x86/kernel/smpboot.c +++ b/arch/x86/kernel/smpboot.c @@ -243,6 +243,14 @@ static void notrace start_secondary(void
wmb(); cpu_startup_entry(CPUHP_ONLINE); + + /* + * Prevent tail call to cpu_startup_entry() because the stack protector + * guard has been changed a couple of function calls up, in + * boot_init_stack_canary() and must not be checked before tail calling + * another function. + */ + prevent_tail_call_optimization(); }
void __init smp_store_boot_cpu_info(void) --- a/arch/x86/xen/smp.c +++ b/arch/x86/xen/smp.c @@ -116,6 +116,7 @@ asmlinkage __visible void cpu_bringup_an #endif cpu_bringup(); cpu_startup_entry(CPUHP_ONLINE); + prevent_tail_call_optimization(); }
static void xen_smp_intr_free(unsigned int cpu) --- a/include/linux/compiler.h +++ b/include/linux/compiler.h @@ -556,4 +556,11 @@ static __always_inline void __write_once # define __kprobes # define nokprobe_inline inline #endif + +/* + * This is needed in functions which generate the stack canary, see + * arch/x86/kernel/smpboot.c::start_secondary() for an example. + */ +#define prevent_tail_call_optimization() mb() + #endif /* __LINUX_COMPILER_H */ --- a/init/main.c +++ b/init/main.c @@ -683,6 +683,8 @@ asmlinkage __visible void __init start_k
/* Do the rest non-__init'ed, we're now alive */ rest_init(); + + prevent_tail_call_optimization(); }
/* Call all constructor functions linked into the kernel. */
From: Eric W. Biederman ebiederm@xmission.com
commit f87d1c9559164294040e58f5e3b74a162bf7c6e8 upstream.
I goofed when I added mm->user_ns support to would_dump. I missed the fact that in the case of binfmt_loader, binfmt_em86, binfmt_misc, and binfmt_script bprm->file is reassigned. Which made the move of would_dump from setup_new_exec to __do_execve_file before exec_binprm incorrect as it can result in would_dump running on the script instead of the interpreter of the script.
The net result is that the code stopped making unreadable interpreters undumpable. Which allows them to be ptraced and written to disk without special permissions. Oops.
The move was necessary because the call in set_new_exec was after bprm->mm was no longer valid.
To correct this mistake move the misplaced would_dump from __do_execve_file into flos_old_exec, before exec_mmap is called.
I tested and confirmed that without this fix I can attach with gdb to a script with an unreadable interpreter, and with this fix I can not.
Cc: stable@vger.kernel.org Fixes: f84df2a6f268 ("exec: Ensure mm->user_ns contains the execed files") Signed-off-by: "Eric W. Biederman" ebiederm@xmission.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/exec.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/fs/exec.c +++ b/fs/exec.c @@ -1124,6 +1124,8 @@ int flush_old_exec(struct linux_binprm * */ set_mm_exe_file(bprm->mm, bprm->file);
+ would_dump(bprm, bprm->file); + /* * Release all of the old mmap stuff */ @@ -1632,8 +1634,6 @@ static int do_execveat_common(int fd, st if (retval < 0) goto out;
- would_dump(bprm, bprm->file); - retval = exec_binprm(bprm); if (retval < 0) goto out;
From: Christophe JAILLET christophe.jaillet@wanadoo.fr
commit ccaef7e6e354fb65758eaddd3eae8065a8b3e295 upstream.
'dev' is allocated in 'net2272_probe_init()'. It must be freed in the error handling path, as already done in the remove function (i.e. 'net2272_plat_remove()')
Fixes: 90fccb529d24 ("usb: gadget: Gadget directory cleanup - group UDC drivers") Signed-off-by: Christophe JAILLET christophe.jaillet@wanadoo.fr Signed-off-by: Felipe Balbi balbi@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/usb/gadget/udc/net2272.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/drivers/usb/gadget/udc/net2272.c +++ b/drivers/usb/gadget/udc/net2272.c @@ -2670,6 +2670,8 @@ net2272_plat_probe(struct platform_devic err_req: release_mem_region(base, len); err: + kfree(dev); + return ret; }
From: Christophe JAILLET christophe.jaillet@wanadoo.fr
commit 19b94c1f9c9a16d41a8de3ccbdb8536cf1aecdbf upstream.
If 'usb_otg_descriptor_alloc()' fails, we must return an error code, not 0.
Fixes: 56023ce0fd70 ("usb: gadget: audio: allocate and init otg descriptor by otg capabilities") Reviewed-by: Peter Chen peter.chen@nxp.com Signed-off-by: Christophe JAILLET christophe.jaillet@wanadoo.fr Signed-off-by: Felipe Balbi balbi@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/usb/gadget/legacy/audio.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
--- a/drivers/usb/gadget/legacy/audio.c +++ b/drivers/usb/gadget/legacy/audio.c @@ -249,8 +249,10 @@ static int audio_bind(struct usb_composi struct usb_descriptor_header *usb_desc;
usb_desc = usb_otg_descriptor_alloc(cdev->gadget); - if (!usb_desc) + if (!usb_desc) { + status = -ENOMEM; goto fail; + } usb_otg_descriptor_init(cdev->gadget, usb_desc); otg_desc[0] = usb_desc; otg_desc[1] = NULL;
From: Wei Yongjun weiyongjun1@huawei.com
commit e27d4b30b71c66986196d8a1eb93cba9f602904a upstream.
If 'usb_otg_descriptor_alloc()' fails, we must return a negative error code -ENOMEM, not 0.
Fixes: 1156e91dd7cc ("usb: gadget: ncm: allocate and init otg descriptor by otg capabilities") Reported-by: Hulk Robot hulkci@huawei.com Signed-off-by: Wei Yongjun weiyongjun1@huawei.com Signed-off-by: Felipe Balbi balbi@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/usb/gadget/legacy/ncm.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
--- a/drivers/usb/gadget/legacy/ncm.c +++ b/drivers/usb/gadget/legacy/ncm.c @@ -162,8 +162,10 @@ static int gncm_bind(struct usb_composit struct usb_descriptor_header *usb_desc;
usb_desc = usb_otg_descriptor_alloc(gadget); - if (!usb_desc) + if (!usb_desc) { + status = -ENOMEM; goto fail; + } usb_otg_descriptor_init(gadget, usb_desc); otg_desc[0] = usb_desc; otg_desc[1] = NULL;
From: Wei Yongjun weiyongjun1@huawei.com
commit e8f7f9e3499a6d96f7f63a4818dc7d0f45a7783b upstream.
If 'usb_otg_descriptor_alloc()' fails, we must return a negative error code -ENOMEM, not 0.
Fixes: ab6796ae9833 ("usb: gadget: cdc2: allocate and init otg descriptor by otg capabilities") Reported-by: Hulk Robot hulkci@huawei.com Signed-off-by: Wei Yongjun weiyongjun1@huawei.com Signed-off-by: Felipe Balbi balbi@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/usb/gadget/legacy/cdc2.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
--- a/drivers/usb/gadget/legacy/cdc2.c +++ b/drivers/usb/gadget/legacy/cdc2.c @@ -183,8 +183,10 @@ static int cdc_bind(struct usb_composite struct usb_descriptor_header *usb_desc;
usb_desc = usb_otg_descriptor_alloc(gadget); - if (!usb_desc) + if (!usb_desc) { + status = -ENOMEM; goto fail1; + } usb_otg_descriptor_init(gadget, usb_desc); otg_desc[0] = usb_desc; otg_desc[1] = NULL;
From: Kai-Heng Feng kai.heng.feng@canonical.com
commit f41224efcf8aafe80ea47ac870c5e32f3209ffc8 upstream.
This reverts commit 3b36b13d5e69d6f51ff1c55d1b404a74646c9757.
Enable power save node breaks some systems with ACL225. Revert the patch and use a platform specific quirk for the original issue isntead.
Fixes: 3b36b13d5e69 ("ALSA: hda/realtek: Fix pop noise on ALC225") BugLink: https://bugs.launchpad.net/bugs/1875916 Signed-off-by: Kai-Heng Feng kai.heng.feng@canonical.com Link: https://lore.kernel.org/r/20200503152449.22761-1-kai.heng.feng@canonical.com Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- sound/pci/hda/patch_realtek.c | 2 -- 1 file changed, 2 deletions(-)
--- a/sound/pci/hda/patch_realtek.c +++ b/sound/pci/hda/patch_realtek.c @@ -6341,8 +6341,6 @@ static int patch_alc269(struct hda_codec alc_update_coef_idx(codec, 0x36, 1 << 13, 1 << 5); /* Switch pcbeep path to Line in path*/ break; case 0x10ec0225: - codec->power_save_node = 1; - /* fall through */ case 0x10ec0295: case 0x10ec0299: spec->codec_variant = ALC269_TYPE_ALC225;
From: Geert Uytterhoeven geert+renesas@glider.be
commit e47cb97f153193d4b41ca8d48127da14513d54c7 upstream.
The Clock Pulse Generator (CPG) device node lacks the extal2 clock. This may lead to a failure registering the "r" clock, or to a wrong parent for the "usb24s" clock, depending on MD_CK2 pin configuration and boot loader CPG_USBCKCR register configuration.
This went unnoticed, as this does not affect the single upstream board configuration, which relies on the first clock input only.
Fixes: d9ffd583bf345e2e ("ARM: shmobile: r8a7740: add SoC clocks to DTS") Signed-off-by: Geert Uytterhoeven geert+renesas@glider.be Reviewed-by: Ulrich Hecht uli+renesas@fpond.eu Link: https://lore.kernel.org/r/20200508095918.6061-1-geert+renesas@glider.be Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/arm/boot/dts/r8a7740.dtsi | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/arm/boot/dts/r8a7740.dtsi +++ b/arch/arm/boot/dts/r8a7740.dtsi @@ -461,7 +461,7 @@ cpg_clocks: cpg_clocks@e6150000 { compatible = "renesas,r8a7740-cpg-clocks"; reg = <0xe6150000 0x10000>; - clocks = <&extal1_clk>, <&extalr_clk>; + clocks = <&extal1_clk>, <&extal2_clk>, <&extalr_clk>; #clock-cells = <1>; clock-output-names = "system", "pllc0", "pllc1", "pllc2", "r",
From: Jim Mattson jmattson@google.com
commit c4e0e4ab4cf3ec2b3f0b628ead108d677644ebd9 upstream.
Bank_num is a one-based count of banks, not a zero-based index. It overflows the allocated space only when strictly greater than KVM_MAX_MCE_BANKS.
Fixes: a9e38c3e01ad ("KVM: x86: Catch potential overrun in MCE setup") Signed-off-by: Jue Wang juew@google.com Signed-off-by: Jim Mattson jmattson@google.com Reviewed-by: Peter Shier pshier@google.com Message-Id: 20200511225616.19557-1-jmattson@google.com Reviewed-by: Vitaly Kuznetsov vkuznets@redhat.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/kvm/x86.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -2941,7 +2941,7 @@ static int kvm_vcpu_ioctl_x86_setup_mce( unsigned bank_num = mcg_cap & 0xff, bank;
r = -EINVAL; - if (!bank_num || bank_num >= KVM_MAX_MCE_BANKS) + if (!bank_num || bank_num > KVM_MAX_MCE_BANKS) goto out; if (mcg_cap & ~(KVM_MCE_CAP_SUPPORTED | 0xff | 0xff0000)) goto out;
From: Sergei Trofimovich slyfox@gentoo.org
commit b1112139a103b4b1101d0d2d72931f2d33d8c978 upstream.
gcc-10 will rename --param=allow-store-data-races=0 to -fno-allow-store-data-races.
The flag change happened at https://gcc.gnu.org/PR92046.
Signed-off-by: Sergei Trofimovich slyfox@gentoo.org Acked-by: Jiri Kosina jkosina@suse.cz Signed-off-by: Masahiro Yamada masahiroy@kernel.org Cc: Thomas Backlund tmb@mageia.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- Makefile | 1 + 1 file changed, 1 insertion(+)
--- a/Makefile +++ b/Makefile @@ -650,6 +650,7 @@ endif
# Tell gcc to never replace conditional load with a non-conditional one KBUILD_CFLAGS += $(call cc-option,--param=allow-store-data-races=0) +KBUILD_CFLAGS += $(call cc-option,-fno-allow-store-data-races)
# check for 'asm goto' ifeq ($(shell $(CONFIG_SHELL) $(srctree)/scripts/gcc-goto.sh $(CC) $(KBUILD_CFLAGS)), y)
On Mon, 18 May 2020 at 23:08, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 4.4.224 release. There are 86 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 20 May 2020 17:32:42 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.4.224-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.4.y and the diffstat can be found below.
thanks,
greg k-h
Results from Linaro’s test farm. No regressions on arm64, arm, x86_64, and i386.
Summary ------------------------------------------------------------------------
kernel: 4.4.224-rc1 git repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git git branch: linux-4.4.y git commit: 5614224b8432edc87094945490727479494da465 git describe: v4.4.223-87-g5614224b8432 Test details: https://qa-reports.linaro.org/lkft/linux-stable-rc-4.4-oe/build/v4.4.223-87-...
No regressions (compared to build v4.4.223)
No fixes (compared to build v4.4.223)
Ran 18965 total tests in the following environments and test suites.
Environments -------------- - i386 - juno-r2 - arm64 - juno-r2-compat - juno-r2-kasan - x15 - arm - x86_64 - x86-kasan
Test Suites ----------- * build * kselftest * kselftest/drivers * kselftest/filesystems * libhugetlbfs * linux-log-parser * ltp-cap_bounds-tests * ltp-commands-tests * ltp-containers-tests * ltp-cpuhotplug-tests * ltp-crypto-tests * ltp-cve-tests * ltp-dio-tests * ltp-fcntl-locktests-tests * ltp-filecaps-tests * ltp-fs-tests * ltp-fs_bind-tests * ltp-fs_perms_simple-tests * ltp-fsx-tests * ltp-hugetlb-tests * ltp-io-tests * ltp-ipc-tests * ltp-math-tests * ltp-mm-tests * ltp-nptl-tests * ltp-open-posix-tests * ltp-pty-tests * ltp-sched-tests * ltp-securebits-tests * ltp-syscalls-tests * network-basic-tests * perf * v4l2-compliance * kvm-unit-tests * install-android-platform-tools-r2600 * install-android-platform-tools-r2800 * kselftest/net * kselftest-vsyscall-mode-native * kselftest-vsyscall-mode-native/drivers * kselftest-vsyscall-mode-native/filesystems * kselftest-vsyscall-mode-none * kselftest-vsyscall-mode-none/drivers * kselftest-vsyscall-mode-none/filesystems
Summary ------------------------------------------------------------------------
kernel: 4.4.224-rc1 git repo: https://git.linaro.org/lkft/arm64-stable-rc.git git branch: 4.4.224-rc1-hikey-20200518-727 git commit: 59657db1accd5fa3cc599da31d1501113075794c git describe: 4.4.224-rc1-hikey-20200518-727 Test details: https://qa-reports.linaro.org/lkft/linaro-hikey-stable-rc-4.4-oe/build/4.4.2...
No regressions (compared to build 4.4.224-rc1-hikey-20200518-726)
No fixes (compared to build 4.4.224-rc1-hikey-20200518-726)
Ran 1813 total tests in the following environments and test suites.
Environments -------------- - hi6220-hikey - arm64
Test Suites ----------- * build * install-android-platform-tools-r2600 * kselftest * kselftest/drivers * kselftest/filesystems * libhugetlbfs * linux-log-parser * ltp-cap_bounds-tests * ltp-commands-tests * ltp-containers-tests * ltp-cpuhotplug-tests * ltp-cve-tests * ltp-dio-tests * ltp-fcntl-locktests-tests * ltp-filecaps-tests * ltp-fs-tests * ltp-fs_bind-tests * ltp-fs_perms_simple-tests * ltp-fsx-tests * ltp-hugetlb-tests * ltp-io-tests * ltp-ipc-tests * ltp-math-tests * ltp-mm-tests * ltp-nptl-tests * ltp-pty-tests * ltp-sched-tests * ltp-securebits-tests * ltp-syscalls-tests * perf * spectre-meltdown-checker-test * v4l2-compliance
On 18/05/2020 18:35, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 4.4.224 release. There are 86 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 20 May 2020 17:32:42 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.4.224-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.4.y and the diffstat can be found below.
thanks,
greg k-h
All tests are passing for Tegra ...
Test results for stable-v4.4: 6 builds: 6 pass, 0 fail 12 boots: 12 pass, 0 fail 19 tests: 19 pass, 0 fail
Linux version: 4.4.224-rc1-g4935dd6adfe2 Boards tested: tegra124-jetson-tk1, tegra20-ventana, tegra30-cardhu-a04
Cheers Jon
Hello Greg,
From: stable-owner@vger.kernel.org stable-owner@vger.kernel.org On Behalf Of Greg Kroah-Hartman Sent: 18 May 2020 18:36
This is the start of the stable review cycle for the 4.4.224 release. There are 86 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
No build/boot issues seen for CIP configs for Linux Linux 4.4.224-rc1 (5614224b8432).
Build/test pipeline/logs: https://gitlab.com/cip-project/cip-testing/linux-stable-rc-ci/pipelines/1472... GitLab CI pipeline: https://gitlab.com/cip-project/cip-testing/linux-cip-pipelines/-/blob/master... Relevant LAVA jobs: https://lava.ciplatform.org/scheduler/alljobs?length=25&search=561422#ta...
Kind regards, Chris
Responses should be made by Wed, 20 May 2020 17:32:42 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch- 4.4.224-rc1.gz or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.4.y and the diffstat can be found below.
thanks,
greg k-h
Pseudo-Shortlog of commits:
Greg Kroah-Hartman gregkh@linuxfoundation.org Linux 4.4.224-rc1
Sergei Trofimovich slyfox@gentoo.org Makefile: disallow data races on gcc-10 as well
Jim Mattson jmattson@google.com KVM: x86: Fix off-by-one error in kvm_vcpu_ioctl_x86_setup_mce
Geert Uytterhoeven geert+renesas@glider.be ARM: dts: r8a7740: Add missing extal2 to CPG node
Kai-Heng Feng kai.heng.feng@canonical.com Revert "ALSA: hda/realtek: Fix pop noise on ALC225"
Wei Yongjun weiyongjun1@huawei.com usb: gadget: legacy: fix error return code in cdc_bind()
Wei Yongjun weiyongjun1@huawei.com usb: gadget: legacy: fix error return code in gncm_bind()
Christophe JAILLET christophe.jaillet@wanadoo.fr usb: gadget: audio: Fix a missing error return value in audio_bind()
Christophe JAILLET christophe.jaillet@wanadoo.fr usb: gadget: net2272: Fix a memory leak in an error handling path in 'net2272_plat_probe()'
Eric W. Biederman ebiederm@xmission.com exec: Move would_dump into flush_old_exec
Borislav Petkov bp@suse.de x86: Fix early boot crash on gcc-10, third try
Fabio Estevam festevam@gmail.com ARM: dts: imx27-phytec-phycard-s-rdk: Fix the I2C1 pinctrl entries
Kyungtae Kim kt0755@gmail.com USB: gadget: fix illegal array access in binding with UDC
Takashi Iwai tiwai@suse.de ALSA: rawmidi: Initialize allocated buffers
Takashi Iwai tiwai@suse.de ALSA: rawmidi: Fix racy buffer resize under concurrent accesses
Takashi Iwai tiwai@suse.de ALSA: hda/realtek - Limit int mic boost for Thinkpad T530
Paolo Abeni pabeni@redhat.com netlabel: cope with NULL catmap
Paolo Abeni pabeni@redhat.com net: ipv4: really enforce backoff for redirects
Cong Wang xiyou.wangcong@gmail.com net: fix a potential recursive NETDEV_FEAT_CHANGE
Linus Torvalds torvalds@linux-foundation.org gcc-10: avoid shadowing standard library 'free()' in crypto
Boris Ostrovsky boris.ostrovsky@oracle.com x86/paravirt: Remove the unused irq_enable_sysexit pv op
Keith Busch keith.busch@intel.com blk-mq: Allow blocking queue tag iter callbacks
Jianchao Wang jianchao.w.wang@oracle.com blk-mq: sync the update nr_hw_queues with blk_mq_queue_tag_busy_iter
Gabriel Krisman Bertazi krisman@linux.vnet.ibm.com blk-mq: Allow timeouts to run while queue is freezing
Christoph Hellwig hch@lst.de block: defer timeouts to a workqueue
Linus Torvalds torvalds@linux-foundation.org gcc-10: disable 'restrict' warning for now
Linus Torvalds torvalds@linux-foundation.org gcc-10: disable 'stringop-overflow' warning for now
Linus Torvalds torvalds@linux-foundation.org gcc-10: disable 'array-bounds' warning for now
Linus Torvalds torvalds@linux-foundation.org gcc-10: disable 'zero-length-bounds' warning for now
Linus Torvalds torvalds@linux-foundation.org Stop the ad-hoc games with -Wno-maybe-initialized
Masahiro Yamada yamada.masahiro@socionext.com kbuild: compute false-positive -Wmaybe-uninitialized cases in Kconfig
Linus Torvalds torvalds@linux-foundation.org gcc-10 warnings: fix low-hanging fruit
Jason Gunthorpe jgg@mellanox.com pnp: Use list_for_each_entry() instead of open coding
Jack Morgenstein jackm@dev.mellanox.co.il IB/mlx4: Test return value of calls to ib_get_cached_pkey
Arnd Bergmann arnd@arndb.de netfilter: conntrack: avoid gcc-10 zero-length-bounds warning
Gal Pressman galp@mellanox.com net/mlx5: Fix driver load error flow when firmware is stuck
Anjali Singhai Jain anjali.singhai@intel.com i40e: avoid NVM acquire deadlock during NVM update
Ben Hutchings ben.hutchings@codethink.co.uk scsi: qla2xxx: Avoid double completion of abort command
zhong jiang zhongjiang@huawei.com mm/memory_hotplug.c: fix overflow in test_pages_in_a_zone()
Jiri Benc jbenc@redhat.com gre: do not keep the GRE header around in collect medata mode
John Hurley john.hurley@netronome.com net: openvswitch: fix csum updates for MPLS actions
Vasily Averin vvs@virtuozzo.com ipc/util.c: sysvipc_find_ipc() incorrectly updates position index
Vasily Averin vvs@virtuozzo.com drm/qxl: lost qxl_bo_kunmap_atomic_page in qxl_image_init_helper()
Lubomir Rintel lkundrak@v3.sk dmaengine: mmp_tdma: Reset channel error on release
Madhuparna Bhowmik madhuparnabhowmik10@gmail.com dmaengine: pch_dma.c: Avoid data race between probe and irq handler
Ronnie Sahlberg lsahlber@redhat.com cifs: Fix a race condition with cifs_echo_request
Samuel Cabrero scabrero@suse.de cifs: Check for timeout on Negotiate stage
wuxu.wu wuxu.wu@huawei.com spi: spi-dw: Add lock protect dw_spi rx/tx to prevent concurrent calls
Wu Bo wubo40@huawei.com scsi: sg: add sg_remove_request in sg_write
Arnd Bergmann arnd@arndb.de drop_monitor: work around gcc-10 stringop-overflow warning
Christophe JAILLET christophe.jaillet@wanadoo.fr net: moxa: Fix a potential double 'free_irq()'
Christophe JAILLET christophe.jaillet@wanadoo.fr net/sonic: Fix a resource leak in an error handling path in 'jazz_sonic_probe()'
David Ahern dsa@cumulusnetworks.com net: handle no dst on skb in icmp6_send
Vladis Dronov vdronov@redhat.com ptp: free ptp device pin descriptors properly
Vladis Dronov vdronov@redhat.com ptp: fix the race between the release of ptp_clock and cdev
YueHaibing yuehaibing@huawei.com ptp: Fix pass zero to ERR_PTR() in ptp_clock_register
Logan Gunthorpe logang@deltatee.com chardev: add helper function to register char devs with a struct device
Dmitry Torokhov dmitry.torokhov@gmail.com ptp: create "pins" together with the rest of attributes
Dmitry Torokhov dmitry.torokhov@gmail.com ptp: use is_visible method to hide unused attributes
Dmitry Torokhov dmitry.torokhov@gmail.com ptp: do not explicitly set drvdata in ptp_clock_register()
Cengiz Can cengiz@kernel.wtf blktrace: fix dereference after null check
Jan Kara jack@suse.cz blktrace: Protect q->blk_trace with RCU
Jens Axboe axboe@kernel.dk blktrace: fix trace mutex deadlock
Jens Axboe axboe@kernel.dk blktrace: fix unlocked access to init/start-stop/teardown
Waiman Long longman@redhat.com blktrace: Fix potential deadlock between delete & sysfs ops
Sabrina Dubroca sd@queasysnail.net net: ipv6_stub: use ip6_dst_lookup_flow instead of ip6_dst_lookup
Sabrina Dubroca sd@queasysnail.net net: ipv6: add net argument to ip6_dst_lookup_flow
Shijie Luo luoshijie1@huawei.com ext4: add cond_resched() to ext4_protect_reserved_inode
Kees Cook keescook@chromium.org binfmt_elf: Do not move brk for INTERP-less ET_EXEC
Alexandre Belloni alexandre.belloni@free-electrons.com phy: micrel: Ensure interrupts are reenabled on resume
Alexandre Belloni alexandre.belloni@free-electrons.com phy: micrel: Disable auto negotiation on startup
Ivan Delalande colona@arista.com scripts/decodecode: fix trapping instruction formatting
George Spelvin lkml@sdf.org batman-adv: fix batadv_nc_random_weight_tq
Oliver Neukum oneukum@suse.com USB: serial: garmin_gps: add sanity checking for data length
Oliver Neukum oneukum@suse.com USB: uas: add quirk for LaCie 2Big Quadra
Alex Estrin alex.estrin@intel.com Revert "IB/ipoib: Update broadcast object if PKey value was changed in index 0"
Ville Syrjälä ville.syrjala@linux.intel.com x86/apm: Don't access __preempt_count with zeroed fs
Kees Cook keescook@chromium.org binfmt_elf: move brk out of mmap when doing direct loader exec
Sabrina Dubroca sd@queasysnail.net ipv6: fix cleanup ordering for ip6_mr failure
Govindarajulu Varadarajan gvaradar@cisco.com enic: do not overwrite error code
Hans de Goede hdegoede@redhat.com Revert "ACPI / video: Add force_native quirk for HP Pavilion dv6"
Eric Dumazet edumazet@google.com sch_choke: avoid potential panic in choke_reset()
Eric Dumazet edumazet@google.com sch_sfq: validate silly quantum values
Tariq Toukan tariqt@mellanox.com net/mlx4_core: Fix use of ENOSPC around mlx4_counter_alloc()
Julia Lawall Julia.Lawall@inria.fr dp83640: reverse arguments to list_add_tail
Greg Kroah-Hartman gregkh@linuxfoundation.org Revert "net: phy: Avoid polling PHY with PHY_IGNORE_INTERRUPTS"
Matt Jolly Kangie@footclan.ninja USB: serial: qcserial: Add DW5816e support
Diffstat:
Makefile | 17 +- arch/arm/boot/dts/imx27-phytec-phycard-s-rdk.dts | 4 +- arch/arm/boot/dts/r8a7740.dtsi | 2 +- arch/x86/entry/entry_32.S | 8 +- arch/x86/include/asm/apm.h | 6 - arch/x86/include/asm/paravirt.h | 7 - arch/x86/include/asm/paravirt_types.h | 9 -- arch/x86/include/asm/stackprotector.h | 7 +- arch/x86/kernel/apm_32.c | 5 + arch/x86/kernel/asm-offsets.c | 3 - arch/x86/kernel/paravirt.c | 7 - arch/x86/kernel/paravirt_patch_32.c | 2 - arch/x86/kernel/paravirt_patch_64.c | 1 - arch/x86/kernel/smpboot.c | 8 + arch/x86/kvm/x86.c | 2 +- arch/x86/xen/enlighten.c | 3 - arch/x86/xen/smp.c | 1 + arch/x86/xen/xen-asm_32.S | 14 -- arch/x86/xen/xen-ops.h | 3 - block/blk-core.c | 3 + block/blk-mq-tag.c | 7 +- block/blk-mq.c | 17 ++ block/blk-timeout.c | 3 + crypto/lrw.c | 4 +- crypto/xts.c | 4 +- drivers/acpi/video_detect.c | 11 -- drivers/dma/mmp_tdma.c | 2 + drivers/dma/pch_dma.c | 2 +- drivers/gpu/drm/qxl/qxl_image.c | 3 +- drivers/infiniband/core/addr.c | 6 +- drivers/infiniband/hw/mlx4/qp.c | 14 +- drivers/infiniband/ulp/ipoib/ipoib_ib.c | 13 -- drivers/net/ethernet/cisco/enic/enic_main.c | 9 +- drivers/net/ethernet/intel/i40e/i40e_nvm.c | 98 +++++++----- drivers/net/ethernet/intel/i40e/i40e_prototype.h | 2 - drivers/net/ethernet/mellanox/mlx4/main.c | 4 +- drivers/net/ethernet/mellanox/mlx5/core/main.c | 2 +- drivers/net/ethernet/moxa/moxart_ether.c | 2 +- drivers/net/ethernet/natsemi/jazzsonic.c | 6 +- drivers/net/geneve.c | 4 +- drivers/net/phy/dp83640.c | 2 +- drivers/net/phy/micrel.c | 28 +++- drivers/net/phy/phy.c | 15 +- drivers/net/vxlan.c | 10 +- drivers/ptp/ptp_clock.c | 42 ++--- drivers/ptp/ptp_private.h | 9 +- drivers/ptp/ptp_sysfs.c | 162 ++++++++----------- drivers/scsi/qla2xxx/qla_init.c | 4 +- drivers/scsi/sg.c | 4 +- drivers/spi/spi-dw.c | 15 +- drivers/spi/spi-dw.h | 1 + drivers/usb/gadget/configfs.c | 3 + drivers/usb/gadget/legacy/audio.c | 4 +- drivers/usb/gadget/legacy/cdc2.c | 4 +- drivers/usb/gadget/legacy/ncm.c | 4 +- drivers/usb/gadget/udc/net2272.c | 2 + drivers/usb/serial/garmin_gps.c | 4 +- drivers/usb/serial/qcserial.c | 1 + drivers/usb/storage/unusual_uas.h | 7 + fs/binfmt_elf.c | 12 ++ fs/char_dev.c | 86 ++++++++++ fs/cifs/cifssmb.c | 12 ++ fs/cifs/connect.c | 11 +- fs/cifs/smb2pdu.c | 12 ++ fs/exec.c | 4 +- fs/ext4/block_validity.c | 1 + include/linux/blkdev.h | 3 +- include/linux/blktrace_api.h | 6 +- include/linux/cdev.h | 5 + include/linux/compiler.h | 7 + include/linux/fs.h | 2 +- include/linux/pnp.h | 29 ++-- include/linux/posix-clock.h | 19 ++- include/linux/tty.h | 2 +- include/net/addrconf.h | 6 +- include/net/ipv6.h | 2 +- include/net/netfilter/nf_conntrack.h | 2 +- include/sound/rawmidi.h | 1 + init/main.c | 2 + ipc/util.c | 12 +- kernel/time/posix-clock.c | 31 ++-- kernel/trace/blktrace.c | 191 +++++++++++++++++------ mm/memory_hotplug.c | 4 +- net/batman-adv/network-coding.c | 9 +- net/core/dev.c | 4 +- net/core/drop_monitor.c | 11 +- net/dccp/ipv6.c | 6 +- net/ipv4/cipso_ipv4.c | 6 +- net/ipv4/ip_gre.c | 7 +- net/ipv4/route.c | 2 +- net/ipv6/addrconf_core.c | 11 +- net/ipv6/af_inet6.c | 10 +- net/ipv6/datagram.c | 2 +- net/ipv6/icmp.c | 6 +- net/ipv6/inet6_connection_sock.c | 4 +- net/ipv6/ip6_output.c | 8 +- net/ipv6/raw.c | 2 +- net/ipv6/syncookies.c | 2 +- net/ipv6/tcp_ipv6.c | 4 +- net/l2tp/l2tp_ip6.c | 2 +- net/mpls/af_mpls.c | 7 +- net/netfilter/nf_conntrack_core.c | 4 +- net/netlabel/netlabel_kapi.c | 6 + net/openvswitch/actions.c | 6 +- net/sched/sch_choke.c | 3 +- net/sched/sch_sfq.c | 9 ++ net/sctp/ipv6.c | 4 +- net/tipc/udp_media.c | 9 +- scripts/decodecode | 2 +- sound/core/rawmidi.c | 35 ++++- sound/pci/hda/patch_realtek.c | 12 +- 111 files changed, 805 insertions(+), 496 deletions(-)
linux-stable-mirror@lists.linaro.org