This is the start of the stable review cycle for the 4.9.281 release. There are 43 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Thu 26 Aug 2021 05:06:11 PM UTC. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git/p... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.9.y and the diffstat can be found below.
Thanks, Sasha
------------- Pseudo-Shortlog of commits:
Colin Ian King (1): iio: adc: Fix incorrect exit of for-loop
Dan Williams (1): ACPI: NFIT: Fix support for virtual SPA ranges
Dave Gerlach (1): ARM: dts: am43x-epos-evm: Reduce i2c0 bus speed for tps65218
Dinghao Liu (1): net: qlcnic: add missed unlock in qlcnic_83xx_flash_read32
Dongliang Mu (1): ipack: tpci200: fix many double free issues in tpci200_pci_probe
Greg Kroah-Hartman (1): i2c: dev: zero out array used for i2c reads from userspace
Harshvardhan Jha (1): scsi: megaraid_mm: Fix end of loop tests for list_for_each_entry()
Jaehoon Chung (1): mmc: dw_mmc: call the dw_mci_prep_stop_abort() by default
Jaroslav Kysela (1): ALSA: hda - fix the 'Capture Switch' value change notifications
Jeff Layton (2): locks: print a warning when mount fails due to lack of "mand" support fs: warn about impending deprecation of mandatory locks
Johannes Berg (1): mac80211: drop data frames without key on encrypted links
Maxim Levitsky (1): KVM: nSVM: avoid picking up unsupported bits from L2 in int_ctl (CVE-2021-3653)
Maximilian Heyne (1): xen/events: Fix race in set_evtchn_to_irq
Nathan Chancellor (1): vmlinux.lds.h: Handle clang's module.{c,d}tor sections
Neal Cardwell (1): tcp_bbr: fix u32 wrap bug in round logic if bbr_init() called after 2B packets
NeilBrown (1): btrfs: prevent rename2 from exchanging a subvol with a directory from different parents
Ole Bjørn Midtbø (1): Bluetooth: hidp: use correct wait queue when removing ctrl_wait
Pali Rohár (1): ppp: Fix generating ifname when empty IFLA_IFNAME is specified
Pavel Skripkin (1): net: 6pack: fix slab-out-of-bounds in decode_data
Peter Ujfalusi (1): dmaengine: of-dma: router_xlate to return -EPROBE_DEFER if controller is not yet available
Randy Dunlap (2): x86/tools: Fix objdump version check again dccp: add do-while-0 stubs for dccp_pr_debug macros
Sasha Levin (1): Linux 4.9.281-rc1
Sreekanth Reddy (1): scsi: core: Avoid printing an error if target_alloc() returns -ENXIO
Sudeep Holla (1): ARM: dts: nomadik: Fix up interrupt controller node names
Takashi Iwai (2): ASoC: intel: atom: Fix reference to PCM buffer address ASoC: intel: atom: Fix breakage for PCM buffer address setup
Takeshi Misawa (1): net: Fix memory leak in ieee802154_raw_deliver
Thomas Gleixner (9): PCI/MSI: Enable and mask MSI-X early PCI/MSI: Do not set invalid bits in MSI mask PCI/MSI: Correct misleading comments PCI/MSI: Use msi_mask_irq() in pci_msi_shutdown() PCI/MSI: Protect msi_desc::masked for multi-MSI PCI/MSI: Mask all unused MSI-X entries PCI/MSI: Enforce that MSI-X table entry is masked for update PCI/MSI: Enforce MSI[X] entry updates to be visible x86/fpu: Make init_fpstate correct with optimized XSAVE
Vincent Whitchurch (1): mmc: dw_mmc: Fix hang on data CRC error
Xie Yongji (1): vhost: Fix the calculation in vhost_overflow()
Yang Yingliang (1): net: bridge: fix memleak in br_add_if()
Ye Bin (1): scsi: scsi_dh_rdac: Avoid crash during rdac_bus_attach()
Yu Kuai (1): dmaengine: usb-dmac: Fix PM reference leak in usb_dmac_probe()
.../filesystems/mandatory-locking.txt | 10 ++ Makefile | 4 +- arch/arm/boot/dts/am43x-epos-evm.dts | 2 +- arch/arm/boot/dts/ste-nomadik-stn8815.dtsi | 4 +- arch/x86/include/asm/fpu/internal.h | 30 ++--- arch/x86/include/asm/svm.h | 2 + arch/x86/kernel/fpu/xstate.c | 38 +++++- arch/x86/kvm/svm.c | 6 +- arch/x86/tools/chkobjdump.awk | 1 + drivers/acpi/nfit/core.c | 3 + drivers/base/core.c | 1 + drivers/dma/of-dma.c | 9 +- drivers/dma/sh/usb-dmac.c | 2 +- drivers/i2c/i2c-dev.c | 5 +- drivers/iio/adc/palmas_gpadc.c | 4 +- drivers/ipack/carriers/tpci200.c | 36 +++--- drivers/mmc/host/dw_mmc.c | 21 ++-- .../ethernet/qlogic/qlcnic/qlcnic_83xx_hw.c | 4 +- drivers/net/hamradio/6pack.c | 6 + drivers/net/ppp/ppp_generic.c | 2 +- drivers/pci/msi.c | 119 ++++++++++++------ drivers/scsi/device_handler/scsi_dh_rdac.c | 4 +- drivers/scsi/megaraid/megaraid_mm.c | 21 +++- drivers/scsi/scsi_scan.c | 3 +- drivers/vhost/vhost.c | 10 +- drivers/xen/events/events_base.c | 20 ++- fs/btrfs/inode.c | 10 +- fs/namespace.c | 15 ++- include/asm-generic/vmlinux.lds.h | 1 + include/linux/device.h | 1 + include/linux/msi.h | 2 +- net/bluetooth/hidp/core.c | 2 +- net/bridge/br_if.c | 2 + net/dccp/dccp.h | 6 +- net/ieee802154/socket.c | 7 +- net/ipv4/tcp_bbr.c | 2 +- net/mac80211/debugfs_sta.c | 1 + net/mac80211/key.c | 1 + net/mac80211/sta_info.h | 1 + net/mac80211/tx.c | 12 +- sound/pci/hda/hda_generic.c | 10 +- sound/soc/intel/atom/sst-mfld-platform-pcm.c | 3 +- 42 files changed, 294 insertions(+), 149 deletions(-)
From: Colin Ian King colin.king@canonical.com
commit 5afc1540f13804a31bb704b763308e17688369c5 upstream.
Currently the for-loop that scans for the optimial adc_period iterates through all the possible adc_period levels because the exit logic in the loop is inverted. I believe the comparison should be swapped and the continue replaced with a break to exit the loop at the correct point.
Addresses-Coverity: ("Continue has no effect") Fixes: e08e19c331fb ("iio:adc: add iio driver for Palmas (twl6035/7) gpadc") Signed-off-by: Colin Ian King colin.king@canonical.com Link: https://lore.kernel.org/r/20210730071651.17394-1-colin.king@canonical.com Cc: stable@vger.kernel.org Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/iio/adc/palmas_gpadc.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/iio/adc/palmas_gpadc.c b/drivers/iio/adc/palmas_gpadc.c index 7d61b566e148..f5218461ae25 100644 --- a/drivers/iio/adc/palmas_gpadc.c +++ b/drivers/iio/adc/palmas_gpadc.c @@ -660,8 +660,8 @@ static int palmas_adc_wakeup_configure(struct palmas_gpadc *adc)
adc_period = adc->auto_conversion_period; for (i = 0; i < 16; ++i) { - if (((1000 * (1 << i)) / 32) < adc_period) - continue; + if (((1000 * (1 << i)) / 32) >= adc_period) + break; } if (i > 0) i--;
From: Takashi Iwai tiwai@suse.de
commit 2e6b836312a477d647a7920b56810a5a25f6c856 upstream.
PCM buffers might be allocated dynamically when the buffer preallocation failed or a larger buffer is requested, and it's not guaranteed that substream->dma_buffer points to the actually used buffer. The address should be retrieved from runtime->dma_addr, instead of substream->dma_buffer (and shouldn't use virt_to_phys).
Also, remove the line overriding runtime->dma_area superfluously, which was already set up at the PCM buffer allocation.
Cc: Cezary Rojewski cezary.rojewski@intel.com Cc: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com Cc: stable@vger.kernel.org Signed-off-by: Takashi Iwai tiwai@suse.de Link: https://lore.kernel.org/r/20210728112353.6675-3-tiwai@suse.de Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/soc/intel/atom/sst-mfld-platform-pcm.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/sound/soc/intel/atom/sst-mfld-platform-pcm.c b/sound/soc/intel/atom/sst-mfld-platform-pcm.c index d812cbf41b94..6303b2d3982d 100644 --- a/sound/soc/intel/atom/sst-mfld-platform-pcm.c +++ b/sound/soc/intel/atom/sst-mfld-platform-pcm.c @@ -135,7 +135,7 @@ static void sst_fill_alloc_params(struct snd_pcm_substream *substream, snd_pcm_uframes_t period_size; ssize_t periodbytes; ssize_t buffer_bytes = snd_pcm_lib_buffer_bytes(substream); - u32 buffer_addr = virt_to_phys(substream->dma_buffer.area); + u32 buffer_addr = substream->runtime->dma_addr;
channels = substream->runtime->channels; period_size = substream->runtime->period_size; @@ -241,7 +241,6 @@ static int sst_platform_alloc_stream(struct snd_pcm_substream *substream, /* set codec params and inform SST driver the same */ sst_fill_pcm_params(substream, ¶m); sst_fill_alloc_params(substream, &alloc_params); - substream->runtime->dma_area = substream->dma_buffer.area; str_params.sparams = param; str_params.aparams = alloc_params; str_params.codec = SST_CODEC_TYPE_PCM;
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
commit 86ff25ed6cd8240d18df58930bd8848b19fce308 upstream.
If an i2c driver happens to not provide the full amount of data that a user asks for, it is possible that some uninitialized data could be sent to userspace. While all in-kernel drivers look to be safe, just be sure by initializing the buffer to zero before it is passed to the i2c driver so that any future drivers will not have this issue.
Also properly copy the amount of data recvieved to the userspace buffer, as pointed out by Dan Carpenter.
Reported-by: Eric Dumazet edumazet@google.com Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Wolfram Sang wsa@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/i2c/i2c-dev.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/i2c/i2c-dev.c b/drivers/i2c/i2c-dev.c index c4066276eb7b..b7f9fb00f695 100644 --- a/drivers/i2c/i2c-dev.c +++ b/drivers/i2c/i2c-dev.c @@ -148,7 +148,7 @@ static ssize_t i2cdev_read(struct file *file, char __user *buf, size_t count, if (count > 8192) count = 8192;
- tmp = kmalloc(count, GFP_KERNEL); + tmp = kzalloc(count, GFP_KERNEL); if (tmp == NULL) return -ENOMEM;
@@ -157,7 +157,8 @@ static ssize_t i2cdev_read(struct file *file, char __user *buf, size_t count,
ret = i2c_master_recv(client, tmp, count); if (ret >= 0) - ret = copy_to_user(buf, tmp, count) ? -EFAULT : ret; + if (copy_to_user(buf, tmp, ret)) + ret = -EFAULT; kfree(tmp); return ret; }
From: Dan Williams dan.j.williams@intel.com
commit b93dfa6bda4d4e88e5386490f2b277a26958f9d3 upstream.
Fix the NFIT parsing code to treat a 0 index in a SPA Range Structure as a special case and not match Region Mapping Structures that use 0 to indicate that they are not mapped. Without this fix some platform BIOS descriptions of "virtual disk" ranges do not result in the pmem driver attaching to the range.
Details: In addition to typical persistent memory ranges, the ACPI NFIT may also convey "virtual" ranges. These ranges are indicated by a UUID in the SPA Range Structure of UUID_VOLATILE_VIRTUAL_DISK, UUID_VOLATILE_VIRTUAL_CD, UUID_PERSISTENT_VIRTUAL_DISK, or UUID_PERSISTENT_VIRTUAL_CD. The critical difference between virtual ranges and UUID_PERSISTENT_MEMORY, is that virtual do not support associations with Region Mapping Structures. For this reason the "index" value of virtual SPA Range Structures is allowed to be 0. If a platform BIOS decides to represent NVDIMMs with disconnected "Region Mapping Structures" (range-index == 0), the kernel may falsely associate them with standalone ranges where the "SPA Range Structure Index" is also zero. When this happens the driver may falsely require labels where "virtual disks" are expected to be label-less. I.e. "label-less" is where the namespace-range == region-range and the pmem driver attaches with no user action to create a namespace.
Cc: Jacek Zloch jacek.zloch@intel.com Cc: Lukasz Sobieraj lukasz.sobieraj@intel.com Cc: "Lee, Chun-Yi" jlee@suse.com Cc: stable@vger.kernel.org Fixes: c2f32acdf848 ("acpi, nfit: treat virtual ramdisk SPA as pmem region") Reported-by: Krzysztof Rusocki krzysztof.rusocki@intel.com Reported-by: Damian Bassa damian.bassa@intel.com Reviewed-by: Jeff Moyer jmoyer@redhat.com Link: https://lore.kernel.org/r/162870796589.2521182.1240403310175570220.stgit@dwi... Signed-off-by: Dan Williams dan.j.williams@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/acpi/nfit/core.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/acpi/nfit/core.c b/drivers/acpi/nfit/core.c index b7fd8e00b346..4dddf579560f 100644 --- a/drivers/acpi/nfit/core.c +++ b/drivers/acpi/nfit/core.c @@ -2258,6 +2258,9 @@ static int acpi_nfit_register_region(struct acpi_nfit_desc *acpi_desc, struct acpi_nfit_memory_map *memdev = nfit_memdev->memdev; struct nd_mapping_desc *mapping;
+ /* range index 0 == unmapped in SPA or invalid-SPA */ + if (memdev->range_index == 0 || spa->range_index == 0) + continue; if (memdev->range_index != spa->range_index) continue; if (count >= ND_MAX_MAPPINGS) {
From: Pali Rohár pali@kernel.org
[ Upstream commit 2459dcb96bcba94c08d6861f8a050185ff301672 ]
IFLA_IFNAME is nul-term string which means that IFLA_IFNAME buffer can be larger than length of string which contains.
Function __rtnl_newlink() generates new own ifname if either IFLA_IFNAME was not specified at all or userspace passed empty nul-term string.
It is expected that if userspace does not specify ifname for new ppp netdev then kernel generates one in format "ppp<id>" where id matches to the ppp unit id which can be later obtained by PPPIOCGUNIT ioctl.
And it works in this way if IFLA_IFNAME is not specified at all. But it does not work when IFLA_IFNAME is specified with empty string.
So fix this logic also for empty IFLA_IFNAME in ppp_nl_newlink() function and correctly generates ifname based on ppp unit identifier if userspace did not provided preferred ifname.
Without this patch when IFLA_IFNAME was specified with empty string then kernel created a new ppp interface in format "ppp<id>" but id did not match ppp unit id returned by PPPIOCGUNIT ioctl. In this case id was some number generated by __rtnl_newlink() function.
Signed-off-by: Pali Rohár pali@kernel.org Fixes: bb8082f69138 ("ppp: build ifname using unit identifier for rtnl based devices") Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ppp/ppp_generic.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ppp/ppp_generic.c b/drivers/net/ppp/ppp_generic.c index 5ba472691546..0a29844676f9 100644 --- a/drivers/net/ppp/ppp_generic.c +++ b/drivers/net/ppp/ppp_generic.c @@ -1136,7 +1136,7 @@ static int ppp_nl_newlink(struct net *src_net, struct net_device *dev, * the PPP unit identifer as suffix (i.e. ppp<unit_id>). This allows * userspace to infer the device name using to the PPPIOCGUNIT ioctl. */ - if (!tb[IFLA_IFNAME]) + if (!tb[IFLA_IFNAME] || !nla_len(tb[IFLA_IFNAME]) || !*(char *)nla_data(tb[IFLA_IFNAME])) conf.ifname_is_set = false;
err = ppp_dev_configure(src_net, dev, &conf);
From: Takeshi Misawa jeliantsurux@gmail.com
[ Upstream commit 1090340f7ee53e824fd4eef66a4855d548110c5b ]
If IEEE-802.15.4-RAW is closed before receive skb, skb is leaked. Fix this, by freeing sk_receive_queue in sk->sk_destruct().
syzbot report: BUG: memory leak unreferenced object 0xffff88810f644600 (size 232): comm "softirq", pid 0, jiffies 4294967032 (age 81.270s) hex dump (first 32 bytes): 10 7d 4b 12 81 88 ff ff 10 7d 4b 12 81 88 ff ff .}K......}K..... 00 00 00 00 00 00 00 00 40 7c 4b 12 81 88 ff ff ........@|K..... backtrace: [<ffffffff83651d4a>] skb_clone+0xaa/0x2b0 net/core/skbuff.c:1496 [<ffffffff83fe1b80>] ieee802154_raw_deliver net/ieee802154/socket.c:369 [inline] [<ffffffff83fe1b80>] ieee802154_rcv+0x100/0x340 net/ieee802154/socket.c:1070 [<ffffffff8367cc7a>] __netif_receive_skb_one_core+0x6a/0xa0 net/core/dev.c:5384 [<ffffffff8367cd07>] __netif_receive_skb+0x27/0xa0 net/core/dev.c:5498 [<ffffffff8367cdd9>] netif_receive_skb_internal net/core/dev.c:5603 [inline] [<ffffffff8367cdd9>] netif_receive_skb+0x59/0x260 net/core/dev.c:5662 [<ffffffff83fe6302>] ieee802154_deliver_skb net/mac802154/rx.c:29 [inline] [<ffffffff83fe6302>] ieee802154_subif_frame net/mac802154/rx.c:102 [inline] [<ffffffff83fe6302>] __ieee802154_rx_handle_packet net/mac802154/rx.c:212 [inline] [<ffffffff83fe6302>] ieee802154_rx+0x612/0x620 net/mac802154/rx.c:284 [<ffffffff83fe59a6>] ieee802154_tasklet_handler+0x86/0xa0 net/mac802154/main.c:35 [<ffffffff81232aab>] tasklet_action_common.constprop.0+0x5b/0x100 kernel/softirq.c:557 [<ffffffff846000bf>] __do_softirq+0xbf/0x2ab kernel/softirq.c:345 [<ffffffff81232f4c>] do_softirq kernel/softirq.c:248 [inline] [<ffffffff81232f4c>] do_softirq+0x5c/0x80 kernel/softirq.c:235 [<ffffffff81232fc1>] __local_bh_enable_ip+0x51/0x60 kernel/softirq.c:198 [<ffffffff8367a9a4>] local_bh_enable include/linux/bottom_half.h:32 [inline] [<ffffffff8367a9a4>] rcu_read_unlock_bh include/linux/rcupdate.h:745 [inline] [<ffffffff8367a9a4>] __dev_queue_xmit+0x7f4/0xf60 net/core/dev.c:4221 [<ffffffff83fe2db4>] raw_sendmsg+0x1f4/0x2b0 net/ieee802154/socket.c:295 [<ffffffff8363af16>] sock_sendmsg_nosec net/socket.c:654 [inline] [<ffffffff8363af16>] sock_sendmsg+0x56/0x80 net/socket.c:674 [<ffffffff8363deec>] __sys_sendto+0x15c/0x200 net/socket.c:1977 [<ffffffff8363dfb6>] __do_sys_sendto net/socket.c:1989 [inline] [<ffffffff8363dfb6>] __se_sys_sendto net/socket.c:1985 [inline] [<ffffffff8363dfb6>] __x64_sys_sendto+0x26/0x30 net/socket.c:1985
Fixes: 9ec767160357 ("net: add IEEE 802.15.4 socket family implementation") Reported-and-tested-by: syzbot+1f68113fa907bf0695a8@syzkaller.appspotmail.com Signed-off-by: Takeshi Misawa jeliantsurux@gmail.com Acked-by: Alexander Aring aahringo@redhat.com Link: https://lore.kernel.org/r/20210805075414.GA15796@DESKTOP Signed-off-by: Stefan Schmidt stefan@datenfreihafen.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/ieee802154/socket.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/net/ieee802154/socket.c b/net/ieee802154/socket.c index f66e4afb978a..6383627b783e 100644 --- a/net/ieee802154/socket.c +++ b/net/ieee802154/socket.c @@ -987,6 +987,11 @@ static const struct proto_ops ieee802154_dgram_ops = { #endif };
+static void ieee802154_sock_destruct(struct sock *sk) +{ + skb_queue_purge(&sk->sk_receive_queue); +} + /* Create a socket. Initialise the socket, blank the addresses * set the state. */ @@ -1027,7 +1032,7 @@ static int ieee802154_create(struct net *net, struct socket *sock, sock->ops = ops;
sock_init_data(sock, sk); - /* FIXME: sk->sk_destruct */ + sk->sk_destruct = ieee802154_sock_destruct; sk->sk_family = PF_IEEE802154;
/* Checksums on by default */
From: Yang Yingliang yangyingliang@huawei.com
[ Upstream commit 519133debcc19f5c834e7e28480b60bdc234fe02 ]
I got a memleak report:
BUG: memory leak unreferenced object 0x607ee521a658 (size 240): comm "syz-executor.0", pid 955, jiffies 4294780569 (age 16.449s) hex dump (first 32 bytes, cpu 1): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<00000000d830ea5a>] br_multicast_add_port+0x1c2/0x300 net/bridge/br_multicast.c:1693 [<00000000274d9a71>] new_nbp net/bridge/br_if.c:435 [inline] [<00000000274d9a71>] br_add_if+0x670/0x1740 net/bridge/br_if.c:611 [<0000000012ce888e>] do_set_master net/core/rtnetlink.c:2513 [inline] [<0000000012ce888e>] do_set_master+0x1aa/0x210 net/core/rtnetlink.c:2487 [<0000000099d1cafc>] __rtnl_newlink+0x1095/0x13e0 net/core/rtnetlink.c:3457 [<00000000a01facc0>] rtnl_newlink+0x64/0xa0 net/core/rtnetlink.c:3488 [<00000000acc9186c>] rtnetlink_rcv_msg+0x369/0xa10 net/core/rtnetlink.c:5550 [<00000000d4aabb9c>] netlink_rcv_skb+0x134/0x3d0 net/netlink/af_netlink.c:2504 [<00000000bc2e12a3>] netlink_unicast_kernel net/netlink/af_netlink.c:1314 [inline] [<00000000bc2e12a3>] netlink_unicast+0x4a0/0x6a0 net/netlink/af_netlink.c:1340 [<00000000e4dc2d0e>] netlink_sendmsg+0x789/0xc70 net/netlink/af_netlink.c:1929 [<000000000d22c8b3>] sock_sendmsg_nosec net/socket.c:654 [inline] [<000000000d22c8b3>] sock_sendmsg+0x139/0x170 net/socket.c:674 [<00000000e281417a>] ____sys_sendmsg+0x658/0x7d0 net/socket.c:2350 [<00000000237aa2ab>] ___sys_sendmsg+0xf8/0x170 net/socket.c:2404 [<000000004f2dc381>] __sys_sendmsg+0xd3/0x190 net/socket.c:2433 [<0000000005feca6c>] do_syscall_64+0x37/0x90 arch/x86/entry/common.c:47 [<000000007304477d>] entry_SYSCALL_64_after_hwframe+0x44/0xae
On error path of br_add_if(), p->mcast_stats allocated in new_nbp() need be freed, or it will be leaked.
Fixes: 1080ab95e3c7 ("net: bridge: add support for IGMP/MLD stats and export them via netlink") Reported-by: Hulk Robot hulkci@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Acked-by: Nikolay Aleksandrov nikolay@nvidia.com Link: https://lore.kernel.org/r/20210809132023.978546-1-yangyingliang@huawei.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/bridge/br_if.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/net/bridge/br_if.c b/net/bridge/br_if.c index 4718c528e100..794fba20afbc 100644 --- a/net/bridge/br_if.c +++ b/net/bridge/br_if.c @@ -520,6 +520,7 @@ int br_add_if(struct net_bridge *br, struct net_device *dev)
err = dev_set_allmulti(dev, 1); if (err) { + br_multicast_del_port(p); kfree(p); /* kobject not yet init'd, manually free */ goto err1; } @@ -624,6 +625,7 @@ err4: err3: sysfs_remove_link(br->ifobj, p->dev->name); err2: + br_multicast_del_port(p); kobject_put(&p->kobj); dev_set_allmulti(dev, -1); err1:
From: Neal Cardwell ncardwell@google.com
[ Upstream commit 6de035fec045f8ae5ee5f3a02373a18b939e91fb ]
Currently if BBR congestion control is initialized after more than 2B packets have been delivered, depending on the phase of the tp->delivered counter the tracking of BBR round trips can get stuck.
The bug arises because if tp->delivered is between 2^31 and 2^32 at the time the BBR congestion control module is initialized, then the initialization of bbr->next_rtt_delivered to 0 will cause the logic to believe that the end of the round trip is still billions of packets in the future. More specifically, the following check will fail repeatedly:
!before(rs->prior_delivered, bbr->next_rtt_delivered)
and thus the connection will take up to 2B packets delivered before that check will pass and the connection will set:
bbr->round_start = 1;
This could cause many mechanisms in BBR to fail to trigger, for example bbr_check_full_bw_reached() would likely never exit STARTUP.
This bug is 5 years old and has not been observed, and as a practical matter this would likely rarely trigger, since it would require transferring at least 2B packets, or likely more than 3 terabytes of data, before switching congestion control algorithms to BBR.
This patch is a stable candidate for kernels as far back as v4.9, when tcp_bbr.c was added.
Fixes: 0f8782ea1497 ("tcp_bbr: add BBR congestion control") Signed-off-by: Neal Cardwell ncardwell@google.com Reviewed-by: Yuchung Cheng ycheng@google.com Reviewed-by: Kevin Yang yyd@google.com Reviewed-by: Eric Dumazet edumazet@google.com Link: https://lore.kernel.org/r/20210811024056.235161-1-ncardwell@google.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/ipv4/tcp_bbr.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/ipv4/tcp_bbr.c b/net/ipv4/tcp_bbr.c index c22da42376fe..47f40e105044 100644 --- a/net/ipv4/tcp_bbr.c +++ b/net/ipv4/tcp_bbr.c @@ -811,7 +811,7 @@ static void bbr_init(struct sock *sk) bbr->prior_cwnd = 0; bbr->tso_segs_goal = 0; /* default segs per skb until first ACK */ bbr->rtt_cnt = 0; - bbr->next_rtt_delivered = 0; + bbr->next_rtt_delivered = tp->delivered; bbr->prev_ca_state = TCP_CA_Open; bbr->packet_conservation = 0;
From: Maximilian Heyne mheyne@amazon.de
[ Upstream commit 88ca2521bd5b4e8b83743c01a2d4cb09325b51e9 ]
There is a TOCTOU issue in set_evtchn_to_irq. Rows in the evtchn_to_irq mapping are lazily allocated in this function. The check whether the row is already present and the row initialization is not synchronized. Two threads can at the same time allocate a new row for evtchn_to_irq and add the irq mapping to the their newly allocated row. One thread will overwrite what the other has set for evtchn_to_irq[row] and therefore the irq mapping is lost. This will trigger a BUG_ON later in bind_evtchn_to_cpu:
INFO: pci 0000:1a:15.4: [1d0f:8061] type 00 class 0x010802 INFO: nvme 0000:1a:12.1: enabling device (0000 -> 0002) INFO: nvme nvme77: 1/0/0 default/read/poll queues CRIT: kernel BUG at drivers/xen/events/events_base.c:427! WARN: invalid opcode: 0000 [#1] SMP NOPTI WARN: Workqueue: nvme-reset-wq nvme_reset_work [nvme] WARN: RIP: e030:bind_evtchn_to_cpu+0xc2/0xd0 WARN: Call Trace: WARN: set_affinity_irq+0x121/0x150 WARN: irq_do_set_affinity+0x37/0xe0 WARN: irq_setup_affinity+0xf6/0x170 WARN: irq_startup+0x64/0xe0 WARN: __setup_irq+0x69e/0x740 WARN: ? request_threaded_irq+0xad/0x160 WARN: request_threaded_irq+0xf5/0x160 WARN: ? nvme_timeout+0x2f0/0x2f0 [nvme] WARN: pci_request_irq+0xa9/0xf0 WARN: ? pci_alloc_irq_vectors_affinity+0xbb/0x130 WARN: queue_request_irq+0x4c/0x70 [nvme] WARN: nvme_reset_work+0x82d/0x1550 [nvme] WARN: ? check_preempt_wakeup+0x14f/0x230 WARN: ? check_preempt_curr+0x29/0x80 WARN: ? nvme_irq_check+0x30/0x30 [nvme] WARN: process_one_work+0x18e/0x3c0 WARN: worker_thread+0x30/0x3a0 WARN: ? process_one_work+0x3c0/0x3c0 WARN: kthread+0x113/0x130 WARN: ? kthread_park+0x90/0x90 WARN: ret_from_fork+0x3a/0x50
This patch sets evtchn_to_irq rows via a cmpxchg operation so that they will be set only once. The row is now cleared before writing it to evtchn_to_irq in order to not create a race once the row is visible for other threads.
While at it, do not require the page to be zeroed, because it will be overwritten with -1's in clear_evtchn_to_irq_row anyway.
Signed-off-by: Maximilian Heyne mheyne@amazon.de Fixes: d0b075ffeede ("xen/events: Refactor evtchn_to_irq array to be dynamically allocated") Link: https://lore.kernel.org/r/20210812130930.127134-1-mheyne@amazon.de Reviewed-by: Boris Ostrovsky boris.ostrovsky@oracle.com Signed-off-by: Boris Ostrovsky boris.ostrovsky@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/xen/events/events_base.c | 20 ++++++++++++++------ 1 file changed, 14 insertions(+), 6 deletions(-)
diff --git a/drivers/xen/events/events_base.c b/drivers/xen/events/events_base.c index c6e6b7470cbf..fbb6a4701ea3 100644 --- a/drivers/xen/events/events_base.c +++ b/drivers/xen/events/events_base.c @@ -134,12 +134,12 @@ static void disable_dynirq(struct irq_data *data);
static DEFINE_PER_CPU(unsigned int, irq_epoch);
-static void clear_evtchn_to_irq_row(unsigned row) +static void clear_evtchn_to_irq_row(int *evtchn_row) { unsigned col;
for (col = 0; col < EVTCHN_PER_ROW; col++) - WRITE_ONCE(evtchn_to_irq[row][col], -1); + WRITE_ONCE(evtchn_row[col], -1); }
static void clear_evtchn_to_irq_all(void) @@ -149,7 +149,7 @@ static void clear_evtchn_to_irq_all(void) for (row = 0; row < EVTCHN_ROW(xen_evtchn_max_channels()); row++) { if (evtchn_to_irq[row] == NULL) continue; - clear_evtchn_to_irq_row(row); + clear_evtchn_to_irq_row(evtchn_to_irq[row]); } }
@@ -157,6 +157,7 @@ static int set_evtchn_to_irq(unsigned evtchn, unsigned irq) { unsigned row; unsigned col; + int *evtchn_row;
if (evtchn >= xen_evtchn_max_channels()) return -EINVAL; @@ -169,11 +170,18 @@ static int set_evtchn_to_irq(unsigned evtchn, unsigned irq) if (irq == -1) return 0;
- evtchn_to_irq[row] = (int *)get_zeroed_page(GFP_KERNEL); - if (evtchn_to_irq[row] == NULL) + evtchn_row = (int *) __get_free_pages(GFP_KERNEL, 0); + if (evtchn_row == NULL) return -ENOMEM;
- clear_evtchn_to_irq_row(row); + clear_evtchn_to_irq_row(evtchn_row); + + /* + * We've prepared an empty row for the mapping. If a different + * thread was faster inserting it, we can drop ours. + */ + if (cmpxchg(&evtchn_to_irq[row], NULL, evtchn_row) != NULL) + free_page((unsigned long) evtchn_row); }
WRITE_ONCE(evtchn_to_irq[row][col], irq);
From: Randy Dunlap rdunlap@infradead.org
[ Upstream commit 839ad22f755132838f406751439363c07272ad87 ]
Skip (omit) any version string info that is parenthesized.
Warning: objdump version 15) is older than 2.19 Warning: Skipping posttest.
where 'objdump -v' says: GNU objdump (GNU Binutils; SUSE Linux Enterprise 15) 2.35.1.20201123-7.18
Fixes: 8bee738bb1979 ("x86: Fix objdump version check in chkobjdump.awk for different formats.") Signed-off-by: Randy Dunlap rdunlap@infradead.org Signed-off-by: Thomas Gleixner tglx@linutronix.de Reviewed-by: Masami Hiramatsu mhiramat@kernel.org Link: https://lore.kernel.org/r/20210731000146.2720-1-rdunlap@infradead.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/tools/chkobjdump.awk | 1 + 1 file changed, 1 insertion(+)
diff --git a/arch/x86/tools/chkobjdump.awk b/arch/x86/tools/chkobjdump.awk index fd1ab80be0de..a4cf678cf5c8 100644 --- a/arch/x86/tools/chkobjdump.awk +++ b/arch/x86/tools/chkobjdump.awk @@ -10,6 +10,7 @@ BEGIN {
/^GNU objdump/ { verstr = "" + gsub(/(.*)/, ""); for (i = 3; i <= NF; i++) if (match($(i), "^[0-9]")) { verstr = $(i);
From: Thomas Gleixner tglx@linutronix.de
commit 438553958ba19296663c6d6583d208dfb6792830 upstream.
The ordering of MSI-X enable in hardware is dysfunctional:
1) MSI-X is disabled in the control register 2) Various setup functions 3) pci_msi_setup_msi_irqs() is invoked which ends up accessing the MSI-X table entries 4) MSI-X is enabled and masked in the control register with the comment that enabling is required for some hardware to access the MSI-X table
Step #4 obviously contradicts #3. The history of this is an issue with the NIU hardware. When #4 was introduced the table access actually happened in msix_program_entries() which was invoked after enabling and masking MSI-X.
This was changed in commit d71d6432e105 ("PCI/MSI: Kill redundant call of irq_set_msi_desc() for MSI-X interrupts") which removed the table write from msix_program_entries().
Interestingly enough nobody noticed and either NIU still works or it did not get any testing with a kernel 3.19 or later.
Nevertheless this is inconsistent and there is no reason why MSI-X can't be enabled and masked in the control register early on, i.e. move step #4 above to step #1. This preserves the NIU workaround and has no side effects on other hardware.
Fixes: d71d6432e105 ("PCI/MSI: Kill redundant call of irq_set_msi_desc() for MSI-X interrupts") Signed-off-by: Thomas Gleixner tglx@linutronix.de Tested-by: Marc Zyngier maz@kernel.org Reviewed-by: Ashok Raj ashok.raj@intel.com Reviewed-by: Marc Zyngier maz@kernel.org Acked-by: Bjorn Helgaas bhelgaas@google.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20210729222542.344136412@linutronix.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pci/msi.c | 28 +++++++++++++++------------- 1 file changed, 15 insertions(+), 13 deletions(-)
diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c index 55ca14fbdd2a..aae994163cb9 100644 --- a/drivers/pci/msi.c +++ b/drivers/pci/msi.c @@ -766,18 +766,25 @@ static int msix_capability_init(struct pci_dev *dev, struct msix_entry *entries, u16 control; void __iomem *base;
- /* Ensure MSI-X is disabled while it is set up */ - pci_msix_clear_and_set_ctrl(dev, PCI_MSIX_FLAGS_ENABLE, 0); + /* + * Some devices require MSI-X to be enabled before the MSI-X + * registers can be accessed. Mask all the vectors to prevent + * interrupts coming in before they're fully set up. + */ + pci_msix_clear_and_set_ctrl(dev, 0, PCI_MSIX_FLAGS_MASKALL | + PCI_MSIX_FLAGS_ENABLE);
pci_read_config_word(dev, dev->msix_cap + PCI_MSIX_FLAGS, &control); /* Request & Map MSI-X table region */ base = msix_map_region(dev, msix_table_size(control)); - if (!base) - return -ENOMEM; + if (!base) { + ret = -ENOMEM; + goto out_disable; + }
ret = msix_setup_entries(dev, base, entries, nvec, affinity); if (ret) - return ret; + goto out_disable;
ret = pci_msi_setup_msi_irqs(dev, nvec, PCI_CAP_ID_MSIX); if (ret) @@ -788,14 +795,6 @@ static int msix_capability_init(struct pci_dev *dev, struct msix_entry *entries, if (ret) goto out_free;
- /* - * Some devices require MSI-X to be enabled before we can touch the - * MSI-X registers. We need to mask all the vectors to prevent - * interrupts coming in before they're fully set up. - */ - pci_msix_clear_and_set_ctrl(dev, 0, - PCI_MSIX_FLAGS_MASKALL | PCI_MSIX_FLAGS_ENABLE); - msix_program_entries(dev, entries);
ret = populate_msi_sysfs(dev); @@ -830,6 +829,9 @@ out_avail: out_free: free_msi_irqs(dev);
+out_disable: + pci_msix_clear_and_set_ctrl(dev, PCI_MSIX_FLAGS_ENABLE, 0); + return ret; }
From: Thomas Gleixner tglx@linutronix.de
commit 361fd37397f77578735907341579397d5bed0a2d upstream.
msi_mask_irq() takes a mask and a flags argument. The mask argument is used to mask out bits from the cached mask and the flags argument to set bits.
Some places invoke it with a flags argument which sets bits which are not used by the device, i.e. when the device supports up to 8 vectors a full unmask in some places sets the mask to 0xFFFFFF00. While devices probably do not care, it's still bad practice.
Fixes: 7ba1930db02f ("PCI MSI: Unmask MSI if setup failed") Signed-off-by: Thomas Gleixner tglx@linutronix.de Tested-by: Marc Zyngier maz@kernel.org Reviewed-by: Marc Zyngier maz@kernel.org Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20210729222542.568173099@linutronix.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pci/msi.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c index aae994163cb9..ebbd72f53e45 100644 --- a/drivers/pci/msi.c +++ b/drivers/pci/msi.c @@ -639,21 +639,21 @@ static int msi_capability_init(struct pci_dev *dev, int nvec, bool affinity) /* Configure MSI capability structure */ ret = pci_msi_setup_msi_irqs(dev, nvec, PCI_CAP_ID_MSI); if (ret) { - msi_mask_irq(entry, mask, ~mask); + msi_mask_irq(entry, mask, 0); free_msi_irqs(dev); return ret; }
ret = msi_verify_entries(dev); if (ret) { - msi_mask_irq(entry, mask, ~mask); + msi_mask_irq(entry, mask, 0); free_msi_irqs(dev); return ret; }
ret = populate_msi_sysfs(dev); if (ret) { - msi_mask_irq(entry, mask, ~mask); + msi_mask_irq(entry, mask, 0); free_msi_irqs(dev); return ret; } @@ -920,7 +920,7 @@ void pci_msi_shutdown(struct pci_dev *dev) /* Return the device with MSI unmasked as initial states */ mask = msi_mask(desc->msi_attrib.multi_cap); /* Keep cached state to be restored */ - __pci_msi_desc_mask_irq(desc, mask, ~mask); + __pci_msi_desc_mask_irq(desc, mask, 0);
/* Restore dev->irq to its default pin-assertion irq */ dev->irq = desc->msi_attrib.default_irq;
From: Thomas Gleixner tglx@linutronix.de
commit 689e6b5351573c38ccf92a0dd8b3e2c2241e4aff upstream.
The comments about preserving the cached state in pci_msi[x]_shutdown() are misleading as the MSI descriptors are freed right after those functions return. So there is nothing to restore. Preparatory change.
Signed-off-by: Thomas Gleixner tglx@linutronix.de Tested-by: Marc Zyngier maz@kernel.org Reviewed-by: Marc Zyngier maz@kernel.org Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20210729222542.621609423@linutronix.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pci/msi.c | 5 +---- 1 file changed, 1 insertion(+), 4 deletions(-)
diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c index ebbd72f53e45..481f1a1884e6 100644 --- a/drivers/pci/msi.c +++ b/drivers/pci/msi.c @@ -919,7 +919,6 @@ void pci_msi_shutdown(struct pci_dev *dev)
/* Return the device with MSI unmasked as initial states */ mask = msi_mask(desc->msi_attrib.multi_cap); - /* Keep cached state to be restored */ __pci_msi_desc_mask_irq(desc, mask, 0);
/* Restore dev->irq to its default pin-assertion irq */ @@ -1021,10 +1020,8 @@ void pci_msix_shutdown(struct pci_dev *dev) return;
/* Return the device with MSI-X masked as initial states */ - for_each_pci_msi_entry(entry, dev) { - /* Keep cached states to be restored */ + for_each_pci_msi_entry(entry, dev) __pci_msix_desc_mask_irq(entry, 1); - }
pci_msix_clear_and_set_ctrl(dev, PCI_MSIX_FLAGS_ENABLE, 0); pci_intx_for_msi(dev, 1);
From: Thomas Gleixner tglx@linutronix.de
commit d28d4ad2a1aef27458b3383725bb179beb8d015c upstream.
No point in using the raw write function from shutdown. Preparatory change to introduce proper serialization for the msi_desc::masked cache.
Signed-off-by: Thomas Gleixner tglx@linutronix.de Tested-by: Marc Zyngier maz@kernel.org Reviewed-by: Marc Zyngier maz@kernel.org Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20210729222542.674391354@linutronix.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pci/msi.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c index 481f1a1884e6..b3977b4c51b6 100644 --- a/drivers/pci/msi.c +++ b/drivers/pci/msi.c @@ -919,7 +919,7 @@ void pci_msi_shutdown(struct pci_dev *dev)
/* Return the device with MSI unmasked as initial states */ mask = msi_mask(desc->msi_attrib.multi_cap); - __pci_msi_desc_mask_irq(desc, mask, 0); + msi_mask_irq(desc, mask, 0);
/* Restore dev->irq to its default pin-assertion irq */ dev->irq = desc->msi_attrib.default_irq;
From: Thomas Gleixner tglx@linutronix.de
commit 77e89afc25f30abd56e76a809ee2884d7c1b63ce upstream.
Multi-MSI uses a single MSI descriptor and there is a single mask register when the device supports per vector masking. To avoid reading back the mask register the value is cached in the MSI descriptor and updates are done by clearing and setting bits in the cache and writing it to the device.
But nothing protects msi_desc::masked and the mask register from being modified concurrently on two different CPUs for two different Linux interrupts which belong to the same multi-MSI descriptor.
Add a lock to struct device and protect any operation on the mask and the mask register with it.
This makes the update of msi_desc::masked unconditional, but there is no place which requires a modification of the hardware register without updating the masked cache.
msi_mask_irq() is now an empty wrapper which will be cleaned up in follow up changes.
The problem goes way back to the initial support of multi-MSI, but picking the commit which introduced the mask cache is a valid cut off point (2.6.30).
Fixes: f2440d9acbe8 ("PCI MSI: Refactor interrupt masking code") Signed-off-by: Thomas Gleixner tglx@linutronix.de Tested-by: Marc Zyngier maz@kernel.org Reviewed-by: Marc Zyngier maz@kernel.org Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20210729222542.726833414@linutronix.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/base/core.c | 1 + drivers/pci/msi.c | 19 ++++++++++--------- include/linux/device.h | 1 + include/linux/msi.h | 2 +- 4 files changed, 13 insertions(+), 10 deletions(-)
diff --git a/drivers/base/core.c b/drivers/base/core.c index 3b8487e28c84..e82a89325f3d 100644 --- a/drivers/base/core.c +++ b/drivers/base/core.c @@ -710,6 +710,7 @@ void device_initialize(struct device *dev) device_pm_init(dev); set_dev_node(dev, -1); #ifdef CONFIG_GENERIC_MSI_IRQ + raw_spin_lock_init(&dev->msi_lock); INIT_LIST_HEAD(&dev->msi_list); #endif } diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c index b3977b4c51b6..79b36f1bde0d 100644 --- a/drivers/pci/msi.c +++ b/drivers/pci/msi.c @@ -189,24 +189,25 @@ static inline __attribute_const__ u32 msi_mask(unsigned x) * reliably as devices without an INTx disable bit will then generate a * level IRQ which will never be cleared. */ -u32 __pci_msi_desc_mask_irq(struct msi_desc *desc, u32 mask, u32 flag) +void __pci_msi_desc_mask_irq(struct msi_desc *desc, u32 mask, u32 flag) { - u32 mask_bits = desc->masked; + raw_spinlock_t *lock = &desc->dev->msi_lock; + unsigned long flags;
if (pci_msi_ignore_mask || !desc->msi_attrib.maskbit) - return 0; + return;
- mask_bits &= ~mask; - mask_bits |= flag; + raw_spin_lock_irqsave(lock, flags); + desc->masked &= ~mask; + desc->masked |= flag; pci_write_config_dword(msi_desc_to_pci_dev(desc), desc->mask_pos, - mask_bits); - - return mask_bits; + desc->masked); + raw_spin_unlock_irqrestore(lock, flags); }
static void msi_mask_irq(struct msi_desc *desc, u32 mask, u32 flag) { - desc->masked = __pci_msi_desc_mask_irq(desc, mask, flag); + __pci_msi_desc_mask_irq(desc, mask, flag); }
static void __iomem *pci_msix_desc_addr(struct msi_desc *desc) diff --git a/include/linux/device.h b/include/linux/device.h index eb865b461acc..ca765188a981 100644 --- a/include/linux/device.h +++ b/include/linux/device.h @@ -812,6 +812,7 @@ struct device { struct dev_pin_info *pins; #endif #ifdef CONFIG_GENERIC_MSI_IRQ + raw_spinlock_t msi_lock; struct list_head msi_list; #endif
diff --git a/include/linux/msi.h b/include/linux/msi.h index debc8aa4ec19..601bff9fbbec 100644 --- a/include/linux/msi.h +++ b/include/linux/msi.h @@ -133,7 +133,7 @@ void __pci_write_msi_msg(struct msi_desc *entry, struct msi_msg *msg); void pci_write_msi_msg(unsigned int irq, struct msi_msg *msg);
u32 __pci_msix_desc_mask_irq(struct msi_desc *desc, u32 flag); -u32 __pci_msi_desc_mask_irq(struct msi_desc *desc, u32 mask, u32 flag); +void __pci_msi_desc_mask_irq(struct msi_desc *desc, u32 mask, u32 flag); void pci_msi_mask_irq(struct irq_data *data); void pci_msi_unmask_irq(struct irq_data *data);
From: Thomas Gleixner tglx@linutronix.de
commit 7d5ec3d3612396dc6d4b76366d20ab9fc06f399f upstream.
When MSI-X is enabled the ordering of calls is:
msix_map_region(); msix_setup_entries(); pci_msi_setup_msi_irqs(); msix_program_entries();
This has a few interesting issues:
1) msix_setup_entries() allocates the MSI descriptors and initializes them except for the msi_desc:masked member which is left zero initialized.
2) pci_msi_setup_msi_irqs() allocates the interrupt descriptors and sets up the MSI interrupts which ends up in pci_write_msi_msg() unless the interrupt chip provides its own irq_write_msi_msg() function.
3) msix_program_entries() does not do what the name suggests. It solely updates the entries array (if not NULL) and initializes the masked member for each MSI descriptor by reading the hardware state and then masks the entry.
Obviously this has some issues:
1) The uninitialized masked member of msi_desc prevents the enforcement of masking the entry in pci_write_msi_msg() depending on the cached masked bit. Aside of that half initialized data is a NONO in general
2) msix_program_entries() only ensures that the actually allocated entries are masked. This is wrong as experimentation with crash testing and crash kernel kexec has shown.
This limited testing unearthed that when the production kernel had more entries in use and unmasked when it crashed and the crash kernel allocated a smaller amount of entries, then a full scan of all entries found unmasked entries which were in use in the production kernel.
This is obviously a device or emulation issue as the device reset should mask all MSI-X table entries, but obviously that's just part of the paper specification.
Cure this by:
1) Masking all table entries in hardware 2) Initializing msi_desc::masked in msix_setup_entries() 3) Removing the mask dance in msix_program_entries() 4) Renaming msix_program_entries() to msix_update_entries() to reflect the purpose of that function.
As the masking of unused entries has never been done the Fixes tag refers to a commit in: git://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git
Fixes: f036d4ea5fa7 ("[PATCH] ia32 Message Signalled Interrupt support") Signed-off-by: Thomas Gleixner tglx@linutronix.de Tested-by: Marc Zyngier maz@kernel.org Reviewed-by: Marc Zyngier maz@kernel.org Acked-by: Bjorn Helgaas bhelgaas@google.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20210729222542.403833459@linutronix.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pci/msi.c | 39 ++++++++++++++++++++++++++++----------- 1 file changed, 28 insertions(+), 11 deletions(-)
diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c index 79b36f1bde0d..a4873c7fea72 100644 --- a/drivers/pci/msi.c +++ b/drivers/pci/msi.c @@ -695,6 +695,7 @@ static int msix_setup_entries(struct pci_dev *dev, void __iomem *base, { struct cpumask *curmsk, *masks = NULL; struct msi_desc *entry; + void __iomem *addr; int ret, i;
if (affinity) { @@ -717,6 +718,7 @@ static int msix_setup_entries(struct pci_dev *dev, void __iomem *base,
entry->msi_attrib.is_msix = 1; entry->msi_attrib.is_64 = 1; + if (entries) entry->msi_attrib.entry_nr = entries[i].entry; else @@ -724,6 +726,10 @@ static int msix_setup_entries(struct pci_dev *dev, void __iomem *base, entry->msi_attrib.default_irq = dev->irq; entry->mask_base = base;
+ addr = pci_msix_desc_addr(entry); + if (addr) + entry->masked = readl(addr + PCI_MSIX_ENTRY_VECTOR_CTRL); + list_add_tail(&entry->list, dev_to_msi_list(&dev->dev)); if (masks) curmsk++; @@ -734,21 +740,27 @@ out: return ret; }
-static void msix_program_entries(struct pci_dev *dev, - struct msix_entry *entries) +static void msix_update_entries(struct pci_dev *dev, struct msix_entry *entries) { struct msi_desc *entry; - int i = 0;
for_each_pci_msi_entry(entry, dev) { - if (entries) - entries[i++].vector = entry->irq; - entry->masked = readl(pci_msix_desc_addr(entry) + - PCI_MSIX_ENTRY_VECTOR_CTRL); - msix_mask_irq(entry, 1); + if (entries) { + entries->vector = entry->irq; + entries++; + } } }
+static void msix_mask_all(void __iomem *base, int tsize) +{ + u32 ctrl = PCI_MSIX_ENTRY_CTRL_MASKBIT; + int i; + + for (i = 0; i < tsize; i++, base += PCI_MSIX_ENTRY_SIZE) + writel(ctrl, base + PCI_MSIX_ENTRY_VECTOR_CTRL); +} + /** * msix_capability_init - configure device's MSI-X capability * @dev: pointer to the pci_dev data structure of MSI-X device function @@ -763,9 +775,9 @@ static void msix_program_entries(struct pci_dev *dev, static int msix_capability_init(struct pci_dev *dev, struct msix_entry *entries, int nvec, bool affinity) { - int ret; - u16 control; void __iomem *base; + int ret, tsize; + u16 control;
/* * Some devices require MSI-X to be enabled before the MSI-X @@ -777,12 +789,17 @@ static int msix_capability_init(struct pci_dev *dev, struct msix_entry *entries,
pci_read_config_word(dev, dev->msix_cap + PCI_MSIX_FLAGS, &control); /* Request & Map MSI-X table region */ + tsize = msix_table_size(control); + base = msix_map_region(dev, tsize); base = msix_map_region(dev, msix_table_size(control)); if (!base) { ret = -ENOMEM; goto out_disable; }
+ /* Ensure that all table entries are masked. */ + msix_mask_all(base, tsize); + ret = msix_setup_entries(dev, base, entries, nvec, affinity); if (ret) goto out_disable; @@ -796,7 +813,7 @@ static int msix_capability_init(struct pci_dev *dev, struct msix_entry *entries, if (ret) goto out_free;
- msix_program_entries(dev, entries); + msix_update_entries(dev, entries);
ret = populate_msi_sysfs(dev); if (ret)
From: Thomas Gleixner tglx@linutronix.de
commit da181dc974ad667579baece33c2c8d2d1e4558d5 upstream.
The specification (PCIe r5.0, sec 6.1.4.5) states:
For MSI-X, a function is permitted to cache Address and Data values from unmasked MSI-X Table entries. However, anytime software unmasks a currently masked MSI-X Table entry either by clearing its Mask bit or by clearing the Function Mask bit, the function must update any Address or Data values that it cached from that entry. If software changes the Address or Data value of an entry while the entry is unmasked, the result is undefined.
The Linux kernel's MSI-X support never enforced that the entry is masked before the entry is modified hence the Fixes tag refers to a commit in: git://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git
Enforce the entry to be masked across the update.
There is no point in enforcing this to be handled at all possible call sites as this is just pointless code duplication and the common update function is the obvious place to enforce this.
Fixes: f036d4ea5fa7 ("[PATCH] ia32 Message Signalled Interrupt support") Reported-by: Kevin Tian kevin.tian@intel.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Tested-by: Marc Zyngier maz@kernel.org Reviewed-by: Marc Zyngier maz@kernel.org Acked-by: Bjorn Helgaas bhelgaas@google.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20210729222542.462096385@linutronix.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pci/msi.c | 15 +++++++++++++++ 1 file changed, 15 insertions(+)
diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c index a4873c7fea72..3be9c0ceb4e9 100644 --- a/drivers/pci/msi.c +++ b/drivers/pci/msi.c @@ -322,10 +322,25 @@ void __pci_write_msi_msg(struct msi_desc *entry, struct msi_msg *msg) /* Don't touch the hardware now */ } else if (entry->msi_attrib.is_msix) { void __iomem *base = pci_msix_desc_addr(entry); + bool unmasked = !(entry->masked & PCI_MSIX_ENTRY_CTRL_MASKBIT); + + /* + * The specification mandates that the entry is masked + * when the message is modified: + * + * "If software changes the Address or Data value of an + * entry while the entry is unmasked, the result is + * undefined." + */ + if (unmasked) + __pci_msix_desc_mask_irq(entry, PCI_MSIX_ENTRY_CTRL_MASKBIT);
writel(msg->address_lo, base + PCI_MSIX_ENTRY_LOWER_ADDR); writel(msg->address_hi, base + PCI_MSIX_ENTRY_UPPER_ADDR); writel(msg->data, base + PCI_MSIX_ENTRY_DATA); + + if (unmasked) + __pci_msix_desc_mask_irq(entry, 0); } else { int pos = dev->msi_cap; u16 msgctl;
From: Thomas Gleixner tglx@linutronix.de
commit b9255a7cb51754e8d2645b65dd31805e282b4f3e upstream.
Nothing enforces the posted writes to be visible when the function returns. Flush them even if the flush might be redundant when the entry is masked already as the unmask will flush as well. This is either setup or a rare affinity change event so the extra flush is not the end of the world.
While this is more a theoretical issue especially the logic in the X86 specific msi_set_affinity() function relies on the assumption that the update has reached the hardware when the function returns.
Again, as this never has been enforced the Fixes tag refers to a commit in: git://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git
Fixes: f036d4ea5fa7 ("[PATCH] ia32 Message Signalled Interrupt support") Signed-off-by: Thomas Gleixner tglx@linutronix.de Tested-by: Marc Zyngier maz@kernel.org Reviewed-by: Marc Zyngier maz@kernel.org Acked-by: Bjorn Helgaas bhelgaas@google.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20210729222542.515188147@linutronix.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pci/msi.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c index 3be9c0ceb4e9..77810f424049 100644 --- a/drivers/pci/msi.c +++ b/drivers/pci/msi.c @@ -341,6 +341,9 @@ void __pci_write_msi_msg(struct msi_desc *entry, struct msi_msg *msg)
if (unmasked) __pci_msix_desc_mask_irq(entry, 0); + + /* Ensure that the writes are visible in the device */ + readl(base + PCI_MSIX_ENTRY_DATA); } else { int pos = dev->msi_cap; u16 msgctl; @@ -361,6 +364,8 @@ void __pci_write_msi_msg(struct msi_desc *entry, struct msi_msg *msg) pci_write_config_word(dev, pos + PCI_MSI_DATA_32, msg->data); } + /* Ensure that the writes are visible in the device */ + pci_read_config_word(dev, pos + PCI_MSI_FLAGS, &msgctl); } entry->msg = *msg; }
From: Nathan Chancellor nathan@kernel.org
commit 848378812e40152abe9b9baf58ce2004f76fb988 upstream.
A recent change in LLVM causes module_{c,d}tor sections to appear when CONFIG_K{A,C}SAN are enabled, which results in orphan section warnings because these are not handled anywhere:
ld.lld: warning: arch/x86/pci/built-in.a(legacy.o):(.text.asan.module_ctor) is being placed in '.text.asan.module_ctor' ld.lld: warning: arch/x86/pci/built-in.a(legacy.o):(.text.asan.module_dtor) is being placed in '.text.asan.module_dtor' ld.lld: warning: arch/x86/pci/built-in.a(legacy.o):(.text.tsan.module_ctor) is being placed in '.text.tsan.module_ctor'
Fangrui explains: "the function asan.module_ctor has the SHF_GNU_RETAIN flag, so it is in a separate section even with -fno-function-sections (default)".
Place them in the TEXT_TEXT section so that these technologies continue to work with the newer compiler versions. All of the KASAN and KCSAN KUnit tests continue to pass after this change.
Cc: stable@vger.kernel.org Link: https://github.com/ClangBuiltLinux/linux/issues/1432 Link: https://github.com/llvm/llvm-project/commit/7b789562244ee941b7bf2cefeb3fc08a... Signed-off-by: Nathan Chancellor nathan@kernel.org Reviewed-by: Nick Desaulniers ndesaulniers@google.com Reviewed-by: Fangrui Song maskray@google.com Acked-by: Marco Elver elver@google.com Signed-off-by: Kees Cook keescook@chromium.org Link: https://lore.kernel.org/r/20210731023107.1932981-1-nathan@kernel.org [nc: Fix conflicts due to lack of cf68fffb66d60 and 266ff2a8f51f0] Signed-off-by: Nathan Chancellor nathan@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/asm-generic/vmlinux.lds.h | 1 + 1 file changed, 1 insertion(+)
diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h index 36198563fb8b..8cff6d157e56 100644 --- a/include/asm-generic/vmlinux.lds.h +++ b/include/asm-generic/vmlinux.lds.h @@ -465,6 +465,7 @@ *(.text.unlikely .text.unlikely.*) \ *(.text.unknown .text.unknown.*) \ *(.ref.text) \ + *(.text.asan.* .text.tsan.*) \ MEM_KEEP(init.text) \ MEM_KEEP(exit.text) \
From: Johannes Berg johannes.berg@intel.com
commit a0761a301746ec2d92d7fcb82af69c0a6a4339aa upstream.
If we know that we have an encrypted link (based on having had a key configured for TX in the past) then drop all data frames in the key selection handler if there's no key anymore.
This fixes an issue with mac80211 internal TXQs - there we can buffer frames for an encrypted link, but then if the key is no longer there when they're dequeued, the frames are sent without encryption. This happens if a station is disconnected while the frames are still on the TXQ.
Detecting that a link should be encrypted based on a first key having been configured for TX is fine as there are no use cases for a connection going from with encryption to no encryption. With extended key IDs, however, there is a case of having a key configured for only decryption, so we can't just trigger this behaviour on a key being configured.
Cc: stable@vger.kernel.org Reported-by: Jouni Malinen j@w1.fi Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Luca Coelho luciano.coelho@intel.com Link: https://lore.kernel.org/r/iwlwifi.20200326150855.6865c7f28a14.I9fb1d911b0642... Signed-off-by: Johannes Berg johannes.berg@intel.com [pali: Backported to 4.19 and older versions] Signed-off-by: Pali Rohár pali@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/mac80211/debugfs_sta.c | 1 + net/mac80211/key.c | 1 + net/mac80211/sta_info.h | 1 + net/mac80211/tx.c | 12 +++++++++--- 4 files changed, 12 insertions(+), 3 deletions(-)
diff --git a/net/mac80211/debugfs_sta.c b/net/mac80211/debugfs_sta.c index 14ec63a02669..91b94ac9a88a 100644 --- a/net/mac80211/debugfs_sta.c +++ b/net/mac80211/debugfs_sta.c @@ -80,6 +80,7 @@ static const char * const sta_flag_names[] = { FLAG(MPSP_OWNER), FLAG(MPSP_RECIPIENT), FLAG(PS_DELIVER), + FLAG(USES_ENCRYPTION), #undef FLAG };
diff --git a/net/mac80211/key.c b/net/mac80211/key.c index 4e23f240f599..a0d9507cb6a7 100644 --- a/net/mac80211/key.c +++ b/net/mac80211/key.c @@ -334,6 +334,7 @@ static void ieee80211_key_replace(struct ieee80211_sub_if_data *sdata, if (sta) { if (pairwise) { rcu_assign_pointer(sta->ptk[idx], new); + set_sta_flag(sta, WLAN_STA_USES_ENCRYPTION); sta->ptk_idx = idx; ieee80211_check_fast_xmit(sta); } else { diff --git a/net/mac80211/sta_info.h b/net/mac80211/sta_info.h index fd31c4db1282..0909332965bc 100644 --- a/net/mac80211/sta_info.h +++ b/net/mac80211/sta_info.h @@ -100,6 +100,7 @@ enum ieee80211_sta_info_flags { WLAN_STA_MPSP_OWNER, WLAN_STA_MPSP_RECIPIENT, WLAN_STA_PS_DELIVER, + WLAN_STA_USES_ENCRYPTION,
NUM_WLAN_STA_FLAGS, }; diff --git a/net/mac80211/tx.c b/net/mac80211/tx.c index eebbddccb47b..48d0dd0beaa5 100644 --- a/net/mac80211/tx.c +++ b/net/mac80211/tx.c @@ -588,10 +588,13 @@ ieee80211_tx_h_select_key(struct ieee80211_tx_data *tx) struct ieee80211_tx_info *info = IEEE80211_SKB_CB(tx->skb); struct ieee80211_hdr *hdr = (struct ieee80211_hdr *)tx->skb->data;
- if (unlikely(info->flags & IEEE80211_TX_INTFL_DONT_ENCRYPT)) + if (unlikely(info->flags & IEEE80211_TX_INTFL_DONT_ENCRYPT)) { tx->key = NULL; - else if (tx->sta && - (key = rcu_dereference(tx->sta->ptk[tx->sta->ptk_idx]))) + return TX_CONTINUE; + } + + if (tx->sta && + (key = rcu_dereference(tx->sta->ptk[tx->sta->ptk_idx]))) tx->key = key; else if (ieee80211_is_group_privacy_action(tx->skb) && (key = rcu_dereference(tx->sdata->default_multicast_key))) @@ -652,6 +655,9 @@ ieee80211_tx_h_select_key(struct ieee80211_tx_data *tx) if (!skip_hw && tx->key && tx->key->flags & KEY_FLAG_UPLOADED_TO_HARDWARE) info->control.hw_key = &tx->key->conf; + } else if (!ieee80211_is_mgmt(hdr->frame_control) && tx->sta && + test_sta_flag(tx->sta, WLAN_STA_USES_ENCRYPTION)) { + return TX_DROP; }
return TX_CONTINUE;
From: Maxim Levitsky mlevitsk@redhat.com
[ upstream commit 0f923e07124df069ba68d8bb12324398f4b6b709 ]
* Invert the mask of bits that we pick from L2 in nested_vmcb02_prepare_control
* Invert and explicitly use VIRQ related bits bitmask in svm_clear_vintr
This fixes a security issue that allowed a malicious L1 to run L2 with AVIC enabled, which allowed the L2 to exploit the uninitialized and enabled AVIC to read/write the host physical memory at some offsets.
Fixes: 3d6368ef580a ("KVM: SVM: Add VMRUN handler") Signed-off-by: Maxim Levitsky mlevitsk@redhat.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/include/asm/svm.h | 2 ++ arch/x86/kvm/svm.c | 6 +++++- 2 files changed, 7 insertions(+), 1 deletion(-)
diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h index 14824fc78f7e..509b9f3307e4 100644 --- a/arch/x86/include/asm/svm.h +++ b/arch/x86/include/asm/svm.h @@ -113,6 +113,8 @@ struct __attribute__ ((__packed__)) vmcb_control_area { #define V_IGN_TPR_SHIFT 20 #define V_IGN_TPR_MASK (1 << V_IGN_TPR_SHIFT)
+#define V_IRQ_INJECTION_BITS_MASK (V_IRQ_MASK | V_INTR_PRIO_MASK | V_IGN_TPR_MASK) + #define V_INTR_MASKING_SHIFT 24 #define V_INTR_MASKING_MASK (1 << V_INTR_MASKING_SHIFT)
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index cbc7f177bbd8..03fdeab057d2 100644 --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -3048,7 +3048,11 @@ static bool nested_svm_vmrun(struct vcpu_svm *svm) svm->nested.intercept = nested_vmcb->control.intercept;
svm_flush_tlb(&svm->vcpu); - svm->vmcb->control.int_ctl = nested_vmcb->control.int_ctl | V_INTR_MASKING_MASK; + svm->vmcb->control.int_ctl = nested_vmcb->control.int_ctl & + (V_TPR_MASK | V_IRQ_INJECTION_BITS_MASK); + + svm->vmcb->control.int_ctl |= V_INTR_MASKING_MASK; + if (nested_vmcb->control.int_ctl & V_INTR_MASKING_MASK) svm->vcpu.arch.hflags |= HF_VINTR_MASK; else
From: Thomas Gleixner tglx@linutronix.de
commit f9dfb5e390fab2df9f7944bb91e7705aba14cd26 upstream.
The XSAVE init code initializes all enabled and supported components with XRSTOR(S) to init state. Then it XSAVEs the state of the components back into init_fpstate which is used in several places to fill in the init state of components.
This works correctly with XSAVE, but not with XSAVEOPT and XSAVES because those use the init optimization and skip writing state of components which are in init state. So init_fpstate.xsave still contains all zeroes after this operation.
There are two ways to solve that:
1) Use XSAVE unconditionally, but that requires to reshuffle the buffer when XSAVES is enabled because XSAVES uses compacted format.
2) Save the components which are known to have a non-zero init state by other means.
Looking deeper, #2 is the right thing to do because all components the kernel supports have all-zeroes init state except the legacy features (FP, SSE). Those cannot be hard coded because the states are not identical on all CPUs, but they can be saved with FXSAVE which avoids all conditionals.
Use FXSAVE to save the legacy FP/SSE components in init_fpstate along with a BUILD_BUG_ON() which reminds developers to validate that a newly added component has all zeroes init state. As a bonus remove the now unused copy_xregs_to_kernel_booting() crutch.
The XSAVE and reshuffle method can still be implemented in the unlikely case that components are added which have a non-zero init state and no other means to save them. For now, FXSAVE is just simple and good enough.
[ bp: Fix a typo or two in the text. ]
Fixes: 6bad06b76892 ("x86, xsave: Use xsaveopt in context-switch path when supported") Signed-off-by: Thomas Gleixner tglx@linutronix.de Signed-off-by: Borislav Petkov bp@suse.de Reviewed-by: Borislav Petkov bp@suse.de Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20210618143444.587311343@linutronix.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/include/asm/fpu/internal.h | 30 ++++++----------------- arch/x86/kernel/fpu/xstate.c | 38 ++++++++++++++++++++++++++--- 2 files changed, 43 insertions(+), 25 deletions(-)
diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h index ebda4718eb8f..793c04cba0de 100644 --- a/arch/x86/include/asm/fpu/internal.h +++ b/arch/x86/include/asm/fpu/internal.h @@ -221,6 +221,14 @@ static inline void copy_fxregs_to_kernel(struct fpu *fpu) } }
+static inline void fxsave(struct fxregs_state *fx) +{ + if (IS_ENABLED(CONFIG_X86_32)) + asm volatile( "fxsave %[fx]" : [fx] "=m" (*fx)); + else + asm volatile("fxsaveq %[fx]" : [fx] "=m" (*fx)); +} + /* These macros all use (%edi)/(%rdi) as the single memory argument. */ #define XSAVE ".byte " REX_PREFIX "0x0f,0xae,0x27" #define XSAVEOPT ".byte " REX_PREFIX "0x0f,0xae,0x37" @@ -294,28 +302,6 @@ static inline void copy_fxregs_to_kernel(struct fpu *fpu) : "D" (st), "m" (*st), "a" (lmask), "d" (hmask) \ : "memory")
-/* - * This function is called only during boot time when x86 caps are not set - * up and alternative can not be used yet. - */ -static inline void copy_xregs_to_kernel_booting(struct xregs_state *xstate) -{ - u64 mask = -1; - u32 lmask = mask; - u32 hmask = mask >> 32; - int err; - - WARN_ON(system_state != SYSTEM_BOOTING); - - if (static_cpu_has(X86_FEATURE_XSAVES)) - XSTATE_OP(XSAVES, xstate, lmask, hmask, err); - else - XSTATE_OP(XSAVE, xstate, lmask, hmask, err); - - /* We should never fault when copying to a kernel buffer: */ - WARN_ON_FPU(err); -} - /* * This function is called only during boot time when x86 caps are not set * up and alternative can not be used yet. diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c index dbd396c91348..02ad98ec5149 100644 --- a/arch/x86/kernel/fpu/xstate.c +++ b/arch/x86/kernel/fpu/xstate.c @@ -407,6 +407,24 @@ static void __init print_xstate_offset_size(void) } }
+/* + * All supported features have either init state all zeros or are + * handled in setup_init_fpu() individually. This is an explicit + * feature list and does not use XFEATURE_MASK*SUPPORTED to catch + * newly added supported features at build time and make people + * actually look at the init state for the new feature. + */ +#define XFEATURES_INIT_FPSTATE_HANDLED \ + (XFEATURE_MASK_FP | \ + XFEATURE_MASK_SSE | \ + XFEATURE_MASK_YMM | \ + XFEATURE_MASK_OPMASK | \ + XFEATURE_MASK_ZMM_Hi256 | \ + XFEATURE_MASK_Hi16_ZMM | \ + XFEATURE_MASK_PKRU | \ + XFEATURE_MASK_BNDREGS | \ + XFEATURE_MASK_BNDCSR) + /* * setup the xstate image representing the init state */ @@ -414,6 +432,8 @@ static void __init setup_init_fpu_buf(void) { static int on_boot_cpu __initdata = 1;
+ BUILD_BUG_ON(XCNTXT_MASK != XFEATURES_INIT_FPSTATE_HANDLED); + WARN_ON_FPU(!on_boot_cpu); on_boot_cpu = 0;
@@ -432,10 +452,22 @@ static void __init setup_init_fpu_buf(void) copy_kernel_to_xregs_booting(&init_fpstate.xsave);
/* - * Dump the init state again. This is to identify the init state - * of any feature which is not represented by all zero's. + * All components are now in init state. Read the state back so + * that init_fpstate contains all non-zero init state. This only + * works with XSAVE, but not with XSAVEOPT and XSAVES because + * those use the init optimization which skips writing data for + * components in init state. + * + * XSAVE could be used, but that would require to reshuffle the + * data when XSAVES is available because XSAVES uses xstate + * compaction. But doing so is a pointless exercise because most + * components have an all zeros init state except for the legacy + * ones (FP and SSE). Those can be saved with FXSAVE into the + * legacy area. Adding new features requires to ensure that init + * state is all zeroes or if not to add the necessary handling + * here. */ - copy_xregs_to_kernel_booting(&init_fpstate.xsave); + fxsave(&init_fpstate.fxsave); }
static int xfeature_uncompacted_offset(int xfeature_nr)
From: Yu Kuai yukuai3@huawei.com
[ Upstream commit 1da569fa7ec8cb0591c74aa3050d4ea1397778b4 ]
pm_runtime_get_sync will increment pm usage counter even it failed. Forgetting to putting operation will result in reference leak here. Fix it by moving the error_pm label above the pm_runtime_put() in the error path.
Reported-by: Hulk Robot hulkci@huawei.com Signed-off-by: Yu Kuai yukuai3@huawei.com Link: https://lore.kernel.org/r/20210706124521.1371901-1-yukuai3@huawei.com Signed-off-by: Vinod Koul vkoul@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/dma/sh/usb-dmac.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/dma/sh/usb-dmac.c b/drivers/dma/sh/usb-dmac.c index 6682b3eec2b6..ec15ded640f6 100644 --- a/drivers/dma/sh/usb-dmac.c +++ b/drivers/dma/sh/usb-dmac.c @@ -861,8 +861,8 @@ static int usb_dmac_probe(struct platform_device *pdev)
error: of_dma_controller_free(pdev->dev.of_node); - pm_runtime_put(&pdev->dev); error_pm: + pm_runtime_put(&pdev->dev); pm_runtime_disable(&pdev->dev); return ret; }
From: Dave Gerlach d-gerlach@ti.com
[ Upstream commit 20a6b3fd8e2e2c063b25fbf2ee74d86b898e5087 ]
Based on the latest timing specifications for the TPS65218 from the data sheet, http://www.ti.com/lit/ds/symlink/tps65218.pdf, document SLDS206 from November 2014, we must change the i2c bus speed to better fit within the minimum high SCL time required for proper i2c transfer.
When running at 400khz, measurements show that SCL spends 0.8125 uS/1.666 uS high/low which violates the requirement for minimum high period of SCL provided in datasheet Table 7.6 which is 1 uS. Switching to 100khz gives us 5 uS/5 uS high/low which both fall above the minimum given values for 100 khz, 4.0 uS/4.7 uS high/low.
Without this patch occasionally a voltage set operation from the kernel will appear to have worked but the actual voltage reflected on the PMIC will not have updated, causing problems especially with cpufreq that may update to a higher OPP without actually raising the voltage on DCDC2, leading to a hang.
Signed-off-by: Dave Gerlach d-gerlach@ti.com Signed-off-by: Kevin Hilman khilman@baylibre.com Signed-off-by: Tony Lindgren tony@atomide.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm/boot/dts/am43x-epos-evm.dts | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/arm/boot/dts/am43x-epos-evm.dts b/arch/arm/boot/dts/am43x-epos-evm.dts index 21918807c9f6..f42a92391289 100644 --- a/arch/arm/boot/dts/am43x-epos-evm.dts +++ b/arch/arm/boot/dts/am43x-epos-evm.dts @@ -411,7 +411,7 @@ status = "okay"; pinctrl-names = "default"; pinctrl-0 = <&i2c0_pins>; - clock-frequency = <400000>; + clock-frequency = <100000>;
tps65218: tps65218@24 { reg = <0x24>;
From: Peter Ujfalusi peter.ujfalusi@gmail.com
[ Upstream commit eda97cb095f2958bbad55684a6ca3e7d7af0176a ]
If the router_xlate can not find the controller in the available DMA devices then it should return with -EPORBE_DEFER in a same way as the of_dma_request_slave_channel() does.
The issue can be reproduced if the event router is registered before the DMA controller itself and a driver would request for a channel before the controller is registered. In of_dma_request_slave_channel(): 1. of_dma_find_controller() would find the dma_router 2. ofdma->of_dma_xlate() would fail and returned NULL 3. -ENODEV is returned as error code
with this patch we would return in this case the correct -EPROBE_DEFER and the client can try to request the channel later.
Signed-off-by: Peter Ujfalusi peter.ujfalusi@gmail.com Link: https://lore.kernel.org/r/20210717190021.21897-1-peter.ujfalusi@gmail.com Signed-off-by: Vinod Koul vkoul@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/dma/of-dma.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/drivers/dma/of-dma.c b/drivers/dma/of-dma.c index 757cf48c1c5e..441f37b41abd 100644 --- a/drivers/dma/of-dma.c +++ b/drivers/dma/of-dma.c @@ -68,8 +68,12 @@ static struct dma_chan *of_dma_router_xlate(struct of_phandle_args *dma_spec, return NULL;
ofdma_target = of_dma_find_controller(&dma_spec_target); - if (!ofdma_target) - return NULL; + if (!ofdma_target) { + ofdma->dma_router->route_free(ofdma->dma_router->dev, + route_data); + chan = ERR_PTR(-EPROBE_DEFER); + goto err; + }
chan = ofdma_target->of_dma_xlate(&dma_spec_target, ofdma_target); if (IS_ERR_OR_NULL(chan)) { @@ -80,6 +84,7 @@ static struct dma_chan *of_dma_router_xlate(struct of_phandle_args *dma_spec, chan->route_data = route_data; }
+err: /* * Need to put the node back since the ofdma->of_dma_route_allocate * has taken it for generating the new, translated dma_spec
From: Harshvardhan Jha harshvardhan.jha@oracle.com
[ Upstream commit 77541f78eadfe9fdb018a7b8b69f0f2af2cf4b82 ]
The list_for_each_entry() iterator, "adapter" in this code, can never be NULL. If we exit the loop without finding the correct adapter then "adapter" points invalid memory that is an offset from the list head. This will eventually lead to memory corruption and presumably a kernel crash.
Link: https://lore.kernel.org/r/20210708074642.23599-1-harshvardhan.jha@oracle.com Acked-by: Sumit Saxena sumit.saxena@broadcom.com Signed-off-by: Harshvardhan Jha harshvardhan.jha@oracle.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/megaraid/megaraid_mm.c | 21 +++++++++++++++------ 1 file changed, 15 insertions(+), 6 deletions(-)
diff --git a/drivers/scsi/megaraid/megaraid_mm.c b/drivers/scsi/megaraid/megaraid_mm.c index 4cf9ed96414f..d61df49e4e1b 100644 --- a/drivers/scsi/megaraid/megaraid_mm.c +++ b/drivers/scsi/megaraid/megaraid_mm.c @@ -250,7 +250,7 @@ mraid_mm_get_adapter(mimd_t __user *umimd, int *rval) mimd_t mimd; uint32_t adapno; int iterator; - + bool is_found;
if (copy_from_user(&mimd, umimd, sizeof(mimd_t))) { *rval = -EFAULT; @@ -266,12 +266,16 @@ mraid_mm_get_adapter(mimd_t __user *umimd, int *rval)
adapter = NULL; iterator = 0; + is_found = false;
list_for_each_entry(adapter, &adapters_list_g, list) { - if (iterator++ == adapno) break; + if (iterator++ == adapno) { + is_found = true; + break; + } }
- if (!adapter) { + if (!is_found) { *rval = -ENODEV; return NULL; } @@ -739,6 +743,7 @@ ioctl_done(uioc_t *kioc) uint32_t adapno; int iterator; mraid_mmadp_t* adapter; + bool is_found;
/* * When the kioc returns from driver, make sure it still doesn't @@ -761,19 +766,23 @@ ioctl_done(uioc_t *kioc) iterator = 0; adapter = NULL; adapno = kioc->adapno; + is_found = false;
con_log(CL_ANN, ( KERN_WARNING "megaraid cmm: completed " "ioctl that was timedout before\n"));
list_for_each_entry(adapter, &adapters_list_g, list) { - if (iterator++ == adapno) break; + if (iterator++ == adapno) { + is_found = true; + break; + } }
kioc->timedout = 0;
- if (adapter) { + if (is_found) mraid_mm_dealloc_kioc( adapter, kioc ); - } + } else { wake_up(&wait_q);
From: Ye Bin yebin10@huawei.com
[ Upstream commit bc546c0c9abb3bb2fb46866b3d1e6ade9695a5f6 ]
The following BUG_ON() was observed during RDAC scan:
[595952.944297] kernel BUG at drivers/scsi/device_handler/scsi_dh_rdac.c:427! [595952.951143] Internal error: Oops - BUG: 0 [#1] SMP ...... [595953.251065] Call trace: [595953.259054] check_ownership+0xb0/0x118 [595953.269794] rdac_bus_attach+0x1f0/0x4b0 [595953.273787] scsi_dh_handler_attach+0x3c/0xe8 [595953.278211] scsi_dh_add_device+0xc4/0xe8 [595953.282291] scsi_sysfs_add_sdev+0x8c/0x2a8 [595953.286544] scsi_probe_and_add_lun+0x9fc/0xd00 [595953.291142] __scsi_scan_target+0x598/0x630 [595953.295395] scsi_scan_target+0x120/0x130 [595953.299481] fc_user_scan+0x1a0/0x1c0 [scsi_transport_fc] [595953.304944] store_scan+0xb0/0x108 [595953.308420] dev_attr_store+0x44/0x60 [595953.312160] sysfs_kf_write+0x58/0x80 [595953.315893] kernfs_fop_write+0xe8/0x1f0 [595953.319888] __vfs_write+0x60/0x190 [595953.323448] vfs_write+0xac/0x1c0 [595953.326836] ksys_write+0x74/0xf0 [595953.330221] __arm64_sys_write+0x24/0x30
Code is in check_ownership:
list_for_each_entry_rcu(tmp, &h->ctlr->dh_list, node) { /* h->sdev should always be valid */ BUG_ON(!tmp->sdev); tmp->sdev->access_state = access_state; }
rdac_bus_attach initialize_controller list_add_rcu(&h->node, &h->ctlr->dh_list); h->sdev = sdev;
rdac_bus_detach list_del_rcu(&h->node); h->sdev = NULL;
Fix the race between rdac_bus_attach() and rdac_bus_detach() where h->sdev is NULL when processing the RDAC attach.
Link: https://lore.kernel.org/r/20210113063103.2698953-1-yebin10@huawei.com Reviewed-by: Bart Van Assche bvanassche@acm.org Signed-off-by: Ye Bin yebin10@huawei.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/device_handler/scsi_dh_rdac.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/scsi/device_handler/scsi_dh_rdac.c b/drivers/scsi/device_handler/scsi_dh_rdac.c index 06fbd0b0c68a..6ddb3e9f21ba 100644 --- a/drivers/scsi/device_handler/scsi_dh_rdac.c +++ b/drivers/scsi/device_handler/scsi_dh_rdac.c @@ -526,8 +526,8 @@ static int initialize_controller(struct scsi_device *sdev, if (!h->ctlr) err = SCSI_DH_RES_TEMP_UNAVAIL; else { - list_add_rcu(&h->node, &h->ctlr->dh_list); h->sdev = sdev; + list_add_rcu(&h->node, &h->ctlr->dh_list); } spin_unlock(&list_lock); } @@ -852,11 +852,11 @@ static void rdac_bus_detach( struct scsi_device *sdev ) spin_lock(&list_lock); if (h->ctlr) { list_del_rcu(&h->node); - h->sdev = NULL; kref_put(&h->ctlr->kref, release_controller); } spin_unlock(&list_lock); sdev->handler_data = NULL; + synchronize_rcu(); kfree(h); }
From: Sreekanth Reddy sreekanth.reddy@broadcom.com
[ Upstream commit 70edd2e6f652f67d854981fd67f9ad0f1deaea92 ]
Avoid printing a 'target allocation failed' error if the driver target_alloc() callback function returns -ENXIO. This return value indicates that the corresponding H:C:T:L entry is empty.
Removing this error reduces the scan time if the user issues SCAN_WILD_CARD scan operation through sysfs parameter on a host with a lot of empty H:C:T:L entries.
Avoiding the printk on -ENXIO matches the behavior of the other callback functions during scanning.
Link: https://lore.kernel.org/r/20210726115402.1936-1-sreekanth.reddy@broadcom.com Signed-off-by: Sreekanth Reddy sreekanth.reddy@broadcom.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/scsi_scan.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c index 397deb69c659..e51819e3a508 100644 --- a/drivers/scsi/scsi_scan.c +++ b/drivers/scsi/scsi_scan.c @@ -460,7 +460,8 @@ static struct scsi_target *scsi_alloc_target(struct device *parent, error = shost->hostt->target_alloc(starget);
if(error) { - dev_printk(KERN_ERR, dev, "target allocation failed, error %d\n", error); + if (error != -ENXIO) + dev_err(dev, "target allocation failed, error %d\n", error); /* don't want scsi_target_reap to do the final * put because it will be under the host lock */ scsi_target_destroy(starget);
From: Sudeep Holla sudeep.holla@arm.com
[ Upstream commit 47091f473b364c98207c4def197a0ae386fc9af1 ]
Once the new schema interrupt-controller/arm,vic.yaml is added, we get the below warnings:
arch/arm/boot/dts/ste-nomadik-nhk15.dt.yaml: intc@10140000: $nodename:0: 'intc@10140000' does not match '^interrupt-controller(@[0-9a-f,]+)*$'
Fix the node names for the interrupt controller to conform to the standard node name interrupt-controller@..
Signed-off-by: Sudeep Holla sudeep.holla@arm.com Signed-off-by: Linus Walleij linus.walleij@linaro.org Cc: Linus Walleij linus.walleij@linaro.org Link: https://lore.kernel.org/r/20210617210825.3064367-2-sudeep.holla@arm.com Link: https://lore.kernel.org/r/20210626000103.830184-1-linus.walleij@linaro.org' Signed-off-by: Arnd Bergmann arnd@arndb.de Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm/boot/dts/ste-nomadik-stn8815.dtsi | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/arm/boot/dts/ste-nomadik-stn8815.dtsi b/arch/arm/boot/dts/ste-nomadik-stn8815.dtsi index 1077ceebb2d6..87494773f409 100644 --- a/arch/arm/boot/dts/ste-nomadik-stn8815.dtsi +++ b/arch/arm/boot/dts/ste-nomadik-stn8815.dtsi @@ -755,14 +755,14 @@ status = "disabled"; };
- vica: intc@10140000 { + vica: interrupt-controller@10140000 { compatible = "arm,versatile-vic"; interrupt-controller; #interrupt-cells = <1>; reg = <0x10140000 0x20>; };
- vicb: intc@10140020 { + vicb: interrupt-controller@10140020 { compatible = "arm,versatile-vic"; interrupt-controller; #interrupt-cells = <1>;
From: Ole Bjørn Midtbø omidtbo@cisco.com
[ Upstream commit cca342d98bef68151a80b024f7bf5f388d1fbdea ]
A different wait queue was used when removing ctrl_wait than when adding it. This effectively made the remove operation without locking compared to other operations on the wait queue ctrl_wait was part of. This caused issues like below where dead000000000100 is LIST_POISON1 and dead000000000200 is LIST_POISON2.
list_add corruption. next->prev should be prev (ffffffc1b0a33a08), \ but was dead000000000200. (next=ffffffc03ac77de0). ------------[ cut here ]------------ CPU: 3 PID: 2138 Comm: bluetoothd Tainted: G O 4.4.238+ #9 ... ---[ end trace 0adc2158f0646eac ]--- Call trace: [<ffffffc000443f78>] __list_add+0x38/0xb0 [<ffffffc0000f0d04>] add_wait_queue+0x4c/0x68 [<ffffffc00020eecc>] __pollwait+0xec/0x100 [<ffffffc000d1556c>] bt_sock_poll+0x74/0x200 [<ffffffc000bdb8a8>] sock_poll+0x110/0x128 [<ffffffc000210378>] do_sys_poll+0x220/0x480 [<ffffffc0002106f0>] SyS_poll+0x80/0x138 [<ffffffc00008510c>] __sys_trace_return+0x0/0x4
Unable to handle kernel paging request at virtual address dead000000000100 ... CPU: 4 PID: 5387 Comm: kworker/u15:3 Tainted: G W O 4.4.238+ #9 ... Call trace: [<ffffffc0000f079c>] __wake_up_common+0x7c/0xa8 [<ffffffc0000f0818>] __wake_up+0x50/0x70 [<ffffffc000be11b0>] sock_def_wakeup+0x58/0x60 [<ffffffc000de5e10>] l2cap_sock_teardown_cb+0x200/0x224 [<ffffffc000d3f2ac>] l2cap_chan_del+0xa4/0x298 [<ffffffc000d45ea0>] l2cap_conn_del+0x118/0x198 [<ffffffc000d45f8c>] l2cap_disconn_cfm+0x6c/0x78 [<ffffffc000d29934>] hci_event_packet+0x564/0x2e30 [<ffffffc000d19b0c>] hci_rx_work+0x10c/0x360 [<ffffffc0000c2218>] process_one_work+0x268/0x460 [<ffffffc0000c2678>] worker_thread+0x268/0x480 [<ffffffc0000c94e0>] kthread+0x118/0x128 [<ffffffc000085070>] ret_from_fork+0x10/0x20 ---[ end trace 0adc2158f0646ead ]---
Signed-off-by: Ole Bjørn Midtbø omidtbo@cisco.com Signed-off-by: Marcel Holtmann marcel@holtmann.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/bluetooth/hidp/core.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/bluetooth/hidp/core.c b/net/bluetooth/hidp/core.c index 552e00b07196..9ec37c6c8c4a 100644 --- a/net/bluetooth/hidp/core.c +++ b/net/bluetooth/hidp/core.c @@ -1282,7 +1282,7 @@ static int hidp_session_thread(void *arg)
/* cleanup runtime environment */ remove_wait_queue(sk_sleep(session->intr_sock->sk), &intr_wait); - remove_wait_queue(sk_sleep(session->intr_sock->sk), &ctrl_wait); + remove_wait_queue(sk_sleep(session->ctrl_sock->sk), &ctrl_wait); wake_up_interruptible(&session->report_queue); hidp_del_timer(session);
From: Randy Dunlap rdunlap@infradead.org
[ Upstream commit 86aab09a4870bb8346c9579864588c3d7f555299 ]
GCC complains about empty macros in an 'if' statement, so convert them to 'do {} while (0)' macros.
Fixes these build warnings:
net/dccp/output.c: In function 'dccp_xmit_packet': ../net/dccp/output.c:283:71: warning: suggest braces around empty body in an 'if' statement [-Wempty-body] 283 | dccp_pr_debug("transmit_skb() returned err=%d\n", err); net/dccp/ackvec.c: In function 'dccp_ackvec_update_old': ../net/dccp/ackvec.c:163:80: warning: suggest braces around empty body in an 'else' statement [-Wempty-body] 163 | (unsigned long long)seqno, state);
Fixes: dc841e30eaea ("dccp: Extend CCID packet dequeueing interface") Fixes: 380240864451 ("dccp ccid-2: Update code for the Ack Vector input/registration routine") Signed-off-by: Randy Dunlap rdunlap@infradead.org Cc: dccp@vger.kernel.org Cc: "David S. Miller" davem@davemloft.net Cc: Jakub Kicinski kuba@kernel.org Cc: Gerrit Renker gerrit@erg.abdn.ac.uk Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/dccp/dccp.h | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/net/dccp/dccp.h b/net/dccp/dccp.h index 0c55ffb859bf..121aa71fcb5c 100644 --- a/net/dccp/dccp.h +++ b/net/dccp/dccp.h @@ -44,9 +44,9 @@ extern bool dccp_debug; #define dccp_pr_debug_cat(format, a...) DCCP_PRINTK(dccp_debug, format, ##a) #define dccp_debug(fmt, a...) dccp_pr_debug_cat(KERN_DEBUG fmt, ##a) #else -#define dccp_pr_debug(format, a...) -#define dccp_pr_debug_cat(format, a...) -#define dccp_debug(format, a...) +#define dccp_pr_debug(format, a...) do {} while (0) +#define dccp_pr_debug_cat(format, a...) do {} while (0) +#define dccp_debug(format, a...) do {} while (0) #endif
extern struct inet_hashinfo dccp_hashinfo;
From: Xie Yongji xieyongji@bytedance.com
[ Upstream commit f7ad318ea0ad58ebe0e595e59aed270bb643b29b ]
This fixes the incorrect calculation for integer overflow when the last address of iova range is 0xffffffff.
Fixes: ec33d031a14b ("vhost: detect 32 bit integer wrap around") Reported-by: Jason Wang jasowang@redhat.com Signed-off-by: Xie Yongji xieyongji@bytedance.com Acked-by: Jason Wang jasowang@redhat.com Link: https://lore.kernel.org/r/20210728130756.97-2-xieyongji@bytedance.com Signed-off-by: Michael S. Tsirkin mst@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/vhost/vhost.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c index d2431afeda84..62c61a283b35 100644 --- a/drivers/vhost/vhost.c +++ b/drivers/vhost/vhost.c @@ -675,10 +675,16 @@ static int log_access_ok(void __user *log_base, u64 addr, unsigned long sz) (sz + VHOST_PAGE_SIZE * 8 - 1) / VHOST_PAGE_SIZE / 8); }
+/* Make sure 64 bit math will not overflow. */ static bool vhost_overflow(u64 uaddr, u64 size) { - /* Make sure 64 bit math will not overflow. */ - return uaddr > ULONG_MAX || size > ULONG_MAX || uaddr > ULONG_MAX - size; + if (uaddr > ULONG_MAX || size > ULONG_MAX) + return true; + + if (!size) + return false; + + return uaddr > ULONG_MAX - size + 1; }
/* Caller should have vq mutex and device mutex. */
From: Pavel Skripkin paskripkin@gmail.com
[ Upstream commit 19d1532a187669ce86d5a2696eb7275310070793 ]
Syzbot reported slab-out-of bounds write in decode_data(). The problem was in missing validation checks.
Syzbot's reproducer generated malicious input, which caused decode_data() to be called a lot in sixpack_decode(). Since rx_count_cooked is only 400 bytes and noone reported before, that 400 bytes is not enough, let's just check if input is malicious and complain about buffer overrun.
Fail log: ================================================================== BUG: KASAN: slab-out-of-bounds in drivers/net/hamradio/6pack.c:843 Write of size 1 at addr ffff888087c5544e by task kworker/u4:0/7
CPU: 0 PID: 7 Comm: kworker/u4:0 Not tainted 5.6.0-rc3-syzkaller #0 ... Workqueue: events_unbound flush_to_ldisc Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x197/0x210 lib/dump_stack.c:118 print_address_description.constprop.0.cold+0xd4/0x30b mm/kasan/report.c:374 __kasan_report.cold+0x1b/0x32 mm/kasan/report.c:506 kasan_report+0x12/0x20 mm/kasan/common.c:641 __asan_report_store1_noabort+0x17/0x20 mm/kasan/generic_report.c:137 decode_data.part.0+0x23b/0x270 drivers/net/hamradio/6pack.c:843 decode_data drivers/net/hamradio/6pack.c:965 [inline] sixpack_decode drivers/net/hamradio/6pack.c:968 [inline]
Reported-and-tested-by: syzbot+fc8cd9a673d4577fb2e4@syzkaller.appspotmail.com Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Pavel Skripkin paskripkin@gmail.com Reviewed-by: Dan Carpenter dan.carpenter@oracle.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/hamradio/6pack.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/drivers/net/hamradio/6pack.c b/drivers/net/hamradio/6pack.c index 03c96a6cbafd..e510dbda77e5 100644 --- a/drivers/net/hamradio/6pack.c +++ b/drivers/net/hamradio/6pack.c @@ -870,6 +870,12 @@ static void decode_data(struct sixpack *sp, unsigned char inbyte) return; }
+ if (sp->rx_count_cooked + 2 >= sizeof(sp->cooked_buf)) { + pr_err("6pack: cooked buffer overrun, data loss\n"); + sp->rx_count = 0; + return; + } + buf = sp->raw_buf; sp->cooked_buf[sp->rx_count_cooked++] = buf[0] | ((buf[1] << 2) & 0xc0);
From: Dinghao Liu dinghao.liu@zju.edu.cn
[ Upstream commit 0a298d133893c72c96e2156ed7cb0f0c4a306a3e ]
qlcnic_83xx_unlock_flash() is called on all paths after we call qlcnic_83xx_lock_flash(), except for one error path on failure of QLCRD32(), which may cause a deadlock. This bug is suggested by a static analysis tool, please advise.
Fixes: 81d0aeb0a4fff ("qlcnic: flash template based firmware reset recovery") Signed-off-by: Dinghao Liu dinghao.liu@zju.edu.cn Link: https://lore.kernel.org/r/20210816131405.24024-1-dinghao.liu@zju.edu.cn Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/qlogic/qlcnic/qlcnic_83xx_hw.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_83xx_hw.c b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_83xx_hw.c index 5d2de48b77a0..dce36e9e1879 100644 --- a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_83xx_hw.c +++ b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_83xx_hw.c @@ -3157,8 +3157,10 @@ int qlcnic_83xx_flash_read32(struct qlcnic_adapter *adapter, u32 flash_addr,
indirect_addr = QLC_83XX_FLASH_DIRECT_DATA(addr); ret = QLCRD32(adapter, indirect_addr, &err); - if (err == -EIO) + if (err == -EIO) { + qlcnic_83xx_unlock_flash(adapter); return err; + }
word = ret; *(u32 *)p_data = word;
From: Jaehoon Chung jh80.chung@samsung.com
[ Upstream commit e13c3c081845b51e8ba71a90e91c52679cfdbf89 ]
stop_cmdr should be set to values relevant to stop command. It migth be assigned to values whatever there is mrq->stop or not. Then it doesn't need to use dw_mci_prepare_command(). It's enough to use the prep_stop_abort for preparing stop command.
Signed-off-by: Jaehoon Chung jh80.chung@samsung.com Tested-by: Heiko Stuebner heiko@sntech.de Reviewed-by: Shawn Lin shawn.lin@rock-chips.com Signed-off-by: Ulf Hansson ulf.hansson@linaro.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/mmc/host/dw_mmc.c | 15 +++++---------- 1 file changed, 5 insertions(+), 10 deletions(-)
diff --git a/drivers/mmc/host/dw_mmc.c b/drivers/mmc/host/dw_mmc.c index d9c7fd0cabaf..4b3e1079c39f 100644 --- a/drivers/mmc/host/dw_mmc.c +++ b/drivers/mmc/host/dw_mmc.c @@ -380,7 +380,7 @@ static void dw_mci_start_command(struct dw_mci *host,
static inline void send_stop_abort(struct dw_mci *host, struct mmc_data *data) { - struct mmc_command *stop = data->stop ? data->stop : &host->stop_abort; + struct mmc_command *stop = &host->stop_abort;
dw_mci_start_command(host, stop, host->stop_cmdr); } @@ -1280,10 +1280,7 @@ static void __dw_mci_start_request(struct dw_mci *host, spin_unlock_irqrestore(&host->irq_lock, irqflags); }
- if (mrq->stop) - host->stop_cmdr = dw_mci_prepare_command(slot->mmc, mrq->stop); - else - host->stop_cmdr = dw_mci_prep_stop_abort(host, cmd); + host->stop_cmdr = dw_mci_prep_stop_abort(host, cmd); }
static void dw_mci_start_request(struct dw_mci *host, @@ -1895,8 +1892,7 @@ static void dw_mci_tasklet_func(unsigned long priv) if (test_and_clear_bit(EVENT_DATA_ERROR, &host->pending_events)) { dw_mci_stop_dma(host); - if (data->stop || - !(host->data_status & (SDMMC_INT_DRTO | + if (!(host->data_status & (SDMMC_INT_DRTO | SDMMC_INT_EBE))) send_stop_abort(host, data); state = STATE_DATA_ERROR; @@ -1932,8 +1928,7 @@ static void dw_mci_tasklet_func(unsigned long priv) if (test_and_clear_bit(EVENT_DATA_ERROR, &host->pending_events)) { dw_mci_stop_dma(host); - if (data->stop || - !(host->data_status & (SDMMC_INT_DRTO | + if (!(host->data_status & (SDMMC_INT_DRTO | SDMMC_INT_EBE))) send_stop_abort(host, data); state = STATE_DATA_ERROR; @@ -2009,7 +2004,7 @@ static void dw_mci_tasklet_func(unsigned long priv) host->cmd = NULL; host->data = NULL;
- if (mrq->stop) + if (!mrq->sbc && mrq->stop) dw_mci_command_complete(host, mrq->stop); else host->cmd_status = 0;
From: Vincent Whitchurch vincent.whitchurch@axis.com
[ Upstream commit 25f8203b4be1937c4939bb98623e67dcfd7da4d1 ]
When a Data CRC interrupt is received, the driver disables the DMA, then sends the stop/abort command and then waits for Data Transfer Over.
However, sometimes, when a data CRC error is received in the middle of a multi-block write transfer, the Data Transfer Over interrupt is never received, and the driver hangs and never completes the request.
The driver sets the BMOD.SWR bit (SDMMC_IDMAC_SWRESET) when stopping the DMA, but according to the manual CMD.STOP_ABORT_CMD should be programmed "before assertion of SWR". Do these operations in the recommended order. With this change the Data Transfer Over is always received correctly in my tests.
Signed-off-by: Vincent Whitchurch vincent.whitchurch@axis.com Reviewed-by: Jaehoon Chung jh80.chung@samsung.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20210630102232.16011-1-vincent.whitchurch@axis.com Signed-off-by: Ulf Hansson ulf.hansson@linaro.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/mmc/host/dw_mmc.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/mmc/host/dw_mmc.c b/drivers/mmc/host/dw_mmc.c index 4b3e1079c39f..c6b91efaa956 100644 --- a/drivers/mmc/host/dw_mmc.c +++ b/drivers/mmc/host/dw_mmc.c @@ -1866,8 +1866,8 @@ static void dw_mci_tasklet_func(unsigned long priv) continue; }
- dw_mci_stop_dma(host); send_stop_abort(host, data); + dw_mci_stop_dma(host); state = STATE_SENDING_STOP; break; } @@ -1891,10 +1891,10 @@ static void dw_mci_tasklet_func(unsigned long priv) */ if (test_and_clear_bit(EVENT_DATA_ERROR, &host->pending_events)) { - dw_mci_stop_dma(host); if (!(host->data_status & (SDMMC_INT_DRTO | SDMMC_INT_EBE))) send_stop_abort(host, data); + dw_mci_stop_dma(host); state = STATE_DATA_ERROR; break; } @@ -1927,10 +1927,10 @@ static void dw_mci_tasklet_func(unsigned long priv) */ if (test_and_clear_bit(EVENT_DATA_ERROR, &host->pending_events)) { - dw_mci_stop_dma(host); if (!(host->data_status & (SDMMC_INT_DRTO | SDMMC_INT_EBE))) send_stop_abort(host, data); + dw_mci_stop_dma(host); state = STATE_DATA_ERROR; break; }
From: Jaroslav Kysela perex@perex.cz
[ Upstream commit a2befe9380dd04ee76c871568deca00eedf89134 ]
The original code in the cap_put_caller() function does not handle correctly the positive values returned from the passed function for multiple iterations. It means that the change notifications may be lost.
Fixes: 352f7f914ebb ("ALSA: hda - Merge Realtek parser code to generic parser") BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=213851 Cc: stable@kernel.org Signed-off-by: Jaroslav Kysela perex@perex.cz Link: https://lore.kernel.org/r/20210811161441.1325250-1-perex@perex.cz Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Sasha Levin sashal@kernel.org --- sound/pci/hda/hda_generic.c | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-)
diff --git a/sound/pci/hda/hda_generic.c b/sound/pci/hda/hda_generic.c index 8d99ac931ff6..c29f7ff5ccd2 100644 --- a/sound/pci/hda/hda_generic.c +++ b/sound/pci/hda/hda_generic.c @@ -3421,7 +3421,7 @@ static int cap_put_caller(struct snd_kcontrol *kcontrol, struct hda_gen_spec *spec = codec->spec; const struct hda_input_mux *imux; struct nid_path *path; - int i, adc_idx, err = 0; + int i, adc_idx, ret, err = 0;
imux = &spec->input_mux; adc_idx = kcontrol->id.index; @@ -3431,9 +3431,13 @@ static int cap_put_caller(struct snd_kcontrol *kcontrol, if (!path || !path->ctls[type]) continue; kcontrol->private_value = path->ctls[type]; - err = func(kcontrol, ucontrol); - if (err < 0) + ret = func(kcontrol, ucontrol); + if (ret < 0) { + err = ret; break; + } + if (ret > 0) + err = 1; } mutex_unlock(&codec->control_mutex); if (err >= 0 && spec->cap_sync_hook)
From: Dongliang Mu mudongliangabcd@gmail.com
[ Upstream commit 57a1681095f912239c7fb4d66683ab0425973838 ]
The function tpci200_register called by tpci200_install and tpci200_unregister called by tpci200_uninstall are in pair. However, tpci200_unregister has some cleanup operations not in the tpci200_register. So the error handling code of tpci200_pci_probe has many different double free issues.
Fix this problem by moving those cleanup operations out of tpci200_unregister, into tpci200_pci_remove and reverting the previous commit 9272e5d0028d ("ipack/carriers/tpci200: Fix a double free in tpci200_pci_probe").
Fixes: 9272e5d0028d ("ipack/carriers/tpci200: Fix a double free in tpci200_pci_probe") Cc: stable@vger.kernel.org Reported-by: Dongliang Mu mudongliangabcd@gmail.com Signed-off-by: Dongliang Mu mudongliangabcd@gmail.com Link: https://lore.kernel.org/r/20210810100323.3938492-1-mudongliangabcd@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/ipack/carriers/tpci200.c | 36 ++++++++++++++++---------------- 1 file changed, 18 insertions(+), 18 deletions(-)
diff --git a/drivers/ipack/carriers/tpci200.c b/drivers/ipack/carriers/tpci200.c index 7ba1a94497f5..4294523bede5 100644 --- a/drivers/ipack/carriers/tpci200.c +++ b/drivers/ipack/carriers/tpci200.c @@ -94,16 +94,13 @@ static void tpci200_unregister(struct tpci200_board *tpci200) free_irq(tpci200->info->pdev->irq, (void *) tpci200);
pci_iounmap(tpci200->info->pdev, tpci200->info->interface_regs); - pci_iounmap(tpci200->info->pdev, tpci200->info->cfg_regs);
pci_release_region(tpci200->info->pdev, TPCI200_IP_INTERFACE_BAR); pci_release_region(tpci200->info->pdev, TPCI200_IO_ID_INT_SPACES_BAR); pci_release_region(tpci200->info->pdev, TPCI200_MEM16_SPACE_BAR); pci_release_region(tpci200->info->pdev, TPCI200_MEM8_SPACE_BAR); - pci_release_region(tpci200->info->pdev, TPCI200_CFG_MEM_BAR);
pci_disable_device(tpci200->info->pdev); - pci_dev_put(tpci200->info->pdev); }
static void tpci200_enable_irq(struct tpci200_board *tpci200, @@ -524,7 +521,7 @@ static int tpci200_pci_probe(struct pci_dev *pdev, tpci200->info = kzalloc(sizeof(struct tpci200_infos), GFP_KERNEL); if (!tpci200->info) { ret = -ENOMEM; - goto out_err_info; + goto err_tpci200; }
pci_dev_get(pdev); @@ -535,7 +532,7 @@ static int tpci200_pci_probe(struct pci_dev *pdev, if (ret) { dev_err(&pdev->dev, "Failed to allocate PCI Configuration Memory"); ret = -EBUSY; - goto out_err_pci_request; + goto err_tpci200_info; } tpci200->info->cfg_regs = ioremap_nocache( pci_resource_start(pdev, TPCI200_CFG_MEM_BAR), @@ -543,7 +540,7 @@ static int tpci200_pci_probe(struct pci_dev *pdev, if (!tpci200->info->cfg_regs) { dev_err(&pdev->dev, "Failed to map PCI Configuration Memory"); ret = -EFAULT; - goto out_err_ioremap; + goto err_request_region; }
/* Disable byte swapping for 16 bit IP module access. This will ensure @@ -566,7 +563,7 @@ static int tpci200_pci_probe(struct pci_dev *pdev, if (ret) { dev_err(&pdev->dev, "error during tpci200 install\n"); ret = -ENODEV; - goto out_err_install; + goto err_cfg_regs; }
/* Register the carrier in the industry pack bus driver */ @@ -578,7 +575,7 @@ static int tpci200_pci_probe(struct pci_dev *pdev, dev_err(&pdev->dev, "error registering the carrier on ipack driver\n"); ret = -EFAULT; - goto out_err_bus_register; + goto err_tpci200_install; }
/* save the bus number given by ipack to logging purpose */ @@ -589,19 +586,16 @@ static int tpci200_pci_probe(struct pci_dev *pdev, tpci200_create_device(tpci200, i); return 0;
-out_err_bus_register: +err_tpci200_install: tpci200_uninstall(tpci200); - /* tpci200->info->cfg_regs is unmapped in tpci200_uninstall */ - tpci200->info->cfg_regs = NULL; -out_err_install: - if (tpci200->info->cfg_regs) - iounmap(tpci200->info->cfg_regs); -out_err_ioremap: +err_cfg_regs: + pci_iounmap(tpci200->info->pdev, tpci200->info->cfg_regs); +err_request_region: pci_release_region(pdev, TPCI200_CFG_MEM_BAR); -out_err_pci_request: - pci_dev_put(pdev); +err_tpci200_info: kfree(tpci200->info); -out_err_info: + pci_dev_put(pdev); +err_tpci200: kfree(tpci200); return ret; } @@ -611,6 +605,12 @@ static void __tpci200_pci_remove(struct tpci200_board *tpci200) ipack_bus_unregister(tpci200->info->ipack_bus); tpci200_uninstall(tpci200);
+ pci_iounmap(tpci200->info->pdev, tpci200->info->cfg_regs); + + pci_release_region(tpci200->info->pdev, TPCI200_CFG_MEM_BAR); + + pci_dev_put(tpci200->info->pdev); + kfree(tpci200->info); kfree(tpci200); }
From: NeilBrown neilb@suse.de
[ Upstream commit 3f79f6f6247c83f448c8026c3ee16d4636ef8d4f ]
Cross-rename lacks a check when that would prevent exchanging a directory and subvolume from different parent subvolume. This causes data inconsistencies and is caught before commit by tree-checker, turning the filesystem to read-only.
Calling the renameat2 with RENAME_EXCHANGE flags like
renameat2(AT_FDCWD, namesrc, AT_FDCWD, namedest, (1 << 1))
on two paths:
namesrc = dir1/subvol1/dir2 namedest = subvol2/subvol3
will cause key order problem with following write time tree-checker report:
[1194842.307890] BTRFS critical (device loop1): corrupt leaf: root=5 block=27574272 slot=10 ino=258, invalid previous key objectid, have 257 expect 258 [1194842.322221] BTRFS info (device loop1): leaf 27574272 gen 8 total ptrs 11 free space 15444 owner 5 [1194842.331562] BTRFS info (device loop1): refs 2 lock_owner 0 current 26561 [1194842.338772] item 0 key (256 1 0) itemoff 16123 itemsize 160 [1194842.338793] inode generation 3 size 16 mode 40755 [1194842.338801] item 1 key (256 12 256) itemoff 16111 itemsize 12 [1194842.338809] item 2 key (256 84 2248503653) itemoff 16077 itemsize 34 [1194842.338817] dir oid 258 type 2 [1194842.338823] item 3 key (256 84 2363071922) itemoff 16043 itemsize 34 [1194842.338830] dir oid 257 type 2 [1194842.338836] item 4 key (256 96 2) itemoff 16009 itemsize 34 [1194842.338843] item 5 key (256 96 3) itemoff 15975 itemsize 34 [1194842.338852] item 6 key (257 1 0) itemoff 15815 itemsize 160 [1194842.338863] inode generation 6 size 8 mode 40755 [1194842.338869] item 7 key (257 12 256) itemoff 15801 itemsize 14 [1194842.338876] item 8 key (257 84 2505409169) itemoff 15767 itemsize 34 [1194842.338883] dir oid 256 type 2 [1194842.338888] item 9 key (257 96 2) itemoff 15733 itemsize 34 [1194842.338895] item 10 key (258 12 256) itemoff 15719 itemsize 14 [1194842.339163] BTRFS error (device loop1): block=27574272 write time tree block corruption detected [1194842.339245] ------------[ cut here ]------------ [1194842.443422] WARNING: CPU: 6 PID: 26561 at fs/btrfs/disk-io.c:449 csum_one_extent_buffer+0xed/0x100 [btrfs] [1194842.511863] CPU: 6 PID: 26561 Comm: kworker/u17:2 Not tainted 5.14.0-rc3-git+ #793 [1194842.511870] Hardware name: empty empty/S3993, BIOS PAQEX0-3 02/24/2008 [1194842.511876] Workqueue: btrfs-worker-high btrfs_work_helper [btrfs] [1194842.511976] RIP: 0010:csum_one_extent_buffer+0xed/0x100 [btrfs] [1194842.512068] RSP: 0018:ffffa2c284d77da0 EFLAGS: 00010282 [1194842.512074] RAX: 0000000000000000 RBX: 0000000000001000 RCX: ffff928867bd9978 [1194842.512078] RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff928867bd9970 [1194842.512081] RBP: ffff92876b958000 R08: 0000000000000001 R09: 00000000000c0003 [1194842.512085] R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000 [1194842.512088] R13: ffff92875f989f98 R14: 0000000000000000 R15: 0000000000000000 [1194842.512092] FS: 0000000000000000(0000) GS:ffff928867a00000(0000) knlGS:0000000000000000 [1194842.512095] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [1194842.512099] CR2: 000055f5384da1f0 CR3: 0000000102fe4000 CR4: 00000000000006e0 [1194842.512103] Call Trace: [1194842.512128] ? run_one_async_free+0x10/0x10 [btrfs] [1194842.631729] btree_csum_one_bio+0x1ac/0x1d0 [btrfs] [1194842.631837] run_one_async_start+0x18/0x30 [btrfs] [1194842.631938] btrfs_work_helper+0xd5/0x1d0 [btrfs] [1194842.647482] process_one_work+0x262/0x5e0 [1194842.647520] worker_thread+0x4c/0x320 [1194842.655935] ? process_one_work+0x5e0/0x5e0 [1194842.655946] kthread+0x135/0x160 [1194842.655953] ? set_kthread_struct+0x40/0x40 [1194842.655965] ret_from_fork+0x1f/0x30 [1194842.672465] irq event stamp: 1729 [1194842.672469] hardirqs last enabled at (1735): [<ffffffffbd1104f5>] console_trylock_spinning+0x185/0x1a0 [1194842.672477] hardirqs last disabled at (1740): [<ffffffffbd1104cc>] console_trylock_spinning+0x15c/0x1a0 [1194842.672482] softirqs last enabled at (1666): [<ffffffffbdc002e1>] __do_softirq+0x2e1/0x50a [1194842.672491] softirqs last disabled at (1651): [<ffffffffbd08aab7>] __irq_exit_rcu+0xa7/0xd0
The corrupted data will not be written, and filesystem can be unmounted and mounted again (all changes since the last commit will be lost).
Add the missing check for new_ino so that all non-subvolumes must reside under the same parent subvolume. There's an exception allowing to exchange two subvolumes from any parents as the directory representing a subvolume is only a logical link and does not have any other structures related to the parent subvolume, unlike files, directories etc, that are always in the inode namespace of the parent subvolume.
Fixes: cdd1fedf8261 ("btrfs: add support for RENAME_EXCHANGE and RENAME_WHITEOUT") CC: stable@vger.kernel.org # 4.7+ Reviewed-by: Nikolay Borisov nborisov@suse.com Signed-off-by: NeilBrown neilb@suse.de Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/btrfs/inode.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index a55d23a73cdb..b744e7d33d87 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -9632,8 +9632,14 @@ static int btrfs_rename_exchange(struct inode *old_dir, bool root_log_pinned = false; bool dest_log_pinned = false;
- /* we only allow rename subvolume link between subvolumes */ - if (old_ino != BTRFS_FIRST_FREE_OBJECTID && root != dest) + /* + * For non-subvolumes allow exchange only within one subvolume, in the + * same inode namespace. Two subvolumes (represented as directory) can + * be exchanged as they're a logical link and have a fixed inode number. + */ + if (root != dest && + (old_ino != BTRFS_FIRST_FREE_OBJECTID || + new_ino != BTRFS_FIRST_FREE_OBJECTID)) return -EXDEV;
/* close the race window with snapshot create/destroy ioctl */
From: Takashi Iwai tiwai@suse.de
[ Upstream commit 65ca89c2b12cca0d473f3dd54267568ad3af55cc ]
The commit 2e6b836312a4 ("ASoC: intel: atom: Fix reference to PCM buffer address") changed the reference of PCM buffer address to substream->runtime->dma_addr as the buffer address may change dynamically. However, I forgot that the dma_addr field is still not set up for the CONTINUOUS buffer type (that this driver uses) yet in 5.14 and earlier kernels, and it resulted in garbage I/O. The problem will be fixed in 5.15, but we need to address it quickly for now.
The fix is to deduce the address again from the DMA pointer with virt_to_phys(), but from the right one, substream->runtime->dma_area.
Fixes: 2e6b836312a4 ("ASoC: intel: atom: Fix reference to PCM buffer address") Reported-and-tested-by: Hans de Goede hdegoede@redhat.com Cc: stable@vger.kernel.org Acked-by: Mark Brown broonie@kernel.org Link: https://lore.kernel.org/r/2048c6aa-2187-46bd-6772-36a4fb3c5aeb@redhat.com Link: https://lore.kernel.org/r/20210819152945.8510-1-tiwai@suse.de Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/intel/atom/sst-mfld-platform-pcm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/sound/soc/intel/atom/sst-mfld-platform-pcm.c b/sound/soc/intel/atom/sst-mfld-platform-pcm.c index 6303b2d3982d..1b6dedfc33e3 100644 --- a/sound/soc/intel/atom/sst-mfld-platform-pcm.c +++ b/sound/soc/intel/atom/sst-mfld-platform-pcm.c @@ -135,7 +135,7 @@ static void sst_fill_alloc_params(struct snd_pcm_substream *substream, snd_pcm_uframes_t period_size; ssize_t periodbytes; ssize_t buffer_bytes = snd_pcm_lib_buffer_bytes(substream); - u32 buffer_addr = substream->runtime->dma_addr; + u32 buffer_addr = virt_to_phys(substream->runtime->dma_area);
channels = substream->runtime->channels; period_size = substream->runtime->period_size;
From: Jeff Layton jlayton@kernel.org
[ Upstream commit df2474a22c42ce419b67067c52d71da06c385501 ]
Since 9e8925b67a ("locks: Allow disabling mandatory locking at compile time"), attempts to mount filesystems with "-o mand" will fail. Unfortunately, there is no other indiciation of the reason for the failure.
Change how the function is defined for better readability. When CONFIG_MANDATORY_FILE_LOCKING is disabled, printk a warning when someone attempts to mount with -o mand.
Also, add a blurb to the mandatory-locking.txt file to explain about the "mand" option, and the behavior one should expect when it is disabled.
Reported-by: Jan Kara jack@suse.cz Reviewed-by: Jan Kara jack@suse.cz Signed-off-by: Jeff Layton jlayton@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- Documentation/filesystems/mandatory-locking.txt | 10 ++++++++++ fs/namespace.c | 11 ++++++++--- 2 files changed, 18 insertions(+), 3 deletions(-)
diff --git a/Documentation/filesystems/mandatory-locking.txt b/Documentation/filesystems/mandatory-locking.txt index 0979d1d2ca8b..a251ca33164a 100644 --- a/Documentation/filesystems/mandatory-locking.txt +++ b/Documentation/filesystems/mandatory-locking.txt @@ -169,3 +169,13 @@ havoc if they lock crucial files. The way around it is to change the file permissions (remove the setgid bit) before trying to read or write to it. Of course, that might be a bit tricky if the system is hung :-(
+7. The "mand" mount option +-------------------------- +Mandatory locking is disabled on all filesystems by default, and must be +administratively enabled by mounting with "-o mand". That mount option +is only allowed if the mounting task has the CAP_SYS_ADMIN capability. + +Since kernel v4.5, it is possible to disable mandatory locking +altogether by setting CONFIG_MANDATORY_FILE_LOCKING to "n". A kernel +with this disabled will reject attempts to mount filesystems with the +"mand" mount option with the error status EPERM. diff --git a/fs/namespace.c b/fs/namespace.c index 9f2390c89b63..68457fa2f981 100644 --- a/fs/namespace.c +++ b/fs/namespace.c @@ -1669,13 +1669,18 @@ static inline bool may_mount(void) return ns_capable(current->nsproxy->mnt_ns->user_ns, CAP_SYS_ADMIN); }
+#ifdef CONFIG_MANDATORY_FILE_LOCKING static inline bool may_mandlock(void) { -#ifndef CONFIG_MANDATORY_FILE_LOCKING - return false; -#endif return capable(CAP_SYS_ADMIN); } +#else +static inline bool may_mandlock(void) +{ + pr_warn("VFS: "mand" mount option not supported"); + return false; +} +#endif
/* * Now umount can handle mount points as well as block devices.
From: Jeff Layton jlayton@kernel.org
[ Upstream commit fdd92b64d15bc4aec973caa25899afd782402e68 ]
We've had CONFIG_MANDATORY_FILE_LOCKING since 2015 and a lot of distros have disabled it. Warn the stragglers that still use "-o mand" that we'll be dropping support for that mount option.
Cc: stable@vger.kernel.org Signed-off-by: Jeff Layton jlayton@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/namespace.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/fs/namespace.c b/fs/namespace.c index 68457fa2f981..b9e30a385c01 100644 --- a/fs/namespace.c +++ b/fs/namespace.c @@ -1670,8 +1670,12 @@ static inline bool may_mount(void) }
#ifdef CONFIG_MANDATORY_FILE_LOCKING -static inline bool may_mandlock(void) +static bool may_mandlock(void) { + pr_warn_once("======================================================\n" + "WARNING: the mand mount option is being deprecated and\n" + " will be removed in v5.15!\n" + "======================================================\n"); return capable(CAP_SYS_ADMIN); } #else
Signed-off-by: Sasha Levin sashal@kernel.org --- Makefile | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/Makefile b/Makefile index 7cd5634469b1..332713b5a28f 100644 --- a/Makefile +++ b/Makefile @@ -1,7 +1,7 @@ VERSION = 4 PATCHLEVEL = 9 -SUBLEVEL = 280 -EXTRAVERSION = +SUBLEVEL = 281 +EXTRAVERSION = -rc1 NAME = Roaring Lionus
# *DOCUMENTATION*
On Tue, Aug 24, 2021 at 01:05:31PM -0400, Sasha Levin wrote:
This is the start of the stable review cycle for the 4.9.281 release. There are 43 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Thu 26 Aug 2021 05:06:11 PM UTC. Anything received after that time might be too late.
Build results: total: 163 pass: 163 fail: 0 Qemu test results: total: 393 pass: 393 fail: 0
Tested-by: Guenter Roeck linux@roeck-us.net
Guenter
Hello!
On 8/24/21 12:05 PM, Sasha Levin wrote:
This is the start of the stable review cycle for the 4.9.281 release. There are 43 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Thu 26 Aug 2021 05:06:11 PM UTC. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git/p... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.9.y and the diffstat can be found below.
Thanks, Sasha
Results from Linaro's test farm. No regressions on arm64, arm, x86_64, and i386.
Tested-by: Linux Kernel Functional Testing lkft@linaro.org
## Build * kernel: 4.9.281-rc1 * git: ['https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git', 'https://gitlab.com/Linaro/lkft/mirrors/stable/linux-stable-rc'] * git branch: linux-4.9.y * git commit: 3d204357a2ed1b927c75e0166f31aa67a5d99c4e * git describe: v4.9.280-43-g3d204357a2ed * test details: https://qa-reports.linaro.org/lkft/linux-stable-rc-linux-4.9.y/build/v4.9.28...
## No regressions (compared to v4.9.280)
## No fixes (compared to v4.9.280)
## Test result summary total: 63783, pass: 51053, fail: 476, skip: 10349, xfail: 1905
## Build Summary * arm: 98 total, 98 passed, 0 failed * arm64: 28 total, 28 passed, 0 failed * dragonboard-410c: 1 total, 1 passed, 0 failed * hi6220-hikey: 1 total, 1 passed, 0 failed * i386: 15 total, 15 passed, 0 failed * juno-r2: 1 total, 1 passed, 0 failed * mips: 36 total, 36 passed, 0 failed * sparc: 9 total, 9 passed, 0 failed * x15: 1 total, 1 passed, 0 failed * x86: 1 total, 1 passed, 0 failed * x86_64: 15 total, 15 passed, 0 failed
## Test suites summary * fwts * igt-gpu-tools * install-android-platform-tools-r2600 * kselftest-android * kselftest-arm64 * kselftest-bpf * kselftest-breakpoints * kselftest-capabilities * kselftest-cgroup * kselftest-clone3 * kselftest-core * kselftest-cpu-hotplug * kselftest-cpufreq * kselftest-drivers * kselftest-efivarfs * kselftest-filesystems * kselftest-firmware * kselftest-fpu * kselftest-futex * kselftest-gpio * kselftest-intel_pstate * kselftest-ipc * kselftest-ir * kselftest-kcmp * kselftest-kexec * kselftest-kvm * kselftest-lib * kselftest-livepatch * kselftest-membarrier * kselftest-memfd * kselftest-memory-hotplug * kselftest-mincore * kselftest-mount * kselftest-mqueue * kselftest-openat2 * kselftest-pid_namespace * kselftest-pidfd * kselftest-proc * kselftest-pstore * kselftest-ptrace * kselftest-rseq * kselftest-rtc * kselftest-seccomp * kselftest-sigaltstack * kselftest-size * kselftest-splice * kselftest-static_keys * kselftest-sync * kselftest-sysctl * kselftest-timens * kselftest-timers * kselftest-tmpfs * kselftest-tpm2 * kselftest-user * kselftest-vm * kselftest-x86 * kselftest-zram * kvm-unit-tests * libhugetlbfs * linux-log-parser * ltp-cap_bounds-tests * ltp-commands-tests * ltp-containers-tests * ltp-controllers-tests * ltp-cpuhotplug-tests * ltp-crypto-tests * ltp-cve-tests * ltp-dio-tests * ltp-fcntl-locktests-tests * ltp-filecaps-tests * ltp-fs-tests * ltp-fs_bind-tests * ltp-fs_perms_simple-tests * ltp-fsx-tests * ltp-hugetlb-tests * ltp-io-tests * ltp-ipc-tests * ltp-math-tests * ltp-mm-tests * ltp-nptl-tests * ltp-open-posix-tests * ltp-pty-tests * ltp-sched-tests * ltp-securebits-tests * ltp-syscalls-tests * ltp-tracing-tests * network-basic-tests * packetdrill * perf * ssuite * v4l2-compliance
Greetings!
Daniel Díaz daniel.diaz@linaro.org
On 8/24/21 11:05 AM, Sasha Levin wrote:
This is the start of the stable review cycle for the 4.9.281 release. There are 43 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Thu 26 Aug 2021 05:06:11 PM UTC. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git/p... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.9.y and the diffstat can be found below.
Thanks, Sasha
Compiled and booted on my test system. No dmesg regressions.
Tested-by: Shuah Khan skhan@linuxfoundation.org
thanks, -- Shuah
linux-stable-mirror@lists.linaro.org