From: Johannes Berg <johannes.berg(a)intel.com>
Ben Greear further reports deadlocks during concurrent debugfs
remove while files are being accessed, even though the code in
question now uses debugfs cancellations. Turns out that despite
all the review on the locking, we missed completely that the
logic is wrong: if the refcount hits zero we can finish (and
need not wait for the completion), but if it doesn't we have
to trigger all the cancellations. As written, we can _never_
get into the loop triggering the cancellations. Fix this, and
explain it better while at it.
Cc: stable(a)vger.kernel.org
Fixes: 8c88a474357e ("debugfs: add API to allow debugfs operations cancellation")
Reported-by: Ben Greear <greearb(a)candelatech.com>
Closes: https://lore.kernel.org/r/1c9fa9e5-09f1-0522-fdbc-dbcef4d255ca@candelatech.…
Tested-by: Madhan Sai <madhan.singaraju(a)candelatech.com>
Signed-off-by: Johannes Berg <johannes.berg(a)intel.com>
---
fs/debugfs/inode.c | 25 ++++++++++++++++++++-----
1 file changed, 20 insertions(+), 5 deletions(-)
diff --git a/fs/debugfs/inode.c b/fs/debugfs/inode.c
index 034a617cb1a5..a40da0065433 100644
--- a/fs/debugfs/inode.c
+++ b/fs/debugfs/inode.c
@@ -751,13 +751,28 @@ static void __debugfs_file_removed(struct dentry *dentry)
if ((unsigned long)fsd & DEBUGFS_FSDATA_IS_REAL_FOPS_BIT)
return;
- /* if we hit zero, just wait for all to finish */
- if (!refcount_dec_and_test(&fsd->active_users)) {
- wait_for_completion(&fsd->active_users_drained);
+ /* if this was the last reference, we're done */
+ if (refcount_dec_and_test(&fsd->active_users))
return;
- }
- /* if we didn't hit zero, try to cancel any we can */
+ /*
+ * If there's still a reference, the code that obtained it can
+ * be in different states:
+ * - The common case of not using cancellations, or already
+ * after debugfs_leave_cancellation(), where we just need
+ * to wait for debugfs_file_put() which signals the completion;
+ * - inside a cancellation section, i.e. between
+ * debugfs_enter_cancellation() and debugfs_leave_cancellation(),
+ * in which case we need to trigger the ->cancel() function,
+ * and then wait for debugfs_file_put() just like in the
+ * previous case;
+ * - before debugfs_enter_cancellation() (but obviously after
+ * debugfs_file_get()), in which case we may not see the
+ * cancellation in the list on the first round of the loop,
+ * but debugfs_enter_cancellation() signals the completion
+ * after adding it, so this code gets woken up to call the
+ * ->cancel() function.
+ */
while (refcount_read(&fsd->active_users)) {
struct debugfs_cancellation *c;
--
2.43.2
Currently xhci_map_urb_for_dma() creates a temporary buffer
and copies the SG list to the new linear buffer. But if the
kzalloc_node() fails, then the following sg_pcopy_to_buffer()
can lead to crash since it tries to memcpy to NULL pointer.
So return -ENOMEM if kzalloc returns null pointer.
Cc: <stable(a)vger.kernel.org> # 5.11
Fixes: 2017a1e58472 ("usb: xhci: Use temporary buffer to consolidate SG")
Signed-off-by: Prashanth K <quic_prashk(a)quicinc.com>
---
v2: Updated -EAGAIN to -ENOMEM
drivers/usb/host/xhci.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
index c057c42c36f4..35e9efdee3b2 100644
--- a/drivers/usb/host/xhci.c
+++ b/drivers/usb/host/xhci.c
@@ -1217,6 +1217,8 @@ static int xhci_map_temp_buffer(struct usb_hcd *hcd, struct urb *urb)
temp = kzalloc_node(buf_len, GFP_ATOMIC,
dev_to_node(hcd->self.sysdev));
+ if (!temp)
+ return -ENOMEM;
if (usb_urb_dir_out(urb))
sg_pcopy_to_buffer(urb->sg, urb->num_sgs,
--
2.25.1
From: Prashanth K <quic_prashk(a)quicinc.com>
Currently xhci_map_urb_for_dma() creates a temporary buffer and copies
the SG list to the new linear buffer. But if the kzalloc_node() fails,
then the following sg_pcopy_to_buffer() can lead to crash since it
tries to memcpy to NULL pointer.
So return -ENOMEM if kzalloc returns null pointer.
Cc: <stable(a)vger.kernel.org> # 5.11
Fixes: 2017a1e58472 ("usb: xhci: Use temporary buffer to consolidate SG")
Signed-off-by: Prashanth K <quic_prashk(a)quicinc.com>
Signed-off-by: Mathias Nyman <mathias.nyman(a)linux.intel.com>
---
drivers/usb/host/xhci.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
index 5d70e0176527..8579603edaff 100644
--- a/drivers/usb/host/xhci.c
+++ b/drivers/usb/host/xhci.c
@@ -1219,6 +1219,8 @@ static int xhci_map_temp_buffer(struct usb_hcd *hcd, struct urb *urb)
temp = kzalloc_node(buf_len, GFP_ATOMIC,
dev_to_node(hcd->self.sysdev));
+ if (!temp)
+ return -ENOMEM;
if (usb_urb_dir_out(urb))
sg_pcopy_to_buffer(urb->sg, urb->num_sgs,
--
2.25.1
The commit 80dd33cf72d1 ("drivers: base: Fix device link removal")
introduces a workqueue to release the consumer and supplier devices used
in the devlink.
In the job queued, devices are release and in turn, when all the
references to these devices are dropped, the release function of the
device itself is called.
Nothing is present to provide some synchronisation with this workqueue
in order to ensure that all ongoing releasing operations are done and
so, some other operations can be started safely.
For instance, in the following sequence:
1) of_platform_depopulate()
2) of_overlay_remove()
During the step 1, devices are released and related devlinks are removed
(jobs pushed in the workqueue).
During the step 2, OF nodes are destroyed but, without any
synchronisation with devlink removal jobs, of_overlay_remove() can raise
warnings related to missing of_node_put():
ERROR: memory leak, expected refcount 1 instead of 2
Indeed, the missing of_node_put() call is going to be done, too late,
from the workqueue job execution.
Introduce device_link_wait_removal() to offer a way to synchronize
operations waiting for the end of devlink removals (i.e. end of
workqueue jobs).
Also, as a flushing operation is done on the workqueue, the workqueue
used is moved from a system-wide workqueue to a local one.
Fixes: 80dd33cf72d1 ("drivers: base: Fix device link removal")
Cc: stable(a)vger.kernel.org
Signed-off-by: Herve Codina <herve.codina(a)bootlin.com>
---
drivers/base/core.c | 26 +++++++++++++++++++++++---
include/linux/device.h | 1 +
2 files changed, 24 insertions(+), 3 deletions(-)
diff --git a/drivers/base/core.c b/drivers/base/core.c
index d5f4e4aac09b..80d9430856a8 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -44,6 +44,7 @@ static bool fw_devlink_is_permissive(void);
static void __fw_devlink_link_to_consumers(struct device *dev);
static bool fw_devlink_drv_reg_done;
static bool fw_devlink_best_effort;
+static struct workqueue_struct *device_link_wq;
/**
* __fwnode_link_add - Create a link between two fwnode_handles.
@@ -532,12 +533,26 @@ static void devlink_dev_release(struct device *dev)
/*
* It may take a while to complete this work because of the SRCU
* synchronization in device_link_release_fn() and if the consumer or
- * supplier devices get deleted when it runs, so put it into the "long"
- * workqueue.
+ * supplier devices get deleted when it runs, so put it into the
+ * dedicated workqueue.
*/
- queue_work(system_long_wq, &link->rm_work);
+ queue_work(device_link_wq, &link->rm_work);
}
+/**
+ * device_link_wait_removal - Wait for ongoing devlink removal jobs to terminate
+ */
+void device_link_wait_removal(void)
+{
+ /*
+ * devlink removal jobs are queued in the dedicated work queue.
+ * To be sure that all removal jobs are terminated, ensure that any
+ * scheduled work has run to completion.
+ */
+ drain_workqueue(device_link_wq);
+}
+EXPORT_SYMBOL_GPL(device_link_wait_removal);
+
static struct class devlink_class = {
.name = "devlink",
.dev_groups = devlink_groups,
@@ -4099,9 +4114,14 @@ int __init devices_init(void)
sysfs_dev_char_kobj = kobject_create_and_add("char", dev_kobj);
if (!sysfs_dev_char_kobj)
goto char_kobj_err;
+ device_link_wq = alloc_workqueue("device_link_wq", 0, 0);
+ if (!device_link_wq)
+ goto wq_err;
return 0;
+ wq_err:
+ kobject_put(sysfs_dev_char_kobj);
char_kobj_err:
kobject_put(sysfs_dev_block_kobj);
block_kobj_err:
diff --git a/include/linux/device.h b/include/linux/device.h
index 1795121dee9a..d7d8305a72e8 100644
--- a/include/linux/device.h
+++ b/include/linux/device.h
@@ -1249,6 +1249,7 @@ void device_link_del(struct device_link *link);
void device_link_remove(void *consumer, struct device *supplier);
void device_links_supplier_sync_state_pause(void);
void device_links_supplier_sync_state_resume(void);
+void device_link_wait_removal(void);
/* Create alias, so I can be autoloaded. */
#define MODULE_ALIAS_CHARDEV(major,minor) \
--
2.43.0
This is the start of the stable review cycle for the 5.4.270 release.
There are 84 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Thu, 29 Feb 2024 13:15:36 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.4.270-rc…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.4.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 5.4.270-rc1
Andrii Nakryiko <andriin(a)fb.com>
scripts/bpf: Fix xdp_md forward declaration typo
Bart Van Assche <bvanassche(a)acm.org>
fs/aio: Restrict kiocb_set_cancel_fn() to I/O submitted via libaio
Erik Kurzinger <ekurzinger(a)nvidia.com>
drm/syncobj: call drm_syncobj_fence_add_wait when WAIT_AVAILABLE flag is set
Christian König <christian.koenig(a)amd.com>
drm/syncobj: make lockdep complain on WAIT_FOR_SUBMIT v3
Florian Westphal <fw(a)strlen.de>
netfilter: nf_tables: set dormant flag on hook register failure
Sabrina Dubroca <sd(a)queasysnail.net>
tls: stop recv() if initial process_rx_list gave us non-DATA
Jakub Kicinski <kuba(a)kernel.org>
tls: rx: drop pointless else after goto
Jakub Kicinski <kuba(a)kernel.org>
tls: rx: jump to a more appropriate label
Jason Gunthorpe <jgg(a)nvidia.com>
s390: use the correct count for __iowrite64_copy()
Wolfram Sang <wsa+renesas(a)sang-engineering.com>
packet: move from strlcpy with unused retval to strscpy
Vasiliy Kovalev <kovalev(a)altlinux.org>
ipv6: sr: fix possible use-after-free and null-ptr-deref
Daniil Dulov <d.dulov(a)aladdin.ru>
afs: Increase buffer size in afs_update_volume_status()
Eric Dumazet <edumazet(a)google.com>
ipv6: properly combine dev_base_seq and ipv6.dev_addr_genid
Eric Dumazet <edumazet(a)google.com>
ipv4: properly combine dev_base_seq and ipv4.dev_addr_genid
Arnd Bergmann <arnd(a)arndb.de>
nouveau: fix function cast warnings
Randy Dunlap <rdunlap(a)infradead.org>
scsi: jazz_esp: Only build if SCSI core is builtin
Gianmarco Lusvardi <glusvardi(a)posteo.net>
bpf, scripts: Correct GPL license name
Andrii Nakryiko <andriin(a)fb.com>
scripts/bpf: teach bpf_helpers_doc.py to dump BPF helper definitions
Arnd Bergmann <arnd(a)arndb.de>
RDMA/srpt: fix function pointer cast warnings
Bart Van Assche <bvanassche(a)acm.org>
RDMA/srpt: Make debug output more detailed
Kalesh AP <kalesh-anakkur.purayil(a)broadcom.com>
RDMA/bnxt_re: Return error for SRQ resize
Zhipeng Lu <alexious(a)zju.edu.cn>
IB/hfi1: Fix a memleak in init_credit_return
Xu Yang <xu.yang_2(a)nxp.com>
usb: roles: don't get/set_role() when usb_role_switch is unregistered
Krishna Kurapati <quic_kriskura(a)quicinc.com>
usb: gadget: ncm: Avoid dropping datagrams of properly parsed NTBs
Frank Li <Frank.Li(a)nxp.com>
usb: cdns3: fix memory double free when handle zero packet
Frank Li <Frank.Li(a)nxp.com>
usb: cdns3: fixed memory use after free at cdns3_gadget_ep_disable()
Nikita Shubin <nikita.shubin(a)maquefel.me>
ARM: ep93xx: Add terminator to gpiod_lookup_table
Tom Parkin <tparkin(a)katalix.com>
l2tp: pass correct message length to ip6_append_data
Vidya Sagar <vidyas(a)nvidia.com>
PCI/MSI: Prevent MSI hardware interrupt number truncation
Vasiliy Kovalev <kovalev(a)altlinux.org>
gtp: fix use-after-free and null-ptr-deref in gtp_genl_dump_pdp()
Mikulas Patocka <mpatocka(a)redhat.com>
dm-crypt: don't modify the data when using authenticated encryption
Daniel Vacek <neelx(a)redhat.com>
IB/hfi1: Fix sdma.h tx->num_descs off-by-one error
Christophe JAILLET <christophe.jaillet(a)wanadoo.fr>
PCI: tegra: Fix OF node reference leak
Pali Rohár <pali(a)kernel.org>
PCI: tegra: Fix reporting GPIO error value
Sireesh Kodali <sireeshkodali1(a)gmail.com>
arm64: dts: qcom: msm8916: Fix typo in pronto remoteproc node
Nathan Chancellor <nathan(a)kernel.org>
drm/amdgpu: Fix type of second parameter in trans_msg() callback
Matthew Wilcox (Oracle) <willy(a)infradead.org>
iomap: Set all uptodate bits for an Uptodate page
Mikulas Patocka <mpatocka(a)redhat.com>
dm-integrity: don't modify bio's immutable bio_vec in integrity_metadata()
Kirill A. Shutemov <kirill.shutemov(a)linux.intel.com>
x86/alternatives: Disable KASAN in apply_alternatives()
Trek <trek00(a)inbox.ru>
drm/amdgpu: Check for valid number of registers to read
Icenowy Zheng <icenowy(a)aosc.io>
Revert "drm/sun4i: dsi: Change the start delay calculation"
Kai-Heng Feng <kai.heng.feng(a)canonical.com>
ALSA: hda/realtek - Enable micmute LED on and HP system
Björn Töpel <bjorn.topel(a)gmail.com>
selftests/bpf: Avoid running unprivileged tests with alignment requirements
Nikolay Aleksandrov <nikolay(a)cumulusnetworks.com>
net: bridge: clear bridge's private skb space on xmit
Christophe JAILLET <christophe.jaillet(a)wanadoo.fr>
spi: mt7621: Fix an error message in mt7621_spi_probe()
John Stultz <john.stultz(a)linaro.org>
driver core: Set deferred_probe_timeout to a longer default if CONFIG_MODULES is set
Miaoqian Lin <linmq006(a)gmail.com>
pinctrl: rockchip: Fix refcount leak in rockchip_pinctrl_parse_groups
Lee Jones <lee.jones(a)linaro.org>
pinctrl: pinctrl-rockchip: Fix a bunch of kerneldoc misdemeanours
Eric Dumazet <edumazet(a)google.com>
tcp: add annotations around sk->sk_shutdown accesses
Soheil Hassas Yeganeh <soheil(a)google.com>
tcp: return EPOLLOUT from tcp_poll only when notsent_bytes is half the limit
Paolo Abeni <pabeni(a)redhat.com>
tcp: factor out __tcp_close() helper
Geert Uytterhoeven <geert+renesas(a)glider.be>
pmdomain: renesas: r8a77980-sysc: CR7 must be always on
Alexandra Winter <wintera(a)linux.ibm.com>
s390/qeth: Fix potential loss of L3-IP@ in case of network issues
Yi Sun <yi.sun(a)unisoc.com>
virtio-blk: Ensure no requests in virtqueues before deleting vqs.
Takashi Sakamoto <o-takashi(a)sakamocchi.jp>
firewire: core: send bus reset promptly on gap count error
Hannes Reinecke <hare(a)suse.de>
scsi: lpfc: Use unsigned type for num_sge
Zhang Rui <rui.zhang(a)intel.com>
hwmon: (coretemp) Enlarge per package core count limit
Daniel Wagner <dwagner(a)suse.de>
nvmet-fc: abort command when there is no binding
Xin Long <lucien.xin(a)gmail.com>
netfilter: conntrack: check SCTP_CID_SHUTDOWN_ACK for vtag setting in sctp_new
Chen-Yu Tsai <wens(a)csie.org>
ASoC: sunxi: sun4i-spdif: Add support for Allwinner H616
Guixin Liu <kanie(a)linux.alibaba.com>
nvmet-tcp: fix nvme tcp ida memory leak
Martin Blumenstingl <martin.blumenstingl(a)googlemail.com>
regulator: pwm-regulator: Add validity checks in continuous .get_voltage
Baokun Li <libaokun1(a)huawei.com>
ext4: avoid allocating blocks from corrupted group in ext4_mb_find_by_goal()
Baokun Li <libaokun1(a)huawei.com>
ext4: avoid allocating blocks from corrupted group in ext4_mb_try_best_found()
Lennert Buytenhek <kernel(a)wantstofly.org>
ahci: add 43-bit DMA address quirk for ASMedia ASM1061 controllers
Conrad Kostecki <conikost(a)gentoo.org>
ahci: asm1166: correct count of reported ports
Fullway Wang <fullwaywang(a)outlook.com>
fbdev: sis: Error out if pixclock equals zero
Fullway Wang <fullwaywang(a)outlook.com>
fbdev: savage: Error out if pixclock equals zero
Felix Fietkau <nbd(a)nbd.name>
wifi: mac80211: fix race condition on enabling fast-xmit
Michal Kazior <michal(a)plume.com>
wifi: cfg80211: fix missing interfaces when dumping
Vinod Koul <vkoul(a)kernel.org>
dmaengine: fsl-qdma: increase size of 'irq_name'
Vinod Koul <vkoul(a)kernel.org>
dmaengine: shdma: increase size of 'dev_id'
Dmitry Bogdanov <d.bogdanov(a)yadro.com>
scsi: target: core: Add TMF to tmr_list handling
Cyril Hrubis <chrubis(a)suse.cz>
sched/rt: Disallow writing invalid values to sched_rt_period_us
Cyril Hrubis <chrubis(a)suse.cz>
sched/rt: Fix sysctl_sched_rr_timeslice intial value
Lokesh Gidra <lokeshgidra(a)google.com>
userfaultfd: fix mmap_changing checking in mfill_atomic_hugetlb
Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
nilfs2: replace WARN_ONs for invalid DAT metadata block requests
GONG, Ruiqi <gongruiqi1(a)huawei.com>
memcg: add refcnt for pcpu stock to avoid UAF problem in drain_all_stock()
Cyril Hrubis <chrubis(a)suse.cz>
sched/rt: sysctl_sched_rr_timeslice show default timeslice after reset
Jamal Hadi Salim <jhs(a)mojatatu.com>
net/sched: Retire dsmark qdisc
Jamal Hadi Salim <jhs(a)mojatatu.com>
net/sched: Retire ATM qdisc
Jamal Hadi Salim <jhs(a)mojatatu.com>
net/sched: Retire CBQ qdisc
Oliver Upton <oliver.upton(a)linux.dev>
KVM: arm64: vgic-its: Test for valid IRQ in MOVALL handler
Oliver Upton <oliver.upton(a)linux.dev>
KVM: arm64: vgic-its: Test for valid IRQ in its_sync_lpi_pending_table()
-------------
Diffstat:
Makefile | 4 +-
arch/arm/mach-ep93xx/core.c | 1 +
arch/arm64/boot/dts/qcom/msm8916.dtsi | 4 +-
arch/s390/pci/pci.c | 2 +-
arch/x86/kernel/alternative.c | 13 +
drivers/ata/ahci.c | 34 +-
drivers/ata/ahci.h | 1 +
drivers/base/dd.c | 9 +
drivers/block/virtio_blk.c | 7 +-
drivers/dma/fsl-qdma.c | 2 +-
drivers/dma/sh/shdma.h | 2 +-
drivers/firewire/core-card.c | 18 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c | 3 +
drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h | 5 +-
drivers/gpu/drm/drm_syncobj.c | 16 +-
drivers/gpu/drm/nouveau/nvkm/subdev/bios/shadow.c | 8 +-
drivers/gpu/drm/sun4i/sun6i_mipi_dsi.c | 3 +-
drivers/hwmon/coretemp.c | 2 +-
drivers/infiniband/hw/bnxt_re/ib_verbs.c | 5 +-
drivers/infiniband/hw/hfi1/pio.c | 6 +-
drivers/infiniband/hw/hfi1/sdma.c | 2 +-
drivers/infiniband/ulp/srpt/ib_srpt.c | 18 +-
drivers/md/dm-crypt.c | 6 +
drivers/md/dm-integrity.c | 11 +-
drivers/net/gtp.c | 10 +-
drivers/nvme/target/fc.c | 8 +-
drivers/nvme/target/tcp.c | 1 +
drivers/pci/controller/pci-tegra.c | 17 +-
drivers/pci/msi.c | 2 +-
drivers/pinctrl/pinctrl-rockchip.c | 23 +-
drivers/regulator/pwm-regulator.c | 3 +
drivers/s390/net/qeth_l3_main.c | 9 +-
drivers/scsi/Kconfig | 2 +-
drivers/scsi/lpfc/lpfc_scsi.c | 12 +-
drivers/soc/renesas/r8a77980-sysc.c | 3 +-
drivers/spi/spi-mt7621.c | 8 +-
drivers/target/target_core_device.c | 5 -
drivers/target/target_core_transport.c | 4 +
drivers/usb/cdns3/gadget.c | 8 +-
drivers/usb/gadget/function/f_ncm.c | 10 +-
drivers/usb/roles/class.c | 12 +-
drivers/video/fbdev/savage/savagefb_driver.c | 3 +
drivers/video/fbdev/sis/sis_main.c | 2 +
fs/afs/volume.c | 4 +-
fs/aio.c | 9 +-
fs/ext4/mballoc.c | 13 +-
fs/iomap/buffered-io.c | 3 +
fs/nilfs2/dat.c | 27 +-
include/linux/fs.h | 2 +
include/linux/lockdep.h | 5 +
include/net/tcp.h | 1 +
kernel/sched/rt.c | 10 +-
kernel/sysctl.c | 4 +
mm/memcontrol.c | 6 +
mm/userfaultfd.c | 14 +-
net/bridge/br_device.c | 2 +
net/ipv4/af_inet.c | 2 +-
net/ipv4/devinet.c | 21 +-
net/ipv4/tcp.c | 27 +-
net/ipv4/tcp_input.c | 4 +-
net/ipv6/addrconf.c | 21 +-
net/ipv6/seg6.c | 20 +-
net/l2tp/l2tp_ip6.c | 2 +-
net/mac80211/sta_info.c | 2 +
net/mac80211/tx.c | 2 +-
net/netfilter/nf_conntrack_proto_sctp.c | 2 +-
net/netfilter/nf_tables_api.c | 1 +
net/packet/af_packet.c | 4 +-
net/sched/Kconfig | 42 -
net/sched/Makefile | 3 -
net/sched/sch_atm.c | 710 --------
net/sched/sch_cbq.c | 1818 ---------------------
net/sched/sch_dsmark.c | 523 ------
net/tls/tls_sw.c | 12 +-
net/wireless/nl80211.c | 1 +
scripts/bpf_helpers_doc.py | 157 +-
sound/pci/hda/patch_realtek.c | 6 +-
sound/soc/sunxi/sun4i-spdif.c | 5 +
tools/testing/selftests/bpf/test_verifier.c | 13 +
virt/kvm/arm/vgic/vgic-its.c | 5 +
80 files changed, 572 insertions(+), 3255 deletions(-)
The AMD USB host controller (1022:43f7) does not enter PCI D3 by default
when nothing is connected. This is due to the policy introduced by
'commit a611bf473d1f ("xhci-pci: Set runtime PM as default policy on all
xHC 1.2 or later devices")', which only covers 1.2 or later devices.
Therefore, by default, allow RPM on the AMD USB controller [1022:43f7].
Fixes: 4baf12181509 ("xhci: Loosen RPM as default policy to cover for AMD xHC 1.1")
Link: https://lore.kernel.org/all/12335218.O9o76ZdvQC@natalenko.name/
Cc: Mario Limonciello <mario.limonciello(a)amd.com>
Cc: stable(a)vger.kernel.org
Tested-by: Oleksandr Natalenko <oleksandr(a)natalenko.name>
Signed-off-by: Basavaraj Natikar <Basavaraj.Natikar(a)amd.com>
---
Changes in v2:
- Added Cc: stable(a)vger.kernel.org
drivers/usb/host/xhci-pci.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/usb/host/xhci-pci.c b/drivers/usb/host/xhci-pci.c
index b534ca9752be..1eb7a41a75d7 100644
--- a/drivers/usb/host/xhci-pci.c
+++ b/drivers/usb/host/xhci-pci.c
@@ -473,6 +473,8 @@ static void xhci_pci_quirks(struct device *dev, struct xhci_hcd *xhci)
/* xHC spec requires PCI devices to support D3hot and D3cold */
if (xhci->hci_version >= 0x120)
xhci->quirks |= XHCI_DEFAULT_PM_RUNTIME_ALLOW;
+ else if (pdev->vendor == PCI_VENDOR_ID_AMD && pdev->device == 0x43f7)
+ xhci->quirks |= XHCI_DEFAULT_PM_RUNTIME_ALLOW;
if (xhci->quirks & XHCI_RESET_ON_RESUME)
xhci_dbg_trace(xhci, trace_xhci_dbg_quirks,
--
2.25.1
In the following sequence:
1) of_platform_depopulate()
2) of_overlay_remove()
During the step 1, devices are destroyed and devlinks are removed.
During the step 2, OF nodes are destroyed but
__of_changeset_entry_destroy() can raise warnings related to missing
of_node_put():
ERROR: memory leak, expected refcount 1 instead of 2 ...
Indeed, during the devlink removals performed at step 1, the removal
itself releasing the device (and the attached of_node) is done by a job
queued in a workqueue and so, it is done asynchronously with respect to
function calls.
When the warning is present, of_node_put() will be called but wrongly
too late from the workqueue job.
In order to be sure that any ongoing devlink removals are done before
the of_node destruction, synchronize the of_overlay_remove() with the
devlink removals.
Fixes: 80dd33cf72d1 ("drivers: base: Fix device link removal")
Cc: stable(a)vger.kernel.org
Signed-off-by: Herve Codina <herve.codina(a)bootlin.com>
---
drivers/of/overlay.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/drivers/of/overlay.c b/drivers/of/overlay.c
index 2ae7e9d24a64..99659ae9fb28 100644
--- a/drivers/of/overlay.c
+++ b/drivers/of/overlay.c
@@ -853,6 +853,14 @@ static void free_overlay_changeset(struct overlay_changeset *ovcs)
{
int i;
+ /*
+ * Wait for any ongoing device link removals before removing some of
+ * nodes. Drop the global lock while waiting
+ */
+ mutex_unlock(&of_mutex);
+ device_link_wait_removal();
+ mutex_lock(&of_mutex);
+
if (ovcs->cset.entries.next)
of_changeset_destroy(&ovcs->cset);
@@ -862,7 +870,6 @@ static void free_overlay_changeset(struct overlay_changeset *ovcs)
ovcs->id = 0;
}
-
for (i = 0; i < ovcs->count; i++) {
of_node_put(ovcs->fragments[i].target);
of_node_put(ovcs->fragments[i].overlay);
--
2.43.0
In the scenario of entering hibernation with udisk in the system, if the
udisk was gone or resume fail in the thaw phase of hibernation. Its state
will be set to NOTATTACHED. At this point, usb_hub_wq was already freezed
and can't not handle disconnect event. Next, in the poweroff phase of
hibernation, SYNCHRONIZE_CACHE SCSI command will be sent to this udisk
when poweroff this scsi device, which will cause uas_submit_urbs to be
called to submit URB for sense/data/cmd pipe. However, these URBs will
submit fail as device was set to NOTATTACHED state. Then, uas_submit_urbs
will return a value SCSI_MLQUEUE_DEVICE_BUSY to the caller. That will lead
the SCSI layer go into an ugly loop and system fail to go into hibernation.
On the other hand, when we specially check for -ENODEV in function
uas_queuecommand_lck, returning DID_ERROR to SCSI layer will cause device
poweroff fail and system shutdown instead of entering hibernation.
To fix this issue, let uas_submit_urbs function to return a value -ENODEV
when submit URB fail with device in NOTATTACHED state. At the same time,
we need to translate -ENODEV to DID_NOT_CONNECT for the SCSI layer.
Cc: stable(a)vger.kernel.org
Signed-off-by: Weitao Wang <WeitaoWang-oc(a)zhaoxin.com>
---
v1->v2
- Modify the description of this patch.
drivers/usb/storage/uas.c | 21 ++++++++++-----------
1 file changed, 10 insertions(+), 11 deletions(-)
diff --git a/drivers/usb/storage/uas.c b/drivers/usb/storage/uas.c
index 9707f53cfda9..967f18db525a 100644
--- a/drivers/usb/storage/uas.c
+++ b/drivers/usb/storage/uas.c
@@ -533,7 +533,7 @@ static struct urb *uas_alloc_cmd_urb(struct uas_dev_info *devinfo, gfp_t gfp,
* daft to me.
*/
-static struct urb *uas_submit_sense_urb(struct scsi_cmnd *cmnd, gfp_t gfp)
+static int uas_submit_sense_urb(struct scsi_cmnd *cmnd, gfp_t gfp)
{
struct uas_dev_info *devinfo = cmnd->device->hostdata;
struct urb *urb;
@@ -541,16 +541,15 @@ static struct urb *uas_submit_sense_urb(struct scsi_cmnd *cmnd, gfp_t gfp)
urb = uas_alloc_sense_urb(devinfo, gfp, cmnd);
if (!urb)
- return NULL;
+ return -ENOMEM;
usb_anchor_urb(urb, &devinfo->sense_urbs);
err = usb_submit_urb(urb, gfp);
if (err) {
usb_unanchor_urb(urb);
uas_log_cmd_state(cmnd, "sense submit err", err);
usb_free_urb(urb);
- return NULL;
}
- return urb;
+ return err;
}
static int uas_submit_urbs(struct scsi_cmnd *cmnd,
@@ -562,9 +561,9 @@ static int uas_submit_urbs(struct scsi_cmnd *cmnd,
lockdep_assert_held(&devinfo->lock);
if (cmdinfo->state & SUBMIT_STATUS_URB) {
- urb = uas_submit_sense_urb(cmnd, GFP_ATOMIC);
- if (!urb)
- return SCSI_MLQUEUE_DEVICE_BUSY;
+ err = uas_submit_sense_urb(cmnd, GFP_ATOMIC);
+ if (err)
+ return (err == -ENODEV) ? -ENODEV : SCSI_MLQUEUE_DEVICE_BUSY;
cmdinfo->state &= ~SUBMIT_STATUS_URB;
}
@@ -582,7 +581,7 @@ static int uas_submit_urbs(struct scsi_cmnd *cmnd,
if (err) {
usb_unanchor_urb(cmdinfo->data_in_urb);
uas_log_cmd_state(cmnd, "data in submit err", err);
- return SCSI_MLQUEUE_DEVICE_BUSY;
+ return (err == -ENODEV) ? -ENODEV : SCSI_MLQUEUE_DEVICE_BUSY;
}
cmdinfo->state &= ~SUBMIT_DATA_IN_URB;
cmdinfo->state |= DATA_IN_URB_INFLIGHT;
@@ -602,7 +601,7 @@ static int uas_submit_urbs(struct scsi_cmnd *cmnd,
if (err) {
usb_unanchor_urb(cmdinfo->data_out_urb);
uas_log_cmd_state(cmnd, "data out submit err", err);
- return SCSI_MLQUEUE_DEVICE_BUSY;
+ return (err == -ENODEV) ? -ENODEV : SCSI_MLQUEUE_DEVICE_BUSY;
}
cmdinfo->state &= ~SUBMIT_DATA_OUT_URB;
cmdinfo->state |= DATA_OUT_URB_INFLIGHT;
@@ -621,7 +620,7 @@ static int uas_submit_urbs(struct scsi_cmnd *cmnd,
if (err) {
usb_unanchor_urb(cmdinfo->cmd_urb);
uas_log_cmd_state(cmnd, "cmd submit err", err);
- return SCSI_MLQUEUE_DEVICE_BUSY;
+ return (err == -ENODEV) ? -ENODEV : SCSI_MLQUEUE_DEVICE_BUSY;
}
cmdinfo->cmd_urb = NULL;
cmdinfo->state &= ~SUBMIT_CMD_URB;
@@ -698,7 +697,7 @@ static int uas_queuecommand_lck(struct scsi_cmnd *cmnd)
* of queueing, no matter how fatal the error
*/
if (err == -ENODEV) {
- set_host_byte(cmnd, DID_ERROR);
+ set_host_byte(cmnd, DID_NO_CONNECT);
scsi_done(cmnd);
goto zombie;
}
--
2.32.0
This is the backport of recently upstreamed series that moves VERW
execution to a later point in exit-to-user path. This is needed because
in some cases it may be possible for data accessed after VERW executions
may end into MDS affected CPU buffers. Moving VERW closer to ring
transition reduces the attack surface.
Patch 1/6 includes a minor fix that is queued for upstream:
https://lore.kernel.org/lkml/170899674562.398.6398007479766564897.tip-bot2@…
Patch 1,2,5 and 6 needed conflict resolution.
I saw a few new warnings:
arch/x86/entry/entry.o: warning: objtool: mds_verw_sel+0x0: unreachable instruction
I tried using REACHABLE, but that did not fix the warning.
For the below warning:
vmlinux.o: warning: objtool: .altinstr_replacement+0x17: unsupported relocation in alternatives section
not sure if this is related to this series or a pre-existing warning, I
will check later without this series.
I am not too concerned because the alternative did substitute verw
correctly:
entry_SYSCALL_64:
...
0xffffffff8200013d <+253>: swapgs
0xffffffff82000140 <+256>: verw 0xffffffff82000000
0xffffffff82000148 <+264>: sysretq
0xffffffff8200014b <+267>: int3
---
Pawan Gupta (5):
x86/bugs: Add asm helpers for executing VERW
x86/entry_64: Add VERW just before userspace transition
x86/entry_32: Add VERW just before userspace transition
x86/bugs: Use ALTERNATIVE() instead of mds_user_clear static key
KVM/VMX: Move VERW closer to VMentry for MDS mitigation
Sean Christopherson (1):
KVM/VMX: Use BT+JNC, i.e. EFLAGS.CF to select VMRESUME vs. VMLAUNCH
Documentation/x86/mds.rst | 38 +++++++++++++++++++++++++-----------
arch/x86/entry/entry.S | 22 +++++++++++++++++++++
arch/x86/entry/entry_32.S | 3 +++
arch/x86/entry/entry_64.S | 11 +++++++++++
arch/x86/entry/entry_64_compat.S | 1 +
arch/x86/include/asm/cpufeatures.h | 2 +-
arch/x86/include/asm/entry-common.h | 1 -
arch/x86/include/asm/nospec-branch.h | 25 ++++++++++++------------
arch/x86/kernel/cpu/bugs.c | 15 ++++++--------
arch/x86/kernel/nmi.c | 3 ---
arch/x86/kvm/vmx/run_flags.h | 7 +++++--
arch/x86/kvm/vmx/vmenter.S | 9 ++++++---
arch/x86/kvm/vmx/vmx.c | 12 ++++++++----
13 files changed, 103 insertions(+), 46 deletions(-)
---
base-commit: 81e1dc2f70014b9523dd02ca763788e4f81e5bac
change-id: 20240226-delay-verw-backport-6-1-y-4b0cec84087c
The commit 80dd33cf72d1 ("drivers: base: Fix device link removal")
introduces a workqueue to release the consumer and supplier devices used
in the devlink.
In the job queued, devices are release and in turn, when all the
references to these devices are dropped, the release function of the
device itself is called.
Nothing is present to provide some synchronisation with this workqueue
in order to ensure that all ongoing releasing operations are done and
so, some other operations can be started safely.
For instance, in the following sequence:
1) of_platform_depopulate()
2) of_overlay_remove()
During the step 1, devices are released and related devlinks are removed
(jobs pushed in the workqueue).
During the step 2, OF nodes are destroyed but, without any
synchronisation with devlink removal jobs, of_overlay_remove() can raise
warnings related to missing of_node_put():
ERROR: memory leak, expected refcount 1 instead of 2
Indeed, the missing of_node_put() call is going to be done, too late,
from the workqueue job execution.
Introduce device_link_wait_removal() to offer a way to synchronize
operations waiting for the end of devlink removals (i.e. end of
workqueue jobs).
Also, as a flushing operation is done on the workqueue, the workqueue
used is moved from a system-wide workqueue to a local one.
Fixes: 80dd33cf72d1 ("drivers: base: Fix device link removal")
Cc: stable(a)vger.kernel.org
Signed-off-by: Herve Codina <herve.codina(a)bootlin.com>
---
drivers/base/core.c | 26 +++++++++++++++++++++++---
include/linux/device.h | 1 +
2 files changed, 24 insertions(+), 3 deletions(-)
diff --git a/drivers/base/core.c b/drivers/base/core.c
index d5f4e4aac09b..80d9430856a8 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -44,6 +44,7 @@ static bool fw_devlink_is_permissive(void);
static void __fw_devlink_link_to_consumers(struct device *dev);
static bool fw_devlink_drv_reg_done;
static bool fw_devlink_best_effort;
+static struct workqueue_struct *device_link_wq;
/**
* __fwnode_link_add - Create a link between two fwnode_handles.
@@ -532,12 +533,26 @@ static void devlink_dev_release(struct device *dev)
/*
* It may take a while to complete this work because of the SRCU
* synchronization in device_link_release_fn() and if the consumer or
- * supplier devices get deleted when it runs, so put it into the "long"
- * workqueue.
+ * supplier devices get deleted when it runs, so put it into the
+ * dedicated workqueue.
*/
- queue_work(system_long_wq, &link->rm_work);
+ queue_work(device_link_wq, &link->rm_work);
}
+/**
+ * device_link_wait_removal - Wait for ongoing devlink removal jobs to terminate
+ */
+void device_link_wait_removal(void)
+{
+ /*
+ * devlink removal jobs are queued in the dedicated work queue.
+ * To be sure that all removal jobs are terminated, ensure that any
+ * scheduled work has run to completion.
+ */
+ drain_workqueue(device_link_wq);
+}
+EXPORT_SYMBOL_GPL(device_link_wait_removal);
+
static struct class devlink_class = {
.name = "devlink",
.dev_groups = devlink_groups,
@@ -4099,9 +4114,14 @@ int __init devices_init(void)
sysfs_dev_char_kobj = kobject_create_and_add("char", dev_kobj);
if (!sysfs_dev_char_kobj)
goto char_kobj_err;
+ device_link_wq = alloc_workqueue("device_link_wq", 0, 0);
+ if (!device_link_wq)
+ goto wq_err;
return 0;
+ wq_err:
+ kobject_put(sysfs_dev_char_kobj);
char_kobj_err:
kobject_put(sysfs_dev_block_kobj);
block_kobj_err:
diff --git a/include/linux/device.h b/include/linux/device.h
index 1795121dee9a..d7d8305a72e8 100644
--- a/include/linux/device.h
+++ b/include/linux/device.h
@@ -1249,6 +1249,7 @@ void device_link_del(struct device_link *link);
void device_link_remove(void *consumer, struct device *supplier);
void device_links_supplier_sync_state_pause(void);
void device_links_supplier_sync_state_resume(void);
+void device_link_wait_removal(void);
/* Create alias, so I can be autoloaded. */
#define MODULE_ALIAS_CHARDEV(major,minor) \
--
2.43.0
On Fri, Feb 23, 2024 at 9:24 PM Saravana Kannan <saravanak(a)google.com> wrote:
>
> Introduced a stupid bug in commit 782bfd03c3ae ("of: property: Improve
> finding the supplier of a remote-endpoint property") due to a last minute
> incorrect edit of "index !=0" into "!index". This patch fixes it to be
> "index > 0" to match the comment right next to it.
Greg, this needs to land in the stable branches once Rob picks it up
for the next 6.8-rc.
-Saravana
>
> Reported-by: Luca Ceresoli <luca.ceresoli(a)bootlin.com>
> Link: https://lore.kernel.org/lkml/20240223171849.10f9901d@booty/
> Fixes: 782bfd03c3ae ("of: property: Improve finding the supplier of a remote-endpoint property")
> Signed-off-by: Saravana Kannan <saravanak(a)google.com>
> ---
> Using Link: instead of Closes: because Luca reported two separate issues.
>
> Sorry about introducing a stupid bug in an -rcX Rob.
>
> -Saravana
>
> drivers/of/property.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/of/property.c b/drivers/of/property.c
> index b71267c6667c..fa8cd33be131 100644
> --- a/drivers/of/property.c
> +++ b/drivers/of/property.c
> @@ -1304,7 +1304,7 @@ static struct device_node *parse_remote_endpoint(struct device_node *np,
> int index)
> {
> /* Return NULL for index > 0 to signify end of remote-endpoints. */
> - if (!index || strcmp(prop_name, "remote-endpoint"))
> + if (index > 0 || strcmp(prop_name, "remote-endpoint"))
> return NULL;
>
> return of_graph_get_remote_port_parent(np);
> --
> 2.44.0.rc0.258.g7320e95886-goog
>
Commit eaae75754d81 ("docs: turn off "smart quotes" in the HTML build")
disabled conversion of quote marks along with that of dashes.
Despite the short summary, the change affects not only HTML build
but also other build targets including PDF.
However, as "smart quotes" had been enabled for more than half a
decade already, quite a few readers of HTML pages are likely expecting
conversions of "foo" -> “foo” and 'bar' -> ‘bar’.
Furthermore, in LaTeX typesetting convention, it is common to use
distinct marks for opening and closing quote marks.
To satisfy such readers' expectation, restore conversion of quotes
only by setting smartquotes_action [1].
Link: [1] https://www.sphinx-doc.org/en/master/usage/configuration.html#confval-smart…
Cc: stable(a)vger.kernel.org # v6.4
Signed-off-by: Akira Yokosawa <akiyks(a)gmail.com>
---
Documentation/conf.py | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/Documentation/conf.py b/Documentation/conf.py
index da64c9fb7e07..d148f3e8dd57 100644
--- a/Documentation/conf.py
+++ b/Documentation/conf.py
@@ -346,9 +346,9 @@ sys.stderr.write("Using %s theme\n" % html_theme)
html_static_path = ['sphinx-static']
# If true, Docutils "smart quotes" will be used to convert quotes and dashes
-# to typographically correct entities. This will convert "--" to "—",
-# which is not always what we want, so disable it.
-smartquotes = False
+# to typographically correct entities. However, conversion of "--" to "—"
+# is not always what we want, so enable only quotes.
+smartquotes_action = 'q'
# Custom sidebar templates, maps document names to template names.
# Note that the RTD theme ignores this
base-commit: 32ed7930304ca7600a54c2e702098bd2ce7086af
--
2.34.1
Commit 9f4f3dfad8cf ("PCI: qcom: Enable ASPM for platforms supporting
1.9.0 ops") started enabling ASPM unconditionally when the hardware
claims to support it. This triggers Correctable Errors for some PCIe
devices on machines like the Lenovo ThinkPad X13s, which could indicate
an incomplete driver ASPM implementation or that the hardware does in
fact not support L0s.
Add support for disabling ASPM L0s in the devicetree when it is not
supported on a particular machine and controller.
Note that only the 1.9.0 ops enable ASPM currently.
Fixes: 9f4f3dfad8cf ("PCI: qcom: Enable ASPM for platforms supporting 1.9.0 ops")
Cc: stable(a)vger.kernel.org # 6.7
Signed-off-by: Johan Hovold <johan+linaro(a)kernel.org>
---
drivers/pci/controller/dwc/pcie-qcom.c | 20 ++++++++++++++++++++
1 file changed, 20 insertions(+)
diff --git a/drivers/pci/controller/dwc/pcie-qcom.c b/drivers/pci/controller/dwc/pcie-qcom.c
index 09d485df34b9..0fb5dc06d2ef 100644
--- a/drivers/pci/controller/dwc/pcie-qcom.c
+++ b/drivers/pci/controller/dwc/pcie-qcom.c
@@ -273,6 +273,25 @@ static int qcom_pcie_start_link(struct dw_pcie *pci)
return 0;
}
+static void qcom_pcie_clear_aspm_l0s(struct dw_pcie *pci)
+{
+ u16 offset;
+ u32 val;
+
+ if (!of_property_read_bool(pci->dev->of_node, "aspm-no-l0s"))
+ return;
+
+ offset = dw_pcie_find_capability(pci, PCI_CAP_ID_EXP);
+
+ dw_pcie_dbi_ro_wr_en(pci);
+
+ val = readl(pci->dbi_base + offset + PCI_EXP_LNKCAP);
+ val &= ~PCI_EXP_LNKCAP_ASPM_L0S;
+ writel(val, pci->dbi_base + offset + PCI_EXP_LNKCAP);
+
+ dw_pcie_dbi_ro_wr_dis(pci);
+}
+
static void qcom_pcie_clear_hpc(struct dw_pcie *pci)
{
u16 offset = dw_pcie_find_capability(pci, PCI_CAP_ID_EXP);
@@ -962,6 +981,7 @@ static int qcom_pcie_init_2_7_0(struct qcom_pcie *pcie)
static int qcom_pcie_post_init_2_7_0(struct qcom_pcie *pcie)
{
+ qcom_pcie_clear_aspm_l0s(pcie->pci);
qcom_pcie_clear_hpc(pcie->pci);
return 0;
--
2.43.0
Before 9d8e536 ("ALSA: memalloc: Try dma_alloc_noncontiguous() at first")
the alsa non-contiguous allocator always called the alsa fallback
allocator in the non-iommu case. This allocated non-contig memory
consisting of progressively smaller contiguous chunks. Allocation was
fast due to the OR-ing in of __GFP_NORETRY.
After 9d8e536 ("ALSA: memalloc: Try dma_alloc_noncontiguous() at first")
the code tries the dma non-contig allocator first, then falls back to
the alsa fallback allocator. In the non-iommu case, the former supports
only a single contiguous chunk.
We have observed experimentally that under heavy memory fragmentation,
allocating a large-ish contiguous chunk with __GFP_RETRY_MAYFAIL
triggers an indefinite hang in the dma non-contig allocator. This has
high-impact, as an occurrence will trigger a device reboot, resulting in
loss of user state.
Fix the non-iommu path by letting dma_alloc_noncontiguous() fail quickly
so it does not get stuck looking for that elusive large contiguous chunk,
in which case we will fall back to the alsa fallback allocator.
Note that the iommu dma non-contiguous allocator is not affected. While
assembling an array of pages, it tries consecutively smaller contiguous
allocations, and lets higher-order chunk allocations fail quickly.
Suggested-by: Sven van Ashbrook <svenva(a)chromium.org>
Suggested-by: Brian Geffon <bgeffon(a)google.com>
Fixes: 9d8e536d36e7 ("ALSA: memalloc: Try dma_alloc_noncontiguous() at first")
Cc: stable(a)vger.kernel.org
Cc: Sven van Ashbrook <svenva(a)chromium.org>
Cc: Brian Geffon <bgeffon(a)google.com>
Cc: Curtis Malainey <cujomalainey(a)chromium.org>
Signed-off-by: Karthikeyan Ramasubramanian <kramasub(a)chromium.org>
---
sound/core/memalloc.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/sound/core/memalloc.c b/sound/core/memalloc.c
index f901504b5afc1..5f6526a0d731c 100644
--- a/sound/core/memalloc.c
+++ b/sound/core/memalloc.c
@@ -540,13 +540,18 @@ static void *snd_dma_noncontig_alloc(struct snd_dma_buffer *dmab, size_t size)
{
struct sg_table *sgt;
void *p;
+ gfp_t gfp_flags = DEFAULT_GFP;
#ifdef CONFIG_SND_DMA_SGBUF
if (cpu_feature_enabled(X86_FEATURE_XENPV))
return snd_dma_sg_fallback_alloc(dmab, size);
+
+ /* Non-IOMMU case: prevent allocator from searching forever */
+ if (!get_dma_ops(dmab->dev.dev))
+ gfp_flags |= __GFP_NORETRY;
#endif
sgt = dma_alloc_noncontiguous(dmab->dev.dev, size, dmab->dev.dir,
- DEFAULT_GFP, 0);
+ gfp_flags, 0);
#ifdef CONFIG_SND_DMA_SGBUF
if (!sgt && !get_dma_ops(dmab->dev.dev))
return snd_dma_sg_fallback_alloc(dmab, size);
--
2.43.0.687.g38aa6559b0-goog
From: Ville Syrjälä <ville.syrjala(a)linux.intel.com>
The icl+ power well code currently assumes that every AUX power
well maps to an encoder which is using said power well. That is
by no menas guaranteed as we:
- only register encoders for ports declared in the VBT
- combo PHY HDMI-only encoder no longer get an AUX CH since
commit 9856308c94ca ("drm/i915: Only populate aux_ch if really needed")
However we have places such as intel_power_domains_sanitize_state()
that blindly traverse all the possible power wells. So these bits
of code may very well encounbter an aux power well with no associated
encoder.
In this particular case the BIOS seems to have left one AUX power
well enabled even though we're dealing with a HDMI only encoder
on a combo PHY. We then proceed to turn off said power well and
explode when we can't find a matching encoder. As a short term fix
we should be able to just skip the PHY related parts of the power
well programming since we know this situation can only happen with
combo PHYs.
Another option might be to go back to always picking an AUX CH for
all encoders. However I'm a bit wary about that since we might in
theory end up conflicting with the VBT AUX CH assignment. Also
that wouldn't help with encoders not declared in the VBT, should
we ever need to poke the corresponding power wells.
Longer term we need to figure out what the actual relationship
is between the PHY vs. AUX CH vs. AUX power well. Currently this
is entirely unclear.
Cc: stable(a)vger.kernel.org
Fixes: 9856308c94ca ("drm/i915: Only populate aux_ch if really needed")
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/10184
Signed-off-by: Ville Syrjälä <ville.syrjala(a)linux.intel.com>
---
.../drm/i915/display/intel_display_power_well.c | 17 ++++++++++++++---
1 file changed, 14 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/i915/display/intel_display_power_well.c b/drivers/gpu/drm/i915/display/intel_display_power_well.c
index 47cd6bb04366..06900ff307b2 100644
--- a/drivers/gpu/drm/i915/display/intel_display_power_well.c
+++ b/drivers/gpu/drm/i915/display/intel_display_power_well.c
@@ -246,7 +246,14 @@ static enum phy icl_aux_pw_to_phy(struct drm_i915_private *i915,
enum aux_ch aux_ch = icl_aux_pw_to_ch(power_well);
struct intel_digital_port *dig_port = aux_ch_to_digital_port(i915, aux_ch);
- return intel_port_to_phy(i915, dig_port->base.port);
+ /*
+ * FIXME should we care about the (VBT defined) dig_port->aux_ch
+ * relationship or should this be purely defined by the hardware layout?
+ * Currently if the port doesn't appear in the VBT, or if it's declared
+ * as HDMI-only and routed to a combo PHY, the encoder either won't be
+ * present at all or it will not have an aux_ch assigned.
+ */
+ return dig_port ? intel_port_to_phy(i915, dig_port->base.port) : PHY_NONE;
}
static void hsw_wait_for_power_well_enable(struct drm_i915_private *dev_priv,
@@ -414,7 +421,8 @@ icl_combo_phy_aux_power_well_enable(struct drm_i915_private *dev_priv,
intel_de_rmw(dev_priv, regs->driver, 0, HSW_PWR_WELL_CTL_REQ(pw_idx));
- if (DISPLAY_VER(dev_priv) < 12)
+ /* FIXME this is a mess */
+ if (phy != PHY_NONE)
intel_de_rmw(dev_priv, ICL_PORT_CL_DW12(phy),
0, ICL_LANE_ENABLE_AUX);
@@ -437,7 +445,10 @@ icl_combo_phy_aux_power_well_disable(struct drm_i915_private *dev_priv,
drm_WARN_ON(&dev_priv->drm, !IS_ICELAKE(dev_priv));
- intel_de_rmw(dev_priv, ICL_PORT_CL_DW12(phy), ICL_LANE_ENABLE_AUX, 0);
+ /* FIXME this is a mess */
+ if (phy != PHY_NONE)
+ intel_de_rmw(dev_priv, ICL_PORT_CL_DW12(phy),
+ ICL_LANE_ENABLE_AUX, 0);
intel_de_rmw(dev_priv, regs->driver, HSW_PWR_WELL_CTL_REQ(pw_idx), 0);
--
2.43.0
From: Nicholas Kazlauskas <nicholas.kazlauskas(a)amd.com>
[WHY]
We still had an instance of get_idle_state checking the PMFW scratch
register instead of the actual idle allow signal.
[HOW]
Replace it with the SW state check for whether we had allowed idle
through notify_idle.
Cc: Mario Limonciello <mario.limonciello(a)amd.com>
Cc: Alex Deucher <alexander.deucher(a)amd.com>
Cc: stable(a)vger.kernel.org
Reviewed-by: Duncan Ma <duncan.ma(a)amd.com>
Acked-by: Alex Hung <alex.hung(a)amd.com>
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas(a)amd.com>
---
drivers/gpu/drm/amd/display/dc/core/dc.c | 12 +++---------
1 file changed, 3 insertions(+), 9 deletions(-)
diff --git a/drivers/gpu/drm/amd/display/dc/core/dc.c b/drivers/gpu/drm/amd/display/dc/core/dc.c
index daf6c7fe0906..acd8f1257ade 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc.c
@@ -4884,22 +4884,16 @@ void dc_exit_ips_for_hw_access_internal(struct dc *dc, const char *caller_name)
bool dc_dmub_is_ips_idle_state(struct dc *dc)
{
- uint32_t idle_state = 0;
-
if (dc->debug.disable_idle_power_optimizations)
return false;
if (!dc->caps.ips_support || (dc->config.disable_ips == DMUB_IPS_DISABLE_ALL))
return false;
- if (dc->hwss.get_idle_state)
- idle_state = dc->hwss.get_idle_state(dc);
-
- if (!(idle_state & DMUB_IPS1_ALLOW_MASK) ||
- !(idle_state & DMUB_IPS2_ALLOW_MASK))
- return true;
+ if (!dc->ctx->dmub_srv)
+ return false;
- return false;
+ return dc->ctx->dmub_srv->idle_allowed;
}
/* set min and max memory clock to lowest and highest DPM level, respectively */
--
2.34.1
From: Wenjing Liu <wenjing.liu(a)amd.com>
[WHY]
When committing an update with ODM combine change when the plane is
removing or already removed, we fail to detect odm change in pipe
update flags. This has caused mismatch between new dc state and the
actual hardware state, because we missed odm programming.
[HOW]
- Detect odm change even for otg master pipe without a plane.
- Update odm config before calling program pipes for pipe with planes.
The commit also updates blank pattern programming when odm is changed
without plane. This is because number of OPP is changed when ODM
combine is changed. Blank pattern is per OPP so we will need to
reprogram OPP based on the new pipe topology.
Cc: Mario Limonciello <mario.limonciello(a)amd.com>
Cc: Alex Deucher <alexander.deucher(a)amd.com>
Cc: stable(a)vger.kernel.org
Reviewed-by: Dillon Varone <dillon.varone(a)amd.com>
Acked-by: Alex Hung <alex.hung(a)amd.com>
Signed-off-by: Wenjing Liu <wenjing.liu(a)amd.com>
---
.../amd/display/dc/hwss/dcn20/dcn20_hwseq.c | 41 ++++++++++---------
.../amd/display/dc/hwss/dcn32/dcn32_hwseq.c | 7 ++++
2 files changed, 28 insertions(+), 20 deletions(-)
diff --git a/drivers/gpu/drm/amd/display/dc/hwss/dcn20/dcn20_hwseq.c b/drivers/gpu/drm/amd/display/dc/hwss/dcn20/dcn20_hwseq.c
index c55d5155ecb9..40098d9f70cb 100644
--- a/drivers/gpu/drm/amd/display/dc/hwss/dcn20/dcn20_hwseq.c
+++ b/drivers/gpu/drm/amd/display/dc/hwss/dcn20/dcn20_hwseq.c
@@ -1498,6 +1498,11 @@ static void dcn20_detect_pipe_changes(struct dc_state *old_state,
return;
}
+ if (resource_is_pipe_type(new_pipe, OTG_MASTER) &&
+ resource_is_odm_topology_changed(new_pipe, old_pipe))
+ /* Detect odm changes */
+ new_pipe->update_flags.bits.odm = 1;
+
/* Exit on unchanged, unused pipe */
if (!old_pipe->plane_state && !new_pipe->plane_state)
return;
@@ -1551,10 +1556,6 @@ static void dcn20_detect_pipe_changes(struct dc_state *old_state,
/* Detect top pipe only changes */
if (resource_is_pipe_type(new_pipe, OTG_MASTER)) {
- /* Detect odm changes */
- if (resource_is_odm_topology_changed(new_pipe, old_pipe))
- new_pipe->update_flags.bits.odm = 1;
-
/* Detect global sync changes */
if (old_pipe->pipe_dlg_param.vready_offset != new_pipe->pipe_dlg_param.vready_offset
|| old_pipe->pipe_dlg_param.vstartup_start != new_pipe->pipe_dlg_param.vstartup_start
@@ -1999,19 +2000,20 @@ void dcn20_program_front_end_for_ctx(
DC_LOGGER_INIT(dc->ctx->logger);
unsigned int prev_hubp_count = 0;
unsigned int hubp_count = 0;
+ struct pipe_ctx *pipe;
if (resource_is_pipe_topology_changed(dc->current_state, context))
resource_log_pipe_topology_update(dc, context);
if (dc->hwss.program_triplebuffer != NULL && dc->debug.enable_tri_buf) {
for (i = 0; i < dc->res_pool->pipe_count; i++) {
- struct pipe_ctx *pipe_ctx = &context->res_ctx.pipe_ctx[i];
+ pipe = &context->res_ctx.pipe_ctx[i];
- if (!pipe_ctx->top_pipe && !pipe_ctx->prev_odm_pipe && pipe_ctx->plane_state) {
- ASSERT(!pipe_ctx->plane_state->triplebuffer_flips);
+ if (!pipe->top_pipe && !pipe->prev_odm_pipe && pipe->plane_state) {
+ ASSERT(!pipe->plane_state->triplebuffer_flips);
/*turn off triple buffer for full update*/
dc->hwss.program_triplebuffer(
- dc, pipe_ctx, pipe_ctx->plane_state->triplebuffer_flips);
+ dc, pipe, pipe->plane_state->triplebuffer_flips);
}
}
}
@@ -2085,12 +2087,22 @@ void dcn20_program_front_end_for_ctx(
DC_LOG_DC("Reset mpcc for pipe %d\n", dc->current_state->res_ctx.pipe_ctx[i].pipe_idx);
}
+ /* update ODM for blanked OTG master pipes */
+ for (i = 0; i < dc->res_pool->pipe_count; i++) {
+ pipe = &context->res_ctx.pipe_ctx[i];
+ if (resource_is_pipe_type(pipe, OTG_MASTER) &&
+ !resource_is_pipe_type(pipe, DPP_PIPE) &&
+ pipe->update_flags.bits.odm &&
+ hws->funcs.update_odm)
+ hws->funcs.update_odm(dc, context, pipe);
+ }
+
/*
* Program all updated pipes, order matters for mpcc setup. Start with
* top pipe and program all pipes that follow in order
*/
for (i = 0; i < dc->res_pool->pipe_count; i++) {
- struct pipe_ctx *pipe = &context->res_ctx.pipe_ctx[i];
+ pipe = &context->res_ctx.pipe_ctx[i];
if (pipe->plane_state && !pipe->top_pipe) {
while (pipe) {
@@ -2129,17 +2141,6 @@ void dcn20_program_front_end_for_ctx(
context->stream_status[0].plane_count > 1) {
pipe->plane_res.hubp->funcs->hubp_wait_pipe_read_start(pipe->plane_res.hubp);
}
-
- /* when dynamic ODM is active, pipes must be reconfigured when all planes are
- * disabled, as some transitions will leave software and hardware state
- * mismatched.
- */
- if (dc->debug.enable_single_display_2to1_odm_policy &&
- pipe->stream &&
- pipe->update_flags.bits.disable &&
- !pipe->prev_odm_pipe &&
- hws->funcs.update_odm)
- hws->funcs.update_odm(dc, context, pipe);
}
}
diff --git a/drivers/gpu/drm/amd/display/dc/hwss/dcn32/dcn32_hwseq.c b/drivers/gpu/drm/amd/display/dc/hwss/dcn32/dcn32_hwseq.c
index aa36d7a56ca8..b890db0bfc46 100644
--- a/drivers/gpu/drm/amd/display/dc/hwss/dcn32/dcn32_hwseq.c
+++ b/drivers/gpu/drm/amd/display/dc/hwss/dcn32/dcn32_hwseq.c
@@ -1156,6 +1156,13 @@ void dcn32_update_odm(struct dc *dc, struct dc_state *context, struct pipe_ctx *
dsc->funcs->dsc_disconnect(dsc);
}
}
+
+ if (!resource_is_pipe_type(pipe_ctx, DPP_PIPE))
+ /*
+ * blank pattern is generated by OPP, reprogram blank pattern
+ * due to OPP count change
+ */
+ dc->hwseq->funcs.blank_pixel_data(dc, pipe_ctx, true);
}
unsigned int dcn32_calculate_dccg_k1_k2_values(struct pipe_ctx *pipe_ctx, unsigned int *k1_div, unsigned int *k2_div)
--
2.34.1
From: Ryan Lin <tsung-hua.lin(a)amd.com>
[WHY]
Some eDP panels' ext caps don't write initial values. The value of
dpcd_addr (0x317) can be random and the backlight control interface
will be incorrect.
[HOW]
Add new panel patches to remove sink ext caps.
Cc: Mario Limonciello <mario.limonciello(a)amd.com>
Cc: Alex Deucher <alexander.deucher(a)amd.com>
Cc: stable(a)vger.kernel.org # 6.5.x
Cc: Tsung-hua Lin <tsung-hua.lin(a)amd.com>
Cc: Chris Chi <moukong.chi(a)amd.com>
Reviewed-by: Wayne Lin <wayne.lin(a)amd.com>
Acked-by: Alex Hung <alex.hung(a)amd.com>
Signed-off-by: Ryan Lin <tsung-hua.lin(a)amd.com>
---
drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c
index 85b7f58a7f35..c27063305a13 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c
@@ -67,6 +67,8 @@ static void apply_edid_quirks(struct edid *edid, struct dc_edid_caps *edid_caps)
/* Workaround for some monitors that do not clear DPCD 0x317 if FreeSync is unsupported */
case drm_edid_encode_panel_id('A', 'U', 'O', 0xA7AB):
case drm_edid_encode_panel_id('A', 'U', 'O', 0xE69B):
+ case drm_edid_encode_panel_id('B', 'O', 'E', 0x092A):
+ case drm_edid_encode_panel_id('L', 'G', 'D', 0x06D1):
DRM_DEBUG_DRIVER("Clearing DPCD 0x317 on monitor with panel id %X\n", panel_id);
edid_caps->panel_patch.remove_sink_ext_caps = true;
break;
@@ -120,6 +122,8 @@ enum dc_edid_status dm_helpers_parse_edid_caps(
edid_caps->edid_hdmi = connector->display_info.is_hdmi;
+ apply_edid_quirks(edid_buf, edid_caps);
+
sad_count = drm_edid_to_sad((struct edid *) edid->raw_edid, &sads);
if (sad_count <= 0)
return result;
@@ -146,8 +150,6 @@ enum dc_edid_status dm_helpers_parse_edid_caps(
else
edid_caps->speaker_flags = DEFAULT_SPEAKER_LOCATION;
- apply_edid_quirks(edid_buf, edid_caps);
-
kfree(sads);
kfree(sadb);
--
2.34.1
From: Josip Pavic <josip.pavic(a)amd.com>
[WHY]
It's beneficial for ABM to know when new frame data are available.
[HOW]
Add new condition to allow dirty rects to be sent to DMUB when ABM is
active. ABM will use this as a signal that a new frame has arrived.
Cc: Mario Limonciello <mario.limonciello(a)amd.com>
Cc: Alex Deucher <alexander.deucher(a)amd.com>
Cc: stable(a)vger.kernel.org
Reviewed-by: Anthony Koo <anthony.koo(a)amd.com>
Acked-by: Alex Hung <alex.hung(a)amd.com>
Signed-off-by: Josip Pavic <josip.pavic(a)amd.com>
---
drivers/gpu/drm/amd/display/dc/core/dc.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/gpu/drm/amd/display/dc/core/dc.c b/drivers/gpu/drm/amd/display/dc/core/dc.c
index 5211c1c0f3c0..613d09c42f3b 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc.c
@@ -3270,6 +3270,9 @@ static bool dc_dmub_should_send_dirty_rect_cmd(struct dc *dc, struct dc_stream_s
if (stream->link->replay_settings.config.replay_supported)
return true;
+ if (stream->ctx->dce_version >= DCN_VERSION_3_5 && stream->abm_level)
+ return true;
+
return false;
}
--
2.34.1
This is the start of the stable review cycle for the 4.19.308 release.
There are 52 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Thu, 29 Feb 2024 13:15:36 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.19.308-r…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.19.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 4.19.308-rc1
Andrii Nakryiko <andriin(a)fb.com>
scripts/bpf: Fix xdp_md forward declaration typo
Bart Van Assche <bvanassche(a)acm.org>
fs/aio: Restrict kiocb_set_cancel_fn() to I/O submitted via libaio
Oliver Upton <oliver.upton(a)linux.dev>
KVM: arm64: vgic-its: Test for valid IRQ in MOVALL handler
Oliver Upton <oliver.upton(a)linux.dev>
KVM: arm64: vgic-its: Test for valid IRQ in its_sync_lpi_pending_table()
Vidya Sagar <vidyas(a)nvidia.com>
PCI/MSI: Prevent MSI hardware interrupt number truncation
Jason Gunthorpe <jgg(a)nvidia.com>
s390: use the correct count for __iowrite64_copy()
Wolfram Sang <wsa+renesas(a)sang-engineering.com>
packet: move from strlcpy with unused retval to strscpy
Vasiliy Kovalev <kovalev(a)altlinux.org>
ipv6: sr: fix possible use-after-free and null-ptr-deref
Arnd Bergmann <arnd(a)arndb.de>
nouveau: fix function cast warnings
Randy Dunlap <rdunlap(a)infradead.org>
scsi: jazz_esp: Only build if SCSI core is builtin
Gianmarco Lusvardi <glusvardi(a)posteo.net>
bpf, scripts: Correct GPL license name
Andrii Nakryiko <andriin(a)fb.com>
scripts/bpf: teach bpf_helpers_doc.py to dump BPF helper definitions
Arnd Bergmann <arnd(a)arndb.de>
RDMA/srpt: fix function pointer cast warnings
Bart Van Assche <bvanassche(a)acm.org>
RDMA/srpt: Make debug output more detailed
Jason Gunthorpe <jgg(a)ziepe.ca>
RDMA/ulp: Use dev_name instead of ibdev->name
Bart Van Assche <bvanassche(a)acm.org>
RDMA/srpt: Support specifying the srpt_service_guid parameter
Kalesh AP <kalesh-anakkur.purayil(a)broadcom.com>
RDMA/bnxt_re: Return error for SRQ resize
Zhipeng Lu <alexious(a)zju.edu.cn>
IB/hfi1: Fix a memleak in init_credit_return
Xu Yang <xu.yang_2(a)nxp.com>
usb: roles: don't get/set_role() when usb_role_switch is unregistered
Krishna Kurapati <quic_kriskura(a)quicinc.com>
usb: gadget: ncm: Avoid dropping datagrams of properly parsed NTBs
Nikita Shubin <nikita.shubin(a)maquefel.me>
ARM: ep93xx: Add terminator to gpiod_lookup_table
Tom Parkin <tparkin(a)katalix.com>
l2tp: pass correct message length to ip6_append_data
Vasiliy Kovalev <kovalev(a)altlinux.org>
gtp: fix use-after-free and null-ptr-deref in gtp_genl_dump_pdp()
Mikulas Patocka <mpatocka(a)redhat.com>
dm-crypt: don't modify the data when using authenticated encryption
Roman Gushchin <guro(a)fb.com>
mm: memcontrol: switch to rcu protection in drain_all_stock()
Daniel Vacek <neelx(a)redhat.com>
IB/hfi1: Fix sdma.h tx->num_descs off-by-one error
Geert Uytterhoeven <geert+renesas(a)glider.be>
pmdomain: renesas: r8a77980-sysc: CR7 must be always on
Alexandra Winter <wintera(a)linux.ibm.com>
s390/qeth: Fix potential loss of L3-IP@ in case of network issues
Yi Sun <yi.sun(a)unisoc.com>
virtio-blk: Ensure no requests in virtqueues before deleting vqs.
Takashi Sakamoto <o-takashi(a)sakamocchi.jp>
firewire: core: send bus reset promptly on gap count error
Zhang Rui <rui.zhang(a)intel.com>
hwmon: (coretemp) Enlarge per package core count limit
Martin Blumenstingl <martin.blumenstingl(a)googlemail.com>
regulator: pwm-regulator: Add validity checks in continuous .get_voltage
Baokun Li <libaokun1(a)huawei.com>
ext4: avoid allocating blocks from corrupted group in ext4_mb_find_by_goal()
Baokun Li <libaokun1(a)huawei.com>
ext4: avoid allocating blocks from corrupted group in ext4_mb_try_best_found()
Conrad Kostecki <conikost(a)gentoo.org>
ahci: asm1166: correct count of reported ports
Fullway Wang <fullwaywang(a)outlook.com>
fbdev: sis: Error out if pixclock equals zero
Fullway Wang <fullwaywang(a)outlook.com>
fbdev: savage: Error out if pixclock equals zero
Felix Fietkau <nbd(a)nbd.name>
wifi: mac80211: fix race condition on enabling fast-xmit
Michal Kazior <michal(a)plume.com>
wifi: cfg80211: fix missing interfaces when dumping
Vinod Koul <vkoul(a)kernel.org>
dmaengine: shdma: increase size of 'dev_id'
Dmitry Bogdanov <d.bogdanov(a)yadro.com>
scsi: target: core: Add TMF to tmr_list handling
Cyril Hrubis <chrubis(a)suse.cz>
sched/rt: Disallow writing invalid values to sched_rt_period_us
Cyril Hrubis <chrubis(a)suse.cz>
sched/rt: sysctl_sched_rr_timeslice show default timeslice after reset
Cyril Hrubis <chrubis(a)suse.cz>
sched/rt: Fix sysctl_sched_rr_timeslice intial value
Lokesh Gidra <lokeshgidra(a)google.com>
userfaultfd: fix mmap_changing checking in mfill_atomic_hugetlb
Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
nilfs2: replace WARN_ONs for invalid DAT metadata block requests
GONG, Ruiqi <gongruiqi1(a)huawei.com>
memcg: add refcnt for pcpu stock to avoid UAF problem in drain_all_stock()
Aaro Koskinen <aaro.koskinen(a)nokia.com>
net: stmmac: fix notifier registration
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
stmmac: no need to check return value of debugfs_create functions
Jamal Hadi Salim <jhs(a)mojatatu.com>
net/sched: Retire dsmark qdisc
Jamal Hadi Salim <jhs(a)mojatatu.com>
net/sched: Retire ATM qdisc
Jamal Hadi Salim <jhs(a)mojatatu.com>
net/sched: Retire CBQ qdisc
-------------
Diffstat:
Makefile | 4 +-
arch/arm/mach-ep93xx/core.c | 1 +
arch/s390/pci/pci.c | 2 +-
drivers/ata/ahci.c | 5 +
drivers/block/virtio_blk.c | 7 +-
drivers/dma/sh/shdma.h | 2 +-
drivers/firewire/core-card.c | 18 +-
drivers/gpu/drm/nouveau/nvkm/subdev/bios/shadow.c | 8 +-
drivers/hwmon/coretemp.c | 2 +-
drivers/infiniband/hw/bnxt_re/ib_verbs.c | 5 +-
drivers/infiniband/hw/hfi1/pio.c | 6 +-
drivers/infiniband/hw/hfi1/sdma.c | 2 +-
drivers/infiniband/ulp/ipoib/ipoib_verbs.c | 2 +-
drivers/infiniband/ulp/iser/iser_verbs.c | 9 +-
drivers/infiniband/ulp/isert/ib_isert.c | 2 +-
drivers/infiniband/ulp/opa_vnic/opa_vnic_vema.c | 3 +-
drivers/infiniband/ulp/srp/ib_srp.c | 10 +-
drivers/infiniband/ulp/srpt/ib_srpt.c | 52 +-
drivers/md/dm-crypt.c | 6 +
drivers/net/ethernet/stmicro/stmmac/stmmac.h | 2 -
drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 65 +-
drivers/net/gtp.c | 10 +-
drivers/pci/msi.c | 2 +-
drivers/regulator/pwm-regulator.c | 3 +
drivers/s390/net/qeth_l3_main.c | 9 +-
drivers/scsi/Kconfig | 2 +-
drivers/soc/renesas/r8a77980-sysc.c | 3 +-
drivers/target/target_core_device.c | 5 -
drivers/target/target_core_transport.c | 4 +
drivers/usb/gadget/function/f_ncm.c | 10 +-
drivers/usb/roles/class.c | 12 +-
drivers/video/fbdev/savage/savagefb_driver.c | 3 +
drivers/video/fbdev/sis/sis_main.c | 2 +
fs/aio.c | 9 +-
fs/ext4/mballoc.c | 13 +-
fs/nilfs2/dat.c | 27 +-
include/linux/fs.h | 2 +
include/rdma/rdma_vt.h | 2 +-
kernel/sched/rt.c | 10 +-
kernel/sysctl.c | 5 +
mm/memcontrol.c | 23 +-
mm/userfaultfd.c | 14 +-
net/ipv6/seg6.c | 20 +-
net/l2tp/l2tp_ip6.c | 2 +-
net/mac80211/sta_info.c | 2 +
net/mac80211/tx.c | 2 +-
net/packet/af_packet.c | 4 +-
net/sched/Kconfig | 42 -
net/sched/Makefile | 3 -
net/sched/sch_atm.c | 708 --------
net/sched/sch_cbq.c | 1823 ---------------------
net/sched/sch_dsmark.c | 519 ------
net/wireless/nl80211.c | 1 +
scripts/bpf_helpers_doc.py | 157 +-
virt/kvm/arm/vgic/vgic-its.c | 5 +
55 files changed, 412 insertions(+), 3259 deletions(-)
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 045e9d812868a2d80b7a57b224ce8009444b7bbc
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024022602-extended-buffer-5cf7@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
045e9d812868 ("mptcp: fix duplicate subflow creation")
b9d69db87fb7 ("mptcp: let the in-kernel PM use mixed IPv4 and IPv6 addresses")
bedee0b56113 ("mptcp: address lookup improvements")
4638de5aefe5 ("mptcp: handle local addrs announced by userspace PMs")
c682bf536cf4 ("mptcp: add pm_nl_pernet helpers")
4cf86ae84c71 ("mptcp: strict local address ID selection")
d045b9eb95a9 ("mptcp: introduce implicit endpoints")
aaa25a2fa796 ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 045e9d812868a2d80b7a57b224ce8009444b7bbc Mon Sep 17 00:00:00 2001
From: Paolo Abeni <pabeni(a)redhat.com>
Date: Thu, 15 Feb 2024 19:25:33 +0100
Subject: [PATCH] mptcp: fix duplicate subflow creation
Fullmesh endpoints could end-up unexpectedly generating duplicate
subflows - same local and remote addresses - when multiple incoming
ADD_ADDR are processed before the PM creates the subflow for the local
endpoints.
Address the issue explicitly checking for duplicates at subflow
creation time.
To avoid a quadratic computational complexity, track the unavailable
remote address ids in a temporary bitmap and initialize such bitmap
with the remote ids of all the existing subflows matching the local
address currently processed.
The above allows additionally replacing the existing code checking
for duplicate entry in the current set with a simple bit test
operation.
Fixes: 2843ff6f36db ("mptcp: remote addresses fullmesh")
Cc: stable(a)vger.kernel.org
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/435
Signed-off-by: Paolo Abeni <pabeni(a)redhat.com>
Reviewed-by: Mat Martineau <martineau(a)kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
diff --git a/net/mptcp/pm_netlink.c b/net/mptcp/pm_netlink.c
index ed6983af1ab2..58d17d9604e7 100644
--- a/net/mptcp/pm_netlink.c
+++ b/net/mptcp/pm_netlink.c
@@ -396,19 +396,6 @@ void mptcp_pm_free_anno_list(struct mptcp_sock *msk)
}
}
-static bool lookup_address_in_vec(const struct mptcp_addr_info *addrs, unsigned int nr,
- const struct mptcp_addr_info *addr)
-{
- int i;
-
- for (i = 0; i < nr; i++) {
- if (addrs[i].id == addr->id)
- return true;
- }
-
- return false;
-}
-
/* Fill all the remote addresses into the array addrs[],
* and return the array size.
*/
@@ -440,6 +427,16 @@ static unsigned int fill_remote_addresses_vec(struct mptcp_sock *msk,
msk->pm.subflows++;
addrs[i++] = remote;
} else {
+ DECLARE_BITMAP(unavail_id, MPTCP_PM_MAX_ADDR_ID + 1);
+
+ /* Forbid creation of new subflows matching existing
+ * ones, possibly already created by incoming ADD_ADDR
+ */
+ bitmap_zero(unavail_id, MPTCP_PM_MAX_ADDR_ID + 1);
+ mptcp_for_each_subflow(msk, subflow)
+ if (READ_ONCE(subflow->local_id) == local->id)
+ __set_bit(subflow->remote_id, unavail_id);
+
mptcp_for_each_subflow(msk, subflow) {
ssk = mptcp_subflow_tcp_sock(subflow);
remote_address((struct sock_common *)ssk, &addrs[i]);
@@ -447,11 +444,17 @@ static unsigned int fill_remote_addresses_vec(struct mptcp_sock *msk,
if (deny_id0 && !addrs[i].id)
continue;
+ if (test_bit(addrs[i].id, unavail_id))
+ continue;
+
if (!mptcp_pm_addr_families_match(sk, local, &addrs[i]))
continue;
- if (!lookup_address_in_vec(addrs, i, &addrs[i]) &&
- msk->pm.subflows < subflows_max) {
+ if (msk->pm.subflows < subflows_max) {
+ /* forbid creating multiple address towards
+ * this id
+ */
+ __set_bit(addrs[i].id, unavail_id);
msk->pm.subflows++;
i++;
}
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 8c86fad2cecdc6bf7283ecd298b4d0555bd8b8aa
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024021943-wriggle-repair-49dd@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
8c86fad2cecd ("selftests: mptcp: add missing kconfig for NF Filter in v6")
b6e074e171bc ("selftests: mptcp: add infinite map testcase")
39aab88242a8 ("selftests: mptcp: join: list failure at the end")
c7d49c033de0 ("selftests: mptcp: join: alt. to exec specific tests")
ae7bd9ccecc3 ("selftests: mptcp: join: option to execute specific tests")
e59300ce3ff8 ("selftests: mptcp: join: reset failing links")
3afd0280e7d3 ("selftests: mptcp: join: define tests groups once")
3c082695e78b ("selftests: mptcp: drop msg argument of chk_csum_nr")
69c6ce7b6eca ("selftests: mptcp: add implicit endpoint test case")
d045b9eb95a9 ("mptcp: introduce implicit endpoints")
6fa0174a7c86 ("mptcp: more careful RM_ADDR generation")
f98c2bca7b2b ("selftests: mptcp: Rename wait function")
826d7bdca833 ("selftests: mptcp: join: allow running -cCi")
7d9bf018f907 ("selftests: mptcp: update output info of chk_rm_nr")
26516e10c433 ("selftests: mptcp: add more arguments for chk_join_nr")
8117dac3e7c3 ("selftests: mptcp: add invert check in check_transfer")
01542c9bf9ab ("selftests: mptcp: add fastclose testcase")
cbfafac4cf8f ("selftests: mptcp: add extra_args in do_transfer")
922fd2b39e5a ("selftests: mptcp: add the MP_RST mibs check")
e8e947ef50f6 ("selftests: mptcp: add the MP_FASTCLOSE mibs check")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 8c86fad2cecdc6bf7283ecd298b4d0555bd8b8aa Mon Sep 17 00:00:00 2001
From: "Matthieu Baerts (NGI0)" <matttbe(a)kernel.org>
Date: Wed, 31 Jan 2024 22:49:48 +0100
Subject: [PATCH] selftests: mptcp: add missing kconfig for NF Filter in v6
Since the commit mentioned below, 'mptcp_join' selftests is using
IPTables to add rules to the Filter table for IPv6.
It is then required to have IP6_NF_FILTER KConfig.
This KConfig is usually enabled by default in many defconfig, but we
recently noticed that some CI were running our selftests without them
enabled.
Fixes: 523514ed0a99 ("selftests: mptcp: add ADD_ADDR IPv6 test cases")
Cc: stable(a)vger.kernel.org
Reviewed-by: Geliang Tang <geliang(a)kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
Link: https://lore.kernel.org/r/20240131-upstream-net-20240131-mptcp-ci-issues-v1…
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
diff --git a/tools/testing/selftests/net/mptcp/config b/tools/testing/selftests/net/mptcp/config
index 2a00bf4acdfa..26fe466f803d 100644
--- a/tools/testing/selftests/net/mptcp/config
+++ b/tools/testing/selftests/net/mptcp/config
@@ -25,6 +25,7 @@ CONFIG_IP_MULTIPLE_TABLES=y
CONFIG_IP_NF_FILTER=m
CONFIG_IP_NF_TARGET_REJECT=m
CONFIG_IPV6_MULTIPLE_TABLES=y
+CONFIG_IP6_NF_FILTER=m
CONFIG_NET_ACT_CSUM=m
CONFIG_NET_ACT_PEDIT=m
CONFIG_NET_CLS_ACT=y
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 3645c844902bd4e173d6704fc2a37e8746904d67
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024021925-lyricism-unshackle-b3ca@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
3645c844902b ("selftests: mptcp: add missing kconfig for NF Filter")
46e967d187ed ("selftests: mptcp: add tests for subflow creation failure")
5fb62e9cd3ad ("selftests: mptcp: add tproxy test case")
b6ab64b074f2 ("selftests: mptcp: more stable simult_flows tests")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 3645c844902bd4e173d6704fc2a37e8746904d67 Mon Sep 17 00:00:00 2001
From: "Matthieu Baerts (NGI0)" <matttbe(a)kernel.org>
Date: Wed, 31 Jan 2024 22:49:47 +0100
Subject: [PATCH] selftests: mptcp: add missing kconfig for NF Filter
Since the commit mentioned below, 'mptcp_join' selftests is using
IPTables to add rules to the Filter table.
It is then required to have IP_NF_FILTER KConfig.
This KConfig is usually enabled by default in many defconfig, but we
recently noticed that some CI were running our selftests without them
enabled.
Fixes: 8d014eaa9254 ("selftests: mptcp: add ADD_ADDR timeout test case")
Cc: stable(a)vger.kernel.org
Reviewed-by: Geliang Tang <geliang(a)kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
diff --git a/tools/testing/selftests/net/mptcp/config b/tools/testing/selftests/net/mptcp/config
index e317c2e44dae..2a00bf4acdfa 100644
--- a/tools/testing/selftests/net/mptcp/config
+++ b/tools/testing/selftests/net/mptcp/config
@@ -22,6 +22,7 @@ CONFIG_NFT_TPROXY=m
CONFIG_NFT_SOCKET=m
CONFIG_IP_ADVANCED_ROUTER=y
CONFIG_IP_MULTIPLE_TABLES=y
+CONFIG_IP_NF_FILTER=m
CONFIG_IP_NF_TARGET_REJECT=m
CONFIG_IPV6_MULTIPLE_TABLES=y
CONFIG_NET_ACT_CSUM=m
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 9f1a98813b4b686482e5ef3c9d998581cace0ba6
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023100447-durable-snowiness-8b36@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 9f1a98813b4b686482e5ef3c9d998581cace0ba6 Mon Sep 17 00:00:00 2001
From: Paolo Abeni <pabeni(a)redhat.com>
Date: Sat, 16 Sep 2023 12:52:47 +0200
Subject: [PATCH] mptcp: process pending subflow error on close
On incoming TCP reset, subflow closing could happen before error
propagation. That in turn could cause the socket error being ignored,
and a missing socket state transition, as reported by Daire-Byrne.
Address the issues explicitly checking for subflow socket error at
close time. To avoid code duplication, factor-out of __mptcp_error_report()
a new helper implementing the relevant bits.
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/429
Fixes: 15cc10453398 ("mptcp: deliver ssk errors to msk")
Cc: stable(a)vger.kernel.org
Signed-off-by: Paolo Abeni <pabeni(a)redhat.com>
Reviewed-by: Mat Martineau <martineau(a)kernel.org>
Signed-off-by: Matthieu Baerts <matthieu.baerts(a)tessares.net>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index 915860027b1a..1c96b8da71df 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -770,40 +770,44 @@ static bool __mptcp_ofo_queue(struct mptcp_sock *msk)
return moved;
}
+static bool __mptcp_subflow_error_report(struct sock *sk, struct sock *ssk)
+{
+ int err = sock_error(ssk);
+ int ssk_state;
+
+ if (!err)
+ return false;
+
+ /* only propagate errors on fallen-back sockets or
+ * on MPC connect
+ */
+ if (sk->sk_state != TCP_SYN_SENT && !__mptcp_check_fallback(mptcp_sk(sk)))
+ return false;
+
+ /* We need to propagate only transition to CLOSE state.
+ * Orphaned socket will see such state change via
+ * subflow_sched_work_if_closed() and that path will properly
+ * destroy the msk as needed.
+ */
+ ssk_state = inet_sk_state_load(ssk);
+ if (ssk_state == TCP_CLOSE && !sock_flag(sk, SOCK_DEAD))
+ inet_sk_state_store(sk, ssk_state);
+ WRITE_ONCE(sk->sk_err, -err);
+
+ /* This barrier is coupled with smp_rmb() in mptcp_poll() */
+ smp_wmb();
+ sk_error_report(sk);
+ return true;
+}
+
void __mptcp_error_report(struct sock *sk)
{
struct mptcp_subflow_context *subflow;
struct mptcp_sock *msk = mptcp_sk(sk);
- mptcp_for_each_subflow(msk, subflow) {
- struct sock *ssk = mptcp_subflow_tcp_sock(subflow);
- int err = sock_error(ssk);
- int ssk_state;
-
- if (!err)
- continue;
-
- /* only propagate errors on fallen-back sockets or
- * on MPC connect
- */
- if (sk->sk_state != TCP_SYN_SENT && !__mptcp_check_fallback(msk))
- continue;
-
- /* We need to propagate only transition to CLOSE state.
- * Orphaned socket will see such state change via
- * subflow_sched_work_if_closed() and that path will properly
- * destroy the msk as needed.
- */
- ssk_state = inet_sk_state_load(ssk);
- if (ssk_state == TCP_CLOSE && !sock_flag(sk, SOCK_DEAD))
- inet_sk_state_store(sk, ssk_state);
- WRITE_ONCE(sk->sk_err, -err);
-
- /* This barrier is coupled with smp_rmb() in mptcp_poll() */
- smp_wmb();
- sk_error_report(sk);
- break;
- }
+ mptcp_for_each_subflow(msk, subflow)
+ if (__mptcp_subflow_error_report(sk, mptcp_subflow_tcp_sock(subflow)))
+ break;
}
/* In most cases we will be able to lock the mptcp socket. If its already
@@ -2428,6 +2432,7 @@ static void __mptcp_close_ssk(struct sock *sk, struct sock *ssk,
}
out_release:
+ __mptcp_subflow_error_report(sk, ssk);
release_sock(ssk);
sock_put(ssk);
Commit 8238b4579866 ("wait_on_bit: add an acquire memory barrier") added
a new bitop, test_bit_acquire(), with proper wrapping in order to try to
optimize it at compile-time, but missed the list of bitops used for
checking their prototypes a bit below.
The functions added have consistent prototypes, so that no more changes
are required and no functional changes take place.
Fixes: 8238b4579866 ("wait_on_bit: add an acquire memory barrier")
Cc: stable(a)vger.kernel.org # 6.0+
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel(a)intel.com>
Signed-off-by: Alexander Lobakin <aleksander.lobakin(a)intel.com>
---
include/linux/bitops.h | 1 +
1 file changed, 1 insertion(+)
diff --git a/include/linux/bitops.h b/include/linux/bitops.h
index 2ba557e067fe..f7f5a783da2a 100644
--- a/include/linux/bitops.h
+++ b/include/linux/bitops.h
@@ -80,6 +80,7 @@ __check_bitop_pr(__test_and_set_bit);
__check_bitop_pr(__test_and_clear_bit);
__check_bitop_pr(__test_and_change_bit);
__check_bitop_pr(test_bit);
+__check_bitop_pr(test_bit_acquire);
#undef __check_bitop_pr
--
2.43.0
From: Ivan Lipski <ivlipski(a)amd.com>
[WHY]
Some eDP panels's ext caps don't write initial value cause the value of
dpcd_addr(0x317) is random. It means that sometimes the eDP will
clarify it is OLED, miniLED...etc cause the backlight control interface
is incorrect.
[HOW]
Add a new panel patch to remove sink ext caps(HDR,OLED...etc)
Cc: stable(a)vger.kernel.org # 6.5.x
Cc: Hamza Mahfooz <hamza.mahfooz(a)amd.com>
Cc: Tsung-hua Lin <tsung-hua.lin(a)amd.com>
Cc: Chris Chi <moukong.chi(a)amd.com>
Cc: Harry Wentland <Harry.Wentland(a)amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler(a)amd.com>
Reviewed-by: Sun peng Li <sunpeng.li(a)amd.com>
Acked-by: Rodrigo Siqueira <rodrigo.siqueira(a)amd.com>
Signed-off-by: Ivan Lipski <ivlipski(a)amd.com>
---
drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c
index d9a482908380..764dc3ffd91b 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c
@@ -63,6 +63,12 @@ static void apply_edid_quirks(struct edid *edid, struct dc_edid_caps *edid_caps)
DRM_DEBUG_DRIVER("Disabling FAMS on monitor with panel id %X\n", panel_id);
edid_caps->panel_patch.disable_fams = true;
break;
+ /* Workaround for some monitors that do not clear DPCD 0x317 if FreeSync is unsupported */
+ case drm_edid_encode_panel_id('A', 'U', 'O', 0xA7AB):
+ case drm_edid_encode_panel_id('A', 'U', 'O', 0xE69B):
+ DRM_DEBUG_DRIVER("Clearing DPCD 0x317 on monitor with panel id %X\n", panel_id);
+ edid_caps->panel_patch.remove_sink_ext_caps = true;
+ break;
default:
return;
}
--
2.43.0
From: Bjorn Andersson <quic_bjorande(a)quicinc.com>
Commit 'e3e56c050ab6 ("soc: qcom: rpmhpd: Make power_on actually enable
the domain")' aimed to make sure that a power-domain that is being
enabled without any particular performance-state requested will at least
turn the rail on, to avoid filling DeviceTree with otherwise unnecessary
required-opps properties.
But in the event that aggregation happens on a disabled power-domain, with
an enabled peer without performance-state, both the local and peer
corner are 0. The peer's enabled_corner is not considered, with the
result that the underlying (shared) resource is disabled.
One case where this can be observed is when the display stack keeps mmcx
enabled (but without a particular performance-state vote) in order to
access registers and sync_state happens in the rpmhpd driver. As mmcx_ao
is flushed the state of the peer (mmcx) is not considered and mmcx_ao
ends up turning off "mmcx.lvl" underneath mmcx. This has been observed
several times, but has been painted over in DeviceTree by adding an
explicit vote for the lowest non-disabled performance-state.
Fixes: e3e56c050ab6 ("soc: qcom: rpmhpd: Make power_on actually enable the domain")
Reported-by: Johan Hovold <johan(a)kernel.org>
Closes: https://lore.kernel.org/linux-arm-msm/ZdMwZa98L23mu3u6@hovoldconsulting.com/
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Bjorn Andersson <quic_bjorande(a)quicinc.com>
---
This issue is the root cause of a display regression on SC8280XP boards,
resulting in the system often resetting during boot. It was exposed by
the refactoring of the DisplayPort driver in v6.8-rc1.
---
drivers/pmdomain/qcom/rpmhpd.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/drivers/pmdomain/qcom/rpmhpd.c b/drivers/pmdomain/qcom/rpmhpd.c
index 3078896b1300..47df910645f6 100644
--- a/drivers/pmdomain/qcom/rpmhpd.c
+++ b/drivers/pmdomain/qcom/rpmhpd.c
@@ -692,6 +692,7 @@ static int rpmhpd_aggregate_corner(struct rpmhpd *pd, unsigned int corner)
unsigned int active_corner, sleep_corner;
unsigned int this_active_corner = 0, this_sleep_corner = 0;
unsigned int peer_active_corner = 0, peer_sleep_corner = 0;
+ unsigned int peer_enabled_corner;
if (pd->state_synced) {
to_active_sleep(pd, corner, &this_active_corner, &this_sleep_corner);
@@ -701,9 +702,11 @@ static int rpmhpd_aggregate_corner(struct rpmhpd *pd, unsigned int corner)
this_sleep_corner = pd->level_count - 1;
}
- if (peer && peer->enabled)
- to_active_sleep(peer, peer->corner, &peer_active_corner,
+ if (peer && peer->enabled) {
+ peer_enabled_corner = max(peer->corner, peer->enable_corner);
+ to_active_sleep(peer, peer_enabled_corner, &peer_active_corner,
&peer_sleep_corner);
+ }
active_corner = max(this_active_corner, peer_active_corner);
---
base-commit: b401b621758e46812da61fa58a67c3fd8d91de0d
change-id: 20240226-rpmhpd-enable-corner-fix-c5e07fe7b986
Best regards,
--
Bjorn Andersson <quic_bjorande(a)quicinc.com>
The value of the [ms]envcfg CSR is lost when entering a nonretentive
idle state, so the CSR must be rewritten when resuming the CPU.
Cc: <stable(a)vger.kernel.org> # v6.7+
Fixes: 43c16d51a19b ("RISC-V: Enable cbo.zero in usermode")
Signed-off-by: Samuel Holland <samuel.holland(a)sifive.com>
---
Changes in v4:
- Check for Xlinuxenvcfg instead of Zicboz
Changes in v3:
- Check for Zicboz instead of the privileged ISA version
Changes in v2:
- Check for privileged ISA v1.12 instead of the specific CSR
- Use riscv_has_extension_likely() instead of new ALTERNATIVE()s
arch/riscv/include/asm/suspend.h | 1 +
arch/riscv/kernel/suspend.c | 4 ++++
2 files changed, 5 insertions(+)
diff --git a/arch/riscv/include/asm/suspend.h b/arch/riscv/include/asm/suspend.h
index 02f87867389a..491296a335d0 100644
--- a/arch/riscv/include/asm/suspend.h
+++ b/arch/riscv/include/asm/suspend.h
@@ -14,6 +14,7 @@ struct suspend_context {
struct pt_regs regs;
/* Saved and restored by high-level functions */
unsigned long scratch;
+ unsigned long envcfg;
unsigned long tvec;
unsigned long ie;
#ifdef CONFIG_MMU
diff --git a/arch/riscv/kernel/suspend.c b/arch/riscv/kernel/suspend.c
index 239509367e42..299795341e8a 100644
--- a/arch/riscv/kernel/suspend.c
+++ b/arch/riscv/kernel/suspend.c
@@ -15,6 +15,8 @@
void suspend_save_csrs(struct suspend_context *context)
{
context->scratch = csr_read(CSR_SCRATCH);
+ if (riscv_cpu_has_extension_unlikely(smp_processor_id(), RISCV_ISA_EXT_XLINUXENVCFG))
+ context->envcfg = csr_read(CSR_ENVCFG);
context->tvec = csr_read(CSR_TVEC);
context->ie = csr_read(CSR_IE);
@@ -36,6 +38,8 @@ void suspend_save_csrs(struct suspend_context *context)
void suspend_restore_csrs(struct suspend_context *context)
{
csr_write(CSR_SCRATCH, context->scratch);
+ if (riscv_cpu_has_extension_unlikely(smp_processor_id(), RISCV_ISA_EXT_XLINUXENVCFG))
+ csr_write(CSR_ENVCFG, context->envcfg);
csr_write(CSR_TVEC, context->tvec);
csr_write(CSR_IE, context->ie);
--
2.43.1
First several patches target fixing the UFS support on the Qualcomm
MSM8996 / APQ8096 platforms, broken by the commit b4e13e1ae95e ("scsi:
ufs: qcom: Add multiple frequency support for MAX_CORE_CLK_1US_CYCLES").
Last two patches clean up the UFS DT device on that platform to follow
the bindings on other MSM8969 platforms. If such breaking change is
unacceptable, they can be simply ignored, merging fixes only.
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov(a)linaro.org>
---
Changes in v3:
- dropped the patch conflicting with Yassine's patch that got accepted
- Cc stable on the UFS change (Manivannan)
- Fixed typos in the commit message (Manivannan)
- Link to v2: https://lore.kernel.org/r/20240213-msm8996-fix-ufs-v2-0-650758c26458@linaro…
Changes in v2:
- Dropped patches adding RX_SYMBOL_1_CLK, MSM8996 uses single lane
(Krzysztof).
- Link to v1: https://lore.kernel.org/r/20240209-msm8996-fix-ufs-v1-0-107b52e57420@linaro…
---
Dmitry Baryshkov (5):
scsi: ufs: qcom: provide default cycles_in_1us value
arm64: dts: qcom: msm8996: specify UFS core_clk frequencies
arm64: dts: qcom: msm8996: set GCC_UFS_ICE_CORE_CLK freq directly
dt-bindings: ufs: qcom,ufs: drop source clock entries
arm64: dts: qcom: msm8996: drop source clock entries from the UFS node
Documentation/devicetree/bindings/ufs/qcom,ufs.yaml | 12 +++++-------
arch/arm64/boot/dts/qcom/msm8996.dtsi | 8 +-------
drivers/ufs/host/ufs-qcom.c | 6 ++++--
3 files changed, 10 insertions(+), 16 deletions(-)
---
base-commit: 0035c3918a74a83f94158fbbd667e163bfd4a0d0
change-id: 20240209-msm8996-fix-ufs-f80ae6d4d8cf
Best regards,
--
Dmitry Baryshkov <dmitry.baryshkov(a)linaro.org>
While connecting to a Linux host with CDC_NCM_NTB_DEF_SIZE_TX
set to 65536, it has been observed that we receive short packets,
which come at interval of 5-10 seconds sometimes and have block
length zero but still contain 1-2 valid datagrams present.
According to the NCM spec:
"If wBlockLength = 0x0000, the block is terminated by a
short packet. In this case, the USB transfer must still
be shorter than dwNtbInMaxSize or dwNtbOutMaxSize. If
exactly dwNtbInMaxSize or dwNtbOutMaxSize bytes are sent,
and the size is a multiple of wMaxPacketSize for the
given pipe, then no ZLP shall be sent.
wBlockLength= 0x0000 must be used with extreme care, because
of the possibility that the host and device may get out of
sync, and because of test issues.
wBlockLength = 0x0000 allows the sender to reduce latency by
starting to send a very large NTB, and then shortening it when
the sender discovers that there’s not sufficient data to justify
sending a large NTB"
However, there is a potential issue with the current implementation,
as it checks for the occurrence of multiple NTBs in a single
giveback by verifying if the leftover bytes to be processed is zero
or not. If the block length reads zero, we would process the same
NTB infintely because the leftover bytes is never zero and it leads
to a crash. Fix this by bailing out if block length reads zero.
Cc: <stable(a)vger.kernel.org>
Fixes: 427694cfaafa ("usb: gadget: ncm: Handle decoding of multiple NTB's in unwrap call")
Signed-off-by: Krishna Kurapati <quic_kriskura(a)quicinc.com>
---
Changes in v2:
Removed goto label
Link to v1:
https://lore.kernel.org/all/20240226112815.2616719-1-quic_kriskura@quicinc.…
drivers/usb/gadget/function/f_ncm.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/usb/gadget/function/f_ncm.c b/drivers/usb/gadget/function/f_ncm.c
index e2a059cfda2c..28f4e6552e84 100644
--- a/drivers/usb/gadget/function/f_ncm.c
+++ b/drivers/usb/gadget/function/f_ncm.c
@@ -1346,7 +1346,7 @@ static int ncm_unwrap_ntb(struct gether *port,
if (to_process == 1 &&
(*(unsigned char *)(ntb_ptr + block_len) == 0x00)) {
to_process--;
- } else if (to_process > 0) {
+ } else if ((to_process > 0) && (block_len != 0)) {
ntb_ptr = (unsigned char *)(ntb_ptr + block_len);
goto parse_ntb;
}
--
2.34.1
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x f1796544a0ca0f14386a679d3d05fbc69235015e
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024022759-crave-busily-bef7@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
f1796544a0ca ("memcg: fix use-after-free in uncharge_batch")
1a3e1f40962c ("mm: memcontrol: decouple reference counting from page accounting")
8d22a9351035 ("mm/memcg: fix refcount error while moving and swapping")
d9eb1ea2bf87 ("mm: memcontrol: delete unused lrucare handling")
4c6355b25e8b ("mm: memcontrol: charge swapin pages on instantiation")
f0e45fb4da29 ("mm: memcontrol: drop unused try/commit/cancel charge API")
9d82c69438d0 ("mm: memcontrol: convert anon and file-thp to new mem_cgroup_charge() API")
468c398233da ("mm: memcontrol: switch to native NR_ANON_THPS counter")
be5d0a74c62d ("mm: memcontrol: switch to native NR_ANON_MAPPED counter")
0d1c20722ab3 ("mm: memcontrol: switch to native NR_FILE_PAGES and NR_SHMEM counters")
49e50d277ba2 ("mm: memcontrol: prepare move_account for removal of private page type counters")
9f762dbe19b9 ("mm: memcontrol: prepare uncharging for removal of private page type counters")
3fea5a499d57 ("mm: memcontrol: convert page cache to a new mem_cgroup_charge() API")
6caa6a0703e0 ("mm: memcontrol: move out cgroup swaprate throttling")
14235ab36019 ("mm: shmem: remove rare optimization when swapin races with hole punching")
3fba69a56e16 ("mm: memcontrol: drop @compound parameter from memcg charging API")
abb242f57196 ("mm: memcontrol: fix stat-corrupting race in charge moving")
f4129ea3591a ("mm: fix NUMA node file count error in replace_page_cache()")
ffe945e633b5 ("khugepaged: do not stop collapse if less than half PTEs are referenced")
396bcc5299c2 ("mm: remove CONFIG_TRANSPARENT_HUGE_PAGECACHE")
85b9f46e8ea4 ("mm, thp: track fallbacks due to failed memcg charges separately")
dcdf11ee1441 ("mm, shmem: add vmstat for hugepage fallback")
9c315e4d7d8c ("mm: memcg/slab: cache page number in memcg_(un)charge_slab()")
92d0510c3585 ("mm: kmem: switch to nr_pages in (__)memcg_kmem_charge_memcg()")
f4b00eab5004 ("mm: kmem: rename memcg_kmem_(un)charge() into memcg_kmem_(un)charge_page()")
50591183fa86 ("mm: kmem: cleanup memcg_kmem_uncharge_memcg() arguments")
10eaec2f63b6 ("mm: kmem: cleanup (__)memcg_kmem_charge_memcg() arguments")
47e29d32afba ("mm/gup: page->hpage_pinned_refcount: exact pin counts for huge pages")
3faa52c03f44 ("mm/gup: track FOLL_PIN pages")
3b78d8347d31 ("mm/gup: pass gup flags to two more routines")
c23a0c99793f ("mm/migrate: clean up some minor coding style")
92855270ff08 ("mm/memcontrol.c: cleanup some useless code")
f1f6a7dd9b53 ("mm, tree-wide: rename put_user_page*() to unpin_user_page*()")
aa4b87fe9ea3 ("powerpc: book3s64: convert to pin_user_pages() and put_user_page()")
19fed0dae94d ("vfio, mm: pin_user_pages (FOLL_PIN) and put_user_page() conversion")
1f815afcfca7 ("media/v4l2-core: pin_user_pages (FOLL_PIN) and put_user_page() conversion")
803e4572d7c5 ("mm/process_vm_access: set FOLL_PIN via pin_user_pages_remote()")
57459435cff5 ("goldish_pipe: convert to pin_user_pages() and put_user_page()")
eddb1c228f79 ("mm/gup: introduce pin_user_pages*() and FOLL_PIN")
3c7470b6f684 ("media/v4l2-core: set pages dirty upon releasing DMA buffers")
f4000fdf435b ("mm/gup: allow FOLL_FORCE for get_user_pages_fast()")
3567813eae5e ("vfio: fix FOLL_LONGTERM use, simplify get_user_pages_remote() call")
c4237f8b1f4f ("mm: fix get_user_pages_remote()'s handling of FOLL_LONGTERM")
a707cdd55f0f ("mm/gup: move try_get_compound_head() to top, fix minor issues")
a43e982082c2 ("mm/gup: factor out duplicate code from four routines")
fac0516b5534 ("mm: thp: don't need care deferred split queue in memcg charge move path")
f1fe80d4ae33 ("mm, thp: do not queue fully unmapped pages for deferred split")
acbfb087e3b1 ("mm/hugetlb: avoid looping to the same hugepage if !pages and !vmas")
867e5e1de14b ("mm: clean up and clarify lruvec lookup procedure")
242c37b459ce ("include/linux/memcontrol.h: fix comments based on per-node memcg")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From f1796544a0ca0f14386a679d3d05fbc69235015e Mon Sep 17 00:00:00 2001
From: Michal Hocko <mhocko(a)suse.com>
Date: Fri, 4 Sep 2020 16:35:24 -0700
Subject: [PATCH] memcg: fix use-after-free in uncharge_batch
syzbot has reported an use-after-free in the uncharge_batch path
BUG: KASAN: use-after-free in instrument_atomic_write include/linux/instrumented.h:71 [inline]
BUG: KASAN: use-after-free in atomic64_sub_return include/asm-generic/atomic-instrumented.h:970 [inline]
BUG: KASAN: use-after-free in atomic_long_sub_return include/asm-generic/atomic-long.h:113 [inline]
BUG: KASAN: use-after-free in page_counter_cancel mm/page_counter.c:54 [inline]
BUG: KASAN: use-after-free in page_counter_uncharge+0x3d/0xc0 mm/page_counter.c:155
Write of size 8 at addr ffff8880371c0148 by task syz-executor.0/9304
CPU: 0 PID: 9304 Comm: syz-executor.0 Not tainted 5.8.0-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x1f0/0x31e lib/dump_stack.c:118
print_address_description+0x66/0x620 mm/kasan/report.c:383
__kasan_report mm/kasan/report.c:513 [inline]
kasan_report+0x132/0x1d0 mm/kasan/report.c:530
check_memory_region_inline mm/kasan/generic.c:183 [inline]
check_memory_region+0x2b5/0x2f0 mm/kasan/generic.c:192
instrument_atomic_write include/linux/instrumented.h:71 [inline]
atomic64_sub_return include/asm-generic/atomic-instrumented.h:970 [inline]
atomic_long_sub_return include/asm-generic/atomic-long.h:113 [inline]
page_counter_cancel mm/page_counter.c:54 [inline]
page_counter_uncharge+0x3d/0xc0 mm/page_counter.c:155
uncharge_batch+0x6c/0x350 mm/memcontrol.c:6764
uncharge_page+0x115/0x430 mm/memcontrol.c:6796
uncharge_list mm/memcontrol.c:6835 [inline]
mem_cgroup_uncharge_list+0x70/0xe0 mm/memcontrol.c:6877
release_pages+0x13a2/0x1550 mm/swap.c:911
tlb_batch_pages_flush mm/mmu_gather.c:49 [inline]
tlb_flush_mmu_free mm/mmu_gather.c:242 [inline]
tlb_flush_mmu+0x780/0x910 mm/mmu_gather.c:249
tlb_finish_mmu+0xcb/0x200 mm/mmu_gather.c:328
exit_mmap+0x296/0x550 mm/mmap.c:3185
__mmput+0x113/0x370 kernel/fork.c:1076
exit_mm+0x4cd/0x550 kernel/exit.c:483
do_exit+0x576/0x1f20 kernel/exit.c:793
do_group_exit+0x161/0x2d0 kernel/exit.c:903
get_signal+0x139b/0x1d30 kernel/signal.c:2743
arch_do_signal+0x33/0x610 arch/x86/kernel/signal.c:811
exit_to_user_mode_loop kernel/entry/common.c:135 [inline]
exit_to_user_mode_prepare+0x8d/0x1b0 kernel/entry/common.c:166
syscall_exit_to_user_mode+0x5e/0x1a0 kernel/entry/common.c:241
entry_SYSCALL_64_after_hwframe+0x44/0xa9
Commit 1a3e1f40962c ("mm: memcontrol: decouple reference counting from
page accounting") reworked the memcg lifetime to be bound the the struct
page rather than charges. It also removed the css_put_many from
uncharge_batch and that is causing the above splat.
uncharge_batch() is supposed to uncharge accumulated charges for all
pages freed from the same memcg. The queuing is done by uncharge_page
which however drops the memcg reference after it adds charges to the
batch. If the current page happens to be the last one holding the
reference for its memcg then the memcg is OK to go and the next page to
be freed will trigger batched uncharge which needs to access the memcg
which is gone already.
Fix the issue by taking a reference for the memcg in the current batch.
Fixes: 1a3e1f40962c ("mm: memcontrol: decouple reference counting from page accounting")
Reported-by: syzbot+b305848212deec86eabe(a)syzkaller.appspotmail.com
Reported-by: syzbot+b5ea6fb6f139c8b9482b(a)syzkaller.appspotmail.com
Signed-off-by: Michal Hocko <mhocko(a)suse.com>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Reviewed-by: Shakeel Butt <shakeelb(a)google.com>
Acked-by: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: Roman Gushchin <guro(a)fb.com>
Cc: Hugh Dickins <hughd(a)google.com>
Link: https://lkml.kernel.org/r/20200820090341.GC5033@dhcp22.suse.cz
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index b807952b4d43..cfa6cbad21d5 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -6774,6 +6774,9 @@ static void uncharge_batch(const struct uncharge_gather *ug)
__this_cpu_add(ug->memcg->vmstats_percpu->nr_page_events, ug->nr_pages);
memcg_check_events(ug->memcg, ug->dummy_page);
local_irq_restore(flags);
+
+ /* drop reference from uncharge_page */
+ css_put(&ug->memcg->css);
}
static void uncharge_page(struct page *page, struct uncharge_gather *ug)
@@ -6797,6 +6800,9 @@ static void uncharge_page(struct page *page, struct uncharge_gather *ug)
uncharge_gather_clear(ug);
}
ug->memcg = page->mem_cgroup;
+
+ /* pairs with css_put in uncharge_batch */
+ css_get(&ug->memcg->css);
}
nr_pages = compound_nr(page);
ODI DFP-34X-2C2 is capable of 2500base-X, but incorrectly report its
capabilities in the EEPROM.
So use sfp_quirk_2500basex for this module to allow 2500Base-X mode.
Signed-off-by: Shengyu Qu <wiagn233(a)outlook.com>
---
drivers/net/phy/sfp.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/drivers/net/phy/sfp.c b/drivers/net/phy/sfp.c
index f75c9eb3958e..2021cb4ff2f6 100644
--- a/drivers/net/phy/sfp.c
+++ b/drivers/net/phy/sfp.c
@@ -495,6 +495,13 @@ static const struct sfp_quirk sfp_quirks[] = {
// 2500MBd NRZ in their EEPROM
SFP_QUIRK_M("Lantech", "8330-262D-E", sfp_quirk_2500basex),
+ // ODI DFP-34X-2C2 can operate at 2500base-X, but incorrectly report 1300MBd
+ // NRZ in the EEPROM.
+ // Besides, In early batches, vendor id is set to OEM, but that is fixed in
+ // newer batches.
+ SFP_QUIRK_M("ODI", "DFP-34X-2C2", sfp_quirk_2500basex),
+ SFP_QUIRK_M("OEM", "DFP-34X-2C2", sfp_quirk_2500basex),
+
SFP_QUIRK_M("UBNT", "UF-INSTANT", sfp_quirk_ubnt_uf_instant),
// Walsun HXSX-ATR[CI]-1 don't identify as copper, and use the
--
2.39.2
Currently xhci_map_urb_for_dma() creates a temporary buffer
and copies the SG list to the new linear buffer. But if the
kzalloc_node() fails, then the following sg_pcopy_to_buffer()
can lead to crash since it tries to memcpy to NULL pointer.
So return -EAGAIN if kzalloc returns null pointer.
Cc: <stable(a)vger.kernel.org> # 5.11
Fixes: 2017a1e58472 ("usb: xhci: Use temporary buffer to consolidate SG")
Signed-off-by: Prashanth K <quic_prashk(a)quicinc.com>
---
drivers/usb/host/xhci.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
index c057c42c36f4..0597a60bec34 100644
--- a/drivers/usb/host/xhci.c
+++ b/drivers/usb/host/xhci.c
@@ -1218,6 +1218,9 @@ static int xhci_map_temp_buffer(struct usb_hcd *hcd, struct urb *urb)
temp = kzalloc_node(buf_len, GFP_ATOMIC,
dev_to_node(hcd->self.sysdev));
+ if (!temp)
+ return -EAGAIN;
+
if (usb_urb_dir_out(urb))
sg_pcopy_to_buffer(urb->sg, urb->num_sgs,
temp, buf_len, 0);
--
2.25.1
After reset the VFIO device state will always be put in
VFIO_DEVICE_STATE_RUNNING, but the save/restore files will only be
cleared if the previous state was VFIO_DEVICE_STATE_ERROR. This
can/will cause the restore/save files to be leaked if/when the
migration state machine transitions through the states that
re-allocates these files. Fix this by always clearing the
restore/save files for resets.
Fixes: 7dabb1bcd177 ("vfio/pds: Add support for firmware recovery")
Cc: stable(a)vger.kernel.org
Signed-off-by: Brett Creeley <brett.creeley(a)amd.com>
Reviewed-by: Shannon Nelson <shannon.nelson(a)amd.com>
---
drivers/vfio/pci/pds/vfio_dev.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/vfio/pci/pds/vfio_dev.c b/drivers/vfio/pci/pds/vfio_dev.c
index 4c351c59d05a..a286ebcc7112 100644
--- a/drivers/vfio/pci/pds/vfio_dev.c
+++ b/drivers/vfio/pci/pds/vfio_dev.c
@@ -32,9 +32,9 @@ void pds_vfio_state_mutex_unlock(struct pds_vfio_pci_device *pds_vfio)
mutex_lock(&pds_vfio->reset_mutex);
if (pds_vfio->deferred_reset) {
pds_vfio->deferred_reset = false;
+ pds_vfio_put_restore_file(pds_vfio);
+ pds_vfio_put_save_file(pds_vfio);
if (pds_vfio->state == VFIO_DEVICE_STATE_ERROR) {
- pds_vfio_put_restore_file(pds_vfio);
- pds_vfio_put_save_file(pds_vfio);
pds_vfio_dirty_disable(pds_vfio, false);
}
pds_vfio->state = pds_vfio->deferred_reset_state;
--
2.17.1
Since kernel version 5.4.250 LTS, there has been an issue with the kernel live patching feature becoming unavailable. When compiling the sample code for kernel live patching, the following message is displayed when enabled:
livepatch: klp_check_stack: kworker/u256:6:23490 has an unreliable stack
After investigation, it was found that this is due to objtool not supporting intra-function calls, resulting in incorrect orc entry generation.
This patchset adds support for intra-function calls, allowing the kernel live patching feature to work correctly.
Alexandre Chartre (2):
objtool: is_fentry_call() crashes if call has no destination
objtool: Add support for intra-function calls
Rui Qi (1):
x86/speculation: Support intra-function call validation
arch/x86/include/asm/nospec-branch.h | 7 ++
include/linux/frame.h | 11 ++++
.../Documentation/stack-validation.txt | 8 +++
tools/objtool/arch/x86/decode.c | 6 ++
tools/objtool/check.c | 64 +++++++++++++++++--
5 files changed, 91 insertions(+), 5 deletions(-)
--
2.39.2 (Apple Git-143)
Supersedes: <20240130180400.1698136-1-pbonzini(a)redhat.com>
MKTME repurposes the high bit of physical address to key id for encryption
key and, even though MAXPHYADDR in CPUID[0x80000008] remains the same,
the valid bits in the MTRR mask register are based on the reduced number
of physical address bits. This breaks boot on machines that have TME enabled
and do something to cleanup MTRRs, unless "disable_mtrr_cleanup" is
passed on the command line. The fix is to move the check to early CPU
initialization, which runs before Linux sets up MTRRs.
However, as noticed by Kirill, the patch I sent as v1 actually works only
until Linux 6.6. In Linux 6.7, commit fbf6449f84bf ("x86/sev-es: Set
x86_virt_bits to the correct value straight away, instead of a two-phase
approach") reorganized the initialization of c->x86_phys_bits in a way
that broke the patch. But even in 6.7 AMD processors, which did try to
reduce it in this_cpu->c_early_init(c), had their x86_phys_bits value
overwritten by get_cpu_address_sizes(), so that early_identify_cpu()
left the wrong value in x86_phys_bits. This probably went unnoticed
because on AMD processors you need not apply the reduced MAXPHYADDR to
MTRR masks.
Therefore, this v2 prepends the fix for this issue in commit fbf6449f84bf.
Apologies for the oversight.
Tested on an AMD Epyc machine (where I resorted to dumping mtrr_state) and
on the problematic Intel Emerald Rapids machine.
Thanks,
Paolo
Paolo Bonzini (2):
x86/cpu: allow reducing x86_phys_bits during early_identify_cpu()
x86/cpu/intel: Detect TME keyid bits before setting MTRR mask
registers
arch/x86/kernel/cpu/common.c | 4 +-
arch/x86/kernel/cpu/intel.c | 178 ++++++++++++++++++-----------------
2 files changed, 93 insertions(+), 89 deletions(-)
--
2.43.0
While commit 69f89168b310 ("usb: typec: tpcm: Fix issues with power being
removed during reset") fixes the boot issues for bus powered devices such
as LibreTech Renegade Elite/Firefly, it trades off the CC pins NOT being
Hi-Zed during errory recovery (i.e PORT_RESET) for devices which are NOT
bus powered(a.k.a self powered). This change Hi-Zs the CC pins only for
self powered devices, thus preventing brown out for bus powered devices
Adhering to spec is gaining more importance due to the Common charger
initiative enforced by the European Union.
Quoting from the spec:
4.5.2.2.2.1 ErrorRecovery State Requirements
The port shall not drive VBUS or VCONN, and shall present a
high-impedance to ground (above zOPEN) on its CC1 and CC2 pins.
Hi-Zing the CC pins is the inteded behavior for PORT_RESET.
CC pins are set to default state after tErrorRecovery in
PORT_RESET_WAIT_OFF.
4.5.2.2.2.2 Exiting From ErrorRecovery State
A Sink shall transition to Unattached.SNK after tErrorRecovery.
A Source shall transition to Unattached.SRC after tErrorRecovery.
Fixes: 69f89168b310 ("usb: typec: tpcm: Fix issues with power being removed during reset")
Cc: stable(a)kernel.org
Cc: Mark Brown <broonie(a)kernel.org>
Signed-off-by: Badhri Jagan Sridharan <badhri(a)google.com>
---
drivers/usb/typec/tcpm/tcpm.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/drivers/usb/typec/tcpm/tcpm.c b/drivers/usb/typec/tcpm/tcpm.c
index c9a78f55ca48..bbe1381232eb 100644
--- a/drivers/usb/typec/tcpm/tcpm.c
+++ b/drivers/usb/typec/tcpm/tcpm.c
@@ -5593,8 +5593,11 @@ static void run_state_machine(struct tcpm_port *port)
break;
case PORT_RESET:
tcpm_reset_port(port);
- tcpm_set_cc(port, tcpm_default_state(port) == SNK_UNATTACHED ?
- TYPEC_CC_RD : tcpm_rp_cc(port));
+ if (port->self_powered)
+ tcpm_set_cc(port, TYPEC_CC_OPEN);
+ else
+ tcpm_set_cc(port, tcpm_default_state(port) == SNK_UNATTACHED ?
+ TYPEC_CC_RD : tcpm_rp_cc(port));
tcpm_set_state(port, PORT_RESET_WAIT_OFF,
PD_T_ERROR_RECOVERY);
break;
base-commit: a560a5672826fc1e057068bda93b3d4c98d037a2
--
2.44.0.rc1.240.g4c46232300-goog
On Tue, Feb 27, 2024 at 10:14 AM Kairui Song <ryncsn(a)gmail.com> wrote:
>
> On Wed, Feb 21, 2024 at 12:32 AM Chris Li <chrisl(a)kernel.org> wrote:
> >
> > On Mon, Feb 19, 2024 at 8:56 PM Kairui Song <ryncsn(a)gmail.com> wrote:
> >
> > >
> > > Hi Barry,
> > >
> > > > it might not be a problem for throughput. but for real-time and tail latency,
> > > > this hurts. For example, this might increase dropping frames of UI which
> > > > is an important parameter to evaluate performance :-)
> > > >
> > >
> > > That's a true issue, as Chris mentioned before I think we need to
> > > think of some clever data struct to solve this more naturally in the
> > > future, similar issue exists for cached swapin as well and it has been
> > > there for a while. On the other hand I think maybe applications that
> > > are extremely latency sensitive should try to avoid swap on fault? A
> > > swapin could cause other issues like reclaim, throttled or contention
> > > with many other things, these seem to have a higher chance than this
> > > race.
> >
> >
> > Yes, I do think the best long term solution is to have some clever
> > data structure to solve the synchronization issue and allow racing
> > threads to make forward progress at the same time.
> >
> > I have also explored some (failed) synchronization ideas, for example
> > having the run time swap entry refcount separate from swap_map count.
> > BTW, zswap entry->refcount behaves like that, it is separate from swap
> > entry and manages the temporary run time usage count held by the
> > function. However that idea has its own problem as well, it needs to
> > have an xarray to track the swap entry run time refcount (only stored
> > in the xarray when CPU fails to get SWAP_HAS_CACHE bit.) When we are
> > done with page faults, we still need to look up the xarray to make
> > sure there is no racing CPU and put the refcount into the xarray. That
> > kind of defeats the purpose of avoiding the swap cache in the first
> > place. We still need to do the xarray lookup in the normal path.
> >
> > I came to realize that, while this current fix is not perfect, (I
> > still wish we had a better solution not pausing the racing CPU). This
> > patch stands better than not fixing this data corruption issue and the
> > patch remains relatively simple. Yes it has latency issues but still
> > better than data corruption. It also doesn't stop us from coming up
> > with better solutions later on. If we want to address the
> > synchronization in a way not blocking other CPUs, it will likely
> > require a much bigger change.
> >
> > Unless we have a better suggestion. It seems the better one among the
> > alternatives so far.
> >
>
> Hi,
>
> Thanks for the comments. I've been trying some ideas locally, I think a simple and straight solution exists: We just don't skip the swap cache xarray.
Yes, I have been pondering about that as well.
Notice in __read_swap_cache_async(), it has a similar
"schedule_timeout_uninterruptible(1)" when swapcache_prepare(entry)
fails to grab the SWAP_HAS_CACHE bit. So falling back to use the swap
cache does not automatically solve the latency issue. Similar delay
exists in the swap cache case as well.
> The current reason we are skipping it is for performance, but with some optimization, the performance should be as good as skipping it (in current behavior). Notice even in the swap cache bypass path, we need to do one lookup, and one modify (delete the shadow). That can't be skipped. So the usage of swap cache can be better organized and optimized.
> After all swapin makes use of swap cache, swapin can insert the folio in swap cache xarray first, then set swap map cache bit. I'm thinking about reusing the folio lock, or having an intermediate value in xarray, so raced swapins can wait properly. There are some tricky parts syncing with swap maps though.
Inserting the swap cache xarray first and setting SWAP_HAS_CACHE bit
later will need more audit on the race. I assume you take the swap
device/cluster lock before folio insert into swap cache xarray?
Chris
>
> Currently working on a series, will send in a few weeks if it works.
In raid5_cache_count():
if (conf->max_nr_stripes < conf->min_nr_stripes)
return 0;
return conf->max_nr_stripes - conf->min_nr_stripes;
The current check is ineffective, as the values could change immediately
after being checked.
In raid5_set_cache_size():
...
conf->min_nr_stripes = size;
...
while (size > conf->max_nr_stripes)
conf->min_nr_stripes = conf->max_nr_stripes;
...
Due to intermediate value updates in raid5_set_cache_size(), concurrent
execution of raid5_cache_count() and raid5_set_cache_size() may lead to
inconsistent reads of conf->max_nr_stripes and conf->min_nr_stripes.
The current checks are ineffective as values could change immediately
after being checked, raising the risk of conf->min_nr_stripes exceeding
conf->max_nr_stripes and potentially causing an integer overflow.
This possible bug is found by an experimental static analysis tool
developed by our team. This tool analyzes the locking APIs to extract
function pairs that can be concurrently executed, and then analyzes the
instructions in the paired functions to identify possible concurrency bugs
including data races and atomicity violations. The above possible bug is
reported when our tool analyzes the source code of Linux 6.2.
To resolve this issue, it is suggested to introduce local variables
'min_stripes' and 'max_stripes' in raid5_cache_count() to ensure the
values remain stable throughout the check. Adding locks in
raid5_cache_count() fails to resolve atomicity violations, as
raid5_set_cache_size() may hold intermediate values of
conf->min_nr_stripes while unlocked. With this patch applied, our tool no
longer reports the bug, with the kernel configuration allyesconfig for
x86_64. Due to the lack of associated hardware, we cannot test the patch
in runtime testing, and just verify it according to the code logic.
Fixes: edbe83ab4c27 ("md/raid5: allow the stripe_cache to grow and shrink.")
Cc: stable(a)vger.kernel.org
Signed-off-by: Gui-Dong Han <2045gemini(a)gmail.com>
---
v2:
* In this patch v2, we've updated to use READ_ONCE() instead of direct
reads for accessing max_nr_stripes and min_nr_stripes, since read and
write can concurrent.
Thank Yu Kuai for helpful advice.
---
v3:
* In this patch v3, we've updated to use WRITE_ONCE() in
raid5_set_cache_size(), grow_one_stripe() and drop_one_stripe(), in order
to pair READ_ONCE() with WRITE_ONCE().
Thank Yu Kuai for helpful advice.
---
v4:
* In this patch v4, we've addressed several code style issues.
Thank Yu Kuai for helpful advice.
---
drivers/md/raid5.c | 14 ++++++++------
1 file changed, 8 insertions(+), 6 deletions(-)
diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index 8497880135ee..30e118d10c0b 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -2412,7 +2412,7 @@ static int grow_one_stripe(struct r5conf *conf, gfp_t gfp)
atomic_inc(&conf->active_stripes);
raid5_release_stripe(sh);
- conf->max_nr_stripes++;
+ WRITE_ONCE(conf->max_nr_stripes, conf->max_nr_stripes + 1);
return 1;
}
@@ -2707,7 +2707,7 @@ static int drop_one_stripe(struct r5conf *conf)
shrink_buffers(sh);
free_stripe(conf->slab_cache, sh);
atomic_dec(&conf->active_stripes);
- conf->max_nr_stripes--;
+ WRITE_ONCE(conf->max_nr_stripes, conf->max_nr_stripes - 1);
return 1;
}
@@ -6820,7 +6820,7 @@ raid5_set_cache_size(struct mddev *mddev, int size)
if (size <= 16 || size > 32768)
return -EINVAL;
- conf->min_nr_stripes = size;
+ WRITE_ONCE(conf->min_nr_stripes, size);
mutex_lock(&conf->cache_size_mutex);
while (size < conf->max_nr_stripes &&
drop_one_stripe(conf))
@@ -6832,7 +6832,7 @@ raid5_set_cache_size(struct mddev *mddev, int size)
mutex_lock(&conf->cache_size_mutex);
while (size > conf->max_nr_stripes)
if (!grow_one_stripe(conf, GFP_KERNEL)) {
- conf->min_nr_stripes = conf->max_nr_stripes;
+ WRITE_ONCE(conf->min_nr_stripes, conf->max_nr_stripes);
result = -ENOMEM;
break;
}
@@ -7390,11 +7390,13 @@ static unsigned long raid5_cache_count(struct shrinker *shrink,
struct shrink_control *sc)
{
struct r5conf *conf = shrink->private_data;
+ int max_stripes = READ_ONCE(conf->max_nr_stripes);
+ int min_stripes = READ_ONCE(conf->min_nr_stripes);
- if (conf->max_nr_stripes < conf->min_nr_stripes)
+ if (max_stripes < min_stripes)
/* unlikely, but not impossible */
return 0;
- return conf->max_nr_stripes - conf->min_nr_stripes;
+ return max_stripes - min_stripes;
}
static struct r5conf *setup_conf(struct mddev *mddev)
--
2.34.1
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 045e9d812868a2d80b7a57b224ce8009444b7bbc
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024022601-footwork-fastness-bcab@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
045e9d812868 ("mptcp: fix duplicate subflow creation")
b9d69db87fb7 ("mptcp: let the in-kernel PM use mixed IPv4 and IPv6 addresses")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 045e9d812868a2d80b7a57b224ce8009444b7bbc Mon Sep 17 00:00:00 2001
From: Paolo Abeni <pabeni(a)redhat.com>
Date: Thu, 15 Feb 2024 19:25:33 +0100
Subject: [PATCH] mptcp: fix duplicate subflow creation
Fullmesh endpoints could end-up unexpectedly generating duplicate
subflows - same local and remote addresses - when multiple incoming
ADD_ADDR are processed before the PM creates the subflow for the local
endpoints.
Address the issue explicitly checking for duplicates at subflow
creation time.
To avoid a quadratic computational complexity, track the unavailable
remote address ids in a temporary bitmap and initialize such bitmap
with the remote ids of all the existing subflows matching the local
address currently processed.
The above allows additionally replacing the existing code checking
for duplicate entry in the current set with a simple bit test
operation.
Fixes: 2843ff6f36db ("mptcp: remote addresses fullmesh")
Cc: stable(a)vger.kernel.org
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/435
Signed-off-by: Paolo Abeni <pabeni(a)redhat.com>
Reviewed-by: Mat Martineau <martineau(a)kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
diff --git a/net/mptcp/pm_netlink.c b/net/mptcp/pm_netlink.c
index ed6983af1ab2..58d17d9604e7 100644
--- a/net/mptcp/pm_netlink.c
+++ b/net/mptcp/pm_netlink.c
@@ -396,19 +396,6 @@ void mptcp_pm_free_anno_list(struct mptcp_sock *msk)
}
}
-static bool lookup_address_in_vec(const struct mptcp_addr_info *addrs, unsigned int nr,
- const struct mptcp_addr_info *addr)
-{
- int i;
-
- for (i = 0; i < nr; i++) {
- if (addrs[i].id == addr->id)
- return true;
- }
-
- return false;
-}
-
/* Fill all the remote addresses into the array addrs[],
* and return the array size.
*/
@@ -440,6 +427,16 @@ static unsigned int fill_remote_addresses_vec(struct mptcp_sock *msk,
msk->pm.subflows++;
addrs[i++] = remote;
} else {
+ DECLARE_BITMAP(unavail_id, MPTCP_PM_MAX_ADDR_ID + 1);
+
+ /* Forbid creation of new subflows matching existing
+ * ones, possibly already created by incoming ADD_ADDR
+ */
+ bitmap_zero(unavail_id, MPTCP_PM_MAX_ADDR_ID + 1);
+ mptcp_for_each_subflow(msk, subflow)
+ if (READ_ONCE(subflow->local_id) == local->id)
+ __set_bit(subflow->remote_id, unavail_id);
+
mptcp_for_each_subflow(msk, subflow) {
ssk = mptcp_subflow_tcp_sock(subflow);
remote_address((struct sock_common *)ssk, &addrs[i]);
@@ -447,11 +444,17 @@ static unsigned int fill_remote_addresses_vec(struct mptcp_sock *msk,
if (deny_id0 && !addrs[i].id)
continue;
+ if (test_bit(addrs[i].id, unavail_id))
+ continue;
+
if (!mptcp_pm_addr_families_match(sk, local, &addrs[i]))
continue;
- if (!lookup_address_in_vec(addrs, i, &addrs[i]) &&
- msk->pm.subflows < subflows_max) {
+ if (msk->pm.subflows < subflows_max) {
+ /* forbid creating multiple address towards
+ * this id
+ */
+ __set_bit(addrs[i].id, unavail_id);
msk->pm.subflows++;
i++;
}
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 3f83d8a77eeeb47011b990fd766a421ee64f1d73
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024021911-fragment-yearly-5b45@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
3f83d8a77eee ("mptcp: fix more tx path fields initialization")
013e3179dbd2 ("mptcp: fix rcv space initialization")
c693a8516429 ("mptcp: use mptcp_set_state")
4fd19a307016 ("mptcp: fix inconsistent state on fastopen race")
d109a7767273 ("mptcp: fix possible NULL pointer dereference on close")
8005184fd1ca ("mptcp: refactor sndbuf auto-tuning")
a5efdbcece83 ("mptcp: fix delegated action races")
27e5ccc2d5a5 ("mptcp: fix dangling connection hang-up")
f6909dc1c1f4 ("mptcp: rename timer related helper to less confusing names")
9f1a98813b4b ("mptcp: process pending subflow error on close")
d5fbeff1ab81 ("mptcp: move __mptcp_error_report in protocol.c")
ebc1e08f01eb ("mptcp: drop last_snd and MPTCP_RESET_SCHEDULER")
e263691773cd ("mptcp: Remove unnecessary test for __mptcp_init_sock()")
39880bd808ad ("mptcp: get rid of msk->subflow")
3f326a821b99 ("mptcp: change the mpc check helper to return a sk")
3aa362494170 ("mptcp: avoid ssock usage in mptcp_pm_nl_create_listen_socket()")
f0bc514bd5c1 ("mptcp: avoid additional indirection in sockopt")
40f56d0c7043 ("mptcp: avoid additional indirection in mptcp_listen()")
8cf2ebdc0078 ("mptcp: mptcp: avoid additional indirection in mptcp_bind()")
ccae357c1c6a ("mptcp: avoid additional __inet_stream_connect() call")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 3f83d8a77eeeb47011b990fd766a421ee64f1d73 Mon Sep 17 00:00:00 2001
From: Paolo Abeni <pabeni(a)redhat.com>
Date: Thu, 8 Feb 2024 19:03:51 +0100
Subject: [PATCH] mptcp: fix more tx path fields initialization
The 'msk->write_seq' and 'msk->snd_nxt' are always updated under
the msk socket lock, except at MPC handshake completiont time.
Builds-up on the previous commit to move such init under the relevant
lock.
There are no known problems caused by the potential race, the
primary goal is consistency.
Fixes: 6d0060f600ad ("mptcp: Write MPTCP DSS headers to outgoing data packets")
Cc: stable(a)vger.kernel.org
Signed-off-by: Paolo Abeni <pabeni(a)redhat.com>
Reviewed-by: Mat Martineau <martineau(a)kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index 7632eafb683b..8cb6a873dae9 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -3478,10 +3478,8 @@ void mptcp_finish_connect(struct sock *ssk)
* accessing the field below
*/
WRITE_ONCE(msk->local_key, subflow->local_key);
- WRITE_ONCE(msk->write_seq, subflow->idsn + 1);
- WRITE_ONCE(msk->snd_nxt, msk->write_seq);
- WRITE_ONCE(msk->snd_una, msk->write_seq);
- WRITE_ONCE(msk->wnd_end, msk->snd_nxt + tcp_sk(ssk)->snd_wnd);
+ WRITE_ONCE(msk->snd_una, subflow->idsn + 1);
+ WRITE_ONCE(msk->wnd_end, subflow->idsn + 1 + tcp_sk(ssk)->snd_wnd);
mptcp_pm_new_connection(msk, ssk, 0);
}
diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c
index 56b2ac2f2f22..c2df34ebcf28 100644
--- a/net/mptcp/subflow.c
+++ b/net/mptcp/subflow.c
@@ -421,12 +421,21 @@ static bool subflow_use_different_dport(struct mptcp_sock *msk, const struct soc
void __mptcp_sync_state(struct sock *sk, int state)
{
+ struct mptcp_subflow_context *subflow;
struct mptcp_sock *msk = mptcp_sk(sk);
+ struct sock *ssk = msk->first;
- __mptcp_propagate_sndbuf(sk, msk->first);
+ subflow = mptcp_subflow_ctx(ssk);
+ __mptcp_propagate_sndbuf(sk, ssk);
if (!msk->rcvspace_init)
- mptcp_rcv_space_init(msk, msk->first);
+ mptcp_rcv_space_init(msk, ssk);
+
if (sk->sk_state == TCP_SYN_SENT) {
+ /* subflow->idsn is always available is TCP_SYN_SENT state,
+ * even for the FASTOPEN scenarios
+ */
+ WRITE_ONCE(msk->write_seq, subflow->idsn + 1);
+ WRITE_ONCE(msk->snd_nxt, msk->write_seq);
mptcp_set_state(sk, state);
sk->sk_state_change(sk);
}
Hi Larry,
> -----Original Message-----
> From: Larry Finger <Larry.Finger(a)gmail.com>
> Sent: Tuesday, February 27, 2024 10:35 AM
> To: Kalle Valo <kvalo(a)kernel.org>
> Cc: Johannes Berg <johannes(a)sipsolutions.net>; linux-wireless(a)vger.kernel.org; Nick Morrow
> <morrownr(a)gmail.com>; Larry Finger <Larry.Finger(a)lwfinger.net>; Ping-Ke Shih <pkshih(a)realtek.com>;
> stable(a)vger.kernel.org
> Subject: [PATCHi V2] wifi: rtw88: Add missing VID/PIDs doe 8811CU and 8821CU
Not sure if "doe" is typo?
>
> From: Nick Morrow <morrownr(a)gmail.com>
>
> Purpose: Add VID/PIDs that are known to be missing for this driver.
> - removed /* 8811CU */ and /* 8821CU */ as they are redundant
> since the file is specific to those chips.
> - removed /* TOTOLINK A650UA v3 */ as the manufacturer. It has a REALTEK
> VID so it may not be specific to this adapter.
>
> Source is
> https://1EHFQ.trk.elasticemail.com/tracking/click?d=I82H0YR_W_h175Lb3Nkb0D8…
> 0SPxd1Olp3PNJEJTqsu4kyqBXayE0BVd_k7uLFvlTe65Syx2uqLUB-UQSfsKKLkuyE-frMZXSCL7q824UG3Oer614GGEeEz-DNEWHh
> 43p_e8oz7OouS6gRBEng0
> Verified and tested.
>
> Signed-off-by: Nick Morrow <morrownr(a)gmail.com>
> Signed-off-by: Larry Finger <Larry.Finger(a)lwfinger.net>
> Acked-by: Ping-Ke Shih <pkshih(a)realtek.com>
>
Did you keep a blank line intentionally?
> Cc: stable(a)vger.kernel.org
commit c9b528c35795b711331ed36dc3dbee90d5812d4e upstream.
This mostly reverts commit 6bd97bf273bd ("ext4: remove redundant
mb_regenerate_buddy()") and reintroduces mb_regenerate_buddy(). Based on
code in mb_free_blocks(), fast commit replay can end up marking as free
blocks that are already marked as such. This causes corruption of the
buddy bitmap so we need to regenerate it in that case.
Reported-by: Jan Kara <jack(a)suse.cz>
Fixes: 6bd97bf273bd ("ext4: remove redundant mb_regenerate_buddy()")
CVE: CVE-2024-26601
Signed-off-by: Baokun Li <libaokun1(a)huawei.com>
Reviewed-by: Jan Kara <jack(a)suse.cz>
Link: https://lore.kernel.org/r/20240104142040.2835097-4-libaokun1@huawei.com
Signed-off-by: Theodore Ts'o <tytso(a)mit.edu>
Signed-off-by: Baokun Li <libaokun1(a)huawei.com>
---
fs/ext4/mballoc.c | 20 ++++++++++++++++++++
1 file changed, 20 insertions(+)
diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
index 762c2f8b5b2a..63e4c3b9e608 100644
--- a/fs/ext4/mballoc.c
+++ b/fs/ext4/mballoc.c
@@ -1168,6 +1168,24 @@ void ext4_mb_generate_buddy(struct super_block *sb,
mb_update_avg_fragment_size(sb, grp);
}
+static void mb_regenerate_buddy(struct ext4_buddy *e4b)
+{
+ int count;
+ int order = 1;
+ void *buddy;
+
+ while ((buddy = mb_find_buddy(e4b, order++, &count)))
+ ext4_set_bits(buddy, 0, count);
+
+ e4b->bd_info->bb_fragments = 0;
+ memset(e4b->bd_info->bb_counters, 0,
+ sizeof(*e4b->bd_info->bb_counters) *
+ (e4b->bd_sb->s_blocksize_bits + 2));
+
+ ext4_mb_generate_buddy(e4b->bd_sb, e4b->bd_buddy,
+ e4b->bd_bitmap, e4b->bd_group, e4b->bd_info);
+}
+
/* The buddy information is attached the buddy cache inode
* for convenience. The information regarding each group
* is loaded via ext4_mb_load_buddy. The information involve
@@ -1846,6 +1864,8 @@ static void mb_free_blocks(struct inode *inode, struct ext4_buddy *e4b,
ext4_mark_group_bitmap_corrupted(
sb, e4b->bd_group,
EXT4_GROUP_INFO_BBITMAP_CORRUPT);
+ } else {
+ mb_regenerate_buddy(e4b);
}
goto done;
}
--
2.31.1
From: Conor Dooley <conor.dooley(a)microchip.com>
On RISC-V, and presumably x86/arm64, if CFI_CLANG is enabled loading a
rust module will trigger a kernel panic. Support for sanitisers,
including kcfi (CFI_CLANG), is in the works, but for now they're
nightly-only options in rustc. Make RUST depend on !CFI_CLANG to prevent
configuring a kernel without symmetrical support for kfi.
Fixes: 2f7ab1267dc9 ("Kbuild: add Rust support")
cc: stable(a)vger.kernel.org
Signed-off-by: Conor Dooley <conor.dooley(a)microchip.com>
---
This probably needs to go to stable. The correct fixes tag for that I am
not sure of however, but since CFI_CLANG predates RUST, I blamed the
commit adding rust support.
---
init/Kconfig | 1 +
1 file changed, 1 insertion(+)
diff --git a/init/Kconfig b/init/Kconfig
index 8d4e836e1b6b..6cf05824859e 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1895,6 +1895,7 @@ config RUST
bool "Rust support"
depends on HAVE_RUST
depends on RUST_IS_AVAILABLE
+ depends on !CFI_CLANG
depends on !MODVERSIONS
depends on !GCC_PLUGINS
depends on !RANDSTRUCT
--
2.43.0
commit c9b528c35795b711331ed36dc3dbee90d5812d4e upstream.
This mostly reverts commit 6bd97bf273bd ("ext4: remove redundant
mb_regenerate_buddy()") and reintroduces mb_regenerate_buddy(). Based on
code in mb_free_blocks(), fast commit replay can end up marking as free
blocks that are already marked as such. This causes corruption of the
buddy bitmap so we need to regenerate it in that case.
Reported-by: Jan Kara <jack(a)suse.cz>
Fixes: 6bd97bf273bd ("ext4: remove redundant mb_regenerate_buddy()")
CVE: CVE-2024-26601
Signed-off-by: Baokun Li <libaokun1(a)huawei.com>
Reviewed-by: Jan Kara <jack(a)suse.cz>
Link: https://lore.kernel.org/r/20240104142040.2835097-4-libaokun1@huawei.com
Signed-off-by: Theodore Ts'o <tytso(a)mit.edu>
Signed-off-by: Baokun Li <libaokun1(a)huawei.com>
---
fs/ext4/mballoc.c | 20 ++++++++++++++++++++
1 file changed, 20 insertions(+)
diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
index 9bec75847b85..5799706e20cc 100644
--- a/fs/ext4/mballoc.c
+++ b/fs/ext4/mballoc.c
@@ -823,6 +823,24 @@ void ext4_mb_generate_buddy(struct super_block *sb,
atomic64_add(period, &sbi->s_mb_generation_time);
}
+static void mb_regenerate_buddy(struct ext4_buddy *e4b)
+{
+ int count;
+ int order = 1;
+ void *buddy;
+
+ while ((buddy = mb_find_buddy(e4b, order++, &count)))
+ ext4_set_bits(buddy, 0, count);
+
+ e4b->bd_info->bb_fragments = 0;
+ memset(e4b->bd_info->bb_counters, 0,
+ sizeof(*e4b->bd_info->bb_counters) *
+ (e4b->bd_sb->s_blocksize_bits + 2));
+
+ ext4_mb_generate_buddy(e4b->bd_sb, e4b->bd_buddy,
+ e4b->bd_bitmap, e4b->bd_group, e4b->bd_info);
+}
+
/* The buddy information is attached the buddy cache inode
* for convenience. The information regarding each group
* is loaded via ext4_mb_load_buddy. The information involve
@@ -1505,6 +1523,8 @@ static void mb_free_blocks(struct inode *inode, struct ext4_buddy *e4b,
ext4_mark_group_bitmap_corrupted(
sb, e4b->bd_group,
EXT4_GROUP_INFO_BBITMAP_CORRUPT);
+ } else {
+ mb_regenerate_buddy(e4b);
}
goto done;
}
--
2.31.1
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.19.y
git checkout FETCH_HEAD
git cherry-pick -x f1796544a0ca0f14386a679d3d05fbc69235015e
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024022702-ignition-astonish-a4f1@gregkh' --subject-prefix 'PATCH 4.19.y' HEAD^..
Possible dependencies:
f1796544a0ca ("memcg: fix use-after-free in uncharge_batch")
1a3e1f40962c ("mm: memcontrol: decouple reference counting from page accounting")
8d22a9351035 ("mm/memcg: fix refcount error while moving and swapping")
d9eb1ea2bf87 ("mm: memcontrol: delete unused lrucare handling")
4c6355b25e8b ("mm: memcontrol: charge swapin pages on instantiation")
f0e45fb4da29 ("mm: memcontrol: drop unused try/commit/cancel charge API")
9d82c69438d0 ("mm: memcontrol: convert anon and file-thp to new mem_cgroup_charge() API")
468c398233da ("mm: memcontrol: switch to native NR_ANON_THPS counter")
be5d0a74c62d ("mm: memcontrol: switch to native NR_ANON_MAPPED counter")
0d1c20722ab3 ("mm: memcontrol: switch to native NR_FILE_PAGES and NR_SHMEM counters")
49e50d277ba2 ("mm: memcontrol: prepare move_account for removal of private page type counters")
9f762dbe19b9 ("mm: memcontrol: prepare uncharging for removal of private page type counters")
3fea5a499d57 ("mm: memcontrol: convert page cache to a new mem_cgroup_charge() API")
6caa6a0703e0 ("mm: memcontrol: move out cgroup swaprate throttling")
14235ab36019 ("mm: shmem: remove rare optimization when swapin races with hole punching")
3fba69a56e16 ("mm: memcontrol: drop @compound parameter from memcg charging API")
abb242f57196 ("mm: memcontrol: fix stat-corrupting race in charge moving")
f4129ea3591a ("mm: fix NUMA node file count error in replace_page_cache()")
ffe945e633b5 ("khugepaged: do not stop collapse if less than half PTEs are referenced")
396bcc5299c2 ("mm: remove CONFIG_TRANSPARENT_HUGE_PAGECACHE")
85b9f46e8ea4 ("mm, thp: track fallbacks due to failed memcg charges separately")
dcdf11ee1441 ("mm, shmem: add vmstat for hugepage fallback")
9c315e4d7d8c ("mm: memcg/slab: cache page number in memcg_(un)charge_slab()")
92d0510c3585 ("mm: kmem: switch to nr_pages in (__)memcg_kmem_charge_memcg()")
f4b00eab5004 ("mm: kmem: rename memcg_kmem_(un)charge() into memcg_kmem_(un)charge_page()")
50591183fa86 ("mm: kmem: cleanup memcg_kmem_uncharge_memcg() arguments")
10eaec2f63b6 ("mm: kmem: cleanup (__)memcg_kmem_charge_memcg() arguments")
47e29d32afba ("mm/gup: page->hpage_pinned_refcount: exact pin counts for huge pages")
3faa52c03f44 ("mm/gup: track FOLL_PIN pages")
3b78d8347d31 ("mm/gup: pass gup flags to two more routines")
c23a0c99793f ("mm/migrate: clean up some minor coding style")
92855270ff08 ("mm/memcontrol.c: cleanup some useless code")
f1f6a7dd9b53 ("mm, tree-wide: rename put_user_page*() to unpin_user_page*()")
aa4b87fe9ea3 ("powerpc: book3s64: convert to pin_user_pages() and put_user_page()")
19fed0dae94d ("vfio, mm: pin_user_pages (FOLL_PIN) and put_user_page() conversion")
1f815afcfca7 ("media/v4l2-core: pin_user_pages (FOLL_PIN) and put_user_page() conversion")
803e4572d7c5 ("mm/process_vm_access: set FOLL_PIN via pin_user_pages_remote()")
57459435cff5 ("goldish_pipe: convert to pin_user_pages() and put_user_page()")
eddb1c228f79 ("mm/gup: introduce pin_user_pages*() and FOLL_PIN")
3c7470b6f684 ("media/v4l2-core: set pages dirty upon releasing DMA buffers")
f4000fdf435b ("mm/gup: allow FOLL_FORCE for get_user_pages_fast()")
3567813eae5e ("vfio: fix FOLL_LONGTERM use, simplify get_user_pages_remote() call")
c4237f8b1f4f ("mm: fix get_user_pages_remote()'s handling of FOLL_LONGTERM")
a707cdd55f0f ("mm/gup: move try_get_compound_head() to top, fix minor issues")
a43e982082c2 ("mm/gup: factor out duplicate code from four routines")
fac0516b5534 ("mm: thp: don't need care deferred split queue in memcg charge move path")
f1fe80d4ae33 ("mm, thp: do not queue fully unmapped pages for deferred split")
acbfb087e3b1 ("mm/hugetlb: avoid looping to the same hugepage if !pages and !vmas")
867e5e1de14b ("mm: clean up and clarify lruvec lookup procedure")
242c37b459ce ("include/linux/memcontrol.h: fix comments based on per-node memcg")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From f1796544a0ca0f14386a679d3d05fbc69235015e Mon Sep 17 00:00:00 2001
From: Michal Hocko <mhocko(a)suse.com>
Date: Fri, 4 Sep 2020 16:35:24 -0700
Subject: [PATCH] memcg: fix use-after-free in uncharge_batch
syzbot has reported an use-after-free in the uncharge_batch path
BUG: KASAN: use-after-free in instrument_atomic_write include/linux/instrumented.h:71 [inline]
BUG: KASAN: use-after-free in atomic64_sub_return include/asm-generic/atomic-instrumented.h:970 [inline]
BUG: KASAN: use-after-free in atomic_long_sub_return include/asm-generic/atomic-long.h:113 [inline]
BUG: KASAN: use-after-free in page_counter_cancel mm/page_counter.c:54 [inline]
BUG: KASAN: use-after-free in page_counter_uncharge+0x3d/0xc0 mm/page_counter.c:155
Write of size 8 at addr ffff8880371c0148 by task syz-executor.0/9304
CPU: 0 PID: 9304 Comm: syz-executor.0 Not tainted 5.8.0-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x1f0/0x31e lib/dump_stack.c:118
print_address_description+0x66/0x620 mm/kasan/report.c:383
__kasan_report mm/kasan/report.c:513 [inline]
kasan_report+0x132/0x1d0 mm/kasan/report.c:530
check_memory_region_inline mm/kasan/generic.c:183 [inline]
check_memory_region+0x2b5/0x2f0 mm/kasan/generic.c:192
instrument_atomic_write include/linux/instrumented.h:71 [inline]
atomic64_sub_return include/asm-generic/atomic-instrumented.h:970 [inline]
atomic_long_sub_return include/asm-generic/atomic-long.h:113 [inline]
page_counter_cancel mm/page_counter.c:54 [inline]
page_counter_uncharge+0x3d/0xc0 mm/page_counter.c:155
uncharge_batch+0x6c/0x350 mm/memcontrol.c:6764
uncharge_page+0x115/0x430 mm/memcontrol.c:6796
uncharge_list mm/memcontrol.c:6835 [inline]
mem_cgroup_uncharge_list+0x70/0xe0 mm/memcontrol.c:6877
release_pages+0x13a2/0x1550 mm/swap.c:911
tlb_batch_pages_flush mm/mmu_gather.c:49 [inline]
tlb_flush_mmu_free mm/mmu_gather.c:242 [inline]
tlb_flush_mmu+0x780/0x910 mm/mmu_gather.c:249
tlb_finish_mmu+0xcb/0x200 mm/mmu_gather.c:328
exit_mmap+0x296/0x550 mm/mmap.c:3185
__mmput+0x113/0x370 kernel/fork.c:1076
exit_mm+0x4cd/0x550 kernel/exit.c:483
do_exit+0x576/0x1f20 kernel/exit.c:793
do_group_exit+0x161/0x2d0 kernel/exit.c:903
get_signal+0x139b/0x1d30 kernel/signal.c:2743
arch_do_signal+0x33/0x610 arch/x86/kernel/signal.c:811
exit_to_user_mode_loop kernel/entry/common.c:135 [inline]
exit_to_user_mode_prepare+0x8d/0x1b0 kernel/entry/common.c:166
syscall_exit_to_user_mode+0x5e/0x1a0 kernel/entry/common.c:241
entry_SYSCALL_64_after_hwframe+0x44/0xa9
Commit 1a3e1f40962c ("mm: memcontrol: decouple reference counting from
page accounting") reworked the memcg lifetime to be bound the the struct
page rather than charges. It also removed the css_put_many from
uncharge_batch and that is causing the above splat.
uncharge_batch() is supposed to uncharge accumulated charges for all
pages freed from the same memcg. The queuing is done by uncharge_page
which however drops the memcg reference after it adds charges to the
batch. If the current page happens to be the last one holding the
reference for its memcg then the memcg is OK to go and the next page to
be freed will trigger batched uncharge which needs to access the memcg
which is gone already.
Fix the issue by taking a reference for the memcg in the current batch.
Fixes: 1a3e1f40962c ("mm: memcontrol: decouple reference counting from page accounting")
Reported-by: syzbot+b305848212deec86eabe(a)syzkaller.appspotmail.com
Reported-by: syzbot+b5ea6fb6f139c8b9482b(a)syzkaller.appspotmail.com
Signed-off-by: Michal Hocko <mhocko(a)suse.com>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Reviewed-by: Shakeel Butt <shakeelb(a)google.com>
Acked-by: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: Roman Gushchin <guro(a)fb.com>
Cc: Hugh Dickins <hughd(a)google.com>
Link: https://lkml.kernel.org/r/20200820090341.GC5033@dhcp22.suse.cz
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index b807952b4d43..cfa6cbad21d5 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -6774,6 +6774,9 @@ static void uncharge_batch(const struct uncharge_gather *ug)
__this_cpu_add(ug->memcg->vmstats_percpu->nr_page_events, ug->nr_pages);
memcg_check_events(ug->memcg, ug->dummy_page);
local_irq_restore(flags);
+
+ /* drop reference from uncharge_page */
+ css_put(&ug->memcg->css);
}
static void uncharge_page(struct page *page, struct uncharge_gather *ug)
@@ -6797,6 +6800,9 @@ static void uncharge_page(struct page *page, struct uncharge_gather *ug)
uncharge_gather_clear(ug);
}
ug->memcg = page->mem_cgroup;
+
+ /* pairs with css_put in uncharge_batch */
+ css_get(&ug->memcg->css);
}
nr_pages = compound_nr(page);
Hi,
this series does basically two things:
1. Disables automatic load balancing as adviced by the hardware
workaround.
2. Forces the sharing of the load submitted to CCS among all the
CCS available (as of now only DG2 has more than one CCS). This
way the user, when sending a query, will see only one CCS
available.
Andi
Andi Shyti (2):
drm/i915/gt: Disable HW load balancing for CCS
drm/i915/gt: Set default CCS mode '1'
drivers/gpu/drm/i915/gt/intel_gt.c | 11 +++++++++++
drivers/gpu/drm/i915/gt/intel_gt_regs.h | 3 +++
drivers/gpu/drm/i915/gt/intel_workarounds.c | 6 ++++++
drivers/gpu/drm/i915/i915_drv.h | 17 +++++++++++++++++
drivers/gpu/drm/i915/i915_query.c | 5 +++--
5 files changed, 40 insertions(+), 2 deletions(-)
--
2.43.0
From: David Woodhouse <dwmw(a)amazon.co.uk>
Linux guests since commit b1c3497e604d ("x86/xen: Add support for
HVMOP_set_evtchn_upcall_vector") in v6.0 onwards will use the per-vCPU
upcall vector when it's advertised in the Xen CPUID leaves.
This upcall is injected through the guest's local APIC as an MSI, unlike
the older system vector which was merely injected by the hypervisor any
time the CPU was able to receive an interrupt and the upcall_pending
flags is set in its vcpu_info.
Effectively, that makes the per-CPU upcall edge triggered instead of
level triggered, which results in the upcall being lost if the MSI is
delivered when the local APIC is *disabled*.
Xen checks the vcpu_info->evtchn_upcall_pending flag when the local APIC
for a vCPU is software enabled (in fact, on any write to the SPIV
register which doesn't disable the APIC). Do the same in KVM since KVM
doesn't provide a way for userspace to intervene and trap accesses to
the SPIV register of a local APIC emulated by KVM.
Fixes: fde0451be8fb3 ("KVM: x86/xen: Support per-vCPU event channel upcall via local APIC")
Signed-off-by: David Woodhouse <dwmw(a)amazon.co.uk>
Reviewed-by: Paul Durrant <paul(a)xen.org>
Cc: stable(a)vger.kernel.org
---
arch/x86/kvm/lapic.c | 5 ++++-
arch/x86/kvm/xen.c | 2 +-
arch/x86/kvm/xen.h | 18 ++++++++++++++++++
3 files changed, 23 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 3242f3da2457..75bc7d3f0022 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -41,6 +41,7 @@
#include "ioapic.h"
#include "trace.h"
#include "x86.h"
+#include "xen.h"
#include "cpuid.h"
#include "hyperv.h"
#include "smm.h"
@@ -499,8 +500,10 @@ static inline void apic_set_spiv(struct kvm_lapic *apic, u32 val)
}
/* Check if there are APF page ready requests pending */
- if (enabled)
+ if (enabled) {
kvm_make_request(KVM_REQ_APF_READY, apic->vcpu);
+ kvm_xen_sw_enable_lapic(apic->vcpu);
+ }
}
static inline void kvm_apic_set_xapic_id(struct kvm_lapic *apic, u8 id)
diff --git a/arch/x86/kvm/xen.c b/arch/x86/kvm/xen.c
index ccd2dc753fd6..06904696759c 100644
--- a/arch/x86/kvm/xen.c
+++ b/arch/x86/kvm/xen.c
@@ -568,7 +568,7 @@ void kvm_xen_update_runstate(struct kvm_vcpu *v, int state)
kvm_xen_update_runstate_guest(v, state == RUNSTATE_runnable);
}
-static void kvm_xen_inject_vcpu_vector(struct kvm_vcpu *v)
+void kvm_xen_inject_vcpu_vector(struct kvm_vcpu *v)
{
struct kvm_lapic_irq irq = { };
int r;
diff --git a/arch/x86/kvm/xen.h b/arch/x86/kvm/xen.h
index f8f1fe22d090..f5841d9000ae 100644
--- a/arch/x86/kvm/xen.h
+++ b/arch/x86/kvm/xen.h
@@ -18,6 +18,7 @@ extern struct static_key_false_deferred kvm_xen_enabled;
int __kvm_xen_has_interrupt(struct kvm_vcpu *vcpu);
void kvm_xen_inject_pending_events(struct kvm_vcpu *vcpu);
+void kvm_xen_inject_vcpu_vector(struct kvm_vcpu *vcpu);
int kvm_xen_vcpu_set_attr(struct kvm_vcpu *vcpu, struct kvm_xen_vcpu_attr *data);
int kvm_xen_vcpu_get_attr(struct kvm_vcpu *vcpu, struct kvm_xen_vcpu_attr *data);
int kvm_xen_hvm_set_attr(struct kvm *kvm, struct kvm_xen_hvm_attr *data);
@@ -36,6 +37,19 @@ int kvm_xen_setup_evtchn(struct kvm *kvm,
const struct kvm_irq_routing_entry *ue);
void kvm_xen_update_tsc_info(struct kvm_vcpu *vcpu);
+static inline void kvm_xen_sw_enable_lapic(struct kvm_vcpu *vcpu)
+{
+ /*
+ * The local APIC is being enabled. If the per-vCPU upcall vector is
+ * set and the vCPU's evtchn_upcall_pending flag is set, inject the
+ * interrupt.
+ */
+ if (static_branch_unlikely(&kvm_xen_enabled.key) &&
+ vcpu->arch.xen.vcpu_info_cache.active &&
+ vcpu->arch.xen.upcall_vector && __kvm_xen_has_interrupt(vcpu))
+ kvm_xen_inject_vcpu_vector(vcpu);
+}
+
static inline bool kvm_xen_msr_enabled(struct kvm *kvm)
{
return static_branch_unlikely(&kvm_xen_enabled.key) &&
@@ -101,6 +115,10 @@ static inline void kvm_xen_destroy_vcpu(struct kvm_vcpu *vcpu)
{
}
+static inline void kvm_xen_sw_enable_lapic(struct kvm_vcpu *vcpu)
+{
+}
+
static inline bool kvm_xen_msr_enabled(struct kvm *kvm)
{
return false;
--
2.43.0
Because sandboxing can be used as an opportunistic security measure,
user space may not log unsupported features. Let the system
administrator know if an application tries to use Landlock but failed
because it isn't enabled at boot time. This may be caused by bootloader
configurations with outdated "lsm" kernel's command-line parameter.
Cc: stable(a)vger.kernel.org
Fixes: 265885daf3e5 ("landlock: Add syscall implementations")
Reviewed-by: Kees Cook <keescook(a)chromium.org>
Reviewed-by: Günther Noack <gnoack3000(a)gmail.com>
Signed-off-by: Mickaël Salaün <mic(a)digikod.net>
---
Changes since v1:
* Add Kees's and Günther's Reviewed-by.
* Rename is_not_initialized() to not_initialized() and invert the logic,
as suggested by Günther. This is a cosmetic change without global
behavioral changed.
* Update link to point to a new subsection.
---
security/landlock/syscalls.c | 18 +++++++++++++++---
1 file changed, 15 insertions(+), 3 deletions(-)
diff --git a/security/landlock/syscalls.c b/security/landlock/syscalls.c
index 898358f57fa0..6788e73b6681 100644
--- a/security/landlock/syscalls.c
+++ b/security/landlock/syscalls.c
@@ -33,6 +33,18 @@
#include "ruleset.h"
#include "setup.h"
+static bool is_initialized(void)
+{
+ if (likely(landlock_initialized))
+ return true;
+
+ pr_warn_once(
+ "Disabled but requested by user space. "
+ "You should enable Landlock at boot time: "
+ "https://docs.kernel.org/userspace-api/landlock.html#boot-time-configuration…");
+ return false;
+}
+
/**
* copy_min_struct_from_user - Safe future-proof argument copying
*
@@ -173,7 +185,7 @@ SYSCALL_DEFINE3(landlock_create_ruleset,
/* Build-time checks. */
build_check_abi();
- if (!landlock_initialized)
+ if (!is_initialized())
return -EOPNOTSUPP;
if (flags) {
@@ -398,7 +410,7 @@ SYSCALL_DEFINE4(landlock_add_rule, const int, ruleset_fd,
struct landlock_ruleset *ruleset;
int err;
- if (!landlock_initialized)
+ if (!is_initialized())
return -EOPNOTSUPP;
/* No flag for now. */
@@ -458,7 +470,7 @@ SYSCALL_DEFINE2(landlock_restrict_self, const int, ruleset_fd, const __u32,
struct landlock_cred_security *new_llcred;
int err;
- if (!landlock_initialized)
+ if (!is_initialized())
return -EOPNOTSUPP;
/*
--
2.44.0
After a couple recent changes in LLVM, there is a warning (or error with
CONFIG_WERROR=y or W=e) from the compile time fortify source routines,
specifically the memset() in copy_to_user_tmpl().
In file included from net/xfrm/xfrm_user.c:14:
...
include/linux/fortify-string.h:438:4: error: call to '__write_overflow_field' declared with 'warning' attribute: detected write beyond size of field (1st parameter); maybe use struct_group()? [-Werror,-Wattribute-warning]
438 | __write_overflow_field(p_size_field, size);
| ^
1 error generated.
While ->xfrm_nr has been validated against XFRM_MAX_DEPTH when its value
is first assigned in copy_templates() by calling validate_tmpl() first
(so there should not be any issue in practice), LLVM/clang cannot really
deduce that across the boundaries of these functions. Without that
knowledge, it cannot assume that the loop stops before i is greater than
XFRM_MAX_DEPTH, which would indeed result a stack buffer overflow in the
memset().
To make the bounds of ->xfrm_nr clear to the compiler and add additional
defense in case copy_to_user_tmpl() is ever used in a path where
->xfrm_nr has not been properly validated against XFRM_MAX_DEPTH first,
add an explicit bound check and early return, which clears up the
warning.
Cc: stable(a)vger.kernel.org
Link: https://github.com/ClangBuiltLinux/linux/issues/1985
Signed-off-by: Nathan Chancellor <nathan(a)kernel.org>
---
net/xfrm/xfrm_user.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/net/xfrm/xfrm_user.c b/net/xfrm/xfrm_user.c
index f037be190bae..912c1189ba41 100644
--- a/net/xfrm/xfrm_user.c
+++ b/net/xfrm/xfrm_user.c
@@ -2017,6 +2017,9 @@ static int copy_to_user_tmpl(struct xfrm_policy *xp, struct sk_buff *skb)
if (xp->xfrm_nr == 0)
return 0;
+ if (xp->xfrm_nr > XFRM_MAX_DEPTH)
+ return -ENOBUFS;
+
for (i = 0; i < xp->xfrm_nr; i++) {
struct xfrm_user_tmpl *up = &vec[i];
struct xfrm_tmpl *kp = &xp->xfrm_vec[i];
---
base-commit: 14dec56fdd4c70a0ebe40077368e367421ea6fef
change-id: 20240221-xfrm-avoid-clang-fortify-warning-copy_to_user_tmpl-40cb10b003e3
Best regards,
--
Nathan Chancellor <nathan(a)kernel.org>
Svacer reports a potential division by zero at rcu_torture_writer() in
5.15 stable release. The problem has been fixed by the following patch
that can be cleanly applied to 5.15 branch.
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x e3b63e966cac0bf78aaa1efede1827a252815a1d
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024022612-uncloak-pretext-f4a2@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
e3b63e966cac ("mm: zswap: fix missing folio cleanup in writeback race path")
96c7b0b42239 ("mm: return the folio from __read_swap_cache_async()")
e947ba0bbf47 ("mm/zswap: cleanup zswap_writeback_entry()")
32acba4c0483 ("mm/zswap: refactor out __zswap_load()")
c75f5c1e0f1d ("mm/zswap: reuse dstmem when decompress")
b5ba474f3f51 ("zswap: shrink zswap pool based on memory pressure")
a65b0e7607cc ("zswap: make shrinking memcg-aware")
ddc1a5cbc05d ("mempolicy: alloc_pages_mpol() for NUMA policy without vma")
23e4883248f0 ("mm: add page_rmappable_folio() wrapper")
c36f6e6dff4d ("mempolicy trivia: slightly more consistent naming")
7f1ee4e20708 ("mempolicy trivia: delete those ancient pr_debug()s")
1cb5d11a370f ("mempolicy: fix migrate_pages(2) syscall return nr_failed")
3657fdc2451a ("mm: move vma_policy() and anon_vma_name() decls to mm_types.h")
3022fd7af960 ("shmem: _add_to_page_cache() before shmem_inode_acct_blocks()")
054a9f7ccd0a ("shmem: move memcg charge out of shmem_add_to_page_cache()")
4199f51a7eb2 ("shmem: shmem_acct_blocks() and shmem_inode_acct_blocks()")
e3e1a5067fd2 ("shmem: remove vma arg from shmem_get_folio_gfp()")
75c70128a673 ("mm: mempolicy: make mpol_misplaced() to take a folio")
cda6d93672ac ("mm: memory: make numa_migrate_prep() to take a folio")
6695cf68b15c ("mm: memory: use a folio in do_numa_page()")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From e3b63e966cac0bf78aaa1efede1827a252815a1d Mon Sep 17 00:00:00 2001
From: Yosry Ahmed <yosryahmed(a)google.com>
Date: Thu, 25 Jan 2024 08:51:27 +0000
Subject: [PATCH] mm: zswap: fix missing folio cleanup in writeback race path
In zswap_writeback_entry(), after we get a folio from
__read_swap_cache_async(), we grab the tree lock again to check that the
swap entry was not invalidated and recycled. If it was, we delete the
folio we just added to the swap cache and exit.
However, __read_swap_cache_async() returns the folio locked when it is
newly allocated, which is always true for this path, and the folio is
ref'd. Make sure to unlock and put the folio before returning.
This was discovered by code inspection, probably because this path handles
a race condition that should not happen often, and the bug would not crash
the system, it will only strand the folio indefinitely.
Link: https://lkml.kernel.org/r/20240125085127.1327013-1-yosryahmed@google.com
Fixes: 04fc7816089c ("mm: fix zswap writeback race condition")
Signed-off-by: Yosry Ahmed <yosryahmed(a)google.com>
Reviewed-by: Chengming Zhou <zhouchengming(a)bytedance.com>
Acked-by: Johannes Weiner <hannes(a)cmpxchg.org>
Reviewed-by: Nhat Pham <nphamcs(a)gmail.com>
Cc: Domenico Cerasuolo <cerasuolodomenico(a)gmail.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/mm/zswap.c b/mm/zswap.c
index 350dd2fc8159..d2423247acfd 100644
--- a/mm/zswap.c
+++ b/mm/zswap.c
@@ -1440,6 +1440,8 @@ static int zswap_writeback_entry(struct zswap_entry *entry,
if (zswap_rb_search(&tree->rbroot, swp_offset(entry->swpentry)) != entry) {
spin_unlock(&tree->lock);
delete_from_swap_cache(folio);
+ folio_unlock(folio);
+ folio_put(folio);
return -ENOMEM;
}
spin_unlock(&tree->lock);
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x e3b63e966cac0bf78aaa1efede1827a252815a1d
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024022611-tropics-deferred-2483@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
e3b63e966cac ("mm: zswap: fix missing folio cleanup in writeback race path")
96c7b0b42239 ("mm: return the folio from __read_swap_cache_async()")
e947ba0bbf47 ("mm/zswap: cleanup zswap_writeback_entry()")
32acba4c0483 ("mm/zswap: refactor out __zswap_load()")
c75f5c1e0f1d ("mm/zswap: reuse dstmem when decompress")
b5ba474f3f51 ("zswap: shrink zswap pool based on memory pressure")
a65b0e7607cc ("zswap: make shrinking memcg-aware")
ddc1a5cbc05d ("mempolicy: alloc_pages_mpol() for NUMA policy without vma")
23e4883248f0 ("mm: add page_rmappable_folio() wrapper")
c36f6e6dff4d ("mempolicy trivia: slightly more consistent naming")
7f1ee4e20708 ("mempolicy trivia: delete those ancient pr_debug()s")
1cb5d11a370f ("mempolicy: fix migrate_pages(2) syscall return nr_failed")
3657fdc2451a ("mm: move vma_policy() and anon_vma_name() decls to mm_types.h")
3022fd7af960 ("shmem: _add_to_page_cache() before shmem_inode_acct_blocks()")
054a9f7ccd0a ("shmem: move memcg charge out of shmem_add_to_page_cache()")
4199f51a7eb2 ("shmem: shmem_acct_blocks() and shmem_inode_acct_blocks()")
e3e1a5067fd2 ("shmem: remove vma arg from shmem_get_folio_gfp()")
75c70128a673 ("mm: mempolicy: make mpol_misplaced() to take a folio")
cda6d93672ac ("mm: memory: make numa_migrate_prep() to take a folio")
6695cf68b15c ("mm: memory: use a folio in do_numa_page()")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From e3b63e966cac0bf78aaa1efede1827a252815a1d Mon Sep 17 00:00:00 2001
From: Yosry Ahmed <yosryahmed(a)google.com>
Date: Thu, 25 Jan 2024 08:51:27 +0000
Subject: [PATCH] mm: zswap: fix missing folio cleanup in writeback race path
In zswap_writeback_entry(), after we get a folio from
__read_swap_cache_async(), we grab the tree lock again to check that the
swap entry was not invalidated and recycled. If it was, we delete the
folio we just added to the swap cache and exit.
However, __read_swap_cache_async() returns the folio locked when it is
newly allocated, which is always true for this path, and the folio is
ref'd. Make sure to unlock and put the folio before returning.
This was discovered by code inspection, probably because this path handles
a race condition that should not happen often, and the bug would not crash
the system, it will only strand the folio indefinitely.
Link: https://lkml.kernel.org/r/20240125085127.1327013-1-yosryahmed@google.com
Fixes: 04fc7816089c ("mm: fix zswap writeback race condition")
Signed-off-by: Yosry Ahmed <yosryahmed(a)google.com>
Reviewed-by: Chengming Zhou <zhouchengming(a)bytedance.com>
Acked-by: Johannes Weiner <hannes(a)cmpxchg.org>
Reviewed-by: Nhat Pham <nphamcs(a)gmail.com>
Cc: Domenico Cerasuolo <cerasuolodomenico(a)gmail.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/mm/zswap.c b/mm/zswap.c
index 350dd2fc8159..d2423247acfd 100644
--- a/mm/zswap.c
+++ b/mm/zswap.c
@@ -1440,6 +1440,8 @@ static int zswap_writeback_entry(struct zswap_entry *entry,
if (zswap_rb_search(&tree->rbroot, swp_offset(entry->swpentry)) != entry) {
spin_unlock(&tree->lock);
delete_from_swap_cache(folio);
+ folio_unlock(folio);
+ folio_put(folio);
return -ENOMEM;
}
spin_unlock(&tree->lock);
The patch below does not apply to the 6.7-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.7.y
git checkout FETCH_HEAD
git cherry-pick -x 678e54d4bb9a4822f8ae99690ac131c5d490cdb1
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024022622-agony-salvaging-5082@gregkh' --subject-prefix 'PATCH 6.7.y' HEAD^..
Possible dependencies:
678e54d4bb9a ("mm/zswap: invalidate duplicate entry when !zswap_enabled")
a65b0e7607cc ("zswap: make shrinking memcg-aware")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 678e54d4bb9a4822f8ae99690ac131c5d490cdb1 Mon Sep 17 00:00:00 2001
From: Chengming Zhou <zhouchengming(a)bytedance.com>
Date: Thu, 8 Feb 2024 02:32:54 +0000
Subject: [PATCH] mm/zswap: invalidate duplicate entry when !zswap_enabled
We have to invalidate any duplicate entry even when !zswap_enabled since
zswap can be disabled anytime. If the folio store success before, then
got dirtied again but zswap disabled, we won't invalidate the old
duplicate entry in the zswap_store(). So later lru writeback may
overwrite the new data in swapfile.
Link: https://lkml.kernel.org/r/20240208023254.3873823-1-chengming.zhou@linux.dev
Fixes: 42c06a0e8ebe ("mm: kill frontswap")
Signed-off-by: Chengming Zhou <zhouchengming(a)bytedance.com>
Acked-by: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: Nhat Pham <nphamcs(a)gmail.com>
Cc: Yosry Ahmed <yosryahmed(a)google.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/mm/zswap.c b/mm/zswap.c
index 36903d938c15..db4625af65fb 100644
--- a/mm/zswap.c
+++ b/mm/zswap.c
@@ -1518,7 +1518,7 @@ bool zswap_store(struct folio *folio)
if (folio_test_large(folio))
return false;
- if (!zswap_enabled || !tree)
+ if (!tree)
return false;
/*
@@ -1533,6 +1533,10 @@ bool zswap_store(struct folio *folio)
zswap_invalidate_entry(tree, dupentry);
}
spin_unlock(&tree->lock);
+
+ if (!zswap_enabled)
+ return false;
+
objcg = get_obj_cgroup_from_folio(folio);
if (objcg && !obj_cgroup_may_zswap(objcg)) {
memcg = get_mem_cgroup_from_objcg(objcg);
This is the backport of recently upstreamed series that moves VERW
execution to a later point in exit-to-user path. This is needed because
in some cases it may be possible for data accessed after VERW executions
may end into MDS affected CPU buffers. Moving VERW closer to ring
transition reduces the attack surface.
Patch 1/6 includes a minor fix that is queued for upstream:
https://lore.kernel.org/lkml/170899674562.398.6398007479766564897.tip-bot2@…
Patch 2/6 needed a conflict to be resolved for the hunk
swapgs_restore_regs_and_return_to_usermode.
Signed-off-by: Pawan Gupta <pawan.kumar.gupta(a)linux.intel.com>
---
Pawan Gupta (5):
x86/bugs: Add asm helpers for executing VERW
x86/entry_64: Add VERW just before userspace transition
x86/entry_32: Add VERW just before userspace transition
x86/bugs: Use ALTERNATIVE() instead of mds_user_clear static key
KVM/VMX: Move VERW closer to VMentry for MDS mitigation
Sean Christopherson (1):
KVM/VMX: Use BT+JNC, i.e. EFLAGS.CF to select VMRESUME vs. VMLAUNCH
Documentation/arch/x86/mds.rst | 38 +++++++++++++++++++++++++-----------
arch/x86/entry/entry.S | 23 ++++++++++++++++++++++
arch/x86/entry/entry_32.S | 3 +++
arch/x86/entry/entry_64.S | 11 +++++++++++
arch/x86/entry/entry_64_compat.S | 1 +
arch/x86/include/asm/cpufeatures.h | 2 +-
arch/x86/include/asm/entry-common.h | 1 -
arch/x86/include/asm/nospec-branch.h | 25 ++++++++++++------------
arch/x86/kernel/cpu/bugs.c | 15 ++++++--------
arch/x86/kernel/nmi.c | 3 ---
arch/x86/kvm/vmx/run_flags.h | 7 +++++--
arch/x86/kvm/vmx/vmenter.S | 9 ++++++---
arch/x86/kvm/vmx/vmx.c | 20 +++++++++++++++----
13 files changed, 112 insertions(+), 46 deletions(-)
---
base-commit: b631f5b445dc3379f67ff63a2e4c58f22d4975dc
change-id: 20240226-delay-verw-backport-6-7-y-a2cb3f26bb90
Best regards,
--
Thanks,
Pawan
Hi,
This series fixes errors during module removal. It also
implements PHY core voltage selection as per TI recommendation
and workaround for Errata i2409 [1].
The workaround needs PHY2 region to be present in device node.
The device tree patch will be sent later after the DT binding doc
is merged.
[1] - https://www.ti.com/lit/er/sprz487d/sprz487d.pdf
Signed-off-by: Roger Quadros <rogerq(a)kernel.org>
---
Changes in v4:
- re-arranged patches into first 2 bug-fixes and added Cc stable for them
- Added Acked-by
- Link to v3: https://lore.kernel.org/r/20240214-for-v6-9-am62-usb-errata-3-0-v3-0-147ec5…
---
Roger Quadros (4):
usb: dwc3-am62: fix module unload/reload behavior
usb: dwc3-am62: Disable wakeup at remove
usb: dwc3-am62: Fix PHY core voltage selection
usb: dwc3-am62: add workaround for Errata i2409
drivers/usb/dwc3/dwc3-am62.c | 42 ++++++++++++++++++++++++++++++------------
1 file changed, 30 insertions(+), 12 deletions(-)
---
base-commit: 6613476e225e090cc9aad49be7fa504e290dd33d
change-id: 20240206-for-v6-9-am62-usb-errata-3-0-233024ea8e9d
Best regards,
--
Roger Quadros <rogerq(a)kernel.org>
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.19.y
git checkout FETCH_HEAD
git cherry-pick -x b820de741ae48ccf50dd95e297889c286ff4f760
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024022602-unwrapped-haggler-daae@gregkh' --subject-prefix 'PATCH 4.19.y' HEAD^..
Possible dependencies:
b820de741ae4 ("fs/aio: Restrict kiocb_set_cancel_fn() to I/O submitted via libaio")
9cf3516c29e6 ("fs: add IOCB flags related to passing back dio completions")
f6c73a11133e ("fs.h: Add TRACE_IOCB_STRINGS for use in trace points")
1da8cf961bb1 ("Merge tag 'io_uring-6.0-2022-08-13' of git://git.kernel.dk/linux-block")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From b820de741ae48ccf50dd95e297889c286ff4f760 Mon Sep 17 00:00:00 2001
From: Bart Van Assche <bvanassche(a)acm.org>
Date: Thu, 15 Feb 2024 12:47:38 -0800
Subject: [PATCH] fs/aio: Restrict kiocb_set_cancel_fn() to I/O submitted via
libaio
If kiocb_set_cancel_fn() is called for I/O submitted via io_uring, the
following kernel warning appears:
WARNING: CPU: 3 PID: 368 at fs/aio.c:598 kiocb_set_cancel_fn+0x9c/0xa8
Call trace:
kiocb_set_cancel_fn+0x9c/0xa8
ffs_epfile_read_iter+0x144/0x1d0
io_read+0x19c/0x498
io_issue_sqe+0x118/0x27c
io_submit_sqes+0x25c/0x5fc
__arm64_sys_io_uring_enter+0x104/0xab0
invoke_syscall+0x58/0x11c
el0_svc_common+0xb4/0xf4
do_el0_svc+0x2c/0xb0
el0_svc+0x2c/0xa4
el0t_64_sync_handler+0x68/0xb4
el0t_64_sync+0x1a4/0x1a8
Fix this by setting the IOCB_AIO_RW flag for read and write I/O that is
submitted by libaio.
Suggested-by: Jens Axboe <axboe(a)kernel.dk>
Cc: Christoph Hellwig <hch(a)lst.de>
Cc: Avi Kivity <avi(a)scylladb.com>
Cc: Sandeep Dhavale <dhavale(a)google.com>
Cc: Jens Axboe <axboe(a)kernel.dk>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: Kent Overstreet <kent.overstreet(a)linux.dev>
Cc: stable(a)vger.kernel.org
Signed-off-by: Bart Van Assche <bvanassche(a)acm.org>
Link: https://lore.kernel.org/r/20240215204739.2677806-2-bvanassche@acm.org
Signed-off-by: Christian Brauner <brauner(a)kernel.org>
diff --git a/fs/aio.c b/fs/aio.c
index bb2ff48991f3..da18dbcfcb22 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -593,6 +593,13 @@ void kiocb_set_cancel_fn(struct kiocb *iocb, kiocb_cancel_fn *cancel)
struct kioctx *ctx = req->ki_ctx;
unsigned long flags;
+ /*
+ * kiocb didn't come from aio or is neither a read nor a write, hence
+ * ignore it.
+ */
+ if (!(iocb->ki_flags & IOCB_AIO_RW))
+ return;
+
if (WARN_ON_ONCE(!list_empty(&req->ki_list)))
return;
@@ -1509,7 +1516,7 @@ static int aio_prep_rw(struct kiocb *req, const struct iocb *iocb)
req->ki_complete = aio_complete_rw;
req->private = NULL;
req->ki_pos = iocb->aio_offset;
- req->ki_flags = req->ki_filp->f_iocb_flags;
+ req->ki_flags = req->ki_filp->f_iocb_flags | IOCB_AIO_RW;
if (iocb->aio_flags & IOCB_FLAG_RESFD)
req->ki_flags |= IOCB_EVENTFD;
if (iocb->aio_flags & IOCB_FLAG_IOPRIO) {
diff --git a/include/linux/fs.h b/include/linux/fs.h
index ed5966a70495..c2dcc98cb4c8 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -352,6 +352,8 @@ enum rw_hint {
* unrelated IO (like cache flushing, new IO generation, etc).
*/
#define IOCB_DIO_CALLER_COMP (1 << 22)
+/* kiocb is a read or write operation submitted by fs/aio.c. */
+#define IOCB_AIO_RW (1 << 23)
/* for use in trace events */
#define TRACE_IOCB_STRINGS \
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 1b0ca4e4ff10a2c8402e2cf70132c683e1c772e4
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024022643-scorn-filtrate-8677@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
1b0ca4e4ff10 ("mm/damon/reclaim: fix quota stauts loss due to online tunings")
66d9faec0745 ("mm/damon/reclaim: add a parameter called skip_anon for avoiding anonymous pages reclamation")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 1b0ca4e4ff10a2c8402e2cf70132c683e1c772e4 Mon Sep 17 00:00:00 2001
From: SeongJae Park <sj(a)kernel.org>
Date: Fri, 16 Feb 2024 11:40:24 -0800
Subject: [PATCH] mm/damon/reclaim: fix quota stauts loss due to online tunings
Patch series "mm/damon: fix quota status loss due to online tunings".
DAMON_RECLAIM and DAMON_LRU_SORT is not preserving internal quota status
when applying new user parameters, and hence could cause temporal quota
accuracy degradation. Fix it by preserving the status.
This patch (of 2):
For online parameters change, DAMON_RECLAIM creates new scheme based on
latest values of the parameters and replaces the old scheme with the new
one. When creating it, the internal status of the quota of the old
scheme is not preserved. As a result, charging of the quota starts from
zero after the online tuning. The data that collected to estimate the
throughput of the scheme's action is also reset, and therefore the
estimation should start from the scratch again. Because the throughput
estimation is being used to convert the time quota to the effective size
quota, this could result in temporal time quota inaccuracy. It would be
recovered over time, though. In short, the quota accuracy could be
temporarily degraded after online parameters update.
Fix the problem by checking the case and copying the internal fields for
the status.
Link: https://lkml.kernel.org/r/20240216194025.9207-1-sj@kernel.org
Link: https://lkml.kernel.org/r/20240216194025.9207-2-sj@kernel.org
Fixes: e035c280f6df ("mm/damon/reclaim: support online inputs update")
Signed-off-by: SeongJae Park <sj(a)kernel.org>
Cc: <stable(a)vger.kernel.org> [5.19+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/mm/damon/reclaim.c b/mm/damon/reclaim.c
index ab974e477d2f..66e190f0374a 100644
--- a/mm/damon/reclaim.c
+++ b/mm/damon/reclaim.c
@@ -150,9 +150,20 @@ static struct damos *damon_reclaim_new_scheme(void)
&damon_reclaim_wmarks);
}
+static void damon_reclaim_copy_quota_status(struct damos_quota *dst,
+ struct damos_quota *src)
+{
+ dst->total_charged_sz = src->total_charged_sz;
+ dst->total_charged_ns = src->total_charged_ns;
+ dst->charged_sz = src->charged_sz;
+ dst->charged_from = src->charged_from;
+ dst->charge_target_from = src->charge_target_from;
+ dst->charge_addr_from = src->charge_addr_from;
+}
+
static int damon_reclaim_apply_parameters(void)
{
- struct damos *scheme;
+ struct damos *scheme, *old_scheme;
struct damos_filter *filter;
int err = 0;
@@ -164,6 +175,11 @@ static int damon_reclaim_apply_parameters(void)
scheme = damon_reclaim_new_scheme();
if (!scheme)
return -ENOMEM;
+ if (!list_empty(&ctx->schemes)) {
+ damon_for_each_scheme(old_scheme, ctx)
+ damon_reclaim_copy_quota_status(&scheme->quota,
+ &old_scheme->quota);
+ }
if (skip_anon) {
filter = damos_new_filter(DAMOS_FILTER_TYPE_ANON, true);
if (!filter) {
Hi Greg,
the issue might be due to this patch:
https://git.kernel.org/pub/scm/linux/kernel/git/stable/stable-queue.git/tre…
2024-02-23T15:39:05.6297767Z CC kernel/sys_ni.o
2024-02-23T15:39:05.7583048Z security/apparmor/af_unix.c: In function
‘unix_state_double_lock’:
2024-02-23T15:39:05.7586076Z security/apparmor/af_unix.c:583:17: error:
too few arguments to function ‘unix_state_lock_nested’
2024-02-23T15:39:05.7588374Z 583 |
unix_state_lock_nested(sk2);
2024-02-23T15:39:05.7589913Z | ^~~~~~~~~~~~~~~~~~~~~~
2024-02-23T15:39:05.7591564Z In file included from
security/apparmor/include/af_unix.h:15,
2024-02-23T15:39:05.7593341Z from
security/apparmor/af_unix.c:17:
2024-02-23T15:39:05.7594989Z ./include/net/af_unix.h:77:20: note:
declared here
2024-02-23T15:39:05.7596733Z 77 | static inline void
unix_state_lock_nested(struct sock *sk,
2024-02-23T15:39:05.7598516Z |
^~~~~~~~~~~~~~~~~~~~~~
2024-02-23T15:39:05.7600862Z security/apparmor/af_unix.c:586:17: error:
too few arguments to function ‘unix_state_lock_nested’
2024-02-23T15:39:05.7603177Z 586 |
unix_state_lock_nested(sk1);
2024-02-23T15:39:05.7605189Z | ^~~~~~~~~~~~~~~~~~~~~~
2024-02-23T15:39:05.7606765Z ./include/net/af_unix.h:77:20: note:
declared here
2024-02-23T15:39:05.7608497Z 77 | static inline void
unix_state_lock_nested(struct sock *sk,
2024-02-23T15:39:05.7610208Z |
^~~~~~~~~~~~~~~~~~~~~~
2024-02-23T15:39:05.8002385Z make[2]: *** [scripts/Makefile.build:262:
security/apparmor/af_unix.o] Error 1
2024-02-23T15:39:05.8005077Z make[2]: *** Waiting for unfinished jobs....
2024-02-23T15:39:05.8094726Z CC crypto/scatterwalk.o
2024-02-23T15:39:05.9082621Z CC [M] fs/btrfs/sysfs.o
2024-02-23T15:39:06.2502316Z CC kernel/nsproxy.o
2024-02-23T15:39:06.4094246Z make[1]: *** [scripts/Makefile.build:497:
security/apparmor] Error 2
2024-02-23T15:39:06.4207119Z make: *** [Makefile:1750: security] Error 2
2024-02-23T15:39:06.4208636Z CC kernel/notifier.o
2024-02-23T15:39:06.4210296Z make: *** Waiting for unfinished jobs....
2024-02-23T15:39:06.8604827Z CC crypto/proc.o
--
Best, Philip
This is the backport of recently upstreamed series that moves VERW
execution to a later point in exit-to-user path. This is needed because
in some cases it may be possible for data accessed after VERW executions
may end into MDS affected CPU buffers. Moving VERW closer to ring
transition reduces the attack surface.
Patch 1/6 includes a minor fix that is queued for upstream:
https://lore.kernel.org/lkml/170899674562.398.6398007479766564897.tip-bot2@…
Patch 2/6 needed a conflict to be resolved for the hunk
swapgs_restore_regs_and_return_to_usermode.
This is only compile and boot tested on qemu.
Cc: Dave Hansen <dave.hansen(a)linux.intel.com>
To: stable(a)vger.kernel.org
Signed-off-by: Pawan Gupta <pawan.kumar.gupta(a)linux.intel.com>
---
Pawan Gupta (5):
x86/bugs: Add asm helpers for executing VERW
x86/entry_64: Add VERW just before userspace transition
x86/entry_32: Add VERW just before userspace transition
x86/bugs: Use ALTERNATIVE() instead of mds_user_clear static key
KVM/VMX: Move VERW closer to VMentry for MDS mitigation
Sean Christopherson (1):
KVM/VMX: Use BT+JNC, i.e. EFLAGS.CF to select VMRESUME vs. VMLAUNCH
Documentation/arch/x86/mds.rst | 38 +++++++++++++++++++++++++-----------
arch/x86/entry/entry.S | 23 ++++++++++++++++++++++
arch/x86/entry/entry_32.S | 3 +++
arch/x86/entry/entry_64.S | 11 +++++++++++
arch/x86/entry/entry_64_compat.S | 1 +
arch/x86/include/asm/cpufeatures.h | 2 +-
arch/x86/include/asm/entry-common.h | 1 -
arch/x86/include/asm/nospec-branch.h | 25 ++++++++++++------------
arch/x86/kernel/cpu/bugs.c | 15 ++++++--------
arch/x86/kernel/nmi.c | 3 ---
arch/x86/kvm/vmx/run_flags.h | 7 +++++--
arch/x86/kvm/vmx/vmenter.S | 9 ++++++---
arch/x86/kvm/vmx/vmx.c | 20 +++++++++++++++----
13 files changed, 112 insertions(+), 46 deletions(-)
---
base-commit: d8a27ea2c98685cdaa5fa66c809c7069a4ff394b
change-id: 20240226-delay-verw-backport-6-6-y-2cda3298e600
Oliver Upton (2):
KVM: arm64: vgic-its: Test for valid IRQ in
its_sync_lpi_pending_table()
KVM: arm64: vgic-its: Test for valid IRQ in MOVALL handler
virt/kvm/arm/vgic/vgic-its.c | 5 +++++
1 file changed, 5 insertions(+)
base-commit: ab219d38aef198d26083cc800954d352acd5137b
--
2.44.0.rc1.240.g4c46232300-goog
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x 678e54d4bb9a4822f8ae99690ac131c5d490cdb1
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024022622-resent-ripeness-43f1@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
678e54d4bb9a ("mm/zswap: invalidate duplicate entry when !zswap_enabled")
a65b0e7607cc ("zswap: make shrinking memcg-aware")
ddc1a5cbc05d ("mempolicy: alloc_pages_mpol() for NUMA policy without vma")
23e4883248f0 ("mm: add page_rmappable_folio() wrapper")
c36f6e6dff4d ("mempolicy trivia: slightly more consistent naming")
7f1ee4e20708 ("mempolicy trivia: delete those ancient pr_debug()s")
1cb5d11a370f ("mempolicy: fix migrate_pages(2) syscall return nr_failed")
3657fdc2451a ("mm: move vma_policy() and anon_vma_name() decls to mm_types.h")
3022fd7af960 ("shmem: _add_to_page_cache() before shmem_inode_acct_blocks()")
054a9f7ccd0a ("shmem: move memcg charge out of shmem_add_to_page_cache()")
4199f51a7eb2 ("shmem: shmem_acct_blocks() and shmem_inode_acct_blocks()")
e3e1a5067fd2 ("shmem: remove vma arg from shmem_get_folio_gfp()")
75c70128a673 ("mm: mempolicy: make mpol_misplaced() to take a folio")
cda6d93672ac ("mm: memory: make numa_migrate_prep() to take a folio")
6695cf68b15c ("mm: memory: use a folio in do_numa_page()")
667ffc31aa95 ("mm: huge_memory: use a folio in do_huge_pmd_numa_page()")
73eab3ca481e ("mm: migrate: convert migrate_misplaced_page() to migrate_misplaced_folio()")
2ac9e99f3b21 ("mm: migrate: convert numamigrate_isolate_page() to numamigrate_isolate_folio()")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 678e54d4bb9a4822f8ae99690ac131c5d490cdb1 Mon Sep 17 00:00:00 2001
From: Chengming Zhou <zhouchengming(a)bytedance.com>
Date: Thu, 8 Feb 2024 02:32:54 +0000
Subject: [PATCH] mm/zswap: invalidate duplicate entry when !zswap_enabled
We have to invalidate any duplicate entry even when !zswap_enabled since
zswap can be disabled anytime. If the folio store success before, then
got dirtied again but zswap disabled, we won't invalidate the old
duplicate entry in the zswap_store(). So later lru writeback may
overwrite the new data in swapfile.
Link: https://lkml.kernel.org/r/20240208023254.3873823-1-chengming.zhou@linux.dev
Fixes: 42c06a0e8ebe ("mm: kill frontswap")
Signed-off-by: Chengming Zhou <zhouchengming(a)bytedance.com>
Acked-by: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: Nhat Pham <nphamcs(a)gmail.com>
Cc: Yosry Ahmed <yosryahmed(a)google.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/mm/zswap.c b/mm/zswap.c
index 36903d938c15..db4625af65fb 100644
--- a/mm/zswap.c
+++ b/mm/zswap.c
@@ -1518,7 +1518,7 @@ bool zswap_store(struct folio *folio)
if (folio_test_large(folio))
return false;
- if (!zswap_enabled || !tree)
+ if (!tree)
return false;
/*
@@ -1533,6 +1533,10 @@ bool zswap_store(struct folio *folio)
zswap_invalidate_entry(tree, dupentry);
}
spin_unlock(&tree->lock);
+
+ if (!zswap_enabled)
+ return false;
+
objcg = get_obj_cgroup_from_folio(folio);
if (objcg && !obj_cgroup_may_zswap(objcg)) {
memcg = get_mem_cgroup_from_objcg(objcg);
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.19.y
git checkout FETCH_HEAD
git cherry-pick -x db744ddd59be798c2627efbfc71f707f5a935a40
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024022609-womanless-imprison-678c@gregkh' --subject-prefix 'PATCH 4.19.y' HEAD^..
Possible dependencies:
db744ddd59be ("PCI/MSI: Prevent MSI hardware interrupt number truncation")
aa423ac4221a ("PCI/MSI: Split out irqdomain code")
a01e09ef1237 ("PCI/MSI: Split out !IRQDOMAIN code")
54324c2f3d72 ("PCI/MSI: Split out CONFIG_PCI_MSI independent part")
288c81ce4be7 ("PCI/MSI: Move code into a separate directory")
29a03ada4a00 ("PCI/MSI: Cleanup include zoo")
ae72f3156729 ("PCI/MSI: Make arch_restore_msi_irqs() less horrible.")
e58f2259b91c ("genirq/msi, treewide: Use a named struct for PCI/MSI attributes")
ade044a3d0f0 ("PCI/MSI: Remove msi_desc_to_pci_sysdata()")
9e8688c5f299 ("PCI/MSI: Make pci_msi_domain_write_msg() static")
29bbc35e29d9 ("PCI/MSI: Fix pci_irq_vector()/pci_irq_get_affinity()")
c36e33e2f477 ("Merge tag 'irq-urgent-2021-11-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From db744ddd59be798c2627efbfc71f707f5a935a40 Mon Sep 17 00:00:00 2001
From: Vidya Sagar <vidyas(a)nvidia.com>
Date: Mon, 15 Jan 2024 19:26:49 +0530
Subject: [PATCH] PCI/MSI: Prevent MSI hardware interrupt number truncation
While calculating the hardware interrupt number for a MSI interrupt, the
higher bits (i.e. from bit-5 onwards a.k.a domain_nr >= 32) of the PCI
domain number gets truncated because of the shifted value casting to return
type of pci_domain_nr() which is 'int'. This for example is resulting in
same hardware interrupt number for devices 0019:00:00.0 and 0039:00:00.0.
To address this cast the PCI domain number to 'irq_hw_number_t' before left
shifting it to calculate the hardware interrupt number.
Please note that this fixes the issue only on 64-bit systems and doesn't
change the behavior for 32-bit systems i.e. the 32-bit systems continue to
have the issue. Since the issue surfaces only if there are too many PCIe
controllers in the system which usually is the case in modern server
systems and they don't tend to run 32-bit kernels.
Fixes: 3878eaefb89a ("PCI/MSI: Enhance core to support hierarchy irqdomain")
Signed-off-by: Vidya Sagar <vidyas(a)nvidia.com>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Tested-by: Shanker Donthineni <sdonthineni(a)nvidia.com>
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/r/20240115135649.708536-1-vidyas@nvidia.com
diff --git a/drivers/pci/msi/irqdomain.c b/drivers/pci/msi/irqdomain.c
index c8be056c248d..cfd84a899c82 100644
--- a/drivers/pci/msi/irqdomain.c
+++ b/drivers/pci/msi/irqdomain.c
@@ -61,7 +61,7 @@ static irq_hw_number_t pci_msi_domain_calc_hwirq(struct msi_desc *desc)
return (irq_hw_number_t)desc->msi_index |
pci_dev_id(dev) << 11 |
- (pci_domain_nr(dev->bus) & 0xFFFFFFFF) << 27;
+ ((irq_hw_number_t)(pci_domain_nr(dev->bus) & 0xFFFFFFFF)) << 27;
}
static void pci_msi_domain_set_desc(msi_alloc_info_t *arg,
From: Shyam Prasad N <nspmangalore(a)gmail.com>
commit 69cba9d3c1284e0838ae408830a02c4a063104bc upstream.
When the number of responses with status of STATUS_IO_TIMEOUT
exceeds a specified threshold (NUM_STATUS_IO_TIMEOUT), we reconnect
the connection. But we do not return the mid, or the credits
returned for the mid, or reduce the number of in-flight requests.
This bug could result in the server->in_flight count to go bad,
and also cause a leak in the mids.
This change moves the check to a few lines below where the
response is decrypted, even of the response is read from the
transform header. This way, the code for returning the mids
can be reused.
Also, the cifs_reconnect was reconnecting just the transport
connection before. In case of multi-channel, this may not be
what we want to do after several timeouts. Changed that to
reconnect the session and the tree too.
Also renamed NUM_STATUS_IO_TIMEOUT to a more appropriate name
MAX_STATUS_IO_TIMEOUT.
Fixes: 8e670f77c4a5 ("Handle STATUS_IO_TIMEOUT gracefully")
Signed-off-by: Shyam Prasad N <sprasad(a)microsoft.com>
Signed-off-by: Steve French <stfrench(a)microsoft.com>
[Harshit: Backport to 5.15.y]
Conflicts:
fs/cifs/connect.c -- 5.15.y doesn't have commit 183eea2ee5ba
("cifs: reconnect only the connection and not smb session where
possible") -- User cifs_reconnect(server) instead of
cifs_reconnect(server, true)
Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli(a)oracle.com>
---
Would be nice to get a review from author/maintainer of the upstream patch.
A backport request was made previously but the patch didnot apply
cleanly then:
https://lore.kernel.org/all/CANT5p=oPGnCd4H5ppMbAiHsAKMor3LT_aQRqU7tKu=q6q1…
xfstests with cifs done: before and after patching with this patch on 5.15.149.
There is no change in test results before and after the patch.
Ran: cifs/001 generic/001 generic/002 generic/005 generic/006 generic/007
generic/010 generic/011 generic/013 generic/014 generic/023 generic/024
generic/028 generic/029 generic/030 generic/036 generic/069 generic/074
generic/075 generic/084 generic/091 generic/095 generic/098 generic/100
generic/109 generic/112 generic/113 generic/124 generic/127 generic/129
generic/130 generic/132 generic/133 generic/135 generic/141 generic/169
generic/198 generic/207 generic/208 generic/210 generic/211 generic/212
generic/221 generic/239 generic/241 generic/245 generic/246 generic/247
generic/248 generic/249 generic/257 generic/263 generic/285 generic/286
generic/308 generic/309 generic/310 generic/315 generic/323 generic/339
generic/340 generic/344 generic/345 generic/346 generic/354 generic/360
generic/393 generic/394
Not run: generic/010 generic/286 generic/315
Failures: generic/075 generic/112 generic/127 generic/285
Failed 4 of 68 tests
SECTION -- smb3
=========================
Ran: cifs/001 generic/001 generic/002 generic/005 generic/006 generic/007
generic/010 generic/011 generic/013 generic/014 generic/023 generic/024
generic/028 generic/029 generic/030 generic/036 generic/069 generic/074
generic/075 generic/084 generic/091 generic/095 generic/098 generic/100
generic/109 generic/112 generic/113 generic/124 generic/127 generic/129
generic/130 generic/132 generic/133 generic/135 generic/141 generic/169
generic/198 generic/207 generic/208 generic/210 generic/211 generic/212
generic/221 generic/239 generic/241 generic/245 generic/246 generic/247
generic/248 generic/249 generic/257 generic/263 generic/285 generic/286
generic/308 generic/309 generic/310 generic/315 generic/323 generic/339
generic/340 generic/344 generic/345 generic/346 generic/354 generic/360
generic/393 generic/394
Not run: generic/010 generic/014 generic/129 generic/130 generic/239
Failures: generic/075 generic/091 generic/112 generic/127 generic/263 generic/285 generic/286
Failed 7 of 68 tests
SECTION -- smb21
=========================
Ran: cifs/001 generic/001 generic/002 generic/005 generic/006 generic/007
generic/010 generic/011 generic/013 generic/014 generic/023 generic/024
generic/028 generic/029 generic/030 generic/036 generic/069 generic/074
generic/075 generic/084 generic/091 generic/095 generic/098 generic/100
generic/109 generic/112 generic/113 generic/124 generic/127 generic/129
generic/130 generic/132 generic/133 generic/135 generic/141 generic/169
generic/198 generic/207 generic/208 generic/210 generic/211 generic/212
generic/221 generic/239 generic/241 generic/245 generic/246 generic/247
generic/248 generic/249 generic/257 generic/263 generic/285 generic/286
generic/308 generic/309 generic/310 generic/315 generic/323 generic/339
generic/340 generic/344 generic/345 generic/346 generic/354 generic/360
generic/393 generic/394
Not run: generic/010 generic/014 generic/129 generic/130 generic/239 generic/286 generic/315
Failures: generic/075 generic/112 generic/127 generic/285
Failed 4 of 68 tests
SECTION -- smb2
=========================
Ran: cifs/001 generic/001 generic/002 generic/005 generic/006 generic/007
generic/010 generic/011 generic/013 generic/014 generic/023 generic/024
generic/028 generic/029 generic/030 generic/036 generic/069 generic/074
generic/075 generic/084 generic/091 generic/095 generic/098 generic/100
generic/109 generic/112 generic/113 generic/124 generic/127 generic/129
generic/130 generic/132 generic/133 generic/135 generic/141 generic/169
generic/198 generic/207 generic/208 generic/210 generic/211 generic/212
generic/221 generic/239 generic/241 generic/245 generic/246 generic/247
generic/248 generic/249 generic/257 generic/263 generic/285 generic/286
generic/308 generic/309 generic/310 generic/315 generic/323 generic/339
generic/340 generic/344 generic/345 generic/346 generic/354 generic/360
generic/393 generic/394
Not run: generic/010 generic/286 generic/315
Failures: generic/075 generic/112 generic/127 generic/285
Failed 4 of 68 tests
---
fs/cifs/connect.c | 19 +++++++++++++++----
1 file changed, 15 insertions(+), 4 deletions(-)
diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c
index a521c705b0d7..a3e4811b7871 100644
--- a/fs/cifs/connect.c
+++ b/fs/cifs/connect.c
@@ -59,7 +59,7 @@ extern bool disable_legacy_dialects;
#define TLINK_IDLE_EXPIRE (600 * HZ)
/* Drop the connection to not overload the server */
-#define NUM_STATUS_IO_TIMEOUT 5
+#define MAX_STATUS_IO_TIMEOUT 5
struct mount_ctx {
struct cifs_sb_info *cifs_sb;
@@ -965,6 +965,7 @@ cifs_demultiplex_thread(void *p)
struct mid_q_entry *mids[MAX_COMPOUND];
char *bufs[MAX_COMPOUND];
unsigned int noreclaim_flag, num_io_timeout = 0;
+ bool pending_reconnect = false;
noreclaim_flag = memalloc_noreclaim_save();
cifs_dbg(FYI, "Demultiplex PID: %d\n", task_pid_nr(current));
@@ -1004,6 +1005,8 @@ cifs_demultiplex_thread(void *p)
cifs_dbg(FYI, "RFC1002 header 0x%x\n", pdu_length);
if (!is_smb_response(server, buf[0]))
continue;
+
+ pending_reconnect = false;
next_pdu:
server->pdu_size = pdu_length;
@@ -1063,10 +1066,13 @@ cifs_demultiplex_thread(void *p)
if (server->ops->is_status_io_timeout &&
server->ops->is_status_io_timeout(buf)) {
num_io_timeout++;
- if (num_io_timeout > NUM_STATUS_IO_TIMEOUT) {
- cifs_reconnect(server);
+ if (num_io_timeout > MAX_STATUS_IO_TIMEOUT) {
+ cifs_server_dbg(VFS,
+ "Number of request timeouts exceeded %d. Reconnecting",
+ MAX_STATUS_IO_TIMEOUT);
+
+ pending_reconnect = true;
num_io_timeout = 0;
- continue;
}
}
@@ -1113,6 +1119,11 @@ cifs_demultiplex_thread(void *p)
buf = server->smallbuf;
goto next_pdu;
}
+
+ /* do this reconnect at the very end after processing all MIDs */
+ if (pending_reconnect)
+ cifs_reconnect(server);
+
} /* end while !EXITING */
/* buffer usually freed in free_mid - need to free it here on exit */
--
2.43.0
The patch below does not apply to the 6.7-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.7.y
git checkout FETCH_HEAD
git cherry-pick -x e3b63e966cac0bf78aaa1efede1827a252815a1d
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024022610-amino-basically-add3@gregkh' --subject-prefix 'PATCH 6.7.y' HEAD^..
Possible dependencies:
e3b63e966cac ("mm: zswap: fix missing folio cleanup in writeback race path")
96c7b0b42239 ("mm: return the folio from __read_swap_cache_async()")
e947ba0bbf47 ("mm/zswap: cleanup zswap_writeback_entry()")
32acba4c0483 ("mm/zswap: refactor out __zswap_load()")
c75f5c1e0f1d ("mm/zswap: reuse dstmem when decompress")
b5ba474f3f51 ("zswap: shrink zswap pool based on memory pressure")
a65b0e7607cc ("zswap: make shrinking memcg-aware")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From e3b63e966cac0bf78aaa1efede1827a252815a1d Mon Sep 17 00:00:00 2001
From: Yosry Ahmed <yosryahmed(a)google.com>
Date: Thu, 25 Jan 2024 08:51:27 +0000
Subject: [PATCH] mm: zswap: fix missing folio cleanup in writeback race path
In zswap_writeback_entry(), after we get a folio from
__read_swap_cache_async(), we grab the tree lock again to check that the
swap entry was not invalidated and recycled. If it was, we delete the
folio we just added to the swap cache and exit.
However, __read_swap_cache_async() returns the folio locked when it is
newly allocated, which is always true for this path, and the folio is
ref'd. Make sure to unlock and put the folio before returning.
This was discovered by code inspection, probably because this path handles
a race condition that should not happen often, and the bug would not crash
the system, it will only strand the folio indefinitely.
Link: https://lkml.kernel.org/r/20240125085127.1327013-1-yosryahmed@google.com
Fixes: 04fc7816089c ("mm: fix zswap writeback race condition")
Signed-off-by: Yosry Ahmed <yosryahmed(a)google.com>
Reviewed-by: Chengming Zhou <zhouchengming(a)bytedance.com>
Acked-by: Johannes Weiner <hannes(a)cmpxchg.org>
Reviewed-by: Nhat Pham <nphamcs(a)gmail.com>
Cc: Domenico Cerasuolo <cerasuolodomenico(a)gmail.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/mm/zswap.c b/mm/zswap.c
index 350dd2fc8159..d2423247acfd 100644
--- a/mm/zswap.c
+++ b/mm/zswap.c
@@ -1440,6 +1440,8 @@ static int zswap_writeback_entry(struct zswap_entry *entry,
if (zswap_rb_search(&tree->rbroot, swp_offset(entry->swpentry)) != entry) {
spin_unlock(&tree->lock);
delete_from_swap_cache(folio);
+ folio_unlock(folio);
+ folio_put(folio);
return -ENOMEM;
}
spin_unlock(&tree->lock);
The following patch accidentally removed the code for delivering
completions for cancelled reads and writes to user space: "[PATCH 04/33]
aio: remove retry-based AIO"
(https://lore.kernel.org/all/1363883754-27966-5-git-send-email-koverstreet@g…)
From that patch:
- if (kiocbIsCancelled(iocb)) {
- ret = -EINTR;
- aio_complete(iocb, ret, 0);
- /* must not access the iocb after this */
- goto out;
- }
This leads to a leak in user space of a struct iocb. Hence this patch
that restores the code that reports to user space that a read or write
has been cancelled successfully.
Fixes: 41003a7bcfed ("aio: remove retry-based AIO")
Cc: Christoph Hellwig <hch(a)lst.de>
Cc: Avi Kivity <avi(a)scylladb.com>
Cc: Sandeep Dhavale <dhavale(a)google.com>
Cc: Jens Axboe <axboe(a)kernel.dk>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: Kent Overstreet <kent.overstreet(a)linux.dev>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Bart Van Assche <bvanassche(a)acm.org>
---
fs/aio.c | 27 +++++++++++----------------
1 file changed, 11 insertions(+), 16 deletions(-)
diff --git a/fs/aio.c b/fs/aio.c
index da18dbcfcb22..28223f511931 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -2165,14 +2165,11 @@ COMPAT_SYSCALL_DEFINE3(io_submit, compat_aio_context_t, ctx_id,
#endif
/* sys_io_cancel:
- * Attempts to cancel an iocb previously passed to io_submit. If
- * the operation is successfully cancelled, the resulting event is
- * copied into the memory pointed to by result without being placed
- * into the completion queue and 0 is returned. May fail with
- * -EFAULT if any of the data structures pointed to are invalid.
- * May fail with -EINVAL if aio_context specified by ctx_id is
- * invalid. May fail with -EAGAIN if the iocb specified was not
- * cancelled. Will fail with -ENOSYS if not implemented.
+ * Attempts to cancel an iocb previously passed to io_submit(). If the
+ * operation is successfully cancelled 0 is returned. May fail with
+ * -EFAULT if any of the data structures pointed to are invalid. May
+ * fail with -EINVAL if aio_context specified by ctx_id is invalid. Will
+ * fail with -ENOSYS if not implemented.
*/
SYSCALL_DEFINE3(io_cancel, aio_context_t, ctx_id, struct iocb __user *, iocb,
struct io_event __user *, result)
@@ -2203,14 +2200,12 @@ SYSCALL_DEFINE3(io_cancel, aio_context_t, ctx_id, struct iocb __user *, iocb,
}
spin_unlock_irq(&ctx->ctx_lock);
- if (!ret) {
- /*
- * The result argument is no longer used - the io_event is
- * always delivered via the ring buffer. -EINPROGRESS indicates
- * cancellation is progress:
- */
- ret = -EINPROGRESS;
- }
+ /*
+ * The result argument is no longer used - the io_event is always
+ * delivered via the ring buffer.
+ */
+ if (ret == 0 && kiocb->rw.ki_flags & IOCB_AIO_RW)
+ aio_complete_rw(&kiocb->rw, -EINTR);
percpu_ref_put(&ctx->users);
reg_read() callback registered with nvmem core expects an integer error
as a return value but rmem_read() returns the number of bytes read, as a
result error checks in nvmem core fail even when they shouldn't.
Return 0 on success where number of bytes read match the number of bytes
requested and a negative error -EINVAL on all other cases.
Fixes: 5a3fa75a4d9c ("nvmem: Add driver to expose reserved memory as nvmem")
Cc: stable(a)vger.kernel.org
Signed-off-by: Joy Chakraborty <joychakr(a)google.com>
---
drivers/nvmem/rmem.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/drivers/nvmem/rmem.c b/drivers/nvmem/rmem.c
index 752d0bf4445e..a74dfa279ff4 100644
--- a/drivers/nvmem/rmem.c
+++ b/drivers/nvmem/rmem.c
@@ -46,7 +46,12 @@ static int rmem_read(void *context, unsigned int offset,
memunmap(addr);
- return count;
+ if (count != bytes) {
+ dev_err(priv->dev, "Failed read memory (%d)\n", count);
+ return -EINVAL;
+ }
+
+ return 0;
}
static int rmem_probe(struct platform_device *pdev)
--
2.43.0.594.gd9cf4e227d-goog
Hi,
I wanted to check with you if you had a time to go through my previous
email,
Let me know your thoughts about acquiring this email list
Regards,
*Olivia*
______________________________________________________________________________________________
Hi,
I hope you are the right person to discuss about *Healthcare Leads*, which
includes complete contact details, and tele-verified email addresses.
Please find the Leads Breakdown Chart below:
*Criteria*
*Counts*
*Criteria*
*Counts*
*Criteria*
*Counts*
Allergy immunology
5,064
Healthcare Technology
20,540
Plastic surgery
8,371
Anesthesiology
30,155
Nephrology
6,606
Preventive medicine
6,642
Cardiology
24,577
Neurological surgery
7,066
Psychiatry
4,315
Dermatology
8,467
Neurology
13,354
Radiology
32,763
Emergency medicine
22,300
Obgyn
35,163
Surgery
39,517
Endocrinology diabetes metabolism
3,756
Oncology
17,881
Urology
10,135
Family practice1
62,544
Ophthalmology
15,237
Physician
100,000
Gastroenterology
11,913
Orthopedics
22,145
Doctors
128,000
General practice
12,957
Other
15,559
Dentists
150,200
Geriatrics Doctors
9,634
Otolaryngology
9,539
Osteopathic
25,000
Infectious disease
5,677
Pathology
15,467
Acupuncture
5,000
Internal medicine1
120,029
Pediatrics
55,684
Chiropractors
11,000
Haematology Doctors
12,850
Physical medicine
8,437
Rheumatology
5,000
*Data Fields:* Every lead includes Name, Company, Job Title, Website,
Physical Address, Industry, *Phone Number and Verified/Opt-In Email
Address.* Please let me know if you have any queries about our custom
opt-in list and I would love to answer them.
Kindly share your thoughts.
Warm Regards,
*Olivia Stewart*
*Marketing Executive *
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
We respect your privacy, if you want to remove it from this list. Please
reply with the subject line as “Leave Out”.
This series includes 6 types of fixes:
- Patch 1 fixes v4 mapped in v6 addresses support for the userspace PM,
when asking to delete a subflow. It was done everywhere else, but not
there. Patch 2 validates the modification, thanks to a subtest in
mptcp_join.sh. These patches can be backported up to v5.19.
- Patch 3 is a small fix for a recent bug-fix patch, just to avoid
printing an irrelevant warning (pr_warn()) once. It can be backported
up to v5.6, alongside the bug-fix that has been introduced in the
v6.8-rc5.
- Patches 4 to 6 are fixes for bugs found by Paolo while working on
TCP_NOTSENT_LOWAT support for MPTCP. These fixes can improve the
performances in some cases. Patches can be backported up to v5.6,
v5.11 and v6.7 respectively.
- Patch 7 makes sure 'ss -M' is available when starting MPTCP Join
selftest as it is required for some subtests since v5.18.
- Patch 8 fixes a possible double-free on socket dismantle. The issue
always existed, but was unnoticed because it was not causing any
problem so far. This fix can be backported up to v5.6.
- Patch 9 is a fix for a very recent patch causing lockdep warnings in
subflow diag. The patch causing the regression -- which fixes another
issue present since v5.7 -- should be part of the future v6.8-rc6.
Patch 10 validates the modification, thanks to a new subtest in
diag.sh.
Signed-off-by: Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
---
Davide Caratti (1):
mptcp: fix double-free on socket dismantle
Geliang Tang (3):
mptcp: map v4 address to v6 when destroying subflow
selftests: mptcp: rm subflow with v4/v4mapped addr
selftests: mptcp: join: add ss mptcp support check
Matthieu Baerts (NGI0) (1):
mptcp: avoid printing warning once on client side
Paolo Abeni (5):
mptcp: push at DSS boundaries
mptcp: fix snd_wnd initialization for passive socket
mptcp: fix potential wake-up event loss
mptcp: fix possible deadlock in subflow diag
selftests: mptcp: explicitly trigger the listener diag code-path
net/mptcp/diag.c | 3 ++
net/mptcp/options.c | 2 +-
net/mptcp/pm_userspace.c | 10 +++++
net/mptcp/protocol.c | 52 ++++++++++++++++++++++++-
net/mptcp/protocol.h | 21 +++++-----
tools/testing/selftests/net/mptcp/diag.sh | 30 +++++++++++++-
tools/testing/selftests/net/mptcp/mptcp_join.sh | 33 ++++++++++------
tools/testing/selftests/net/mptcp/mptcp_lib.sh | 4 +-
8 files changed, 128 insertions(+), 27 deletions(-)
---
base-commit: b0b1210bc150fbd741b4b9fce8a24541306b40fc
change-id: 20240223-upstream-net-20240223-misc-fixes-1630cd6b3b0a
Best regards,
--
Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
> -----Original Message-----
> From: Larry Finger <Larry.Finger(a)gmail.com>
> Sent: Tuesday, February 27, 2024 9:41 AM
> To: Kalle Valo <kvalo(a)kernel.org>
> Cc: Johannes Berg <johannes(a)sipsolutions.net>; linux-wireless(a)vger.kernel.org; Nick Morrow
> <morrownr(a)gmail.com>; Larry Finger <Larry.Finger(a)lwfinger.net>; stable(a)vger.kernel.org
> Subject: [PATCH] wifi:rtw88: Add missing VID/PIDs
Missing a space between "wifi:" and "rtw88:", and suggest to mention 8811CU
and 8821CU in subject. Others look good to me.
>
> From: Nick Morrow <morrownr(a)gmail.com>
>
> Purpose: Add VID/PIDs that are known to be missing for this driver.
> - removed /* 8811CU */ and /* 8821CU */ as they are redundant
> since the file is specific to those chips.
> - removed /* TOTOLINK A650UA v3 */ as the manufacturer. It has a REALTEK
> VID so it may not be specific to this adapter.
>
> Source is
> https://1EHFQ.trk.elasticemail.com/tracking/click?d=I82H0YR_W_h175Lb3Nkb0D8…
> 0SPxd1Olp3PNJEm7h1Gft8lKFiXqYf1jEjniUnBHTdCi0Ypi2Y9ugy88eGHqb5MB9U0M7ZbBBaOwoaG0eHpd73dxUfRcicgS3TFBvw
> 066sdoIh1JxdrADO_ro60
> Verified and tested.
>
> Signed-off-by: Nick Morrow <morrownr(a)gmail.com>
> Signed-off-by: Larry Finger <Larry.Finger(a)lwfinger.net>
> Cc: stable(a)vger.kernel.org
Acked-by: Ping-Ke Shih <pkshih(a)realtek.com>
It looks like both 5.15.146 and 5.10.206 are impacted by this regression as they both have the
bad commit 33eae65c6f (smb: client: fix OOB in SMB2_query_info_init()). We tried to
apply the proposed fix eb3e28c1e89b ("smb3: Replace smb2pdu 1-element
arrays with flex-arrays”) but there are a lot of dependencies required to do the backport.
Is it possible to consider the simple fix that Paulo proposed as a solution for 5.10 and 5.15.
We were lucky with 5.4 as it doesn’t have the bad commit because of merge conflict reported
in https://lore.kernel.org/all/2023122857-doubling-crazed-27f4@gregkh/T/#m3aa0…
diff --git a/fs/smb/client/smb2pdu.c b/fs/smb/client/smb2pdu.c
index 05ff8a457a3d..aed5067661de 100644
--- a/fs/smb/client/smb2pdu.c
+++ b/fs/smb/client/smb2pdu.c
@@ -3556,7 +3556,7 @@ SMB2_query_info_init(struct cifs_tcon *tcon, struct TCP_Server_Info *server,
iov[0].iov_base = (char *)req;
/* 1 for Buffer */
- iov[0].iov_len = len;
+ iov[0].iov_len = len - 1;
return 0;
}
Hazem
The following commit has been merged into the x86/misc branch of tip:
Commit-ID: d54e56f31a34fa38fcb5e91df609f9633419a79a
Gitweb: https://git.kernel.org/tip/d54e56f31a34fa38fcb5e91df609f9633419a79a
Author: Breno Leitao <leitao(a)debian.org>
AuthorDate: Wed, 07 Feb 2024 08:52:35 -08:00
Committer: Thomas Gleixner <tglx(a)linutronix.de>
CommitterDate: Mon, 26 Feb 2024 23:41:30 +01:00
x86/nmi: Fix the inverse "in NMI handler" check
Commit 344da544f177 ("x86/nmi: Print reasons why backtrace NMIs are
ignored") creates a super nice framework to diagnose NMIs.
Every time nmi_exc() is called, it increments a per_cpu counter
(nsp->idt_nmi_seq). At its exit, it also increments the same counter. By
reading this counter it can be seen how many times that function was called
(dividing by 2), and, if the function is still being executed, by checking
the idt_nmi_seq's least significant bit.
On the check side (nmi_backtrace_stall_check()), that variable is queried
to check if the NMI is still being executed, but, there is a mistake in the
bitwise operation. That code wants to check if the least significant bit of
the idt_nmi_seq is set or not, but does the opposite, and checks for all
the other bits, which will always be true after the first exc_nmi()
executed successfully.
This appends the misleading string to the dump "(CPU currently in NMI
handler function)"
Fix it by checking the least significant bit, and if it is set, append the
string.
Fixes: 344da544f177 ("x86/nmi: Print reasons why backtrace NMIs are ignored")
Signed-off-by: Breno Leitao <leitao(a)debian.org>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Reviewed-by: Paul E. McKenney <paulmck(a)kernel.org>
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/r/20240207165237.1048837-1-leitao@debian.org
---
arch/x86/kernel/nmi.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/kernel/nmi.c b/arch/x86/kernel/nmi.c
index d238679..c95dc1b 100644
--- a/arch/x86/kernel/nmi.c
+++ b/arch/x86/kernel/nmi.c
@@ -639,7 +639,7 @@ void nmi_backtrace_stall_check(const struct cpumask *btp)
msgp = nmi_check_stall_msg[idx];
if (nsp->idt_ignored_snap != READ_ONCE(nsp->idt_ignored) && (idx & 0x1))
modp = ", but OK because ignore_nmis was set";
- if (nmi_seq & ~0x1)
+ if (nmi_seq & 0x1)
msghp = " (CPU currently in NMI handler function)";
else if (nsp->idt_nmi_seq_snap + 1 == nmi_seq)
msghp = " (CPU exited one NMI handler function)";
On Mon, Feb 26, 2024 at 05:27:50PM +0200, Радослав Ненчовски wrote:
> Hi. IDK how more clear to write it in the title, so let me explain what the
> problem is.
I'm sending your message to stable instead, because helpdesk is only for
requesting help with kernel.org infrastructure.
Stable folks, please see below.
-K
> In the past 4 or 5 years I've been using this script (with an alias) to
> compress a single folder:
> 7z a "$1.7z" "$1"/ -mx=0 -mmt=8
>
> I know it doesn't look like much but essentially it creates a 7z archive
> (with "store" level of compression) with a name I've entered right after the
> alias. For instance: 7z0 "my dir" will create "my dir.7z".
> And in the past 4 or 5 years this script was working just fine because it
> was recognizing the slash as an indication that the target to compress is a
> directory.
> However, ever since 6.6.17-LTS arrived (altough I've heard the same
> complaints from people who use the regular rolling kernel, but they didn't
> tell me which version) bash stopped recognizing the slash as an indication
> for directory and thinks of it as the entire root directory, thus it
> attempts to compress not only "my dir" but also the whole root (/)
> directory. And it doesn't matter whether I'll put the slash between the
> quotes or outside of them - the result is the same. And, naturally, it
> throws out an unlimited number of errors about "access denied" to everything
> in root. I can't even begin to comprehend why on Earth you or whoever writes
> the kernel would make this change. Forget about me but ALL linux sysadmins I
> know use all kinds of scripts and changing the slash at the end of a word to
> mean "root" instead of a sign for directory is a rude way to ruin their
> work. Since this change occurred, I can no longer put a directory in an
> archive through CLI and I have to do it through GUI, which is about 10 times
> slower. I have a DE and I can do that but what about the sysadmins who
> usually use linux without a DE or directly SSH into the distro they're
> admins of? With this change you're literally hindering their job!
>
> I downgraded the kernel to 6.6.15-LTS and the problem disappeared - now the
> slash is properly recognized as a sign for directory.
>
> The point is: *it is urgent that you undo this change back to the way it
> was! I'm pretty sure sysadmins will begin to email you about this, if they
> haven't already.
> *
Oliver Upton (2):
KVM: arm64: vgic-its: Test for valid IRQ in
its_sync_lpi_pending_table()
KVM: arm64: vgic-its: Test for valid IRQ in MOVALL handler
virt/kvm/arm/vgic/vgic-its.c | 5 +++++
1 file changed, 5 insertions(+)
base-commit: 6e1f54a4985b63bc1b55a09e5e75a974c5d6719b
--
2.44.0.rc1.240.g4c46232300-goog
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x 04b57c9e096a9479fe0ad31e3956e336fa589cb2
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024021924-setback-disinfect-0bd6@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
04b57c9e096a ("selftests: mptcp: join: stop transfer when check is done (part 2)")
b9fb176081fb ("selftests: mptcp: userspace pm send RM_ADDR for ID 0")
e3b47e460b4b ("selftests: mptcp: userspace pm remove initial subflow")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 04b57c9e096a9479fe0ad31e3956e336fa589cb2 Mon Sep 17 00:00:00 2001
From: "Matthieu Baerts (NGI0)" <matttbe(a)kernel.org>
Date: Wed, 31 Jan 2024 22:49:54 +0100
Subject: [PATCH] selftests: mptcp: join: stop transfer when check is done
(part 2)
Since the "Fixes" commits mentioned below, the newly added "userspace
pm" subtests of mptcp_join selftests are launching the whole transfer in
the background, do the required checks, then wait for the end of
transfer.
There is no need to wait longer, especially because the checks at the
end of the transfer are ignored (which is fine). This saves quite a few
seconds on slow environments.
While at it, use 'mptcp_lib_kill_wait()' helper everywhere, instead of
on a specific one with 'kill_tests_wait()'.
Fixes: b2e2248f365a ("selftests: mptcp: userspace pm create id 0 subflow")
Fixes: e3b47e460b4b ("selftests: mptcp: userspace pm remove initial subflow")
Fixes: b9fb176081fb ("selftests: mptcp: userspace pm send RM_ADDR for ID 0")
Cc: stable(a)vger.kernel.org
Reviewed-and-tested-by: Geliang Tang <geliang(a)kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
Link: https://lore.kernel.org/r/20240131-upstream-net-20240131-mptcp-ci-issues-v1…
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
diff --git a/tools/testing/selftests/net/mptcp/mptcp_join.sh b/tools/testing/selftests/net/mptcp/mptcp_join.sh
index 85bcc95f4ede..c07386e21e0a 100755
--- a/tools/testing/selftests/net/mptcp/mptcp_join.sh
+++ b/tools/testing/selftests/net/mptcp/mptcp_join.sh
@@ -643,13 +643,6 @@ kill_events_pids()
mptcp_lib_kill_wait $evts_ns2_pid
}
-kill_tests_wait()
-{
- #shellcheck disable=SC2046
- kill -SIGUSR1 $(ip netns pids $ns2) $(ip netns pids $ns1)
- wait
-}
-
pm_nl_set_limits()
{
local ns=$1
@@ -3494,7 +3487,7 @@ userspace_tests()
chk_mptcp_info subflows 1 subflows 1
chk_subflows_total 2 2
kill_events_pids
- wait $tests_pid
+ mptcp_lib_kill_wait $tests_pid
fi
# userspace pm remove initial subflow
@@ -3518,7 +3511,7 @@ userspace_tests()
chk_mptcp_info subflows 1 subflows 1
chk_subflows_total 1 1
kill_events_pids
- wait $tests_pid
+ mptcp_lib_kill_wait $tests_pid
fi
# userspace pm send RM_ADDR for ID 0
@@ -3544,7 +3537,7 @@ userspace_tests()
chk_mptcp_info subflows 1 subflows 1
chk_subflows_total 1 1
kill_events_pids
- wait $tests_pid
+ mptcp_lib_kill_wait $tests_pid
fi
}
@@ -3558,7 +3551,8 @@ endpoint_tests()
pm_nl_set_limits $ns2 2 2
pm_nl_add_endpoint $ns1 10.0.2.1 flags signal
speed=slow \
- run_tests $ns1 $ns2 10.0.1.1 2>/dev/null &
+ run_tests $ns1 $ns2 10.0.1.1 &
+ local tests_pid=$!
wait_mpj $ns1
pm_nl_check_endpoint "creation" \
@@ -3573,7 +3567,7 @@ endpoint_tests()
pm_nl_add_endpoint $ns2 10.0.2.2 flags signal
pm_nl_check_endpoint "modif is allowed" \
$ns2 10.0.2.2 id 1 flags signal
- kill_tests_wait
+ mptcp_lib_kill_wait $tests_pid
fi
if reset "delete and re-add" &&
@@ -3582,7 +3576,8 @@ endpoint_tests()
pm_nl_set_limits $ns2 1 1
pm_nl_add_endpoint $ns2 10.0.2.2 id 2 dev ns2eth2 flags subflow
test_linkfail=4 speed=20 \
- run_tests $ns1 $ns2 10.0.1.1 2>/dev/null &
+ run_tests $ns1 $ns2 10.0.1.1 &
+ local tests_pid=$!
wait_mpj $ns2
chk_subflow_nr "before delete" 2
@@ -3597,7 +3592,7 @@ endpoint_tests()
wait_mpj $ns2
chk_subflow_nr "after re-add" 2
chk_mptcp_info subflows 1 subflows 1
- kill_tests_wait
+ mptcp_lib_kill_wait $tests_pid
fi
}
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x 31ee4ad86afd6ed6f4bb1b38c43011216080c42a
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024021917-nuzzle-magenta-7de4@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
31ee4ad86afd ("selftests: mptcp: join: stop transfer when check is done (part 1)")
80775412882e ("selftests: mptcp: add chk_subflows_total helper")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 31ee4ad86afd6ed6f4bb1b38c43011216080c42a Mon Sep 17 00:00:00 2001
From: "Matthieu Baerts (NGI0)" <matttbe(a)kernel.org>
Date: Wed, 31 Jan 2024 22:49:53 +0100
Subject: [PATCH] selftests: mptcp: join: stop transfer when check is done
(part 1)
Since the "Fixes" commit mentioned below, "userspace pm" subtests of
mptcp_join selftests introduced in v6.5 are launching the whole transfer
in the background, do the required checks, then wait for the end of
transfer.
There is no need to wait longer, especially because the checks at the
end of the transfer are ignored (which is fine). This saves quite a few
seconds in slow environments.
Note that old versions will need commit bdbef0a6ff10 ("selftests: mptcp:
add mptcp_lib_kill_wait") as well to get 'mptcp_lib_kill_wait()' helper.
Fixes: 4369c198e599 ("selftests: mptcp: test userspace pm out of transfer")
Cc: stable(a)vger.kernel.org # 6.5.x: bdbef0a6ff10: selftests: mptcp: add mptcp_lib_kill_wait
Cc: stable(a)vger.kernel.org # 6.5.x
Reviewed-and-tested-by: Geliang Tang <geliang(a)kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
Link: https://lore.kernel.org/r/20240131-upstream-net-20240131-mptcp-ci-issues-v1…
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
diff --git a/tools/testing/selftests/net/mptcp/mptcp_join.sh b/tools/testing/selftests/net/mptcp/mptcp_join.sh
index 3a5b63026191..85bcc95f4ede 100755
--- a/tools/testing/selftests/net/mptcp/mptcp_join.sh
+++ b/tools/testing/selftests/net/mptcp/mptcp_join.sh
@@ -3453,7 +3453,7 @@ userspace_tests()
chk_mptcp_info subflows 0 subflows 0
chk_subflows_total 1 1
kill_events_pids
- wait $tests_pid
+ mptcp_lib_kill_wait $tests_pid
fi
# userspace pm create destroy subflow
@@ -3475,7 +3475,7 @@ userspace_tests()
chk_mptcp_info subflows 0 subflows 0
chk_subflows_total 1 1
kill_events_pids
- wait $tests_pid
+ mptcp_lib_kill_wait $tests_pid
fi
# userspace pm create id 0 subflow
From: Geliang Tang <geliang.tang(a)suse.com>
This patch adds the ability to send RM_ADDR for local ID 0. Check
whether id 0 address is removed, if not, put id 0 into a removing
list, pass it to mptcp_pm_remove_addr() to remove id 0 address.
There is no reason not to allow the userspace to remove the initial
address (ID 0). This special case was not taken into account not
letting the userspace to delete all addresses as announced.
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/379
Reviewed-by: Matthieu Baerts <matttbe(a)kernel.org>
Signed-off-by: Geliang Tang <geliang.tang(a)suse.com>
Signed-off-by: Mat Martineau <martineau(a)kernel.org>
Link: https://lore.kernel.org/r/20231025-send-net-next-20231025-v1-3-db8f25f798eb…
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
(cherry picked from commit 84c531f54ad9a124a924c9505d74e33d16965146)
Fixes: d9a4594edabf ("mptcp: netlink: Add MPTCP_PM_CMD_REMOVE")
Signed-off-by: Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
---
Notes:
- As mentioned in [1], the 'Fixes' tag has been accidentally dropped:
[1] https://lore.kernel.org/stable/a7a3675a-4531-4559-bea2-c7689317764a@kernel.…
- Conflict in pm_userspace.c because the new helper function expected
to be on top of mptcp_pm_nl_remove_doit() which has been recently
renamed in commit 1e07938e29c5 ("net: mptcp: rename netlink handlers
to mptcp_pm_nl_<blah>_{doit,dumpit}").
---
net/mptcp/pm_userspace.c | 39 +++++++++++++++++++++++++++++++++++++++
1 file changed, 39 insertions(+)
diff --git a/net/mptcp/pm_userspace.c b/net/mptcp/pm_userspace.c
index 3b34b7cf56c9..ecd166ce047d 100644
--- a/net/mptcp/pm_userspace.c
+++ b/net/mptcp/pm_userspace.c
@@ -220,6 +220,40 @@ int mptcp_nl_cmd_announce(struct sk_buff *skb, struct genl_info *info)
return err;
}
+static int mptcp_userspace_pm_remove_id_zero_address(struct mptcp_sock *msk,
+ struct genl_info *info)
+{
+ struct mptcp_rm_list list = { .nr = 0 };
+ struct mptcp_subflow_context *subflow;
+ struct sock *sk = (struct sock *)msk;
+ bool has_id_0 = false;
+ int err = -EINVAL;
+
+ lock_sock(sk);
+ mptcp_for_each_subflow(msk, subflow) {
+ if (subflow->local_id == 0) {
+ has_id_0 = true;
+ break;
+ }
+ }
+ if (!has_id_0) {
+ GENL_SET_ERR_MSG(info, "address with id 0 not found");
+ goto remove_err;
+ }
+
+ list.ids[list.nr++] = 0;
+
+ spin_lock_bh(&msk->pm.lock);
+ mptcp_pm_remove_addr(msk, &list);
+ spin_unlock_bh(&msk->pm.lock);
+
+ err = 0;
+
+remove_err:
+ release_sock(sk);
+ return err;
+}
+
int mptcp_nl_cmd_remove(struct sk_buff *skb, struct genl_info *info)
{
struct nlattr *token = info->attrs[MPTCP_PM_ATTR_TOKEN];
@@ -251,6 +285,11 @@ int mptcp_nl_cmd_remove(struct sk_buff *skb, struct genl_info *info)
goto remove_err;
}
+ if (id_val == 0) {
+ err = mptcp_userspace_pm_remove_id_zero_address(msk, info);
+ goto remove_err;
+ }
+
lock_sock((struct sock *)msk);
list_for_each_entry(entry, &msk->pm.userspace_pm_local_addr_list, list) {
--
2.43.0
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x b820de741ae48ccf50dd95e297889c286ff4f760
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024022601-stem-comfort-1bb5@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
b820de741ae4 ("fs/aio: Restrict kiocb_set_cancel_fn() to I/O submitted via libaio")
9cf3516c29e6 ("fs: add IOCB flags related to passing back dio completions")
f6c73a11133e ("fs.h: Add TRACE_IOCB_STRINGS for use in trace points")
1da8cf961bb1 ("Merge tag 'io_uring-6.0-2022-08-13' of git://git.kernel.dk/linux-block")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From b820de741ae48ccf50dd95e297889c286ff4f760 Mon Sep 17 00:00:00 2001
From: Bart Van Assche <bvanassche(a)acm.org>
Date: Thu, 15 Feb 2024 12:47:38 -0800
Subject: [PATCH] fs/aio: Restrict kiocb_set_cancel_fn() to I/O submitted via
libaio
If kiocb_set_cancel_fn() is called for I/O submitted via io_uring, the
following kernel warning appears:
WARNING: CPU: 3 PID: 368 at fs/aio.c:598 kiocb_set_cancel_fn+0x9c/0xa8
Call trace:
kiocb_set_cancel_fn+0x9c/0xa8
ffs_epfile_read_iter+0x144/0x1d0
io_read+0x19c/0x498
io_issue_sqe+0x118/0x27c
io_submit_sqes+0x25c/0x5fc
__arm64_sys_io_uring_enter+0x104/0xab0
invoke_syscall+0x58/0x11c
el0_svc_common+0xb4/0xf4
do_el0_svc+0x2c/0xb0
el0_svc+0x2c/0xa4
el0t_64_sync_handler+0x68/0xb4
el0t_64_sync+0x1a4/0x1a8
Fix this by setting the IOCB_AIO_RW flag for read and write I/O that is
submitted by libaio.
Suggested-by: Jens Axboe <axboe(a)kernel.dk>
Cc: Christoph Hellwig <hch(a)lst.de>
Cc: Avi Kivity <avi(a)scylladb.com>
Cc: Sandeep Dhavale <dhavale(a)google.com>
Cc: Jens Axboe <axboe(a)kernel.dk>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: Kent Overstreet <kent.overstreet(a)linux.dev>
Cc: stable(a)vger.kernel.org
Signed-off-by: Bart Van Assche <bvanassche(a)acm.org>
Link: https://lore.kernel.org/r/20240215204739.2677806-2-bvanassche@acm.org
Signed-off-by: Christian Brauner <brauner(a)kernel.org>
diff --git a/fs/aio.c b/fs/aio.c
index bb2ff48991f3..da18dbcfcb22 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -593,6 +593,13 @@ void kiocb_set_cancel_fn(struct kiocb *iocb, kiocb_cancel_fn *cancel)
struct kioctx *ctx = req->ki_ctx;
unsigned long flags;
+ /*
+ * kiocb didn't come from aio or is neither a read nor a write, hence
+ * ignore it.
+ */
+ if (!(iocb->ki_flags & IOCB_AIO_RW))
+ return;
+
if (WARN_ON_ONCE(!list_empty(&req->ki_list)))
return;
@@ -1509,7 +1516,7 @@ static int aio_prep_rw(struct kiocb *req, const struct iocb *iocb)
req->ki_complete = aio_complete_rw;
req->private = NULL;
req->ki_pos = iocb->aio_offset;
- req->ki_flags = req->ki_filp->f_iocb_flags;
+ req->ki_flags = req->ki_filp->f_iocb_flags | IOCB_AIO_RW;
if (iocb->aio_flags & IOCB_FLAG_RESFD)
req->ki_flags |= IOCB_EVENTFD;
if (iocb->aio_flags & IOCB_FLAG_IOPRIO) {
diff --git a/include/linux/fs.h b/include/linux/fs.h
index ed5966a70495..c2dcc98cb4c8 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -352,6 +352,8 @@ enum rw_hint {
* unrelated IO (like cache flushing, new IO generation, etc).
*/
#define IOCB_DIO_CALLER_COMP (1 << 22)
+/* kiocb is a read or write operation submitted by fs/aio.c. */
+#define IOCB_AIO_RW (1 << 23)
/* for use in trace events */
#define TRACE_IOCB_STRINGS \
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x b820de741ae48ccf50dd95e297889c286ff4f760
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024022654-stainless-aground-196f@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
b820de741ae4 ("fs/aio: Restrict kiocb_set_cancel_fn() to I/O submitted via libaio")
9cf3516c29e6 ("fs: add IOCB flags related to passing back dio completions")
f6c73a11133e ("fs.h: Add TRACE_IOCB_STRINGS for use in trace points")
1da8cf961bb1 ("Merge tag 'io_uring-6.0-2022-08-13' of git://git.kernel.dk/linux-block")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From b820de741ae48ccf50dd95e297889c286ff4f760 Mon Sep 17 00:00:00 2001
From: Bart Van Assche <bvanassche(a)acm.org>
Date: Thu, 15 Feb 2024 12:47:38 -0800
Subject: [PATCH] fs/aio: Restrict kiocb_set_cancel_fn() to I/O submitted via
libaio
If kiocb_set_cancel_fn() is called for I/O submitted via io_uring, the
following kernel warning appears:
WARNING: CPU: 3 PID: 368 at fs/aio.c:598 kiocb_set_cancel_fn+0x9c/0xa8
Call trace:
kiocb_set_cancel_fn+0x9c/0xa8
ffs_epfile_read_iter+0x144/0x1d0
io_read+0x19c/0x498
io_issue_sqe+0x118/0x27c
io_submit_sqes+0x25c/0x5fc
__arm64_sys_io_uring_enter+0x104/0xab0
invoke_syscall+0x58/0x11c
el0_svc_common+0xb4/0xf4
do_el0_svc+0x2c/0xb0
el0_svc+0x2c/0xa4
el0t_64_sync_handler+0x68/0xb4
el0t_64_sync+0x1a4/0x1a8
Fix this by setting the IOCB_AIO_RW flag for read and write I/O that is
submitted by libaio.
Suggested-by: Jens Axboe <axboe(a)kernel.dk>
Cc: Christoph Hellwig <hch(a)lst.de>
Cc: Avi Kivity <avi(a)scylladb.com>
Cc: Sandeep Dhavale <dhavale(a)google.com>
Cc: Jens Axboe <axboe(a)kernel.dk>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: Kent Overstreet <kent.overstreet(a)linux.dev>
Cc: stable(a)vger.kernel.org
Signed-off-by: Bart Van Assche <bvanassche(a)acm.org>
Link: https://lore.kernel.org/r/20240215204739.2677806-2-bvanassche@acm.org
Signed-off-by: Christian Brauner <brauner(a)kernel.org>
diff --git a/fs/aio.c b/fs/aio.c
index bb2ff48991f3..da18dbcfcb22 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -593,6 +593,13 @@ void kiocb_set_cancel_fn(struct kiocb *iocb, kiocb_cancel_fn *cancel)
struct kioctx *ctx = req->ki_ctx;
unsigned long flags;
+ /*
+ * kiocb didn't come from aio or is neither a read nor a write, hence
+ * ignore it.
+ */
+ if (!(iocb->ki_flags & IOCB_AIO_RW))
+ return;
+
if (WARN_ON_ONCE(!list_empty(&req->ki_list)))
return;
@@ -1509,7 +1516,7 @@ static int aio_prep_rw(struct kiocb *req, const struct iocb *iocb)
req->ki_complete = aio_complete_rw;
req->private = NULL;
req->ki_pos = iocb->aio_offset;
- req->ki_flags = req->ki_filp->f_iocb_flags;
+ req->ki_flags = req->ki_filp->f_iocb_flags | IOCB_AIO_RW;
if (iocb->aio_flags & IOCB_FLAG_RESFD)
req->ki_flags |= IOCB_EVENTFD;
if (iocb->aio_flags & IOCB_FLAG_IOPRIO) {
diff --git a/include/linux/fs.h b/include/linux/fs.h
index ed5966a70495..c2dcc98cb4c8 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -352,6 +352,8 @@ enum rw_hint {
* unrelated IO (like cache flushing, new IO generation, etc).
*/
#define IOCB_DIO_CALLER_COMP (1 << 22)
+/* kiocb is a read or write operation submitted by fs/aio.c. */
+#define IOCB_AIO_RW (1 << 23)
/* for use in trace events */
#define TRACE_IOCB_STRINGS \
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x b820de741ae48ccf50dd95e297889c286ff4f760
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024022653-schedule-unloaded-e4ed@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
b820de741ae4 ("fs/aio: Restrict kiocb_set_cancel_fn() to I/O submitted via libaio")
9cf3516c29e6 ("fs: add IOCB flags related to passing back dio completions")
f6c73a11133e ("fs.h: Add TRACE_IOCB_STRINGS for use in trace points")
1da8cf961bb1 ("Merge tag 'io_uring-6.0-2022-08-13' of git://git.kernel.dk/linux-block")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From b820de741ae48ccf50dd95e297889c286ff4f760 Mon Sep 17 00:00:00 2001
From: Bart Van Assche <bvanassche(a)acm.org>
Date: Thu, 15 Feb 2024 12:47:38 -0800
Subject: [PATCH] fs/aio: Restrict kiocb_set_cancel_fn() to I/O submitted via
libaio
If kiocb_set_cancel_fn() is called for I/O submitted via io_uring, the
following kernel warning appears:
WARNING: CPU: 3 PID: 368 at fs/aio.c:598 kiocb_set_cancel_fn+0x9c/0xa8
Call trace:
kiocb_set_cancel_fn+0x9c/0xa8
ffs_epfile_read_iter+0x144/0x1d0
io_read+0x19c/0x498
io_issue_sqe+0x118/0x27c
io_submit_sqes+0x25c/0x5fc
__arm64_sys_io_uring_enter+0x104/0xab0
invoke_syscall+0x58/0x11c
el0_svc_common+0xb4/0xf4
do_el0_svc+0x2c/0xb0
el0_svc+0x2c/0xa4
el0t_64_sync_handler+0x68/0xb4
el0t_64_sync+0x1a4/0x1a8
Fix this by setting the IOCB_AIO_RW flag for read and write I/O that is
submitted by libaio.
Suggested-by: Jens Axboe <axboe(a)kernel.dk>
Cc: Christoph Hellwig <hch(a)lst.de>
Cc: Avi Kivity <avi(a)scylladb.com>
Cc: Sandeep Dhavale <dhavale(a)google.com>
Cc: Jens Axboe <axboe(a)kernel.dk>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: Kent Overstreet <kent.overstreet(a)linux.dev>
Cc: stable(a)vger.kernel.org
Signed-off-by: Bart Van Assche <bvanassche(a)acm.org>
Link: https://lore.kernel.org/r/20240215204739.2677806-2-bvanassche@acm.org
Signed-off-by: Christian Brauner <brauner(a)kernel.org>
diff --git a/fs/aio.c b/fs/aio.c
index bb2ff48991f3..da18dbcfcb22 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -593,6 +593,13 @@ void kiocb_set_cancel_fn(struct kiocb *iocb, kiocb_cancel_fn *cancel)
struct kioctx *ctx = req->ki_ctx;
unsigned long flags;
+ /*
+ * kiocb didn't come from aio or is neither a read nor a write, hence
+ * ignore it.
+ */
+ if (!(iocb->ki_flags & IOCB_AIO_RW))
+ return;
+
if (WARN_ON_ONCE(!list_empty(&req->ki_list)))
return;
@@ -1509,7 +1516,7 @@ static int aio_prep_rw(struct kiocb *req, const struct iocb *iocb)
req->ki_complete = aio_complete_rw;
req->private = NULL;
req->ki_pos = iocb->aio_offset;
- req->ki_flags = req->ki_filp->f_iocb_flags;
+ req->ki_flags = req->ki_filp->f_iocb_flags | IOCB_AIO_RW;
if (iocb->aio_flags & IOCB_FLAG_RESFD)
req->ki_flags |= IOCB_EVENTFD;
if (iocb->aio_flags & IOCB_FLAG_IOPRIO) {
diff --git a/include/linux/fs.h b/include/linux/fs.h
index ed5966a70495..c2dcc98cb4c8 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -352,6 +352,8 @@ enum rw_hint {
* unrelated IO (like cache flushing, new IO generation, etc).
*/
#define IOCB_DIO_CALLER_COMP (1 << 22)
+/* kiocb is a read or write operation submitted by fs/aio.c. */
+#define IOCB_AIO_RW (1 << 23)
/* for use in trace events */
#define TRACE_IOCB_STRINGS \
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x b820de741ae48ccf50dd95e297889c286ff4f760
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024022651-shrimp-freezing-6b17@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
b820de741ae4 ("fs/aio: Restrict kiocb_set_cancel_fn() to I/O submitted via libaio")
9cf3516c29e6 ("fs: add IOCB flags related to passing back dio completions")
f6c73a11133e ("fs.h: Add TRACE_IOCB_STRINGS for use in trace points")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From b820de741ae48ccf50dd95e297889c286ff4f760 Mon Sep 17 00:00:00 2001
From: Bart Van Assche <bvanassche(a)acm.org>
Date: Thu, 15 Feb 2024 12:47:38 -0800
Subject: [PATCH] fs/aio: Restrict kiocb_set_cancel_fn() to I/O submitted via
libaio
If kiocb_set_cancel_fn() is called for I/O submitted via io_uring, the
following kernel warning appears:
WARNING: CPU: 3 PID: 368 at fs/aio.c:598 kiocb_set_cancel_fn+0x9c/0xa8
Call trace:
kiocb_set_cancel_fn+0x9c/0xa8
ffs_epfile_read_iter+0x144/0x1d0
io_read+0x19c/0x498
io_issue_sqe+0x118/0x27c
io_submit_sqes+0x25c/0x5fc
__arm64_sys_io_uring_enter+0x104/0xab0
invoke_syscall+0x58/0x11c
el0_svc_common+0xb4/0xf4
do_el0_svc+0x2c/0xb0
el0_svc+0x2c/0xa4
el0t_64_sync_handler+0x68/0xb4
el0t_64_sync+0x1a4/0x1a8
Fix this by setting the IOCB_AIO_RW flag for read and write I/O that is
submitted by libaio.
Suggested-by: Jens Axboe <axboe(a)kernel.dk>
Cc: Christoph Hellwig <hch(a)lst.de>
Cc: Avi Kivity <avi(a)scylladb.com>
Cc: Sandeep Dhavale <dhavale(a)google.com>
Cc: Jens Axboe <axboe(a)kernel.dk>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: Kent Overstreet <kent.overstreet(a)linux.dev>
Cc: stable(a)vger.kernel.org
Signed-off-by: Bart Van Assche <bvanassche(a)acm.org>
Link: https://lore.kernel.org/r/20240215204739.2677806-2-bvanassche@acm.org
Signed-off-by: Christian Brauner <brauner(a)kernel.org>
diff --git a/fs/aio.c b/fs/aio.c
index bb2ff48991f3..da18dbcfcb22 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -593,6 +593,13 @@ void kiocb_set_cancel_fn(struct kiocb *iocb, kiocb_cancel_fn *cancel)
struct kioctx *ctx = req->ki_ctx;
unsigned long flags;
+ /*
+ * kiocb didn't come from aio or is neither a read nor a write, hence
+ * ignore it.
+ */
+ if (!(iocb->ki_flags & IOCB_AIO_RW))
+ return;
+
if (WARN_ON_ONCE(!list_empty(&req->ki_list)))
return;
@@ -1509,7 +1516,7 @@ static int aio_prep_rw(struct kiocb *req, const struct iocb *iocb)
req->ki_complete = aio_complete_rw;
req->private = NULL;
req->ki_pos = iocb->aio_offset;
- req->ki_flags = req->ki_filp->f_iocb_flags;
+ req->ki_flags = req->ki_filp->f_iocb_flags | IOCB_AIO_RW;
if (iocb->aio_flags & IOCB_FLAG_RESFD)
req->ki_flags |= IOCB_EVENTFD;
if (iocb->aio_flags & IOCB_FLAG_IOPRIO) {
diff --git a/include/linux/fs.h b/include/linux/fs.h
index ed5966a70495..c2dcc98cb4c8 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -352,6 +352,8 @@ enum rw_hint {
* unrelated IO (like cache flushing, new IO generation, etc).
*/
#define IOCB_DIO_CALLER_COMP (1 << 22)
+/* kiocb is a read or write operation submitted by fs/aio.c. */
+#define IOCB_AIO_RW (1 << 23)
/* for use in trace events */
#define TRACE_IOCB_STRINGS \
The patch below does not apply to the 6.7-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.7.y
git checkout FETCH_HEAD
git cherry-pick -x 04b57c9e096a9479fe0ad31e3956e336fa589cb2
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024021925-saloon-pursuit-2736@gregkh' --subject-prefix 'PATCH 6.7.y' HEAD^..
Possible dependencies:
04b57c9e096a ("selftests: mptcp: join: stop transfer when check is done (part 2)")
b9fb176081fb ("selftests: mptcp: userspace pm send RM_ADDR for ID 0")
e3b47e460b4b ("selftests: mptcp: userspace pm remove initial subflow")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 04b57c9e096a9479fe0ad31e3956e336fa589cb2 Mon Sep 17 00:00:00 2001
From: "Matthieu Baerts (NGI0)" <matttbe(a)kernel.org>
Date: Wed, 31 Jan 2024 22:49:54 +0100
Subject: [PATCH] selftests: mptcp: join: stop transfer when check is done
(part 2)
Since the "Fixes" commits mentioned below, the newly added "userspace
pm" subtests of mptcp_join selftests are launching the whole transfer in
the background, do the required checks, then wait for the end of
transfer.
There is no need to wait longer, especially because the checks at the
end of the transfer are ignored (which is fine). This saves quite a few
seconds on slow environments.
While at it, use 'mptcp_lib_kill_wait()' helper everywhere, instead of
on a specific one with 'kill_tests_wait()'.
Fixes: b2e2248f365a ("selftests: mptcp: userspace pm create id 0 subflow")
Fixes: e3b47e460b4b ("selftests: mptcp: userspace pm remove initial subflow")
Fixes: b9fb176081fb ("selftests: mptcp: userspace pm send RM_ADDR for ID 0")
Cc: stable(a)vger.kernel.org
Reviewed-and-tested-by: Geliang Tang <geliang(a)kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
Link: https://lore.kernel.org/r/20240131-upstream-net-20240131-mptcp-ci-issues-v1…
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
diff --git a/tools/testing/selftests/net/mptcp/mptcp_join.sh b/tools/testing/selftests/net/mptcp/mptcp_join.sh
index 85bcc95f4ede..c07386e21e0a 100755
--- a/tools/testing/selftests/net/mptcp/mptcp_join.sh
+++ b/tools/testing/selftests/net/mptcp/mptcp_join.sh
@@ -643,13 +643,6 @@ kill_events_pids()
mptcp_lib_kill_wait $evts_ns2_pid
}
-kill_tests_wait()
-{
- #shellcheck disable=SC2046
- kill -SIGUSR1 $(ip netns pids $ns2) $(ip netns pids $ns1)
- wait
-}
-
pm_nl_set_limits()
{
local ns=$1
@@ -3494,7 +3487,7 @@ userspace_tests()
chk_mptcp_info subflows 1 subflows 1
chk_subflows_total 2 2
kill_events_pids
- wait $tests_pid
+ mptcp_lib_kill_wait $tests_pid
fi
# userspace pm remove initial subflow
@@ -3518,7 +3511,7 @@ userspace_tests()
chk_mptcp_info subflows 1 subflows 1
chk_subflows_total 1 1
kill_events_pids
- wait $tests_pid
+ mptcp_lib_kill_wait $tests_pid
fi
# userspace pm send RM_ADDR for ID 0
@@ -3544,7 +3537,7 @@ userspace_tests()
chk_mptcp_info subflows 1 subflows 1
chk_subflows_total 1 1
kill_events_pids
- wait $tests_pid
+ mptcp_lib_kill_wait $tests_pid
fi
}
@@ -3558,7 +3551,8 @@ endpoint_tests()
pm_nl_set_limits $ns2 2 2
pm_nl_add_endpoint $ns1 10.0.2.1 flags signal
speed=slow \
- run_tests $ns1 $ns2 10.0.1.1 2>/dev/null &
+ run_tests $ns1 $ns2 10.0.1.1 &
+ local tests_pid=$!
wait_mpj $ns1
pm_nl_check_endpoint "creation" \
@@ -3573,7 +3567,7 @@ endpoint_tests()
pm_nl_add_endpoint $ns2 10.0.2.2 flags signal
pm_nl_check_endpoint "modif is allowed" \
$ns2 10.0.2.2 id 1 flags signal
- kill_tests_wait
+ mptcp_lib_kill_wait $tests_pid
fi
if reset "delete and re-add" &&
@@ -3582,7 +3576,8 @@ endpoint_tests()
pm_nl_set_limits $ns2 1 1
pm_nl_add_endpoint $ns2 10.0.2.2 id 2 dev ns2eth2 flags subflow
test_linkfail=4 speed=20 \
- run_tests $ns1 $ns2 10.0.1.1 2>/dev/null &
+ run_tests $ns1 $ns2 10.0.1.1 &
+ local tests_pid=$!
wait_mpj $ns2
chk_subflow_nr "before delete" 2
@@ -3597,7 +3592,7 @@ endpoint_tests()
wait_mpj $ns2
chk_subflow_nr "after re-add" 2
chk_mptcp_info subflows 1 subflows 1
- kill_tests_wait
+ mptcp_lib_kill_wait $tests_pid
fi
}
The patch below does not apply to the 6.7-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.7.y
git checkout FETCH_HEAD
git cherry-pick -x 31ee4ad86afd6ed6f4bb1b38c43011216080c42a
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024021916-striking-evoke-4847@gregkh' --subject-prefix 'PATCH 6.7.y' HEAD^..
Possible dependencies:
31ee4ad86afd ("selftests: mptcp: join: stop transfer when check is done (part 1)")
80775412882e ("selftests: mptcp: add chk_subflows_total helper")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 31ee4ad86afd6ed6f4bb1b38c43011216080c42a Mon Sep 17 00:00:00 2001
From: "Matthieu Baerts (NGI0)" <matttbe(a)kernel.org>
Date: Wed, 31 Jan 2024 22:49:53 +0100
Subject: [PATCH] selftests: mptcp: join: stop transfer when check is done
(part 1)
Since the "Fixes" commit mentioned below, "userspace pm" subtests of
mptcp_join selftests introduced in v6.5 are launching the whole transfer
in the background, do the required checks, then wait for the end of
transfer.
There is no need to wait longer, especially because the checks at the
end of the transfer are ignored (which is fine). This saves quite a few
seconds in slow environments.
Note that old versions will need commit bdbef0a6ff10 ("selftests: mptcp:
add mptcp_lib_kill_wait") as well to get 'mptcp_lib_kill_wait()' helper.
Fixes: 4369c198e599 ("selftests: mptcp: test userspace pm out of transfer")
Cc: stable(a)vger.kernel.org # 6.5.x: bdbef0a6ff10: selftests: mptcp: add mptcp_lib_kill_wait
Cc: stable(a)vger.kernel.org # 6.5.x
Reviewed-and-tested-by: Geliang Tang <geliang(a)kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
Link: https://lore.kernel.org/r/20240131-upstream-net-20240131-mptcp-ci-issues-v1…
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
diff --git a/tools/testing/selftests/net/mptcp/mptcp_join.sh b/tools/testing/selftests/net/mptcp/mptcp_join.sh
index 3a5b63026191..85bcc95f4ede 100755
--- a/tools/testing/selftests/net/mptcp/mptcp_join.sh
+++ b/tools/testing/selftests/net/mptcp/mptcp_join.sh
@@ -3453,7 +3453,7 @@ userspace_tests()
chk_mptcp_info subflows 0 subflows 0
chk_subflows_total 1 1
kill_events_pids
- wait $tests_pid
+ mptcp_lib_kill_wait $tests_pid
fi
# userspace pm create destroy subflow
@@ -3475,7 +3475,7 @@ userspace_tests()
chk_mptcp_info subflows 0 subflows 0
chk_subflows_total 1 1
kill_events_pids
- wait $tests_pid
+ mptcp_lib_kill_wait $tests_pid
fi
# userspace pm create id 0 subflow
The following commit has been merged into the x86/urgent branch of tip:
Commit-ID: 9a458198eba98b7207669a166e64d04b04cb651b
Gitweb: https://git.kernel.org/tip/9a458198eba98b7207669a166e64d04b04cb651b
Author: Paolo Bonzini <pbonzini(a)redhat.com>
AuthorDate: Thu, 01 Feb 2024 00:09:01 +01:00
Committer: Dave Hansen <dave.hansen(a)linux.intel.com>
CommitterDate: Mon, 26 Feb 2024 08:16:15 -08:00
x86/cpu: Allow reducing x86_phys_bits during early_identify_cpu()
In commit fbf6449f84bf ("x86/sev-es: Set x86_virt_bits to the correct
value straight away, instead of a two-phase approach"), the initialization
of c->x86_phys_bits was moved after this_cpu->c_early_init(c). This is
incorrect because early_init_amd() expected to be able to reduce the
value according to the contents of CPUID leaf 0x8000001f.
Fortunately, the bug was negated by init_amd()'s call to early_init_amd(),
which does reduce x86_phys_bits in the end. However, this is very
late in the boot process and, most notably, the wrong value is used for
x86_phys_bits when setting up MTRRs.
To fix this, call get_cpu_address_sizes() as soon as X86_FEATURE_CPUID is
set/cleared, and c->extended_cpuid_level is retrieved.
Fixes: fbf6449f84bf ("x86/sev-es: Set x86_virt_bits to the correct value straight away, instead of a two-phase approach")
Signed-off-by: Paolo Bonzini <pbonzini(a)redhat.com>
Signed-off-by: Dave Hansen <dave.hansen(a)linux.intel.com>
Cc:stable@vger.kernel.org
Link: https://lore.kernel.org/all/20240131230902.1867092-2-pbonzini%40redhat.com
---
arch/x86/kernel/cpu/common.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 0b97bcd..fbc4e60 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1589,6 +1589,7 @@ static void __init early_identify_cpu(struct cpuinfo_x86 *c)
get_cpu_vendor(c);
get_cpu_cap(c);
setup_force_cpu_cap(X86_FEATURE_CPUID);
+ get_cpu_address_sizes(c);
cpu_parse_early_param();
if (this_cpu->c_early_init)
@@ -1601,10 +1602,9 @@ static void __init early_identify_cpu(struct cpuinfo_x86 *c)
this_cpu->c_bsp_init(c);
} else {
setup_clear_cpu_cap(X86_FEATURE_CPUID);
+ get_cpu_address_sizes(c);
}
- get_cpu_address_sizes(c);
-
setup_force_cpu_cap(X86_FEATURE_ALWAYS);
cpu_set_bug_bits(c);
Because sandboxing can be used as an opportunistic security measure,
user space may not log unsupported features. Let the system
administrator know if an application tries to use Landlock but failed
because it isn't enabled at boot time. This may be caused by bootloader
configurations with outdated "lsm" kernel's command-line parameter.
Cc: Günther Noack <gnoack(a)google.com>
Cc: stable(a)vger.kernel.org
Fixes: 265885daf3e5 ("landlock: Add syscall implementations")
Signed-off-by: Mickaël Salaün <mic(a)digikod.net>
---
security/landlock/syscalls.c | 18 +++++++++++++++---
1 file changed, 15 insertions(+), 3 deletions(-)
diff --git a/security/landlock/syscalls.c b/security/landlock/syscalls.c
index f0bc50003b46..b5b424819dee 100644
--- a/security/landlock/syscalls.c
+++ b/security/landlock/syscalls.c
@@ -33,6 +33,18 @@
#include "ruleset.h"
#include "setup.h"
+static bool is_not_initialized(void)
+{
+ if (likely(landlock_initialized))
+ return false;
+
+ pr_warn_once(
+ "Disabled but requested by user space. "
+ "You should enable Landlock at boot time: "
+ "https://docs.kernel.org/userspace-api/landlock.html#kernel-support\n");
+ return true;
+}
+
/**
* copy_min_struct_from_user - Safe future-proof argument copying
*
@@ -173,7 +185,7 @@ SYSCALL_DEFINE3(landlock_create_ruleset,
/* Build-time checks. */
build_check_abi();
- if (!landlock_initialized)
+ if (is_not_initialized())
return -EOPNOTSUPP;
if (flags) {
@@ -407,7 +419,7 @@ SYSCALL_DEFINE4(landlock_add_rule, const int, ruleset_fd,
struct landlock_ruleset *ruleset;
int err;
- if (!landlock_initialized)
+ if (is_not_initialized())
return -EOPNOTSUPP;
/* No flag for now. */
@@ -467,7 +479,7 @@ SYSCALL_DEFINE2(landlock_restrict_self, const int, ruleset_fd, const __u32,
struct landlock_cred_security *new_llcred;
int err;
- if (!landlock_initialized)
+ if (is_not_initialized())
return -EOPNOTSUPP;
/*
--
2.43.0
Svacer reports a potential division by zero at rcu_torture_writer() in
5.10 stable release. The problem has been fixed by the following patch
that can be cleanly applied to 5.10 branches.
From: Bjorn Helgaas <bhelgaas(a)google.com>
When booting with "pci=noaer", we don't request control of AER, but we
previously *did* request control of DPC, as in the dmesg log attached at
the bugzilla below:
Command line: ... pci=noaer
acpi PNP0A08:00: _OSC: OS supports [ExtendedConfig ASPM ClockPM Segments MSI EDR HPX-Type3]
acpi PNP0A08:00: _OSC: OS now controls [PCIeHotplug SHPCHotplug PME PCIeCapability LTR DPC]
That's illegal per PCI Firmware Spec, r3.3, sec 4.5.1, table 4-5, which
says:
If the operating system sets this bit [OSC_PCI_EXPRESS_DPC_CONTROL], it
must also set bit 7 of the Support field (indicating support for Error
Disconnect Recover notifications) and bits 3 and 4 of the Control field
(requesting control of PCI Express Advanced Error Reporting and the PCI
Express Capability Structure).
Request DPC control only if we have also requested AER control.
Fixes: ac1c8e35a326 ("PCI/DPC: Add Error Disconnect Recover (EDR) support")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=218491#c12
Signed-off-by: Bjorn Helgaas <bhelgaas(a)google.com>
Cc: <stable(a)vger.kernel.org> # v5.7+
Cc: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy(a)linux.intel.com>
Cc: Matthew W Carlis <mattc(a)purestorage.com>
Cc: Keith Busch <kbusch(a)kernel.org>
Cc: Lukas Wunner <lukas(a)wunner.de>
Cc: Mika Westerberg <mika.westerberg(a)linux.intel.com>
Cc: Jesse Brandeburg <jesse.brandeburg(a)intel.com>
---
drivers/acpi/pci_root.c | 20 +++++++++++---------
1 file changed, 11 insertions(+), 9 deletions(-)
diff --git a/drivers/acpi/pci_root.c b/drivers/acpi/pci_root.c
index 58b89b8d950e..efc292b6214e 100644
--- a/drivers/acpi/pci_root.c
+++ b/drivers/acpi/pci_root.c
@@ -518,17 +518,19 @@ static u32 calculate_control(void)
if (IS_ENABLED(CONFIG_HOTPLUG_PCI_SHPC))
control |= OSC_PCI_SHPC_NATIVE_HP_CONTROL;
- if (pci_aer_available())
+ if (pci_aer_available()) {
control |= OSC_PCI_EXPRESS_AER_CONTROL;
- /*
- * Per the Downstream Port Containment Related Enhancements ECN to
- * the PCI Firmware Spec, r3.2, sec 4.5.1, table 4-5,
- * OSC_PCI_EXPRESS_DPC_CONTROL indicates the OS supports both DPC
- * and EDR.
- */
- if (IS_ENABLED(CONFIG_PCIE_DPC) && IS_ENABLED(CONFIG_PCIE_EDR))
- control |= OSC_PCI_EXPRESS_DPC_CONTROL;
+ /*
+ * Per PCI Firmware Spec, r3.3, sec 4.5.1, table 4-5, the
+ * OS can request DPC control only if it has advertised
+ * OSC_PCI_EDR_SUPPORT and requested both
+ * OSC_PCI_EXPRESS_CAPABILITY_CONTROL and
+ * OSC_PCI_EXPRESS_AER_CONTROL.
+ */
+ if (IS_ENABLED(CONFIG_PCIE_DPC))
+ control |= OSC_PCI_EXPRESS_DPC_CONTROL;
+ }
return control;
}
--
2.34.1
This bug was found by syzkaller. This series of patches
is fix for this particular bug. Both of these patches were taken
from upstream and applied clearly without any conflicts.
First one is the fix for the problem
and another one is for fix first patch.
Luiz Augusto von Dentz (1):
Bluetooth: SCO: Fix possible circular locking dependency on
sco_connect_cfm
Pauli Virtanen (1):
Bluetooth: SCO: fix sco_conn related locking and validity issues
net/bluetooth/sco.c | 76 ++++++++++++++++++++++++++-------------------
1 file changed, 44 insertions(+), 32 deletions(-)
--
2.42.1
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.19.y
git checkout FETCH_HEAD
git cherry-pick -x 1c9be13846c0b2abc2480602f8ef421360e1ad9e
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024022632-wise-dose-46ed@gregkh' --subject-prefix 'PATCH 4.19.y' HEAD^..
Possible dependencies:
1c9be13846c0 ("usb: roles: fix NULL pointer issue when put module's reference")
044a61158b9e ("USB: roles: make role_class a static const structure")
1aaba11da9aa ("driver core: class: remove module * from class_create()")
6e30a66433af ("driver core: class: remove struct module owner out of struct class")
0b2a1a3938aa ("driver core: class: Clear private pointer on registration failures")
71a7507afbc3 ("Merge tag 'driver-core-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 1c9be13846c0b2abc2480602f8ef421360e1ad9e Mon Sep 17 00:00:00 2001
From: Xu Yang <xu.yang_2(a)nxp.com>
Date: Mon, 29 Jan 2024 17:37:38 +0800
Subject: [PATCH] usb: roles: fix NULL pointer issue when put module's
reference
In current design, usb role class driver will get usb_role_switch parent's
module reference after the user get usb_role_switch device and put the
reference after the user put the usb_role_switch device. However, the
parent device of usb_role_switch may be removed before the user put the
usb_role_switch. If so, then, NULL pointer issue will be met when the user
put the parent module's reference.
This will save the module pointer in structure of usb_role_switch. Then,
we don't need to find module by iterating long relations.
Fixes: 5c54fcac9a9d ("usb: roles: Take care of driver module reference counting")
cc: stable(a)vger.kernel.org
Signed-off-by: Xu Yang <xu.yang_2(a)nxp.com>
Acked-by: Heikki Krogerus <heikki.krogerus(a)linux.intel.com>
Link: https://lore.kernel.org/r/20240129093739.2371530-1-xu.yang_2@nxp.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/usb/roles/class.c b/drivers/usb/roles/class.c
index ae41578bd014..2bad038fb9ad 100644
--- a/drivers/usb/roles/class.c
+++ b/drivers/usb/roles/class.c
@@ -21,6 +21,7 @@ static const struct class role_class = {
struct usb_role_switch {
struct device dev;
struct mutex lock; /* device lock*/
+ struct module *module; /* the module this device depends on */
enum usb_role role;
/* From descriptor */
@@ -135,7 +136,7 @@ struct usb_role_switch *usb_role_switch_get(struct device *dev)
usb_role_switch_match);
if (!IS_ERR_OR_NULL(sw))
- WARN_ON(!try_module_get(sw->dev.parent->driver->owner));
+ WARN_ON(!try_module_get(sw->module));
return sw;
}
@@ -157,7 +158,7 @@ struct usb_role_switch *fwnode_usb_role_switch_get(struct fwnode_handle *fwnode)
sw = fwnode_connection_find_match(fwnode, "usb-role-switch",
NULL, usb_role_switch_match);
if (!IS_ERR_OR_NULL(sw))
- WARN_ON(!try_module_get(sw->dev.parent->driver->owner));
+ WARN_ON(!try_module_get(sw->module));
return sw;
}
@@ -172,7 +173,7 @@ EXPORT_SYMBOL_GPL(fwnode_usb_role_switch_get);
void usb_role_switch_put(struct usb_role_switch *sw)
{
if (!IS_ERR_OR_NULL(sw)) {
- module_put(sw->dev.parent->driver->owner);
+ module_put(sw->module);
put_device(&sw->dev);
}
}
@@ -189,15 +190,18 @@ struct usb_role_switch *
usb_role_switch_find_by_fwnode(const struct fwnode_handle *fwnode)
{
struct device *dev;
+ struct usb_role_switch *sw = NULL;
if (!fwnode)
return NULL;
dev = class_find_device_by_fwnode(&role_class, fwnode);
- if (dev)
- WARN_ON(!try_module_get(dev->parent->driver->owner));
+ if (dev) {
+ sw = to_role_switch(dev);
+ WARN_ON(!try_module_get(sw->module));
+ }
- return dev ? to_role_switch(dev) : NULL;
+ return sw;
}
EXPORT_SYMBOL_GPL(usb_role_switch_find_by_fwnode);
@@ -338,6 +342,7 @@ usb_role_switch_register(struct device *parent,
sw->set = desc->set;
sw->get = desc->get;
+ sw->module = parent->driver->owner;
sw->dev.parent = parent;
sw->dev.fwnode = desc->fwnode;
sw->dev.class = &role_class;
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x 1c9be13846c0b2abc2480602f8ef421360e1ad9e
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024022630-streak-bleep-1f75@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
1c9be13846c0 ("usb: roles: fix NULL pointer issue when put module's reference")
044a61158b9e ("USB: roles: make role_class a static const structure")
1aaba11da9aa ("driver core: class: remove module * from class_create()")
6e30a66433af ("driver core: class: remove struct module owner out of struct class")
0b2a1a3938aa ("driver core: class: Clear private pointer on registration failures")
71a7507afbc3 ("Merge tag 'driver-core-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 1c9be13846c0b2abc2480602f8ef421360e1ad9e Mon Sep 17 00:00:00 2001
From: Xu Yang <xu.yang_2(a)nxp.com>
Date: Mon, 29 Jan 2024 17:37:38 +0800
Subject: [PATCH] usb: roles: fix NULL pointer issue when put module's
reference
In current design, usb role class driver will get usb_role_switch parent's
module reference after the user get usb_role_switch device and put the
reference after the user put the usb_role_switch device. However, the
parent device of usb_role_switch may be removed before the user put the
usb_role_switch. If so, then, NULL pointer issue will be met when the user
put the parent module's reference.
This will save the module pointer in structure of usb_role_switch. Then,
we don't need to find module by iterating long relations.
Fixes: 5c54fcac9a9d ("usb: roles: Take care of driver module reference counting")
cc: stable(a)vger.kernel.org
Signed-off-by: Xu Yang <xu.yang_2(a)nxp.com>
Acked-by: Heikki Krogerus <heikki.krogerus(a)linux.intel.com>
Link: https://lore.kernel.org/r/20240129093739.2371530-1-xu.yang_2@nxp.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/usb/roles/class.c b/drivers/usb/roles/class.c
index ae41578bd014..2bad038fb9ad 100644
--- a/drivers/usb/roles/class.c
+++ b/drivers/usb/roles/class.c
@@ -21,6 +21,7 @@ static const struct class role_class = {
struct usb_role_switch {
struct device dev;
struct mutex lock; /* device lock*/
+ struct module *module; /* the module this device depends on */
enum usb_role role;
/* From descriptor */
@@ -135,7 +136,7 @@ struct usb_role_switch *usb_role_switch_get(struct device *dev)
usb_role_switch_match);
if (!IS_ERR_OR_NULL(sw))
- WARN_ON(!try_module_get(sw->dev.parent->driver->owner));
+ WARN_ON(!try_module_get(sw->module));
return sw;
}
@@ -157,7 +158,7 @@ struct usb_role_switch *fwnode_usb_role_switch_get(struct fwnode_handle *fwnode)
sw = fwnode_connection_find_match(fwnode, "usb-role-switch",
NULL, usb_role_switch_match);
if (!IS_ERR_OR_NULL(sw))
- WARN_ON(!try_module_get(sw->dev.parent->driver->owner));
+ WARN_ON(!try_module_get(sw->module));
return sw;
}
@@ -172,7 +173,7 @@ EXPORT_SYMBOL_GPL(fwnode_usb_role_switch_get);
void usb_role_switch_put(struct usb_role_switch *sw)
{
if (!IS_ERR_OR_NULL(sw)) {
- module_put(sw->dev.parent->driver->owner);
+ module_put(sw->module);
put_device(&sw->dev);
}
}
@@ -189,15 +190,18 @@ struct usb_role_switch *
usb_role_switch_find_by_fwnode(const struct fwnode_handle *fwnode)
{
struct device *dev;
+ struct usb_role_switch *sw = NULL;
if (!fwnode)
return NULL;
dev = class_find_device_by_fwnode(&role_class, fwnode);
- if (dev)
- WARN_ON(!try_module_get(dev->parent->driver->owner));
+ if (dev) {
+ sw = to_role_switch(dev);
+ WARN_ON(!try_module_get(sw->module));
+ }
- return dev ? to_role_switch(dev) : NULL;
+ return sw;
}
EXPORT_SYMBOL_GPL(usb_role_switch_find_by_fwnode);
@@ -338,6 +342,7 @@ usb_role_switch_register(struct device *parent,
sw->set = desc->set;
sw->get = desc->get;
+ sw->module = parent->driver->owner;
sw->dev.parent = parent;
sw->dev.fwnode = desc->fwnode;
sw->dev.class = &role_class;
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.19.y
git checkout FETCH_HEAD
git cherry-pick -x ec4308ecfc887128a468f03fb66b767559c57c23
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024022602-daunting-dreamland-882c@gregkh' --subject-prefix 'PATCH 4.19.y' HEAD^..
Possible dependencies:
ec4308ecfc88 ("irqchip/gic-v3-its: Do not assume vPE tables are preallocated")
c0cdc89072a3 ("irqchip/gic-v3-its: Give the percpu rdist struct its own flags field")
5e5168461c22 ("irqchip/gic-v4.1: VPE table (aka GICR_VPROPBASER) allocation")
b25319d279b6 ("irqchip/gic-v3: Detect GICv4.1 supporting RVPEID")
576a83429757 ("irqchip/gic-v3-its: Kill its->device_ids and use TYPER copy instead")
ffedbf0cba15 ("irqchip/gic-v3-its: Kill its->ite_size and use TYPER copy instead")
0dd57fed6b46 ("irqchip/gic-v3-its: Make is_v4 use a TYPER copy")
8424312516e5 ("irqchip/gic-v3-its: Use the exact ITSList for VMOVP")
5f51f803826e ("irqchip/gic-v3: Add EPPI range support")
81a43273045b ("irqchip/gic-v3: Dynamically allocate PPI NMI refcounts")
1a60e1e64391 ("irqchip/gic: Prepare for more than 16 PPIs")
211bddd210a6 ("irqchip/gic-v3: Add ESPI range support")
e91b036e1c20 ("irqchip/gic-v3: Add INTID range and convertion primitives")
13d22e2e1f35 ("irqchip/gic: Rework gic_configure_irq to take the full ICFGR base")
3d8dfe75ef69 ("Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From ec4308ecfc887128a468f03fb66b767559c57c23 Mon Sep 17 00:00:00 2001
From: Oliver Upton <oliver.upton(a)linux.dev>
Date: Mon, 19 Feb 2024 18:58:06 +0000
Subject: [PATCH] irqchip/gic-v3-its: Do not assume vPE tables are preallocated
The GIC/ITS code is designed to ensure to pick up any preallocated LPI
tables on the redistributors, as enabling LPIs is a one-way switch. There
is no such restriction for vLPIs, and for GICv4.1 it is expected to
allocate a new vPE table at boot.
This works as intended when initializing an ITS, however when setting up a
redistributor in cpu_init_lpis() the early return for preallocated RD
tables skips straight past the GICv4 setup. This all comes to a head when
trying to kexec() into a new kernel, as the new kernel silently fails to
set up GICv4, leading to a complete loss of SGIs and LPIs for KVM VMs.
Slap a band-aid on the problem by ensuring its_cpu_init_lpis() always
initializes GICv4 on the way out, even if the other RD tables were
preallocated.
Fixes: 6479450f72c1 ("irqchip/gic-v4: Fix occasional VLPI drop")
Reported-by: George Cherian <gcherian(a)marvell.com>
Co-developed-by: Marc Zyngier <maz(a)kernel.org>
Signed-off-by: Marc Zyngier <maz(a)kernel.org>
Signed-off-by: Oliver Upton <oliver.upton(a)linux.dev>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/r/20240219185809.286724-2-oliver.upton@linux.dev
diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
index 53abd4779914..b822752c4261 100644
--- a/drivers/irqchip/irq-gic-v3-its.c
+++ b/drivers/irqchip/irq-gic-v3-its.c
@@ -3181,6 +3181,7 @@ static void its_cpu_init_lpis(void)
val |= GICR_CTLR_ENABLE_LPIS;
writel_relaxed(val, rbase + GICR_CTLR);
+out:
if (gic_rdists->has_vlpis && !gic_rdists->has_rvpeid) {
void __iomem *vlpi_base = gic_data_rdist_vlpi_base();
@@ -3216,7 +3217,6 @@ static void its_cpu_init_lpis(void)
/* Make sure the GIC has seen the above */
dsb(sy);
-out:
gic_data_rdist()->flags |= RD_LOCAL_LPI_ENABLED;
pr_info("GICv3: CPU%d: using %s LPI pending table @%pa\n",
smp_processor_id(),
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x ec4308ecfc887128a468f03fb66b767559c57c23
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024022658-brethren-stopper-8b5e@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
ec4308ecfc88 ("irqchip/gic-v3-its: Do not assume vPE tables are preallocated")
c0cdc89072a3 ("irqchip/gic-v3-its: Give the percpu rdist struct its own flags field")
5e5168461c22 ("irqchip/gic-v4.1: VPE table (aka GICR_VPROPBASER) allocation")
b25319d279b6 ("irqchip/gic-v3: Detect GICv4.1 supporting RVPEID")
576a83429757 ("irqchip/gic-v3-its: Kill its->device_ids and use TYPER copy instead")
ffedbf0cba15 ("irqchip/gic-v3-its: Kill its->ite_size and use TYPER copy instead")
0dd57fed6b46 ("irqchip/gic-v3-its: Make is_v4 use a TYPER copy")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From ec4308ecfc887128a468f03fb66b767559c57c23 Mon Sep 17 00:00:00 2001
From: Oliver Upton <oliver.upton(a)linux.dev>
Date: Mon, 19 Feb 2024 18:58:06 +0000
Subject: [PATCH] irqchip/gic-v3-its: Do not assume vPE tables are preallocated
The GIC/ITS code is designed to ensure to pick up any preallocated LPI
tables on the redistributors, as enabling LPIs is a one-way switch. There
is no such restriction for vLPIs, and for GICv4.1 it is expected to
allocate a new vPE table at boot.
This works as intended when initializing an ITS, however when setting up a
redistributor in cpu_init_lpis() the early return for preallocated RD
tables skips straight past the GICv4 setup. This all comes to a head when
trying to kexec() into a new kernel, as the new kernel silently fails to
set up GICv4, leading to a complete loss of SGIs and LPIs for KVM VMs.
Slap a band-aid on the problem by ensuring its_cpu_init_lpis() always
initializes GICv4 on the way out, even if the other RD tables were
preallocated.
Fixes: 6479450f72c1 ("irqchip/gic-v4: Fix occasional VLPI drop")
Reported-by: George Cherian <gcherian(a)marvell.com>
Co-developed-by: Marc Zyngier <maz(a)kernel.org>
Signed-off-by: Marc Zyngier <maz(a)kernel.org>
Signed-off-by: Oliver Upton <oliver.upton(a)linux.dev>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/r/20240219185809.286724-2-oliver.upton@linux.dev
diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
index 53abd4779914..b822752c4261 100644
--- a/drivers/irqchip/irq-gic-v3-its.c
+++ b/drivers/irqchip/irq-gic-v3-its.c
@@ -3181,6 +3181,7 @@ static void its_cpu_init_lpis(void)
val |= GICR_CTLR_ENABLE_LPIS;
writel_relaxed(val, rbase + GICR_CTLR);
+out:
if (gic_rdists->has_vlpis && !gic_rdists->has_rvpeid) {
void __iomem *vlpi_base = gic_data_rdist_vlpi_base();
@@ -3216,7 +3217,6 @@ static void its_cpu_init_lpis(void)
/* Make sure the GIC has seen the above */
dsb(sy);
-out:
gic_data_rdist()->flags |= RD_LOCAL_LPI_ENABLED;
pr_info("GICv3: CPU%d: using %s LPI pending table @%pa\n",
smp_processor_id(),
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x ec4308ecfc887128a468f03fb66b767559c57c23
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024022654-drained-afterglow-ddb6@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
ec4308ecfc88 ("irqchip/gic-v3-its: Do not assume vPE tables are preallocated")
c0cdc89072a3 ("irqchip/gic-v3-its: Give the percpu rdist struct its own flags field")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From ec4308ecfc887128a468f03fb66b767559c57c23 Mon Sep 17 00:00:00 2001
From: Oliver Upton <oliver.upton(a)linux.dev>
Date: Mon, 19 Feb 2024 18:58:06 +0000
Subject: [PATCH] irqchip/gic-v3-its: Do not assume vPE tables are preallocated
The GIC/ITS code is designed to ensure to pick up any preallocated LPI
tables on the redistributors, as enabling LPIs is a one-way switch. There
is no such restriction for vLPIs, and for GICv4.1 it is expected to
allocate a new vPE table at boot.
This works as intended when initializing an ITS, however when setting up a
redistributor in cpu_init_lpis() the early return for preallocated RD
tables skips straight past the GICv4 setup. This all comes to a head when
trying to kexec() into a new kernel, as the new kernel silently fails to
set up GICv4, leading to a complete loss of SGIs and LPIs for KVM VMs.
Slap a band-aid on the problem by ensuring its_cpu_init_lpis() always
initializes GICv4 on the way out, even if the other RD tables were
preallocated.
Fixes: 6479450f72c1 ("irqchip/gic-v4: Fix occasional VLPI drop")
Reported-by: George Cherian <gcherian(a)marvell.com>
Co-developed-by: Marc Zyngier <maz(a)kernel.org>
Signed-off-by: Marc Zyngier <maz(a)kernel.org>
Signed-off-by: Oliver Upton <oliver.upton(a)linux.dev>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/r/20240219185809.286724-2-oliver.upton@linux.dev
diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
index 53abd4779914..b822752c4261 100644
--- a/drivers/irqchip/irq-gic-v3-its.c
+++ b/drivers/irqchip/irq-gic-v3-its.c
@@ -3181,6 +3181,7 @@ static void its_cpu_init_lpis(void)
val |= GICR_CTLR_ENABLE_LPIS;
writel_relaxed(val, rbase + GICR_CTLR);
+out:
if (gic_rdists->has_vlpis && !gic_rdists->has_rvpeid) {
void __iomem *vlpi_base = gic_data_rdist_vlpi_base();
@@ -3216,7 +3217,6 @@ static void its_cpu_init_lpis(void)
/* Make sure the GIC has seen the above */
dsb(sy);
-out:
gic_data_rdist()->flags |= RD_LOCAL_LPI_ENABLED;
pr_info("GICv3: CPU%d: using %s LPI pending table @%pa\n",
smp_processor_id(),
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x ec4308ecfc887128a468f03fb66b767559c57c23
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024022650-washroom-undusted-2aff@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
ec4308ecfc88 ("irqchip/gic-v3-its: Do not assume vPE tables are preallocated")
c0cdc89072a3 ("irqchip/gic-v3-its: Give the percpu rdist struct its own flags field")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From ec4308ecfc887128a468f03fb66b767559c57c23 Mon Sep 17 00:00:00 2001
From: Oliver Upton <oliver.upton(a)linux.dev>
Date: Mon, 19 Feb 2024 18:58:06 +0000
Subject: [PATCH] irqchip/gic-v3-its: Do not assume vPE tables are preallocated
The GIC/ITS code is designed to ensure to pick up any preallocated LPI
tables on the redistributors, as enabling LPIs is a one-way switch. There
is no such restriction for vLPIs, and for GICv4.1 it is expected to
allocate a new vPE table at boot.
This works as intended when initializing an ITS, however when setting up a
redistributor in cpu_init_lpis() the early return for preallocated RD
tables skips straight past the GICv4 setup. This all comes to a head when
trying to kexec() into a new kernel, as the new kernel silently fails to
set up GICv4, leading to a complete loss of SGIs and LPIs for KVM VMs.
Slap a band-aid on the problem by ensuring its_cpu_init_lpis() always
initializes GICv4 on the way out, even if the other RD tables were
preallocated.
Fixes: 6479450f72c1 ("irqchip/gic-v4: Fix occasional VLPI drop")
Reported-by: George Cherian <gcherian(a)marvell.com>
Co-developed-by: Marc Zyngier <maz(a)kernel.org>
Signed-off-by: Marc Zyngier <maz(a)kernel.org>
Signed-off-by: Oliver Upton <oliver.upton(a)linux.dev>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/r/20240219185809.286724-2-oliver.upton@linux.dev
diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
index 53abd4779914..b822752c4261 100644
--- a/drivers/irqchip/irq-gic-v3-its.c
+++ b/drivers/irqchip/irq-gic-v3-its.c
@@ -3181,6 +3181,7 @@ static void its_cpu_init_lpis(void)
val |= GICR_CTLR_ENABLE_LPIS;
writel_relaxed(val, rbase + GICR_CTLR);
+out:
if (gic_rdists->has_vlpis && !gic_rdists->has_rvpeid) {
void __iomem *vlpi_base = gic_data_rdist_vlpi_base();
@@ -3216,7 +3217,6 @@ static void its_cpu_init_lpis(void)
/* Make sure the GIC has seen the above */
dsb(sy);
-out:
gic_data_rdist()->flags |= RD_LOCAL_LPI_ENABLED;
pr_info("GICv3: CPU%d: using %s LPI pending table @%pa\n",
smp_processor_id(),
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 321da3dc1f3c92a12e3c5da934090d2992a8814c
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024022657-skintight-fetal-ec4b@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
321da3dc1f3c ("scsi: sd: usb_storage: uas: Access media prior to querying device properties")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 321da3dc1f3c92a12e3c5da934090d2992a8814c Mon Sep 17 00:00:00 2001
From: "Martin K. Petersen" <martin.petersen(a)oracle.com>
Date: Tue, 13 Feb 2024 09:33:06 -0500
Subject: [PATCH] scsi: sd: usb_storage: uas: Access media prior to querying
device properties
It has been observed that some USB/UAS devices return generic properties
hardcoded in firmware for mode pages for a period of time after a device
has been discovered. The reported properties are either garbage or they do
not accurately reflect the characteristics of the physical storage device
attached in the case of a bridge.
Prior to commit 1e029397d12f ("scsi: sd: Reorganize DIF/DIX code to
avoid calling revalidate twice") we would call revalidate several
times during device discovery. As a result, incorrect values would
eventually get replaced with ones accurately describing the attached
storage. When we did away with the redundant revalidate pass, several
cases were reported where devices reported nonsensical values or would
end up in write-protected state.
An initial attempt at addressing this issue involved introducing a
delayed second revalidate invocation. However, this approach still
left some devices reporting incorrect characteristics.
Tasos Sahanidis debugged the problem further and identified that
introducing a READ operation prior to MODE SENSE fixed the problem and that
it wasn't a timing issue. Issuing a READ appears to cause the devices to
update their state to reflect the actual properties of the storage
media. Device properties like vendor, model, and storage capacity appear to
be correctly reported from the get-go. It is unclear why these devices
defer populating the remaining characteristics.
Match the behavior of a well known commercial operating system and
trigger a READ operation prior to querying device characteristics to
force the device to populate the mode pages.
The additional READ is triggered by a flag set in the USB storage and
UAS drivers. We avoid issuing the READ for other transport classes
since some storage devices identify Linux through our particular
discovery command sequence.
Link: https://lore.kernel.org/r/20240213143306.2194237-1-martin.petersen@oracle.c…
Fixes: 1e029397d12f ("scsi: sd: Reorganize DIF/DIX code to avoid calling revalidate twice")
Cc: stable(a)vger.kernel.org
Reported-by: Tasos Sahanidis <tasos(a)tasossah.com>
Reviewed-by: Ewan D. Milne <emilne(a)redhat.com>
Reviewed-by: Bart Van Assche <bvanassche(a)acm.org>
Tested-by: Tasos Sahanidis <tasos(a)tasossah.com>
Signed-off-by: Martin K. Petersen <martin.petersen(a)oracle.com>
diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index 0833b3e6aa6e..bdd0acf7fa3c 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -3407,6 +3407,24 @@ static bool sd_validate_opt_xfer_size(struct scsi_disk *sdkp,
return true;
}
+static void sd_read_block_zero(struct scsi_disk *sdkp)
+{
+ unsigned int buf_len = sdkp->device->sector_size;
+ char *buffer, cmd[10] = { };
+
+ buffer = kmalloc(buf_len, GFP_KERNEL);
+ if (!buffer)
+ return;
+
+ cmd[0] = READ_10;
+ put_unaligned_be32(0, &cmd[2]); /* Logical block address 0 */
+ put_unaligned_be16(1, &cmd[7]); /* Transfer 1 logical block */
+
+ scsi_execute_cmd(sdkp->device, cmd, REQ_OP_DRV_IN, buffer, buf_len,
+ SD_TIMEOUT, sdkp->max_retries, NULL);
+ kfree(buffer);
+}
+
/**
* sd_revalidate_disk - called the first time a new disk is seen,
* performs disk spin up, read_capacity, etc.
@@ -3446,7 +3464,13 @@ static int sd_revalidate_disk(struct gendisk *disk)
*/
if (sdkp->media_present) {
sd_read_capacity(sdkp, buffer);
-
+ /*
+ * Some USB/UAS devices return generic values for mode pages
+ * until the media has been accessed. Trigger a READ operation
+ * to force the device to populate mode pages.
+ */
+ if (sdp->read_before_ms)
+ sd_read_block_zero(sdkp);
/*
* set the default to rotational. All non-rotational devices
* support the block characteristics VPD page, which will
diff --git a/drivers/usb/storage/scsiglue.c b/drivers/usb/storage/scsiglue.c
index c54e9805da53..12cf9940e5b6 100644
--- a/drivers/usb/storage/scsiglue.c
+++ b/drivers/usb/storage/scsiglue.c
@@ -179,6 +179,13 @@ static int slave_configure(struct scsi_device *sdev)
*/
sdev->use_192_bytes_for_3f = 1;
+ /*
+ * Some devices report generic values until the media has been
+ * accessed. Force a READ(10) prior to querying device
+ * characteristics.
+ */
+ sdev->read_before_ms = 1;
+
/*
* Some devices don't like MODE SENSE with page=0x3f,
* which is the command used for checking if a device
diff --git a/drivers/usb/storage/uas.c b/drivers/usb/storage/uas.c
index 9707f53cfda9..71ace274761f 100644
--- a/drivers/usb/storage/uas.c
+++ b/drivers/usb/storage/uas.c
@@ -878,6 +878,13 @@ static int uas_slave_configure(struct scsi_device *sdev)
if (devinfo->flags & US_FL_CAPACITY_HEURISTICS)
sdev->guess_capacity = 1;
+ /*
+ * Some devices report generic values until the media has been
+ * accessed. Force a READ(10) prior to querying device
+ * characteristics.
+ */
+ sdev->read_before_ms = 1;
+
/*
* Some devices don't like MODE SENSE with page=0x3f,
* which is the command used for checking if a device
diff --git a/include/scsi/scsi_device.h b/include/scsi/scsi_device.h
index 5ec1e71a09de..01c02cb76ea6 100644
--- a/include/scsi/scsi_device.h
+++ b/include/scsi/scsi_device.h
@@ -208,6 +208,7 @@ struct scsi_device {
unsigned use_10_for_rw:1; /* first try 10-byte read / write */
unsigned use_10_for_ms:1; /* first try 10-byte mode sense/select */
unsigned set_dbd_for_ms:1; /* Set "DBD" field in mode sense */
+ unsigned read_before_ms:1; /* perform a READ before MODE SENSE */
unsigned no_report_opcodes:1; /* no REPORT SUPPORTED OPERATION CODES */
unsigned no_write_same:1; /* no WRITE SAME command */
unsigned use_16_for_rw:1; /* Use read/write(16) over read/write(10) */
During the handoff from earlycon to the real console driver, we have
two separate drivers operating on the same device concurrently. In the
case of the 8250 driver these concurrent accesses cause problems due
to the driver's use of banked registers, controlled by LCR.DLAB. It is
possible for the setup(), config_port(), pm() and set_mctrl() callbacks
to set DLAB, which can cause the earlycon code that intends to access
TX to instead access DLL, leading to missed output and corruption on
the serial line due to unintended modifications to the baud rate.
In particular, for setup() we have:
univ8250_console_setup()
-> serial8250_console_setup()
-> uart_set_options()
-> serial8250_set_termios()
-> serial8250_do_set_termios()
-> serial8250_do_set_divisor()
For config_port() we have:
serial8250_config_port()
-> autoconfig()
For pm() we have:
serial8250_pm()
-> serial8250_do_pm()
-> serial8250_set_sleep()
For set_mctrl() we have (for some devices):
serial8250_set_mctrl()
-> omap8250_set_mctrl()
-> __omap8250_set_mctrl()
To avoid such problems, let's make it so that the console is locked
during pre-registration calls to these callbacks, which will prevent
the earlycon driver from running concurrently.
Remove the partial solution to this problem in the 8250 driver
that locked the console only during autoconfig_irq(), as this would
result in a deadlock with the new approach. The console continues
to be locked during autoconfig_irq() because it can only be called
through uart_configure_port().
Although this patch introduces more locking than strictly necessary
(and in particular it also locks during the call to rs485_config()
which is not affected by this issue as far as I can tell), it follows
the principle that it is the responsibility of the generic console
code to manage the earlycon handoff by ensuring that earlycon and real
console driver code cannot run concurrently, and not the individual
drivers.
Signed-off-by: Peter Collingbourne <pcc(a)google.com>
Link: https://linux-review.googlesource.com/id/I7cf8124dcebf8618e6b2ee543fa5b2553…
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Cc: stable(a)vger.kernel.org
---
drivers/tty/serial/8250/8250_port.c | 6 ------
drivers/tty/serial/serial_core.c | 10 ++++++++++
kernel/printk/printk.c | 20 +++++++++++++++++---
3 files changed, 27 insertions(+), 9 deletions(-)
diff --git a/drivers/tty/serial/8250/8250_port.c b/drivers/tty/serial/8250/8250_port.c
index 8ca061d3bbb9..1d65055dde27 100644
--- a/drivers/tty/serial/8250/8250_port.c
+++ b/drivers/tty/serial/8250/8250_port.c
@@ -1329,9 +1329,6 @@ static void autoconfig_irq(struct uart_8250_port *up)
inb_p(ICP);
}
- if (uart_console(port))
- console_lock();
-
/* forget possible initially masked and pending IRQ */
probe_irq_off(probe_irq_on());
save_mcr = serial8250_in_MCR(up);
@@ -1371,9 +1368,6 @@ static void autoconfig_irq(struct uart_8250_port *up)
if (port->flags & UPF_FOURPORT)
outb_p(save_ICP, ICP);
- if (uart_console(port))
- console_unlock();
-
port->irq = (irq > 0) ? irq : 0;
}
diff --git a/drivers/tty/serial/serial_core.c b/drivers/tty/serial/serial_core.c
index d6a58a9e072a..128aa0e0ae24 100644
--- a/drivers/tty/serial/serial_core.c
+++ b/drivers/tty/serial/serial_core.c
@@ -2608,7 +2608,11 @@ uart_configure_port(struct uart_driver *drv, struct uart_state *state,
port->type = PORT_UNKNOWN;
flags |= UART_CONFIG_TYPE;
}
+ if (uart_console(port))
+ console_lock();
port->ops->config_port(port, flags);
+ if (uart_console(port))
+ console_unlock();
}
if (port->type != PORT_UNKNOWN) {
@@ -2616,6 +2620,9 @@ uart_configure_port(struct uart_driver *drv, struct uart_state *state,
uart_report_port(drv, port);
+ if (uart_console(port))
+ console_lock();
+
/* Power up port for set_mctrl() */
uart_change_pm(state, UART_PM_STATE_ON);
@@ -2632,6 +2639,9 @@ uart_configure_port(struct uart_driver *drv, struct uart_state *state,
uart_rs485_config(port);
+ if (uart_console(port))
+ console_unlock();
+
/*
* If this driver supports console, and it hasn't been
* successfully registered yet, try to re-register it.
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index f2444b581e16..db69545e6250 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -3263,6 +3263,20 @@ static int __init keep_bootcon_setup(char *str)
early_param("keep_bootcon", keep_bootcon_setup);
+static int console_call_setup(struct console *newcon, char *options)
+{
+ int err;
+
+ if (!newcon->setup)
+ return 0;
+
+ console_lock();
+ err = newcon->setup(newcon, options);
+ console_unlock();
+
+ return err;
+}
+
/*
* This is called by register_console() to try to match
* the newly registered console with any of the ones selected
@@ -3298,8 +3312,8 @@ static int try_enable_preferred_console(struct console *newcon,
if (_braille_register_console(newcon, c))
return 0;
- if (newcon->setup &&
- (err = newcon->setup(newcon, c->options)) != 0)
+ err = console_call_setup(newcon, c->options);
+ if (err != 0)
return err;
}
newcon->flags |= CON_ENABLED;
@@ -3325,7 +3339,7 @@ static void try_enable_default_console(struct console *newcon)
if (newcon->index < 0)
newcon->index = 0;
- if (newcon->setup && newcon->setup(newcon, NULL) != 0)
+ if (console_call_setup(newcon, NULL) != 0)
return;
newcon->flags |= CON_ENABLED;
--
2.44.0.rc1.240.g4c46232300-goog
On Okt 29 2023, Peter Ujfalusi wrote:
> The core twl chip is probed via i2c and the dev->driver->of_match_table is
> NULL, causing the driver to fail to probe.
>
> This partially reverts commit 1e0c866887f4.
>
> Fixes: 1e0c866887f4 ("mfd: Use device_get_match_data() in a bunch of drivers")
That commit id does not exist, which is why it hasn't been picked up by
stable. The correct commit id is 830fafce06e6f.
--
Andreas Schwab, SUSE Labs, schwab(a)suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE 1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."
Here's the recently merged mds improvement patches adapted to latest stable tree.
I've only compile tested them, but since I have also done similar backports for
older kernels I'm sure they should work.
The main difference is in the definition of the CLEAR_CPU_BUFFERS macro since
5.4 doesn't contains the alternative relocation handling logic hence the verw
instruction is moved out of the alternative definition and instead we have a jump which
skips the verw instruction there. That way the relocation will be handled by the
toolchain rather than the kernel.
H. Peter Anvin (Intel) (1):
x86/asm: Add _ASM_RIP() macro for x86-64 (%rip) suffix
Pawan Gupta (5):
x86/bugs: Add asm helpers for executing VERW
x86/entry_64: Add VERW just before userspace transition
x86/entry_32: Add VERW just before userspace transition
x86/bugs: Use ALTERNATIVE() instead of mds_user_clear static key
KVM/VMX: Move VERW closer to VMentry for MDS mitigation
Sean Christopherson (1):
KVM/VMX: Use BT+JNC, i.e. EFLAGS.CF to select VMRESUME vs. VMLAUNCH
Documentation/x86/mds.rst | 38 ++++++++++++++++++++--------
arch/x86/entry/Makefile | 2 +-
arch/x86/entry/common.c | 2 --
arch/x86/entry/entry.S | 23 +++++++++++++++++
arch/x86/entry/entry_32.S | 3 +++
arch/x86/entry/entry_64.S | 10 ++++++++
arch/x86/entry/entry_64_compat.S | 1 +
arch/x86/include/asm/asm.h | 6 ++++-
arch/x86/include/asm/cpufeatures.h | 2 +-
arch/x86/include/asm/irqflags.h | 1 +
arch/x86/include/asm/nospec-branch.h | 26 ++++++++++---------
arch/x86/kernel/cpu/bugs.c | 15 +++++------
arch/x86/kernel/nmi.c | 3 ---
arch/x86/kvm/vmx/run_flags.h | 7 +++--
arch/x86/kvm/vmx/vmenter.S | 9 ++++---
arch/x86/kvm/vmx/vmx.c | 12 ++++++---
16 files changed, 111 insertions(+), 49 deletions(-)
create mode 100644 arch/x86/entry/entry.S
--
2.34.1
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.19.y
git checkout FETCH_HEAD
git cherry-pick -x 8d3a7dfb801d157ac423261d7cd62c33e95375f8
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024022635-thinner-disinfect-f761@gregkh' --subject-prefix 'PATCH 4.19.y' HEAD^..
Possible dependencies:
8d3a7dfb801d ("KVM: arm64: vgic-its: Test for valid IRQ in its_sync_lpi_pending_table()")
9ed24f4b712b ("KVM: arm64: Move virt/kvm/arm to arch/arm64")
3b50142d8528 ("MAINTAINERS: sort field names for all entries")
4400b7d68f6e ("MAINTAINERS: sort entries by entry name")
b032227c6293 ("Merge tag 'nios2-v5.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/lftan/nios2")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 8d3a7dfb801d157ac423261d7cd62c33e95375f8 Mon Sep 17 00:00:00 2001
From: Oliver Upton <oliver.upton(a)linux.dev>
Date: Wed, 21 Feb 2024 09:27:31 +0000
Subject: [PATCH] KVM: arm64: vgic-its: Test for valid IRQ in
its_sync_lpi_pending_table()
vgic_get_irq() may not return a valid descriptor if there is no ITS that
holds a valid translation for the specified INTID. If that is the case,
it is safe to silently ignore it and continue processing the LPI pending
table.
Cc: stable(a)vger.kernel.org
Fixes: 33d3bc9556a7 ("KVM: arm64: vgic-its: Read initial LPI pending table")
Signed-off-by: Oliver Upton <oliver.upton(a)linux.dev>
Link: https://lore.kernel.org/r/20240221092732.4126848-2-oliver.upton@linux.dev
Signed-off-by: Marc Zyngier <maz(a)kernel.org>
diff --git a/arch/arm64/kvm/vgic/vgic-its.c b/arch/arm64/kvm/vgic/vgic-its.c
index e2764d0ffa9f..082448de27ed 100644
--- a/arch/arm64/kvm/vgic/vgic-its.c
+++ b/arch/arm64/kvm/vgic/vgic-its.c
@@ -468,6 +468,9 @@ static int its_sync_lpi_pending_table(struct kvm_vcpu *vcpu)
}
irq = vgic_get_irq(vcpu->kvm, NULL, intids[i]);
+ if (!irq)
+ continue;
+
raw_spin_lock_irqsave(&irq->irq_lock, flags);
irq->pending_latch = pendmask & (1U << bit_nr);
vgic_queue_irq_unlock(vcpu->kvm, irq, flags);
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x 8d3a7dfb801d157ac423261d7cd62c33e95375f8
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024022634-rut-premises-24cc@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
8d3a7dfb801d ("KVM: arm64: vgic-its: Test for valid IRQ in its_sync_lpi_pending_table()")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 8d3a7dfb801d157ac423261d7cd62c33e95375f8 Mon Sep 17 00:00:00 2001
From: Oliver Upton <oliver.upton(a)linux.dev>
Date: Wed, 21 Feb 2024 09:27:31 +0000
Subject: [PATCH] KVM: arm64: vgic-its: Test for valid IRQ in
its_sync_lpi_pending_table()
vgic_get_irq() may not return a valid descriptor if there is no ITS that
holds a valid translation for the specified INTID. If that is the case,
it is safe to silently ignore it and continue processing the LPI pending
table.
Cc: stable(a)vger.kernel.org
Fixes: 33d3bc9556a7 ("KVM: arm64: vgic-its: Read initial LPI pending table")
Signed-off-by: Oliver Upton <oliver.upton(a)linux.dev>
Link: https://lore.kernel.org/r/20240221092732.4126848-2-oliver.upton@linux.dev
Signed-off-by: Marc Zyngier <maz(a)kernel.org>
diff --git a/arch/arm64/kvm/vgic/vgic-its.c b/arch/arm64/kvm/vgic/vgic-its.c
index e2764d0ffa9f..082448de27ed 100644
--- a/arch/arm64/kvm/vgic/vgic-its.c
+++ b/arch/arm64/kvm/vgic/vgic-its.c
@@ -468,6 +468,9 @@ static int its_sync_lpi_pending_table(struct kvm_vcpu *vcpu)
}
irq = vgic_get_irq(vcpu->kvm, NULL, intids[i]);
+ if (!irq)
+ continue;
+
raw_spin_lock_irqsave(&irq->irq_lock, flags);
irq->pending_latch = pendmask & (1U << bit_nr);
vgic_queue_irq_unlock(vcpu->kvm, irq, flags);
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.19.y
git checkout FETCH_HEAD
git cherry-pick -x 85a71ee9a0700f6c18862ef3b0011ed9dad99aca
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024022619-opulently-accustom-dbe1@gregkh' --subject-prefix 'PATCH 4.19.y' HEAD^..
Possible dependencies:
85a71ee9a070 ("KVM: arm64: vgic-its: Test for valid IRQ in MOVALL handler")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 85a71ee9a0700f6c18862ef3b0011ed9dad99aca Mon Sep 17 00:00:00 2001
From: Oliver Upton <oliver.upton(a)linux.dev>
Date: Wed, 21 Feb 2024 09:27:32 +0000
Subject: [PATCH] KVM: arm64: vgic-its: Test for valid IRQ in MOVALL handler
It is possible that an LPI mapped in a different ITS gets unmapped while
handling the MOVALL command. If that is the case, there is no state that
can be migrated to the destination. Silently ignore it and continue
migrating other LPIs.
Cc: stable(a)vger.kernel.org
Fixes: ff9c114394aa ("KVM: arm/arm64: GICv4: Handle MOVALL applied to a vPE")
Signed-off-by: Oliver Upton <oliver.upton(a)linux.dev>
Link: https://lore.kernel.org/r/20240221092732.4126848-3-oliver.upton@linux.dev
Signed-off-by: Marc Zyngier <maz(a)kernel.org>
diff --git a/arch/arm64/kvm/vgic/vgic-its.c b/arch/arm64/kvm/vgic/vgic-its.c
index 082448de27ed..28a93074eca1 100644
--- a/arch/arm64/kvm/vgic/vgic-its.c
+++ b/arch/arm64/kvm/vgic/vgic-its.c
@@ -1435,6 +1435,8 @@ static int vgic_its_cmd_handle_movall(struct kvm *kvm, struct vgic_its *its,
for (i = 0; i < irq_count; i++) {
irq = vgic_get_irq(kvm, NULL, intids[i]);
+ if (!irq)
+ continue;
update_affinity(irq, vcpu2);
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x 85a71ee9a0700f6c18862ef3b0011ed9dad99aca
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024022618-maternal-runny-28b5@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
85a71ee9a070 ("KVM: arm64: vgic-its: Test for valid IRQ in MOVALL handler")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 85a71ee9a0700f6c18862ef3b0011ed9dad99aca Mon Sep 17 00:00:00 2001
From: Oliver Upton <oliver.upton(a)linux.dev>
Date: Wed, 21 Feb 2024 09:27:32 +0000
Subject: [PATCH] KVM: arm64: vgic-its: Test for valid IRQ in MOVALL handler
It is possible that an LPI mapped in a different ITS gets unmapped while
handling the MOVALL command. If that is the case, there is no state that
can be migrated to the destination. Silently ignore it and continue
migrating other LPIs.
Cc: stable(a)vger.kernel.org
Fixes: ff9c114394aa ("KVM: arm/arm64: GICv4: Handle MOVALL applied to a vPE")
Signed-off-by: Oliver Upton <oliver.upton(a)linux.dev>
Link: https://lore.kernel.org/r/20240221092732.4126848-3-oliver.upton@linux.dev
Signed-off-by: Marc Zyngier <maz(a)kernel.org>
diff --git a/arch/arm64/kvm/vgic/vgic-its.c b/arch/arm64/kvm/vgic/vgic-its.c
index 082448de27ed..28a93074eca1 100644
--- a/arch/arm64/kvm/vgic/vgic-its.c
+++ b/arch/arm64/kvm/vgic/vgic-its.c
@@ -1435,6 +1435,8 @@ static int vgic_its_cmd_handle_movall(struct kvm *kvm, struct vgic_its *its,
for (i = 0; i < irq_count; i++) {
irq = vgic_get_irq(kvm, NULL, intids[i]);
+ if (!irq)
+ continue;
update_affinity(irq, vcpu2);
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.19.y
git checkout FETCH_HEAD
git cherry-pick -x e21a2f17566cbd64926fb8f16323972f7a064444
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024022639-selection-angrily-d6ff@gregkh' --subject-prefix 'PATCH 4.19.y' HEAD^..
Possible dependencies:
e21a2f17566c ("cachefiles: fix memory leak in cachefiles_add_cache()")
d1065b0a6fd9 ("cachefiles: Implement cache registration and withdrawal")
32759f7d7af5 ("cachefiles: Implement a function to get/create a directory in the cache")
1bd9c4e4f049 ("vfs, cachefiles: Mark a backing file in use with an inode flag")
80f94f29f677 ("cachefiles: Provide a function to check how much space there is")
8667d434b2a9 ("cachefiles: Register a miscdev and parse commands over it")
254947d47945 ("cachefiles: Add security derivation")
1493bf74bcf2 ("cachefiles: Add cache error reporting macro")
ecf5a6ce15f9 ("cachefiles: Add a couple of tracepoints for logging errors")
a70f6526267e ("cachefiles: Add some error injection support")
8390fbc46570 ("cachefiles: Define structs")
77443f6171f3 ("cachefiles: Introduce rewritten driver")
850cba069c26 ("cachefiles: Delete the cachefiles driver pending rewrite")
b6773cdb0e9f ("Merge tag 'for-5.16/ki_complete-2021-10-29' of git://git.kernel.dk/linux-block")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From e21a2f17566cbd64926fb8f16323972f7a064444 Mon Sep 17 00:00:00 2001
From: Baokun Li <libaokun1(a)huawei.com>
Date: Sat, 17 Feb 2024 16:14:31 +0800
Subject: [PATCH] cachefiles: fix memory leak in cachefiles_add_cache()
The following memory leak was reported after unbinding /dev/cachefiles:
==================================================================
unreferenced object 0xffff9b674176e3c0 (size 192):
comm "cachefilesd2", pid 680, jiffies 4294881224
hex dump (first 32 bytes):
01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
backtrace (crc ea38a44b):
[<ffffffff8eb8a1a5>] kmem_cache_alloc+0x2d5/0x370
[<ffffffff8e917f86>] prepare_creds+0x26/0x2e0
[<ffffffffc002eeef>] cachefiles_determine_cache_security+0x1f/0x120
[<ffffffffc00243ec>] cachefiles_add_cache+0x13c/0x3a0
[<ffffffffc0025216>] cachefiles_daemon_write+0x146/0x1c0
[<ffffffff8ebc4a3b>] vfs_write+0xcb/0x520
[<ffffffff8ebc5069>] ksys_write+0x69/0xf0
[<ffffffff8f6d4662>] do_syscall_64+0x72/0x140
[<ffffffff8f8000aa>] entry_SYSCALL_64_after_hwframe+0x6e/0x76
==================================================================
Put the reference count of cache_cred in cachefiles_daemon_unbind() to
fix the problem. And also put cache_cred in cachefiles_add_cache() error
branch to avoid memory leaks.
Fixes: 9ae326a69004 ("CacheFiles: A cache that backs onto a mounted filesystem")
CC: stable(a)vger.kernel.org
Signed-off-by: Baokun Li <libaokun1(a)huawei.com>
Link: https://lore.kernel.org/r/20240217081431.796809-1-libaokun1@huawei.com
Acked-by: David Howells <dhowells(a)redhat.com>
Reviewed-by: Jingbo Xu <jefflexu(a)linux.alibaba.com>
Reviewed-by: Jeff Layton <jlayton(a)kernel.org>
Signed-off-by: Christian Brauner <brauner(a)kernel.org>
diff --git a/fs/cachefiles/cache.c b/fs/cachefiles/cache.c
index 7077f72e6f47..f449f7340aad 100644
--- a/fs/cachefiles/cache.c
+++ b/fs/cachefiles/cache.c
@@ -168,6 +168,8 @@ int cachefiles_add_cache(struct cachefiles_cache *cache)
dput(root);
error_open_root:
cachefiles_end_secure(cache, saved_cred);
+ put_cred(cache->cache_cred);
+ cache->cache_cred = NULL;
error_getsec:
fscache_relinquish_cache(cache_cookie);
cache->cache = NULL;
diff --git a/fs/cachefiles/daemon.c b/fs/cachefiles/daemon.c
index 3f24905f4066..6465e2574230 100644
--- a/fs/cachefiles/daemon.c
+++ b/fs/cachefiles/daemon.c
@@ -816,6 +816,7 @@ static void cachefiles_daemon_unbind(struct cachefiles_cache *cache)
cachefiles_put_directory(cache->graveyard);
cachefiles_put_directory(cache->store);
mntput(cache->mnt);
+ put_cred(cache->cache_cred);
kfree(cache->rootdirname);
kfree(cache->secctx);
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x e21a2f17566cbd64926fb8f16323972f7a064444
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024022638-jingling-atlas-4340@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
e21a2f17566c ("cachefiles: fix memory leak in cachefiles_add_cache()")
d1065b0a6fd9 ("cachefiles: Implement cache registration and withdrawal")
32759f7d7af5 ("cachefiles: Implement a function to get/create a directory in the cache")
1bd9c4e4f049 ("vfs, cachefiles: Mark a backing file in use with an inode flag")
80f94f29f677 ("cachefiles: Provide a function to check how much space there is")
8667d434b2a9 ("cachefiles: Register a miscdev and parse commands over it")
254947d47945 ("cachefiles: Add security derivation")
1493bf74bcf2 ("cachefiles: Add cache error reporting macro")
ecf5a6ce15f9 ("cachefiles: Add a couple of tracepoints for logging errors")
a70f6526267e ("cachefiles: Add some error injection support")
8390fbc46570 ("cachefiles: Define structs")
77443f6171f3 ("cachefiles: Introduce rewritten driver")
850cba069c26 ("cachefiles: Delete the cachefiles driver pending rewrite")
b6773cdb0e9f ("Merge tag 'for-5.16/ki_complete-2021-10-29' of git://git.kernel.dk/linux-block")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From e21a2f17566cbd64926fb8f16323972f7a064444 Mon Sep 17 00:00:00 2001
From: Baokun Li <libaokun1(a)huawei.com>
Date: Sat, 17 Feb 2024 16:14:31 +0800
Subject: [PATCH] cachefiles: fix memory leak in cachefiles_add_cache()
The following memory leak was reported after unbinding /dev/cachefiles:
==================================================================
unreferenced object 0xffff9b674176e3c0 (size 192):
comm "cachefilesd2", pid 680, jiffies 4294881224
hex dump (first 32 bytes):
01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
backtrace (crc ea38a44b):
[<ffffffff8eb8a1a5>] kmem_cache_alloc+0x2d5/0x370
[<ffffffff8e917f86>] prepare_creds+0x26/0x2e0
[<ffffffffc002eeef>] cachefiles_determine_cache_security+0x1f/0x120
[<ffffffffc00243ec>] cachefiles_add_cache+0x13c/0x3a0
[<ffffffffc0025216>] cachefiles_daemon_write+0x146/0x1c0
[<ffffffff8ebc4a3b>] vfs_write+0xcb/0x520
[<ffffffff8ebc5069>] ksys_write+0x69/0xf0
[<ffffffff8f6d4662>] do_syscall_64+0x72/0x140
[<ffffffff8f8000aa>] entry_SYSCALL_64_after_hwframe+0x6e/0x76
==================================================================
Put the reference count of cache_cred in cachefiles_daemon_unbind() to
fix the problem. And also put cache_cred in cachefiles_add_cache() error
branch to avoid memory leaks.
Fixes: 9ae326a69004 ("CacheFiles: A cache that backs onto a mounted filesystem")
CC: stable(a)vger.kernel.org
Signed-off-by: Baokun Li <libaokun1(a)huawei.com>
Link: https://lore.kernel.org/r/20240217081431.796809-1-libaokun1@huawei.com
Acked-by: David Howells <dhowells(a)redhat.com>
Reviewed-by: Jingbo Xu <jefflexu(a)linux.alibaba.com>
Reviewed-by: Jeff Layton <jlayton(a)kernel.org>
Signed-off-by: Christian Brauner <brauner(a)kernel.org>
diff --git a/fs/cachefiles/cache.c b/fs/cachefiles/cache.c
index 7077f72e6f47..f449f7340aad 100644
--- a/fs/cachefiles/cache.c
+++ b/fs/cachefiles/cache.c
@@ -168,6 +168,8 @@ int cachefiles_add_cache(struct cachefiles_cache *cache)
dput(root);
error_open_root:
cachefiles_end_secure(cache, saved_cred);
+ put_cred(cache->cache_cred);
+ cache->cache_cred = NULL;
error_getsec:
fscache_relinquish_cache(cache_cookie);
cache->cache = NULL;
diff --git a/fs/cachefiles/daemon.c b/fs/cachefiles/daemon.c
index 3f24905f4066..6465e2574230 100644
--- a/fs/cachefiles/daemon.c
+++ b/fs/cachefiles/daemon.c
@@ -816,6 +816,7 @@ static void cachefiles_daemon_unbind(struct cachefiles_cache *cache)
cachefiles_put_directory(cache->graveyard);
cachefiles_put_directory(cache->store);
mntput(cache->mnt);
+ put_cred(cache->cache_cred);
kfree(cache->rootdirname);
kfree(cache->secctx);
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x e21a2f17566cbd64926fb8f16323972f7a064444
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024022636-prevail-headway-01c9@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
e21a2f17566c ("cachefiles: fix memory leak in cachefiles_add_cache()")
d1065b0a6fd9 ("cachefiles: Implement cache registration and withdrawal")
32759f7d7af5 ("cachefiles: Implement a function to get/create a directory in the cache")
1bd9c4e4f049 ("vfs, cachefiles: Mark a backing file in use with an inode flag")
80f94f29f677 ("cachefiles: Provide a function to check how much space there is")
8667d434b2a9 ("cachefiles: Register a miscdev and parse commands over it")
254947d47945 ("cachefiles: Add security derivation")
1493bf74bcf2 ("cachefiles: Add cache error reporting macro")
ecf5a6ce15f9 ("cachefiles: Add a couple of tracepoints for logging errors")
a70f6526267e ("cachefiles: Add some error injection support")
8390fbc46570 ("cachefiles: Define structs")
77443f6171f3 ("cachefiles: Introduce rewritten driver")
850cba069c26 ("cachefiles: Delete the cachefiles driver pending rewrite")
b6773cdb0e9f ("Merge tag 'for-5.16/ki_complete-2021-10-29' of git://git.kernel.dk/linux-block")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From e21a2f17566cbd64926fb8f16323972f7a064444 Mon Sep 17 00:00:00 2001
From: Baokun Li <libaokun1(a)huawei.com>
Date: Sat, 17 Feb 2024 16:14:31 +0800
Subject: [PATCH] cachefiles: fix memory leak in cachefiles_add_cache()
The following memory leak was reported after unbinding /dev/cachefiles:
==================================================================
unreferenced object 0xffff9b674176e3c0 (size 192):
comm "cachefilesd2", pid 680, jiffies 4294881224
hex dump (first 32 bytes):
01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
backtrace (crc ea38a44b):
[<ffffffff8eb8a1a5>] kmem_cache_alloc+0x2d5/0x370
[<ffffffff8e917f86>] prepare_creds+0x26/0x2e0
[<ffffffffc002eeef>] cachefiles_determine_cache_security+0x1f/0x120
[<ffffffffc00243ec>] cachefiles_add_cache+0x13c/0x3a0
[<ffffffffc0025216>] cachefiles_daemon_write+0x146/0x1c0
[<ffffffff8ebc4a3b>] vfs_write+0xcb/0x520
[<ffffffff8ebc5069>] ksys_write+0x69/0xf0
[<ffffffff8f6d4662>] do_syscall_64+0x72/0x140
[<ffffffff8f8000aa>] entry_SYSCALL_64_after_hwframe+0x6e/0x76
==================================================================
Put the reference count of cache_cred in cachefiles_daemon_unbind() to
fix the problem. And also put cache_cred in cachefiles_add_cache() error
branch to avoid memory leaks.
Fixes: 9ae326a69004 ("CacheFiles: A cache that backs onto a mounted filesystem")
CC: stable(a)vger.kernel.org
Signed-off-by: Baokun Li <libaokun1(a)huawei.com>
Link: https://lore.kernel.org/r/20240217081431.796809-1-libaokun1@huawei.com
Acked-by: David Howells <dhowells(a)redhat.com>
Reviewed-by: Jingbo Xu <jefflexu(a)linux.alibaba.com>
Reviewed-by: Jeff Layton <jlayton(a)kernel.org>
Signed-off-by: Christian Brauner <brauner(a)kernel.org>
diff --git a/fs/cachefiles/cache.c b/fs/cachefiles/cache.c
index 7077f72e6f47..f449f7340aad 100644
--- a/fs/cachefiles/cache.c
+++ b/fs/cachefiles/cache.c
@@ -168,6 +168,8 @@ int cachefiles_add_cache(struct cachefiles_cache *cache)
dput(root);
error_open_root:
cachefiles_end_secure(cache, saved_cred);
+ put_cred(cache->cache_cred);
+ cache->cache_cred = NULL;
error_getsec:
fscache_relinquish_cache(cache_cookie);
cache->cache = NULL;
diff --git a/fs/cachefiles/daemon.c b/fs/cachefiles/daemon.c
index 3f24905f4066..6465e2574230 100644
--- a/fs/cachefiles/daemon.c
+++ b/fs/cachefiles/daemon.c
@@ -816,6 +816,7 @@ static void cachefiles_daemon_unbind(struct cachefiles_cache *cache)
cachefiles_put_directory(cache->graveyard);
cachefiles_put_directory(cache->store);
mntput(cache->mnt);
+ put_cred(cache->cache_cred);
kfree(cache->rootdirname);
kfree(cache->secctx);
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x e21a2f17566cbd64926fb8f16323972f7a064444
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024022635-princess-penniless-bfa6@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
e21a2f17566c ("cachefiles: fix memory leak in cachefiles_add_cache()")
d1065b0a6fd9 ("cachefiles: Implement cache registration and withdrawal")
32759f7d7af5 ("cachefiles: Implement a function to get/create a directory in the cache")
1bd9c4e4f049 ("vfs, cachefiles: Mark a backing file in use with an inode flag")
80f94f29f677 ("cachefiles: Provide a function to check how much space there is")
8667d434b2a9 ("cachefiles: Register a miscdev and parse commands over it")
254947d47945 ("cachefiles: Add security derivation")
1493bf74bcf2 ("cachefiles: Add cache error reporting macro")
ecf5a6ce15f9 ("cachefiles: Add a couple of tracepoints for logging errors")
a70f6526267e ("cachefiles: Add some error injection support")
8390fbc46570 ("cachefiles: Define structs")
77443f6171f3 ("cachefiles: Introduce rewritten driver")
850cba069c26 ("cachefiles: Delete the cachefiles driver pending rewrite")
b6773cdb0e9f ("Merge tag 'for-5.16/ki_complete-2021-10-29' of git://git.kernel.dk/linux-block")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From e21a2f17566cbd64926fb8f16323972f7a064444 Mon Sep 17 00:00:00 2001
From: Baokun Li <libaokun1(a)huawei.com>
Date: Sat, 17 Feb 2024 16:14:31 +0800
Subject: [PATCH] cachefiles: fix memory leak in cachefiles_add_cache()
The following memory leak was reported after unbinding /dev/cachefiles:
==================================================================
unreferenced object 0xffff9b674176e3c0 (size 192):
comm "cachefilesd2", pid 680, jiffies 4294881224
hex dump (first 32 bytes):
01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
backtrace (crc ea38a44b):
[<ffffffff8eb8a1a5>] kmem_cache_alloc+0x2d5/0x370
[<ffffffff8e917f86>] prepare_creds+0x26/0x2e0
[<ffffffffc002eeef>] cachefiles_determine_cache_security+0x1f/0x120
[<ffffffffc00243ec>] cachefiles_add_cache+0x13c/0x3a0
[<ffffffffc0025216>] cachefiles_daemon_write+0x146/0x1c0
[<ffffffff8ebc4a3b>] vfs_write+0xcb/0x520
[<ffffffff8ebc5069>] ksys_write+0x69/0xf0
[<ffffffff8f6d4662>] do_syscall_64+0x72/0x140
[<ffffffff8f8000aa>] entry_SYSCALL_64_after_hwframe+0x6e/0x76
==================================================================
Put the reference count of cache_cred in cachefiles_daemon_unbind() to
fix the problem. And also put cache_cred in cachefiles_add_cache() error
branch to avoid memory leaks.
Fixes: 9ae326a69004 ("CacheFiles: A cache that backs onto a mounted filesystem")
CC: stable(a)vger.kernel.org
Signed-off-by: Baokun Li <libaokun1(a)huawei.com>
Link: https://lore.kernel.org/r/20240217081431.796809-1-libaokun1@huawei.com
Acked-by: David Howells <dhowells(a)redhat.com>
Reviewed-by: Jingbo Xu <jefflexu(a)linux.alibaba.com>
Reviewed-by: Jeff Layton <jlayton(a)kernel.org>
Signed-off-by: Christian Brauner <brauner(a)kernel.org>
diff --git a/fs/cachefiles/cache.c b/fs/cachefiles/cache.c
index 7077f72e6f47..f449f7340aad 100644
--- a/fs/cachefiles/cache.c
+++ b/fs/cachefiles/cache.c
@@ -168,6 +168,8 @@ int cachefiles_add_cache(struct cachefiles_cache *cache)
dput(root);
error_open_root:
cachefiles_end_secure(cache, saved_cred);
+ put_cred(cache->cache_cred);
+ cache->cache_cred = NULL;
error_getsec:
fscache_relinquish_cache(cache_cookie);
cache->cache = NULL;
diff --git a/fs/cachefiles/daemon.c b/fs/cachefiles/daemon.c
index 3f24905f4066..6465e2574230 100644
--- a/fs/cachefiles/daemon.c
+++ b/fs/cachefiles/daemon.c
@@ -816,6 +816,7 @@ static void cachefiles_daemon_unbind(struct cachefiles_cache *cache)
cachefiles_put_directory(cache->graveyard);
cachefiles_put_directory(cache->store);
mntput(cache->mnt);
+ put_cred(cache->cache_cred);
kfree(cache->rootdirname);
kfree(cache->secctx);
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x dbcbfd662a725641d118fb3ae5ffb7be4e3d0fb0
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024022608-trombone-banker-5ed4@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
dbcbfd662a72 ("platform/x86: touchscreen_dmi: Allow partial (prefix) matches for ACPI names")
87eaede45385 ("platform/x86: touchscreen_dmi: Handle device properties with software node API")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From dbcbfd662a725641d118fb3ae5ffb7be4e3d0fb0 Mon Sep 17 00:00:00 2001
From: Hans de Goede <hdegoede(a)redhat.com>
Date: Mon, 12 Feb 2024 13:06:07 +0100
Subject: [PATCH] platform/x86: touchscreen_dmi: Allow partial (prefix) matches
for ACPI names
On some devices the ACPI name of the touchscreen is e.g. either
MSSL1680:00 or MSSL1680:01 depending on the BIOS version.
This happens for example on the "Chuwi Hi8 Air" tablet where the initial
commit's ts_data uses "MSSL1680:00" but the tablets from the github issue
and linux-hardware.org probe linked below both use "MSSL1680:01".
Replace the strcmp() match on ts_data->acpi_name with a strstarts()
check to allow using a partial match on just the ACPI HID of "MSSL1680"
and change the ts_data->acpi_name for the "Chuwi Hi8 Air" accordingly
to fix the touchscreen not working on models where it is "MSSL1680:01".
Note this drops the length check for I2C_NAME_SIZE. This never was
necessary since the ACPI names used are never more then 11 chars and
I2C_NAME_SIZE is 20 so the replaced strncmp() would always stop long
before reaching I2C_NAME_SIZE.
Link: https://linux-hardware.org/?computer=AC4301C0542A
Fixes: bbb97d728f77 ("platform/x86: touchscreen_dmi: Add info for the Chuwi Hi8 Air tablet")
Closes: https://github.com/onitake/gsl-firmware/issues/91
Cc: stable(a)vger.kernel.org
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy(a)linux.intel.com>
Signed-off-by: Hans de Goede <hdegoede(a)redhat.com>
Link: https://lore.kernel.org/r/20240212120608.30469-1-hdegoede@redhat.com
diff --git a/drivers/platform/x86/touchscreen_dmi.c b/drivers/platform/x86/touchscreen_dmi.c
index 7aee5e9ff2b8..969477c83e56 100644
--- a/drivers/platform/x86/touchscreen_dmi.c
+++ b/drivers/platform/x86/touchscreen_dmi.c
@@ -81,7 +81,7 @@ static const struct property_entry chuwi_hi8_air_props[] = {
};
static const struct ts_dmi_data chuwi_hi8_air_data = {
- .acpi_name = "MSSL1680:00",
+ .acpi_name = "MSSL1680",
.properties = chuwi_hi8_air_props,
};
@@ -1821,7 +1821,7 @@ static void ts_dmi_add_props(struct i2c_client *client)
int error;
if (has_acpi_companion(dev) &&
- !strncmp(ts_data->acpi_name, client->name, I2C_NAME_SIZE)) {
+ strstarts(client->name, ts_data->acpi_name)) {
error = device_create_managed_software_node(dev, ts_data->properties, NULL);
if (error)
dev_err(dev, "failed to add properties: %d\n", error);
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x dbcbfd662a725641d118fb3ae5ffb7be4e3d0fb0
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024022607-tainted-tinderbox-4920@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
dbcbfd662a72 ("platform/x86: touchscreen_dmi: Allow partial (prefix) matches for ACPI names")
87eaede45385 ("platform/x86: touchscreen_dmi: Handle device properties with software node API")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From dbcbfd662a725641d118fb3ae5ffb7be4e3d0fb0 Mon Sep 17 00:00:00 2001
From: Hans de Goede <hdegoede(a)redhat.com>
Date: Mon, 12 Feb 2024 13:06:07 +0100
Subject: [PATCH] platform/x86: touchscreen_dmi: Allow partial (prefix) matches
for ACPI names
On some devices the ACPI name of the touchscreen is e.g. either
MSSL1680:00 or MSSL1680:01 depending on the BIOS version.
This happens for example on the "Chuwi Hi8 Air" tablet where the initial
commit's ts_data uses "MSSL1680:00" but the tablets from the github issue
and linux-hardware.org probe linked below both use "MSSL1680:01".
Replace the strcmp() match on ts_data->acpi_name with a strstarts()
check to allow using a partial match on just the ACPI HID of "MSSL1680"
and change the ts_data->acpi_name for the "Chuwi Hi8 Air" accordingly
to fix the touchscreen not working on models where it is "MSSL1680:01".
Note this drops the length check for I2C_NAME_SIZE. This never was
necessary since the ACPI names used are never more then 11 chars and
I2C_NAME_SIZE is 20 so the replaced strncmp() would always stop long
before reaching I2C_NAME_SIZE.
Link: https://linux-hardware.org/?computer=AC4301C0542A
Fixes: bbb97d728f77 ("platform/x86: touchscreen_dmi: Add info for the Chuwi Hi8 Air tablet")
Closes: https://github.com/onitake/gsl-firmware/issues/91
Cc: stable(a)vger.kernel.org
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy(a)linux.intel.com>
Signed-off-by: Hans de Goede <hdegoede(a)redhat.com>
Link: https://lore.kernel.org/r/20240212120608.30469-1-hdegoede@redhat.com
diff --git a/drivers/platform/x86/touchscreen_dmi.c b/drivers/platform/x86/touchscreen_dmi.c
index 7aee5e9ff2b8..969477c83e56 100644
--- a/drivers/platform/x86/touchscreen_dmi.c
+++ b/drivers/platform/x86/touchscreen_dmi.c
@@ -81,7 +81,7 @@ static const struct property_entry chuwi_hi8_air_props[] = {
};
static const struct ts_dmi_data chuwi_hi8_air_data = {
- .acpi_name = "MSSL1680:00",
+ .acpi_name = "MSSL1680",
.properties = chuwi_hi8_air_props,
};
@@ -1821,7 +1821,7 @@ static void ts_dmi_add_props(struct i2c_client *client)
int error;
if (has_acpi_companion(dev) &&
- !strncmp(ts_data->acpi_name, client->name, I2C_NAME_SIZE)) {
+ strstarts(client->name, ts_data->acpi_name)) {
error = device_create_managed_software_node(dev, ts_data->properties, NULL);
if (error)
dev_err(dev, "failed to add properties: %d\n", error);
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.19.y
git checkout FETCH_HEAD
git cherry-pick -x 13ddaf26be324a7f951891ecd9ccd04466d27458
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024022608-green-engaging-46db@gregkh' --subject-prefix 'PATCH 4.19.y' HEAD^..
Possible dependencies:
13ddaf26be32 ("mm/swap: fix race when skipping swapcache")
c9edc242811d ("swap: add swap_cache_get_folio()")
1baec203b77c ("mm/khugepaged: try to free transhuge swapcache when possible")
442701e7058b ("mm/swap: remove swap_cache_info statistics")
014bb1de4fc1 ("mm: create new mm/swap.h header file")
1493a1913e34 ("mm/swap: remember PG_anon_exclusive via a swp pte bit")
6c287605fd56 ("mm: remember exclusively mapped anonymous pages with PG_anon_exclusive")
78fbe906cc90 ("mm/page-flags: reuse PG_mappedtodisk as PG_anon_exclusive for PageAnon() pages")
6c54dc6c7437 ("mm/rmap: use page_move_anon_rmap() when reusing a mapped PageAnon() page exclusively")
28c5209dfd5f ("mm/rmap: pass rmap flags to hugepage_add_anon_rmap()")
f1e2db12e45b ("mm/rmap: remove do_page_add_anon_rmap()")
14f9135d5470 ("mm/rmap: convert RMAP flags to a proper distinct rmap_t type")
fb3d824d1a46 ("mm/rmap: split page_dup_rmap() into page_dup_file_rmap() and page_try_dup_anon_rmap()")
b51ad4f8679e ("mm/memory: slightly simplify copy_present_pte()")
623a1ddfeb23 ("mm/hugetlb: take src_mm->write_protect_seq in copy_hugetlb_page_range()")
3bff7e3f1f16 ("mm/huge_memory: streamline COW logic in do_huge_pmd_wp_page()")
c145e0b47c77 ("mm: streamline COW logic in do_swap_page()")
84d60fdd3733 ("mm: slightly clarify KSM logic in do_swap_page()")
53a05ad9f21d ("mm: optimize do_wp_page() for exclusive pages in the swapcache")
6b1f86f8e9c7 ("Merge tag 'folio-5.18b' of git://git.infradead.org/users/willy/pagecache")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 13ddaf26be324a7f951891ecd9ccd04466d27458 Mon Sep 17 00:00:00 2001
From: Kairui Song <kasong(a)tencent.com>
Date: Wed, 7 Feb 2024 02:25:59 +0800
Subject: [PATCH] mm/swap: fix race when skipping swapcache
When skipping swapcache for SWP_SYNCHRONOUS_IO, if two or more threads
swapin the same entry at the same time, they get different pages (A, B).
Before one thread (T0) finishes the swapin and installs page (A) to the
PTE, another thread (T1) could finish swapin of page (B), swap_free the
entry, then swap out the possibly modified page reusing the same entry.
It breaks the pte_same check in (T0) because PTE value is unchanged,
causing ABA problem. Thread (T0) will install a stalled page (A) into the
PTE and cause data corruption.
One possible callstack is like this:
CPU0 CPU1
---- ----
do_swap_page() do_swap_page() with same entry
<direct swapin path> <direct swapin path>
<alloc page A> <alloc page B>
swap_read_folio() <- read to page A swap_read_folio() <- read to page B
<slow on later locks or interrupt> <finished swapin first>
... set_pte_at()
swap_free() <- entry is free
<write to page B, now page A stalled>
<swap out page B to same swap entry>
pte_same() <- Check pass, PTE seems
unchanged, but page A
is stalled!
swap_free() <- page B content lost!
set_pte_at() <- staled page A installed!
And besides, for ZRAM, swap_free() allows the swap device to discard the
entry content, so even if page (B) is not modified, if swap_read_folio()
on CPU0 happens later than swap_free() on CPU1, it may also cause data
loss.
To fix this, reuse swapcache_prepare which will pin the swap entry using
the cache flag, and allow only one thread to swap it in, also prevent any
parallel code from putting the entry in the cache. Release the pin after
PT unlocked.
Racers just loop and wait since it's a rare and very short event. A
schedule_timeout_uninterruptible(1) call is added to avoid repeated page
faults wasting too much CPU, causing livelock or adding too much noise to
perf statistics. A similar livelock issue was described in commit
029c4628b2eb ("mm: swap: get rid of livelock in swapin readahead")
Reproducer:
This race issue can be triggered easily using a well constructed
reproducer and patched brd (with a delay in read path) [1]:
With latest 6.8 mainline, race caused data loss can be observed easily:
$ gcc -g -lpthread test-thread-swap-race.c && ./a.out
Polulating 32MB of memory region...
Keep swapping out...
Starting round 0...
Spawning 65536 workers...
32746 workers spawned, wait for done...
Round 0: Error on 0x5aa00, expected 32746, got 32743, 3 data loss!
Round 0: Error on 0x395200, expected 32746, got 32743, 3 data loss!
Round 0: Error on 0x3fd000, expected 32746, got 32737, 9 data loss!
Round 0 Failed, 15 data loss!
This reproducer spawns multiple threads sharing the same memory region
using a small swap device. Every two threads updates mapped pages one by
one in opposite direction trying to create a race, with one dedicated
thread keep swapping out the data out using madvise.
The reproducer created a reproduce rate of about once every 5 minutes, so
the race should be totally possible in production.
After this patch, I ran the reproducer for over a few hundred rounds and
no data loss observed.
Performance overhead is minimal, microbenchmark swapin 10G from 32G
zram:
Before: 10934698 us
After: 11157121 us
Cached: 13155355 us (Dropping SWP_SYNCHRONOUS_IO flag)
[kasong(a)tencent.com: v4]
Link: https://lkml.kernel.org/r/20240219082040.7495-1-ryncsn@gmail.com
Link: https://lkml.kernel.org/r/20240206182559.32264-1-ryncsn@gmail.com
Fixes: 0bcac06f27d7 ("mm, swap: skip swapcache for swapin of synchronous device")
Reported-by: "Huang, Ying" <ying.huang(a)intel.com>
Closes: https://lore.kernel.org/lkml/87bk92gqpx.fsf_-_@yhuang6-desk2.ccr.corp.intel…
Link: https://github.com/ryncsn/emm-test-project/tree/master/swap-stress-race [1]
Signed-off-by: Kairui Song <kasong(a)tencent.com>
Reviewed-by: "Huang, Ying" <ying.huang(a)intel.com>
Acked-by: Yu Zhao <yuzhao(a)google.com>
Acked-by: David Hildenbrand <david(a)redhat.com>
Acked-by: Chris Li <chrisl(a)kernel.org>
Cc: Hugh Dickins <hughd(a)google.com>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Minchan Kim <minchan(a)kernel.org>
Cc: Yosry Ahmed <yosryahmed(a)google.com>
Cc: Yu Zhao <yuzhao(a)google.com>
Cc: Barry Song <21cnbao(a)gmail.com>
Cc: SeongJae Park <sj(a)kernel.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/include/linux/swap.h b/include/linux/swap.h
index 4db00ddad261..8d28f6091a32 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -549,6 +549,11 @@ static inline int swap_duplicate(swp_entry_t swp)
return 0;
}
+static inline int swapcache_prepare(swp_entry_t swp)
+{
+ return 0;
+}
+
static inline void swap_free(swp_entry_t swp)
{
}
diff --git a/mm/memory.c b/mm/memory.c
index 15f8b10ea17c..0bfc8b007c01 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -3799,6 +3799,7 @@ vm_fault_t do_swap_page(struct vm_fault *vmf)
struct page *page;
struct swap_info_struct *si = NULL;
rmap_t rmap_flags = RMAP_NONE;
+ bool need_clear_cache = false;
bool exclusive = false;
swp_entry_t entry;
pte_t pte;
@@ -3867,6 +3868,20 @@ vm_fault_t do_swap_page(struct vm_fault *vmf)
if (!folio) {
if (data_race(si->flags & SWP_SYNCHRONOUS_IO) &&
__swap_count(entry) == 1) {
+ /*
+ * Prevent parallel swapin from proceeding with
+ * the cache flag. Otherwise, another thread may
+ * finish swapin first, free the entry, and swapout
+ * reusing the same entry. It's undetectable as
+ * pte_same() returns true due to entry reuse.
+ */
+ if (swapcache_prepare(entry)) {
+ /* Relax a bit to prevent rapid repeated page faults */
+ schedule_timeout_uninterruptible(1);
+ goto out;
+ }
+ need_clear_cache = true;
+
/* skip swapcache */
folio = vma_alloc_folio(GFP_HIGHUSER_MOVABLE, 0,
vma, vmf->address, false);
@@ -4117,6 +4132,9 @@ vm_fault_t do_swap_page(struct vm_fault *vmf)
if (vmf->pte)
pte_unmap_unlock(vmf->pte, vmf->ptl);
out:
+ /* Clear the swap cache pin for direct swapin after PTL unlock */
+ if (need_clear_cache)
+ swapcache_clear(si, entry);
if (si)
put_swap_device(si);
return ret;
@@ -4131,6 +4149,8 @@ vm_fault_t do_swap_page(struct vm_fault *vmf)
folio_unlock(swapcache);
folio_put(swapcache);
}
+ if (need_clear_cache)
+ swapcache_clear(si, entry);
if (si)
put_swap_device(si);
return ret;
diff --git a/mm/swap.h b/mm/swap.h
index 758c46ca671e..fc2f6ade7f80 100644
--- a/mm/swap.h
+++ b/mm/swap.h
@@ -41,6 +41,7 @@ void __delete_from_swap_cache(struct folio *folio,
void delete_from_swap_cache(struct folio *folio);
void clear_shadow_from_swap_cache(int type, unsigned long begin,
unsigned long end);
+void swapcache_clear(struct swap_info_struct *si, swp_entry_t entry);
struct folio *swap_cache_get_folio(swp_entry_t entry,
struct vm_area_struct *vma, unsigned long addr);
struct folio *filemap_get_incore_folio(struct address_space *mapping,
@@ -97,6 +98,10 @@ static inline int swap_writepage(struct page *p, struct writeback_control *wbc)
return 0;
}
+static inline void swapcache_clear(struct swap_info_struct *si, swp_entry_t entry)
+{
+}
+
static inline struct folio *swap_cache_get_folio(swp_entry_t entry,
struct vm_area_struct *vma, unsigned long addr)
{
diff --git a/mm/swapfile.c b/mm/swapfile.c
index 556ff7347d5f..746aa9da5302 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -3365,6 +3365,19 @@ int swapcache_prepare(swp_entry_t entry)
return __swap_duplicate(entry, SWAP_HAS_CACHE);
}
+void swapcache_clear(struct swap_info_struct *si, swp_entry_t entry)
+{
+ struct swap_cluster_info *ci;
+ unsigned long offset = swp_offset(entry);
+ unsigned char usage;
+
+ ci = lock_cluster_or_swap_info(si, offset);
+ usage = __swap_entry_free_locked(si, offset, SWAP_HAS_CACHE);
+ unlock_cluster_or_swap_info(si, ci);
+ if (!usage)
+ free_swap_slot(entry);
+}
+
struct swap_info_struct *swp_swap_info(swp_entry_t entry)
{
return swap_type_to_swap_info(swp_type(entry));
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x 13ddaf26be324a7f951891ecd9ccd04466d27458
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024022604-labrador-edgy-5b56@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
13ddaf26be32 ("mm/swap: fix race when skipping swapcache")
c9edc242811d ("swap: add swap_cache_get_folio()")
1baec203b77c ("mm/khugepaged: try to free transhuge swapcache when possible")
442701e7058b ("mm/swap: remove swap_cache_info statistics")
014bb1de4fc1 ("mm: create new mm/swap.h header file")
1493a1913e34 ("mm/swap: remember PG_anon_exclusive via a swp pte bit")
6c287605fd56 ("mm: remember exclusively mapped anonymous pages with PG_anon_exclusive")
78fbe906cc90 ("mm/page-flags: reuse PG_mappedtodisk as PG_anon_exclusive for PageAnon() pages")
6c54dc6c7437 ("mm/rmap: use page_move_anon_rmap() when reusing a mapped PageAnon() page exclusively")
28c5209dfd5f ("mm/rmap: pass rmap flags to hugepage_add_anon_rmap()")
f1e2db12e45b ("mm/rmap: remove do_page_add_anon_rmap()")
14f9135d5470 ("mm/rmap: convert RMAP flags to a proper distinct rmap_t type")
fb3d824d1a46 ("mm/rmap: split page_dup_rmap() into page_dup_file_rmap() and page_try_dup_anon_rmap()")
b51ad4f8679e ("mm/memory: slightly simplify copy_present_pte()")
623a1ddfeb23 ("mm/hugetlb: take src_mm->write_protect_seq in copy_hugetlb_page_range()")
3bff7e3f1f16 ("mm/huge_memory: streamline COW logic in do_huge_pmd_wp_page()")
c145e0b47c77 ("mm: streamline COW logic in do_swap_page()")
84d60fdd3733 ("mm: slightly clarify KSM logic in do_swap_page()")
53a05ad9f21d ("mm: optimize do_wp_page() for exclusive pages in the swapcache")
6b1f86f8e9c7 ("Merge tag 'folio-5.18b' of git://git.infradead.org/users/willy/pagecache")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 13ddaf26be324a7f951891ecd9ccd04466d27458 Mon Sep 17 00:00:00 2001
From: Kairui Song <kasong(a)tencent.com>
Date: Wed, 7 Feb 2024 02:25:59 +0800
Subject: [PATCH] mm/swap: fix race when skipping swapcache
When skipping swapcache for SWP_SYNCHRONOUS_IO, if two or more threads
swapin the same entry at the same time, they get different pages (A, B).
Before one thread (T0) finishes the swapin and installs page (A) to the
PTE, another thread (T1) could finish swapin of page (B), swap_free the
entry, then swap out the possibly modified page reusing the same entry.
It breaks the pte_same check in (T0) because PTE value is unchanged,
causing ABA problem. Thread (T0) will install a stalled page (A) into the
PTE and cause data corruption.
One possible callstack is like this:
CPU0 CPU1
---- ----
do_swap_page() do_swap_page() with same entry
<direct swapin path> <direct swapin path>
<alloc page A> <alloc page B>
swap_read_folio() <- read to page A swap_read_folio() <- read to page B
<slow on later locks or interrupt> <finished swapin first>
... set_pte_at()
swap_free() <- entry is free
<write to page B, now page A stalled>
<swap out page B to same swap entry>
pte_same() <- Check pass, PTE seems
unchanged, but page A
is stalled!
swap_free() <- page B content lost!
set_pte_at() <- staled page A installed!
And besides, for ZRAM, swap_free() allows the swap device to discard the
entry content, so even if page (B) is not modified, if swap_read_folio()
on CPU0 happens later than swap_free() on CPU1, it may also cause data
loss.
To fix this, reuse swapcache_prepare which will pin the swap entry using
the cache flag, and allow only one thread to swap it in, also prevent any
parallel code from putting the entry in the cache. Release the pin after
PT unlocked.
Racers just loop and wait since it's a rare and very short event. A
schedule_timeout_uninterruptible(1) call is added to avoid repeated page
faults wasting too much CPU, causing livelock or adding too much noise to
perf statistics. A similar livelock issue was described in commit
029c4628b2eb ("mm: swap: get rid of livelock in swapin readahead")
Reproducer:
This race issue can be triggered easily using a well constructed
reproducer and patched brd (with a delay in read path) [1]:
With latest 6.8 mainline, race caused data loss can be observed easily:
$ gcc -g -lpthread test-thread-swap-race.c && ./a.out
Polulating 32MB of memory region...
Keep swapping out...
Starting round 0...
Spawning 65536 workers...
32746 workers spawned, wait for done...
Round 0: Error on 0x5aa00, expected 32746, got 32743, 3 data loss!
Round 0: Error on 0x395200, expected 32746, got 32743, 3 data loss!
Round 0: Error on 0x3fd000, expected 32746, got 32737, 9 data loss!
Round 0 Failed, 15 data loss!
This reproducer spawns multiple threads sharing the same memory region
using a small swap device. Every two threads updates mapped pages one by
one in opposite direction trying to create a race, with one dedicated
thread keep swapping out the data out using madvise.
The reproducer created a reproduce rate of about once every 5 minutes, so
the race should be totally possible in production.
After this patch, I ran the reproducer for over a few hundred rounds and
no data loss observed.
Performance overhead is minimal, microbenchmark swapin 10G from 32G
zram:
Before: 10934698 us
After: 11157121 us
Cached: 13155355 us (Dropping SWP_SYNCHRONOUS_IO flag)
[kasong(a)tencent.com: v4]
Link: https://lkml.kernel.org/r/20240219082040.7495-1-ryncsn@gmail.com
Link: https://lkml.kernel.org/r/20240206182559.32264-1-ryncsn@gmail.com
Fixes: 0bcac06f27d7 ("mm, swap: skip swapcache for swapin of synchronous device")
Reported-by: "Huang, Ying" <ying.huang(a)intel.com>
Closes: https://lore.kernel.org/lkml/87bk92gqpx.fsf_-_@yhuang6-desk2.ccr.corp.intel…
Link: https://github.com/ryncsn/emm-test-project/tree/master/swap-stress-race [1]
Signed-off-by: Kairui Song <kasong(a)tencent.com>
Reviewed-by: "Huang, Ying" <ying.huang(a)intel.com>
Acked-by: Yu Zhao <yuzhao(a)google.com>
Acked-by: David Hildenbrand <david(a)redhat.com>
Acked-by: Chris Li <chrisl(a)kernel.org>
Cc: Hugh Dickins <hughd(a)google.com>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Minchan Kim <minchan(a)kernel.org>
Cc: Yosry Ahmed <yosryahmed(a)google.com>
Cc: Yu Zhao <yuzhao(a)google.com>
Cc: Barry Song <21cnbao(a)gmail.com>
Cc: SeongJae Park <sj(a)kernel.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/include/linux/swap.h b/include/linux/swap.h
index 4db00ddad261..8d28f6091a32 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -549,6 +549,11 @@ static inline int swap_duplicate(swp_entry_t swp)
return 0;
}
+static inline int swapcache_prepare(swp_entry_t swp)
+{
+ return 0;
+}
+
static inline void swap_free(swp_entry_t swp)
{
}
diff --git a/mm/memory.c b/mm/memory.c
index 15f8b10ea17c..0bfc8b007c01 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -3799,6 +3799,7 @@ vm_fault_t do_swap_page(struct vm_fault *vmf)
struct page *page;
struct swap_info_struct *si = NULL;
rmap_t rmap_flags = RMAP_NONE;
+ bool need_clear_cache = false;
bool exclusive = false;
swp_entry_t entry;
pte_t pte;
@@ -3867,6 +3868,20 @@ vm_fault_t do_swap_page(struct vm_fault *vmf)
if (!folio) {
if (data_race(si->flags & SWP_SYNCHRONOUS_IO) &&
__swap_count(entry) == 1) {
+ /*
+ * Prevent parallel swapin from proceeding with
+ * the cache flag. Otherwise, another thread may
+ * finish swapin first, free the entry, and swapout
+ * reusing the same entry. It's undetectable as
+ * pte_same() returns true due to entry reuse.
+ */
+ if (swapcache_prepare(entry)) {
+ /* Relax a bit to prevent rapid repeated page faults */
+ schedule_timeout_uninterruptible(1);
+ goto out;
+ }
+ need_clear_cache = true;
+
/* skip swapcache */
folio = vma_alloc_folio(GFP_HIGHUSER_MOVABLE, 0,
vma, vmf->address, false);
@@ -4117,6 +4132,9 @@ vm_fault_t do_swap_page(struct vm_fault *vmf)
if (vmf->pte)
pte_unmap_unlock(vmf->pte, vmf->ptl);
out:
+ /* Clear the swap cache pin for direct swapin after PTL unlock */
+ if (need_clear_cache)
+ swapcache_clear(si, entry);
if (si)
put_swap_device(si);
return ret;
@@ -4131,6 +4149,8 @@ vm_fault_t do_swap_page(struct vm_fault *vmf)
folio_unlock(swapcache);
folio_put(swapcache);
}
+ if (need_clear_cache)
+ swapcache_clear(si, entry);
if (si)
put_swap_device(si);
return ret;
diff --git a/mm/swap.h b/mm/swap.h
index 758c46ca671e..fc2f6ade7f80 100644
--- a/mm/swap.h
+++ b/mm/swap.h
@@ -41,6 +41,7 @@ void __delete_from_swap_cache(struct folio *folio,
void delete_from_swap_cache(struct folio *folio);
void clear_shadow_from_swap_cache(int type, unsigned long begin,
unsigned long end);
+void swapcache_clear(struct swap_info_struct *si, swp_entry_t entry);
struct folio *swap_cache_get_folio(swp_entry_t entry,
struct vm_area_struct *vma, unsigned long addr);
struct folio *filemap_get_incore_folio(struct address_space *mapping,
@@ -97,6 +98,10 @@ static inline int swap_writepage(struct page *p, struct writeback_control *wbc)
return 0;
}
+static inline void swapcache_clear(struct swap_info_struct *si, swp_entry_t entry)
+{
+}
+
static inline struct folio *swap_cache_get_folio(swp_entry_t entry,
struct vm_area_struct *vma, unsigned long addr)
{
diff --git a/mm/swapfile.c b/mm/swapfile.c
index 556ff7347d5f..746aa9da5302 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -3365,6 +3365,19 @@ int swapcache_prepare(swp_entry_t entry)
return __swap_duplicate(entry, SWAP_HAS_CACHE);
}
+void swapcache_clear(struct swap_info_struct *si, swp_entry_t entry)
+{
+ struct swap_cluster_info *ci;
+ unsigned long offset = swp_offset(entry);
+ unsigned char usage;
+
+ ci = lock_cluster_or_swap_info(si, offset);
+ usage = __swap_entry_free_locked(si, offset, SWAP_HAS_CACHE);
+ unlock_cluster_or_swap_info(si, ci);
+ if (!usage)
+ free_swap_slot(entry);
+}
+
struct swap_info_struct *swp_swap_info(swp_entry_t entry)
{
return swap_type_to_swap_info(swp_type(entry));
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 13ddaf26be324a7f951891ecd9ccd04466d27458
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024022601-skinny-audacious-d173@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
13ddaf26be32 ("mm/swap: fix race when skipping swapcache")
c9edc242811d ("swap: add swap_cache_get_folio()")
1baec203b77c ("mm/khugepaged: try to free transhuge swapcache when possible")
442701e7058b ("mm/swap: remove swap_cache_info statistics")
014bb1de4fc1 ("mm: create new mm/swap.h header file")
1493a1913e34 ("mm/swap: remember PG_anon_exclusive via a swp pte bit")
6c287605fd56 ("mm: remember exclusively mapped anonymous pages with PG_anon_exclusive")
78fbe906cc90 ("mm/page-flags: reuse PG_mappedtodisk as PG_anon_exclusive for PageAnon() pages")
6c54dc6c7437 ("mm/rmap: use page_move_anon_rmap() when reusing a mapped PageAnon() page exclusively")
28c5209dfd5f ("mm/rmap: pass rmap flags to hugepage_add_anon_rmap()")
f1e2db12e45b ("mm/rmap: remove do_page_add_anon_rmap()")
14f9135d5470 ("mm/rmap: convert RMAP flags to a proper distinct rmap_t type")
fb3d824d1a46 ("mm/rmap: split page_dup_rmap() into page_dup_file_rmap() and page_try_dup_anon_rmap()")
b51ad4f8679e ("mm/memory: slightly simplify copy_present_pte()")
623a1ddfeb23 ("mm/hugetlb: take src_mm->write_protect_seq in copy_hugetlb_page_range()")
3bff7e3f1f16 ("mm/huge_memory: streamline COW logic in do_huge_pmd_wp_page()")
c145e0b47c77 ("mm: streamline COW logic in do_swap_page()")
84d60fdd3733 ("mm: slightly clarify KSM logic in do_swap_page()")
53a05ad9f21d ("mm: optimize do_wp_page() for exclusive pages in the swapcache")
6b1f86f8e9c7 ("Merge tag 'folio-5.18b' of git://git.infradead.org/users/willy/pagecache")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 13ddaf26be324a7f951891ecd9ccd04466d27458 Mon Sep 17 00:00:00 2001
From: Kairui Song <kasong(a)tencent.com>
Date: Wed, 7 Feb 2024 02:25:59 +0800
Subject: [PATCH] mm/swap: fix race when skipping swapcache
When skipping swapcache for SWP_SYNCHRONOUS_IO, if two or more threads
swapin the same entry at the same time, they get different pages (A, B).
Before one thread (T0) finishes the swapin and installs page (A) to the
PTE, another thread (T1) could finish swapin of page (B), swap_free the
entry, then swap out the possibly modified page reusing the same entry.
It breaks the pte_same check in (T0) because PTE value is unchanged,
causing ABA problem. Thread (T0) will install a stalled page (A) into the
PTE and cause data corruption.
One possible callstack is like this:
CPU0 CPU1
---- ----
do_swap_page() do_swap_page() with same entry
<direct swapin path> <direct swapin path>
<alloc page A> <alloc page B>
swap_read_folio() <- read to page A swap_read_folio() <- read to page B
<slow on later locks or interrupt> <finished swapin first>
... set_pte_at()
swap_free() <- entry is free
<write to page B, now page A stalled>
<swap out page B to same swap entry>
pte_same() <- Check pass, PTE seems
unchanged, but page A
is stalled!
swap_free() <- page B content lost!
set_pte_at() <- staled page A installed!
And besides, for ZRAM, swap_free() allows the swap device to discard the
entry content, so even if page (B) is not modified, if swap_read_folio()
on CPU0 happens later than swap_free() on CPU1, it may also cause data
loss.
To fix this, reuse swapcache_prepare which will pin the swap entry using
the cache flag, and allow only one thread to swap it in, also prevent any
parallel code from putting the entry in the cache. Release the pin after
PT unlocked.
Racers just loop and wait since it's a rare and very short event. A
schedule_timeout_uninterruptible(1) call is added to avoid repeated page
faults wasting too much CPU, causing livelock or adding too much noise to
perf statistics. A similar livelock issue was described in commit
029c4628b2eb ("mm: swap: get rid of livelock in swapin readahead")
Reproducer:
This race issue can be triggered easily using a well constructed
reproducer and patched brd (with a delay in read path) [1]:
With latest 6.8 mainline, race caused data loss can be observed easily:
$ gcc -g -lpthread test-thread-swap-race.c && ./a.out
Polulating 32MB of memory region...
Keep swapping out...
Starting round 0...
Spawning 65536 workers...
32746 workers spawned, wait for done...
Round 0: Error on 0x5aa00, expected 32746, got 32743, 3 data loss!
Round 0: Error on 0x395200, expected 32746, got 32743, 3 data loss!
Round 0: Error on 0x3fd000, expected 32746, got 32737, 9 data loss!
Round 0 Failed, 15 data loss!
This reproducer spawns multiple threads sharing the same memory region
using a small swap device. Every two threads updates mapped pages one by
one in opposite direction trying to create a race, with one dedicated
thread keep swapping out the data out using madvise.
The reproducer created a reproduce rate of about once every 5 minutes, so
the race should be totally possible in production.
After this patch, I ran the reproducer for over a few hundred rounds and
no data loss observed.
Performance overhead is minimal, microbenchmark swapin 10G from 32G
zram:
Before: 10934698 us
After: 11157121 us
Cached: 13155355 us (Dropping SWP_SYNCHRONOUS_IO flag)
[kasong(a)tencent.com: v4]
Link: https://lkml.kernel.org/r/20240219082040.7495-1-ryncsn@gmail.com
Link: https://lkml.kernel.org/r/20240206182559.32264-1-ryncsn@gmail.com
Fixes: 0bcac06f27d7 ("mm, swap: skip swapcache for swapin of synchronous device")
Reported-by: "Huang, Ying" <ying.huang(a)intel.com>
Closes: https://lore.kernel.org/lkml/87bk92gqpx.fsf_-_@yhuang6-desk2.ccr.corp.intel…
Link: https://github.com/ryncsn/emm-test-project/tree/master/swap-stress-race [1]
Signed-off-by: Kairui Song <kasong(a)tencent.com>
Reviewed-by: "Huang, Ying" <ying.huang(a)intel.com>
Acked-by: Yu Zhao <yuzhao(a)google.com>
Acked-by: David Hildenbrand <david(a)redhat.com>
Acked-by: Chris Li <chrisl(a)kernel.org>
Cc: Hugh Dickins <hughd(a)google.com>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Minchan Kim <minchan(a)kernel.org>
Cc: Yosry Ahmed <yosryahmed(a)google.com>
Cc: Yu Zhao <yuzhao(a)google.com>
Cc: Barry Song <21cnbao(a)gmail.com>
Cc: SeongJae Park <sj(a)kernel.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/include/linux/swap.h b/include/linux/swap.h
index 4db00ddad261..8d28f6091a32 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -549,6 +549,11 @@ static inline int swap_duplicate(swp_entry_t swp)
return 0;
}
+static inline int swapcache_prepare(swp_entry_t swp)
+{
+ return 0;
+}
+
static inline void swap_free(swp_entry_t swp)
{
}
diff --git a/mm/memory.c b/mm/memory.c
index 15f8b10ea17c..0bfc8b007c01 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -3799,6 +3799,7 @@ vm_fault_t do_swap_page(struct vm_fault *vmf)
struct page *page;
struct swap_info_struct *si = NULL;
rmap_t rmap_flags = RMAP_NONE;
+ bool need_clear_cache = false;
bool exclusive = false;
swp_entry_t entry;
pte_t pte;
@@ -3867,6 +3868,20 @@ vm_fault_t do_swap_page(struct vm_fault *vmf)
if (!folio) {
if (data_race(si->flags & SWP_SYNCHRONOUS_IO) &&
__swap_count(entry) == 1) {
+ /*
+ * Prevent parallel swapin from proceeding with
+ * the cache flag. Otherwise, another thread may
+ * finish swapin first, free the entry, and swapout
+ * reusing the same entry. It's undetectable as
+ * pte_same() returns true due to entry reuse.
+ */
+ if (swapcache_prepare(entry)) {
+ /* Relax a bit to prevent rapid repeated page faults */
+ schedule_timeout_uninterruptible(1);
+ goto out;
+ }
+ need_clear_cache = true;
+
/* skip swapcache */
folio = vma_alloc_folio(GFP_HIGHUSER_MOVABLE, 0,
vma, vmf->address, false);
@@ -4117,6 +4132,9 @@ vm_fault_t do_swap_page(struct vm_fault *vmf)
if (vmf->pte)
pte_unmap_unlock(vmf->pte, vmf->ptl);
out:
+ /* Clear the swap cache pin for direct swapin after PTL unlock */
+ if (need_clear_cache)
+ swapcache_clear(si, entry);
if (si)
put_swap_device(si);
return ret;
@@ -4131,6 +4149,8 @@ vm_fault_t do_swap_page(struct vm_fault *vmf)
folio_unlock(swapcache);
folio_put(swapcache);
}
+ if (need_clear_cache)
+ swapcache_clear(si, entry);
if (si)
put_swap_device(si);
return ret;
diff --git a/mm/swap.h b/mm/swap.h
index 758c46ca671e..fc2f6ade7f80 100644
--- a/mm/swap.h
+++ b/mm/swap.h
@@ -41,6 +41,7 @@ void __delete_from_swap_cache(struct folio *folio,
void delete_from_swap_cache(struct folio *folio);
void clear_shadow_from_swap_cache(int type, unsigned long begin,
unsigned long end);
+void swapcache_clear(struct swap_info_struct *si, swp_entry_t entry);
struct folio *swap_cache_get_folio(swp_entry_t entry,
struct vm_area_struct *vma, unsigned long addr);
struct folio *filemap_get_incore_folio(struct address_space *mapping,
@@ -97,6 +98,10 @@ static inline int swap_writepage(struct page *p, struct writeback_control *wbc)
return 0;
}
+static inline void swapcache_clear(struct swap_info_struct *si, swp_entry_t entry)
+{
+}
+
static inline struct folio *swap_cache_get_folio(swp_entry_t entry,
struct vm_area_struct *vma, unsigned long addr)
{
diff --git a/mm/swapfile.c b/mm/swapfile.c
index 556ff7347d5f..746aa9da5302 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -3365,6 +3365,19 @@ int swapcache_prepare(swp_entry_t entry)
return __swap_duplicate(entry, SWAP_HAS_CACHE);
}
+void swapcache_clear(struct swap_info_struct *si, swp_entry_t entry)
+{
+ struct swap_cluster_info *ci;
+ unsigned long offset = swp_offset(entry);
+ unsigned char usage;
+
+ ci = lock_cluster_or_swap_info(si, offset);
+ usage = __swap_entry_free_locked(si, offset, SWAP_HAS_CACHE);
+ unlock_cluster_or_swap_info(si, ci);
+ if (!usage)
+ free_swap_slot(entry);
+}
+
struct swap_info_struct *swp_swap_info(swp_entry_t entry)
{
return swap_type_to_swap_info(swp_type(entry));
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 13ddaf26be324a7f951891ecd9ccd04466d27458
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024022658-expensive-autograph-1b92@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
13ddaf26be32 ("mm/swap: fix race when skipping swapcache")
c9edc242811d ("swap: add swap_cache_get_folio()")
1baec203b77c ("mm/khugepaged: try to free transhuge swapcache when possible")
442701e7058b ("mm/swap: remove swap_cache_info statistics")
014bb1de4fc1 ("mm: create new mm/swap.h header file")
1493a1913e34 ("mm/swap: remember PG_anon_exclusive via a swp pte bit")
6c287605fd56 ("mm: remember exclusively mapped anonymous pages with PG_anon_exclusive")
78fbe906cc90 ("mm/page-flags: reuse PG_mappedtodisk as PG_anon_exclusive for PageAnon() pages")
6c54dc6c7437 ("mm/rmap: use page_move_anon_rmap() when reusing a mapped PageAnon() page exclusively")
28c5209dfd5f ("mm/rmap: pass rmap flags to hugepage_add_anon_rmap()")
f1e2db12e45b ("mm/rmap: remove do_page_add_anon_rmap()")
14f9135d5470 ("mm/rmap: convert RMAP flags to a proper distinct rmap_t type")
fb3d824d1a46 ("mm/rmap: split page_dup_rmap() into page_dup_file_rmap() and page_try_dup_anon_rmap()")
b51ad4f8679e ("mm/memory: slightly simplify copy_present_pte()")
623a1ddfeb23 ("mm/hugetlb: take src_mm->write_protect_seq in copy_hugetlb_page_range()")
3bff7e3f1f16 ("mm/huge_memory: streamline COW logic in do_huge_pmd_wp_page()")
c145e0b47c77 ("mm: streamline COW logic in do_swap_page()")
84d60fdd3733 ("mm: slightly clarify KSM logic in do_swap_page()")
53a05ad9f21d ("mm: optimize do_wp_page() for exclusive pages in the swapcache")
6b1f86f8e9c7 ("Merge tag 'folio-5.18b' of git://git.infradead.org/users/willy/pagecache")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 13ddaf26be324a7f951891ecd9ccd04466d27458 Mon Sep 17 00:00:00 2001
From: Kairui Song <kasong(a)tencent.com>
Date: Wed, 7 Feb 2024 02:25:59 +0800
Subject: [PATCH] mm/swap: fix race when skipping swapcache
When skipping swapcache for SWP_SYNCHRONOUS_IO, if two or more threads
swapin the same entry at the same time, they get different pages (A, B).
Before one thread (T0) finishes the swapin and installs page (A) to the
PTE, another thread (T1) could finish swapin of page (B), swap_free the
entry, then swap out the possibly modified page reusing the same entry.
It breaks the pte_same check in (T0) because PTE value is unchanged,
causing ABA problem. Thread (T0) will install a stalled page (A) into the
PTE and cause data corruption.
One possible callstack is like this:
CPU0 CPU1
---- ----
do_swap_page() do_swap_page() with same entry
<direct swapin path> <direct swapin path>
<alloc page A> <alloc page B>
swap_read_folio() <- read to page A swap_read_folio() <- read to page B
<slow on later locks or interrupt> <finished swapin first>
... set_pte_at()
swap_free() <- entry is free
<write to page B, now page A stalled>
<swap out page B to same swap entry>
pte_same() <- Check pass, PTE seems
unchanged, but page A
is stalled!
swap_free() <- page B content lost!
set_pte_at() <- staled page A installed!
And besides, for ZRAM, swap_free() allows the swap device to discard the
entry content, so even if page (B) is not modified, if swap_read_folio()
on CPU0 happens later than swap_free() on CPU1, it may also cause data
loss.
To fix this, reuse swapcache_prepare which will pin the swap entry using
the cache flag, and allow only one thread to swap it in, also prevent any
parallel code from putting the entry in the cache. Release the pin after
PT unlocked.
Racers just loop and wait since it's a rare and very short event. A
schedule_timeout_uninterruptible(1) call is added to avoid repeated page
faults wasting too much CPU, causing livelock or adding too much noise to
perf statistics. A similar livelock issue was described in commit
029c4628b2eb ("mm: swap: get rid of livelock in swapin readahead")
Reproducer:
This race issue can be triggered easily using a well constructed
reproducer and patched brd (with a delay in read path) [1]:
With latest 6.8 mainline, race caused data loss can be observed easily:
$ gcc -g -lpthread test-thread-swap-race.c && ./a.out
Polulating 32MB of memory region...
Keep swapping out...
Starting round 0...
Spawning 65536 workers...
32746 workers spawned, wait for done...
Round 0: Error on 0x5aa00, expected 32746, got 32743, 3 data loss!
Round 0: Error on 0x395200, expected 32746, got 32743, 3 data loss!
Round 0: Error on 0x3fd000, expected 32746, got 32737, 9 data loss!
Round 0 Failed, 15 data loss!
This reproducer spawns multiple threads sharing the same memory region
using a small swap device. Every two threads updates mapped pages one by
one in opposite direction trying to create a race, with one dedicated
thread keep swapping out the data out using madvise.
The reproducer created a reproduce rate of about once every 5 minutes, so
the race should be totally possible in production.
After this patch, I ran the reproducer for over a few hundred rounds and
no data loss observed.
Performance overhead is minimal, microbenchmark swapin 10G from 32G
zram:
Before: 10934698 us
After: 11157121 us
Cached: 13155355 us (Dropping SWP_SYNCHRONOUS_IO flag)
[kasong(a)tencent.com: v4]
Link: https://lkml.kernel.org/r/20240219082040.7495-1-ryncsn@gmail.com
Link: https://lkml.kernel.org/r/20240206182559.32264-1-ryncsn@gmail.com
Fixes: 0bcac06f27d7 ("mm, swap: skip swapcache for swapin of synchronous device")
Reported-by: "Huang, Ying" <ying.huang(a)intel.com>
Closes: https://lore.kernel.org/lkml/87bk92gqpx.fsf_-_@yhuang6-desk2.ccr.corp.intel…
Link: https://github.com/ryncsn/emm-test-project/tree/master/swap-stress-race [1]
Signed-off-by: Kairui Song <kasong(a)tencent.com>
Reviewed-by: "Huang, Ying" <ying.huang(a)intel.com>
Acked-by: Yu Zhao <yuzhao(a)google.com>
Acked-by: David Hildenbrand <david(a)redhat.com>
Acked-by: Chris Li <chrisl(a)kernel.org>
Cc: Hugh Dickins <hughd(a)google.com>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Minchan Kim <minchan(a)kernel.org>
Cc: Yosry Ahmed <yosryahmed(a)google.com>
Cc: Yu Zhao <yuzhao(a)google.com>
Cc: Barry Song <21cnbao(a)gmail.com>
Cc: SeongJae Park <sj(a)kernel.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/include/linux/swap.h b/include/linux/swap.h
index 4db00ddad261..8d28f6091a32 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -549,6 +549,11 @@ static inline int swap_duplicate(swp_entry_t swp)
return 0;
}
+static inline int swapcache_prepare(swp_entry_t swp)
+{
+ return 0;
+}
+
static inline void swap_free(swp_entry_t swp)
{
}
diff --git a/mm/memory.c b/mm/memory.c
index 15f8b10ea17c..0bfc8b007c01 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -3799,6 +3799,7 @@ vm_fault_t do_swap_page(struct vm_fault *vmf)
struct page *page;
struct swap_info_struct *si = NULL;
rmap_t rmap_flags = RMAP_NONE;
+ bool need_clear_cache = false;
bool exclusive = false;
swp_entry_t entry;
pte_t pte;
@@ -3867,6 +3868,20 @@ vm_fault_t do_swap_page(struct vm_fault *vmf)
if (!folio) {
if (data_race(si->flags & SWP_SYNCHRONOUS_IO) &&
__swap_count(entry) == 1) {
+ /*
+ * Prevent parallel swapin from proceeding with
+ * the cache flag. Otherwise, another thread may
+ * finish swapin first, free the entry, and swapout
+ * reusing the same entry. It's undetectable as
+ * pte_same() returns true due to entry reuse.
+ */
+ if (swapcache_prepare(entry)) {
+ /* Relax a bit to prevent rapid repeated page faults */
+ schedule_timeout_uninterruptible(1);
+ goto out;
+ }
+ need_clear_cache = true;
+
/* skip swapcache */
folio = vma_alloc_folio(GFP_HIGHUSER_MOVABLE, 0,
vma, vmf->address, false);
@@ -4117,6 +4132,9 @@ vm_fault_t do_swap_page(struct vm_fault *vmf)
if (vmf->pte)
pte_unmap_unlock(vmf->pte, vmf->ptl);
out:
+ /* Clear the swap cache pin for direct swapin after PTL unlock */
+ if (need_clear_cache)
+ swapcache_clear(si, entry);
if (si)
put_swap_device(si);
return ret;
@@ -4131,6 +4149,8 @@ vm_fault_t do_swap_page(struct vm_fault *vmf)
folio_unlock(swapcache);
folio_put(swapcache);
}
+ if (need_clear_cache)
+ swapcache_clear(si, entry);
if (si)
put_swap_device(si);
return ret;
diff --git a/mm/swap.h b/mm/swap.h
index 758c46ca671e..fc2f6ade7f80 100644
--- a/mm/swap.h
+++ b/mm/swap.h
@@ -41,6 +41,7 @@ void __delete_from_swap_cache(struct folio *folio,
void delete_from_swap_cache(struct folio *folio);
void clear_shadow_from_swap_cache(int type, unsigned long begin,
unsigned long end);
+void swapcache_clear(struct swap_info_struct *si, swp_entry_t entry);
struct folio *swap_cache_get_folio(swp_entry_t entry,
struct vm_area_struct *vma, unsigned long addr);
struct folio *filemap_get_incore_folio(struct address_space *mapping,
@@ -97,6 +98,10 @@ static inline int swap_writepage(struct page *p, struct writeback_control *wbc)
return 0;
}
+static inline void swapcache_clear(struct swap_info_struct *si, swp_entry_t entry)
+{
+}
+
static inline struct folio *swap_cache_get_folio(swp_entry_t entry,
struct vm_area_struct *vma, unsigned long addr)
{
diff --git a/mm/swapfile.c b/mm/swapfile.c
index 556ff7347d5f..746aa9da5302 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -3365,6 +3365,19 @@ int swapcache_prepare(swp_entry_t entry)
return __swap_duplicate(entry, SWAP_HAS_CACHE);
}
+void swapcache_clear(struct swap_info_struct *si, swp_entry_t entry)
+{
+ struct swap_cluster_info *ci;
+ unsigned long offset = swp_offset(entry);
+ unsigned char usage;
+
+ ci = lock_cluster_or_swap_info(si, offset);
+ usage = __swap_entry_free_locked(si, offset, SWAP_HAS_CACHE);
+ unlock_cluster_or_swap_info(si, ci);
+ if (!usage)
+ free_swap_slot(entry);
+}
+
struct swap_info_struct *swp_swap_info(swp_entry_t entry)
{
return swap_type_to_swap_info(swp_type(entry));
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 5c6224bfabbf7f3e491c51ab50fd2c6f92ba1141
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024022643-rearview-spouse-b8a1@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
5c6224bfabbf ("cxl/acpi: Fix load failures due to single window creation failure")
790815902ec6 ("cxl: Add support for _DSM Function for retrieving QTG ID")
91019b5bc7c2 ("cxl/acpi: Return 'rc' instead of '0' in cxl_parse_cfmws()")
4cf67d3cc999 ("cxl/acpi: Fix a use-after-free in cxl_parse_cfmws()")
d35b495ddf92 ("cxl/port: Fix find_cxl_root() for RCDs and simplify it")
a32320b71f08 ("cxl/region: Add region autodiscovery")
32ce3f185bbb ("cxl/port: Split endpoint and switch port probe")
9995576cef48 ("cxl/region: Move region-position validation to a helper")
86987c766276 ("cxl/region: Cleanup target list on attach error")
1b9b7a6fd618 ("cxl/region: Validate region mode vs decoder mode")
7d505f982f53 ("cxl/region: Add a mode attribute for regions")
02fedf146656 ("Merge branch 'for-6.2/cxl-xor' into for-6.2/cxl")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 5c6224bfabbf7f3e491c51ab50fd2c6f92ba1141 Mon Sep 17 00:00:00 2001
From: Dan Williams <dan.j.williams(a)intel.com>
Date: Fri, 16 Feb 2024 19:11:34 -0800
Subject: [PATCH] cxl/acpi: Fix load failures due to single window creation
failure
The expectation is that cxl_parse_cfwms() continues in the face the of
failure as evidenced by code like:
cxlrd = cxl_root_decoder_alloc(root_port, ways, cxl_calc_hb);
if (IS_ERR(cxlrd))
return 0;
There are other error paths in that function which mistakenly follow
idiomatic expectations and return an error when they should not. Most of
those mistakes are innocuous checks that hardly ever fail in practice.
However, a recent change succeed in making the implementation more
fragile by applying an idiomatic, but still wrong "fix" [1]. In this
failure case the kernel reports:
cxl root0: Failed to populate active decoder targets
cxl_acpi ACPI0017:00: Failed to add decode range: [mem 0x00000000-0x7fffffff flags 0x200]
...which is a real issue with that one window (to be fixed separately),
but ends up failing the entirety of cxl_acpi_probe().
Undo that recent breakage while also removing the confusion about
ignoring errors. Update all exits paths to return an error per typical
expectations and let an outer wrapper function handle dropping the
error.
Fixes: 91019b5bc7c2 ("cxl/acpi: Return 'rc' instead of '0' in cxl_parse_cfmws()") [1]
Cc: <stable(a)vger.kernel.org>
Cc: Breno Leitao <leitao(a)debian.org>
Cc: Alison Schofield <alison.schofield(a)intel.com>
Cc: Vishal Verma <vishal.l.verma(a)intel.com>
Signed-off-by: Dan Williams <dan.j.williams(a)intel.com>
diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
index dcf2b39e1048..1a3e6aafbdcc 100644
--- a/drivers/cxl/acpi.c
+++ b/drivers/cxl/acpi.c
@@ -316,31 +316,27 @@ static const struct cxl_root_ops acpi_root_ops = {
.qos_class = cxl_acpi_qos_class,
};
-static int cxl_parse_cfmws(union acpi_subtable_headers *header, void *arg,
- const unsigned long end)
+static int __cxl_parse_cfmws(struct acpi_cedt_cfmws *cfmws,
+ struct cxl_cfmws_context *ctx)
{
int target_map[CXL_DECODER_MAX_INTERLEAVE];
- struct cxl_cfmws_context *ctx = arg;
struct cxl_port *root_port = ctx->root_port;
struct resource *cxl_res = ctx->cxl_res;
struct cxl_cxims_context cxims_ctx;
struct cxl_root_decoder *cxlrd;
struct device *dev = ctx->dev;
- struct acpi_cedt_cfmws *cfmws;
cxl_calc_hb_fn cxl_calc_hb;
struct cxl_decoder *cxld;
unsigned int ways, i, ig;
struct resource *res;
int rc;
- cfmws = (struct acpi_cedt_cfmws *) header;
-
rc = cxl_acpi_cfmws_verify(dev, cfmws);
if (rc) {
dev_err(dev, "CFMWS range %#llx-%#llx not registered\n",
cfmws->base_hpa,
cfmws->base_hpa + cfmws->window_size - 1);
- return 0;
+ return rc;
}
rc = eiw_to_ways(cfmws->interleave_ways, &ways);
@@ -376,7 +372,7 @@ static int cxl_parse_cfmws(union acpi_subtable_headers *header, void *arg,
cxlrd = cxl_root_decoder_alloc(root_port, ways, cxl_calc_hb);
if (IS_ERR(cxlrd))
- return 0;
+ return PTR_ERR(cxlrd);
cxld = &cxlrd->cxlsd.cxld;
cxld->flags = cfmws_to_decoder_flags(cfmws->restrictions);
@@ -420,16 +416,7 @@ static int cxl_parse_cfmws(union acpi_subtable_headers *header, void *arg,
put_device(&cxld->dev);
else
rc = cxl_decoder_autoremove(dev, cxld);
- if (rc) {
- dev_err(dev, "Failed to add decode range: %pr", res);
- return rc;
- }
- dev_dbg(dev, "add: %s node: %d range [%#llx - %#llx]\n",
- dev_name(&cxld->dev),
- phys_to_target_node(cxld->hpa_range.start),
- cxld->hpa_range.start, cxld->hpa_range.end);
-
- return 0;
+ return rc;
err_insert:
kfree(res->name);
@@ -438,6 +425,29 @@ static int cxl_parse_cfmws(union acpi_subtable_headers *header, void *arg,
return -ENOMEM;
}
+static int cxl_parse_cfmws(union acpi_subtable_headers *header, void *arg,
+ const unsigned long end)
+{
+ struct acpi_cedt_cfmws *cfmws = (struct acpi_cedt_cfmws *)header;
+ struct cxl_cfmws_context *ctx = arg;
+ struct device *dev = ctx->dev;
+ int rc;
+
+ rc = __cxl_parse_cfmws(cfmws, ctx);
+ if (rc)
+ dev_err(dev,
+ "Failed to add decode range: [%#llx - %#llx] (%d)\n",
+ cfmws->base_hpa,
+ cfmws->base_hpa + cfmws->window_size - 1, rc);
+ else
+ dev_dbg(dev, "decode range: node: %d range [%#llx - %#llx]\n",
+ phys_to_target_node(cfmws->base_hpa), cfmws->base_hpa,
+ cfmws->base_hpa + cfmws->window_size - 1);
+
+ /* never fail cxl_acpi load for a single window failure */
+ return 0;
+}
+
__mock struct acpi_device *to_cxl_host_bridge(struct device *host,
struct device *dev)
{
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x bd915ae73a2d78559b376ad2caf5e4ef51de2455
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024022642-retool-clover-ecd7@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
bd915ae73a2d ("drm/meson: Don't remove bridges which are created by other drivers")
42dcf15f901c ("drm/meson: add DSI encoder")
6a044642988b ("drm/meson: fix unbind path if HDMI fails to bind")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From bd915ae73a2d78559b376ad2caf5e4ef51de2455 Mon Sep 17 00:00:00 2001
From: Martin Blumenstingl <martin.blumenstingl(a)googlemail.com>
Date: Thu, 15 Feb 2024 23:04:42 +0100
Subject: [PATCH] drm/meson: Don't remove bridges which are created by other
drivers
Stop calling drm_bridge_remove() for bridges allocated/managed by other
drivers in the remove paths of meson_encoder_{cvbs,dsi,hdmi}.
drm_bridge_remove() unregisters the bridge so it cannot be used
anymore. Doing so for bridges we don't own can lead to the video
pipeline not being able to come up after -EPROBE_DEFER of the VPU
because we're unregistering a bridge that's managed by another driver.
The other driver doesn't know that we have unregistered it's bridge
and on subsequent .probe() we're not able to find those bridges anymore
(since nobody re-creates them).
This fixes probe errors on Meson8b boards with the CVBS outputs enabled.
Fixes: 09847723c12f ("drm/meson: remove drm bridges at aggregate driver unbind time")
Fixes: 42dcf15f901c ("drm/meson: add DSI encoder")
Cc: <stable(a)vger.kernel.org>
Reported-by: Steve Morvai <stevemorvai(a)hotmail.com>
Signed-off-by: Martin Blumenstingl <martin.blumenstingl(a)googlemail.com>
Reviewed-by: Neil Armstrong <neil.armstrong(a)linaro.org>
Tested-by: Steve Morvai <stevemorvai(a)hotmail.com>
Link: https://lore.kernel.org/r/20240215220442.1343152-1-martin.blumenstingl@goog…
Reviewed-by: Neil Armstrong <neil.armstrong(a)linaro.org>
Signed-off-by: Neil Armstrong <neil.armstrong(a)linaro.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20240215220442.1343152-1-mart…
diff --git a/drivers/gpu/drm/meson/meson_encoder_cvbs.c b/drivers/gpu/drm/meson/meson_encoder_cvbs.c
index 3f73b211fa8e..3407450435e2 100644
--- a/drivers/gpu/drm/meson/meson_encoder_cvbs.c
+++ b/drivers/gpu/drm/meson/meson_encoder_cvbs.c
@@ -294,6 +294,5 @@ void meson_encoder_cvbs_remove(struct meson_drm *priv)
if (priv->encoders[MESON_ENC_CVBS]) {
meson_encoder_cvbs = priv->encoders[MESON_ENC_CVBS];
drm_bridge_remove(&meson_encoder_cvbs->bridge);
- drm_bridge_remove(meson_encoder_cvbs->next_bridge);
}
}
diff --git a/drivers/gpu/drm/meson/meson_encoder_dsi.c b/drivers/gpu/drm/meson/meson_encoder_dsi.c
index 3f93c70488ca..311b91630fbe 100644
--- a/drivers/gpu/drm/meson/meson_encoder_dsi.c
+++ b/drivers/gpu/drm/meson/meson_encoder_dsi.c
@@ -168,6 +168,5 @@ void meson_encoder_dsi_remove(struct meson_drm *priv)
if (priv->encoders[MESON_ENC_DSI]) {
meson_encoder_dsi = priv->encoders[MESON_ENC_DSI];
drm_bridge_remove(&meson_encoder_dsi->bridge);
- drm_bridge_remove(meson_encoder_dsi->next_bridge);
}
}
diff --git a/drivers/gpu/drm/meson/meson_encoder_hdmi.c b/drivers/gpu/drm/meson/meson_encoder_hdmi.c
index 25ea76558690..c4686568c9ca 100644
--- a/drivers/gpu/drm/meson/meson_encoder_hdmi.c
+++ b/drivers/gpu/drm/meson/meson_encoder_hdmi.c
@@ -474,6 +474,5 @@ void meson_encoder_hdmi_remove(struct meson_drm *priv)
if (priv->encoders[MESON_ENC_HDMI]) {
meson_encoder_hdmi = priv->encoders[MESON_ENC_HDMI];
drm_bridge_remove(&meson_encoder_hdmi->bridge);
- drm_bridge_remove(meson_encoder_hdmi->next_bridge);
}
}
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x b0ad381fa7690244802aed119b478b4bdafc31dd
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024022655-fled-exes-9ef4@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
b0ad381fa769 ("btrfs: fix deadlock with fiemap and extent locking")
519b7e13b5ae ("btrfs: lock the inode in shared mode before starting fiemap")
b3e744fe6d28 ("btrfs: use cached state when looking for delalloc ranges with fiemap")
8c6e53a79d16 ("btrfs: allow passing a cached state record to count_range_bits()")
8ddc8274e4be ("btrfs: search for delalloc more efficiently during lseek/fiemap")
af979fd618a4 ("btrfs: skip unnecessary delalloc searches during lseek/fiemap")
40daf3e095db ("btrfs: add an early exit when searching for delalloc range for lseek/fiemap")
af142b6f44d3 ("btrfs: move file prototypes to file.h")
7572dec8f522 ("btrfs: move ioctl prototypes into ioctl.h")
c7a03b524d30 ("btrfs: move uuid tree prototypes to uuid-tree.h")
7c8ede162805 ("btrfs: move file-item prototypes into their own header")
f2b39277b87d ("btrfs: move dir-item prototypes into dir-item.h")
59b818e064ab ("btrfs: move defrag related prototypes to their own header")
a6a01ca61f49 ("btrfs: move the file defrag code into defrag.c")
6e3df18ba7e8 ("btrfs: move the auto defrag code to defrag.c")
2885fd632050 ("btrfs: move inode prototypes to btrfs_inode.h")
911bd75aca73 ("btrfs: remove unused function prototypes")
45c40c8f9541 ("btrfs: move root tree prototypes to their own header")
2839c2c142dd ("btrfs: move delalloc space related prototypes to delalloc-space.h")
a0231804affe ("btrfs: move extent-tree helpers into their own header file")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From b0ad381fa7690244802aed119b478b4bdafc31dd Mon Sep 17 00:00:00 2001
From: Josef Bacik <josef(a)toxicpanda.com>
Date: Mon, 12 Feb 2024 11:56:02 -0500
Subject: [PATCH] btrfs: fix deadlock with fiemap and extent locking
While working on the patchset to remove extent locking I got a lockdep
splat with fiemap and pagefaulting with my new extent lock replacement
lock.
This deadlock exists with our normal code, we just don't have lockdep
annotations with the extent locking so we've never noticed it.
Since we're copying the fiemap extent to user space on every iteration
we have the chance of pagefaulting. Because we hold the extent lock for
the entire range we could mkwrite into a range in the file that we have
mmap'ed. This would deadlock with the following stack trace
[<0>] lock_extent+0x28d/0x2f0
[<0>] btrfs_page_mkwrite+0x273/0x8a0
[<0>] do_page_mkwrite+0x50/0xb0
[<0>] do_fault+0xc1/0x7b0
[<0>] __handle_mm_fault+0x2fa/0x460
[<0>] handle_mm_fault+0xa4/0x330
[<0>] do_user_addr_fault+0x1f4/0x800
[<0>] exc_page_fault+0x7c/0x1e0
[<0>] asm_exc_page_fault+0x26/0x30
[<0>] rep_movs_alternative+0x33/0x70
[<0>] _copy_to_user+0x49/0x70
[<0>] fiemap_fill_next_extent+0xc8/0x120
[<0>] emit_fiemap_extent+0x4d/0xa0
[<0>] extent_fiemap+0x7f8/0xad0
[<0>] btrfs_fiemap+0x49/0x80
[<0>] __x64_sys_ioctl+0x3e1/0xb50
[<0>] do_syscall_64+0x94/0x1a0
[<0>] entry_SYSCALL_64_after_hwframe+0x6e/0x76
I wrote an fstest to reproduce this deadlock without my replacement lock
and verified that the deadlock exists with our existing locking.
To fix this simply don't take the extent lock for the entire duration of
the fiemap. This is safe in general because we keep track of where we
are when we're searching the tree, so if an ordered extent updates in
the middle of our fiemap call we'll still emit the correct extents
because we know what offset we were on before.
The only place we maintain the lock is searching delalloc. Since the
delalloc stuff can change during writeback we want to lock the extent
range so we have a consistent view of delalloc at the time we're
checking to see if we need to set the delalloc flag.
With this patch applied we no longer deadlock with my testcase.
CC: stable(a)vger.kernel.org # 6.1+
Reviewed-by: Filipe Manana <fdmanana(a)suse.com>
Signed-off-by: Josef Bacik <josef(a)toxicpanda.com>
Reviewed-by: David Sterba <dsterba(a)suse.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index a0ffd41c5cc1..61d961a30dee 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -2689,16 +2689,34 @@ static int fiemap_process_hole(struct btrfs_inode *inode,
* it beyond i_size.
*/
while (cur_offset < end && cur_offset < i_size) {
+ struct extent_state *cached_state = NULL;
u64 delalloc_start;
u64 delalloc_end;
u64 prealloc_start;
+ u64 lockstart;
+ u64 lockend;
u64 prealloc_len = 0;
bool delalloc;
+ lockstart = round_down(cur_offset, inode->root->fs_info->sectorsize);
+ lockend = round_up(end, inode->root->fs_info->sectorsize);
+
+ /*
+ * We are only locking for the delalloc range because that's the
+ * only thing that can change here. With fiemap we have a lock
+ * on the inode, so no buffered or direct writes can happen.
+ *
+ * However mmaps and normal page writeback will cause this to
+ * change arbitrarily. We have to lock the extent lock here to
+ * make sure that nobody messes with the tree while we're doing
+ * btrfs_find_delalloc_in_range.
+ */
+ lock_extent(&inode->io_tree, lockstart, lockend, &cached_state);
delalloc = btrfs_find_delalloc_in_range(inode, cur_offset, end,
delalloc_cached_state,
&delalloc_start,
&delalloc_end);
+ unlock_extent(&inode->io_tree, lockstart, lockend, &cached_state);
if (!delalloc)
break;
@@ -2866,15 +2884,15 @@ int extent_fiemap(struct btrfs_inode *inode, struct fiemap_extent_info *fieinfo,
u64 start, u64 len)
{
const u64 ino = btrfs_ino(inode);
- struct extent_state *cached_state = NULL;
struct extent_state *delalloc_cached_state = NULL;
struct btrfs_path *path;
struct fiemap_cache cache = { 0 };
struct btrfs_backref_share_check_ctx *backref_ctx;
u64 last_extent_end;
u64 prev_extent_end;
- u64 lockstart;
- u64 lockend;
+ u64 range_start;
+ u64 range_end;
+ const u64 sectorsize = inode->root->fs_info->sectorsize;
bool stopped = false;
int ret;
@@ -2885,12 +2903,11 @@ int extent_fiemap(struct btrfs_inode *inode, struct fiemap_extent_info *fieinfo,
goto out;
}
- lockstart = round_down(start, inode->root->fs_info->sectorsize);
- lockend = round_up(start + len, inode->root->fs_info->sectorsize);
- prev_extent_end = lockstart;
+ range_start = round_down(start, sectorsize);
+ range_end = round_up(start + len, sectorsize);
+ prev_extent_end = range_start;
btrfs_inode_lock(inode, BTRFS_ILOCK_SHARED);
- lock_extent(&inode->io_tree, lockstart, lockend, &cached_state);
ret = fiemap_find_last_extent_offset(inode, path, &last_extent_end);
if (ret < 0)
@@ -2898,7 +2915,7 @@ int extent_fiemap(struct btrfs_inode *inode, struct fiemap_extent_info *fieinfo,
btrfs_release_path(path);
path->reada = READA_FORWARD;
- ret = fiemap_search_slot(inode, path, lockstart);
+ ret = fiemap_search_slot(inode, path, range_start);
if (ret < 0) {
goto out_unlock;
} else if (ret > 0) {
@@ -2910,7 +2927,7 @@ int extent_fiemap(struct btrfs_inode *inode, struct fiemap_extent_info *fieinfo,
goto check_eof_delalloc;
}
- while (prev_extent_end < lockend) {
+ while (prev_extent_end < range_end) {
struct extent_buffer *leaf = path->nodes[0];
struct btrfs_file_extent_item *ei;
struct btrfs_key key;
@@ -2933,19 +2950,19 @@ int extent_fiemap(struct btrfs_inode *inode, struct fiemap_extent_info *fieinfo,
* The first iteration can leave us at an extent item that ends
* before our range's start. Move to the next item.
*/
- if (extent_end <= lockstart)
+ if (extent_end <= range_start)
goto next_item;
backref_ctx->curr_leaf_bytenr = leaf->start;
/* We have in implicit hole (NO_HOLES feature enabled). */
if (prev_extent_end < key.offset) {
- const u64 range_end = min(key.offset, lockend) - 1;
+ const u64 hole_end = min(key.offset, range_end) - 1;
ret = fiemap_process_hole(inode, fieinfo, &cache,
&delalloc_cached_state,
backref_ctx, 0, 0, 0,
- prev_extent_end, range_end);
+ prev_extent_end, hole_end);
if (ret < 0) {
goto out_unlock;
} else if (ret > 0) {
@@ -2955,7 +2972,7 @@ int extent_fiemap(struct btrfs_inode *inode, struct fiemap_extent_info *fieinfo,
}
/* We've reached the end of the fiemap range, stop. */
- if (key.offset >= lockend) {
+ if (key.offset >= range_end) {
stopped = true;
break;
}
@@ -3049,29 +3066,41 @@ int extent_fiemap(struct btrfs_inode *inode, struct fiemap_extent_info *fieinfo,
btrfs_free_path(path);
path = NULL;
- if (!stopped && prev_extent_end < lockend) {
+ if (!stopped && prev_extent_end < range_end) {
ret = fiemap_process_hole(inode, fieinfo, &cache,
&delalloc_cached_state, backref_ctx,
- 0, 0, 0, prev_extent_end, lockend - 1);
+ 0, 0, 0, prev_extent_end, range_end - 1);
if (ret < 0)
goto out_unlock;
- prev_extent_end = lockend;
+ prev_extent_end = range_end;
}
if (cache.cached && cache.offset + cache.len >= last_extent_end) {
const u64 i_size = i_size_read(&inode->vfs_inode);
if (prev_extent_end < i_size) {
+ struct extent_state *cached_state = NULL;
u64 delalloc_start;
u64 delalloc_end;
+ u64 lockstart;
+ u64 lockend;
bool delalloc;
+ lockstart = round_down(prev_extent_end, sectorsize);
+ lockend = round_up(i_size, sectorsize);
+
+ /*
+ * See the comment in fiemap_process_hole as to why
+ * we're doing the locking here.
+ */
+ lock_extent(&inode->io_tree, lockstart, lockend, &cached_state);
delalloc = btrfs_find_delalloc_in_range(inode,
prev_extent_end,
i_size - 1,
&delalloc_cached_state,
&delalloc_start,
&delalloc_end);
+ unlock_extent(&inode->io_tree, lockstart, lockend, &cached_state);
if (!delalloc)
cache.flags |= FIEMAP_EXTENT_LAST;
} else {
@@ -3082,7 +3111,6 @@ int extent_fiemap(struct btrfs_inode *inode, struct fiemap_extent_info *fieinfo,
ret = emit_last_fiemap_cache(fieinfo, &cache);
out_unlock:
- unlock_extent(&inode->io_tree, lockstart, lockend, &cached_state);
btrfs_inode_unlock(inode, BTRFS_ILOCK_SHARED);
out:
free_extent_state(delalloc_cached_state);
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x e42b9d8b9ea2672811285e6a7654887ff64d23f3
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024022640-deepness-manatee-1a30@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
e42b9d8b9ea2 ("btrfs: defrag: avoid unnecessary defrag caused by incorrect extent size")
a6a01ca61f49 ("btrfs: move the file defrag code into defrag.c")
6e3df18ba7e8 ("btrfs: move the auto defrag code to defrag.c")
07e81dc94474 ("btrfs: move accessor helpers into accessors.h")
ad1ac5012c2b ("btrfs: move btrfs_map_token to accessors")
55e5cfd36da5 ("btrfs: remove fs_info::pending_changes and related code")
7966a6b5959b ("btrfs: move fs_info::flags enum to fs.h")
fc97a410bd78 ("btrfs: move mount option definitions to fs.h")
0d3a9cf8c306 ("btrfs: convert incompat and compat flag test helpers to macros")
ec8eb376e271 ("btrfs: move BTRFS_FS_STATE* definitions and helpers to fs.h")
9b569ea0be6f ("btrfs: move the printk helpers out of ctree.h")
e118578a8df7 ("btrfs: move assert helpers out of ctree.h")
c7f13d428ea1 ("btrfs: move fs wide helpers out of ctree.h")
63a7cb130718 ("btrfs: auto enable discard=async when possible")
7a66eda351ba ("btrfs: move the btrfs_verity_descriptor_item defs up in ctree.h")
956504a331a6 ("btrfs: move trans_handle_cachep out of ctree.h")
f1e5c6185ca1 ("btrfs: move flush related definitions to space-info.h")
ed4c491a3db2 ("btrfs: move BTRFS_MAX_MIRRORS into scrub.c")
4300c58f8090 ("btrfs: move btrfs on-disk definitions out of ctree.h")
d60d956eb41f ("btrfs: remove unused set/clear_pending_info helpers")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From e42b9d8b9ea2672811285e6a7654887ff64d23f3 Mon Sep 17 00:00:00 2001
From: Qu Wenruo <wqu(a)suse.com>
Date: Wed, 7 Feb 2024 10:00:42 +1030
Subject: [PATCH] btrfs: defrag: avoid unnecessary defrag caused by incorrect
extent size
[BUG]
With the following file extent layout, defrag would do unnecessary IO
and result more on-disk space usage.
# mkfs.btrfs -f $dev
# mount $dev $mnt
# xfs_io -f -c "pwrite 0 40m" $mnt/foobar
# sync
# xfs_io -f -c "pwrite 40m 16k" $mnt/foobar
# sync
Above command would lead to the following file extent layout:
item 6 key (257 EXTENT_DATA 0) itemoff 15816 itemsize 53
generation 7 type 1 (regular)
extent data disk byte 298844160 nr 41943040
extent data offset 0 nr 41943040 ram 41943040
extent compression 0 (none)
item 7 key (257 EXTENT_DATA 41943040) itemoff 15763 itemsize 53
generation 8 type 1 (regular)
extent data disk byte 13631488 nr 16384
extent data offset 0 nr 16384 ram 16384
extent compression 0 (none)
Which is mostly fine. We can allow the final 16K to be merged with the
previous 40M, but it's upon the end users' preference.
But if we defrag the file using the default parameters, it would result
worse file layout:
# btrfs filesystem defrag $mnt/foobar
# sync
item 6 key (257 EXTENT_DATA 0) itemoff 15816 itemsize 53
generation 7 type 1 (regular)
extent data disk byte 298844160 nr 41943040
extent data offset 0 nr 8650752 ram 41943040
extent compression 0 (none)
item 7 key (257 EXTENT_DATA 8650752) itemoff 15763 itemsize 53
generation 9 type 1 (regular)
extent data disk byte 340787200 nr 33292288
extent data offset 0 nr 33292288 ram 33292288
extent compression 0 (none)
item 8 key (257 EXTENT_DATA 41943040) itemoff 15710 itemsize 53
generation 8 type 1 (regular)
extent data disk byte 13631488 nr 16384
extent data offset 0 nr 16384 ram 16384
extent compression 0 (none)
Note the original 40M extent is still there, but a new 32M extent is
created for no benefit at all.
[CAUSE]
There is an existing check to make sure we won't defrag a large enough
extent (the threshold is by default 32M).
But the check is using the length to the end of the extent:
range_len = em->len - (cur - em->start);
/* Skip too large extent */
if (range_len >= extent_thresh)
goto next;
This means, for the first 8MiB of the extent, the range_len is always
smaller than the default threshold, and would not be defragged.
But after the first 8MiB, the remaining part would fit the requirement,
and be defragged.
Such different behavior inside the same extent caused the above problem,
and we should avoid different defrag decision inside the same extent.
[FIX]
Instead of using @range_len, just use @em->len, so that we have a
consistent decision among the same file extent.
Now with this fix, we won't touch the extent, thus not making it any
worse.
Reported-by: Filipe Manana <fdmanana(a)suse.com>
Fixes: 0cb5950f3f3b ("btrfs: fix deadlock when reserving space during defrag")
CC: stable(a)vger.kernel.org # 6.1+
Reviewed-by: Boris Burkov <boris(a)bur.io>
Reviewed-by: Filipe Manana <fdmanana(a)suse.com>
Signed-off-by: Qu Wenruo <wqu(a)suse.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
diff --git a/fs/btrfs/defrag.c b/fs/btrfs/defrag.c
index c276b136ab63..5b0b64571418 100644
--- a/fs/btrfs/defrag.c
+++ b/fs/btrfs/defrag.c
@@ -1046,7 +1046,7 @@ static int defrag_collect_targets(struct btrfs_inode *inode,
goto add;
/* Skip too large extent */
- if (range_len >= extent_thresh)
+ if (em->len >= extent_thresh)
goto next;
/*
From: Rui Qi <qirui.001(a)bytedance.com>
Since kernel version 5.4.250 LTS, there has been an issue with the kernel live patching feature becoming unavailable. When compiling the sample code for kernel live patching, the following message is displayed when enabled:
livepatch: klp_check_stack: kworker/u256:6:23490 has an unreliable stack
After investigation, it was found that this is due to objtool not supporting intra-function calls, resulting in incorrect orc entry generation.
This patchset adds support for intra-function calls, allowing the kernel live patching feature to work correctly.
Alexandre Chartre (2):
objtool: is_fentry_call() crashes if call has no destination
objtool: Add support for intra-function calls
Rui Qi (1):
x86/speculation: Support intra-function call validation
arch/x86/include/asm/nospec-branch.h | 7 ++
include/linux/frame.h | 11 ++++
.../Documentation/stack-validation.txt | 8 +++
tools/objtool/arch/x86/decode.c | 6 ++
tools/objtool/check.c | 64 +++++++++++++++++--
5 files changed, 91 insertions(+), 5 deletions(-)
--
2.39.2 (Apple Git-143)
commit 912680064f94 ("media: atomisp: make sh_css similar to Intel Aero driver")
removes the affected code, but in versions
tags/v5.8-rc1~10^2~220 - tags/v5.17-rc1~114^2~261
there is no check for the return value of the
ia_css_pipeline_create_and_add_stage() function.
ia_css_pipeline_create_and_add_stage() may return an
error code, so check and return it on error.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Fixes: 7796e455170e ("media: staging: media: atomisp: Fix alignment and line length issues")
Signed-off-by: Alexandra Diupina <adiupina(a)astralinux.ru>
---
drivers/staging/media/atomisp/pci/sh_css.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/drivers/staging/media/atomisp/pci/sh_css.c b/drivers/staging/media/atomisp/pci/sh_css.c
index ba25d0da8b81..8502adb75a5a 100644
--- a/drivers/staging/media/atomisp/pci/sh_css.c
+++ b/drivers/staging/media/atomisp/pci/sh_css.c
@@ -7912,6 +7912,10 @@ create_host_regular_capture_pipeline(struct ia_css_pipe *pipe)
out_frames, in_frame, NULL);
err = ia_css_pipeline_create_and_add_stage(me, &stage_desc,
NULL);
+ if (err) {
+ IA_CSS_LEAVE_ERR_PRIVATE(err);
+ return err;
+ }
} else if (need_pp && current_stage) {
in_frame = current_stage->args.out_frame[0];
err = add_capture_pp_stage(pipe, me, in_frame,
--
2.30.2
From: Rui Qi <qirui.001(a)bytedance.com>
Since kernel version 5.4.250 LTS, there has been an issue with the kernel live patching feature becoming unavailable. When compiling the sample code for kernel live patching, the following message is displayed when enabled:
livepatch: klp_check_stack: kworker/u256:6:23490 has an unreliable stack
After investigation, it was found that this is due to objtool not supporting intra-function calls, resulting in incorrect orc entry generation.
This patchset adds support for intra-function calls, allowing the kernel live patching feature to work correctly.
Alexandre Chartre (2):
objtool: is_fentry_call() crashes if call has no destination
objtool: Add support for intra-function calls
Rui Qi (1):
x86/speculation: Support intra-function call validation
arch/x86/include/asm/nospec-branch.h | 7 ++
include/linux/frame.h | 11 ++++
.../Documentation/stack-validation.txt | 8 +++
tools/objtool/arch/x86/decode.c | 6 ++
tools/objtool/check.c | 64 +++++++++++++++++--
5 files changed, 91 insertions(+), 5 deletions(-)
--
2.39.2 (Apple Git-143)
From: Ondrej Jirman <megi(a)xff.cz>
The reverted commit makes the state machine only ever go from SRC_ATTACH_WAIT
to SNK_TRY in endless loop when toggling. After revert it goes to SRC_ATTACHED
after initially trying SNK_TRY earlier, as it should for toggling to ever detect
the power source mode and the port is again able to provide power to attached
power sinks.
This reverts commit 2d6d80127006ae3da26b1f21a65eccf957f2d1e5.
Cc: stable(a)vger.kernel.org
Fixes: 2d6d80127006 ("usb: typec: tcpm: reset counter when enter into unattached state after try role")
Signed-of-by: Ondrej Jirman <megi(a)xff.cz>
---
drivers/usb/typec/tcpm/tcpm.c | 3 ---
1 file changed, 3 deletions(-)
See https://lore.kernel.org/all/odggrbbgjpardze76qiv57mw6tllisyu5sbrta37iadjzwa…
for more.
diff --git a/drivers/usb/typec/tcpm/tcpm.c b/drivers/usb/typec/tcpm/tcpm.c
index f7d7daa60c8d..295ae7eb912c 100644
--- a/drivers/usb/typec/tcpm/tcpm.c
+++ b/drivers/usb/typec/tcpm/tcpm.c
@@ -3743,9 +3743,6 @@ static void tcpm_detach(struct tcpm_port *port)
if (tcpm_port_is_disconnected(port))
port->hard_reset_count = 0;
- port->try_src_count = 0;
- port->try_snk_count = 0;
-
if (!port->attached)
return;
--
2.43.0
From: Roman Gushchin <guro(a)fb.com>
commit e1a366be5cb4f849ec4de170d50eebc08bb0af20 upstream.
Commit 72f0184c8a00 ("mm, memcg: remove hotplug locking from try_charge")
introduced css_tryget()/css_put() calls in drain_all_stock(), which are
supposed to protect the target memory cgroup from being released during
the mem_cgroup_is_descendant() call.
However, it's not completely safe. In theory, memcg can go away between
reading stock->cached pointer and calling css_tryget().
This can happen if drain_all_stock() races with drain_local_stock()
performed on the remote cpu as a result of a work, scheduled by the
previous invocation of drain_all_stock().
The race is a bit theoretical and there are few chances to trigger it, but
the current code looks a bit confusing, so it makes sense to fix it
anyway. The code looks like as if css_tryget() and css_put() are used to
protect stocks drainage. It's not necessary because stocked pages are
holding references to the cached cgroup. And it obviously won't work for
works, scheduled on other cpus.
So, let's read the stock->cached pointer and evaluate the memory cgroup
inside a rcu read section, and get rid of css_tryget()/css_put() calls.
Link: http://lkml.kernel.org/r/20190802192241.3253165-1-guro@fb.com
Signed-off-by: Roman Gushchin <guro(a)fb.com>
Acked-by: Michal Hocko <mhocko(a)suse.com>
Cc: Hillf Danton <hdanton(a)sina.com>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: Vladimir Davydov <vdavydov.dev(a)gmail.com>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: stable(a)vger.kernel.org # 4.19
Fixes: cdec2e4265df ("memcg: coalesce charging via percpu storage")
Signed-off-by: GONG, Ruiqi <gongruiqi1(a)huawei.com>
---
This patch [1] fixed a UAF problem in drain_all_stock() existed prior to
5.9, and following discussions [2] mentioned that the fix depends on an
RCU read protection to stock->cached (introduced in 5.4), which doesn't
existed in 4.19. So backport this part to 4.19 as well.
[1]: https://lore.kernel.org/all/20240221081801.69764-1-gongruiqi1@huawei.com/
[2]: https://lore.kernel.org/all/ZdXLgjpUfpwEwAe0@tiehlicka/
mm/memcontrol.c | 17 +++++++++--------
1 file changed, 9 insertions(+), 8 deletions(-)
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 8c04296df1c7..d187bfb43b1f 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2094,21 +2094,22 @@ static void drain_all_stock(struct mem_cgroup *root_memcg)
for_each_online_cpu(cpu) {
struct memcg_stock_pcp *stock = &per_cpu(memcg_stock, cpu);
struct mem_cgroup *memcg;
+ bool flush = false;
+ rcu_read_lock();
memcg = stock->cached;
- if (!memcg || !stock->nr_pages || !css_tryget(&memcg->css))
- continue;
- if (!mem_cgroup_is_descendant(memcg, root_memcg)) {
- css_put(&memcg->css);
- continue;
- }
- if (!test_and_set_bit(FLUSHING_CACHED_CHARGE, &stock->flags)) {
+ if (memcg && stock->nr_pages &&
+ mem_cgroup_is_descendant(memcg, root_memcg))
+ flush = true;
+ rcu_read_unlock();
+
+ if (flush &&
+ !test_and_set_bit(FLUSHING_CACHED_CHARGE, &stock->flags)) {
if (cpu == curcpu)
drain_local_stock(&stock->work);
else
schedule_work_on(cpu, &stock->work);
}
- css_put(&memcg->css);
}
put_cpu();
mutex_unlock(&percpu_charge_mutex);
--
2.25.1
Syzkaller reports warning in ext4_set_page_dirty() in 5.10 and 5.15
stable releases. It happens because invalidate_inode_page() frees pages
that are needed for the system. To fix this we need to add additional
checks to the function. page_mapped() checks if a page exists in the
page tables, but this is not enough. The page can be used in other places:
https://elixir.bootlin.com/linux/v6.8-rc1/source/include/linux/page_ref.h#L…
Kernel outputs an error line related to direct I/O:
https://syzkaller.appspot.com/text?tag=CrashLog&x=14ab52dac80000
The problem can be fixed in 5.10 and 5.15 stable releases by the
following patch.
The patch replaces page_mapped() call with check that finds additional
references to the page excluding page cache and filesystem private data.
If additional references exist, the page cannot be freed.
This version does not include the first patch from the first version.
The problem can be fixed without it.
Found by Linux Verification Center (linuxtesting.org) with Syzkaller.
Link: https://syzkaller.appspot.com/bug?extid=02f21431b65c214aa1d6
Previous discussion:
https://lore.kernel.org/all/20240125130947.600632-1-r.smirnov@omp.ru/T/
Matthew Wilcox (Oracle) (1):
mm/truncate: Replace page_mapped() call in invalidate_inode_page()
mm/truncate.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
--
2.34.1
[2024-02-23 18:45] Sasha Levin:
> This is a note to let you know that I've just added the patch titled
>
> xhci: fix possible null pointer deref during xhci urb enqueue
>
> to the 6.7-stable tree which can be found at:
> http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
>
> The filename of the patch is:
> xhci-fix-possible-null-pointer-deref-during-xhci-urb.patch
> and it can be found in the queue-6.7 subdirectory.
>
> If you, or anyone else, feels it should not be added to the stable tree,
> please let <stable(a)vger.kernel.org> know about it.
>
>
>
> commit fb9100c2c6b7b172650ba25283cc4cf9af1d082c
> Author: Mathias Nyman <mathias.nyman(a)linux.intel.com>
> Date: Fri Dec 1 17:06:47 2023 +0200
>
> xhci: fix possible null pointer deref during xhci urb enqueue
>
> [ Upstream commit e2e2aacf042f52854c92775b7800ba668e0bdfe4 ]
>
> There is a short gap between urb being submitted and actually added to the
> endpoint queue (linked). If the device is disconnected during this time
> then usb core is not yet aware of the pending urb, and device may be freed
> just before xhci_urq_enqueue() continues, dereferencing the freed device.
>
> Freeing the device is protected by the xhci spinlock, so make sure we take
> and keep the lock while checking that device exists, dereference it, and
> add the urb to the queue.
>
> Remove the unnecessary URB check, usb core checks it before calling
> xhci_urb_enqueue()
>
> Suggested-by: Kuen-Han Tsai <khtsai(a)google.com>
> Signed-off-by: Mathias Nyman <mathias.nyman(a)linux.intel.com>
> Link: https://lore.kernel.org/r/20231201150647.1307406-20-mathias.nyman@linux.int…
> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
> Signed-off-by: Sasha Levin <sashal(a)kernel.org>
>
> diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
> index 884b0898d9c95..ddb686301af5d 100644
> --- a/drivers/usb/host/xhci.c
> +++ b/drivers/usb/host/xhci.c
> @@ -1522,24 +1522,7 @@ static int xhci_urb_enqueue(struct usb_hcd *hcd, struct urb *urb, gfp_t mem_flag
> struct urb_priv *urb_priv;
> int num_tds;
>
> - if (!urb)
> - return -EINVAL;
> - ret = xhci_check_args(hcd, urb->dev, urb->ep,
> - true, true, __func__);
> - if (ret <= 0)
> - return ret ? ret : -EINVAL;
> -
> - slot_id = urb->dev->slot_id;
> ep_index = xhci_get_endpoint_index(&urb->ep->desc);
> - ep_state = &xhci->devs[slot_id]->eps[ep_index].ep_state;
> -
> - if (!HCD_HW_ACCESSIBLE(hcd))
> - return -ESHUTDOWN;
> -
> - if (xhci->devs[slot_id]->flags & VDEV_PORT_ERROR) {
> - xhci_dbg(xhci, "Can't queue urb, port error, link inactive\n");
> - return -ENODEV;
> - }
>
> if (usb_endpoint_xfer_isoc(&urb->ep->desc))
> num_tds = urb->number_of_packets;
> @@ -1578,12 +1561,35 @@ static int xhci_urb_enqueue(struct usb_hcd *hcd, struct urb *urb, gfp_t mem_flag
>
> spin_lock_irqsave(&xhci->lock, flags);
>
> + ret = xhci_check_args(hcd, urb->dev, urb->ep,
> + true, true, __func__);
> + if (ret <= 0) {
> + ret = ret ? ret : -EINVAL;
> + goto free_priv;
> + }
> +
> + slot_id = urb->dev->slot_id;
> +
> + if (!HCD_HW_ACCESSIBLE(hcd)) {
> + ret = -ESHUTDOWN;
> + goto free_priv;
> + }
> +
> + if (xhci->devs[slot_id]->flags & VDEV_PORT_ERROR) {
> + xhci_dbg(xhci, "Can't queue urb, port error, link inactive\n");
> + ret = -ENODEV;
> + goto free_priv;
> + }
> +
> if (xhci->xhc_state & XHCI_STATE_DYING) {
> xhci_dbg(xhci, "Ep 0x%x: URB %p submitted for non-responsive xHCI host.\n",
> urb->ep->desc.bEndpointAddress, urb);
> ret = -ESHUTDOWN;
> goto free_priv;
> }
> +
> + ep_state = &xhci->devs[slot_id]->eps[ep_index].ep_state;
> +
> if (*ep_state & (EP_GETTING_STREAMS | EP_GETTING_NO_STREAMS)) {
> xhci_warn(xhci, "WARN: Can't enqueue URB, ep in streams transition state %x\n",
> *ep_state);
Hi, this patch is causing my laptop (Dell Precision 7530) to crash
during early boot with a kernel 6.7.6 with all the patches from your
current stable-queue applied on top
(https://git.kernel.org/pub/scm/linux/kernel/git/stable/stable-queue.git/tre…).
Booting with "module_blacklist=xhci_pci,xhci_pci_renesas" stops the
crashes. This patch was already thrown out a few weeks ago because it
was causing problems:
https://lore.kernel.org/stable/2024020331-confetti-ducking-8afb@gregkh/
Regards
Pascal
Hi Greg and Sasha,
Please apply commit 56778b49c9a2 ("kunit: Add a macro to wrap a deferred
action function") to the 6.7 branch, as there are reported kCFI failures
that are resolved by that change:
https://github.com/ClangBuiltLinux/linux/issues/1998
It applies cleanly for me. We may want this in earlier branches as well
but there are some conflicts that I did not have too much time to look
at, we can always wait for other reports to come in before going further
back as well.
Cheers,
Nathan
From: Bjorn Helgaas <bhelgaas(a)google.com>
When booting with "pci=noaer", we don't request control of AER, but we
previously *did* request control of DPC, as in the dmesg log attached at
the bugzilla below:
Command line: ... pci=noaer
acpi PNP0A08:00: _OSC: OS supports [ExtendedConfig ASPM ClockPM Segments MSI EDR HPX-Type3]
acpi PNP0A08:00: _OSC: OS now controls [PCIeHotplug SHPCHotplug PME PCIeCapability LTR DPC]
That's illegal per PCI Firmware Spec, r3.3, sec 4.5.1, table 4-5, which
says:
If the operating system sets this bit [OSC_PCI_EXPRESS_DPC_CONTROL], it
must also set bit 7 of the Support field (indicating support for Error
Disconnect Recover notifications) and bits 3 and 4 of the Control field
(requesting control of PCI Express Advanced Error Reporting and the PCI
Express Capability Structure).
Request DPC control only if we have also requested AER control.
Fixes: ac1c8e35a326 ("PCI/DPC: Add Error Disconnect Recover (EDR) support")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=218491#c12
Signed-off-by: Bjorn Helgaas <bhelgaas(a)google.com>
Cc: <stable(a)vger.kernel.org> # v5.7+
Cc: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy(a)linux.intel.com>
Cc: Matthew W Carlis <mattc(a)purestorage.com>
Cc: Keith Busch <kbusch(a)kernel.org>
Cc: Lukas Wunner <lukas(a)wunner.de>
Cc: Mika Westerberg <mika.westerberg(a)linux.intel.com>
Cc: Jesse Brandeburg <jesse.brandeburg(a)intel.com>
---
drivers/acpi/pci_root.c | 20 +++++++++++---------
1 file changed, 11 insertions(+), 9 deletions(-)
diff --git a/drivers/acpi/pci_root.c b/drivers/acpi/pci_root.c
index 58b89b8d950e..1c16965427b3 100644
--- a/drivers/acpi/pci_root.c
+++ b/drivers/acpi/pci_root.c
@@ -518,17 +518,19 @@ static u32 calculate_control(void)
if (IS_ENABLED(CONFIG_HOTPLUG_PCI_SHPC))
control |= OSC_PCI_SHPC_NATIVE_HP_CONTROL;
- if (pci_aer_available())
+ if (pci_aer_available()) {
control |= OSC_PCI_EXPRESS_AER_CONTROL;
- /*
- * Per the Downstream Port Containment Related Enhancements ECN to
- * the PCI Firmware Spec, r3.2, sec 4.5.1, table 4-5,
- * OSC_PCI_EXPRESS_DPC_CONTROL indicates the OS supports both DPC
- * and EDR.
- */
- if (IS_ENABLED(CONFIG_PCIE_DPC) && IS_ENABLED(CONFIG_PCIE_EDR))
- control |= OSC_PCI_EXPRESS_DPC_CONTROL;
+ /*
+ * Per PCI Firmware Spec, r3.3, sec 4.5.1, table 4-5, the
+ * OS can request DPC control only if it has advertised
+ * OSC_PCI_EDR_SUPPORT and requested both
+ * OSC_PCI_EXPRESS_CAPABILITY_CONTROL and
+ * OSC_PCI_EXPRESS_AER_CONTROL.
+ */
+ if (IS_ENABLED(CONFIG_PCIE_DPC) && IS_ENABLED(CONFIG_PCIE_EDR))
+ control |= OSC_PCI_EXPRESS_DPC_CONTROL;
+ }
return control;
}
--
2.34.1
From: Paul E. McKenney <paulmck(a)kernel.org>
commit bc31e6cb27a9334140ff2f0a209d59b08bc0bc8c upstream.
Holding a mutex across synchronize_rcu_tasks() and acquiring
that same mutex in code called from do_exit() after its call to
exit_tasks_rcu_start() but before its call to exit_tasks_rcu_stop()
results in deadlock. This is by design, because tasks that are far
enough into do_exit() are no longer present on the tasks list, making
it a bit difficult for RCU Tasks to find them, let alone wait on them
to do a voluntary context switch. However, such deadlocks are becoming
more frequent. In addition, lockdep currently does not detect such
deadlocks and they can be difficult to reproduce.
In addition, if a task voluntarily context switches during that time
(for example, if it blocks acquiring a mutex), then this task is in an
RCU Tasks quiescent state. And with some adjustments, RCU Tasks could
just as well take advantage of that fact.
This commit therefore eliminates these deadlock by replacing the
SRCU-based wait for do_exit() completion with per-CPU lists of tasks
currently exiting. A given task will be on one of these per-CPU lists for
the same period of time that this task would previously have been in the
previous SRCU read-side critical section. These lists enable RCU Tasks
to find the tasks that have already been removed from the tasks list,
but that must nevertheless be waited upon.
The RCU Tasks grace period gathers any of these do_exit() tasks that it
must wait on, and adds them to the list of holdouts. Per-CPU locking
and get_task_struct() are used to synchronize addition to and removal
from these lists.
Link: https://lore.kernel.org/all/20240118021842.290665-1-chenzhongjin@huawei.com/
Reported-by: Chen Zhongjin <chenzhongjin(a)huawei.com>
Signed-off-by: Paul E. McKenney <paulmck(a)kernel.org>
Signed-off-by: Zqiang <qiang.zhang1211(a)gmail.com>
---
include/linux/sched.h | 1 +
init/init_task.c | 1 +
kernel/fork.c | 1 +
kernel/rcu/update.c | 65 ++++++++++++++++++++++++++++++-------------
4 files changed, 49 insertions(+), 19 deletions(-)
diff --git a/include/linux/sched.h b/include/linux/sched.h
index fd4899236037..0b555d8e9d5e 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -679,6 +679,7 @@ struct task_struct {
u8 rcu_tasks_idx;
int rcu_tasks_idle_cpu;
struct list_head rcu_tasks_holdout_list;
+ struct list_head rcu_tasks_exit_list;
#endif /* #ifdef CONFIG_TASKS_RCU */
struct sched_info sched_info;
diff --git a/init/init_task.c b/init/init_task.c
index 994ffe018120..f741cbfd891c 100644
--- a/init/init_task.c
+++ b/init/init_task.c
@@ -139,6 +139,7 @@ struct task_struct init_task
.rcu_tasks_holdout = false,
.rcu_tasks_holdout_list = LIST_HEAD_INIT(init_task.rcu_tasks_holdout_list),
.rcu_tasks_idle_cpu = -1,
+ .rcu_tasks_exit_list = LIST_HEAD_INIT(init_task.rcu_tasks_exit_list),
#endif
#ifdef CONFIG_CPUSETS
.mems_allowed_seq = SEQCNT_ZERO(init_task.mems_allowed_seq),
diff --git a/kernel/fork.c b/kernel/fork.c
index b65871600507..d416d16df62f 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -1626,6 +1626,7 @@ static inline void rcu_copy_process(struct task_struct *p)
p->rcu_tasks_holdout = false;
INIT_LIST_HEAD(&p->rcu_tasks_holdout_list);
p->rcu_tasks_idle_cpu = -1;
+ INIT_LIST_HEAD(&p->rcu_tasks_exit_list);
#endif /* #ifdef CONFIG_TASKS_RCU */
}
diff --git a/kernel/rcu/update.c b/kernel/rcu/update.c
index 81688a133552..5227cb5c1bea 100644
--- a/kernel/rcu/update.c
+++ b/kernel/rcu/update.c
@@ -527,7 +527,8 @@ static DECLARE_WAIT_QUEUE_HEAD(rcu_tasks_cbs_wq);
static DEFINE_RAW_SPINLOCK(rcu_tasks_cbs_lock);
/* Track exiting tasks in order to allow them to be waited for. */
-DEFINE_STATIC_SRCU(tasks_rcu_exit_srcu);
+static LIST_HEAD(rtp_exit_list);
+static DEFINE_RAW_SPINLOCK(rtp_exit_list_lock);
/* Control stall timeouts. Disable with <= 0, otherwise jiffies till stall. */
#define RCU_TASK_STALL_TIMEOUT (HZ * 60 * 10)
@@ -661,6 +662,17 @@ static void check_holdout_task(struct task_struct *t,
sched_show_task(t);
}
+static void rcu_tasks_pertask(struct task_struct *t, struct list_head *hop)
+{
+ if (t != current && READ_ONCE(t->on_rq) &&
+ !is_idle_task(t)) {
+ get_task_struct(t);
+ t->rcu_tasks_nvcsw = READ_ONCE(t->nvcsw);
+ WRITE_ONCE(t->rcu_tasks_holdout, true);
+ list_add(&t->rcu_tasks_holdout_list, hop);
+ }
+}
+
/* RCU-tasks kthread that detects grace periods and invokes callbacks. */
static int __noreturn rcu_tasks_kthread(void *arg)
{
@@ -726,14 +738,7 @@ static int __noreturn rcu_tasks_kthread(void *arg)
*/
rcu_read_lock();
for_each_process_thread(g, t) {
- if (t != current && READ_ONCE(t->on_rq) &&
- !is_idle_task(t)) {
- get_task_struct(t);
- t->rcu_tasks_nvcsw = READ_ONCE(t->nvcsw);
- WRITE_ONCE(t->rcu_tasks_holdout, true);
- list_add(&t->rcu_tasks_holdout_list,
- &rcu_tasks_holdouts);
- }
+ rcu_tasks_pertask(t, &rcu_tasks_holdouts);
}
rcu_read_unlock();
@@ -744,8 +749,12 @@ static int __noreturn rcu_tasks_kthread(void *arg)
* where they have disabled preemption, allowing the
* later synchronize_sched() to finish the job.
*/
- synchronize_srcu(&tasks_rcu_exit_srcu);
-
+ raw_spin_lock_irqsave(&rtp_exit_list_lock, flags);
+ list_for_each_entry(t, &rtp_exit_list, rcu_tasks_exit_list) {
+ if (list_empty(&t->rcu_tasks_holdout_list))
+ rcu_tasks_pertask(t, &rcu_tasks_holdouts);
+ }
+ raw_spin_unlock_irqrestore(&rtp_exit_list_lock, flags);
/*
* Each pass through the following loop scans the list
* of holdout tasks, removing any that are no longer
@@ -802,8 +811,7 @@ static int __noreturn rcu_tasks_kthread(void *arg)
*
* In addition, this synchronize_sched() waits for exiting
* tasks to complete their final preempt_disable() region
- * of execution, cleaning up after the synchronize_srcu()
- * above.
+ * of execution.
*/
synchronize_sched();
@@ -834,20 +842,39 @@ static int __init rcu_spawn_tasks_kthread(void)
}
core_initcall(rcu_spawn_tasks_kthread);
-/* Do the srcu_read_lock() for the above synchronize_srcu(). */
+/*
+ * Protect against tasklist scan blind spot while the task is exiting and
+ * may be removed from the tasklist. Do this by adding the task to yet
+ * another list.
+ */
void exit_tasks_rcu_start(void)
{
+ unsigned long flags;
+ struct task_struct *t = current;
+
+ WARN_ON_ONCE(!list_empty(&t->rcu_tasks_exit_list));
+ get_task_struct(t);
preempt_disable();
- current->rcu_tasks_idx = __srcu_read_lock(&tasks_rcu_exit_srcu);
+ raw_spin_lock_irqsave(&rtp_exit_list_lock, flags);
+ list_add(&t->rcu_tasks_exit_list, &rtp_exit_list);
+ raw_spin_unlock_irqrestore(&rtp_exit_list_lock, flags);
preempt_enable();
}
-/* Do the srcu_read_unlock() for the above synchronize_srcu(). */
+/*
+ * Remove the task from the "yet another list" because do_exit() is now
+ * non-preemptible, allowing synchronize_rcu() to wait beyond this point.
+ */
void exit_tasks_rcu_finish(void)
{
- preempt_disable();
- __srcu_read_unlock(&tasks_rcu_exit_srcu, current->rcu_tasks_idx);
- preempt_enable();
+ unsigned long flags;
+ struct task_struct *t = current;
+
+ WARN_ON_ONCE(list_empty(&t->rcu_tasks_exit_list));
+ raw_spin_lock_irqsave(&rtp_exit_list_lock, flags);
+ list_del_init(&t->rcu_tasks_exit_list);
+ raw_spin_unlock_irqrestore(&rtp_exit_list_lock, flags);
+ put_task_struct(t);
}
#endif /* #ifdef CONFIG_TASKS_RCU */
--
2.17.1
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x d877550eaf2dc9090d782864c96939397a3c6835
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024021941-reprimand-grudge-7734@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
d877550eaf2d ("x86/fpu: Stop relying on userspace for info to fault in xsave buffer")
c03098d4b9ad ("Merge tag 'gfs2-v5.15-rc5-mmap-fault' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From d877550eaf2dc9090d782864c96939397a3c6835 Mon Sep 17 00:00:00 2001
From: Andrei Vagin <avagin(a)google.com>
Date: Mon, 29 Jan 2024 22:36:03 -0800
Subject: [PATCH] x86/fpu: Stop relying on userspace for info to fault in xsave
buffer
Before this change, the expected size of the user space buffer was
taken from fx_sw->xstate_size. fx_sw->xstate_size can be changed
from user-space, so it is possible construct a sigreturn frame where:
* fx_sw->xstate_size is smaller than the size required by valid bits in
fx_sw->xfeatures.
* user-space unmaps parts of the sigrame fpu buffer so that not all of
the buffer required by xrstor is accessible.
In this case, xrstor tries to restore and accesses the unmapped area
which results in a fault. But fault_in_readable succeeds because buf +
fx_sw->xstate_size is within the still mapped area, so it goes back and
tries xrstor again. It will spin in this loop forever.
Instead, fault in the maximum size which can be touched by XRSTOR (taken
from fpstate->user_size).
[ dhansen: tweak subject / changelog ]
Fixes: fcb3635f5018 ("x86/fpu/signal: Handle #PF in the direct restore path")
Reported-by: Konstantin Bogomolov <bogomolov(a)google.com>
Suggested-by: Thomas Gleixner <tglx(a)linutronix.de>
Signed-off-by: Andrei Vagin <avagin(a)google.com>
Signed-off-by: Dave Hansen <dave.hansen(a)linux.intel.com>
Cc:stable@vger.kernel.org
Link: https://lore.kernel.org/all/20240130063603.3392627-1-avagin%40google.com
diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c
index 558076dbde5b..247f2225aa9f 100644
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -274,12 +274,13 @@ static int __restore_fpregs_from_user(void __user *buf, u64 ufeatures,
* Attempt to restore the FPU registers directly from user memory.
* Pagefaults are handled and any errors returned are fatal.
*/
-static bool restore_fpregs_from_user(void __user *buf, u64 xrestore,
- bool fx_only, unsigned int size)
+static bool restore_fpregs_from_user(void __user *buf, u64 xrestore, bool fx_only)
{
struct fpu *fpu = ¤t->thread.fpu;
int ret;
+ /* Restore enabled features only. */
+ xrestore &= fpu->fpstate->user_xfeatures;
retry:
fpregs_lock();
/* Ensure that XFD is up to date */
@@ -309,7 +310,7 @@ static bool restore_fpregs_from_user(void __user *buf, u64 xrestore,
if (ret != X86_TRAP_PF)
return false;
- if (!fault_in_readable(buf, size))
+ if (!fault_in_readable(buf, fpu->fpstate->user_size))
goto retry;
return false;
}
@@ -339,7 +340,6 @@ static bool __fpu_restore_sig(void __user *buf, void __user *buf_fx,
struct user_i387_ia32_struct env;
bool success, fx_only = false;
union fpregs_state *fpregs;
- unsigned int state_size;
u64 user_xfeatures = 0;
if (use_xsave()) {
@@ -349,17 +349,14 @@ static bool __fpu_restore_sig(void __user *buf, void __user *buf_fx,
return false;
fx_only = !fx_sw_user.magic1;
- state_size = fx_sw_user.xstate_size;
user_xfeatures = fx_sw_user.xfeatures;
} else {
user_xfeatures = XFEATURE_MASK_FPSSE;
- state_size = fpu->fpstate->user_size;
}
if (likely(!ia32_fxstate)) {
/* Restore the FPU registers directly from user memory. */
- return restore_fpregs_from_user(buf_fx, user_xfeatures, fx_only,
- state_size);
+ return restore_fpregs_from_user(buf_fx, user_xfeatures, fx_only);
}
/*
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 3c12466b6b7bf1e56f9b32c366a3d83d87afb4de
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024012650-altitude-gush-572f@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
3c12466b6b7b ("erofs: fix lz4 inplace decompression")
123ec246ebe3 ("erofs: get rid of the remaining kmap_atomic()")
ab749badf9f4 ("erofs: support unaligned data decompression")
10e5f6e482e1 ("erofs: introduce z_erofs_fixup_insize")
d67aee76d418 ("erofs: tidy up z_erofs_lz4_decompress")
7e508f2ca8bb ("erofs: rename lz4_0pading to zero_padding")
eaa9172ad988 ("erofs: get rid of ->lru usage")
622ceaddb764 ("erofs: lzma compression support")
966edfb0a3dc ("erofs: rename some generic methods in decompressor")
386292919c25 ("erofs: introduce readmore decompression strategy")
8f89926290c4 ("erofs: get compression algorithms directly on mapping")
dfeab2e95a75 ("erofs: add multiple device support")
e62424651f43 ("erofs: decouple basic mount options from fs_context")
5b6e7e120e71 ("erofs: remove the fast path of per-CPU buffer decompression")
2e5fd489a4e5 ("Merge tag 'libnvdimm-for-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 3c12466b6b7bf1e56f9b32c366a3d83d87afb4de Mon Sep 17 00:00:00 2001
From: Gao Xiang <xiang(a)kernel.org>
Date: Wed, 6 Dec 2023 12:55:34 +0800
Subject: [PATCH] erofs: fix lz4 inplace decompression
Currently EROFS can map another compressed buffer for inplace
decompression, that was used to handle the cases that some pages of
compressed data are actually not in-place I/O.
However, like most simple LZ77 algorithms, LZ4 expects the compressed
data is arranged at the end of the decompressed buffer and it
explicitly uses memmove() to handle overlapping:
__________________________________________________________
|_ direction of decompression --> ____ |_ compressed data _|
Although EROFS arranges compressed data like this, it typically maps two
individual virtual buffers so the relative order is uncertain.
Previously, it was hardly observed since LZ4 only uses memmove() for
short overlapped literals and x86/arm64 memmove implementations seem to
completely cover it up and they don't have this issue. Juhyung reported
that EROFS data corruption can be found on a new Intel x86 processor.
After some analysis, it seems that recent x86 processors with the new
FSRM feature expose this issue with "rep movsb".
Let's strictly use the decompressed buffer for lz4 inplace
decompression for now. Later, as an useful improvement, we could try
to tie up these two buffers together in the correct order.
Reported-and-tested-by: Juhyung Park <qkrwngud825(a)gmail.com>
Closes: https://lore.kernel.org/r/CAD14+f2AVKf8Fa2OO1aAUdDNTDsVzzR6ctU_oJSmTyd6zSYR…
Fixes: 0ffd71bcc3a0 ("staging: erofs: introduce LZ4 decompression inplace")
Fixes: 598162d05080 ("erofs: support decompress big pcluster for lz4 backend")
Cc: stable <stable(a)vger.kernel.org> # 5.4+
Tested-by: Yifan Zhao <zhaoyifan(a)sjtu.edu.cn>
Signed-off-by: Gao Xiang <hsiangkao(a)linux.alibaba.com>
Link: https://lore.kernel.org/r/20231206045534.3920847-1-hsiangkao@linux.alibaba.…
diff --git a/fs/erofs/decompressor.c b/fs/erofs/decompressor.c
index 021be5feb1bc..e0d609c3958f 100644
--- a/fs/erofs/decompressor.c
+++ b/fs/erofs/decompressor.c
@@ -121,11 +121,11 @@ static int z_erofs_lz4_prepare_dstpages(struct z_erofs_lz4_decompress_ctx *ctx,
}
static void *z_erofs_lz4_handle_overlap(struct z_erofs_lz4_decompress_ctx *ctx,
- void *inpage, unsigned int *inputmargin, int *maptype,
- bool may_inplace)
+ void *inpage, void *out, unsigned int *inputmargin,
+ int *maptype, bool may_inplace)
{
struct z_erofs_decompress_req *rq = ctx->rq;
- unsigned int omargin, total, i, j;
+ unsigned int omargin, total, i;
struct page **in;
void *src, *tmp;
@@ -135,12 +135,13 @@ static void *z_erofs_lz4_handle_overlap(struct z_erofs_lz4_decompress_ctx *ctx,
omargin < LZ4_DECOMPRESS_INPLACE_MARGIN(rq->inputsize))
goto docopy;
- for (i = 0; i < ctx->inpages; ++i) {
- DBG_BUGON(rq->in[i] == NULL);
- for (j = 0; j < ctx->outpages - ctx->inpages + i; ++j)
- if (rq->out[j] == rq->in[i])
- goto docopy;
- }
+ for (i = 0; i < ctx->inpages; ++i)
+ if (rq->out[ctx->outpages - ctx->inpages + i] !=
+ rq->in[i])
+ goto docopy;
+ kunmap_local(inpage);
+ *maptype = 3;
+ return out + ((ctx->outpages - ctx->inpages) << PAGE_SHIFT);
}
if (ctx->inpages <= 1) {
@@ -148,7 +149,6 @@ static void *z_erofs_lz4_handle_overlap(struct z_erofs_lz4_decompress_ctx *ctx,
return inpage;
}
kunmap_local(inpage);
- might_sleep();
src = erofs_vm_map_ram(rq->in, ctx->inpages);
if (!src)
return ERR_PTR(-ENOMEM);
@@ -204,12 +204,12 @@ int z_erofs_fixup_insize(struct z_erofs_decompress_req *rq, const char *padbuf,
}
static int z_erofs_lz4_decompress_mem(struct z_erofs_lz4_decompress_ctx *ctx,
- u8 *out)
+ u8 *dst)
{
struct z_erofs_decompress_req *rq = ctx->rq;
bool support_0padding = false, may_inplace = false;
unsigned int inputmargin;
- u8 *headpage, *src;
+ u8 *out, *headpage, *src;
int ret, maptype;
DBG_BUGON(*rq->in == NULL);
@@ -230,11 +230,12 @@ static int z_erofs_lz4_decompress_mem(struct z_erofs_lz4_decompress_ctx *ctx,
}
inputmargin = rq->pageofs_in;
- src = z_erofs_lz4_handle_overlap(ctx, headpage, &inputmargin,
+ src = z_erofs_lz4_handle_overlap(ctx, headpage, dst, &inputmargin,
&maptype, may_inplace);
if (IS_ERR(src))
return PTR_ERR(src);
+ out = dst + rq->pageofs_out;
/* legacy format could compress extra data in a pcluster. */
if (rq->partial_decoding || !support_0padding)
ret = LZ4_decompress_safe_partial(src + inputmargin, out,
@@ -265,7 +266,7 @@ static int z_erofs_lz4_decompress_mem(struct z_erofs_lz4_decompress_ctx *ctx,
vm_unmap_ram(src, ctx->inpages);
} else if (maptype == 2) {
erofs_put_pcpubuf(src);
- } else {
+ } else if (maptype != 3) {
DBG_BUGON(1);
return -EFAULT;
}
@@ -308,7 +309,7 @@ static int z_erofs_lz4_decompress(struct z_erofs_decompress_req *rq,
}
dstmap_out:
- ret = z_erofs_lz4_decompress_mem(&ctx, dst + rq->pageofs_out);
+ ret = z_erofs_lz4_decompress_mem(&ctx, dst);
if (!dst_maptype)
kunmap_local(dst);
else if (dst_maptype == 2)
From: Jean-Baptiste Maneyrol <jean-baptiste.maneyrol(a)tdk.com>
Now that we are reading the full FIFO in the interrupt handler,
it is possible to have an emply FIFO since we are still receiving
1 interrupt per data. Handle correctly this case instead of having
an error causing a reset of the FIFO.
Fixes: 0829edc43e0a ("iio: imu: inv_mpu6050: read the full fifo when processing data")
Cc: stable(a)vger.kernel.org
Signed-off-by: Jean-Baptiste Maneyrol <jean-baptiste.maneyrol(a)tdk.com>
---
V2: add missing stable tag
drivers/iio/imu/inv_mpu6050/inv_mpu_ring.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/iio/imu/inv_mpu6050/inv_mpu_ring.c b/drivers/iio/imu/inv_mpu6050/inv_mpu_ring.c
index 66d4ba088e70..d4f9b5d8d28d 100644
--- a/drivers/iio/imu/inv_mpu6050/inv_mpu_ring.c
+++ b/drivers/iio/imu/inv_mpu6050/inv_mpu_ring.c
@@ -109,6 +109,8 @@ irqreturn_t inv_mpu6050_read_fifo(int irq, void *p)
/* compute and process only all complete datum */
nb = fifo_count / bytes_per_datum;
fifo_count = nb * bytes_per_datum;
+ if (nb == 0)
+ goto end_session;
/* Each FIFO data contains all sensors, so same number for FIFO and sensor data */
fifo_period = NSEC_PER_SEC / INV_MPU6050_DIVIDER_TO_FIFO_RATE(st->chip_config.divider);
inv_sensors_timestamp_interrupt(&st->timestamp, fifo_period, nb, nb, pf->timestamp);
--
2.34.1
Resolving a frequency to an efficient one should not transgress policy->max
(which can be set for thermal reason) and policy->min. Currently there is
possibility where scaling_cur_freq can exceed scaling_max_freq when
scaling_max_freq is inefficient frequency. Add additional check to ensure
that resolving a frequency will respect policy->min/max.
Fixes: 1f39fa0dccff ("cpufreq: Introducing CPUFREQ_RELATION_E")
Signed-off-by: Shivnandan Kumar <quic_kshivnan(a)quicinc.com>
---
include/linux/cpufreq.h | 18 +++++++++++++++++-
1 file changed, 17 insertions(+), 1 deletion(-)
diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
index afda5f24d3dd..42d98b576a36 100644
--- a/include/linux/cpufreq.h
+++ b/include/linux/cpufreq.h
@@ -1021,6 +1021,19 @@ static inline int cpufreq_table_find_index_c(struct cpufreq_policy *policy,
efficiencies);
}
+static inline bool cpufreq_table_index_is_in_limits(struct cpufreq_policy *policy,
+ int idx)
+{
+ unsigned int freq;
+
+ if (idx < 0)
+ return false;
+
+ freq = policy->freq_table[idx].frequency;
+
+ return (freq == clamp_val(freq, policy->min, policy->max));
+}
+
static inline int cpufreq_frequency_table_target(struct cpufreq_policy *policy,
unsigned int target_freq,
unsigned int relation)
@@ -1054,7 +1067,10 @@ static inline int cpufreq_frequency_table_target(struct cpufreq_policy *policy,
return 0;
}
- if (idx < 0 && efficiencies) {
+ /*
+ * Limit frequency index to honor policy->min/max
+ */
+ if (!cpufreq_table_index_is_in_limits(policy, idx) && efficiencies) {
efficiencies = false;
goto retry;
}
--
2.25.1
Resolving a frequency to an efficient one should not transgress policy->max
(which can be set for thermal reason) and policy->min. Currently there is
possibility where scaling_cur_freq can exceed scaling_max_freq when
scaling_max_freq is inefficient frequency. Add additional check to ensure
that resolving a frequency will respect policy->min/max.
Cc: <stable(a)vger.kernel.org>
Fixes: 1f39fa0dccff ("cpufreq: Introducing CPUFREQ_RELATION_E")
Signed-off-by: Shivnandan Kumar <quic_kshivnan(a)quicinc.com>
--
Changes in v2:
-rename function name from cpufreq_table_index_is_in_limits to cpufreq_is_in_limits
-remove redundant outer parenthesis in return statement
-Make comment single line
--
---
include/linux/cpufreq.h | 16 +++++++++++++++-
1 file changed, 15 insertions(+), 1 deletion(-)
diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
index afda5f24d3dd..7741244dee6e 100644
--- a/include/linux/cpufreq.h
+++ b/include/linux/cpufreq.h
@@ -1021,6 +1021,19 @@ static inline int cpufreq_table_find_index_c(struct cpufreq_policy *policy,
efficiencies);
}
+static inline bool cpufreq_is_in_limits(struct cpufreq_policy *policy,
+ int idx)
+{
+ unsigned int freq;
+
+ if (idx < 0)
+ return false;
+
+ freq = policy->freq_table[idx].frequency;
+
+ return freq == clamp_val(freq, policy->min, policy->max);
+}
+
static inline int cpufreq_frequency_table_target(struct cpufreq_policy *policy,
unsigned int target_freq,
unsigned int relation)
@@ -1054,7 +1067,8 @@ static inline int cpufreq_frequency_table_target(struct cpufreq_policy *policy,
return 0;
}
- if (idx < 0 && efficiencies) {
+ /* Limit frequency index to honor policy->min/max */
+ if (!cpufreq_is_in_limits(policy, idx) && efficiencies) {
efficiencies = false;
goto retry;
}
--
2.25.1
From: Christophe JAILLET <christophe.jaillet(a)wanadoo.fr>
ida_alloc() and ida_free() should be preferred to the deprecated
ida_simple_get() and ida_simple_remove().
Note that the upper limit of ida_simple_get() is exclusive, but the one of
ida_alloc_range() is inclusive. So change this change allows one more
device. Previously address 0xFE was never used.
Fixes: 46a2bb5a7f7e ("slimbus: core: Add slim controllers support")
Cc: <Stable(a)vger.kernel.org>
Signed-off-by: Christophe JAILLET <christophe.jaillet(a)wanadoo.fr>
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla(a)linaro.org>
---
drivers/slimbus/core.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/slimbus/core.c b/drivers/slimbus/core.c
index d43873bb5fe6..01cbd4621981 100644
--- a/drivers/slimbus/core.c
+++ b/drivers/slimbus/core.c
@@ -436,8 +436,8 @@ static int slim_device_alloc_laddr(struct slim_device *sbdev,
if (ret < 0)
goto err;
} else if (report_present) {
- ret = ida_simple_get(&ctrl->laddr_ida,
- 0, SLIM_LA_MANAGER - 1, GFP_KERNEL);
+ ret = ida_alloc_max(&ctrl->laddr_ida,
+ SLIM_LA_MANAGER - 1, GFP_KERNEL);
if (ret < 0)
goto err;
--
2.25.1
Some Makefiles under tools/ use the 'override CFLAGS += ...' construct
to add a few required options to CFLAGS passed by the user.
Unfortunately that only works when user passes CFLAGS as an environment
variable, i.e.
CFLAGS=... make ...
and not in case when CFLAGS are passed as make command line arguments:
make ... CFLAGS=...
It happens because in the latter case CFLAGS=... is recorded in the make
variable MAKEOVERRIDES and this variable is passed in its original form
to all $(MAKE) subcommands, taking precedence over modified CFLAGS value
passed in the environment variable. E.g. this causes build failure for
gpio and iio tools when the build is run with user CFLAGS because of
missing _GNU_SOURCE definition needed for the asprintf().
One way to fix it is by removing overridden variables from the
MAKEOVERRIDES. Add macro 'drop-var-from-overrides' that removes a
definition of a variable passed to it from the MAKEOVERRIDES and use it
to fix CFLAGS passing for tools/gpio and tools/iio.
This implementation tries to be precise in string processing and handle
variables with embedded spaces and backslashes correctly. To achieve
that it replaces every '\\' sequence with '\-' to make sure that every
'\' in the resulting string is an escape character. It then replaces
every '\ ' sequence with '\_' to turn string values with embedded spaces
into single words. After filtering the overridden variable definition
out of the resulting string these two transformations are reversed.
Cc: stable(a)vger.kernel.org
Fixes: 4ccc98a48958 ("tools gpio: Allow overriding CFLAGS")
Fixes: 572974610273 ("tools iio: Override CFLAGS assignments")
Signed-off-by: Max Filippov <jcmvbkbc(a)gmail.com>
---
Changes v1->v2:
- make drop-var-from-overrides-code work correctly with arbitrary
variables, including thoses ending with '\'.
tools/gpio/Makefile | 1 +
tools/iio/Makefile | 1 +
tools/scripts/Makefile.include | 9 +++++++++
3 files changed, 11 insertions(+)
diff --git a/tools/gpio/Makefile b/tools/gpio/Makefile
index d29c9c49e251..46fc38d51639 100644
--- a/tools/gpio/Makefile
+++ b/tools/gpio/Makefile
@@ -24,6 +24,7 @@ ALL_PROGRAMS := $(patsubst %,$(OUTPUT)%,$(ALL_TARGETS))
all: $(ALL_PROGRAMS)
export srctree OUTPUT CC LD CFLAGS
+$(call drop-var-from-overrides,CFLAGS)
include $(srctree)/tools/build/Makefile.include
#
diff --git a/tools/iio/Makefile b/tools/iio/Makefile
index fa720f062229..04307588dd3f 100644
--- a/tools/iio/Makefile
+++ b/tools/iio/Makefile
@@ -20,6 +20,7 @@ ALL_PROGRAMS := $(patsubst %,$(OUTPUT)%,$(ALL_TARGETS))
all: $(ALL_PROGRAMS)
export srctree OUTPUT CC LD CFLAGS
+$(call drop-var-from-overrides,CFLAGS)
include $(srctree)/tools/build/Makefile.include
#
diff --git a/tools/scripts/Makefile.include b/tools/scripts/Makefile.include
index 6fba29f3222d..0f68b95cf55c 100644
--- a/tools/scripts/Makefile.include
+++ b/tools/scripts/Makefile.include
@@ -51,6 +51,15 @@ define allow-override
$(eval $(1) = $(2)))
endef
+# When a Makefile overrides a variable and exports it for the nested $(MAKE)
+# invocations to use its modified value, it must remove that variable definition
+# from the MAKEOVERRIDES variable, otherwise the original definition from the
+# MAKEOVERRIDES takes precedence over the exported value.
+drop-var-from-overrides = $(eval $(drop-var-from-overrides-code))
+define drop-var-from-overrides-code
+MAKEOVERRIDES := $(subst \-,\\,$(subst \_,\ ,$(filter-out $(1)=%,$(subst \ ,\_,$(subst \\,\-,$(MAKEOVERRIDES))))))
+endef
+
ifneq ($(LLVM),)
ifneq ($(filter %/,$(LLVM)),)
LLVM_PREFIX := $(LLVM)
--
2.39.2
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 3c12466b6b7bf1e56f9b32c366a3d83d87afb4de
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024012648-backwater-colt-7290@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
3c12466b6b7b ("erofs: fix lz4 inplace decompression")
123ec246ebe3 ("erofs: get rid of the remaining kmap_atomic()")
ab749badf9f4 ("erofs: support unaligned data decompression")
10e5f6e482e1 ("erofs: introduce z_erofs_fixup_insize")
d67aee76d418 ("erofs: tidy up z_erofs_lz4_decompress")
7e508f2ca8bb ("erofs: rename lz4_0pading to zero_padding")
eaa9172ad988 ("erofs: get rid of ->lru usage")
622ceaddb764 ("erofs: lzma compression support")
966edfb0a3dc ("erofs: rename some generic methods in decompressor")
386292919c25 ("erofs: introduce readmore decompression strategy")
8f89926290c4 ("erofs: get compression algorithms directly on mapping")
dfeab2e95a75 ("erofs: add multiple device support")
e62424651f43 ("erofs: decouple basic mount options from fs_context")
5b6e7e120e71 ("erofs: remove the fast path of per-CPU buffer decompression")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 3c12466b6b7bf1e56f9b32c366a3d83d87afb4de Mon Sep 17 00:00:00 2001
From: Gao Xiang <xiang(a)kernel.org>
Date: Wed, 6 Dec 2023 12:55:34 +0800
Subject: [PATCH] erofs: fix lz4 inplace decompression
Currently EROFS can map another compressed buffer for inplace
decompression, that was used to handle the cases that some pages of
compressed data are actually not in-place I/O.
However, like most simple LZ77 algorithms, LZ4 expects the compressed
data is arranged at the end of the decompressed buffer and it
explicitly uses memmove() to handle overlapping:
__________________________________________________________
|_ direction of decompression --> ____ |_ compressed data _|
Although EROFS arranges compressed data like this, it typically maps two
individual virtual buffers so the relative order is uncertain.
Previously, it was hardly observed since LZ4 only uses memmove() for
short overlapped literals and x86/arm64 memmove implementations seem to
completely cover it up and they don't have this issue. Juhyung reported
that EROFS data corruption can be found on a new Intel x86 processor.
After some analysis, it seems that recent x86 processors with the new
FSRM feature expose this issue with "rep movsb".
Let's strictly use the decompressed buffer for lz4 inplace
decompression for now. Later, as an useful improvement, we could try
to tie up these two buffers together in the correct order.
Reported-and-tested-by: Juhyung Park <qkrwngud825(a)gmail.com>
Closes: https://lore.kernel.org/r/CAD14+f2AVKf8Fa2OO1aAUdDNTDsVzzR6ctU_oJSmTyd6zSYR…
Fixes: 0ffd71bcc3a0 ("staging: erofs: introduce LZ4 decompression inplace")
Fixes: 598162d05080 ("erofs: support decompress big pcluster for lz4 backend")
Cc: stable <stable(a)vger.kernel.org> # 5.4+
Tested-by: Yifan Zhao <zhaoyifan(a)sjtu.edu.cn>
Signed-off-by: Gao Xiang <hsiangkao(a)linux.alibaba.com>
Link: https://lore.kernel.org/r/20231206045534.3920847-1-hsiangkao@linux.alibaba.…
diff --git a/fs/erofs/decompressor.c b/fs/erofs/decompressor.c
index 021be5feb1bc..e0d609c3958f 100644
--- a/fs/erofs/decompressor.c
+++ b/fs/erofs/decompressor.c
@@ -121,11 +121,11 @@ static int z_erofs_lz4_prepare_dstpages(struct z_erofs_lz4_decompress_ctx *ctx,
}
static void *z_erofs_lz4_handle_overlap(struct z_erofs_lz4_decompress_ctx *ctx,
- void *inpage, unsigned int *inputmargin, int *maptype,
- bool may_inplace)
+ void *inpage, void *out, unsigned int *inputmargin,
+ int *maptype, bool may_inplace)
{
struct z_erofs_decompress_req *rq = ctx->rq;
- unsigned int omargin, total, i, j;
+ unsigned int omargin, total, i;
struct page **in;
void *src, *tmp;
@@ -135,12 +135,13 @@ static void *z_erofs_lz4_handle_overlap(struct z_erofs_lz4_decompress_ctx *ctx,
omargin < LZ4_DECOMPRESS_INPLACE_MARGIN(rq->inputsize))
goto docopy;
- for (i = 0; i < ctx->inpages; ++i) {
- DBG_BUGON(rq->in[i] == NULL);
- for (j = 0; j < ctx->outpages - ctx->inpages + i; ++j)
- if (rq->out[j] == rq->in[i])
- goto docopy;
- }
+ for (i = 0; i < ctx->inpages; ++i)
+ if (rq->out[ctx->outpages - ctx->inpages + i] !=
+ rq->in[i])
+ goto docopy;
+ kunmap_local(inpage);
+ *maptype = 3;
+ return out + ((ctx->outpages - ctx->inpages) << PAGE_SHIFT);
}
if (ctx->inpages <= 1) {
@@ -148,7 +149,6 @@ static void *z_erofs_lz4_handle_overlap(struct z_erofs_lz4_decompress_ctx *ctx,
return inpage;
}
kunmap_local(inpage);
- might_sleep();
src = erofs_vm_map_ram(rq->in, ctx->inpages);
if (!src)
return ERR_PTR(-ENOMEM);
@@ -204,12 +204,12 @@ int z_erofs_fixup_insize(struct z_erofs_decompress_req *rq, const char *padbuf,
}
static int z_erofs_lz4_decompress_mem(struct z_erofs_lz4_decompress_ctx *ctx,
- u8 *out)
+ u8 *dst)
{
struct z_erofs_decompress_req *rq = ctx->rq;
bool support_0padding = false, may_inplace = false;
unsigned int inputmargin;
- u8 *headpage, *src;
+ u8 *out, *headpage, *src;
int ret, maptype;
DBG_BUGON(*rq->in == NULL);
@@ -230,11 +230,12 @@ static int z_erofs_lz4_decompress_mem(struct z_erofs_lz4_decompress_ctx *ctx,
}
inputmargin = rq->pageofs_in;
- src = z_erofs_lz4_handle_overlap(ctx, headpage, &inputmargin,
+ src = z_erofs_lz4_handle_overlap(ctx, headpage, dst, &inputmargin,
&maptype, may_inplace);
if (IS_ERR(src))
return PTR_ERR(src);
+ out = dst + rq->pageofs_out;
/* legacy format could compress extra data in a pcluster. */
if (rq->partial_decoding || !support_0padding)
ret = LZ4_decompress_safe_partial(src + inputmargin, out,
@@ -265,7 +266,7 @@ static int z_erofs_lz4_decompress_mem(struct z_erofs_lz4_decompress_ctx *ctx,
vm_unmap_ram(src, ctx->inpages);
} else if (maptype == 2) {
erofs_put_pcpubuf(src);
- } else {
+ } else if (maptype != 3) {
DBG_BUGON(1);
return -EFAULT;
}
@@ -308,7 +309,7 @@ static int z_erofs_lz4_decompress(struct z_erofs_decompress_req *rq,
}
dstmap_out:
- ret = z_erofs_lz4_decompress_mem(&ctx, dst + rq->pageofs_out);
+ ret = z_erofs_lz4_decompress_mem(&ctx, dst);
if (!dst_maptype)
kunmap_local(dst);
else if (dst_maptype == 2)
The quilt patch titled
Subject: mm: cachestat: fix folio read-after-free in cache walk
has been removed from the -mm tree. Its filename was
mm-cachestat-fix-folio-read-after-free-in-cache-walk.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Nhat Pham <nphamcs(a)gmail.com>
Subject: mm: cachestat: fix folio read-after-free in cache walk
Date: Mon, 19 Feb 2024 19:01:21 -0800
In cachestat, we access the folio from the page cache's xarray to compute
its page offset, and check for its dirty and writeback flags. However, we
do not hold a reference to the folio before performing these actions,
which means the folio can concurrently be released and reused as another
folio/page/slab.
Get around this altogether by just using xarray's existing machinery for
the folio page offsets and dirty/writeback states.
This changes behavior for tmpfs files to now always report zeroes in their
dirty and writeback counters. This is okay as tmpfs doesn't follow
conventional writeback cache behavior: its pages get "cleaned" during
swapout, after which they're no longer resident etc.
Link: https://lkml.kernel.org/r/20240220153409.GA216065@cmpxchg.org
Fixes: cf264e1329fb ("cachestat: implement cachestat syscall")
Reported-by: Jann Horn <jannh(a)google.com>
Suggested-by: Matthew Wilcox <willy(a)infradead.org>
Signed-off-by: Nhat Pham <nphamcs(a)gmail.com>
Signed-off-by: Johannes Weiner <hannes(a)cmpxchg.org>
Tested-by: Jann Horn <jannh(a)google.com>
Cc: <stable(a)vger.kernel.org> [6.4+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/filemap.c | 51 ++++++++++++++++++++++++-------------------------
1 file changed, 26 insertions(+), 25 deletions(-)
--- a/mm/filemap.c~mm-cachestat-fix-folio-read-after-free-in-cache-walk
+++ a/mm/filemap.c
@@ -4111,28 +4111,40 @@ static void filemap_cachestat(struct add
rcu_read_lock();
xas_for_each(&xas, folio, last_index) {
+ int order;
unsigned long nr_pages;
pgoff_t folio_first_index, folio_last_index;
+ /*
+ * Don't deref the folio. It is not pinned, and might
+ * get freed (and reused) underneath us.
+ *
+ * We *could* pin it, but that would be expensive for
+ * what should be a fast and lightweight syscall.
+ *
+ * Instead, derive all information of interest from
+ * the rcu-protected xarray.
+ */
+
if (xas_retry(&xas, folio))
continue;
+ order = xa_get_order(xas.xa, xas.xa_index);
+ nr_pages = 1 << order;
+ folio_first_index = round_down(xas.xa_index, 1 << order);
+ folio_last_index = folio_first_index + nr_pages - 1;
+
+ /* Folios might straddle the range boundaries, only count covered pages */
+ if (folio_first_index < first_index)
+ nr_pages -= first_index - folio_first_index;
+
+ if (folio_last_index > last_index)
+ nr_pages -= folio_last_index - last_index;
+
if (xa_is_value(folio)) {
/* page is evicted */
void *shadow = (void *)folio;
bool workingset; /* not used */
- int order = xa_get_order(xas.xa, xas.xa_index);
-
- nr_pages = 1 << order;
- folio_first_index = round_down(xas.xa_index, 1 << order);
- folio_last_index = folio_first_index + nr_pages - 1;
-
- /* Folios might straddle the range boundaries, only count covered pages */
- if (folio_first_index < first_index)
- nr_pages -= first_index - folio_first_index;
-
- if (folio_last_index > last_index)
- nr_pages -= folio_last_index - last_index;
cs->nr_evicted += nr_pages;
@@ -4150,24 +4162,13 @@ static void filemap_cachestat(struct add
goto resched;
}
- nr_pages = folio_nr_pages(folio);
- folio_first_index = folio_pgoff(folio);
- folio_last_index = folio_first_index + nr_pages - 1;
-
- /* Folios might straddle the range boundaries, only count covered pages */
- if (folio_first_index < first_index)
- nr_pages -= first_index - folio_first_index;
-
- if (folio_last_index > last_index)
- nr_pages -= folio_last_index - last_index;
-
/* page is in cache */
cs->nr_cache += nr_pages;
- if (folio_test_dirty(folio))
+ if (xas_get_mark(&xas, PAGECACHE_TAG_DIRTY))
cs->nr_dirty += nr_pages;
- if (folio_test_writeback(folio))
+ if (xas_get_mark(&xas, PAGECACHE_TAG_WRITEBACK))
cs->nr_writeback += nr_pages;
resched:
_
Patches currently in -mm which might be from nphamcs(a)gmail.com are
The patch titled
Subject: init/Kconfig: lower GCC version check for -Warray-bounds
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
init-kconfig-lower-gcc-version-check-for-warray-bounds.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Kees Cook <keescook(a)chromium.org>
Subject: init/Kconfig: lower GCC version check for -Warray-bounds
Date: Fri, 23 Feb 2024 09:08:27 -0800
We continue to see false positives from -Warray-bounds even in GCC 10,
which is getting reported in a few places[1] still:
security/security.c:811:2: warning: `memcpy' offset 32 is out of the bounds [0, 0] [-Warray-bounds]
Lower the GCC version check from 11 to 10.
Link: https://lkml.kernel.org/r/20240223170824.work.768-kees@kernel.org
Reported-by: Lu Yao <yaolu(a)kylinos.cn>
Closes: https://lore.kernel.org/lkml/20240117014541.8887-1-yaolu@kylinos.cn/
Link: https://lore.kernel.org/linux-next/65d84438.620a0220.7d171.81a7@mx.google.c… [1]
Signed-off-by: Kees Cook <keescook(a)chromium.org>
Cc: Ard Biesheuvel <ardb(a)kernel.org>
Cc: Christophe Leroy <christophe.leroy(a)csgroup.eu>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: "Gustavo A. R. Silva" <gustavoars(a)kernel.org>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: Marc Aur��le La France <tsi(a)tuyoix.net>
Cc: Masahiro Yamada <masahiroy(a)kernel.org>
Cc: Nathan Chancellor <nathan(a)kernel.org>
Cc: Nhat Pham <nphamcs(a)gmail.com>
Cc: Paul Moore <paul(a)paul-moore.com>
Cc: Petr Mladek <pmladek(a)suse.com>
Cc: Randy Dunlap <rdunlap(a)infradead.org>
Cc: Suren Baghdasaryan <surenb(a)google.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
init/Kconfig | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
--- a/init/Kconfig~init-kconfig-lower-gcc-version-check-for-warray-bounds
+++ a/init/Kconfig
@@ -876,14 +876,14 @@ config CC_IMPLICIT_FALLTHROUGH
default "-Wimplicit-fallthrough=5" if CC_IS_GCC && $(cc-option,-Wimplicit-fallthrough=5)
default "-Wimplicit-fallthrough" if CC_IS_CLANG && $(cc-option,-Wunreachable-code-fallthrough)
-# Currently, disable gcc-11+ array-bounds globally.
+# Currently, disable gcc-10+ array-bounds globally.
# It's still broken in gcc-13, so no upper bound yet.
-config GCC11_NO_ARRAY_BOUNDS
+config GCC10_NO_ARRAY_BOUNDS
def_bool y
config CC_NO_ARRAY_BOUNDS
bool
- default y if CC_IS_GCC && GCC_VERSION >= 110000 && GCC11_NO_ARRAY_BOUNDS
+ default y if CC_IS_GCC && GCC_VERSION >= 100000 && GCC10_NO_ARRAY_BOUNDS
# Currently, disable -Wstringop-overflow for GCC globally.
config GCC_NO_STRINGOP_OVERFLOW
_
Patches currently in -mm which might be from keescook(a)chromium.org are
init-kconfig-lower-gcc-version-check-for-warray-bounds.patch