Patch "fs/aio: Make io_cancel() generate completions again" is based on the
assumption that calling kiocb->ki_cancel() does not complete R/W requests.
This is incorrect: the two drivers that call kiocb_set_cancel_fn() callers
set a cancellation function that calls usb_ep_dequeue(). According to its
documentation, usb_ep_dequeue() calls the completion routine with status
-ECONNRESET. Hence this revert.
Cc: Benjamin LaHaise <ben(a)communityfibre.ca>
Cc: Eric Biggers <ebiggers(a)google.com>
Cc: Christoph Hellwig <hch(a)lst.de>
Cc: Avi Kivity <avi(a)scylladb.com>
Cc: Sandeep Dhavale <dhavale(a)google.com>
Cc: Jens Axboe <axboe(a)kernel.dk>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: Kent Overstreet <kent.overstreet(a)linux.dev>
Cc: stable(a)vger.kernel.org
Reported-by: syzbot+b91eb2ed18f599dd3c31(a)syzkaller.appspotmail.com
Fixes: 54cbc058d86b ("fs/aio: Make io_cancel() generate completions again")
Signed-off-by: Bart Van Assche <bvanassche(a)acm.org>
---
fs/aio.c | 27 ++++++++++++++++-----------
1 file changed, 16 insertions(+), 11 deletions(-)
diff --git a/fs/aio.c b/fs/aio.c
index 28223f511931..da18dbcfcb22 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -2165,11 +2165,14 @@ COMPAT_SYSCALL_DEFINE3(io_submit, compat_aio_context_t, ctx_id,
#endif
/* sys_io_cancel:
- * Attempts to cancel an iocb previously passed to io_submit(). If the
- * operation is successfully cancelled 0 is returned. May fail with
- * -EFAULT if any of the data structures pointed to are invalid. May
- * fail with -EINVAL if aio_context specified by ctx_id is invalid. Will
- * fail with -ENOSYS if not implemented.
+ * Attempts to cancel an iocb previously passed to io_submit. If
+ * the operation is successfully cancelled, the resulting event is
+ * copied into the memory pointed to by result without being placed
+ * into the completion queue and 0 is returned. May fail with
+ * -EFAULT if any of the data structures pointed to are invalid.
+ * May fail with -EINVAL if aio_context specified by ctx_id is
+ * invalid. May fail with -EAGAIN if the iocb specified was not
+ * cancelled. Will fail with -ENOSYS if not implemented.
*/
SYSCALL_DEFINE3(io_cancel, aio_context_t, ctx_id, struct iocb __user *, iocb,
struct io_event __user *, result)
@@ -2200,12 +2203,14 @@ SYSCALL_DEFINE3(io_cancel, aio_context_t, ctx_id, struct iocb __user *, iocb,
}
spin_unlock_irq(&ctx->ctx_lock);
- /*
- * The result argument is no longer used - the io_event is always
- * delivered via the ring buffer.
- */
- if (ret == 0 && kiocb->rw.ki_flags & IOCB_AIO_RW)
- aio_complete_rw(&kiocb->rw, -EINTR);
+ if (!ret) {
+ /*
+ * The result argument is no longer used - the io_event is
+ * always delivered via the ring buffer. -EINPROGRESS indicates
+ * cancellation is progress:
+ */
+ ret = -EINPROGRESS;
+ }
percpu_ref_put(&ctx->users);
This is the start of the stable review cycle for the 5.15.151 release.
There are 84 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Wed, 06 Mar 2024 21:15:26 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.15.151-r…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 5.15.151-rc1
Davide Caratti <dcaratti(a)redhat.com>
mptcp: fix double-free on socket dismantle
Gal Pressman <gal(a)nvidia.com>
Revert "tls: rx: move counting TlsDecryptErrors for sync"
Jakub Kicinski <kuba(a)kernel.org>
net: tls: fix async vs NIC crypto offload
Martynas Pumputis <m(a)lambda.lt>
bpf: Derive source IP addr via bpf_*_fib_lookup()
Louis DeLosSantos <louis.delos.devel(a)gmail.com>
bpf: Add table ID to bpf_fib_lookup BPF helper
Martin KaFai Lau <martin.lau(a)kernel.org>
bpf: Add BPF_FIB_LOOKUP_SKIP_NEIGH for bpf_fib_lookup
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Revert "interconnect: Teach lockdep about icc_bw_lock order"
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Revert "interconnect: Fix locking for runpm vs reclaim"
Bartosz Golaszewski <bartosz.golaszewski(a)linaro.org>
gpio: fix resource unwinding order in error path
Andy Shevchenko <andriy.shevchenko(a)linux.intel.com>
gpiolib: Fix the error path order in gpiochip_add_data_with_key()
Arturas Moskvinas <arturas.moskvinas(a)gmail.com>
gpio: 74x164: Enable output pins after registers are reset
Kuniyuki Iwashima <kuniyu(a)amazon.com>
af_unix: Drop oob_skb ref before purging queue in GC.
Max Krummenacher <max.krummenacher(a)toradex.com>
Revert "drm/bridge: lt8912b: Register and attach our DSI device at probe"
Oscar Salvador <osalvador(a)suse.de>
fs,hugetlb: fix NULL pointer dereference in hugetlbs_fill_super
Baokun Li <libaokun1(a)huawei.com>
cachefiles: fix memory leak in cachefiles_add_cache()
Paolo Abeni <pabeni(a)redhat.com>
mptcp: fix possible deadlock in subflow diag
Paolo Abeni <pabeni(a)redhat.com>
mptcp: push at DSS boundaries
Geliang Tang <tanggeliang(a)kylinos.cn>
mptcp: add needs_id for netlink appending addr
Jean Sacren <sakiwit(a)gmail.com>
mptcp: clean up harmless false expressions
Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
selftests: mptcp: add missing kconfig for NF Filter in v6
Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
selftests: mptcp: add missing kconfig for NF Filter
Paolo Abeni <pabeni(a)redhat.com>
mptcp: rename timer related helper to less confusing names
Paolo Abeni <pabeni(a)redhat.com>
mptcp: process pending subflow error on close
Paolo Abeni <pabeni(a)redhat.com>
mptcp: move __mptcp_error_report in protocol.c
Paolo Bonzini <pbonzini(a)redhat.com>
x86/cpu/intel: Detect TME keyid bits before setting MTRR mask registers
Bjorn Andersson <quic_bjorande(a)quicinc.com>
pmdomain: qcom: rpmhpd: Fix enabled_corner aggregation
Zong Li <zong.li(a)sifive.com>
riscv: add CALLER_ADDRx support
Elad Nachman <enachman(a)marvell.com>
mmc: sdhci-xenon: fix PHY init clock stability
Elad Nachman <enachman(a)marvell.com>
mmc: sdhci-xenon: add timeout for PHY init complete
Ivan Semenov <ivan(a)semenov.dev>
mmc: core: Fix eMMC initialization with 1-bit bus connection
Curtis Klein <curtis.klein(a)hpe.com>
dmaengine: fsl-qdma: init irq after reg initialization
Tadeusz Struk <tstruk(a)gigaio.com>
dmaengine: ptdma: use consistent DMA masks
Peng Ma <peng.ma(a)nxp.com>
dmaengine: fsl-qdma: fix SoC may hang on 16 byte unaligned read
David Sterba <dsterba(a)suse.com>
btrfs: dev-replace: properly validate device names
Johannes Berg <johannes.berg(a)intel.com>
wifi: nl80211: reject iftype change with mesh ID change
Alexander Ofitserov <oficerovas(a)altlinux.org>
gtp: fix use-after-free and null-ptr-deref in gtp_newlink()
Takashi Sakamoto <o-takashi(a)sakamocchi.jp>
ALSA: firewire-lib: fix to check cycle continuity
Tetsuo Handa <penguin-kernel(a)I-love.SAKURA.ne.jp>
tomoyo: fix UAF write bug in tomoyo_write_control()
Dimitris Vlachos <dvlachos(a)ics.forth.gr>
riscv: Sparse-Memory/vmemmap out-of-bounds fix
David Howells <dhowells(a)redhat.com>
afs: Fix endless loop in directory parsing
Jiri Slaby (SUSE) <jirislaby(a)kernel.org>
fbcon: always restore the old font data in fbcon_do_set_font()
Takashi Iwai <tiwai(a)suse.de>
ALSA: Drop leftover snd-rtctimer stuff from Makefile
Hans de Goede <hdegoede(a)redhat.com>
power: supply: bq27xxx-i2c: Do not free non existing IRQ
Arnd Bergmann <arnd(a)arndb.de>
efi/capsule-loader: fix incorrect allocation size
Sabrina Dubroca <sd(a)queasysnail.net>
tls: decrement decrypt_pending if no async completion will be called
Jakub Kicinski <kuba(a)kernel.org>
tls: rx: use async as an in-out argument
Jakub Kicinski <kuba(a)kernel.org>
tls: rx: assume crypto always calls our callback
Jakub Kicinski <kuba(a)kernel.org>
tls: rx: move counting TlsDecryptErrors for sync
Jakub Kicinski <kuba(a)kernel.org>
tls: rx: don't track the async count
Jakub Kicinski <kuba(a)kernel.org>
tls: rx: factor out writing ContentType to cmsg
Jakub Kicinski <kuba(a)kernel.org>
tls: rx: wrap decryption arguments in a structure
Jakub Kicinski <kuba(a)kernel.org>
tls: rx: don't report text length from the bowels of decrypt
Jakub Kicinski <kuba(a)kernel.org>
tls: rx: drop unnecessary arguments from tls_setup_from_iter()
Jakub Kicinski <kuba(a)kernel.org>
tls: hw: rx: use return value of tls_device_decrypted() to carry status
Jakub Kicinski <kuba(a)kernel.org>
tls: rx: refactor decrypt_skb_update()
Jakub Kicinski <kuba(a)kernel.org>
tls: rx: don't issue wake ups when data is decrypted
Jakub Kicinski <kuba(a)kernel.org>
tls: rx: don't store the decryption status in socket context
Jakub Kicinski <kuba(a)kernel.org>
tls: rx: don't store the record type in socket context
Oleksij Rempel <linux(a)rempel-privat.de>
igb: extend PTP timestamp adjustments to i211
Lin Ma <linma(a)zju.edu.cn>
rtnetlink: fix error logic of IFLA_BRIDGE_FLAGS writing back
Florian Westphal <fw(a)strlen.de>
netfilter: bridge: confirm multicast packets before passing them up the stack
Florian Westphal <fw(a)strlen.de>
netfilter: let reset rules clean out conntrack entries
Florian Westphal <fw(a)strlen.de>
netfilter: make function op structures const
Florian Westphal <fw(a)strlen.de>
netfilter: core: move ip_ct_attach indirection to struct nf_ct_hook
Florian Westphal <fw(a)strlen.de>
netfilter: nfnetlink_queue: silence bogus compiler warning
Ignat Korchagin <ignat(a)cloudflare.com>
netfilter: nf_tables: allow NFPROTO_INET in nft_(match/target)_validate()
Kai-Heng Feng <kai.heng.feng(a)canonical.com>
Bluetooth: Enforce validation on max value of connection interval
Luiz Augusto von Dentz <luiz.von.dentz(a)intel.com>
Bluetooth: hci_event: Fix handling of HCI_EV_IO_CAPA_REQUEST
Zijun Hu <quic_zijuhu(a)quicinc.com>
Bluetooth: hci_event: Fix wrongly recorded wakeup BD_ADDR
Ying Hsu <yinghsu(a)chromium.org>
Bluetooth: Avoid potential use-after-free in hci_error_reset
Jakub Raczynski <j.raczynski(a)samsung.com>
stmmac: Clear variable when destroying workqueue
Justin Iurman <justin.iurman(a)uliege.be>
uapi: in6: replace temporary label with rfc9486
Javier Carrasco <javier.carrasco.cruz(a)gmail.com>
net: usb: dm9601: fix wrong return value in dm9601_mdio_read
Jakub Kicinski <kuba(a)kernel.org>
veth: try harder when allocating queue memory
Vasily Averin <vvs(a)openvz.org>
net: enable memcg accounting for veth queues
Oleksij Rempel <linux(a)rempel-privat.de>
lan78xx: enable auto speed configuration for LAN7850 if no EEPROM is detected
Eric Dumazet <edumazet(a)google.com>
ipv6: fix potential "struct net" leak in inet6_rtm_getaddr()
Jakub Kicinski <kuba(a)kernel.org>
net: veth: clear GRO when clearing XDP even when down
Doug Smythies <dsmythies(a)telus.net>
cpufreq: intel_pstate: fix pstate limits enforcement for adjust_perf call back
Yunjian Wang <wangyunjian(a)huawei.com>
tun: Fix xdp_rxq_info's queue_index when detaching
Florian Westphal <fw(a)strlen.de>
net: ip_tunnel: prevent perpetual headroom growth
Ryosuke Yasuoka <ryasuoka(a)redhat.com>
netlink: Fix kernel-infoleak-after-free in __skb_datagram_iter
Han Xu <han.xu(a)nxp.com>
mtd: spinand: gigadevice: Fix the get ecc status issue
Pablo Neira Ayuso <pablo(a)netfilter.org>
netfilter: nf_tables: disallow timeout for anonymous sets
-------------
Diffstat:
Makefile | 4 +-
arch/riscv/include/asm/ftrace.h | 5 +
arch/riscv/include/asm/pgtable.h | 2 +-
arch/riscv/kernel/Makefile | 2 +
arch/riscv/kernel/return_address.c | 48 ++++
arch/x86/kernel/cpu/intel.c | 178 ++++++------
drivers/cpufreq/intel_pstate.c | 3 +
drivers/dma/fsl-qdma.c | 25 +-
drivers/dma/ptdma/ptdma-dmaengine.c | 2 -
drivers/firmware/efi/capsule-loader.c | 2 +-
drivers/gpio/gpio-74x164.c | 4 +-
drivers/gpio/gpiolib.c | 12 +-
drivers/gpu/drm/bridge/lontium-lt8912b.c | 11 +-
drivers/interconnect/core.c | 18 +-
drivers/mmc/core/mmc.c | 2 +
drivers/mmc/host/sdhci-xenon-phy.c | 48 +++-
drivers/mtd/nand/spi/gigadevice.c | 6 +-
drivers/net/ethernet/intel/igb/igb_ptp.c | 5 +-
drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 4 +-
drivers/net/gtp.c | 12 +-
drivers/net/tun.c | 1 +
drivers/net/usb/dm9601.c | 2 +-
drivers/net/usb/lan78xx.c | 3 +-
drivers/net/veth.c | 40 +--
drivers/power/supply/bq27xxx_battery_i2c.c | 4 +-
drivers/soc/qcom/rpmhpd.c | 7 +-
drivers/video/fbdev/core/fbcon.c | 8 +-
fs/afs/dir.c | 4 +-
fs/btrfs/dev-replace.c | 24 +-
fs/cachefiles/bind.c | 3 +
fs/hugetlbfs/inode.c | 6 +-
include/linux/netfilter.h | 14 +-
include/net/ipv6_stubs.h | 5 +
include/net/netfilter/nf_conntrack.h | 8 +
include/net/strparser.h | 4 +
include/net/tls.h | 11 +-
include/uapi/linux/bpf.h | 37 ++-
include/uapi/linux/in6.h | 2 +-
net/bluetooth/hci_core.c | 7 +-
net/bluetooth/hci_event.c | 13 +-
net/bluetooth/l2cap_core.c | 8 +-
net/bridge/br_netfilter_hooks.c | 96 +++++++
net/bridge/netfilter/nf_conntrack_bridge.c | 30 ++
net/core/filter.c | 67 ++++-
net/core/rtnetlink.c | 11 +-
net/ipv4/ip_tunnel.c | 28 +-
net/ipv4/netfilter/nf_reject_ipv4.c | 1 +
net/ipv6/addrconf.c | 7 +-
net/ipv6/af_inet6.c | 1 +
net/ipv6/netfilter/nf_reject_ipv6.c | 1 +
net/mptcp/diag.c | 3 +
net/mptcp/pm_netlink.c | 30 +-
net/mptcp/protocol.c | 123 +++++++--
net/mptcp/subflow.c | 36 ---
net/netfilter/core.c | 45 +--
net/netfilter/nf_conntrack_core.c | 21 +-
net/netfilter/nf_conntrack_netlink.c | 4 +-
net/netfilter/nf_conntrack_proto_tcp.c | 35 +++
net/netfilter/nf_nat_core.c | 2 +-
net/netfilter/nf_tables_api.c | 7 +
net/netfilter/nfnetlink_queue.c | 10 +-
net/netfilter/nft_compat.c | 20 ++
net/netlink/af_netlink.c | 2 +-
net/tls/tls_device.c | 6 +-
net/tls/tls_sw.c | 316 ++++++++++------------
net/unix/garbage.c | 22 +-
net/wireless/nl80211.c | 2 +
security/tomoyo/common.c | 3 +-
sound/core/Makefile | 1 -
sound/firewire/amdtp-stream.c | 2 +-
tools/include/uapi/linux/bpf.h | 37 ++-
tools/testing/selftests/net/mptcp/config | 2 +
72 files changed, 1046 insertions(+), 529 deletions(-)
Larry Finger <Larry.Finger(a)gmail.com> wrote:
> From: Nick Morrow <morrownr(a)gmail.com>
>
> Add VID/PIDs that are known to be missing for this driver.
>
> Removed /* 8811CU */ and /* 8821CU */ as they are redundant
> since the file is specific to those chips.
>
> Removed /* TOTOLINK A650UA v3 */ as the manufacturer. It has a REALTEK
> VID so it may not be specific to this adapter.
>
> Verified and tested.
>
> Cc: stable(a)vger.kernel.org
> Signed-off-by: Nick Morrow <morrownr(a)gmail.com>
> Signed-off-by: Larry Finger <Larry.Finger(a)lwfinger.net>
> Acked-by: Ping-Ke Shih <pkshih(a)realtek.com>
Patch applied to wireless-next.git, thanks.
b8a62478f3b1 wifi: rtw88: Add missing VID/PIDs for 8811CU and 8821CU
--
https://patchwork.kernel.org/project/linux-wireless/patch/4ume7mjw63u7.XlMU…https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatc…
The patch titled
Subject: mm: swap: fix race between free_swap_and_cache() and swapoff()
has been added to the -mm mm-unstable branch. Its filename is
mm-swap-fix-race-between-free_swap_and_cache-and-swapoff.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Ryan Roberts <ryan.roberts(a)arm.com>
Subject: mm: swap: fix race between free_swap_and_cache() and swapoff()
Date: Tue, 5 Mar 2024 15:13:49 +0000
There was previously a theoretical window where swapoff() could run and
teardown a swap_info_struct while a call to free_swap_and_cache() was
running in another thread. This could cause, amongst other bad
possibilities, swap_page_trans_huge_swapped() (called by
free_swap_and_cache()) to access the freed memory for swap_map.
This is a theoretical problem and I haven't been able to provoke it from a
test case. But there has been agreement based on code review that this is
possible (see link below).
Fix it by using get_swap_device()/put_swap_device(), which will stall
swapoff(). There was an extra check in _swap_info_get() to confirm that
the swap entry was valid. This wasn't present in get_swap_device() so
I've added it. I couldn't find any existing get_swap_device() call sites
where this extra check would cause any false alarms.
Details of how to provoke one possible issue (thanks to David Hildenbrand
for deriving this):
--8<-----
__swap_entry_free() might be the last user and result in
"count == SWAP_HAS_CACHE".
swapoff->try_to_unuse() will stop as soon as soon as si->inuse_pages==0.
So the question is: could someone reclaim the folio and turn
si->inuse_pages==0, before we completed swap_page_trans_huge_swapped().
Imagine the following: 2 MiB folio in the swapcache. Only 2 subpages are
still references by swap entries.
Process 1 still references subpage 0 via swap entry.
Process 2 still references subpage 1 via swap entry.
Process 1 quits. Calls free_swap_and_cache().
-> count == SWAP_HAS_CACHE
[then, preempted in the hypervisor etc.]
Process 2 quits. Calls free_swap_and_cache().
-> count == SWAP_HAS_CACHE
Process 2 goes ahead, passes swap_page_trans_huge_swapped(), and calls
__try_to_reclaim_swap().
__try_to_reclaim_swap()->folio_free_swap()->delete_from_swap_cache()->
put_swap_folio()->free_swap_slot()->swapcache_free_entries()->
swap_entry_free()->swap_range_free()->
...
WRITE_ONCE(si->inuse_pages, si->inuse_pages - nr_entries);
What stops swapoff to succeed after process 2 reclaimed the swap cache
but before process1 finished its call to swap_page_trans_huge_swapped()?
--8<-----
Link: https://lkml.kernel.org/r/20240305151349.3781428-1-ryan.roberts@arm.com
Fixes: 7c00bafee87c ("mm/swap: free swap slots in batch")
Closes: https://lore.kernel.org/linux-mm/65a66eb9-41f8-4790-8db2-0c70ea15979f@redha…
Signed-off-by: Ryan Roberts <ryan.roberts(a)arm.com>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: "Huang, Ying" <ying.huang(a)intel.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/swapfile.c | 14 +++++++++++---
1 file changed, 11 insertions(+), 3 deletions(-)
--- a/mm/swapfile.c~mm-swap-fix-race-between-free_swap_and_cache-and-swapoff
+++ a/mm/swapfile.c
@@ -1281,7 +1281,9 @@ struct swap_info_struct *get_swap_device
smp_rmb();
offset = swp_offset(entry);
if (offset >= si->max)
- goto put_out;
+ goto bad_offset;
+ if (data_race(!si->swap_map[swp_offset(entry)]))
+ goto bad_free;
return si;
bad_nofile:
@@ -1289,9 +1291,14 @@ bad_nofile:
out:
return NULL;
put_out:
- pr_err("%s: %s%08lx\n", __func__, Bad_offset, entry.val);
percpu_ref_put(&si->users);
return NULL;
+bad_offset:
+ pr_err("%s: %s%08lx\n", __func__, Bad_offset, entry.val);
+ goto put_out;
+bad_free:
+ pr_err("%s: %s%08lx\n", __func__, Unused_offset, entry.val);
+ goto put_out;
}
static unsigned char __swap_entry_free(struct swap_info_struct *p,
@@ -1609,13 +1616,14 @@ int free_swap_and_cache(swp_entry_t entr
if (non_swap_entry(entry))
return 1;
- p = _swap_info_get(entry);
+ p = get_swap_device(entry);
if (p) {
count = __swap_entry_free(p, entry);
if (count == SWAP_HAS_CACHE &&
!swap_page_trans_huge_swapped(p, entry))
__try_to_reclaim_swap(p, swp_offset(entry),
TTRS_UNMAPPED | TTRS_FULL);
+ put_swap_device(p);
}
return p != NULL;
}
_
Patches currently in -mm which might be from ryan.roberts(a)arm.com are
mm-swap-fix-race-between-free_swap_and_cache-and-swapoff.patch
From: Zi Yan <ziy(a)nvidia.com>
The tail pages in a THP can have swap entry information stored in their
private field. When migrating to a new page, all tail pages of the new
page need to update ->private to avoid future data corruption.
Corresponding swapcache entries need to be updated as well.
e71769ae5260 ("mm: enable thp migration for shmem thp") fixed it already.
Fixes: 616b8371539a ("mm: thp: enable thp migration in generic path")
Signed-off-by: Zi Yan <ziy(a)nvidia.com>
---
mm/migrate.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/mm/migrate.c b/mm/migrate.c
index c7d5566623ad..c37af50f312d 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -424,8 +424,12 @@ int migrate_page_move_mapping(struct address_space *mapping,
if (PageSwapBacked(page)) {
__SetPageSwapBacked(newpage);
if (PageSwapCache(page)) {
+ int i;
+
SetPageSwapCache(newpage);
- set_page_private(newpage, page_private(page));
+ for (i = 0; i < (1 << compound_order(page)); i++)
+ set_page_private(newpage + i,
+ page_private(page + i));
}
} else {
VM_BUG_ON_PAGE(PageSwapCache(page), page);
--
2.43.0
Commit fb24ea52f78e0d595852e ("drivers: Remove explicit invocations of
mmiowb()") remove all mmiowb() in drivers, but it says:
"NOTE: mmiowb() has only ever guaranteed ordering in conjunction with
spin_unlock(). However, pairing each mmiowb() removal in this patch with
the corresponding call to spin_unlock() is not at all trivial, so there
is a small chance that this change may regress any drivers incorrectly
relying on mmiowb() to order MMIO writes between CPUs using lock-free
synchronisation."
The mmio in radeon_ring_commit() is protected by a mutex rather than a
spinlock, but in the mutex fastpath it behaves similar to spinlock. We
can add mmiowb() calls in the radeon driver but the maintainer says he
doesn't like such a workaround, and radeon is not the only example of
mutex protected mmio.
So we should extend the mmiowb tracking system from spinlock to mutex,
and maybe other locking primitives. This is not easy and error prone, so
we solve it in the architectural code, by simply defining the __io_aw()
hook as mmiowb(). And we no longer need to override queued_spin_unlock()
so use the generic definition.
Without this, we get such an error when run 'glxgears' on weak ordering
architectures such as LoongArch:
radeon 0000:04:00.0: ring 0 stalled for more than 10324msec
radeon 0000:04:00.0: ring 3 stalled for more than 10240msec
radeon 0000:04:00.0: GPU lockup (current fence id 0x000000000001f412 last fence id 0x000000000001f414 on ring 3)
radeon 0000:04:00.0: GPU lockup (current fence id 0x000000000000f940 last fence id 0x000000000000f941 on ring 0)
radeon 0000:04:00.0: scheduling IB failed (-35).
[drm:radeon_gem_va_ioctl [radeon]] *ERROR* Couldn't update BO_VA (-35)
radeon 0000:04:00.0: scheduling IB failed (-35).
[drm:radeon_gem_va_ioctl [radeon]] *ERROR* Couldn't update BO_VA (-35)
radeon 0000:04:00.0: scheduling IB failed (-35).
[drm:radeon_gem_va_ioctl [radeon]] *ERROR* Couldn't update BO_VA (-35)
radeon 0000:04:00.0: scheduling IB failed (-35).
[drm:radeon_gem_va_ioctl [radeon]] *ERROR* Couldn't update BO_VA (-35)
radeon 0000:04:00.0: scheduling IB failed (-35).
[drm:radeon_gem_va_ioctl [radeon]] *ERROR* Couldn't update BO_VA (-35)
radeon 0000:04:00.0: scheduling IB failed (-35).
[drm:radeon_gem_va_ioctl [radeon]] *ERROR* Couldn't update BO_VA (-35)
radeon 0000:04:00.0: scheduling IB failed (-35).
[drm:radeon_gem_va_ioctl [radeon]] *ERROR* Couldn't update BO_VA (-35)
Link: https://lore.kernel.org/dri-devel/29df7e26-d7a8-4f67-b988-44353c4270ac@amd.…
Link: https://lore.kernel.org/linux-arch/20240301130532.3953167-1-chenhuacai@loon…
Cc: stable(a)vger.kernel.org
Signed-off-by: Huacai Chen <chenhuacai(a)loongson.cn>
---
arch/loongarch/include/asm/Kbuild | 1 +
arch/loongarch/include/asm/io.h | 2 ++
arch/loongarch/include/asm/qspinlock.h | 18 ------------------
3 files changed, 3 insertions(+), 18 deletions(-)
delete mode 100644 arch/loongarch/include/asm/qspinlock.h
diff --git a/arch/loongarch/include/asm/Kbuild b/arch/loongarch/include/asm/Kbuild
index a97c0edbb866..2dbec7853ae8 100644
--- a/arch/loongarch/include/asm/Kbuild
+++ b/arch/loongarch/include/asm/Kbuild
@@ -6,6 +6,7 @@ generic-y += mcs_spinlock.h
generic-y += parport.h
generic-y += early_ioremap.h
generic-y += qrwlock.h
+generic-y += qspinlock.h
generic-y += rwsem.h
generic-y += segment.h
generic-y += user.h
diff --git a/arch/loongarch/include/asm/io.h b/arch/loongarch/include/asm/io.h
index c486c2341b66..4a8adcca329b 100644
--- a/arch/loongarch/include/asm/io.h
+++ b/arch/loongarch/include/asm/io.h
@@ -71,6 +71,8 @@ extern void __memcpy_fromio(void *to, const volatile void __iomem *from, size_t
#define memcpy_fromio(a, c, l) __memcpy_fromio((a), (c), (l))
#define memcpy_toio(c, a, l) __memcpy_toio((c), (a), (l))
+#define __io_aw() mmiowb()
+
#include <asm-generic/io.h>
#define ARCH_HAS_VALID_PHYS_ADDR_RANGE
diff --git a/arch/loongarch/include/asm/qspinlock.h b/arch/loongarch/include/asm/qspinlock.h
deleted file mode 100644
index 34f43f8ad591..000000000000
--- a/arch/loongarch/include/asm/qspinlock.h
+++ /dev/null
@@ -1,18 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-#ifndef _ASM_QSPINLOCK_H
-#define _ASM_QSPINLOCK_H
-
-#include <asm-generic/qspinlock_types.h>
-
-#define queued_spin_unlock queued_spin_unlock
-
-static inline void queued_spin_unlock(struct qspinlock *lock)
-{
- compiletime_assert_atomic_type(lock->locked);
- c_sync();
- WRITE_ONCE(lock->locked, 0);
-}
-
-#include <asm-generic/qspinlock.h>
-
-#endif /* _ASM_QSPINLOCK_H */
--
2.43.0
From: Ville Syrjälä <ville.syrjala(a)linux.intel.com>
Reinstate commit 88b065943cb5 ("drm/i915/dsi: Do display on
sequence later on icl+"), for the most part. Turns out some
machines (eg. Chuwi Minibook X) really do need that updated order.
It is also the order the Windows driver uses.
However we can't just undo the revert since that would again
break Lenovo 82TQ. After staring at the VBT sequences for both
machines I've concluded that the Lenovo 82TQ sequences look
somewhat broken:
- INIT_OTP is not present at all
- what should be in INIT_OTP is found in DISPLAY_ON
- what should be in DISPLAY_ON is found in BACKLIGHT_ON
(along with the actual backlight stuff)
The Chuwi Minibook X on the other hand has a full complement
of sequences in its VBT.
So let's try to deal with the broken sequences in the
Lenovo 82TQ VBT by simply swapping the (non-existent)
INIT_OTP sequence with the DISPLAY_ON sequence. Thus we
execute DISPLAY_ON when intending to execute INIT_OTP,
and execute nothing at all when intending to execute
DISPLAY_ON. That should be 100% equivalent to the
revert, for such broken VBTs.
Cc: stable(a)vger.kernel.org
Fixes: dc524d05974f ("Revert "drm/i915/dsi: Do display on sequence later on icl+"")
References: https://gitlab.freedesktop.org/drm/intel/-/issues/10071
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/10334
Signed-off-by: Ville Syrjälä <ville.syrjala(a)linux.intel.com>
---
drivers/gpu/drm/i915/display/icl_dsi.c | 3 +-
drivers/gpu/drm/i915/display/intel_bios.c | 43 +++++++++++++++++++----
2 files changed, 39 insertions(+), 7 deletions(-)
diff --git a/drivers/gpu/drm/i915/display/icl_dsi.c b/drivers/gpu/drm/i915/display/icl_dsi.c
index eda4a8b88590..ac456a2275db 100644
--- a/drivers/gpu/drm/i915/display/icl_dsi.c
+++ b/drivers/gpu/drm/i915/display/icl_dsi.c
@@ -1155,7 +1155,6 @@ static void gen11_dsi_powerup_panel(struct intel_encoder *encoder)
}
intel_dsi_vbt_exec_sequence(intel_dsi, MIPI_SEQ_INIT_OTP);
- intel_dsi_vbt_exec_sequence(intel_dsi, MIPI_SEQ_DISPLAY_ON);
/* ensure all panel commands dispatched before enabling transcoder */
wait_for_cmds_dispatched_to_panel(encoder);
@@ -1256,6 +1255,8 @@ static void gen11_dsi_enable(struct intel_atomic_state *state,
/* step6d: enable dsi transcoder */
gen11_dsi_enable_transcoder(encoder);
+ intel_dsi_vbt_exec_sequence(intel_dsi, MIPI_SEQ_DISPLAY_ON);
+
/* step7: enable backlight */
intel_backlight_enable(crtc_state, conn_state);
intel_dsi_vbt_exec_sequence(intel_dsi, MIPI_SEQ_BACKLIGHT_ON);
diff --git a/drivers/gpu/drm/i915/display/intel_bios.c b/drivers/gpu/drm/i915/display/intel_bios.c
index 343726de9aa7..373291d10af9 100644
--- a/drivers/gpu/drm/i915/display/intel_bios.c
+++ b/drivers/gpu/drm/i915/display/intel_bios.c
@@ -1955,16 +1955,12 @@ static int get_init_otp_deassert_fragment_len(struct drm_i915_private *i915,
* these devices we split the init OTP sequence into a deassert sequence and
* the actual init OTP part.
*/
-static void fixup_mipi_sequences(struct drm_i915_private *i915,
- struct intel_panel *panel)
+static void vlv_fixup_mipi_sequences(struct drm_i915_private *i915,
+ struct intel_panel *panel)
{
u8 *init_otp;
int len;
- /* Limit this to VLV for now. */
- if (!IS_VALLEYVIEW(i915))
- return;
-
/* Limit this to v1 vid-mode sequences */
if (panel->vbt.dsi.config->is_cmd_mode ||
panel->vbt.dsi.seq_version != 1)
@@ -2000,6 +1996,41 @@ static void fixup_mipi_sequences(struct drm_i915_private *i915,
panel->vbt.dsi.sequence[MIPI_SEQ_INIT_OTP] = init_otp + len - 1;
}
+/*
+ * Some machines (eg. Lenovo 82TQ) appear to have broken
+ * VBT sequences:
+ * - INIT_OTP is not present at all
+ * - what should be in INIT_OTP is in DISPLAY_ON
+ * - what should be in DISPLAY_ON is in BACKLIGHT_ON
+ * (along with the actual backlight stuff)
+ *
+ * To make those work we simply swap DISPLAY_ON and INIT_OTP.
+ *
+ * TODO: Do we need to limit this to specific machines,
+ * or examine the contents of the sequences to
+ * avoid false positives?
+ */
+static void icl_fixup_mipi_sequences(struct drm_i915_private *i915,
+ struct intel_panel *panel)
+{
+ if (!panel->vbt.dsi.sequence[MIPI_SEQ_INIT_OTP] &&
+ panel->vbt.dsi.sequence[MIPI_SEQ_DISPLAY_ON]) {
+ drm_dbg_kms(&i915->drm, "Broken VBT: Swapping INIT_OTP and DISPLAY_ON sequences\n");
+
+ swap(panel->vbt.dsi.sequence[MIPI_SEQ_INIT_OTP],
+ panel->vbt.dsi.sequence[MIPI_SEQ_DISPLAY_ON]);
+ }
+}
+
+static void fixup_mipi_sequences(struct drm_i915_private *i915,
+ struct intel_panel *panel)
+{
+ if (DISPLAY_VER(i915) >= 11)
+ icl_fixup_mipi_sequences(i915, panel);
+ else if (IS_VALLEYVIEW(i915))
+ vlv_fixup_mipi_sequences(i915, panel);
+}
+
static void
parse_mipi_sequence(struct drm_i915_private *i915,
struct intel_panel *panel)
--
2.43.0
With the to-be-fixed commit, the reset_work handler cleared 'host->mrq'
outside of the spinlock protected critical section. That leaves a small
race window during execution of 'tmio_mmc_reset()' where the done_work
handler could grab a pointer to the now invalid 'host->mrq'. Both would
use it to call mmc_request_done() causing problems (see link below).
However, 'host->mrq' cannot simply be cleared earlier inside the
critical section. That would allow new mrqs to come in asynchronously
while the actual reset of the controller still needs to be done. So,
like 'tmio_mmc_set_ios()', an ERR_PTR is used to prevent new mrqs from
coming in but still avoiding concurrency between work handlers.
Reported-by: Dirk Behme <dirk.behme(a)de.bosch.com>
Closes: https://lore.kernel.org/all/20240220061356.3001761-1-dirk.behme@de.bosch.co…
Fixes: df3ef2d3c92c ("mmc: protect the tmio_mmc driver against a theoretical race")
Signed-off-by: Wolfram Sang <wsa+renesas(a)sang-engineering.com>
Tested-by: Dirk Behme <dirk.behme(a)de.bosch.com>
Reviewed-by: Dirk Behme <dirk.behme(a)de.bosch.com>
Cc: stable(a)vger.kernel.org # 3.0+
---
Change since v1/RFT: added Dirk's tags and stable tag
@Ulf: this is nasty, subtle stuff. Would be awesome to have it in 6.8
already!
drivers/mmc/host/tmio_mmc_core.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/mmc/host/tmio_mmc_core.c b/drivers/mmc/host/tmio_mmc_core.c
index be7f18fd4836..c253d176db69 100644
--- a/drivers/mmc/host/tmio_mmc_core.c
+++ b/drivers/mmc/host/tmio_mmc_core.c
@@ -259,6 +259,8 @@ static void tmio_mmc_reset_work(struct work_struct *work)
else
mrq->cmd->error = -ETIMEDOUT;
+ /* No new calls yet, but disallow concurrent tmio_mmc_done_work() */
+ host->mrq = ERR_PTR(-EBUSY);
host->cmd = NULL;
host->data = NULL;
--
2.43.0
If VM_BIND is enabled on the client the legacy submission ioctl can't be
used, however if a client tries to do so regardless it will return an
error. In this case the clients mutex remained unlocked leading to a
deadlock inside nouveau_drm_postclose or any other nouveau ioctl call.
Fixes: b88baab82871 ("drm/nouveau: implement new VM_BIND uAPI")
Cc: Danilo Krummrich <dakr(a)redhat.com>
Cc: <stable(a)vger.kernel.org> # v6.6+
Signed-off-by: Karol Herbst <kherbst(a)redhat.com>
Reviewed-by: Lyude Paul <lyude(a)redhat.com>
Reviewed-by: Danilo Krummrich <dakr(a)redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240304183157.1587152-1-kher…
---
drivers/gpu/drm/nouveau/nouveau_gem.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/nouveau/nouveau_gem.c b/drivers/gpu/drm/nouveau/nouveau_gem.c
index 49c2bcbef1299..5a887d67dc0e8 100644
--- a/drivers/gpu/drm/nouveau/nouveau_gem.c
+++ b/drivers/gpu/drm/nouveau/nouveau_gem.c
@@ -764,7 +764,7 @@ nouveau_gem_ioctl_pushbuf(struct drm_device *dev, void *data,
return -ENOMEM;
if (unlikely(nouveau_cli_uvmm(cli)))
- return -ENOSYS;
+ return nouveau_abi16_put(abi16, -ENOSYS);
list_for_each_entry(temp, &abi16->channels, head) {
if (temp->chan->chid == req->channel) {
--
2.44.0