The following commit has been merged into the x86/urgent branch of tip:
Commit-ID: d794734c9bbfe22f86686dc2909c25f5ffe1a572
Gitweb: https://git.kernel.org/tip/d794734c9bbfe22f86686dc2909c25f5ffe1a572
Author: Steve Wahl <steve.wahl(a)hpe.com>
AuthorDate: Fri, 26 Jan 2024 10:48:41 -06:00
Committer: Dave Hansen <dave.hansen(a)linux.intel.com>
CommitterDate: Mon, 12 Feb 2024 14:53:42 -08:00
x86/mm/ident_map: Use gbpages only where full GB page should be mapped.
When ident_pud_init() uses only gbpages to create identity maps, large
ranges of addresses not actually requested can be included in the
resulting table; a 4K request will map a full GB. On UV systems, this
ends up including regions that will cause hardware to halt the system
if accessed (these are marked "reserved" by BIOS). Even processor
speculation into these regions is enough to trigger the system halt.
Only use gbpages when map creation requests include the full GB page
of space. Fall back to using smaller 2M pages when only portions of a
GB page are included in the request.
No attempt is made to coalesce mapping requests. If a request requires
a map entry at the 2M (pmd) level, subsequent mapping requests within
the same 1G region will also be at the pmd level, even if adjacent or
overlapping such requests could have been combined to map a full
gbpage. Existing usage starts with larger regions and then adds
smaller regions, so this should not have any great consequence.
[ dhansen: fix up comment formatting, simplifty changelog ]
Signed-off-by: Steve Wahl <steve.wahl(a)hpe.com>
Signed-off-by: Dave Hansen <dave.hansen(a)linux.intel.com>
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/all/20240126164841.170866-1-steve.wahl%40hpe.com
---
arch/x86/mm/ident_map.c | 23 ++++++++++++++++++-----
1 file changed, 18 insertions(+), 5 deletions(-)
diff --git a/arch/x86/mm/ident_map.c b/arch/x86/mm/ident_map.c
index 968d700..f50cc21 100644
--- a/arch/x86/mm/ident_map.c
+++ b/arch/x86/mm/ident_map.c
@@ -26,18 +26,31 @@ static int ident_pud_init(struct x86_mapping_info *info, pud_t *pud_page,
for (; addr < end; addr = next) {
pud_t *pud = pud_page + pud_index(addr);
pmd_t *pmd;
+ bool use_gbpage;
next = (addr & PUD_MASK) + PUD_SIZE;
if (next > end)
next = end;
- if (info->direct_gbpages) {
- pud_t pudval;
+ /* if this is already a gbpage, this portion is already mapped */
+ if (pud_large(*pud))
+ continue;
+
+ /* Is using a gbpage allowed? */
+ use_gbpage = info->direct_gbpages;
- if (pud_present(*pud))
- continue;
+ /* Don't use gbpage if it maps more than the requested region. */
+ /* at the begining: */
+ use_gbpage &= ((addr & ~PUD_MASK) == 0);
+ /* ... or at the end: */
+ use_gbpage &= ((next & ~PUD_MASK) == 0);
+
+ /* Never overwrite existing mappings */
+ use_gbpage &= !pud_present(*pud);
+
+ if (use_gbpage) {
+ pud_t pudval;
- addr &= PUD_MASK;
pudval = __pud((addr - info->offset) | info->page_flag);
set_pud(pud, pudval);
continue;
The patch titled
Subject: kasan/test: avoid gcc warning for intentional overflow
has been added to the -mm mm-unstable branch. Its filename is
kasan-test-avoid-gcc-warning-for-intentional-overflow.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Arnd Bergmann <arnd(a)arndb.de>
Subject: kasan/test: avoid gcc warning for intentional overflow
Date: Mon, 12 Feb 2024 12:15:52 +0100
The out-of-bounds test allocates an object that is three bytes too short
in order to validate the bounds checking. Starting with gcc-14, this
causes a compile-time warning as gcc has grown smart enough to understand
the sizeof() logic:
mm/kasan/kasan_test.c: In function 'kmalloc_oob_16':
mm/kasan/kasan_test.c:443:14: error: allocation of insufficient size '13' for type 'struct <anonymous>' with size '16' [-Werror=alloc-size]
443 | ptr1 = kmalloc(sizeof(*ptr1) - 3, GFP_KERNEL);
| ^
Hide the actual computation behind a RELOC_HIDE() that ensures
the compiler misses the intentional bug.
Link: https://lkml.kernel.org/r/20240212111609.869266-1-arnd@kernel.org
Fixes: 3f15801cdc23 ("lib: add kasan test module")
Signed-off-by: Arnd Bergmann <arnd(a)arndb.de>
Cc: Alexander Potapenko <glider(a)google.com>
Cc: Andrey Konovalov <andreyknvl(a)gmail.com>
Cc: Andrey Ryabinin <ryabinin.a.a(a)gmail.com>
Cc: Arnd Bergmann <arnd(a)arndb.de>
Cc: Dmitry Vyukov <dvyukov(a)google.com>
Cc: Marco Elver <elver(a)google.com>
Cc: Vincenzo Frascino <vincenzo.frascino(a)arm.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/kasan/kasan_test.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
--- a/mm/kasan/kasan_test.c~kasan-test-avoid-gcc-warning-for-intentional-overflow
+++ a/mm/kasan/kasan_test.c
@@ -440,7 +440,8 @@ static void kmalloc_oob_16(struct kunit
/* This test is specifically crafted for the generic mode. */
KASAN_TEST_NEEDS_CONFIG_ON(test, CONFIG_KASAN_GENERIC);
- ptr1 = kmalloc(sizeof(*ptr1) - 3, GFP_KERNEL);
+ /* RELOC_HIDE to prevent gcc from warning about short alloc */
+ ptr1 = RELOC_HIDE(kmalloc(sizeof(*ptr1) - 3, GFP_KERNEL), 0);
KUNIT_ASSERT_NOT_ERR_OR_NULL(test, ptr1);
ptr2 = kmalloc(sizeof(*ptr2), GFP_KERNEL);
_
Patches currently in -mm which might be from arnd(a)arndb.de are
mm-damon-dbgfs-implement-deprecation-notice-file-fix.patch
kasan-test-avoid-gcc-warning-for-intentional-overflow.patch
Limit the WiFi PCIe link speed to Gen2 speed (500 GB/s), which is the
speed that the boot firmware has brought up the link at (and that
Windows uses).
This is specifically needed to avoid a large amount of link errors when
restarting the link during boot (but which are currently not reported).
This may potentially also help with intermittent failures to download
the ath11k firmware during boot which can be seen when there is a
longer delay between restarting the link and loading the WiFi driver
(e.g. when using full disk encryption).
Fixes: 123b30a75623 ("arm64: dts: qcom: sc8280xp-x13s: enable WiFi controller")
Cc: stable(a)vger.kernel.org # 6.2
Signed-off-by: Johan Hovold <johan+linaro(a)kernel.org>
---
arch/arm64/boot/dts/qcom/sc8280xp-lenovo-thinkpad-x13s.dts | 2 ++
1 file changed, 2 insertions(+)
diff --git a/arch/arm64/boot/dts/qcom/sc8280xp-lenovo-thinkpad-x13s.dts b/arch/arm64/boot/dts/qcom/sc8280xp-lenovo-thinkpad-x13s.dts
index 511d53d9c5a1..ff4b896b1bbf 100644
--- a/arch/arm64/boot/dts/qcom/sc8280xp-lenovo-thinkpad-x13s.dts
+++ b/arch/arm64/boot/dts/qcom/sc8280xp-lenovo-thinkpad-x13s.dts
@@ -863,6 +863,8 @@ &pcie3a_phy {
};
&pcie4 {
+ max-link-speed = <2>;
+
perst-gpios = <&tlmm 141 GPIO_ACTIVE_LOW>;
wake-gpios = <&tlmm 139 GPIO_ACTIVE_LOW>;
--
2.43.0
Originally io_cancel() only supported cancelling USB reads and writes.
If I/O was cancelled successfully, information about the cancelled I/O
operation was copied to the data structure the io_cancel() 'result'
argument points at. Commit 63b05203af57 ("[PATCH] AIO: retry
infrastructure fixes and enhancements") changed the io_cancel() behavior
from reporting status information via the 'result' argument into
reporting status information on the completion ring. Commit 41003a7bcfed
("aio: remove retry-based AIO") accidentally changed the behavior into
not reporting a completion event on the completion ring for cancelled
requests. This is a bug because successful cancellation leads to an iocb
leak in user space. Since this bug was introduced more than ten years
ago and since nobody has complained since then, remove support for I/O
cancellation. Keep support for cancellation of IOCB_CMD_POLL requests.
Calling kiocb_set_cancel_fn() without knowing whether the caller
submitted a struct kiocb or a struct aio_kiocb is unsafe. The
following call trace illustrates that without this patch an
out-of-bounds write happens if I/O is submitted by io_uring (from a
phone with an ARM CPU and kernel 6.1):
WARNING: CPU: 3 PID: 368 at fs/aio.c:598 kiocb_set_cancel_fn+0x9c/0xa8
Call trace:
kiocb_set_cancel_fn+0x9c/0xa8
ffs_epfile_read_iter+0x144/0x1d0
io_read+0x19c/0x498
io_issue_sqe+0x118/0x27c
io_submit_sqes+0x25c/0x5fc
__arm64_sys_io_uring_enter+0x104/0xab0
invoke_syscall+0x58/0x11c
el0_svc_common+0xb4/0xf4
do_el0_svc+0x2c/0xb0
el0_svc+0x2c/0xa4
el0t_64_sync_handler+0x68/0xb4
el0t_64_sync+0x1a4/0x1a8
Cc: Christoph Hellwig <hch(a)lst.de>
Cc: Avi Kivity <avi(a)scylladb.com>
Cc: Sandeep Dhavale <dhavale(a)google.com>
Cc: Jens Axboe <axboe(a)kernel.dk>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Fixes: 63b05203af57 ("[PATCH] AIO: retry infrastructure fixes and enhancements")
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Bart Van Assche <bvanassche(a)acm.org>
---
drivers/usb/gadget/function/f_fs.c | 25 -------------------
drivers/usb/gadget/legacy/inode.c | 20 ---------------
fs/aio.c | 39 +++++-------------------------
include/linux/aio.h | 9 -------
4 files changed, 6 insertions(+), 87 deletions(-)
diff --git a/drivers/usb/gadget/function/f_fs.c b/drivers/usb/gadget/function/f_fs.c
index 6bff6cb93789..59789292f4f7 100644
--- a/drivers/usb/gadget/function/f_fs.c
+++ b/drivers/usb/gadget/function/f_fs.c
@@ -1157,25 +1157,6 @@ ffs_epfile_open(struct inode *inode, struct file *file)
return stream_open(inode, file);
}
-static int ffs_aio_cancel(struct kiocb *kiocb)
-{
- struct ffs_io_data *io_data = kiocb->private;
- struct ffs_epfile *epfile = kiocb->ki_filp->private_data;
- unsigned long flags;
- int value;
-
- spin_lock_irqsave(&epfile->ffs->eps_lock, flags);
-
- if (io_data && io_data->ep && io_data->req)
- value = usb_ep_dequeue(io_data->ep, io_data->req);
- else
- value = -EINVAL;
-
- spin_unlock_irqrestore(&epfile->ffs->eps_lock, flags);
-
- return value;
-}
-
static ssize_t ffs_epfile_write_iter(struct kiocb *kiocb, struct iov_iter *from)
{
struct ffs_io_data io_data, *p = &io_data;
@@ -1198,9 +1179,6 @@ static ssize_t ffs_epfile_write_iter(struct kiocb *kiocb, struct iov_iter *from)
kiocb->private = p;
- if (p->aio)
- kiocb_set_cancel_fn(kiocb, ffs_aio_cancel);
-
res = ffs_epfile_io(kiocb->ki_filp, p);
if (res == -EIOCBQUEUED)
return res;
@@ -1242,9 +1220,6 @@ static ssize_t ffs_epfile_read_iter(struct kiocb *kiocb, struct iov_iter *to)
kiocb->private = p;
- if (p->aio)
- kiocb_set_cancel_fn(kiocb, ffs_aio_cancel);
-
res = ffs_epfile_io(kiocb->ki_filp, p);
if (res == -EIOCBQUEUED)
return res;
diff --git a/drivers/usb/gadget/legacy/inode.c b/drivers/usb/gadget/legacy/inode.c
index 03179b1880fd..99b7366d77af 100644
--- a/drivers/usb/gadget/legacy/inode.c
+++ b/drivers/usb/gadget/legacy/inode.c
@@ -446,25 +446,6 @@ struct kiocb_priv {
unsigned actual;
};
-static int ep_aio_cancel(struct kiocb *iocb)
-{
- struct kiocb_priv *priv = iocb->private;
- struct ep_data *epdata;
- int value;
-
- local_irq_disable();
- epdata = priv->epdata;
- // spin_lock(&epdata->dev->lock);
- if (likely(epdata && epdata->ep && priv->req))
- value = usb_ep_dequeue (epdata->ep, priv->req);
- else
- value = -EINVAL;
- // spin_unlock(&epdata->dev->lock);
- local_irq_enable();
-
- return value;
-}
-
static void ep_user_copy_worker(struct work_struct *work)
{
struct kiocb_priv *priv = container_of(work, struct kiocb_priv, work);
@@ -537,7 +518,6 @@ static ssize_t ep_aio(struct kiocb *iocb,
iocb->private = priv;
priv->iocb = iocb;
- kiocb_set_cancel_fn(iocb, ep_aio_cancel);
get_ep(epdata);
priv->epdata = epdata;
priv->actual = 0;
diff --git a/fs/aio.c b/fs/aio.c
index bb2ff48991f3..c20946d5fcf3 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -203,7 +203,7 @@ struct aio_kiocb {
};
struct kioctx *ki_ctx;
- kiocb_cancel_fn *ki_cancel;
+ int (*ki_cancel)(struct kiocb *);
struct io_event ki_res;
@@ -587,22 +587,6 @@ static int aio_setup_ring(struct kioctx *ctx, unsigned int nr_events)
#define AIO_EVENTS_FIRST_PAGE ((PAGE_SIZE - sizeof(struct aio_ring)) / sizeof(struct io_event))
#define AIO_EVENTS_OFFSET (AIO_EVENTS_PER_PAGE - AIO_EVENTS_FIRST_PAGE)
-void kiocb_set_cancel_fn(struct kiocb *iocb, kiocb_cancel_fn *cancel)
-{
- struct aio_kiocb *req = container_of(iocb, struct aio_kiocb, rw);
- struct kioctx *ctx = req->ki_ctx;
- unsigned long flags;
-
- if (WARN_ON_ONCE(!list_empty(&req->ki_list)))
- return;
-
- spin_lock_irqsave(&ctx->ctx_lock, flags);
- list_add_tail(&req->ki_list, &ctx->active_reqs);
- req->ki_cancel = cancel;
- spin_unlock_irqrestore(&ctx->ctx_lock, flags);
-}
-EXPORT_SYMBOL(kiocb_set_cancel_fn);
-
/*
* free_ioctx() should be RCU delayed to synchronize against the RCU
* protected lookup_ioctx() and also needs process context to call
@@ -2158,13 +2142,11 @@ COMPAT_SYSCALL_DEFINE3(io_submit, compat_aio_context_t, ctx_id,
#endif
/* sys_io_cancel:
- * Attempts to cancel an iocb previously passed to io_submit. If
- * the operation is successfully cancelled, the resulting event is
- * copied into the memory pointed to by result without being placed
- * into the completion queue and 0 is returned. May fail with
- * -EFAULT if any of the data structures pointed to are invalid.
- * May fail with -EINVAL if aio_context specified by ctx_id is
- * invalid. May fail with -EAGAIN if the iocb specified was not
+ * Attempts to cancel an IOCB_CMD_POLL iocb previously passed to
+ * io_submit(). If the operation is successfully cancelled 0 is returned.
+ * May fail with -EFAULT if any of the data structures pointed to are
+ * invalid. May fail with -EINVAL if aio_context specified by ctx_id is
+ * invalid. May fail with -EINPROGRESS if the iocb specified was not
* cancelled. Will fail with -ENOSYS if not implemented.
*/
SYSCALL_DEFINE3(io_cancel, aio_context_t, ctx_id, struct iocb __user *, iocb,
@@ -2196,15 +2178,6 @@ SYSCALL_DEFINE3(io_cancel, aio_context_t, ctx_id, struct iocb __user *, iocb,
}
spin_unlock_irq(&ctx->ctx_lock);
- if (!ret) {
- /*
- * The result argument is no longer used - the io_event is
- * always delivered via the ring buffer. -EINPROGRESS indicates
- * cancellation is progress:
- */
- ret = -EINPROGRESS;
- }
-
percpu_ref_put(&ctx->users);
return ret;
diff --git a/include/linux/aio.h b/include/linux/aio.h
index 86892a4fe7c8..9aabca4a0eed 100644
--- a/include/linux/aio.h
+++ b/include/linux/aio.h
@@ -2,22 +2,13 @@
#ifndef __LINUX__AIO_H
#define __LINUX__AIO_H
-#include <linux/aio_abi.h>
-
-struct kioctx;
-struct kiocb;
struct mm_struct;
-typedef int (kiocb_cancel_fn)(struct kiocb *);
-
/* prototypes */
#ifdef CONFIG_AIO
extern void exit_aio(struct mm_struct *mm);
-void kiocb_set_cancel_fn(struct kiocb *req, kiocb_cancel_fn *cancel);
#else
static inline void exit_aio(struct mm_struct *mm) { }
-static inline void kiocb_set_cancel_fn(struct kiocb *req,
- kiocb_cancel_fn *cancel) { }
#endif /* CONFIG_AIO */
#endif /* __LINUX__AIO_H */
Patches 1-4 are fixes for issues found by Paolo while working on adding
TCP_NOTSENT_LOWAT support. The latter will need to track more states
under the msk data lock. Since the locking msk locking schema is already
quite complex, do a long awaited clean-up step by moving several
confusing lockless initialization under the relevant locks. Note that it
is unlikely a real race could happen even prior to such patches as the
MPTCP-level state machine implicitly ensures proper serialization of the
write accesses, even lacking explicit lock. But still, simplification is
welcome and this will help for the maintenance. This can be backported
up to v5.6.
Patch 5 is a fix for the userspace PM, not to add new local address
entries if the address is already in the list. This behaviour can be
seen since v5.19.
Patch 6 fixes an issue when Fastopen is used. The issue can happen since
v6.2. A previous fix has already been applied, but not taking care of
all cases according to syzbot.
Patch 7 updates Geliang's email address in the MAINTAINERS file.
Signed-off-by: Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
---
Geliang Tang (2):
mptcp: check addrs list in userspace_pm_get_local_id
MAINTAINERS: update Geliang's email address
Paolo Abeni (5):
mptcp: drop the push_pending field
mptcp: fix rcv space initialization
mptcp: fix more tx path fields initialization
mptcp: corner case locking for rx path fields initialization
mptcp: really cope with fastopen race
.mailmap | 9 +++---
MAINTAINERS | 2 +-
net/mptcp/fastopen.c | 6 ++--
net/mptcp/options.c | 9 +++---
net/mptcp/pm_userspace.c | 13 ++++++++-
net/mptcp/protocol.c | 31 +++++++++++----------
net/mptcp/protocol.h | 16 ++++++-----
net/mptcp/subflow.c | 71 ++++++++++++++++++++++++++++++------------------
8 files changed, 95 insertions(+), 62 deletions(-)
---
base-commit: 335bac1daae3fd9070d0f9f34d7d7ba708729256
change-id: 20240202-upstream-net-20240202-locking-cleanup-misc-5f2ee79d8356
Best regards,
--
Matthieu Baerts (NGI0) <matttbe(a)kernel.org>