On certain i.MX8 series parts [1], the PPS channel 0
is routed internally to eDMA, and the external PPS
pin is available on channel 1. In addition, on
certain boards, the PPS may be wired on the PCB to
an EVENTOUTn pin other than 0. On these systems
it is necessary that the PPS channel be able
to be configured from the Device Tree.
[1] https://lore.kernel.org/all/ZrPYOWA3FESx197L@lizhi-Precision-Tower-5810/
Francesco Dolcini (3):
dt-bindings: net: fec: add pps channel property
net: fec: refactor PPS channel configuration
net: fec: make PPS channel configurable
Documentation/devicetree/bindings/net/fsl,fec.yaml | 7 +++++++
drivers/net/ethernet/freescale/fec_ptp.c | 11 ++++++-----
2 files changed, 13 insertions(+), 5 deletions(-)
--
2.34.1
The lifetime of a struct pps_device and the struct cdev embedded in it
is under control of the associated struct device.
The cdev's .open/.release methods (pps_cdev_{open,release}()) try to
keep the device alive while the cdev is open, but this is insufficient.
Consider this sequence:
1. Attach the PPS line discipline to a TTY.
2. Open the created /dev/pps* file.
3. Detach the PPS line discipline.
4. Close the file.
In this scenario, the last reference to the cdev is not released from
pps code, but in __fput(), *after* running the .release method, which
frees the pps_device (including struct cdev).
Fix this by using cdev_set_parent() to ensure that the pps_device
outlives the cdev.
cdev_set_parent() should be used before cdev_add(), but we can't do that
here, because at that point we don't have the dev yet. It's created
in the next step with device_create(). To compensate, bump the refcount
of the dev, which is what cdev_add() would have done if we followed the
usual order. This will be cleaned up in subsequent patches.
With the parent relationship in place, the .open/.release methods no
longer need to change the refcount. The cdev reference held by the core
filesystem code is enough to keep the pps device alive.
The .release method had nothing else to do, so remove it.
Move the cdev_del() from pps_device_destruct() to pps_unregister_cdev().
This is necessary. Otherwise, the pps_device would be holding a
reference to itself and never get released. It also brings symmetry
between pps_register_cdev() and pps_unregister_cdev().
KASAN detection of the bug:
pps_core: deallocating pps0
==================================================================
BUG: KASAN: slab-use-after-free in cdev_put+0x4e/0x50
Read of size 8 at addr ff1100001c1c7360 by task sleep/1192
CPU: 0 UID: 0 PID: 1192 Comm: sleep Not tainted 6.12.0-0.rc7.59.fc42.x86_64+debug #1
Hardware name: Red Hat OpenStack Compute, BIOS 1.14.0-1.module+el8.4.0+8855+a9e237a9 04/01/2014
Call Trace:
<TASK>
dump_stack_lvl+0x84/0xd0
print_report+0x174/0x505
kasan_report+0xab/0x180
cdev_put+0x4e/0x50
__fput+0x725/0xaa0
task_work_run+0x119/0x200
do_exit+0x8ef/0x27a0
do_group_exit+0xbc/0x250
get_signal+0x1b78/0x1e00
arch_do_signal_or_restart+0x8f/0x570
syscall_exit_to_user_mode+0x1f4/0x290
do_syscall_64+0xa3/0x190
entry_SYSCALL_64_after_hwframe+0x76/0x7e
RIP: 0033:0x7f6c598342d6
Code: Unable to access opcode bytes at 0x7f6c598342ac.
RSP: 002b:00007ffff3528160 EFLAGS: 00000202 ORIG_RAX: 00000000000000e6
RAX: fffffffffffffdfc RBX: 00007f6c597c5740 RCX: 00007f6c598342d6
RDX: 00007ffff35281f0 RSI: 0000000000000000 RDI: 0000000000000000
RBP: 00007ffff3528170 R08: 0000000000000000 R09: 0000000000000000
R10: 00007ffff35281e0 R11: 0000000000000202 R12: 00007f6c597c56c8
R13: 00007ffff35281e0 R14: 000000000000003c R15: 00007ffff35281e0
</TASK>
Allocated by task 1186:
kasan_save_stack+0x30/0x50
kasan_save_track+0x14/0x30
__kasan_kmalloc+0x8f/0xa0
pps_register_source+0xe4/0x360
pps_tty_open+0x191/0x220 [pps_ldisc]
tty_ldisc_open+0x75/0xc0
tty_set_ldisc+0x29e/0x730
tty_ioctl+0x866/0x11e0
__x64_sys_ioctl+0x12e/0x1a0
do_syscall_64+0x97/0x190
entry_SYSCALL_64_after_hwframe+0x76/0x7e
Freed by task 1192:
kasan_save_stack+0x30/0x50
kasan_save_track+0x14/0x30
kasan_save_free_info+0x3b/0x70
__kasan_slab_free+0x37/0x50
kfree+0x140/0x4d0
device_release+0x9c/0x210
kobject_put+0x17c/0x4b0
pps_cdev_release+0x56/0x70
__fput+0x368/0xaa0
task_work_run+0x119/0x200
do_exit+0x8ef/0x27a0
do_group_exit+0xbc/0x250
get_signal+0x1b78/0x1e00
arch_do_signal_or_restart+0x8f/0x570
syscall_exit_to_user_mode+0x1f4/0x290
do_syscall_64+0xa3/0x190
entry_SYSCALL_64_after_hwframe+0x76/0x7e
The buggy address belongs to the object at ff1100001c1c7200
which belongs to the cache kmalloc-rnd-02-512 of size 512
The buggy address is located 352 bytes inside of
freed 512-byte region [ff1100001c1c7200, ff1100001c1c7400)
[...]
==================================================================
Fixes: eae9d2ba0cfc ("LinuxPPS: core support")
Fixes: d953e0e837e6 ("pps: Fix a use-after free bug when unregistering a source.")
Cc: stable(a)vger.kernel.org
Signed-off-by: Michal Schmidt <mschmidt(a)redhat.com>
---
drivers/pps/pps.c | 17 +++++------------
1 file changed, 5 insertions(+), 12 deletions(-)
diff --git a/drivers/pps/pps.c b/drivers/pps/pps.c
index 25d47907db17..4f497353daa2 100644
--- a/drivers/pps/pps.c
+++ b/drivers/pps/pps.c
@@ -301,15 +301,6 @@ static int pps_cdev_open(struct inode *inode, struct file *file)
struct pps_device *pps = container_of(inode->i_cdev,
struct pps_device, cdev);
file->private_data = pps;
- kobject_get(&pps->dev->kobj);
- return 0;
-}
-
-static int pps_cdev_release(struct inode *inode, struct file *file)
-{
- struct pps_device *pps = container_of(inode->i_cdev,
- struct pps_device, cdev);
- kobject_put(&pps->dev->kobj);
return 0;
}
@@ -324,15 +315,12 @@ static const struct file_operations pps_cdev_fops = {
.compat_ioctl = pps_cdev_compat_ioctl,
.unlocked_ioctl = pps_cdev_ioctl,
.open = pps_cdev_open,
- .release = pps_cdev_release,
};
static void pps_device_destruct(struct device *dev)
{
struct pps_device *pps = dev_get_drvdata(dev);
- cdev_del(&pps->cdev);
-
/* Now we can release the ID for re-use */
pr_debug("deallocating pps%d\n", pps->id);
mutex_lock(&pps_idr_lock);
@@ -383,6 +371,10 @@ int pps_register_cdev(struct pps_device *pps)
goto del_cdev;
}
+ cdev_set_parent(&pps->cdev, &pps->dev->kobj);
+ /* Compensate for setting the parent after cdev_add() */
+ get_device(pps->dev);
+
/* Override the release function with our own */
pps->dev->release = pps_device_destruct;
@@ -407,6 +399,7 @@ void pps_unregister_cdev(struct pps_device *pps)
pr_debug("unregistering pps%d\n", pps->id);
pps->lookup_cookie = NULL;
device_destroy(pps_class, pps->dev->devt);
+ cdev_del(&pps->cdev);
}
/*
--
2.47.0
The patch below does not apply to the 6.11-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.11.y
git checkout FETCH_HEAD
git cherry-pick -x 2714b0d1f36999dbd99a3474a24e7301acbd74f1
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024120203-carat-landlord-afcf@gregkh' --subject-prefix 'PATCH 6.11.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 2714b0d1f36999dbd99a3474a24e7301acbd74f1 Mon Sep 17 00:00:00 2001
From: Christian Brauner <brauner(a)kernel.org>
Date: Tue, 8 Oct 2024 13:30:49 +0200
Subject: [PATCH] fcntl: make F_DUPFD_QUERY associative
Currently when passing a closed file descriptor to
fcntl(fd, F_DUPFD_QUERY, fd_dup) the order matters:
fd = open("/dev/null");
fd_dup = dup(fd);
When we now close one of the file descriptors we get:
(1) fcntl(fd, fd_dup) // -EBADF
(2) fcntl(fd_dup, fd) // 0 aka not equal
depending on which file descriptor is passed first. That's not a huge
deal but it gives the api I slightly weird feel. Make it so that the
order doesn't matter by requiring that both file descriptors are valid:
(1') fcntl(fd, fd_dup) // -EBADF
(2') fcntl(fd_dup, fd) // -EBADF
Link: https://lore.kernel.org/r/20241008-duften-formel-251f967602d5@brauner
Fixes: c62b758bae6a ("fcntl: add F_DUPFD_QUERY fcntl()")
Reviewed-by: Jeff Layton <jlayton(a)kernel.org>
Reviewed-By: Lennart Poettering <lennart(a)poettering.net>
Cc: stable(a)vger.kernel.org
Reported-by: Lennart Poettering <lennart(a)poettering.net>
Signed-off-by: Christian Brauner <brauner(a)kernel.org>
diff --git a/fs/fcntl.c b/fs/fcntl.c
index 22dd9dcce7ec..3d89de31066a 100644
--- a/fs/fcntl.c
+++ b/fs/fcntl.c
@@ -397,6 +397,9 @@ static long f_dupfd_query(int fd, struct file *filp)
{
CLASS(fd_raw, f)(fd);
+ if (fd_empty(f))
+ return -EBADF;
+
/*
* We can do the 'fdput()' immediately, as the only thing that
* matters is the pointer value which isn't changed by the fdput.
-Wenum-enum-conversion was strengthened in clang-19 to warn for C, which
caused the kernel to move it to W=1 in commit 75b5ab134bb5 ("kbuild:
Move -Wenum-{compare-conditional,enum-conversion} into W=1") because
there were numerous instances that would break builds with -Werror.
Unfortunately, this is not a full solution, as more and more developers,
subsystems, and distributors are building with W=1 as well, so they
continue to see the numerous instances of this warning.
Since the move to W=1, there have not been many new instances that have
appeared through various build reports and the ones that have appeared
seem to be following similar existing patterns, suggesting that most
instances of this warning will not be real issues. The only alternatives
for silencing this warning are adding casts (which is generally seen as
an ugly practice) or refactoring the enums to macro defines or a unified
enum (which may be undesirable because of type safety in other parts of
the code).
Move the warning to W=2, where warnings that occur frequently but may be
relevant should reside.
Cc: stable(a)vger.kernel.org
Fixes: 75b5ab134bb5 ("kbuild: Move -Wenum-{compare-conditional,enum-conversion} into W=1")
Link: https://lore.kernel.org/ZwRA9SOcOjjLJcpi@google.com/
Signed-off-by: Nathan Chancellor <nathan(a)kernel.org>
---
Changes in v2:
- Move -Wenum-enum-conversion to W=2, instead of disabling it
outright (Arnd)
- Leave -Wenum-compare-conditional in W=1, as there are not that
many instances, so it can be turned on fully at some point (Arnd)
- Link to v1: https://lore.kernel.org/r/20241016-disable-two-clang-enum-warnings-v1-1-ae8…
---
scripts/Makefile.extrawarn | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/scripts/Makefile.extrawarn b/scripts/Makefile.extrawarn
index 1d13cecc7cc7808610e635ddc03476cf92b3a8c1..04faf15ed316a9c291dc952b6cc40fb6c8c330cf 100644
--- a/scripts/Makefile.extrawarn
+++ b/scripts/Makefile.extrawarn
@@ -130,7 +130,6 @@ KBUILD_CFLAGS += $(call cc-disable-warning, pointer-to-enum-cast)
KBUILD_CFLAGS += -Wno-tautological-constant-out-of-range-compare
KBUILD_CFLAGS += $(call cc-disable-warning, unaligned-access)
KBUILD_CFLAGS += -Wno-enum-compare-conditional
-KBUILD_CFLAGS += -Wno-enum-enum-conversion
endif
endif
@@ -154,6 +153,10 @@ KBUILD_CFLAGS += -Wno-missing-field-initializers
KBUILD_CFLAGS += -Wno-type-limits
KBUILD_CFLAGS += -Wno-shift-negative-value
+ifdef CONFIG_CC_IS_CLANG
+KBUILD_CFLAGS += -Wno-enum-enum-conversion
+endif
+
ifdef CONFIG_CC_IS_GCC
KBUILD_CFLAGS += -Wno-maybe-uninitialized
endif
---
base-commit: 8e929cb546ee42c9a61d24fae60605e9e3192354
change-id: 20241016-disable-two-clang-enum-warnings-e7994d44f948
Best regards,
--
Nathan Chancellor <nathan(a)kernel.org>
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x 5e1c67abc9301d05130b7e267c204e7005503b33
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024120248-scuff-roster-854c@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 5e1c67abc9301d05130b7e267c204e7005503b33 Mon Sep 17 00:00:00 2001
From: Kuangyi Chiang <ki.chiang65(a)gmail.com>
Date: Wed, 6 Nov 2024 12:14:45 +0200
Subject: [PATCH] xhci: Fix control transfer error on Etron xHCI host
Performing a stability stress test on a USB3.0 2.5G ethernet adapter
results in errors like this:
[ 91.441469] r8152 2-3:1.0 eth3: get_registers -71
[ 91.458659] r8152 2-3:1.0 eth3: get_registers -71
[ 91.475911] r8152 2-3:1.0 eth3: get_registers -71
[ 91.493203] r8152 2-3:1.0 eth3: get_registers -71
[ 91.510421] r8152 2-3:1.0 eth3: get_registers -71
The r8152 driver will periodically issue lots of control-IN requests
to access the status of ethernet adapter hardware registers during
the test.
This happens when the xHCI driver enqueue a control TD (which cross
over the Link TRB between two ring segments, as shown) in the endpoint
zero's transfer ring. Seems the Etron xHCI host can not perform this
TD correctly, causing the USB transfer error occurred, maybe the upper
driver retry that control-IN request can solve problem, but not all
drivers do this.
| |
-------
| TRB | Setup Stage
-------
| TRB | Link
-------
-------
| TRB | Data Stage
-------
| TRB | Status Stage
-------
| |
To work around this, the xHCI driver should enqueue a No Op TRB if
next available TRB is the Link TRB in the ring segment, this can
prevent the Setup and Data Stage TRB to be breaked by the Link TRB.
Check if the XHCI_ETRON_HOST quirk flag is set before invoking the
workaround in xhci_queue_ctrl_tx().
Fixes: d0e96f5a71a0 ("USB: xhci: Control transfer support.")
Cc: stable(a)vger.kernel.org
Signed-off-by: Kuangyi Chiang <ki.chiang65(a)gmail.com>
Signed-off-by: Mathias Nyman <mathias.nyman(a)linux.intel.com>
Link: https://lore.kernel.org/r/20241106101459.775897-20-mathias.nyman@linux.inte…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index f62b243d0fc4..517df97ef496 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -3733,6 +3733,20 @@ int xhci_queue_ctrl_tx(struct xhci_hcd *xhci, gfp_t mem_flags,
if (!urb->setup_packet)
return -EINVAL;
+ if ((xhci->quirks & XHCI_ETRON_HOST) &&
+ urb->dev->speed >= USB_SPEED_SUPER) {
+ /*
+ * If next available TRB is the Link TRB in the ring segment then
+ * enqueue a No Op TRB, this can prevent the Setup and Data Stage
+ * TRB to be breaked by the Link TRB.
+ */
+ if (trb_is_link(ep_ring->enqueue + 1)) {
+ field = TRB_TYPE(TRB_TR_NOOP) | ep_ring->cycle_state;
+ queue_trb(xhci, ep_ring, false, 0, 0,
+ TRB_INTR_TARGET(0), field);
+ }
+ }
+
/* 1 TRB for setup, 1 for status */
num_trbs = 2;
/*
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 5e1c67abc9301d05130b7e267c204e7005503b33
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024120247-spew-molasses-0b53@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 5e1c67abc9301d05130b7e267c204e7005503b33 Mon Sep 17 00:00:00 2001
From: Kuangyi Chiang <ki.chiang65(a)gmail.com>
Date: Wed, 6 Nov 2024 12:14:45 +0200
Subject: [PATCH] xhci: Fix control transfer error on Etron xHCI host
Performing a stability stress test on a USB3.0 2.5G ethernet adapter
results in errors like this:
[ 91.441469] r8152 2-3:1.0 eth3: get_registers -71
[ 91.458659] r8152 2-3:1.0 eth3: get_registers -71
[ 91.475911] r8152 2-3:1.0 eth3: get_registers -71
[ 91.493203] r8152 2-3:1.0 eth3: get_registers -71
[ 91.510421] r8152 2-3:1.0 eth3: get_registers -71
The r8152 driver will periodically issue lots of control-IN requests
to access the status of ethernet adapter hardware registers during
the test.
This happens when the xHCI driver enqueue a control TD (which cross
over the Link TRB between two ring segments, as shown) in the endpoint
zero's transfer ring. Seems the Etron xHCI host can not perform this
TD correctly, causing the USB transfer error occurred, maybe the upper
driver retry that control-IN request can solve problem, but not all
drivers do this.
| |
-------
| TRB | Setup Stage
-------
| TRB | Link
-------
-------
| TRB | Data Stage
-------
| TRB | Status Stage
-------
| |
To work around this, the xHCI driver should enqueue a No Op TRB if
next available TRB is the Link TRB in the ring segment, this can
prevent the Setup and Data Stage TRB to be breaked by the Link TRB.
Check if the XHCI_ETRON_HOST quirk flag is set before invoking the
workaround in xhci_queue_ctrl_tx().
Fixes: d0e96f5a71a0 ("USB: xhci: Control transfer support.")
Cc: stable(a)vger.kernel.org
Signed-off-by: Kuangyi Chiang <ki.chiang65(a)gmail.com>
Signed-off-by: Mathias Nyman <mathias.nyman(a)linux.intel.com>
Link: https://lore.kernel.org/r/20241106101459.775897-20-mathias.nyman@linux.inte…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index f62b243d0fc4..517df97ef496 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -3733,6 +3733,20 @@ int xhci_queue_ctrl_tx(struct xhci_hcd *xhci, gfp_t mem_flags,
if (!urb->setup_packet)
return -EINVAL;
+ if ((xhci->quirks & XHCI_ETRON_HOST) &&
+ urb->dev->speed >= USB_SPEED_SUPER) {
+ /*
+ * If next available TRB is the Link TRB in the ring segment then
+ * enqueue a No Op TRB, this can prevent the Setup and Data Stage
+ * TRB to be breaked by the Link TRB.
+ */
+ if (trb_is_link(ep_ring->enqueue + 1)) {
+ field = TRB_TYPE(TRB_TR_NOOP) | ep_ring->cycle_state;
+ queue_trb(xhci, ep_ring, false, 0, 0,
+ TRB_INTR_TARGET(0), field);
+ }
+ }
+
/* 1 TRB for setup, 1 for status */
num_trbs = 2;
/*
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 5e1c67abc9301d05130b7e267c204e7005503b33
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024120246-volley-handbook-e11d@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 5e1c67abc9301d05130b7e267c204e7005503b33 Mon Sep 17 00:00:00 2001
From: Kuangyi Chiang <ki.chiang65(a)gmail.com>
Date: Wed, 6 Nov 2024 12:14:45 +0200
Subject: [PATCH] xhci: Fix control transfer error on Etron xHCI host
Performing a stability stress test on a USB3.0 2.5G ethernet adapter
results in errors like this:
[ 91.441469] r8152 2-3:1.0 eth3: get_registers -71
[ 91.458659] r8152 2-3:1.0 eth3: get_registers -71
[ 91.475911] r8152 2-3:1.0 eth3: get_registers -71
[ 91.493203] r8152 2-3:1.0 eth3: get_registers -71
[ 91.510421] r8152 2-3:1.0 eth3: get_registers -71
The r8152 driver will periodically issue lots of control-IN requests
to access the status of ethernet adapter hardware registers during
the test.
This happens when the xHCI driver enqueue a control TD (which cross
over the Link TRB between two ring segments, as shown) in the endpoint
zero's transfer ring. Seems the Etron xHCI host can not perform this
TD correctly, causing the USB transfer error occurred, maybe the upper
driver retry that control-IN request can solve problem, but not all
drivers do this.
| |
-------
| TRB | Setup Stage
-------
| TRB | Link
-------
-------
| TRB | Data Stage
-------
| TRB | Status Stage
-------
| |
To work around this, the xHCI driver should enqueue a No Op TRB if
next available TRB is the Link TRB in the ring segment, this can
prevent the Setup and Data Stage TRB to be breaked by the Link TRB.
Check if the XHCI_ETRON_HOST quirk flag is set before invoking the
workaround in xhci_queue_ctrl_tx().
Fixes: d0e96f5a71a0 ("USB: xhci: Control transfer support.")
Cc: stable(a)vger.kernel.org
Signed-off-by: Kuangyi Chiang <ki.chiang65(a)gmail.com>
Signed-off-by: Mathias Nyman <mathias.nyman(a)linux.intel.com>
Link: https://lore.kernel.org/r/20241106101459.775897-20-mathias.nyman@linux.inte…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index f62b243d0fc4..517df97ef496 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -3733,6 +3733,20 @@ int xhci_queue_ctrl_tx(struct xhci_hcd *xhci, gfp_t mem_flags,
if (!urb->setup_packet)
return -EINVAL;
+ if ((xhci->quirks & XHCI_ETRON_HOST) &&
+ urb->dev->speed >= USB_SPEED_SUPER) {
+ /*
+ * If next available TRB is the Link TRB in the ring segment then
+ * enqueue a No Op TRB, this can prevent the Setup and Data Stage
+ * TRB to be breaked by the Link TRB.
+ */
+ if (trb_is_link(ep_ring->enqueue + 1)) {
+ field = TRB_TYPE(TRB_TR_NOOP) | ep_ring->cycle_state;
+ queue_trb(xhci, ep_ring, false, 0, 0,
+ TRB_INTR_TARGET(0), field);
+ }
+ }
+
/* 1 TRB for setup, 1 for status */
num_trbs = 2;
/*
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 5e1c67abc9301d05130b7e267c204e7005503b33
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024120245-celestial-coherent-9ce2@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 5e1c67abc9301d05130b7e267c204e7005503b33 Mon Sep 17 00:00:00 2001
From: Kuangyi Chiang <ki.chiang65(a)gmail.com>
Date: Wed, 6 Nov 2024 12:14:45 +0200
Subject: [PATCH] xhci: Fix control transfer error on Etron xHCI host
Performing a stability stress test on a USB3.0 2.5G ethernet adapter
results in errors like this:
[ 91.441469] r8152 2-3:1.0 eth3: get_registers -71
[ 91.458659] r8152 2-3:1.0 eth3: get_registers -71
[ 91.475911] r8152 2-3:1.0 eth3: get_registers -71
[ 91.493203] r8152 2-3:1.0 eth3: get_registers -71
[ 91.510421] r8152 2-3:1.0 eth3: get_registers -71
The r8152 driver will periodically issue lots of control-IN requests
to access the status of ethernet adapter hardware registers during
the test.
This happens when the xHCI driver enqueue a control TD (which cross
over the Link TRB between two ring segments, as shown) in the endpoint
zero's transfer ring. Seems the Etron xHCI host can not perform this
TD correctly, causing the USB transfer error occurred, maybe the upper
driver retry that control-IN request can solve problem, but not all
drivers do this.
| |
-------
| TRB | Setup Stage
-------
| TRB | Link
-------
-------
| TRB | Data Stage
-------
| TRB | Status Stage
-------
| |
To work around this, the xHCI driver should enqueue a No Op TRB if
next available TRB is the Link TRB in the ring segment, this can
prevent the Setup and Data Stage TRB to be breaked by the Link TRB.
Check if the XHCI_ETRON_HOST quirk flag is set before invoking the
workaround in xhci_queue_ctrl_tx().
Fixes: d0e96f5a71a0 ("USB: xhci: Control transfer support.")
Cc: stable(a)vger.kernel.org
Signed-off-by: Kuangyi Chiang <ki.chiang65(a)gmail.com>
Signed-off-by: Mathias Nyman <mathias.nyman(a)linux.intel.com>
Link: https://lore.kernel.org/r/20241106101459.775897-20-mathias.nyman@linux.inte…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index f62b243d0fc4..517df97ef496 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -3733,6 +3733,20 @@ int xhci_queue_ctrl_tx(struct xhci_hcd *xhci, gfp_t mem_flags,
if (!urb->setup_packet)
return -EINVAL;
+ if ((xhci->quirks & XHCI_ETRON_HOST) &&
+ urb->dev->speed >= USB_SPEED_SUPER) {
+ /*
+ * If next available TRB is the Link TRB in the ring segment then
+ * enqueue a No Op TRB, this can prevent the Setup and Data Stage
+ * TRB to be breaked by the Link TRB.
+ */
+ if (trb_is_link(ep_ring->enqueue + 1)) {
+ field = TRB_TYPE(TRB_TR_NOOP) | ep_ring->cycle_state;
+ queue_trb(xhci, ep_ring, false, 0, 0,
+ TRB_INTR_TARGET(0), field);
+ }
+ }
+
/* 1 TRB for setup, 1 for status */
num_trbs = 2;
/*
Hi,
I wanted to check if you’d be interested in acquiring the attendees list of I/ITSEC: The Interservice/Industry Training, Simulation & Education Conference 2024.
Date: 02 - 05 Dec 2024
Location: Orlando, USA.
Registrants Counts: 17,157+ visitors.
Each contact contains: Contact Name, Job Title, Company Name, Company website, Physical Address, Phone Number, and Official Email address.
Let me know if you are interested to see the pricing.
Regards,
Jennifer Martin
Sr. Marketing Manager
If you do not wish to receive this newsletter reply “Not interested”
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.19.y
git checkout FETCH_HEAD
git cherry-pick -x ce8f9fb651fac95dd41f69afe54d935420b945bd
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024120212-germless-strongly-3499@gregkh' --subject-prefix 'PATCH 4.19.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From ce8f9fb651fac95dd41f69afe54d935420b945bd Mon Sep 17 00:00:00 2001
From: Jann Horn <jannh(a)google.com>
Date: Thu, 17 Oct 2024 21:07:45 +0200
Subject: [PATCH] comedi: Flush partial mappings in error case
If some remap_pfn_range() calls succeeded before one failed, we still have
buffer pages mapped into the userspace page tables when we drop the buffer
reference with comedi_buf_map_put(bm). The userspace mappings are only
cleaned up later in the mmap error path.
Fix it by explicitly flushing all mappings in our VMA on the error path.
See commit 79a61cc3fc04 ("mm: avoid leaving partial pfn mappings around in
error case").
Cc: stable(a)vger.kernel.org
Fixes: ed9eccbe8970 ("Staging: add comedi core")
Signed-off-by: Jann Horn <jannh(a)google.com>
Link: https://lore.kernel.org/r/20241017-comedi-tlb-v3-1-16b82f9372ce@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/comedi/comedi_fops.c b/drivers/comedi/comedi_fops.c
index 1b481731df96..b9df9b19d4bd 100644
--- a/drivers/comedi/comedi_fops.c
+++ b/drivers/comedi/comedi_fops.c
@@ -2407,6 +2407,18 @@ static int comedi_mmap(struct file *file, struct vm_area_struct *vma)
start += PAGE_SIZE;
}
+
+#ifdef CONFIG_MMU
+ /*
+ * Leaving behind a partial mapping of a buffer we're about to
+ * drop is unsafe, see remap_pfn_range_notrack().
+ * We need to zap the range here ourselves instead of relying
+ * on the automatic zapping in remap_pfn_range() because we call
+ * remap_pfn_range() in a loop.
+ */
+ if (retval)
+ zap_vma_ptes(vma, vma->vm_start, size);
+#endif
}
if (retval == 0) {
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x 5e1c67abc9301d05130b7e267c204e7005503b33
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024120211-exit-maturely-4f37@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 5e1c67abc9301d05130b7e267c204e7005503b33 Mon Sep 17 00:00:00 2001
From: Kuangyi Chiang <ki.chiang65(a)gmail.com>
Date: Wed, 6 Nov 2024 12:14:45 +0200
Subject: [PATCH] xhci: Fix control transfer error on Etron xHCI host
Performing a stability stress test on a USB3.0 2.5G ethernet adapter
results in errors like this:
[ 91.441469] r8152 2-3:1.0 eth3: get_registers -71
[ 91.458659] r8152 2-3:1.0 eth3: get_registers -71
[ 91.475911] r8152 2-3:1.0 eth3: get_registers -71
[ 91.493203] r8152 2-3:1.0 eth3: get_registers -71
[ 91.510421] r8152 2-3:1.0 eth3: get_registers -71
The r8152 driver will periodically issue lots of control-IN requests
to access the status of ethernet adapter hardware registers during
the test.
This happens when the xHCI driver enqueue a control TD (which cross
over the Link TRB between two ring segments, as shown) in the endpoint
zero's transfer ring. Seems the Etron xHCI host can not perform this
TD correctly, causing the USB transfer error occurred, maybe the upper
driver retry that control-IN request can solve problem, but not all
drivers do this.
| |
-------
| TRB | Setup Stage
-------
| TRB | Link
-------
-------
| TRB | Data Stage
-------
| TRB | Status Stage
-------
| |
To work around this, the xHCI driver should enqueue a No Op TRB if
next available TRB is the Link TRB in the ring segment, this can
prevent the Setup and Data Stage TRB to be breaked by the Link TRB.
Check if the XHCI_ETRON_HOST quirk flag is set before invoking the
workaround in xhci_queue_ctrl_tx().
Fixes: d0e96f5a71a0 ("USB: xhci: Control transfer support.")
Cc: stable(a)vger.kernel.org
Signed-off-by: Kuangyi Chiang <ki.chiang65(a)gmail.com>
Signed-off-by: Mathias Nyman <mathias.nyman(a)linux.intel.com>
Link: https://lore.kernel.org/r/20241106101459.775897-20-mathias.nyman@linux.inte…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index f62b243d0fc4..517df97ef496 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -3733,6 +3733,20 @@ int xhci_queue_ctrl_tx(struct xhci_hcd *xhci, gfp_t mem_flags,
if (!urb->setup_packet)
return -EINVAL;
+ if ((xhci->quirks & XHCI_ETRON_HOST) &&
+ urb->dev->speed >= USB_SPEED_SUPER) {
+ /*
+ * If next available TRB is the Link TRB in the ring segment then
+ * enqueue a No Op TRB, this can prevent the Setup and Data Stage
+ * TRB to be breaked by the Link TRB.
+ */
+ if (trb_is_link(ep_ring->enqueue + 1)) {
+ field = TRB_TYPE(TRB_TR_NOOP) | ep_ring->cycle_state;
+ queue_trb(xhci, ep_ring, false, 0, 0,
+ TRB_INTR_TARGET(0), field);
+ }
+ }
+
/* 1 TRB for setup, 1 for status */
num_trbs = 2;
/*
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.19.y
git checkout FETCH_HEAD
git cherry-pick -x 5e1c67abc9301d05130b7e267c204e7005503b33
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024120210-tastiness-guru-3aba@gregkh' --subject-prefix 'PATCH 4.19.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 5e1c67abc9301d05130b7e267c204e7005503b33 Mon Sep 17 00:00:00 2001
From: Kuangyi Chiang <ki.chiang65(a)gmail.com>
Date: Wed, 6 Nov 2024 12:14:45 +0200
Subject: [PATCH] xhci: Fix control transfer error on Etron xHCI host
Performing a stability stress test on a USB3.0 2.5G ethernet adapter
results in errors like this:
[ 91.441469] r8152 2-3:1.0 eth3: get_registers -71
[ 91.458659] r8152 2-3:1.0 eth3: get_registers -71
[ 91.475911] r8152 2-3:1.0 eth3: get_registers -71
[ 91.493203] r8152 2-3:1.0 eth3: get_registers -71
[ 91.510421] r8152 2-3:1.0 eth3: get_registers -71
The r8152 driver will periodically issue lots of control-IN requests
to access the status of ethernet adapter hardware registers during
the test.
This happens when the xHCI driver enqueue a control TD (which cross
over the Link TRB between two ring segments, as shown) in the endpoint
zero's transfer ring. Seems the Etron xHCI host can not perform this
TD correctly, causing the USB transfer error occurred, maybe the upper
driver retry that control-IN request can solve problem, but not all
drivers do this.
| |
-------
| TRB | Setup Stage
-------
| TRB | Link
-------
-------
| TRB | Data Stage
-------
| TRB | Status Stage
-------
| |
To work around this, the xHCI driver should enqueue a No Op TRB if
next available TRB is the Link TRB in the ring segment, this can
prevent the Setup and Data Stage TRB to be breaked by the Link TRB.
Check if the XHCI_ETRON_HOST quirk flag is set before invoking the
workaround in xhci_queue_ctrl_tx().
Fixes: d0e96f5a71a0 ("USB: xhci: Control transfer support.")
Cc: stable(a)vger.kernel.org
Signed-off-by: Kuangyi Chiang <ki.chiang65(a)gmail.com>
Signed-off-by: Mathias Nyman <mathias.nyman(a)linux.intel.com>
Link: https://lore.kernel.org/r/20241106101459.775897-20-mathias.nyman@linux.inte…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index f62b243d0fc4..517df97ef496 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -3733,6 +3733,20 @@ int xhci_queue_ctrl_tx(struct xhci_hcd *xhci, gfp_t mem_flags,
if (!urb->setup_packet)
return -EINVAL;
+ if ((xhci->quirks & XHCI_ETRON_HOST) &&
+ urb->dev->speed >= USB_SPEED_SUPER) {
+ /*
+ * If next available TRB is the Link TRB in the ring segment then
+ * enqueue a No Op TRB, this can prevent the Setup and Data Stage
+ * TRB to be breaked by the Link TRB.
+ */
+ if (trb_is_link(ep_ring->enqueue + 1)) {
+ field = TRB_TYPE(TRB_TR_NOOP) | ep_ring->cycle_state;
+ queue_trb(xhci, ep_ring, false, 0, 0,
+ TRB_INTR_TARGET(0), field);
+ }
+ }
+
/* 1 TRB for setup, 1 for status */
num_trbs = 2;
/*
Disabling card detect from the host's ->shutdown_pre() callback turned out
to not be the complete solution. More precisely, beyond the point when the
mmc_bus->shutdown() has been called, to gracefully power off the card, we
need to prevent card detect. Otherwise the mmc_rescan work may poll for the
card with a CMD13, to see if it's still alive, which then will fail and
hang as the card has already been powered off.
To fix this problem, let's disable mmc_rescan prior to power off the card
during shutdown.
Reported-by: Anthony Pighin <anthony.pighin(a)nokia.com>
Fixes: 66c915d09b94 ("mmc: core: Disable card detect during shutdown")
Cc: stable(a)vger.kernel.org
Signed-off-by: Ulf Hansson <ulf.hansson(a)linaro.org>
---
drivers/mmc/core/bus.c | 2 ++
drivers/mmc/core/core.c | 3 +++
2 files changed, 5 insertions(+)
diff --git a/drivers/mmc/core/bus.c b/drivers/mmc/core/bus.c
index 9283b28bc69f..1cf64e0952fb 100644
--- a/drivers/mmc/core/bus.c
+++ b/drivers/mmc/core/bus.c
@@ -149,6 +149,8 @@ static void mmc_bus_shutdown(struct device *dev)
if (dev->driver && drv->shutdown)
drv->shutdown(card);
+ __mmc_stop_host(host);
+
if (host->bus_ops->shutdown) {
ret = host->bus_ops->shutdown(host);
if (ret)
diff --git a/drivers/mmc/core/core.c b/drivers/mmc/core/core.c
index a499f3c59de5..d996d39c0d6f 100644
--- a/drivers/mmc/core/core.c
+++ b/drivers/mmc/core/core.c
@@ -2335,6 +2335,9 @@ void mmc_start_host(struct mmc_host *host)
void __mmc_stop_host(struct mmc_host *host)
{
+ if (host->rescan_disable)
+ return;
+
if (host->slot.cd_irq >= 0) {
mmc_gpio_set_cd_wake(host, false);
disable_irq(host->slot.cd_irq);
--
2.43.0
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.19.y
git checkout FETCH_HEAD
git cherry-pick -x 76d98856b1c6d06ce18f32c20527a4f9d283e660
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024120230-racoon-massager-3c4c@gregkh' --subject-prefix 'PATCH 4.19.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 76d98856b1c6d06ce18f32c20527a4f9d283e660 Mon Sep 17 00:00:00 2001
From: Kuangyi Chiang <ki.chiang65(a)gmail.com>
Date: Wed, 6 Nov 2024 12:14:44 +0200
Subject: [PATCH] xhci: Don't issue Reset Device command to Etron xHCI host
Sometimes the hub driver does not recognize the USB device connected
to the external USB2.0 hub when the system resumes from S4.
After the SetPortFeature(PORT_RESET) request is completed, the hub
driver calls the HCD reset_device callback, which will issue a Reset
Device command and free all structures associated with endpoints
that were disabled.
This happens when the xHCI driver issue a Reset Device command to
inform the Etron xHCI host that the USB device associated with a
device slot has been reset. Seems that the Etron xHCI host can not
perform this command correctly, affecting the USB device.
To work around this, the xHCI driver should obtain a new device slot
with reference to commit 651aaf36a7d7 ("usb: xhci: Handle USB transaction
error on address command"), which is another way to inform the Etron
xHCI host that the USB device has been reset.
Add a new XHCI_ETRON_HOST quirk flag to invoke the workaround in
xhci_discover_or_reset_device().
Fixes: 2a8f82c4ceaf ("USB: xhci: Notify the xHC when a device is reset.")
Cc: stable(a)vger.kernel.org
Signed-off-by: Kuangyi Chiang <ki.chiang65(a)gmail.com>
Signed-off-by: Mathias Nyman <mathias.nyman(a)linux.intel.com>
Link: https://lore.kernel.org/r/20241106101459.775897-19-mathias.nyman@linux.inte…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/usb/host/xhci-pci.c b/drivers/usb/host/xhci-pci.c
index db3c7e738213..4b8c93e59d6d 100644
--- a/drivers/usb/host/xhci-pci.c
+++ b/drivers/usb/host/xhci-pci.c
@@ -396,6 +396,7 @@ static void xhci_pci_quirks(struct device *dev, struct xhci_hcd *xhci)
if (pdev->vendor == PCI_VENDOR_ID_ETRON &&
(pdev->device == PCI_DEVICE_ID_EJ168 ||
pdev->device == PCI_DEVICE_ID_EJ188)) {
+ xhci->quirks |= XHCI_ETRON_HOST;
xhci->quirks |= XHCI_RESET_ON_RESUME;
xhci->quirks |= XHCI_BROKEN_STREAMS;
}
diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
index aa8c877f47ac..ae16253b53fb 100644
--- a/drivers/usb/host/xhci.c
+++ b/drivers/usb/host/xhci.c
@@ -3733,6 +3733,8 @@ void xhci_free_device_endpoint_resources(struct xhci_hcd *xhci,
xhci->num_active_eps);
}
+static void xhci_free_dev(struct usb_hcd *hcd, struct usb_device *udev);
+
/*
* This submits a Reset Device Command, which will set the device state to 0,
* set the device address to 0, and disable all the endpoints except the default
@@ -3803,6 +3805,23 @@ static int xhci_discover_or_reset_device(struct usb_hcd *hcd,
SLOT_STATE_DISABLED)
return 0;
+ if (xhci->quirks & XHCI_ETRON_HOST) {
+ /*
+ * Obtaining a new device slot to inform the xHCI host that
+ * the USB device has been reset.
+ */
+ ret = xhci_disable_slot(xhci, udev->slot_id);
+ xhci_free_virt_device(xhci, udev->slot_id);
+ if (!ret) {
+ ret = xhci_alloc_dev(hcd, udev);
+ if (ret == 1)
+ ret = 0;
+ else
+ ret = -EINVAL;
+ }
+ return ret;
+ }
+
trace_xhci_discover_or_reset_device(slot_ctx);
xhci_dbg(xhci, "Resetting device with slot ID %u\n", slot_id);
diff --git a/drivers/usb/host/xhci.h b/drivers/usb/host/xhci.h
index d3b250c736b8..a0204e10486d 100644
--- a/drivers/usb/host/xhci.h
+++ b/drivers/usb/host/xhci.h
@@ -1631,6 +1631,7 @@ struct xhci_hcd {
#define XHCI_ZHAOXIN_HOST BIT_ULL(46)
#define XHCI_WRITE_64_HI_LO BIT_ULL(47)
#define XHCI_CDNS_SCTX_QUIRK BIT_ULL(48)
+#define XHCI_ETRON_HOST BIT_ULL(49)
unsigned int num_active_eps;
unsigned int limit_active_eps;
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x 76d98856b1c6d06ce18f32c20527a4f9d283e660
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024120229-papaya-corned-f071@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 76d98856b1c6d06ce18f32c20527a4f9d283e660 Mon Sep 17 00:00:00 2001
From: Kuangyi Chiang <ki.chiang65(a)gmail.com>
Date: Wed, 6 Nov 2024 12:14:44 +0200
Subject: [PATCH] xhci: Don't issue Reset Device command to Etron xHCI host
Sometimes the hub driver does not recognize the USB device connected
to the external USB2.0 hub when the system resumes from S4.
After the SetPortFeature(PORT_RESET) request is completed, the hub
driver calls the HCD reset_device callback, which will issue a Reset
Device command and free all structures associated with endpoints
that were disabled.
This happens when the xHCI driver issue a Reset Device command to
inform the Etron xHCI host that the USB device associated with a
device slot has been reset. Seems that the Etron xHCI host can not
perform this command correctly, affecting the USB device.
To work around this, the xHCI driver should obtain a new device slot
with reference to commit 651aaf36a7d7 ("usb: xhci: Handle USB transaction
error on address command"), which is another way to inform the Etron
xHCI host that the USB device has been reset.
Add a new XHCI_ETRON_HOST quirk flag to invoke the workaround in
xhci_discover_or_reset_device().
Fixes: 2a8f82c4ceaf ("USB: xhci: Notify the xHC when a device is reset.")
Cc: stable(a)vger.kernel.org
Signed-off-by: Kuangyi Chiang <ki.chiang65(a)gmail.com>
Signed-off-by: Mathias Nyman <mathias.nyman(a)linux.intel.com>
Link: https://lore.kernel.org/r/20241106101459.775897-19-mathias.nyman@linux.inte…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/usb/host/xhci-pci.c b/drivers/usb/host/xhci-pci.c
index db3c7e738213..4b8c93e59d6d 100644
--- a/drivers/usb/host/xhci-pci.c
+++ b/drivers/usb/host/xhci-pci.c
@@ -396,6 +396,7 @@ static void xhci_pci_quirks(struct device *dev, struct xhci_hcd *xhci)
if (pdev->vendor == PCI_VENDOR_ID_ETRON &&
(pdev->device == PCI_DEVICE_ID_EJ168 ||
pdev->device == PCI_DEVICE_ID_EJ188)) {
+ xhci->quirks |= XHCI_ETRON_HOST;
xhci->quirks |= XHCI_RESET_ON_RESUME;
xhci->quirks |= XHCI_BROKEN_STREAMS;
}
diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
index aa8c877f47ac..ae16253b53fb 100644
--- a/drivers/usb/host/xhci.c
+++ b/drivers/usb/host/xhci.c
@@ -3733,6 +3733,8 @@ void xhci_free_device_endpoint_resources(struct xhci_hcd *xhci,
xhci->num_active_eps);
}
+static void xhci_free_dev(struct usb_hcd *hcd, struct usb_device *udev);
+
/*
* This submits a Reset Device Command, which will set the device state to 0,
* set the device address to 0, and disable all the endpoints except the default
@@ -3803,6 +3805,23 @@ static int xhci_discover_or_reset_device(struct usb_hcd *hcd,
SLOT_STATE_DISABLED)
return 0;
+ if (xhci->quirks & XHCI_ETRON_HOST) {
+ /*
+ * Obtaining a new device slot to inform the xHCI host that
+ * the USB device has been reset.
+ */
+ ret = xhci_disable_slot(xhci, udev->slot_id);
+ xhci_free_virt_device(xhci, udev->slot_id);
+ if (!ret) {
+ ret = xhci_alloc_dev(hcd, udev);
+ if (ret == 1)
+ ret = 0;
+ else
+ ret = -EINVAL;
+ }
+ return ret;
+ }
+
trace_xhci_discover_or_reset_device(slot_ctx);
xhci_dbg(xhci, "Resetting device with slot ID %u\n", slot_id);
diff --git a/drivers/usb/host/xhci.h b/drivers/usb/host/xhci.h
index d3b250c736b8..a0204e10486d 100644
--- a/drivers/usb/host/xhci.h
+++ b/drivers/usb/host/xhci.h
@@ -1631,6 +1631,7 @@ struct xhci_hcd {
#define XHCI_ZHAOXIN_HOST BIT_ULL(46)
#define XHCI_WRITE_64_HI_LO BIT_ULL(47)
#define XHCI_CDNS_SCTX_QUIRK BIT_ULL(48)
+#define XHCI_ETRON_HOST BIT_ULL(49)
unsigned int num_active_eps;
unsigned int limit_active_eps;
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 76d98856b1c6d06ce18f32c20527a4f9d283e660
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024120228-twentieth-lung-0df0@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 76d98856b1c6d06ce18f32c20527a4f9d283e660 Mon Sep 17 00:00:00 2001
From: Kuangyi Chiang <ki.chiang65(a)gmail.com>
Date: Wed, 6 Nov 2024 12:14:44 +0200
Subject: [PATCH] xhci: Don't issue Reset Device command to Etron xHCI host
Sometimes the hub driver does not recognize the USB device connected
to the external USB2.0 hub when the system resumes from S4.
After the SetPortFeature(PORT_RESET) request is completed, the hub
driver calls the HCD reset_device callback, which will issue a Reset
Device command and free all structures associated with endpoints
that were disabled.
This happens when the xHCI driver issue a Reset Device command to
inform the Etron xHCI host that the USB device associated with a
device slot has been reset. Seems that the Etron xHCI host can not
perform this command correctly, affecting the USB device.
To work around this, the xHCI driver should obtain a new device slot
with reference to commit 651aaf36a7d7 ("usb: xhci: Handle USB transaction
error on address command"), which is another way to inform the Etron
xHCI host that the USB device has been reset.
Add a new XHCI_ETRON_HOST quirk flag to invoke the workaround in
xhci_discover_or_reset_device().
Fixes: 2a8f82c4ceaf ("USB: xhci: Notify the xHC when a device is reset.")
Cc: stable(a)vger.kernel.org
Signed-off-by: Kuangyi Chiang <ki.chiang65(a)gmail.com>
Signed-off-by: Mathias Nyman <mathias.nyman(a)linux.intel.com>
Link: https://lore.kernel.org/r/20241106101459.775897-19-mathias.nyman@linux.inte…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/usb/host/xhci-pci.c b/drivers/usb/host/xhci-pci.c
index db3c7e738213..4b8c93e59d6d 100644
--- a/drivers/usb/host/xhci-pci.c
+++ b/drivers/usb/host/xhci-pci.c
@@ -396,6 +396,7 @@ static void xhci_pci_quirks(struct device *dev, struct xhci_hcd *xhci)
if (pdev->vendor == PCI_VENDOR_ID_ETRON &&
(pdev->device == PCI_DEVICE_ID_EJ168 ||
pdev->device == PCI_DEVICE_ID_EJ188)) {
+ xhci->quirks |= XHCI_ETRON_HOST;
xhci->quirks |= XHCI_RESET_ON_RESUME;
xhci->quirks |= XHCI_BROKEN_STREAMS;
}
diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
index aa8c877f47ac..ae16253b53fb 100644
--- a/drivers/usb/host/xhci.c
+++ b/drivers/usb/host/xhci.c
@@ -3733,6 +3733,8 @@ void xhci_free_device_endpoint_resources(struct xhci_hcd *xhci,
xhci->num_active_eps);
}
+static void xhci_free_dev(struct usb_hcd *hcd, struct usb_device *udev);
+
/*
* This submits a Reset Device Command, which will set the device state to 0,
* set the device address to 0, and disable all the endpoints except the default
@@ -3803,6 +3805,23 @@ static int xhci_discover_or_reset_device(struct usb_hcd *hcd,
SLOT_STATE_DISABLED)
return 0;
+ if (xhci->quirks & XHCI_ETRON_HOST) {
+ /*
+ * Obtaining a new device slot to inform the xHCI host that
+ * the USB device has been reset.
+ */
+ ret = xhci_disable_slot(xhci, udev->slot_id);
+ xhci_free_virt_device(xhci, udev->slot_id);
+ if (!ret) {
+ ret = xhci_alloc_dev(hcd, udev);
+ if (ret == 1)
+ ret = 0;
+ else
+ ret = -EINVAL;
+ }
+ return ret;
+ }
+
trace_xhci_discover_or_reset_device(slot_ctx);
xhci_dbg(xhci, "Resetting device with slot ID %u\n", slot_id);
diff --git a/drivers/usb/host/xhci.h b/drivers/usb/host/xhci.h
index d3b250c736b8..a0204e10486d 100644
--- a/drivers/usb/host/xhci.h
+++ b/drivers/usb/host/xhci.h
@@ -1631,6 +1631,7 @@ struct xhci_hcd {
#define XHCI_ZHAOXIN_HOST BIT_ULL(46)
#define XHCI_WRITE_64_HI_LO BIT_ULL(47)
#define XHCI_CDNS_SCTX_QUIRK BIT_ULL(48)
+#define XHCI_ETRON_HOST BIT_ULL(49)
unsigned int num_active_eps;
unsigned int limit_active_eps;
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 76d98856b1c6d06ce18f32c20527a4f9d283e660
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024120227-yapping-outright-27f7@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 76d98856b1c6d06ce18f32c20527a4f9d283e660 Mon Sep 17 00:00:00 2001
From: Kuangyi Chiang <ki.chiang65(a)gmail.com>
Date: Wed, 6 Nov 2024 12:14:44 +0200
Subject: [PATCH] xhci: Don't issue Reset Device command to Etron xHCI host
Sometimes the hub driver does not recognize the USB device connected
to the external USB2.0 hub when the system resumes from S4.
After the SetPortFeature(PORT_RESET) request is completed, the hub
driver calls the HCD reset_device callback, which will issue a Reset
Device command and free all structures associated with endpoints
that were disabled.
This happens when the xHCI driver issue a Reset Device command to
inform the Etron xHCI host that the USB device associated with a
device slot has been reset. Seems that the Etron xHCI host can not
perform this command correctly, affecting the USB device.
To work around this, the xHCI driver should obtain a new device slot
with reference to commit 651aaf36a7d7 ("usb: xhci: Handle USB transaction
error on address command"), which is another way to inform the Etron
xHCI host that the USB device has been reset.
Add a new XHCI_ETRON_HOST quirk flag to invoke the workaround in
xhci_discover_or_reset_device().
Fixes: 2a8f82c4ceaf ("USB: xhci: Notify the xHC when a device is reset.")
Cc: stable(a)vger.kernel.org
Signed-off-by: Kuangyi Chiang <ki.chiang65(a)gmail.com>
Signed-off-by: Mathias Nyman <mathias.nyman(a)linux.intel.com>
Link: https://lore.kernel.org/r/20241106101459.775897-19-mathias.nyman@linux.inte…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/usb/host/xhci-pci.c b/drivers/usb/host/xhci-pci.c
index db3c7e738213..4b8c93e59d6d 100644
--- a/drivers/usb/host/xhci-pci.c
+++ b/drivers/usb/host/xhci-pci.c
@@ -396,6 +396,7 @@ static void xhci_pci_quirks(struct device *dev, struct xhci_hcd *xhci)
if (pdev->vendor == PCI_VENDOR_ID_ETRON &&
(pdev->device == PCI_DEVICE_ID_EJ168 ||
pdev->device == PCI_DEVICE_ID_EJ188)) {
+ xhci->quirks |= XHCI_ETRON_HOST;
xhci->quirks |= XHCI_RESET_ON_RESUME;
xhci->quirks |= XHCI_BROKEN_STREAMS;
}
diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
index aa8c877f47ac..ae16253b53fb 100644
--- a/drivers/usb/host/xhci.c
+++ b/drivers/usb/host/xhci.c
@@ -3733,6 +3733,8 @@ void xhci_free_device_endpoint_resources(struct xhci_hcd *xhci,
xhci->num_active_eps);
}
+static void xhci_free_dev(struct usb_hcd *hcd, struct usb_device *udev);
+
/*
* This submits a Reset Device Command, which will set the device state to 0,
* set the device address to 0, and disable all the endpoints except the default
@@ -3803,6 +3805,23 @@ static int xhci_discover_or_reset_device(struct usb_hcd *hcd,
SLOT_STATE_DISABLED)
return 0;
+ if (xhci->quirks & XHCI_ETRON_HOST) {
+ /*
+ * Obtaining a new device slot to inform the xHCI host that
+ * the USB device has been reset.
+ */
+ ret = xhci_disable_slot(xhci, udev->slot_id);
+ xhci_free_virt_device(xhci, udev->slot_id);
+ if (!ret) {
+ ret = xhci_alloc_dev(hcd, udev);
+ if (ret == 1)
+ ret = 0;
+ else
+ ret = -EINVAL;
+ }
+ return ret;
+ }
+
trace_xhci_discover_or_reset_device(slot_ctx);
xhci_dbg(xhci, "Resetting device with slot ID %u\n", slot_id);
diff --git a/drivers/usb/host/xhci.h b/drivers/usb/host/xhci.h
index d3b250c736b8..a0204e10486d 100644
--- a/drivers/usb/host/xhci.h
+++ b/drivers/usb/host/xhci.h
@@ -1631,6 +1631,7 @@ struct xhci_hcd {
#define XHCI_ZHAOXIN_HOST BIT_ULL(46)
#define XHCI_WRITE_64_HI_LO BIT_ULL(47)
#define XHCI_CDNS_SCTX_QUIRK BIT_ULL(48)
+#define XHCI_ETRON_HOST BIT_ULL(49)
unsigned int num_active_eps;
unsigned int limit_active_eps;
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 76d98856b1c6d06ce18f32c20527a4f9d283e660
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024120226-secret-penknife-88db@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 76d98856b1c6d06ce18f32c20527a4f9d283e660 Mon Sep 17 00:00:00 2001
From: Kuangyi Chiang <ki.chiang65(a)gmail.com>
Date: Wed, 6 Nov 2024 12:14:44 +0200
Subject: [PATCH] xhci: Don't issue Reset Device command to Etron xHCI host
Sometimes the hub driver does not recognize the USB device connected
to the external USB2.0 hub when the system resumes from S4.
After the SetPortFeature(PORT_RESET) request is completed, the hub
driver calls the HCD reset_device callback, which will issue a Reset
Device command and free all structures associated with endpoints
that were disabled.
This happens when the xHCI driver issue a Reset Device command to
inform the Etron xHCI host that the USB device associated with a
device slot has been reset. Seems that the Etron xHCI host can not
perform this command correctly, affecting the USB device.
To work around this, the xHCI driver should obtain a new device slot
with reference to commit 651aaf36a7d7 ("usb: xhci: Handle USB transaction
error on address command"), which is another way to inform the Etron
xHCI host that the USB device has been reset.
Add a new XHCI_ETRON_HOST quirk flag to invoke the workaround in
xhci_discover_or_reset_device().
Fixes: 2a8f82c4ceaf ("USB: xhci: Notify the xHC when a device is reset.")
Cc: stable(a)vger.kernel.org
Signed-off-by: Kuangyi Chiang <ki.chiang65(a)gmail.com>
Signed-off-by: Mathias Nyman <mathias.nyman(a)linux.intel.com>
Link: https://lore.kernel.org/r/20241106101459.775897-19-mathias.nyman@linux.inte…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/usb/host/xhci-pci.c b/drivers/usb/host/xhci-pci.c
index db3c7e738213..4b8c93e59d6d 100644
--- a/drivers/usb/host/xhci-pci.c
+++ b/drivers/usb/host/xhci-pci.c
@@ -396,6 +396,7 @@ static void xhci_pci_quirks(struct device *dev, struct xhci_hcd *xhci)
if (pdev->vendor == PCI_VENDOR_ID_ETRON &&
(pdev->device == PCI_DEVICE_ID_EJ168 ||
pdev->device == PCI_DEVICE_ID_EJ188)) {
+ xhci->quirks |= XHCI_ETRON_HOST;
xhci->quirks |= XHCI_RESET_ON_RESUME;
xhci->quirks |= XHCI_BROKEN_STREAMS;
}
diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
index aa8c877f47ac..ae16253b53fb 100644
--- a/drivers/usb/host/xhci.c
+++ b/drivers/usb/host/xhci.c
@@ -3733,6 +3733,8 @@ void xhci_free_device_endpoint_resources(struct xhci_hcd *xhci,
xhci->num_active_eps);
}
+static void xhci_free_dev(struct usb_hcd *hcd, struct usb_device *udev);
+
/*
* This submits a Reset Device Command, which will set the device state to 0,
* set the device address to 0, and disable all the endpoints except the default
@@ -3803,6 +3805,23 @@ static int xhci_discover_or_reset_device(struct usb_hcd *hcd,
SLOT_STATE_DISABLED)
return 0;
+ if (xhci->quirks & XHCI_ETRON_HOST) {
+ /*
+ * Obtaining a new device slot to inform the xHCI host that
+ * the USB device has been reset.
+ */
+ ret = xhci_disable_slot(xhci, udev->slot_id);
+ xhci_free_virt_device(xhci, udev->slot_id);
+ if (!ret) {
+ ret = xhci_alloc_dev(hcd, udev);
+ if (ret == 1)
+ ret = 0;
+ else
+ ret = -EINVAL;
+ }
+ return ret;
+ }
+
trace_xhci_discover_or_reset_device(slot_ctx);
xhci_dbg(xhci, "Resetting device with slot ID %u\n", slot_id);
diff --git a/drivers/usb/host/xhci.h b/drivers/usb/host/xhci.h
index d3b250c736b8..a0204e10486d 100644
--- a/drivers/usb/host/xhci.h
+++ b/drivers/usb/host/xhci.h
@@ -1631,6 +1631,7 @@ struct xhci_hcd {
#define XHCI_ZHAOXIN_HOST BIT_ULL(46)
#define XHCI_WRITE_64_HI_LO BIT_ULL(47)
#define XHCI_CDNS_SCTX_QUIRK BIT_ULL(48)
+#define XHCI_ETRON_HOST BIT_ULL(49)
unsigned int num_active_eps;
unsigned int limit_active_eps;
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x 76d98856b1c6d06ce18f32c20527a4f9d283e660
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024120225-presoak-fresh-7f09@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 76d98856b1c6d06ce18f32c20527a4f9d283e660 Mon Sep 17 00:00:00 2001
From: Kuangyi Chiang <ki.chiang65(a)gmail.com>
Date: Wed, 6 Nov 2024 12:14:44 +0200
Subject: [PATCH] xhci: Don't issue Reset Device command to Etron xHCI host
Sometimes the hub driver does not recognize the USB device connected
to the external USB2.0 hub when the system resumes from S4.
After the SetPortFeature(PORT_RESET) request is completed, the hub
driver calls the HCD reset_device callback, which will issue a Reset
Device command and free all structures associated with endpoints
that were disabled.
This happens when the xHCI driver issue a Reset Device command to
inform the Etron xHCI host that the USB device associated with a
device slot has been reset. Seems that the Etron xHCI host can not
perform this command correctly, affecting the USB device.
To work around this, the xHCI driver should obtain a new device slot
with reference to commit 651aaf36a7d7 ("usb: xhci: Handle USB transaction
error on address command"), which is another way to inform the Etron
xHCI host that the USB device has been reset.
Add a new XHCI_ETRON_HOST quirk flag to invoke the workaround in
xhci_discover_or_reset_device().
Fixes: 2a8f82c4ceaf ("USB: xhci: Notify the xHC when a device is reset.")
Cc: stable(a)vger.kernel.org
Signed-off-by: Kuangyi Chiang <ki.chiang65(a)gmail.com>
Signed-off-by: Mathias Nyman <mathias.nyman(a)linux.intel.com>
Link: https://lore.kernel.org/r/20241106101459.775897-19-mathias.nyman@linux.inte…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/usb/host/xhci-pci.c b/drivers/usb/host/xhci-pci.c
index db3c7e738213..4b8c93e59d6d 100644
--- a/drivers/usb/host/xhci-pci.c
+++ b/drivers/usb/host/xhci-pci.c
@@ -396,6 +396,7 @@ static void xhci_pci_quirks(struct device *dev, struct xhci_hcd *xhci)
if (pdev->vendor == PCI_VENDOR_ID_ETRON &&
(pdev->device == PCI_DEVICE_ID_EJ168 ||
pdev->device == PCI_DEVICE_ID_EJ188)) {
+ xhci->quirks |= XHCI_ETRON_HOST;
xhci->quirks |= XHCI_RESET_ON_RESUME;
xhci->quirks |= XHCI_BROKEN_STREAMS;
}
diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
index aa8c877f47ac..ae16253b53fb 100644
--- a/drivers/usb/host/xhci.c
+++ b/drivers/usb/host/xhci.c
@@ -3733,6 +3733,8 @@ void xhci_free_device_endpoint_resources(struct xhci_hcd *xhci,
xhci->num_active_eps);
}
+static void xhci_free_dev(struct usb_hcd *hcd, struct usb_device *udev);
+
/*
* This submits a Reset Device Command, which will set the device state to 0,
* set the device address to 0, and disable all the endpoints except the default
@@ -3803,6 +3805,23 @@ static int xhci_discover_or_reset_device(struct usb_hcd *hcd,
SLOT_STATE_DISABLED)
return 0;
+ if (xhci->quirks & XHCI_ETRON_HOST) {
+ /*
+ * Obtaining a new device slot to inform the xHCI host that
+ * the USB device has been reset.
+ */
+ ret = xhci_disable_slot(xhci, udev->slot_id);
+ xhci_free_virt_device(xhci, udev->slot_id);
+ if (!ret) {
+ ret = xhci_alloc_dev(hcd, udev);
+ if (ret == 1)
+ ret = 0;
+ else
+ ret = -EINVAL;
+ }
+ return ret;
+ }
+
trace_xhci_discover_or_reset_device(slot_ctx);
xhci_dbg(xhci, "Resetting device with slot ID %u\n", slot_id);
diff --git a/drivers/usb/host/xhci.h b/drivers/usb/host/xhci.h
index d3b250c736b8..a0204e10486d 100644
--- a/drivers/usb/host/xhci.h
+++ b/drivers/usb/host/xhci.h
@@ -1631,6 +1631,7 @@ struct xhci_hcd {
#define XHCI_ZHAOXIN_HOST BIT_ULL(46)
#define XHCI_WRITE_64_HI_LO BIT_ULL(47)
#define XHCI_CDNS_SCTX_QUIRK BIT_ULL(48)
+#define XHCI_ETRON_HOST BIT_ULL(49)
unsigned int num_active_eps;
unsigned int limit_active_eps;
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.19.y
git checkout FETCH_HEAD
git cherry-pick -x d7b11fe5790203fcc0db182249d7bfd945e44ccb
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024120252-sheath-spotted-fbe5@gregkh' --subject-prefix 'PATCH 4.19.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From d7b11fe5790203fcc0db182249d7bfd945e44ccb Mon Sep 17 00:00:00 2001
From: Kuangyi Chiang <ki.chiang65(a)gmail.com>
Date: Wed, 6 Nov 2024 12:14:43 +0200
Subject: [PATCH] xhci: Combine two if statements for Etron xHCI host
Combine two if statements, because these hosts have the same
quirk flags applied.
[Mathias: has stable tag because other fixes in series depend on this]
Fixes: 91f7a1524a92 ("xhci: Apply broken streams quirk to Etron EJ188 xHCI host")
Cc: stable(a)vger.kernel.org
Signed-off-by: Kuangyi Chiang <ki.chiang65(a)gmail.com>
Signed-off-by: Mathias Nyman <mathias.nyman(a)linux.intel.com>
Link: https://lore.kernel.org/r/20241106101459.775897-18-mathias.nyman@linux.inte…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/usb/host/xhci-pci.c b/drivers/usb/host/xhci-pci.c
index 7803ff1f1c9f..db3c7e738213 100644
--- a/drivers/usb/host/xhci-pci.c
+++ b/drivers/usb/host/xhci-pci.c
@@ -394,12 +394,8 @@ static void xhci_pci_quirks(struct device *dev, struct xhci_hcd *xhci)
xhci->quirks |= XHCI_DEFAULT_PM_RUNTIME_ALLOW;
if (pdev->vendor == PCI_VENDOR_ID_ETRON &&
- pdev->device == PCI_DEVICE_ID_EJ168) {
- xhci->quirks |= XHCI_RESET_ON_RESUME;
- xhci->quirks |= XHCI_BROKEN_STREAMS;
- }
- if (pdev->vendor == PCI_VENDOR_ID_ETRON &&
- pdev->device == PCI_DEVICE_ID_EJ188) {
+ (pdev->device == PCI_DEVICE_ID_EJ168 ||
+ pdev->device == PCI_DEVICE_ID_EJ188)) {
xhci->quirks |= XHCI_RESET_ON_RESUME;
xhci->quirks |= XHCI_BROKEN_STREAMS;
}
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x d7b11fe5790203fcc0db182249d7bfd945e44ccb
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024120251-backrest-passion-3be3@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From d7b11fe5790203fcc0db182249d7bfd945e44ccb Mon Sep 17 00:00:00 2001
From: Kuangyi Chiang <ki.chiang65(a)gmail.com>
Date: Wed, 6 Nov 2024 12:14:43 +0200
Subject: [PATCH] xhci: Combine two if statements for Etron xHCI host
Combine two if statements, because these hosts have the same
quirk flags applied.
[Mathias: has stable tag because other fixes in series depend on this]
Fixes: 91f7a1524a92 ("xhci: Apply broken streams quirk to Etron EJ188 xHCI host")
Cc: stable(a)vger.kernel.org
Signed-off-by: Kuangyi Chiang <ki.chiang65(a)gmail.com>
Signed-off-by: Mathias Nyman <mathias.nyman(a)linux.intel.com>
Link: https://lore.kernel.org/r/20241106101459.775897-18-mathias.nyman@linux.inte…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/usb/host/xhci-pci.c b/drivers/usb/host/xhci-pci.c
index 7803ff1f1c9f..db3c7e738213 100644
--- a/drivers/usb/host/xhci-pci.c
+++ b/drivers/usb/host/xhci-pci.c
@@ -394,12 +394,8 @@ static void xhci_pci_quirks(struct device *dev, struct xhci_hcd *xhci)
xhci->quirks |= XHCI_DEFAULT_PM_RUNTIME_ALLOW;
if (pdev->vendor == PCI_VENDOR_ID_ETRON &&
- pdev->device == PCI_DEVICE_ID_EJ168) {
- xhci->quirks |= XHCI_RESET_ON_RESUME;
- xhci->quirks |= XHCI_BROKEN_STREAMS;
- }
- if (pdev->vendor == PCI_VENDOR_ID_ETRON &&
- pdev->device == PCI_DEVICE_ID_EJ188) {
+ (pdev->device == PCI_DEVICE_ID_EJ168 ||
+ pdev->device == PCI_DEVICE_ID_EJ188)) {
xhci->quirks |= XHCI_RESET_ON_RESUME;
xhci->quirks |= XHCI_BROKEN_STREAMS;
}
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x d7b11fe5790203fcc0db182249d7bfd945e44ccb
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024120251-exponent-polygon-9f02@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From d7b11fe5790203fcc0db182249d7bfd945e44ccb Mon Sep 17 00:00:00 2001
From: Kuangyi Chiang <ki.chiang65(a)gmail.com>
Date: Wed, 6 Nov 2024 12:14:43 +0200
Subject: [PATCH] xhci: Combine two if statements for Etron xHCI host
Combine two if statements, because these hosts have the same
quirk flags applied.
[Mathias: has stable tag because other fixes in series depend on this]
Fixes: 91f7a1524a92 ("xhci: Apply broken streams quirk to Etron EJ188 xHCI host")
Cc: stable(a)vger.kernel.org
Signed-off-by: Kuangyi Chiang <ki.chiang65(a)gmail.com>
Signed-off-by: Mathias Nyman <mathias.nyman(a)linux.intel.com>
Link: https://lore.kernel.org/r/20241106101459.775897-18-mathias.nyman@linux.inte…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/usb/host/xhci-pci.c b/drivers/usb/host/xhci-pci.c
index 7803ff1f1c9f..db3c7e738213 100644
--- a/drivers/usb/host/xhci-pci.c
+++ b/drivers/usb/host/xhci-pci.c
@@ -394,12 +394,8 @@ static void xhci_pci_quirks(struct device *dev, struct xhci_hcd *xhci)
xhci->quirks |= XHCI_DEFAULT_PM_RUNTIME_ALLOW;
if (pdev->vendor == PCI_VENDOR_ID_ETRON &&
- pdev->device == PCI_DEVICE_ID_EJ168) {
- xhci->quirks |= XHCI_RESET_ON_RESUME;
- xhci->quirks |= XHCI_BROKEN_STREAMS;
- }
- if (pdev->vendor == PCI_VENDOR_ID_ETRON &&
- pdev->device == PCI_DEVICE_ID_EJ188) {
+ (pdev->device == PCI_DEVICE_ID_EJ168 ||
+ pdev->device == PCI_DEVICE_ID_EJ188)) {
xhci->quirks |= XHCI_RESET_ON_RESUME;
xhci->quirks |= XHCI_BROKEN_STREAMS;
}
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x d7b11fe5790203fcc0db182249d7bfd945e44ccb
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024120249-dynamite-flounder-73f6@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From d7b11fe5790203fcc0db182249d7bfd945e44ccb Mon Sep 17 00:00:00 2001
From: Kuangyi Chiang <ki.chiang65(a)gmail.com>
Date: Wed, 6 Nov 2024 12:14:43 +0200
Subject: [PATCH] xhci: Combine two if statements for Etron xHCI host
Combine two if statements, because these hosts have the same
quirk flags applied.
[Mathias: has stable tag because other fixes in series depend on this]
Fixes: 91f7a1524a92 ("xhci: Apply broken streams quirk to Etron EJ188 xHCI host")
Cc: stable(a)vger.kernel.org
Signed-off-by: Kuangyi Chiang <ki.chiang65(a)gmail.com>
Signed-off-by: Mathias Nyman <mathias.nyman(a)linux.intel.com>
Link: https://lore.kernel.org/r/20241106101459.775897-18-mathias.nyman@linux.inte…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/usb/host/xhci-pci.c b/drivers/usb/host/xhci-pci.c
index 7803ff1f1c9f..db3c7e738213 100644
--- a/drivers/usb/host/xhci-pci.c
+++ b/drivers/usb/host/xhci-pci.c
@@ -394,12 +394,8 @@ static void xhci_pci_quirks(struct device *dev, struct xhci_hcd *xhci)
xhci->quirks |= XHCI_DEFAULT_PM_RUNTIME_ALLOW;
if (pdev->vendor == PCI_VENDOR_ID_ETRON &&
- pdev->device == PCI_DEVICE_ID_EJ168) {
- xhci->quirks |= XHCI_RESET_ON_RESUME;
- xhci->quirks |= XHCI_BROKEN_STREAMS;
- }
- if (pdev->vendor == PCI_VENDOR_ID_ETRON &&
- pdev->device == PCI_DEVICE_ID_EJ188) {
+ (pdev->device == PCI_DEVICE_ID_EJ168 ||
+ pdev->device == PCI_DEVICE_ID_EJ188)) {
xhci->quirks |= XHCI_RESET_ON_RESUME;
xhci->quirks |= XHCI_BROKEN_STREAMS;
}
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x d7b11fe5790203fcc0db182249d7bfd945e44ccb
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024120250-wok-batboy-aebc@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From d7b11fe5790203fcc0db182249d7bfd945e44ccb Mon Sep 17 00:00:00 2001
From: Kuangyi Chiang <ki.chiang65(a)gmail.com>
Date: Wed, 6 Nov 2024 12:14:43 +0200
Subject: [PATCH] xhci: Combine two if statements for Etron xHCI host
Combine two if statements, because these hosts have the same
quirk flags applied.
[Mathias: has stable tag because other fixes in series depend on this]
Fixes: 91f7a1524a92 ("xhci: Apply broken streams quirk to Etron EJ188 xHCI host")
Cc: stable(a)vger.kernel.org
Signed-off-by: Kuangyi Chiang <ki.chiang65(a)gmail.com>
Signed-off-by: Mathias Nyman <mathias.nyman(a)linux.intel.com>
Link: https://lore.kernel.org/r/20241106101459.775897-18-mathias.nyman@linux.inte…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/usb/host/xhci-pci.c b/drivers/usb/host/xhci-pci.c
index 7803ff1f1c9f..db3c7e738213 100644
--- a/drivers/usb/host/xhci-pci.c
+++ b/drivers/usb/host/xhci-pci.c
@@ -394,12 +394,8 @@ static void xhci_pci_quirks(struct device *dev, struct xhci_hcd *xhci)
xhci->quirks |= XHCI_DEFAULT_PM_RUNTIME_ALLOW;
if (pdev->vendor == PCI_VENDOR_ID_ETRON &&
- pdev->device == PCI_DEVICE_ID_EJ168) {
- xhci->quirks |= XHCI_RESET_ON_RESUME;
- xhci->quirks |= XHCI_BROKEN_STREAMS;
- }
- if (pdev->vendor == PCI_VENDOR_ID_ETRON &&
- pdev->device == PCI_DEVICE_ID_EJ188) {
+ (pdev->device == PCI_DEVICE_ID_EJ168 ||
+ pdev->device == PCI_DEVICE_ID_EJ188)) {
xhci->quirks |= XHCI_RESET_ON_RESUME;
xhci->quirks |= XHCI_BROKEN_STREAMS;
}
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x d7b11fe5790203fcc0db182249d7bfd945e44ccb
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024120249-tribunal-catacomb-b0ad@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From d7b11fe5790203fcc0db182249d7bfd945e44ccb Mon Sep 17 00:00:00 2001
From: Kuangyi Chiang <ki.chiang65(a)gmail.com>
Date: Wed, 6 Nov 2024 12:14:43 +0200
Subject: [PATCH] xhci: Combine two if statements for Etron xHCI host
Combine two if statements, because these hosts have the same
quirk flags applied.
[Mathias: has stable tag because other fixes in series depend on this]
Fixes: 91f7a1524a92 ("xhci: Apply broken streams quirk to Etron EJ188 xHCI host")
Cc: stable(a)vger.kernel.org
Signed-off-by: Kuangyi Chiang <ki.chiang65(a)gmail.com>
Signed-off-by: Mathias Nyman <mathias.nyman(a)linux.intel.com>
Link: https://lore.kernel.org/r/20241106101459.775897-18-mathias.nyman@linux.inte…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/usb/host/xhci-pci.c b/drivers/usb/host/xhci-pci.c
index 7803ff1f1c9f..db3c7e738213 100644
--- a/drivers/usb/host/xhci-pci.c
+++ b/drivers/usb/host/xhci-pci.c
@@ -394,12 +394,8 @@ static void xhci_pci_quirks(struct device *dev, struct xhci_hcd *xhci)
xhci->quirks |= XHCI_DEFAULT_PM_RUNTIME_ALLOW;
if (pdev->vendor == PCI_VENDOR_ID_ETRON &&
- pdev->device == PCI_DEVICE_ID_EJ168) {
- xhci->quirks |= XHCI_RESET_ON_RESUME;
- xhci->quirks |= XHCI_BROKEN_STREAMS;
- }
- if (pdev->vendor == PCI_VENDOR_ID_ETRON &&
- pdev->device == PCI_DEVICE_ID_EJ188) {
+ (pdev->device == PCI_DEVICE_ID_EJ168 ||
+ pdev->device == PCI_DEVICE_ID_EJ188)) {
xhci->quirks |= XHCI_RESET_ON_RESUME;
xhci->quirks |= XHCI_BROKEN_STREAMS;
}
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.19.y
git checkout FETCH_HEAD
git cherry-pick -x e9649129d33dca561305fc590a7c4ba8c3e5675a
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024120253-crawling-traverse-68fd@gregkh' --subject-prefix 'PATCH 4.19.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From e9649129d33dca561305fc590a7c4ba8c3e5675a Mon Sep 17 00:00:00 2001
From: Kunkun Jiang <jiangkunkun(a)huawei.com>
Date: Thu, 7 Nov 2024 13:41:36 -0800
Subject: [PATCH] KVM: arm64: vgic-its: Clear DTE when MAPD unmaps a device
vgic_its_save_device_tables will traverse its->device_list to
save DTE for each device. vgic_its_restore_device_tables will
traverse each entry of device table and check if it is valid.
Restore if valid.
But when MAPD unmaps a device, it does not invalidate the
corresponding DTE. In the scenario of continuous saves
and restores, there may be a situation where a device's DTE
is not saved but is restored. This is unreasonable and may
cause restore to fail. This patch clears the corresponding
DTE when MAPD unmaps a device.
Cc: stable(a)vger.kernel.org
Fixes: 57a9a117154c ("KVM: arm64: vgic-its: Device table save/restore")
Co-developed-by: Shusen Li <lishusen2(a)huawei.com>
Signed-off-by: Shusen Li <lishusen2(a)huawei.com>
Signed-off-by: Kunkun Jiang <jiangkunkun(a)huawei.com>
[Jing: Update with entry write helper]
Signed-off-by: Jing Zhang <jingzhangos(a)google.com>
Link: https://lore.kernel.org/r/20241107214137.428439-5-jingzhangos@google.com
Signed-off-by: Oliver Upton <oliver.upton(a)linux.dev>
diff --git a/arch/arm64/kvm/vgic/vgic-its.c b/arch/arm64/kvm/vgic/vgic-its.c
index 68ba7e2453cd..b77fa99eafed 100644
--- a/arch/arm64/kvm/vgic/vgic-its.c
+++ b/arch/arm64/kvm/vgic/vgic-its.c
@@ -1139,9 +1139,11 @@ static int vgic_its_cmd_handle_mapd(struct kvm *kvm, struct vgic_its *its,
bool valid = its_cmd_get_validbit(its_cmd);
u8 num_eventid_bits = its_cmd_get_size(its_cmd);
gpa_t itt_addr = its_cmd_get_ittaddr(its_cmd);
+ int dte_esz = vgic_its_get_abi(its)->dte_esz;
struct its_device *device;
+ gpa_t gpa;
- if (!vgic_its_check_id(its, its->baser_device_table, device_id, NULL))
+ if (!vgic_its_check_id(its, its->baser_device_table, device_id, &gpa))
return E_ITS_MAPD_DEVICE_OOR;
if (valid && num_eventid_bits > VITS_TYPER_IDBITS)
@@ -1162,7 +1164,7 @@ static int vgic_its_cmd_handle_mapd(struct kvm *kvm, struct vgic_its *its,
* is an error, so we are done in any case.
*/
if (!valid)
- return 0;
+ return vgic_its_write_entry_lock(its, gpa, 0, dte_esz);
device = vgic_its_alloc_device(its, device_id, itt_addr,
num_eventid_bits);
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x e9649129d33dca561305fc590a7c4ba8c3e5675a
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024120252-submersed-reorder-64c3@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From e9649129d33dca561305fc590a7c4ba8c3e5675a Mon Sep 17 00:00:00 2001
From: Kunkun Jiang <jiangkunkun(a)huawei.com>
Date: Thu, 7 Nov 2024 13:41:36 -0800
Subject: [PATCH] KVM: arm64: vgic-its: Clear DTE when MAPD unmaps a device
vgic_its_save_device_tables will traverse its->device_list to
save DTE for each device. vgic_its_restore_device_tables will
traverse each entry of device table and check if it is valid.
Restore if valid.
But when MAPD unmaps a device, it does not invalidate the
corresponding DTE. In the scenario of continuous saves
and restores, there may be a situation where a device's DTE
is not saved but is restored. This is unreasonable and may
cause restore to fail. This patch clears the corresponding
DTE when MAPD unmaps a device.
Cc: stable(a)vger.kernel.org
Fixes: 57a9a117154c ("KVM: arm64: vgic-its: Device table save/restore")
Co-developed-by: Shusen Li <lishusen2(a)huawei.com>
Signed-off-by: Shusen Li <lishusen2(a)huawei.com>
Signed-off-by: Kunkun Jiang <jiangkunkun(a)huawei.com>
[Jing: Update with entry write helper]
Signed-off-by: Jing Zhang <jingzhangos(a)google.com>
Link: https://lore.kernel.org/r/20241107214137.428439-5-jingzhangos@google.com
Signed-off-by: Oliver Upton <oliver.upton(a)linux.dev>
diff --git a/arch/arm64/kvm/vgic/vgic-its.c b/arch/arm64/kvm/vgic/vgic-its.c
index 68ba7e2453cd..b77fa99eafed 100644
--- a/arch/arm64/kvm/vgic/vgic-its.c
+++ b/arch/arm64/kvm/vgic/vgic-its.c
@@ -1139,9 +1139,11 @@ static int vgic_its_cmd_handle_mapd(struct kvm *kvm, struct vgic_its *its,
bool valid = its_cmd_get_validbit(its_cmd);
u8 num_eventid_bits = its_cmd_get_size(its_cmd);
gpa_t itt_addr = its_cmd_get_ittaddr(its_cmd);
+ int dte_esz = vgic_its_get_abi(its)->dte_esz;
struct its_device *device;
+ gpa_t gpa;
- if (!vgic_its_check_id(its, its->baser_device_table, device_id, NULL))
+ if (!vgic_its_check_id(its, its->baser_device_table, device_id, &gpa))
return E_ITS_MAPD_DEVICE_OOR;
if (valid && num_eventid_bits > VITS_TYPER_IDBITS)
@@ -1162,7 +1164,7 @@ static int vgic_its_cmd_handle_mapd(struct kvm *kvm, struct vgic_its *its,
* is an error, so we are done in any case.
*/
if (!valid)
- return 0;
+ return vgic_its_write_entry_lock(its, gpa, 0, dte_esz);
device = vgic_its_alloc_device(its, device_id, itt_addr,
num_eventid_bits);
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.19.y
git checkout FETCH_HEAD
git cherry-pick -x 7602ffd1d5e8927fadd5187cb4aed2fdc9c47143
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024120229-transfer-capped-7af1@gregkh' --subject-prefix 'PATCH 4.19.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 7602ffd1d5e8927fadd5187cb4aed2fdc9c47143 Mon Sep 17 00:00:00 2001
From: Kunkun Jiang <jiangkunkun(a)huawei.com>
Date: Thu, 7 Nov 2024 13:41:37 -0800
Subject: [PATCH] KVM: arm64: vgic-its: Clear ITE when DISCARD frees an ITE
When DISCARD frees an ITE, it does not invalidate the
corresponding ITE. In the scenario of continuous saves and
restores, there may be a situation where an ITE is not saved
but is restored. This is unreasonable and may cause restore
to fail. This patch clears the corresponding ITE when DISCARD
frees an ITE.
Cc: stable(a)vger.kernel.org
Fixes: eff484e0298d ("KVM: arm64: vgic-its: ITT save and restore")
Signed-off-by: Kunkun Jiang <jiangkunkun(a)huawei.com>
[Jing: Update with entry write helper]
Signed-off-by: Jing Zhang <jingzhangos(a)google.com>
Link: https://lore.kernel.org/r/20241107214137.428439-6-jingzhangos@google.com
Signed-off-by: Oliver Upton <oliver.upton(a)linux.dev>
diff --git a/arch/arm64/kvm/vgic/vgic-its.c b/arch/arm64/kvm/vgic/vgic-its.c
index b77fa99eafed..198296933e7e 100644
--- a/arch/arm64/kvm/vgic/vgic-its.c
+++ b/arch/arm64/kvm/vgic/vgic-its.c
@@ -782,6 +782,9 @@ static int vgic_its_cmd_handle_discard(struct kvm *kvm, struct vgic_its *its,
ite = find_ite(its, device_id, event_id);
if (ite && its_is_collection_mapped(ite->collection)) {
+ struct its_device *device = find_its_device(its, device_id);
+ int ite_esz = vgic_its_get_abi(its)->ite_esz;
+ gpa_t gpa = device->itt_addr + ite->event_id * ite_esz;
/*
* Though the spec talks about removing the pending state, we
* don't bother here since we clear the ITTE anyway and the
@@ -790,7 +793,8 @@ static int vgic_its_cmd_handle_discard(struct kvm *kvm, struct vgic_its *its,
vgic_its_invalidate_cache(its);
its_free_ite(kvm, ite);
- return 0;
+
+ return vgic_its_write_entry_lock(its, gpa, 0, ite_esz);
}
return E_ITS_DISCARD_UNMAPPED_INTERRUPT;
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x 7602ffd1d5e8927fadd5187cb4aed2fdc9c47143
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024120229-backshift-recolor-5b9b@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 7602ffd1d5e8927fadd5187cb4aed2fdc9c47143 Mon Sep 17 00:00:00 2001
From: Kunkun Jiang <jiangkunkun(a)huawei.com>
Date: Thu, 7 Nov 2024 13:41:37 -0800
Subject: [PATCH] KVM: arm64: vgic-its: Clear ITE when DISCARD frees an ITE
When DISCARD frees an ITE, it does not invalidate the
corresponding ITE. In the scenario of continuous saves and
restores, there may be a situation where an ITE is not saved
but is restored. This is unreasonable and may cause restore
to fail. This patch clears the corresponding ITE when DISCARD
frees an ITE.
Cc: stable(a)vger.kernel.org
Fixes: eff484e0298d ("KVM: arm64: vgic-its: ITT save and restore")
Signed-off-by: Kunkun Jiang <jiangkunkun(a)huawei.com>
[Jing: Update with entry write helper]
Signed-off-by: Jing Zhang <jingzhangos(a)google.com>
Link: https://lore.kernel.org/r/20241107214137.428439-6-jingzhangos@google.com
Signed-off-by: Oliver Upton <oliver.upton(a)linux.dev>
diff --git a/arch/arm64/kvm/vgic/vgic-its.c b/arch/arm64/kvm/vgic/vgic-its.c
index b77fa99eafed..198296933e7e 100644
--- a/arch/arm64/kvm/vgic/vgic-its.c
+++ b/arch/arm64/kvm/vgic/vgic-its.c
@@ -782,6 +782,9 @@ static int vgic_its_cmd_handle_discard(struct kvm *kvm, struct vgic_its *its,
ite = find_ite(its, device_id, event_id);
if (ite && its_is_collection_mapped(ite->collection)) {
+ struct its_device *device = find_its_device(its, device_id);
+ int ite_esz = vgic_its_get_abi(its)->ite_esz;
+ gpa_t gpa = device->itt_addr + ite->event_id * ite_esz;
/*
* Though the spec talks about removing the pending state, we
* don't bother here since we clear the ITTE anyway and the
@@ -790,7 +793,8 @@ static int vgic_its_cmd_handle_discard(struct kvm *kvm, struct vgic_its *its,
vgic_its_invalidate_cache(its);
its_free_ite(kvm, ite);
- return 0;
+
+ return vgic_its_write_entry_lock(its, gpa, 0, ite_esz);
}
return E_ITS_DISCARD_UNMAPPED_INTERRUPT;
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x e735a5da64420a86be370b216c269b5dd8e830e2
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024120209-cranberry-font-7473@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From e735a5da64420a86be370b216c269b5dd8e830e2 Mon Sep 17 00:00:00 2001
From: Oliver Upton <oliver.upton(a)linux.dev>
Date: Fri, 25 Oct 2024 20:31:03 +0000
Subject: [PATCH] KVM: arm64: Don't retire aborted MMIO instruction
Returning an abort to the guest for an unsupported MMIO access is a
documented feature of the KVM UAPI. Nevertheless, it's clear that this
plumbing has seen limited testing, since userspace can trivially cause a
WARN in the MMIO return:
WARNING: CPU: 0 PID: 30558 at arch/arm64/include/asm/kvm_emulate.h:536 kvm_handle_mmio_return+0x46c/0x5c4 arch/arm64/include/asm/kvm_emulate.h:536
Call trace:
kvm_handle_mmio_return+0x46c/0x5c4 arch/arm64/include/asm/kvm_emulate.h:536
kvm_arch_vcpu_ioctl_run+0x98/0x15b4 arch/arm64/kvm/arm.c:1133
kvm_vcpu_ioctl+0x75c/0xa78 virt/kvm/kvm_main.c:4487
__do_sys_ioctl fs/ioctl.c:51 [inline]
__se_sys_ioctl fs/ioctl.c:893 [inline]
__arm64_sys_ioctl+0x14c/0x1c8 fs/ioctl.c:893
__invoke_syscall arch/arm64/kernel/syscall.c:35 [inline]
invoke_syscall+0x98/0x2b8 arch/arm64/kernel/syscall.c:49
el0_svc_common+0x1e0/0x23c arch/arm64/kernel/syscall.c:132
do_el0_svc+0x48/0x58 arch/arm64/kernel/syscall.c:151
el0_svc+0x38/0x68 arch/arm64/kernel/entry-common.c:712
el0t_64_sync_handler+0x90/0xfc arch/arm64/kernel/entry-common.c:730
el0t_64_sync+0x190/0x194 arch/arm64/kernel/entry.S:598
The splat is complaining that KVM is advancing PC while an exception is
pending, i.e. that KVM is retiring the MMIO instruction despite a
pending synchronous external abort. Womp womp.
Fix the glaring UAPI bug by skipping over all the MMIO emulation in
case there is a pending synchronous exception. Note that while userspace
is capable of pending an asynchronous exception (SError, IRQ, or FIQ),
it is still safe to retire the MMIO instruction in this case as (1) they
are by definition asynchronous, and (2) KVM relies on hardware support
for pending/delivering these exceptions instead of the software state
machine for advancing PC.
Cc: stable(a)vger.kernel.org
Fixes: da345174ceca ("KVM: arm/arm64: Allow user injection of external data aborts")
Reported-by: Alexander Potapenko <glider(a)google.com>
Reviewed-by: Marc Zyngier <maz(a)kernel.org>
Link: https://lore.kernel.org/r/20241025203106.3529261-2-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton(a)linux.dev>
diff --git a/arch/arm64/kvm/mmio.c b/arch/arm64/kvm/mmio.c
index cd6b7b83e2c3..ab365e839874 100644
--- a/arch/arm64/kvm/mmio.c
+++ b/arch/arm64/kvm/mmio.c
@@ -72,6 +72,31 @@ unsigned long kvm_mmio_read_buf(const void *buf, unsigned int len)
return data;
}
+static bool kvm_pending_sync_exception(struct kvm_vcpu *vcpu)
+{
+ if (!vcpu_get_flag(vcpu, PENDING_EXCEPTION))
+ return false;
+
+ if (vcpu_el1_is_32bit(vcpu)) {
+ switch (vcpu_get_flag(vcpu, EXCEPT_MASK)) {
+ case unpack_vcpu_flag(EXCEPT_AA32_UND):
+ case unpack_vcpu_flag(EXCEPT_AA32_IABT):
+ case unpack_vcpu_flag(EXCEPT_AA32_DABT):
+ return true;
+ default:
+ return false;
+ }
+ } else {
+ switch (vcpu_get_flag(vcpu, EXCEPT_MASK)) {
+ case unpack_vcpu_flag(EXCEPT_AA64_EL1_SYNC):
+ case unpack_vcpu_flag(EXCEPT_AA64_EL2_SYNC):
+ return true;
+ default:
+ return false;
+ }
+ }
+}
+
/**
* kvm_handle_mmio_return -- Handle MMIO loads after user space emulation
* or in-kernel IO emulation
@@ -84,8 +109,11 @@ int kvm_handle_mmio_return(struct kvm_vcpu *vcpu)
unsigned int len;
int mask;
- /* Detect an already handled MMIO return */
- if (unlikely(!vcpu->mmio_needed))
+ /*
+ * Detect if the MMIO return was already handled or if userspace aborted
+ * the MMIO access.
+ */
+ if (unlikely(!vcpu->mmio_needed || kvm_pending_sync_exception(vcpu)))
return 1;
vcpu->mmio_needed = 0;
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x e735a5da64420a86be370b216c269b5dd8e830e2
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024120208-mundane-finite-682f@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From e735a5da64420a86be370b216c269b5dd8e830e2 Mon Sep 17 00:00:00 2001
From: Oliver Upton <oliver.upton(a)linux.dev>
Date: Fri, 25 Oct 2024 20:31:03 +0000
Subject: [PATCH] KVM: arm64: Don't retire aborted MMIO instruction
Returning an abort to the guest for an unsupported MMIO access is a
documented feature of the KVM UAPI. Nevertheless, it's clear that this
plumbing has seen limited testing, since userspace can trivially cause a
WARN in the MMIO return:
WARNING: CPU: 0 PID: 30558 at arch/arm64/include/asm/kvm_emulate.h:536 kvm_handle_mmio_return+0x46c/0x5c4 arch/arm64/include/asm/kvm_emulate.h:536
Call trace:
kvm_handle_mmio_return+0x46c/0x5c4 arch/arm64/include/asm/kvm_emulate.h:536
kvm_arch_vcpu_ioctl_run+0x98/0x15b4 arch/arm64/kvm/arm.c:1133
kvm_vcpu_ioctl+0x75c/0xa78 virt/kvm/kvm_main.c:4487
__do_sys_ioctl fs/ioctl.c:51 [inline]
__se_sys_ioctl fs/ioctl.c:893 [inline]
__arm64_sys_ioctl+0x14c/0x1c8 fs/ioctl.c:893
__invoke_syscall arch/arm64/kernel/syscall.c:35 [inline]
invoke_syscall+0x98/0x2b8 arch/arm64/kernel/syscall.c:49
el0_svc_common+0x1e0/0x23c arch/arm64/kernel/syscall.c:132
do_el0_svc+0x48/0x58 arch/arm64/kernel/syscall.c:151
el0_svc+0x38/0x68 arch/arm64/kernel/entry-common.c:712
el0t_64_sync_handler+0x90/0xfc arch/arm64/kernel/entry-common.c:730
el0t_64_sync+0x190/0x194 arch/arm64/kernel/entry.S:598
The splat is complaining that KVM is advancing PC while an exception is
pending, i.e. that KVM is retiring the MMIO instruction despite a
pending synchronous external abort. Womp womp.
Fix the glaring UAPI bug by skipping over all the MMIO emulation in
case there is a pending synchronous exception. Note that while userspace
is capable of pending an asynchronous exception (SError, IRQ, or FIQ),
it is still safe to retire the MMIO instruction in this case as (1) they
are by definition asynchronous, and (2) KVM relies on hardware support
for pending/delivering these exceptions instead of the software state
machine for advancing PC.
Cc: stable(a)vger.kernel.org
Fixes: da345174ceca ("KVM: arm/arm64: Allow user injection of external data aborts")
Reported-by: Alexander Potapenko <glider(a)google.com>
Reviewed-by: Marc Zyngier <maz(a)kernel.org>
Link: https://lore.kernel.org/r/20241025203106.3529261-2-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton(a)linux.dev>
diff --git a/arch/arm64/kvm/mmio.c b/arch/arm64/kvm/mmio.c
index cd6b7b83e2c3..ab365e839874 100644
--- a/arch/arm64/kvm/mmio.c
+++ b/arch/arm64/kvm/mmio.c
@@ -72,6 +72,31 @@ unsigned long kvm_mmio_read_buf(const void *buf, unsigned int len)
return data;
}
+static bool kvm_pending_sync_exception(struct kvm_vcpu *vcpu)
+{
+ if (!vcpu_get_flag(vcpu, PENDING_EXCEPTION))
+ return false;
+
+ if (vcpu_el1_is_32bit(vcpu)) {
+ switch (vcpu_get_flag(vcpu, EXCEPT_MASK)) {
+ case unpack_vcpu_flag(EXCEPT_AA32_UND):
+ case unpack_vcpu_flag(EXCEPT_AA32_IABT):
+ case unpack_vcpu_flag(EXCEPT_AA32_DABT):
+ return true;
+ default:
+ return false;
+ }
+ } else {
+ switch (vcpu_get_flag(vcpu, EXCEPT_MASK)) {
+ case unpack_vcpu_flag(EXCEPT_AA64_EL1_SYNC):
+ case unpack_vcpu_flag(EXCEPT_AA64_EL2_SYNC):
+ return true;
+ default:
+ return false;
+ }
+ }
+}
+
/**
* kvm_handle_mmio_return -- Handle MMIO loads after user space emulation
* or in-kernel IO emulation
@@ -84,8 +109,11 @@ int kvm_handle_mmio_return(struct kvm_vcpu *vcpu)
unsigned int len;
int mask;
- /* Detect an already handled MMIO return */
- if (unlikely(!vcpu->mmio_needed))
+ /*
+ * Detect if the MMIO return was already handled or if userspace aborted
+ * the MMIO access.
+ */
+ if (unlikely(!vcpu->mmio_needed || kvm_pending_sync_exception(vcpu)))
return 1;
vcpu->mmio_needed = 0;
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x e735a5da64420a86be370b216c269b5dd8e830e2
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024120207-banker-blast-f652@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From e735a5da64420a86be370b216c269b5dd8e830e2 Mon Sep 17 00:00:00 2001
From: Oliver Upton <oliver.upton(a)linux.dev>
Date: Fri, 25 Oct 2024 20:31:03 +0000
Subject: [PATCH] KVM: arm64: Don't retire aborted MMIO instruction
Returning an abort to the guest for an unsupported MMIO access is a
documented feature of the KVM UAPI. Nevertheless, it's clear that this
plumbing has seen limited testing, since userspace can trivially cause a
WARN in the MMIO return:
WARNING: CPU: 0 PID: 30558 at arch/arm64/include/asm/kvm_emulate.h:536 kvm_handle_mmio_return+0x46c/0x5c4 arch/arm64/include/asm/kvm_emulate.h:536
Call trace:
kvm_handle_mmio_return+0x46c/0x5c4 arch/arm64/include/asm/kvm_emulate.h:536
kvm_arch_vcpu_ioctl_run+0x98/0x15b4 arch/arm64/kvm/arm.c:1133
kvm_vcpu_ioctl+0x75c/0xa78 virt/kvm/kvm_main.c:4487
__do_sys_ioctl fs/ioctl.c:51 [inline]
__se_sys_ioctl fs/ioctl.c:893 [inline]
__arm64_sys_ioctl+0x14c/0x1c8 fs/ioctl.c:893
__invoke_syscall arch/arm64/kernel/syscall.c:35 [inline]
invoke_syscall+0x98/0x2b8 arch/arm64/kernel/syscall.c:49
el0_svc_common+0x1e0/0x23c arch/arm64/kernel/syscall.c:132
do_el0_svc+0x48/0x58 arch/arm64/kernel/syscall.c:151
el0_svc+0x38/0x68 arch/arm64/kernel/entry-common.c:712
el0t_64_sync_handler+0x90/0xfc arch/arm64/kernel/entry-common.c:730
el0t_64_sync+0x190/0x194 arch/arm64/kernel/entry.S:598
The splat is complaining that KVM is advancing PC while an exception is
pending, i.e. that KVM is retiring the MMIO instruction despite a
pending synchronous external abort. Womp womp.
Fix the glaring UAPI bug by skipping over all the MMIO emulation in
case there is a pending synchronous exception. Note that while userspace
is capable of pending an asynchronous exception (SError, IRQ, or FIQ),
it is still safe to retire the MMIO instruction in this case as (1) they
are by definition asynchronous, and (2) KVM relies on hardware support
for pending/delivering these exceptions instead of the software state
machine for advancing PC.
Cc: stable(a)vger.kernel.org
Fixes: da345174ceca ("KVM: arm/arm64: Allow user injection of external data aborts")
Reported-by: Alexander Potapenko <glider(a)google.com>
Reviewed-by: Marc Zyngier <maz(a)kernel.org>
Link: https://lore.kernel.org/r/20241025203106.3529261-2-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton(a)linux.dev>
diff --git a/arch/arm64/kvm/mmio.c b/arch/arm64/kvm/mmio.c
index cd6b7b83e2c3..ab365e839874 100644
--- a/arch/arm64/kvm/mmio.c
+++ b/arch/arm64/kvm/mmio.c
@@ -72,6 +72,31 @@ unsigned long kvm_mmio_read_buf(const void *buf, unsigned int len)
return data;
}
+static bool kvm_pending_sync_exception(struct kvm_vcpu *vcpu)
+{
+ if (!vcpu_get_flag(vcpu, PENDING_EXCEPTION))
+ return false;
+
+ if (vcpu_el1_is_32bit(vcpu)) {
+ switch (vcpu_get_flag(vcpu, EXCEPT_MASK)) {
+ case unpack_vcpu_flag(EXCEPT_AA32_UND):
+ case unpack_vcpu_flag(EXCEPT_AA32_IABT):
+ case unpack_vcpu_flag(EXCEPT_AA32_DABT):
+ return true;
+ default:
+ return false;
+ }
+ } else {
+ switch (vcpu_get_flag(vcpu, EXCEPT_MASK)) {
+ case unpack_vcpu_flag(EXCEPT_AA64_EL1_SYNC):
+ case unpack_vcpu_flag(EXCEPT_AA64_EL2_SYNC):
+ return true;
+ default:
+ return false;
+ }
+ }
+}
+
/**
* kvm_handle_mmio_return -- Handle MMIO loads after user space emulation
* or in-kernel IO emulation
@@ -84,8 +109,11 @@ int kvm_handle_mmio_return(struct kvm_vcpu *vcpu)
unsigned int len;
int mask;
- /* Detect an already handled MMIO return */
- if (unlikely(!vcpu->mmio_needed))
+ /*
+ * Detect if the MMIO return was already handled or if userspace aborted
+ * the MMIO access.
+ */
+ if (unlikely(!vcpu->mmio_needed || kvm_pending_sync_exception(vcpu)))
return 1;
vcpu->mmio_needed = 0;
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x e735a5da64420a86be370b216c269b5dd8e830e2
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024120206-conflict-ambiguity-0106@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From e735a5da64420a86be370b216c269b5dd8e830e2 Mon Sep 17 00:00:00 2001
From: Oliver Upton <oliver.upton(a)linux.dev>
Date: Fri, 25 Oct 2024 20:31:03 +0000
Subject: [PATCH] KVM: arm64: Don't retire aborted MMIO instruction
Returning an abort to the guest for an unsupported MMIO access is a
documented feature of the KVM UAPI. Nevertheless, it's clear that this
plumbing has seen limited testing, since userspace can trivially cause a
WARN in the MMIO return:
WARNING: CPU: 0 PID: 30558 at arch/arm64/include/asm/kvm_emulate.h:536 kvm_handle_mmio_return+0x46c/0x5c4 arch/arm64/include/asm/kvm_emulate.h:536
Call trace:
kvm_handle_mmio_return+0x46c/0x5c4 arch/arm64/include/asm/kvm_emulate.h:536
kvm_arch_vcpu_ioctl_run+0x98/0x15b4 arch/arm64/kvm/arm.c:1133
kvm_vcpu_ioctl+0x75c/0xa78 virt/kvm/kvm_main.c:4487
__do_sys_ioctl fs/ioctl.c:51 [inline]
__se_sys_ioctl fs/ioctl.c:893 [inline]
__arm64_sys_ioctl+0x14c/0x1c8 fs/ioctl.c:893
__invoke_syscall arch/arm64/kernel/syscall.c:35 [inline]
invoke_syscall+0x98/0x2b8 arch/arm64/kernel/syscall.c:49
el0_svc_common+0x1e0/0x23c arch/arm64/kernel/syscall.c:132
do_el0_svc+0x48/0x58 arch/arm64/kernel/syscall.c:151
el0_svc+0x38/0x68 arch/arm64/kernel/entry-common.c:712
el0t_64_sync_handler+0x90/0xfc arch/arm64/kernel/entry-common.c:730
el0t_64_sync+0x190/0x194 arch/arm64/kernel/entry.S:598
The splat is complaining that KVM is advancing PC while an exception is
pending, i.e. that KVM is retiring the MMIO instruction despite a
pending synchronous external abort. Womp womp.
Fix the glaring UAPI bug by skipping over all the MMIO emulation in
case there is a pending synchronous exception. Note that while userspace
is capable of pending an asynchronous exception (SError, IRQ, or FIQ),
it is still safe to retire the MMIO instruction in this case as (1) they
are by definition asynchronous, and (2) KVM relies on hardware support
for pending/delivering these exceptions instead of the software state
machine for advancing PC.
Cc: stable(a)vger.kernel.org
Fixes: da345174ceca ("KVM: arm/arm64: Allow user injection of external data aborts")
Reported-by: Alexander Potapenko <glider(a)google.com>
Reviewed-by: Marc Zyngier <maz(a)kernel.org>
Link: https://lore.kernel.org/r/20241025203106.3529261-2-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton(a)linux.dev>
diff --git a/arch/arm64/kvm/mmio.c b/arch/arm64/kvm/mmio.c
index cd6b7b83e2c3..ab365e839874 100644
--- a/arch/arm64/kvm/mmio.c
+++ b/arch/arm64/kvm/mmio.c
@@ -72,6 +72,31 @@ unsigned long kvm_mmio_read_buf(const void *buf, unsigned int len)
return data;
}
+static bool kvm_pending_sync_exception(struct kvm_vcpu *vcpu)
+{
+ if (!vcpu_get_flag(vcpu, PENDING_EXCEPTION))
+ return false;
+
+ if (vcpu_el1_is_32bit(vcpu)) {
+ switch (vcpu_get_flag(vcpu, EXCEPT_MASK)) {
+ case unpack_vcpu_flag(EXCEPT_AA32_UND):
+ case unpack_vcpu_flag(EXCEPT_AA32_IABT):
+ case unpack_vcpu_flag(EXCEPT_AA32_DABT):
+ return true;
+ default:
+ return false;
+ }
+ } else {
+ switch (vcpu_get_flag(vcpu, EXCEPT_MASK)) {
+ case unpack_vcpu_flag(EXCEPT_AA64_EL1_SYNC):
+ case unpack_vcpu_flag(EXCEPT_AA64_EL2_SYNC):
+ return true;
+ default:
+ return false;
+ }
+ }
+}
+
/**
* kvm_handle_mmio_return -- Handle MMIO loads after user space emulation
* or in-kernel IO emulation
@@ -84,8 +109,11 @@ int kvm_handle_mmio_return(struct kvm_vcpu *vcpu)
unsigned int len;
int mask;
- /* Detect an already handled MMIO return */
- if (unlikely(!vcpu->mmio_needed))
+ /*
+ * Detect if the MMIO return was already handled or if userspace aborted
+ * the MMIO access.
+ */
+ if (unlikely(!vcpu->mmio_needed || kvm_pending_sync_exception(vcpu)))
return 1;
vcpu->mmio_needed = 0;
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 85434c3c73fcad58870016ddfe5eaa5036672675
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024120245-frighten-linseed-1821@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 85434c3c73fcad58870016ddfe5eaa5036672675 Mon Sep 17 00:00:00 2001
From: Sean Christopherson <seanjc(a)google.com>
Date: Mon, 18 Nov 2024 17:14:33 -0800
Subject: [PATCH] Revert "KVM: VMX: Move LOAD_IA32_PERF_GLOBAL_CTRL errata
handling out of setup_vmcs_config()"
Revert back to clearing VM_{ENTRY,EXIT}_LOAD_IA32_PERF_GLOBAL_CTRL in KVM's
golden VMCS config, as applying the workaround during vCPU creation is
pointless and broken. KVM *unconditionally* clears the controls in the
values returned by vmx_vmentry_ctrl() and vmx_vmexit_ctrl(), as KVM loads
PERF_GLOBAL_CTRL if and only if its necessary to do so. E.g. if KVM wants
to run the guest with the same PERF_GLOBAL_CTRL as the host, then there's
no need to re-load the MSR on entry and exit.
Even worse, the buggy commit failed to apply the erratum where it's
actually needed, add_atomic_switch_msr(). As a result, KVM completely
ignores the erratum for all intents and purposes, i.e. uses the flawed
VMCS controls to load PERF_GLOBAL_CTRL.
To top things off, the patch was intended to be dropped, as the premise
of an L1 VMM being able to pivot on FMS is flawed, and KVM can (and now
does) fully emulate the controls in software. Simply revert the commit,
as all upstream supported kernels that have the buggy commit should also
have commit f4c93d1a0e71 ("KVM: nVMX: Always emulate PERF_GLOBAL_CTRL
VM-Entry/VM-Exit controls"), i.e. the (likely theoretical) live migration
concern is a complete non-issue.
Opportunistically drop the manual "kvm: " scope from the warning about
the erratum, as KVM now uses pr_fmt() to provide the correct scope (v6.1
kernels and earlier don't, but the erratum only applies to CPUs that are
15+ years old; it's not worth a separate patch).
This reverts commit 9d78d6fb186bc4aff41b5d6c4726b76649d3cb53.
Link: https://lore.kernel.org/all/YtnZmCutdd5tpUmz@google.com
Fixes: 9d78d6fb186b ("KVM: VMX: Move LOAD_IA32_PERF_GLOBAL_CTRL errata handling out of setup_vmcs_config()")
Cc: stable(a)vger.kernel.org
Cc: Vitaly Kuznetsov <vkuznets(a)redhat.com>
Cc: Maxim Levitsky <mlevitsk(a)redhat.com>
Signed-off-by: Sean Christopherson <seanjc(a)google.com>
Reviewed-by: Vitaly Kuznetsov <vkuznets(a)redhat.com>
Message-ID: <20241119011433.1797921-1-seanjc(a)google.com>
Signed-off-by: Paolo Bonzini <pbonzini(a)redhat.com>
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 0f008f5ef6f0..893366e53732 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -2548,28 +2548,6 @@ static bool cpu_has_sgx(void)
return cpuid_eax(0) >= 0x12 && (cpuid_eax(0x12) & BIT(0));
}
-/*
- * Some cpus support VM_{ENTRY,EXIT}_IA32_PERF_GLOBAL_CTRL but they
- * can't be used due to errata where VM Exit may incorrectly clear
- * IA32_PERF_GLOBAL_CTRL[34:32]. Work around the errata by using the
- * MSR load mechanism to switch IA32_PERF_GLOBAL_CTRL.
- */
-static bool cpu_has_perf_global_ctrl_bug(void)
-{
- switch (boot_cpu_data.x86_vfm) {
- case INTEL_NEHALEM_EP: /* AAK155 */
- case INTEL_NEHALEM: /* AAP115 */
- case INTEL_WESTMERE: /* AAT100 */
- case INTEL_WESTMERE_EP: /* BC86,AAY89,BD102 */
- case INTEL_NEHALEM_EX: /* BA97 */
- return true;
- default:
- break;
- }
-
- return false;
-}
-
static int adjust_vmx_controls(u32 ctl_min, u32 ctl_opt, u32 msr, u32 *result)
{
u32 vmx_msr_low, vmx_msr_high;
@@ -2729,6 +2707,27 @@ static int setup_vmcs_config(struct vmcs_config *vmcs_conf,
_vmexit_control &= ~x_ctrl;
}
+ /*
+ * Some cpus support VM_{ENTRY,EXIT}_IA32_PERF_GLOBAL_CTRL but they
+ * can't be used due to an errata where VM Exit may incorrectly clear
+ * IA32_PERF_GLOBAL_CTRL[34:32]. Workaround the errata by using the
+ * MSR load mechanism to switch IA32_PERF_GLOBAL_CTRL.
+ */
+ switch (boot_cpu_data.x86_vfm) {
+ case INTEL_NEHALEM_EP: /* AAK155 */
+ case INTEL_NEHALEM: /* AAP115 */
+ case INTEL_WESTMERE: /* AAT100 */
+ case INTEL_WESTMERE_EP: /* BC86,AAY89,BD102 */
+ case INTEL_NEHALEM_EX: /* BA97 */
+ _vmentry_control &= ~VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL;
+ _vmexit_control &= ~VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL;
+ pr_warn_once("VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL "
+ "does not work properly. Using workaround\n");
+ break;
+ default:
+ break;
+ }
+
rdmsrl(MSR_IA32_VMX_BASIC, basic_msr);
/* IA-32 SDM Vol 3B: VMCS size is never greater than 4kB. */
@@ -4432,9 +4431,6 @@ static u32 vmx_vmentry_ctrl(void)
VM_ENTRY_LOAD_IA32_EFER |
VM_ENTRY_IA32E_MODE);
- if (cpu_has_perf_global_ctrl_bug())
- vmentry_ctrl &= ~VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL;
-
return vmentry_ctrl;
}
@@ -4452,10 +4448,6 @@ static u32 vmx_vmexit_ctrl(void)
if (vmx_pt_mode_is_system())
vmexit_ctrl &= ~(VM_EXIT_PT_CONCEAL_PIP |
VM_EXIT_CLEAR_IA32_RTIT_CTL);
-
- if (cpu_has_perf_global_ctrl_bug())
- vmexit_ctrl &= ~VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL;
-
/* Loading of EFER and PERF_GLOBAL_CTRL are toggled dynamically */
return vmexit_ctrl &
~(VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL | VM_EXIT_LOAD_IA32_EFER);
@@ -8415,10 +8407,6 @@ __init int vmx_hardware_setup(void)
if (setup_vmcs_config(&vmcs_config, &vmx_capability) < 0)
return -EIO;
- if (cpu_has_perf_global_ctrl_bug())
- pr_warn_once("VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL "
- "does not work properly. Using workaround\n");
-
if (boot_cpu_has(X86_FEATURE_NX))
kvm_enable_efer_bits(EFER_NX);
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x 85434c3c73fcad58870016ddfe5eaa5036672675
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024120244-luxury-rebirth-25fd@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 85434c3c73fcad58870016ddfe5eaa5036672675 Mon Sep 17 00:00:00 2001
From: Sean Christopherson <seanjc(a)google.com>
Date: Mon, 18 Nov 2024 17:14:33 -0800
Subject: [PATCH] Revert "KVM: VMX: Move LOAD_IA32_PERF_GLOBAL_CTRL errata
handling out of setup_vmcs_config()"
Revert back to clearing VM_{ENTRY,EXIT}_LOAD_IA32_PERF_GLOBAL_CTRL in KVM's
golden VMCS config, as applying the workaround during vCPU creation is
pointless and broken. KVM *unconditionally* clears the controls in the
values returned by vmx_vmentry_ctrl() and vmx_vmexit_ctrl(), as KVM loads
PERF_GLOBAL_CTRL if and only if its necessary to do so. E.g. if KVM wants
to run the guest with the same PERF_GLOBAL_CTRL as the host, then there's
no need to re-load the MSR on entry and exit.
Even worse, the buggy commit failed to apply the erratum where it's
actually needed, add_atomic_switch_msr(). As a result, KVM completely
ignores the erratum for all intents and purposes, i.e. uses the flawed
VMCS controls to load PERF_GLOBAL_CTRL.
To top things off, the patch was intended to be dropped, as the premise
of an L1 VMM being able to pivot on FMS is flawed, and KVM can (and now
does) fully emulate the controls in software. Simply revert the commit,
as all upstream supported kernels that have the buggy commit should also
have commit f4c93d1a0e71 ("KVM: nVMX: Always emulate PERF_GLOBAL_CTRL
VM-Entry/VM-Exit controls"), i.e. the (likely theoretical) live migration
concern is a complete non-issue.
Opportunistically drop the manual "kvm: " scope from the warning about
the erratum, as KVM now uses pr_fmt() to provide the correct scope (v6.1
kernels and earlier don't, but the erratum only applies to CPUs that are
15+ years old; it's not worth a separate patch).
This reverts commit 9d78d6fb186bc4aff41b5d6c4726b76649d3cb53.
Link: https://lore.kernel.org/all/YtnZmCutdd5tpUmz@google.com
Fixes: 9d78d6fb186b ("KVM: VMX: Move LOAD_IA32_PERF_GLOBAL_CTRL errata handling out of setup_vmcs_config()")
Cc: stable(a)vger.kernel.org
Cc: Vitaly Kuznetsov <vkuznets(a)redhat.com>
Cc: Maxim Levitsky <mlevitsk(a)redhat.com>
Signed-off-by: Sean Christopherson <seanjc(a)google.com>
Reviewed-by: Vitaly Kuznetsov <vkuznets(a)redhat.com>
Message-ID: <20241119011433.1797921-1-seanjc(a)google.com>
Signed-off-by: Paolo Bonzini <pbonzini(a)redhat.com>
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 0f008f5ef6f0..893366e53732 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -2548,28 +2548,6 @@ static bool cpu_has_sgx(void)
return cpuid_eax(0) >= 0x12 && (cpuid_eax(0x12) & BIT(0));
}
-/*
- * Some cpus support VM_{ENTRY,EXIT}_IA32_PERF_GLOBAL_CTRL but they
- * can't be used due to errata where VM Exit may incorrectly clear
- * IA32_PERF_GLOBAL_CTRL[34:32]. Work around the errata by using the
- * MSR load mechanism to switch IA32_PERF_GLOBAL_CTRL.
- */
-static bool cpu_has_perf_global_ctrl_bug(void)
-{
- switch (boot_cpu_data.x86_vfm) {
- case INTEL_NEHALEM_EP: /* AAK155 */
- case INTEL_NEHALEM: /* AAP115 */
- case INTEL_WESTMERE: /* AAT100 */
- case INTEL_WESTMERE_EP: /* BC86,AAY89,BD102 */
- case INTEL_NEHALEM_EX: /* BA97 */
- return true;
- default:
- break;
- }
-
- return false;
-}
-
static int adjust_vmx_controls(u32 ctl_min, u32 ctl_opt, u32 msr, u32 *result)
{
u32 vmx_msr_low, vmx_msr_high;
@@ -2729,6 +2707,27 @@ static int setup_vmcs_config(struct vmcs_config *vmcs_conf,
_vmexit_control &= ~x_ctrl;
}
+ /*
+ * Some cpus support VM_{ENTRY,EXIT}_IA32_PERF_GLOBAL_CTRL but they
+ * can't be used due to an errata where VM Exit may incorrectly clear
+ * IA32_PERF_GLOBAL_CTRL[34:32]. Workaround the errata by using the
+ * MSR load mechanism to switch IA32_PERF_GLOBAL_CTRL.
+ */
+ switch (boot_cpu_data.x86_vfm) {
+ case INTEL_NEHALEM_EP: /* AAK155 */
+ case INTEL_NEHALEM: /* AAP115 */
+ case INTEL_WESTMERE: /* AAT100 */
+ case INTEL_WESTMERE_EP: /* BC86,AAY89,BD102 */
+ case INTEL_NEHALEM_EX: /* BA97 */
+ _vmentry_control &= ~VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL;
+ _vmexit_control &= ~VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL;
+ pr_warn_once("VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL "
+ "does not work properly. Using workaround\n");
+ break;
+ default:
+ break;
+ }
+
rdmsrl(MSR_IA32_VMX_BASIC, basic_msr);
/* IA-32 SDM Vol 3B: VMCS size is never greater than 4kB. */
@@ -4432,9 +4431,6 @@ static u32 vmx_vmentry_ctrl(void)
VM_ENTRY_LOAD_IA32_EFER |
VM_ENTRY_IA32E_MODE);
- if (cpu_has_perf_global_ctrl_bug())
- vmentry_ctrl &= ~VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL;
-
return vmentry_ctrl;
}
@@ -4452,10 +4448,6 @@ static u32 vmx_vmexit_ctrl(void)
if (vmx_pt_mode_is_system())
vmexit_ctrl &= ~(VM_EXIT_PT_CONCEAL_PIP |
VM_EXIT_CLEAR_IA32_RTIT_CTL);
-
- if (cpu_has_perf_global_ctrl_bug())
- vmexit_ctrl &= ~VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL;
-
/* Loading of EFER and PERF_GLOBAL_CTRL are toggled dynamically */
return vmexit_ctrl &
~(VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL | VM_EXIT_LOAD_IA32_EFER);
@@ -8415,10 +8407,6 @@ __init int vmx_hardware_setup(void)
if (setup_vmcs_config(&vmcs_config, &vmx_capability) < 0)
return -EIO;
- if (cpu_has_perf_global_ctrl_bug())
- pr_warn_once("VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL "
- "does not work properly. Using workaround\n");
-
if (boot_cpu_has(X86_FEATURE_NX))
kvm_enable_efer_bits(EFER_NX);
The patch below does not apply to the 6.11-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.11.y
git checkout FETCH_HEAD
git cherry-pick -x 85434c3c73fcad58870016ddfe5eaa5036672675
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024120243-chimp-penknife-28f5@gregkh' --subject-prefix 'PATCH 6.11.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 85434c3c73fcad58870016ddfe5eaa5036672675 Mon Sep 17 00:00:00 2001
From: Sean Christopherson <seanjc(a)google.com>
Date: Mon, 18 Nov 2024 17:14:33 -0800
Subject: [PATCH] Revert "KVM: VMX: Move LOAD_IA32_PERF_GLOBAL_CTRL errata
handling out of setup_vmcs_config()"
Revert back to clearing VM_{ENTRY,EXIT}_LOAD_IA32_PERF_GLOBAL_CTRL in KVM's
golden VMCS config, as applying the workaround during vCPU creation is
pointless and broken. KVM *unconditionally* clears the controls in the
values returned by vmx_vmentry_ctrl() and vmx_vmexit_ctrl(), as KVM loads
PERF_GLOBAL_CTRL if and only if its necessary to do so. E.g. if KVM wants
to run the guest with the same PERF_GLOBAL_CTRL as the host, then there's
no need to re-load the MSR on entry and exit.
Even worse, the buggy commit failed to apply the erratum where it's
actually needed, add_atomic_switch_msr(). As a result, KVM completely
ignores the erratum for all intents and purposes, i.e. uses the flawed
VMCS controls to load PERF_GLOBAL_CTRL.
To top things off, the patch was intended to be dropped, as the premise
of an L1 VMM being able to pivot on FMS is flawed, and KVM can (and now
does) fully emulate the controls in software. Simply revert the commit,
as all upstream supported kernels that have the buggy commit should also
have commit f4c93d1a0e71 ("KVM: nVMX: Always emulate PERF_GLOBAL_CTRL
VM-Entry/VM-Exit controls"), i.e. the (likely theoretical) live migration
concern is a complete non-issue.
Opportunistically drop the manual "kvm: " scope from the warning about
the erratum, as KVM now uses pr_fmt() to provide the correct scope (v6.1
kernels and earlier don't, but the erratum only applies to CPUs that are
15+ years old; it's not worth a separate patch).
This reverts commit 9d78d6fb186bc4aff41b5d6c4726b76649d3cb53.
Link: https://lore.kernel.org/all/YtnZmCutdd5tpUmz@google.com
Fixes: 9d78d6fb186b ("KVM: VMX: Move LOAD_IA32_PERF_GLOBAL_CTRL errata handling out of setup_vmcs_config()")
Cc: stable(a)vger.kernel.org
Cc: Vitaly Kuznetsov <vkuznets(a)redhat.com>
Cc: Maxim Levitsky <mlevitsk(a)redhat.com>
Signed-off-by: Sean Christopherson <seanjc(a)google.com>
Reviewed-by: Vitaly Kuznetsov <vkuznets(a)redhat.com>
Message-ID: <20241119011433.1797921-1-seanjc(a)google.com>
Signed-off-by: Paolo Bonzini <pbonzini(a)redhat.com>
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 0f008f5ef6f0..893366e53732 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -2548,28 +2548,6 @@ static bool cpu_has_sgx(void)
return cpuid_eax(0) >= 0x12 && (cpuid_eax(0x12) & BIT(0));
}
-/*
- * Some cpus support VM_{ENTRY,EXIT}_IA32_PERF_GLOBAL_CTRL but they
- * can't be used due to errata where VM Exit may incorrectly clear
- * IA32_PERF_GLOBAL_CTRL[34:32]. Work around the errata by using the
- * MSR load mechanism to switch IA32_PERF_GLOBAL_CTRL.
- */
-static bool cpu_has_perf_global_ctrl_bug(void)
-{
- switch (boot_cpu_data.x86_vfm) {
- case INTEL_NEHALEM_EP: /* AAK155 */
- case INTEL_NEHALEM: /* AAP115 */
- case INTEL_WESTMERE: /* AAT100 */
- case INTEL_WESTMERE_EP: /* BC86,AAY89,BD102 */
- case INTEL_NEHALEM_EX: /* BA97 */
- return true;
- default:
- break;
- }
-
- return false;
-}
-
static int adjust_vmx_controls(u32 ctl_min, u32 ctl_opt, u32 msr, u32 *result)
{
u32 vmx_msr_low, vmx_msr_high;
@@ -2729,6 +2707,27 @@ static int setup_vmcs_config(struct vmcs_config *vmcs_conf,
_vmexit_control &= ~x_ctrl;
}
+ /*
+ * Some cpus support VM_{ENTRY,EXIT}_IA32_PERF_GLOBAL_CTRL but they
+ * can't be used due to an errata where VM Exit may incorrectly clear
+ * IA32_PERF_GLOBAL_CTRL[34:32]. Workaround the errata by using the
+ * MSR load mechanism to switch IA32_PERF_GLOBAL_CTRL.
+ */
+ switch (boot_cpu_data.x86_vfm) {
+ case INTEL_NEHALEM_EP: /* AAK155 */
+ case INTEL_NEHALEM: /* AAP115 */
+ case INTEL_WESTMERE: /* AAT100 */
+ case INTEL_WESTMERE_EP: /* BC86,AAY89,BD102 */
+ case INTEL_NEHALEM_EX: /* BA97 */
+ _vmentry_control &= ~VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL;
+ _vmexit_control &= ~VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL;
+ pr_warn_once("VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL "
+ "does not work properly. Using workaround\n");
+ break;
+ default:
+ break;
+ }
+
rdmsrl(MSR_IA32_VMX_BASIC, basic_msr);
/* IA-32 SDM Vol 3B: VMCS size is never greater than 4kB. */
@@ -4432,9 +4431,6 @@ static u32 vmx_vmentry_ctrl(void)
VM_ENTRY_LOAD_IA32_EFER |
VM_ENTRY_IA32E_MODE);
- if (cpu_has_perf_global_ctrl_bug())
- vmentry_ctrl &= ~VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL;
-
return vmentry_ctrl;
}
@@ -4452,10 +4448,6 @@ static u32 vmx_vmexit_ctrl(void)
if (vmx_pt_mode_is_system())
vmexit_ctrl &= ~(VM_EXIT_PT_CONCEAL_PIP |
VM_EXIT_CLEAR_IA32_RTIT_CTL);
-
- if (cpu_has_perf_global_ctrl_bug())
- vmexit_ctrl &= ~VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL;
-
/* Loading of EFER and PERF_GLOBAL_CTRL are toggled dynamically */
return vmexit_ctrl &
~(VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL | VM_EXIT_LOAD_IA32_EFER);
@@ -8415,10 +8407,6 @@ __init int vmx_hardware_setup(void)
if (setup_vmcs_config(&vmcs_config, &vmx_capability) < 0)
return -EIO;
- if (cpu_has_perf_global_ctrl_bug())
- pr_warn_once("VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL "
- "does not work properly. Using workaround\n");
-
if (boot_cpu_has(X86_FEATURE_NX))
kvm_enable_efer_bits(EFER_NX);
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x 5b590160d2cf776b304eb054afafea2bd55e3620
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024120220-comply-kissing-b2b3@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 5b590160d2cf776b304eb054afafea2bd55e3620 Mon Sep 17 00:00:00 2001
From: Adrian Hunter <adrian.hunter(a)intel.com>
Date: Tue, 22 Oct 2024 18:59:07 +0300
Subject: [PATCH] perf/x86/intel/pt: Fix buffer full but size is 0 case
If the trace data buffer becomes full, a truncated flag [T] is reported
in PERF_RECORD_AUX. In some cases, the size reported is 0, even though
data must have been added to make the buffer full.
That happens when the buffer fills up from empty to full before the
Intel PT driver has updated the buffer position. Then the driver
calculates the new buffer position before calculating the data size.
If the old and new positions are the same, the data size is reported
as 0, even though it is really the whole buffer size.
Fix by detecting when the buffer position is wrapped, and adjust the
data size calculation accordingly.
Example
Use a very small buffer size (8K) and observe the size of truncated [T]
data. Before the fix, it is possible to see records of 0 size.
Before:
$ perf record -m,8K -e intel_pt// uname
Linux
[ perf record: Woken up 2 times to write data ]
[ perf record: Captured and wrote 0.105 MB perf.data ]
$ perf script -D --no-itrace | grep AUX | grep -F '[T]'
Warning:
AUX data lost 2 times out of 3!
5 19462712368111 0x19710 [0x40]: PERF_RECORD_AUX offset: 0 size: 0 flags: 0x1 [T]
5 19462712700046 0x19ba8 [0x40]: PERF_RECORD_AUX offset: 0x170 size: 0xe90 flags: 0x1 [T]
After:
$ perf record -m,8K -e intel_pt// uname
Linux
[ perf record: Woken up 3 times to write data ]
[ perf record: Captured and wrote 0.040 MB perf.data ]
$ perf script -D --no-itrace | grep AUX | grep -F '[T]'
Warning:
AUX data lost 2 times out of 3!
1 113720802995 0x4948 [0x40]: PERF_RECORD_AUX offset: 0 size: 0x2000 flags: 0x1 [T]
1 113720979812 0x6b10 [0x40]: PERF_RECORD_AUX offset: 0x2000 size: 0x2000 flags: 0x1 [T]
Fixes: 52ca9ced3f70 ("perf/x86/intel/pt: Add Intel PT PMU driver")
Signed-off-by: Adrian Hunter <adrian.hunter(a)intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz(a)infradead.org>
Cc: stable(a)vger.kernel.org
Link: https://lkml.kernel.org/r/20241022155920.17511-2-adrian.hunter@intel.com
diff --git a/arch/x86/events/intel/pt.c b/arch/x86/events/intel/pt.c
index fd4670a6694e..a087bc0c5498 100644
--- a/arch/x86/events/intel/pt.c
+++ b/arch/x86/events/intel/pt.c
@@ -828,11 +828,13 @@ static void pt_buffer_advance(struct pt_buffer *buf)
buf->cur_idx++;
if (buf->cur_idx == buf->cur->last) {
- if (buf->cur == buf->last)
+ if (buf->cur == buf->last) {
buf->cur = buf->first;
- else
+ buf->wrapped = true;
+ } else {
buf->cur = list_entry(buf->cur->list.next, struct topa,
list);
+ }
buf->cur_idx = 0;
}
}
@@ -846,8 +848,11 @@ static void pt_buffer_advance(struct pt_buffer *buf)
static void pt_update_head(struct pt *pt)
{
struct pt_buffer *buf = perf_get_aux(&pt->handle);
+ bool wrapped = buf->wrapped;
u64 topa_idx, base, old;
+ buf->wrapped = false;
+
if (buf->single) {
local_set(&buf->data_size, buf->output_off);
return;
@@ -865,7 +870,7 @@ static void pt_update_head(struct pt *pt)
} else {
old = (local64_xchg(&buf->head, base) &
((buf->nr_pages << PAGE_SHIFT) - 1));
- if (base < old)
+ if (base < old || (base == old && wrapped))
base += buf->nr_pages << PAGE_SHIFT;
local_add(base - old, &buf->data_size);
diff --git a/arch/x86/events/intel/pt.h b/arch/x86/events/intel/pt.h
index f5e46c04c145..a1b6c04b7f68 100644
--- a/arch/x86/events/intel/pt.h
+++ b/arch/x86/events/intel/pt.h
@@ -65,6 +65,7 @@ struct pt_pmu {
* @head: logical write offset inside the buffer
* @snapshot: if this is for a snapshot/overwrite counter
* @single: use Single Range Output instead of ToPA
+ * @wrapped: buffer advance wrapped back to the first topa table
* @stop_pos: STOP topa entry index
* @intr_pos: INT topa entry index
* @stop_te: STOP topa entry pointer
@@ -82,6 +83,7 @@ struct pt_buffer {
local64_t head;
bool snapshot;
bool single;
+ bool wrapped;
long stop_pos, intr_pos;
struct topa_entry *stop_te, *intr_te;
void **data_pages;
From: Chuck Lever <chuck.lever(a)oracle.com>
Backport the upstream patch that disables background NFSv4.2 COPY
operations.
Unlike later LTS kernels, the patches that limit the number of
background COPY operations do not apply at all to v5.4. Because
there is no support for server-to-server COPY in v5.4, disabling
background COPY operations should not be noticeable.
Chuck Lever (1):
NFSD: Force all NFSv4.2 COPY requests to be synchronous
fs/nfsd/nfs4proc.c | 11 ++++++-----
1 file changed, 6 insertions(+), 5 deletions(-)
--
2.47.0
v1 -> v2:
- init_codec to always update with latest payload from firmware
(Dmitry/Bryan)
- Rewrite the logic of packet parsing to consider payload size for
different packet type (Bryan)
- Consider reading sfr data till available space (Dmitry)
- Add reviewed-by tags
v1:
https://lore.kernel.org/all/20241105-venus_oob-v1-0-8d4feedfe2bb@quicinc.co…
This series primarily adds check at relevant places in venus driver where there
are possible OOB accesses due to unexpected payload from venus firmware. The
patches describes the specific OOB possibility.
Please review and share your feedback.
Validated on sc7180(v4) and rb5(v6).
Stan, please help to extend the test on db410c(v1).
Signed-off-by: Vikash Garodia <quic_vgarodia(a)quicinc.com>
---
Vikash Garodia (4):
media: venus: hfi_parser: add check to avoid out of bound access
media: venus: hfi_parser: avoid OOB access beyond payload word count
media: venus: hfi: add check to handle incorrect queue size
media: venus: hfi: add a check to handle OOB in sfr region
drivers/media/platform/qcom/venus/hfi_parser.c | 58 +++++++++++++++++++++-----
drivers/media/platform/qcom/venus/hfi_venus.c | 15 ++++++-
2 files changed, 60 insertions(+), 13 deletions(-)
---
base-commit: c7ccf3683ac9746b263b0502255f5ce47f64fe0a
change-id: 20241115-venus_oob_2-21708239176a
Best regards,
--
Vikash Garodia <quic_vgarodia(a)quicinc.com>
5.4 doesn't have commit backported which introduces RCU for cgroup
root_list, but has a commit fixing a UAF (and thus a CVE) which depends
on it.
Thus, we need to backport the original commits. See thread:
https://lore.kernel.org/all/xr93ikus2nd1.fsf@gthelen-cloudtop.c.googlers.co…
This patch series backports the requisite commits to 5.4.y, which are
picked up from the above mentioned thread.
Thanks,
Siddh
Waiman Long (1):
cgroup: Move rcu_head up near the top of cgroup_root
Yafang Shao (1):
cgroup: Make operations on the cgroup root_list RCU safe
include/linux/cgroup-defs.h | 7 ++++---
kernel/cgroup/cgroup-internal.h | 3 ++-
kernel/cgroup/cgroup.c | 23 ++++++++++++++++-------
3 files changed, 22 insertions(+), 11 deletions(-)
--
2.45.2
From: Mikulas Patocka <mpatocka(a)redhat.com>
[ Upstream commit 7ae04ba36b381bffe2471eff3a93edced843240f]
ARCH_DMA_MINALIGN was defined as 16 - this is too small - it may be
possible that two unrelated 16-byte allocations share a cache line. If
one of these allocations is written using DMA and the other is written
using cached write, the value that was written with DMA may be
corrupted.
This commit changes ARCH_DMA_MINALIGN to be 128 on PA20 and 32 on PA1.1 -
that's the largest possible cache line size.
As different parisc microarchitectures have different cache line size, we
define arch_slab_minalign(), cache_line_size() and
dma_get_cache_alignment() so that the kernel may tune slab cache
parameters dynamically, based on the detected cache line size.
Signed-off-by: Mikulas Patocka <mpatocka(a)redhat.com>
Cc: stable(a)vger.kernel.org
Signed-off-by: Helge Deller <deller(a)gmx.de>
Signed-off-by: Mingli Yu <mingli.yu(a)windriver.com>
---
arch/parisc/Kconfig | 1 +
arch/parisc/include/asm/cache.h | 11 ++++++++++-
2 files changed, 11 insertions(+), 1 deletion(-)
diff --git a/arch/parisc/Kconfig b/arch/parisc/Kconfig
index 6ac0c4b98e28..9888e0b3f675 100644
--- a/arch/parisc/Kconfig
+++ b/arch/parisc/Kconfig
@@ -15,6 +15,7 @@ config PARISC
select ARCH_SPLIT_ARG64 if !64BIT
select ARCH_SUPPORTS_HUGETLBFS if PA20
select ARCH_SUPPORTS_MEMORY_FAILURE
+ select ARCH_HAS_CACHE_LINE_SIZE
select DMA_OPS
select RTC_CLASS
select RTC_DRV_GENERIC
diff --git a/arch/parisc/include/asm/cache.h b/arch/parisc/include/asm/cache.h
index d53e9e27dba0..99e26c686f7f 100644
--- a/arch/parisc/include/asm/cache.h
+++ b/arch/parisc/include/asm/cache.h
@@ -20,7 +20,16 @@
#define SMP_CACHE_BYTES L1_CACHE_BYTES
-#define ARCH_DMA_MINALIGN L1_CACHE_BYTES
+#ifdef CONFIG_PA20
+#define ARCH_DMA_MINALIGN 128
+#else
+#define ARCH_DMA_MINALIGN 32
+#endif
+#define ARCH_KMALLOC_MINALIGN 16 /* ldcw requires 16-byte alignment */
+
+#define arch_slab_minalign() ((unsigned)dcache_stride)
+#define cache_line_size() dcache_stride
+#define dma_get_cache_alignment cache_line_size
#define __read_mostly __section(".data..read_mostly")
--
2.34.1
This patch addresses CVE-2024-49935 [1], a vulnerability in ACPI
subsystem caused by calling cpumask_clear_cpu() with bit to
clear set to 0xffffffff, thus leading to erroneous memory
access. The issue is still present in 5.10.y kernel.
The original commit [2] has been backported to several stable
branches (5.15.y and fresher) and now has been cherry-picked for
5.10.y.
[1] https://nvd.nist.gov/vuln/detail/CVE-2024-49935
[2] https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id…
Seiji Nishikawa (1):
ACPI: PAD: fix crash in exit_round_robin()
drivers/acpi/acpi_pad.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
--
2.25.1
On certain i.MX8 series parts [1], the PPS channel 0
is routed internally to eDMA, and the external PPS
pin is available on channel 1. In addition, on
certain boards, the PPS may be wired on the PCB to
an EVENTOUTn pin other than 0. On these systems
it is necessary that the PPS channel be able
to be configured from the Device Tree.
[1] https://lore.kernel.org/all/ZrPYOWA3FESx197L@lizhi-Precision-Tower-5810/
Francesco Dolcini (3):
dt-bindings: net: fec: add pps channel property
net: fec: refactor PPS channel configuration
net: fec: make PPS channel configurable
Documentation/devicetree/bindings/net/fsl,fec.yaml | 7 +++++++
drivers/net/ethernet/freescale/fec_ptp.c | 11 ++++++-----
2 files changed, 13 insertions(+), 5 deletions(-)
--
2.34.1
On certain i.MX8 series parts [1], the PPS channel 0
is routed internally to eDMA, and the external PPS
pin is available on channel 1. In addition, on
certain boards, the PPS may be wired on the PCB to
an EVENTOUTn pin other than 0. On these systems
it is necessary that the PPS channel be able
to be configured from the Device Tree.
[1] https://lore.kernel.org/all/ZrPYOWA3FESx197L@lizhi-Precision-Tower-5810/
Francesco Dolcini (3):
dt-bindings: net: fec: add pps channel property
net: fec: refactor PPS channel configuration
net: fec: make PPS channel configurable
Documentation/devicetree/bindings/net/fsl,fec.yaml | 7 +++++++
drivers/net/ethernet/freescale/fec_ptp.c | 11 ++++++-----
2 files changed, 13 insertions(+), 5 deletions(-)
--
2.34.1
The first patch completes a previous fix where one error path stayed as
a direct return after the child nodes were acquired, and the second,
completely trivial, updates the function name used in the comment to
indicate where the references are released.
Signed-off-by: Javier Carrasco <javier.carrasco.cruz(a)gmail.com>
---
Javier Carrasco (2):
pinctrl: samsung: fix fwnode refcount cleanup if platform_get_irq_optional() fails
pinctrl: samsung: update child reference drop comment
drivers/pinctrl/samsung/pinctrl-samsung.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
---
base-commit: 5b913f5d7d7fe0f567dea8605f21da6eaa1735fb
change-id: 20241106-samsung-pinctrl-put-cb973a35ef52
Best regards,
--
Javier Carrasco <javier.carrasco.cruz(a)gmail.com>
At the end of __regmap_init(), if dev is not NULL, regmap_attach_dev()
is called, which adds a devres reference to the regmap, to be able to
retrieve a dev's regmap by name using dev_get_regmap().
When calling regmap_exit, the opposite does not happen, and the
reference is kept until the dev is detached.
Add a regmap_detach_dev() function and call it in regmap_exit() to make
sure that the devres reference is not kept.
Cc: stable(a)vger.kernel.org
Fixes: 72b39f6f2b5a ("regmap: Implement dev_get_regmap()")
Signed-off-by: Cosmin Tanislav <demonsingur(a)gmail.com>
---
V2:
* switch to static function
V3:
* move inter-version changelog after ---
V4:
* remove mention of exporting the regmap_detach_dev() function
diff --git a/drivers/base/regmap/regmap.c b/drivers/base/regmap/regmap.c
index 53131a7ede0a6..e3e2afc2c83c6 100644
--- a/drivers/base/regmap/regmap.c
+++ b/drivers/base/regmap/regmap.c
@@ -598,6 +598,17 @@ int regmap_attach_dev(struct device *dev, struct regmap *map,
}
EXPORT_SYMBOL_GPL(regmap_attach_dev);
+static int dev_get_regmap_match(struct device *dev, void *res, void *data);
+
+static int regmap_detach_dev(struct device *dev, struct regmap *map)
+{
+ if (!dev)
+ return 0;
+
+ return devres_release(dev, dev_get_regmap_release,
+ dev_get_regmap_match, (void *)map->name);
+}
+
static enum regmap_endian regmap_get_reg_endian(const struct regmap_bus *bus,
const struct regmap_config *config)
{
@@ -1445,6 +1456,7 @@ void regmap_exit(struct regmap *map)
{
struct regmap_async *async;
+ regmap_detach_dev(map->dev, map);
regcache_exit(map);
regmap_debugfs_exit(map);
--
2.47.0
At the end of __regmap_init(), if dev is not NULL, regmap_attach_dev()
is called, which adds a devres reference to the regmap, to be able to
retrieve a dev's regmap by name using dev_get_regmap().
When calling regmap_exit, the opposite does not happen, and the
reference is kept until the dev is detached.
Add a regmap_detach_dev() function, export it and call it in
regmap_exit(), to make sure that the devres reference is not kept.
Fixes: 72b39f6f2b5a ("regmap: Implement dev_get_regmap()")
Signed-off-by: Cosmin Tanislav <demonsingur(a)gmail.com>
---
V2:
* switch to static function
V3:
* move inter-version changelog after ---
drivers/base/regmap/regmap.c | 12 ++++++++++++
1 file changed, 12 insertions(+)
diff --git a/drivers/base/regmap/regmap.c b/drivers/base/regmap/regmap.c
index 53131a7ede0a6..e3e2afc2c83c6 100644
--- a/drivers/base/regmap/regmap.c
+++ b/drivers/base/regmap/regmap.c
@@ -598,6 +598,17 @@ int regmap_attach_dev(struct device *dev, struct regmap *map,
}
EXPORT_SYMBOL_GPL(regmap_attach_dev);
+static int dev_get_regmap_match(struct device *dev, void *res, void *data);
+
+static int regmap_detach_dev(struct device *dev, struct regmap *map)
+{
+ if (!dev)
+ return 0;
+
+ return devres_release(dev, dev_get_regmap_release,
+ dev_get_regmap_match, (void *)map->name);
+}
+
static enum regmap_endian regmap_get_reg_endian(const struct regmap_bus *bus,
const struct regmap_config *config)
{
@@ -1445,6 +1456,7 @@ void regmap_exit(struct regmap *map)
{
struct regmap_async *async;
+ regmap_detach_dev(map->dev, map);
regcache_exit(map);
regmap_debugfs_exit(map);
--
2.47.0
From: Igor Artemiev <Igor.A.Artemiev(a)mcst.ru>
[ Upstream commit a1e2da6a5072f8abe5b0feaa91a5bcd9dc544a04 ]
It is possible, although unlikely, that an integer overflow will occur
when the result of radeon_get_ib_value() is shifted to the left.
Avoid it by casting one of the operands to larger data type (u64).
Found by Linux Verification Center (linuxtesting.org) with static
analysis tool SVACE.
Signed-off-by: Igor Artemiev <Igor.A.Artemiev(a)mcst.ru>
Signed-off-by: Alex Deucher <alexander.deucher(a)amd.com>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
drivers/gpu/drm/radeon/r600_cs.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/radeon/r600_cs.c b/drivers/gpu/drm/radeon/r600_cs.c
index b6bdfb3f4a7f7..580ca4f753531 100644
--- a/drivers/gpu/drm/radeon/r600_cs.c
+++ b/drivers/gpu/drm/radeon/r600_cs.c
@@ -2104,7 +2104,7 @@ static int r600_packet3_check(struct radeon_cs_parser *p,
return -EINVAL;
}
- offset = radeon_get_ib_value(p, idx+1) << 8;
+ offset = (u64)radeon_get_ib_value(p, idx+1) << 8;
if (offset != track->vgt_strmout_bo_offset[idx_value]) {
DRM_ERROR("bad STRMOUT_BASE_UPDATE, bo offset does not match: 0x%llx, 0x%x\n",
offset, track->vgt_strmout_bo_offset[idx_value]);
--
2.43.0
Right now we power-up the device when a user open() the device and we
power it off when the last user close() the first video node.
This behaviour affects the power consumption of the device is multiple
use cases, such as:
- Polling the privacy gpio
- udev probing the device
This patchset introduces a more granular power saving behaviour where
the camera is only awaken when needed. It is compatible with
asynchronous controls.
While developing this patchset, two bugs were found. The patchset has
been developed so these fixes can be taken independently.
Signed-off-by: Ricardo Ribalda <ribalda(a)chromium.org>
---
Ricardo Ribalda (9):
media: uvcvideo: Do not set an async control owned by other fh
media: uvcvideo: Remove dangling pointers
media: uvcvideo: Keep streaming state in the file handle
media: uvcvideo: Move usb_autopm_(get|put)_interface to status_get
media: uvcvideo: Add a uvc_status guard
media: uvcvideo: Increase/decrease the PM counter per IOCTL
media: uvcvideo: Make power management granular
media: uvcvideo: Do not turn on the camera for some ioctls
media: uvcvideo: Remove duplicated cap/out code
drivers/media/usb/uvc/uvc_ctrl.c | 52 +++++++++-
drivers/media/usb/uvc/uvc_status.c | 38 +++++++-
drivers/media/usb/uvc/uvc_v4l2.c | 190 +++++++++++++++----------------------
drivers/media/usb/uvc/uvcvideo.h | 6 ++
4 files changed, 166 insertions(+), 120 deletions(-)
---
base-commit: 72ad4ff638047bbbdf3232178fea4bec1f429319
change-id: 20241126-uvc-granpower-ng-069185a6d474
Best regards,
--
Ricardo Ribalda <ribalda(a)chromium.org>
Some notebooks have a button to disable the camera (not to be mistaken
with the mechanical cover). This is a standard GPIO linked to the
camera via the ACPI table.
4 years ago we added support for this button in UVC via the Privacy control.
This has three issues:
- If the camera has its own privacy control, it will be masked.
- We need to power-up the camera to read the privacy control gpio.
- Other drivers have not followed this approach and have used evdev.
We tried to fix the power-up issues implementing "granular power
saving" but it has been more complicated than anticipated...
This patchset implements the Privacy GPIO as a evdev.
The first patch of this set is already in Laurent's tree... but I
include it to get some CI coverage.
Signed-off-by: Ricardo Ribalda <ribalda(a)chromium.org>
---
Changes in v3:
- CodeStyle (Thanks Sakari)
- Re-implement as input device
- Make the code depend on UVC_INPUT_EVDEV
- Link to v2: https://lore.kernel.org/r/20241108-uvc-subdev-v2-0-85d8a051a3d3@chromium.org
Changes in v2:
- Rebase on top of https://patchwork.linuxtv.org/project/linux-media/patch/20241106-uvc-crashr…
- Create uvc_gpio_cleanup and uvc_gpio_deinit
- Refactor quirk: do not disable irq
- Change define number for MEDIA_ENT_F_GPIO
- Link to v1: https://lore.kernel.org/r/20241031-uvc-subdev-v1-0-a68331cedd72@chromium.org
---
Ricardo Ribalda (8):
media: uvcvideo: Fix crash during unbind if gpio unit is in use
media: uvcvideo: Factor out gpio functions to its own file
media: uvcvideo: Re-implement privacy GPIO as an input device
Revert "media: uvcvideo: Allow entity-defined get_info and get_cur"
media: uvcvideo: Create ancillary link for GPIO subdevice
media: v4l2-core: Add new MEDIA_ENT_F_GPIO
media: uvcvideo: Use MEDIA_ENT_F_GPIO for the GPIO entity
media: uvcvideo: Introduce UVC_QUIRK_PRIVACY_DURING_STREAM
.../userspace-api/media/mediactl/media-types.rst | 4 +
drivers/media/usb/uvc/Kconfig | 2 +-
drivers/media/usb/uvc/Makefile | 3 +
drivers/media/usb/uvc/uvc_ctrl.c | 40 +-----
drivers/media/usb/uvc/uvc_driver.c | 112 +---------------
drivers/media/usb/uvc/uvc_entity.c | 21 ++-
drivers/media/usb/uvc/uvc_gpio.c | 144 +++++++++++++++++++++
drivers/media/usb/uvc/uvc_status.c | 13 +-
drivers/media/usb/uvc/uvc_video.c | 4 +
drivers/media/usb/uvc/uvcvideo.h | 31 +++--
drivers/media/v4l2-core/v4l2-async.c | 3 +-
include/uapi/linux/media.h | 1 +
12 files changed, 223 insertions(+), 155 deletions(-)
---
base-commit: 1b3bb4d69f20be5931abc18a6dbc24ff687fa780
change-id: 20241030-uvc-subdev-89f4467a00b5
Best regards,
--
Ricardo Ribalda <ribalda(a)chromium.org>
From: Oleg Nesterov <oleg(a)redhat.com>
[ Upstream commit c7b4133c48445dde789ed30b19ccb0448c7593f7 ]
1. Clear utask->xol_vaddr unconditionally, even if this addr is not valid,
xol_free_insn_slot() should never return with utask->xol_vaddr != NULL.
2. Add a comment to explain why do we need to validate slot_addr.
3. Simplify the validation above. We can simply check offset < PAGE_SIZE,
unsigned underflows are fine, it should work if slot_addr < area->vaddr.
4. Kill the unnecessary "slot_nr >= UINSNS_PER_PAGE" check, slot_nr must
be valid if offset < PAGE_SIZE.
The next patches will cleanup this function even more.
Signed-off-by: Oleg Nesterov <oleg(a)redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz(a)infradead.org>
Link: https://lore.kernel.org/r/20240929144235.GA9471@redhat.com
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
kernel/events/uprobes.c | 21 +++++++++------------
1 file changed, 9 insertions(+), 12 deletions(-)
diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
index 1ea2c1f311261..220d5f4a57e6b 100644
--- a/kernel/events/uprobes.c
+++ b/kernel/events/uprobes.c
@@ -1634,8 +1634,8 @@ static unsigned long xol_get_insn_slot(struct uprobe *uprobe)
static void xol_free_insn_slot(struct task_struct *tsk)
{
struct xol_area *area;
- unsigned long vma_end;
unsigned long slot_addr;
+ unsigned long offset;
if (!tsk->mm || !tsk->mm->uprobes_state.xol_area || !tsk->utask)
return;
@@ -1644,24 +1644,21 @@ static void xol_free_insn_slot(struct task_struct *tsk)
if (unlikely(!slot_addr))
return;
+ tsk->utask->xol_vaddr = 0;
area = tsk->mm->uprobes_state.xol_area;
- vma_end = area->vaddr + PAGE_SIZE;
- if (area->vaddr <= slot_addr && slot_addr < vma_end) {
- unsigned long offset;
- int slot_nr;
-
- offset = slot_addr - area->vaddr;
- slot_nr = offset / UPROBE_XOL_SLOT_BYTES;
- if (slot_nr >= UINSNS_PER_PAGE)
- return;
+ offset = slot_addr - area->vaddr;
+ /*
+ * slot_addr must fit into [area->vaddr, area->vaddr + PAGE_SIZE).
+ * This check can only fail if the "[uprobes]" vma was mremap'ed.
+ */
+ if (offset < PAGE_SIZE) {
+ int slot_nr = offset / UPROBE_XOL_SLOT_BYTES;
clear_bit(slot_nr, area->bitmap);
atomic_dec(&area->slot_count);
smp_mb__after_atomic(); /* pairs with prepare_to_wait() */
if (waitqueue_active(&area->wq))
wake_up(&area->wq);
-
- tsk->utask->xol_vaddr = 0;
}
}
--
2.43.0
From: Thomas Richter <tmricht(a)linux.ibm.com>
[ Upstream commit a0bd7dacbd51c632b8e2c0500b479af564afadf3 ]
CPU hotplug remove handling triggers the following function
call sequence:
CPUHP_AP_PERF_S390_SF_ONLINE --> s390_pmu_sf_offline_cpu()
...
CPUHP_AP_PERF_ONLINE --> perf_event_exit_cpu()
The s390 CPUMF sampling CPU hotplug handler invokes:
s390_pmu_sf_offline_cpu()
+--> cpusf_pmu_setup()
+--> setup_pmc_cpu()
+--> deallocate_buffers()
This function de-allocates all sampling data buffers (SDBs) allocated
for that CPU at event initialization. It also clears the
PMU_F_RESERVED bit. The CPU is gone and can not be sampled.
With the event still being active on the removed CPU, the CPU event
hotplug support in kernel performance subsystem triggers the
following function calls on the removed CPU:
perf_event_exit_cpu()
+--> perf_event_exit_cpu_context()
+--> __perf_event_exit_context()
+--> __perf_remove_from_context()
+--> event_sched_out()
+--> cpumsf_pmu_del()
+--> cpumsf_pmu_stop()
+--> hw_perf_event_update()
to stop and remove the event. During removal of the event, the
sampling device driver tries to read out the remaining samples from
the sample data buffers (SDBs). But they have already been freed
(and may have been re-assigned). This may lead to a use after free
situation in which case the samples are most likely invalid. In the
best case the memory has not been reassigned and still contains
valid data.
Remedy this situation and check if the CPU is still in reserved
state (bit PMU_F_RESERVED set). In this case the SDBs have not been
released an contain valid data. This is always the case when
the event is removed (and no CPU hotplug off occured).
If the PMU_F_RESERVED bit is not set, the SDB buffers are gone.
Signed-off-by: Thomas Richter <tmricht(a)linux.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner(a)linux.ibm.com>
Signed-off-by: Heiko Carstens <hca(a)linux.ibm.com>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
arch/s390/kernel/perf_cpum_sf.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/arch/s390/kernel/perf_cpum_sf.c b/arch/s390/kernel/perf_cpum_sf.c
index a9e05f4d0a483..fc45f123f3bdc 100644
--- a/arch/s390/kernel/perf_cpum_sf.c
+++ b/arch/s390/kernel/perf_cpum_sf.c
@@ -1896,7 +1896,9 @@ static void cpumsf_pmu_stop(struct perf_event *event, int flags)
event->hw.state |= PERF_HES_STOPPED;
if ((flags & PERF_EF_UPDATE) && !(event->hw.state & PERF_HES_UPTODATE)) {
- hw_perf_event_update(event, 1);
+ /* CPU hotplug off removes SDBs. No samples to extract. */
+ if (cpuhw->flags & PMU_F_RESERVED)
+ hw_perf_event_update(event, 1);
event->hw.state |= PERF_HES_UPTODATE;
}
perf_pmu_enable(event->pmu);
--
2.43.0
From: Thomas Richter <tmricht(a)linux.ibm.com>
[ Upstream commit a0bd7dacbd51c632b8e2c0500b479af564afadf3 ]
CPU hotplug remove handling triggers the following function
call sequence:
CPUHP_AP_PERF_S390_SF_ONLINE --> s390_pmu_sf_offline_cpu()
...
CPUHP_AP_PERF_ONLINE --> perf_event_exit_cpu()
The s390 CPUMF sampling CPU hotplug handler invokes:
s390_pmu_sf_offline_cpu()
+--> cpusf_pmu_setup()
+--> setup_pmc_cpu()
+--> deallocate_buffers()
This function de-allocates all sampling data buffers (SDBs) allocated
for that CPU at event initialization. It also clears the
PMU_F_RESERVED bit. The CPU is gone and can not be sampled.
With the event still being active on the removed CPU, the CPU event
hotplug support in kernel performance subsystem triggers the
following function calls on the removed CPU:
perf_event_exit_cpu()
+--> perf_event_exit_cpu_context()
+--> __perf_event_exit_context()
+--> __perf_remove_from_context()
+--> event_sched_out()
+--> cpumsf_pmu_del()
+--> cpumsf_pmu_stop()
+--> hw_perf_event_update()
to stop and remove the event. During removal of the event, the
sampling device driver tries to read out the remaining samples from
the sample data buffers (SDBs). But they have already been freed
(and may have been re-assigned). This may lead to a use after free
situation in which case the samples are most likely invalid. In the
best case the memory has not been reassigned and still contains
valid data.
Remedy this situation and check if the CPU is still in reserved
state (bit PMU_F_RESERVED set). In this case the SDBs have not been
released an contain valid data. This is always the case when
the event is removed (and no CPU hotplug off occured).
If the PMU_F_RESERVED bit is not set, the SDB buffers are gone.
Signed-off-by: Thomas Richter <tmricht(a)linux.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner(a)linux.ibm.com>
Signed-off-by: Heiko Carstens <hca(a)linux.ibm.com>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
arch/s390/kernel/perf_cpum_sf.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/arch/s390/kernel/perf_cpum_sf.c b/arch/s390/kernel/perf_cpum_sf.c
index c7f94b5d93968..e51babd0bbc10 100644
--- a/arch/s390/kernel/perf_cpum_sf.c
+++ b/arch/s390/kernel/perf_cpum_sf.c
@@ -1772,7 +1772,9 @@ static void cpumsf_pmu_stop(struct perf_event *event, int flags)
event->hw.state |= PERF_HES_STOPPED;
if ((flags & PERF_EF_UPDATE) && !(event->hw.state & PERF_HES_UPTODATE)) {
- hw_perf_event_update(event, 1);
+ /* CPU hotplug off removes SDBs. No samples to extract. */
+ if (cpuhw->flags & PMU_F_RESERVED)
+ hw_perf_event_update(event, 1);
event->hw.state |= PERF_HES_UPTODATE;
}
perf_pmu_enable(event->pmu);
--
2.43.0
From: John Watts <contact(a)jookia.org>
[ Upstream commit f8da001ae7af0abd9f6250c02c01a1121074ca60 ]
The audio graph card doesn't mark its subnodes such as multi {}, dpcm {}
and c2c {} as not requiring any suppliers. This causes a hang as Linux
waits for these phantom suppliers to show up on boot.
Make it clear these nodes have no suppliers.
Example error message:
[ 15.208558] platform 2034000.i2s: deferred probe pending: platform: wait for supplier /sound/multi
[ 15.208584] platform sound: deferred probe pending: asoc-audio-graph-card2: parse error
Signed-off-by: John Watts <contact(a)jookia.org>
Acked-by: Kuninori Morimoto <kuninori.morimoto.gx(a)renesas.com>
Link: https://patch.msgid.link/20241108-graph_dt_fix-v1-1-173e2f9603d6@jookia.org
Signed-off-by: Mark Brown <broonie(a)kernel.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
sound/soc/generic/audio-graph-card2.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/sound/soc/generic/audio-graph-card2.c b/sound/soc/generic/audio-graph-card2.c
index 8ac6df645ee6c..33f35eaa76a8b 100644
--- a/sound/soc/generic/audio-graph-card2.c
+++ b/sound/soc/generic/audio-graph-card2.c
@@ -249,16 +249,19 @@ static enum graph_type __graph_get_type(struct device_node *lnk)
if (of_node_name_eq(np, GRAPH_NODENAME_MULTI)) {
ret = GRAPH_MULTI;
+ fw_devlink_purge_absent_suppliers(&np->fwnode);
goto out_put;
}
if (of_node_name_eq(np, GRAPH_NODENAME_DPCM)) {
ret = GRAPH_DPCM;
+ fw_devlink_purge_absent_suppliers(&np->fwnode);
goto out_put;
}
if (of_node_name_eq(np, GRAPH_NODENAME_C2C)) {
ret = GRAPH_C2C;
+ fw_devlink_purge_absent_suppliers(&np->fwnode);
goto out_put;
}
--
2.43.0
The condition in replenish_dl_new_period() that checks if a reservation
(dl_server) is deferred and is not handling a starvation case is
obviously wrong.
Fix it.
Cc: stable(a)vger.kernel.org
Cc: Ingo Molnar <mingo(a)redhat.com>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Vincent Guittot <vincent.guittot(a)linaro.org>
Cc: Dietmar Eggemann <dietmar.eggemann(a)arm.com>
Cc: Steven Rostedt <rostedt(a)goodmis.org>
Cc: Ben Segall <bsegall(a)google.com>
Cc: Mel Gorman <mgorman(a)suse.de>
Cc: Valentin Schneider <vschneid(a)redhat.com>
Fixes: a110a81c52a9 ("sched/deadline: Deferrable dl server")
Signed-off-by: Juri Lelli <juri.lelli(a)redhat.com>
---
kernel/sched/deadline.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index d9d5a702f1a6..206691d35b7d 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -781,7 +781,7 @@ static inline void replenish_dl_new_period(struct sched_dl_entity *dl_se,
* If it is a deferred reservation, and the server
* is not handling an starvation case, defer it.
*/
- if (dl_se->dl_defer & !dl_se->dl_defer_running) {
+ if (dl_se->dl_defer && !dl_se->dl_defer_running) {
dl_se->dl_throttled = 1;
dl_se->dl_defer_armed = 1;
}
--
2.47.0
From: Kan Liang <kan.liang(a)linux.intel.com>
The PEBS kernel warnings can still be observed with the below case.
when the below commands are running in parallel for a while.
while true;
do
perf record --no-buildid -a --intr-regs=AX \
-e cpu/event=0xd0,umask=0x81/pp \
-c 10003 -o /dev/null ./triad;
done &
while true;
do
perf record -e 'cpu/mem-loads,ldlat=3/uP' -W -d -- ./dtlb
done
The commit b752ea0c28e3 ("perf/x86/intel/ds: Flush PEBS DS when changing
PEBS_DATA_CFG") intends to flush the entire PEBS buffer before the
hardware is reprogrammed. However, it fails in the above case.
The first perf command utilizes the large PEBS, while the second perf
command only utilizes a single PEBS. When the second perf event is
added, only the n_pebs++. The intel_pmu_pebs_enable() is invoked after
intel_pmu_pebs_add(). So the cpuc->n_pebs == cpuc->n_large_pebs check in
the intel_pmu_drain_large_pebs() fails. The PEBS DS is not flushed.
The new PEBS event should not be taken into account when flushing the
existing PEBS DS.
The check is unnecessary here. Before the hardware is reprogrammed, all
the stale records must be drained unconditionally.
For single PEBS or PEBS-vi-pt, the DS must be empty. The drain_pebs()
can handle the empty case. There is no harm to unconditionally drain the
PEBS DS.
Fixes: b752ea0c28e3 ("perf/x86/intel/ds: Flush PEBS DS when changing PEBS_DATA_CFG")
Signed-off-by: Kan Liang <kan.liang(a)linux.intel.com>
Cc: stable(a)vger.kernel.org
---
arch/x86/events/intel/ds.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
index 8afc4ad3cd16..1a4b326ca2ce 100644
--- a/arch/x86/events/intel/ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -1489,7 +1489,7 @@ void intel_pmu_pebs_enable(struct perf_event *event)
* hence we need to drain when changing said
* size.
*/
- intel_pmu_drain_large_pebs(cpuc);
+ intel_pmu_drain_pebs_buffer();
adaptive_pebs_record_size_update();
wrmsrl(MSR_PEBS_DATA_CFG, pebs_data_cfg);
cpuc->active_pebs_data_cfg = pebs_data_cfg;
--
2.38.1
Added a check for ubi_num for negative numbers
If the variable ubi_num takes negative values then we get:
qemu-system-arm ... -append "ubi.mtd=0,0,0,-22222345" ...
[ 0.745065] ubi_attach_mtd_dev from ubi_init+0x178/0x218
[ 0.745230] ubi_init from do_one_initcall+0x70/0x1ac
[ 0.745344] do_one_initcall from kernel_init_freeable+0x198/0x224
[ 0.745474] kernel_init_freeable from kernel_init+0x18/0x134
[ 0.745600] kernel_init from ret_from_fork+0x14/0x28
[ 0.745727] Exception stack(0x90015fb0 to 0x90015ff8)
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Fixes: 83ff59a06663 ("UBI: support ubi_num on mtd.ubi command line")
Cc: stable(a)vger.kernel.org
Signed-off-by: Denis Arefev <arefev(a)swemel.ru>
---
V1 -> V2: changed the tag Fixes and moved the check to ubi_mtd_param_parse()
drivers/mtd/ubi/build.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/mtd/ubi/build.c b/drivers/mtd/ubi/build.c
index 30be4ed68fad..ef6a22f372f9 100644
--- a/drivers/mtd/ubi/build.c
+++ b/drivers/mtd/ubi/build.c
@@ -1537,7 +1537,7 @@ static int ubi_mtd_param_parse(const char *val, const struct kernel_param *kp)
if (token) {
int err = kstrtoint(token, 10, &p->ubi_num);
- if (err) {
+ if (err || p->ubi_num < UBI_DEV_NUM_AUTO) {
pr_err("UBI error: bad value for ubi_num parameter: %s\n",
token);
return -EINVAL;
--
2.25.1
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.19.y
git checkout FETCH_HEAD
git cherry-pick -x afc545da381ba0c651b2658966ac737032676f01
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024120221-serving-certainly-75b1@gregkh' --subject-prefix 'PATCH 4.19.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From afc545da381ba0c651b2658966ac737032676f01 Mon Sep 17 00:00:00 2001
From: Qiu-ji Chen <chenqiuji666(a)gmail.com>
Date: Tue, 5 Nov 2024 21:09:19 +0800
Subject: [PATCH] xen: Fix the issue of resource not being properly released in
xenbus_dev_probe()
This patch fixes an issue in the function xenbus_dev_probe(). In the
xenbus_dev_probe() function, within the if (err) branch at line 313, the
program incorrectly returns err directly without releasing the resources
allocated by err = drv->probe(dev, id). As the return value is non-zero,
the upper layers assume the processing logic has failed. However, the probe
operation was performed earlier without a corresponding remove operation.
Since the probe actually allocates resources, failing to perform the remove
operation could lead to problems.
To fix this issue, we followed the resource release logic of the
xenbus_dev_remove() function by adding a new block fail_remove before the
fail_put block. After entering the branch if (err) at line 313, the
function will use a goto statement to jump to the fail_remove block,
ensuring that the previously acquired resources are correctly released,
thus preventing the reference count leak.
This bug was identified by an experimental static analysis tool developed
by our team. The tool specializes in analyzing reference count operations
and detecting potential issues where resources are not properly managed.
In this case, the tool flagged the missing release operation as a
potential problem, which led to the development of this patch.
Fixes: 4bac07c993d0 ("xen: add the Xenbus sysfs and virtual device hotplug driver")
Cc: stable(a)vger.kernel.org
Signed-off-by: Qiu-ji Chen <chenqiuji666(a)gmail.com>
Reviewed-by: Juergen Gross <jgross(a)suse.com>
Message-ID: <20241105130919.4621-1-chenqiuji666(a)gmail.com>
Signed-off-by: Juergen Gross <jgross(a)suse.com>
diff --git a/drivers/xen/xenbus/xenbus_probe.c b/drivers/xen/xenbus/xenbus_probe.c
index 9f097f1f4a4c..6d32ffb01136 100644
--- a/drivers/xen/xenbus/xenbus_probe.c
+++ b/drivers/xen/xenbus/xenbus_probe.c
@@ -313,7 +313,7 @@ int xenbus_dev_probe(struct device *_dev)
if (err) {
dev_warn(&dev->dev, "watch_otherend on %s failed.\n",
dev->nodename);
- return err;
+ goto fail_remove;
}
dev->spurious_threshold = 1;
@@ -322,6 +322,12 @@ int xenbus_dev_probe(struct device *_dev)
dev->nodename);
return 0;
+fail_remove:
+ if (drv->remove) {
+ down(&dev->reclaim_sem);
+ drv->remove(dev);
+ up(&dev->reclaim_sem);
+ }
fail_put:
module_put(drv->driver.owner);
fail:
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x afc545da381ba0c651b2658966ac737032676f01
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024120220-amuck-esophagus-6542@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From afc545da381ba0c651b2658966ac737032676f01 Mon Sep 17 00:00:00 2001
From: Qiu-ji Chen <chenqiuji666(a)gmail.com>
Date: Tue, 5 Nov 2024 21:09:19 +0800
Subject: [PATCH] xen: Fix the issue of resource not being properly released in
xenbus_dev_probe()
This patch fixes an issue in the function xenbus_dev_probe(). In the
xenbus_dev_probe() function, within the if (err) branch at line 313, the
program incorrectly returns err directly without releasing the resources
allocated by err = drv->probe(dev, id). As the return value is non-zero,
the upper layers assume the processing logic has failed. However, the probe
operation was performed earlier without a corresponding remove operation.
Since the probe actually allocates resources, failing to perform the remove
operation could lead to problems.
To fix this issue, we followed the resource release logic of the
xenbus_dev_remove() function by adding a new block fail_remove before the
fail_put block. After entering the branch if (err) at line 313, the
function will use a goto statement to jump to the fail_remove block,
ensuring that the previously acquired resources are correctly released,
thus preventing the reference count leak.
This bug was identified by an experimental static analysis tool developed
by our team. The tool specializes in analyzing reference count operations
and detecting potential issues where resources are not properly managed.
In this case, the tool flagged the missing release operation as a
potential problem, which led to the development of this patch.
Fixes: 4bac07c993d0 ("xen: add the Xenbus sysfs and virtual device hotplug driver")
Cc: stable(a)vger.kernel.org
Signed-off-by: Qiu-ji Chen <chenqiuji666(a)gmail.com>
Reviewed-by: Juergen Gross <jgross(a)suse.com>
Message-ID: <20241105130919.4621-1-chenqiuji666(a)gmail.com>
Signed-off-by: Juergen Gross <jgross(a)suse.com>
diff --git a/drivers/xen/xenbus/xenbus_probe.c b/drivers/xen/xenbus/xenbus_probe.c
index 9f097f1f4a4c..6d32ffb01136 100644
--- a/drivers/xen/xenbus/xenbus_probe.c
+++ b/drivers/xen/xenbus/xenbus_probe.c
@@ -313,7 +313,7 @@ int xenbus_dev_probe(struct device *_dev)
if (err) {
dev_warn(&dev->dev, "watch_otherend on %s failed.\n",
dev->nodename);
- return err;
+ goto fail_remove;
}
dev->spurious_threshold = 1;
@@ -322,6 +322,12 @@ int xenbus_dev_probe(struct device *_dev)
dev->nodename);
return 0;
+fail_remove:
+ if (drv->remove) {
+ down(&dev->reclaim_sem);
+ drv->remove(dev);
+ up(&dev->reclaim_sem);
+ }
fail_put:
module_put(drv->driver.owner);
fail:
During mass manufacturing, we noticed the mmc_rx_crc_error counter,
as reported by "ethtool -S eth0 | grep mmc_rx_crc_error" to increase
above zero during nuttcp speedtests.
Cycling through the rx_delay range on two boards shows that there is a
large "good" region from 0x11 to 0x35 (see below for details).
This commit increases rx_delay to 0x11, which is the smallest
possible change that fixes the issue we are seeing on the KSZ9031 PHY.
This also matches what most other rk3399 boards do.
Tests for Puma PCBA S/N TT0069903:
rx_delay mmc_rx_crc_error
-------- ----------------
0x09 (dhcp broken)
0x10 897
0x11 0
0x20 0
0x30 0
0x35 0
0x3a 745
0x3b 11375
0x3c 36680
0x40 (dhcp broken)
0x7f (dhcp broken)
Tests for Puma PCBA S/N TT0157733:
rx_delay mmc_rx_crc_error
-------- ----------------
0x10 59
0x11 0
0x35 0
Fixes: 2c66fc34e945 ("arm64: dts: rockchip: add RK3399-Q7 (Puma) SoM")
Cc: <stable(a)vger.kernel.org>
Reviewed-by: Quentin Schulz <quentin.schulz(a)cherry.de>
Signed-off-by: Jakob Unterwurzacher <jakob.unterwurzacher(a)cherry.de>
---
v2: cc stable, add "Fixes:", add omitted "there" to commit msg,
add Reviewed-by.
arch/arm64/boot/dts/rockchip/rk3399-puma.dtsi | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/arm64/boot/dts/rockchip/rk3399-puma.dtsi b/arch/arm64/boot/dts/rockchip/rk3399-puma.dtsi
index 9efcdce0f593..13d0c511046b 100644
--- a/arch/arm64/boot/dts/rockchip/rk3399-puma.dtsi
+++ b/arch/arm64/boot/dts/rockchip/rk3399-puma.dtsi
@@ -181,7 +181,7 @@ &gmac {
snps,reset-active-low;
snps,reset-delays-us = <0 10000 50000>;
tx_delay = <0x10>;
- rx_delay = <0x10>;
+ rx_delay = <0x11>;
status = "okay";
};
--
2.39.5
On certain i.MX8 series parts [1], the PPS channel 0
is routed internally to eDMA, and the external PPS
pin is available on channel 1. In addition, on
certain boards, the PPS may be wired on the PCB to
an EVENTOUTn pin other than 0. On these systems
it is necessary that the PPS channel be able
to be configured from the Device Tree.
[1] https://lore.kernel.org/all/ZrPYOWA3FESx197L@lizhi-Precision-Tower-5810/
Francesco Dolcini (3):
dt-bindings: net: fec: add pps channel property
net: fec: refactor PPS channel configuration
net: fec: make PPS channel configurable
Documentation/devicetree/bindings/net/fsl,fec.yaml | 7 +++++++
drivers/net/ethernet/freescale/fec_ptp.c | 11 ++++++-----
2 files changed, 13 insertions(+), 5 deletions(-)
--
2.34.1
Netpoll will explicitly pass the polling call with a budget of 0 to
indicate it's clearing the Tx path only. For the gve_rx_poll and
gve_xdp_poll, they were mistakenly taking the 0 budget as the indication
to do all the work. Add check to avoid the rx path and xdp path being
called when budget is 0. And also avoid napi_complete_done being called
when budget is 0 for netpoll.
The original fix was merged here:
https://lore.kernel.org/r/20231114004144.2022268-1-ziweixiao@google.com
Resend it since the original one was not cleanly applied to 6.1 kernel.
Fixes: f5cedc84a30d ("gve: Add transmit and receive support")
Signed-off-by: Ziwei Xiao <ziweixiao(a)google.com>
Reviewed-by: Praveen Kaligineedi <pkaligineedi(a)google.com>
Signed-off-by: Praveen Kaligineedi <pkaligineedi(a)google.com>
---
drivers/net/ethernet/google/gve/gve_main.c | 7 +++++++
drivers/net/ethernet/google/gve/gve_rx.c | 4 ----
drivers/net/ethernet/google/gve/gve_tx.c | 4 ----
3 files changed, 7 insertions(+), 8 deletions(-)
diff --git a/drivers/net/ethernet/google/gve/gve_main.c b/drivers/net/ethernet/google/gve/gve_main.c
index d3f6ad586ba1..8771ccfc69b4 100644
--- a/drivers/net/ethernet/google/gve/gve_main.c
+++ b/drivers/net/ethernet/google/gve/gve_main.c
@@ -202,6 +202,10 @@ static int gve_napi_poll(struct napi_struct *napi, int budget)
if (block->tx)
reschedule |= gve_tx_poll(block, budget);
+
+ if (!budget)
+ return 0;
+
if (block->rx) {
work_done = gve_rx_poll(block, budget);
reschedule |= work_done == budget;
@@ -242,6 +246,9 @@ static int gve_napi_poll_dqo(struct napi_struct *napi, int budget)
if (block->tx)
reschedule |= gve_tx_poll_dqo(block, /*do_clean=*/true);
+ if (!budget)
+ return 0;
+
if (block->rx) {
work_done = gve_rx_poll_dqo(block, budget);
reschedule |= work_done == budget;
diff --git a/drivers/net/ethernet/google/gve/gve_rx.c b/drivers/net/ethernet/google/gve/gve_rx.c
index 021bbf308d68..639eb6848c7d 100644
--- a/drivers/net/ethernet/google/gve/gve_rx.c
+++ b/drivers/net/ethernet/google/gve/gve_rx.c
@@ -778,10 +778,6 @@ int gve_rx_poll(struct gve_notify_block *block, int budget)
feat = block->napi.dev->features;
- /* If budget is 0, do all the work */
- if (budget == 0)
- budget = INT_MAX;
-
if (budget > 0)
work_done = gve_clean_rx_done(rx, budget, feat);
diff --git a/drivers/net/ethernet/google/gve/gve_tx.c b/drivers/net/ethernet/google/gve/gve_tx.c
index 5e11b8236754..bf1ac0d1dc6f 100644
--- a/drivers/net/ethernet/google/gve/gve_tx.c
+++ b/drivers/net/ethernet/google/gve/gve_tx.c
@@ -725,10 +725,6 @@ bool gve_tx_poll(struct gve_notify_block *block, int budget)
u32 nic_done;
u32 to_do;
- /* If budget is 0, do all the work */
- if (budget == 0)
- budget = INT_MAX;
-
/* In TX path, it may try to clean completed pkts in order to xmit,
* to avoid cleaning conflict, use spin_lock(), it yields better
* concurrency between xmit/clean than netif's lock.
--
2.47.0.338.g60cca15819-goog
Commit b8e0ddd36ce9 ("can: mcp251xfd: tef: prepare to workaround
broken TEF FIFO tail index erratum") introduced
mcp251xfd_get_tef_len() to get the number of unhandled transmit events
from the Transmit Event FIFO (TEF).
As the TEF has no head index, the driver uses the TX-FIFO's tail index
instead, assuming that send frames are completed.
When calculating the number of unhandled TEF events, that commit
didn't take mcp2518fd erratum DS80000789E 6. into account. According
to that erratum, the FIFOCI bits of a FIFOSTA register, here the
TX-FIFO tail index might be corrupted.
However here it seems the bit indicating that the TX-FIFO is
empty (MCP251XFD_REG_FIFOSTA_TFERFFIF) is not correct while the
TX-FIFO tail index is.
Assume that the TX-FIFO is indeed empty if:
- Chip's head and tail index are equal (len == 0).
- The TX-FIFO is less than half full.
(The TX-FIFO empty case has already been checked at the
beginning of this function.)
- No free buffers in the TX ring.
If the TX-FIFO is assumed to be empty, assume that the TEF is full and
return the number of elements in the TX-FIFO (which equals the number
of TEF elements).
If these assumptions are false, the driver might read to many objects
from the TEF. mcp251xfd_handle_tefif_one() checks the sequence numbers
and will refuse to process old events.
Reported-by: Renjaya Raga Zenta <renjaya.zenta(a)formulatrix.com>
Closes: https://patch.msgid.link/CAJ7t6HgaeQ3a_OtfszezU=zB-FqiZXqrnATJ3UujNoQJJf7Gg…
Fixes: b8e0ddd36ce9 ("can: mcp251xfd: tef: prepare to workaround broken TEF FIFO tail index erratum")
Tested-by: Renjaya Raga Zenta <renjaya.zenta(a)formulatrix.com>
Cc: stable(a)vger.kernel.org
Link: https://patch.msgid.link/20241126-mcp251xfd-fix-length-calculation-v2-1-c2e…
Signed-off-by: Marc Kleine-Budde <mkl(a)pengutronix.de>
---
drivers/net/can/spi/mcp251xfd/mcp251xfd-tef.c | 29 ++++++++++++++++++-
1 file changed, 28 insertions(+), 1 deletion(-)
diff --git a/drivers/net/can/spi/mcp251xfd/mcp251xfd-tef.c b/drivers/net/can/spi/mcp251xfd/mcp251xfd-tef.c
index d3ac865933fd..e94321849fd7 100644
--- a/drivers/net/can/spi/mcp251xfd/mcp251xfd-tef.c
+++ b/drivers/net/can/spi/mcp251xfd/mcp251xfd-tef.c
@@ -21,6 +21,11 @@ static inline bool mcp251xfd_tx_fifo_sta_empty(u32 fifo_sta)
return fifo_sta & MCP251XFD_REG_FIFOSTA_TFERFFIF;
}
+static inline bool mcp251xfd_tx_fifo_sta_less_than_half_full(u32 fifo_sta)
+{
+ return fifo_sta & MCP251XFD_REG_FIFOSTA_TFHRFHIF;
+}
+
static inline int
mcp251xfd_tef_tail_get_from_chip(const struct mcp251xfd_priv *priv,
u8 *tef_tail)
@@ -147,7 +152,29 @@ mcp251xfd_get_tef_len(struct mcp251xfd_priv *priv, u8 *len_p)
BUILD_BUG_ON(sizeof(tx_ring->obj_num) != sizeof(len));
len = (chip_tx_tail << shift) - (tail << shift);
- *len_p = len >> shift;
+ len >>= shift;
+
+ /* According to mcp2518fd erratum DS80000789E 6. the FIFOCI
+ * bits of a FIFOSTA register, here the TX-FIFO tail index
+ * might be corrupted.
+ *
+ * However here it seems the bit indicating that the TX-FIFO
+ * is empty (MCP251XFD_REG_FIFOSTA_TFERFFIF) is not correct
+ * while the TX-FIFO tail index is.
+ *
+ * We assume the TX-FIFO is empty, i.e. all pending CAN frames
+ * haven been send, if:
+ * - Chip's head and tail index are equal (len == 0).
+ * - The TX-FIFO is less than half full.
+ * (The TX-FIFO empty case has already been checked at the
+ * beginning of this function.)
+ * - No free buffers in the TX ring.
+ */
+ if (len == 0 && mcp251xfd_tx_fifo_sta_less_than_half_full(fifo_sta) &&
+ mcp251xfd_get_tx_free(tx_ring) == 0)
+ len = tx_ring->obj_num;
+
+ *len_p = len;
return 0;
}
--
2.45.2
Commit a7a7c1d423a6 ("f2fs: fix fiemap failure issue when page size is 16KB")
It resolves an infinite loop in fiemap when using 16k f2fs filesystems.
Please apply to stable 6.7-6.12
-Daniel
This patchset fixes two bugs with the async controls for the uvc driver.
They were found while implementing the granular PM, but I am sending
them as a separate patches, so they can be reviewed sooner. They fix
real issues in the driver that need to be taken care.
Signed-off-by: Ricardo Ribalda <ribalda(a)chromium.org>
---
Changes in v4:
- Fix implementation of uvc_ctrl_set_handle.
- Link to v3: https://lore.kernel.org/r/20241129-uvc-fix-async-v3-0-ab675ce66db7@chromium…
Changes in v3:
- change again! order of patches.
- Introduce uvc_ctrl_set_handle.
- Do not change ctrl->handle if it is not NULL.
Changes in v2:
- Annotate lockdep
- ctrl->handle != handle
- Change order of patches
- Move documentation of mutex
- Link to v1: https://lore.kernel.org/r/20241127-uvc-fix-async-v1-0-eb8722531b8c@chromium…
---
Ricardo Ribalda (4):
media: uvcvideo: Do not replace the handler of an async ctrl
media: uvcvideo: Remove dangling pointers
media: uvcvideo: Annotate lock requirements for uvc_ctrl_set
media: uvcvideo: Remove redundant NULL assignment
drivers/media/usb/uvc/uvc_ctrl.c | 62 ++++++++++++++++++++++++++++++++++++----
drivers/media/usb/uvc/uvc_v4l2.c | 2 ++
drivers/media/usb/uvc/uvcvideo.h | 14 +++++++--
3 files changed, 70 insertions(+), 8 deletions(-)
---
base-commit: 72ad4ff638047bbbdf3232178fea4bec1f429319
change-id: 20241127-uvc-fix-async-2c9d40413ad8
Best regards,
--
Ricardo Ribalda <ribalda(a)chromium.org>
Unplugging a USB3.0 webcam while streaming results in errors like this:
[ 132.646387] xhci_hcd 0000:03:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 18 comp_code 13
[ 132.646446] xhci_hcd 0000:03:00.0: Looking for event-dma 000000002fdf8630 trb-start 000000002fdf8640 trb-end 000000002fdf8650 seg-start 000000002fdf8000 seg-end 000000002fdf8ff0
[ 132.646560] xhci_hcd 0000:03:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 18 comp_code 13
[ 132.646568] xhci_hcd 0000:03:00.0: Looking for event-dma 000000002fdf8660 trb-start 000000002fdf8670 trb-end 000000002fdf8670 seg-start 000000002fdf8000 seg-end 000000002fdf8ff0
If an error is detected while processing the last TRB of an isoc TD,
the Etron xHC generates two transfer events for the TRB where the
error was detected. The first event can be any sort of error (like
USB Transaction or Babble Detected, etc), and the final event is
Success.
The xHCI driver will handle the TD after the first event and remove it
from its internal list, and then print an "Transfer event TRB DMA ptr
not part of current TD" error message after the final event.
Commit 5372c65e1311 ("xhci: process isoc TD properly when there was a
transaction error mid TD.") is designed to address isoc transaction
errors, but unfortunately it doesn't account for this scenario.
To work around this by reusing the logic that handles isoc transaction
errors, but continuing to wait for the final event when this condition
occurs. Sometimes we see the Stopped event after an error mid TD, this
is a normal event for a pending TD and we can think of it as the final
event we are waiting for.
Check if the XHCI_ETRON_HOST quirk flag is set before invoking the
workaround in process_isoc_td().
Fixes: 5372c65e1311 ("xhci: process isoc TD properly when there was a transaction error mid TD.")
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Kuangyi Chiang <ki.chiang65(a)gmail.com>
---
drivers/usb/host/xhci-ring.c | 29 +++++++++++++++++++++--------
1 file changed, 21 insertions(+), 8 deletions(-)
diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index 4cf5363875c7..a51eb3526ae3 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -2450,8 +2450,10 @@ static void process_isoc_td(struct xhci_hcd *xhci, struct xhci_virt_ep *ep,
switch (trb_comp_code) {
case COMP_SUCCESS:
/* Don't overwrite status if TD had an error, see xHCI 4.9.1 */
- if (td->error_mid_td)
+ if (td->error_mid_td) {
+ td->error_mid_td = false;
break;
+ }
if (remaining) {
frame->status = short_framestatus;
sum_trbs_for_length = true;
@@ -2466,25 +2468,36 @@ static void process_isoc_td(struct xhci_hcd *xhci, struct xhci_virt_ep *ep,
case COMP_BANDWIDTH_OVERRUN_ERROR:
frame->status = -ECOMM;
break;
+ case COMP_USB_TRANSACTION_ERROR:
case COMP_BABBLE_DETECTED_ERROR:
sum_trbs_for_length = true;
fallthrough;
case COMP_ISOCH_BUFFER_OVERRUN:
frame->status = -EOVERFLOW;
+ if (trb_comp_code == COMP_USB_TRANSACTION_ERROR)
+ frame->status = -EPROTO;
if (ep_trb != td->end_trb)
td->error_mid_td = true;
+ else
+ td->error_mid_td = false;
+
+ /*
+ * If an error is detected on the last TRB of the TD,
+ * wait for the final event.
+ */
+ if ((xhci->quirks & XHCI_ETRON_HOST) &&
+ td->urb->dev->speed >= USB_SPEED_SUPER &&
+ ep_trb == td->end_trb)
+ td->error_mid_td = true;
break;
case COMP_INCOMPATIBLE_DEVICE_ERROR:
case COMP_STALL_ERROR:
frame->status = -EPROTO;
break;
- case COMP_USB_TRANSACTION_ERROR:
- frame->status = -EPROTO;
- sum_trbs_for_length = true;
- if (ep_trb != td->end_trb)
- td->error_mid_td = true;
- break;
case COMP_STOPPED:
+ /* Think of it as the final event if TD had an error */
+ if (td->error_mid_td)
+ td->error_mid_td = false;
sum_trbs_for_length = true;
break;
case COMP_STOPPED_SHORT_PACKET:
@@ -2517,7 +2530,7 @@ static void process_isoc_td(struct xhci_hcd *xhci, struct xhci_virt_ep *ep,
finish_td:
/* Don't give back TD yet if we encountered an error mid TD */
- if (td->error_mid_td && ep_trb != td->end_trb) {
+ if (td->error_mid_td) {
xhci_dbg(xhci, "Error mid isoc TD, wait for final completion event\n");
td->urb_length_set = true;
return;
--
2.25.1
From: Pali Rohár <pali(a)kernel.org>
upstream e2a8910af01653c1c268984855629d71fb81f404 commit.
ReparseDataLength is sum of the InodeType size and DataBuffer size.
So to get DataBuffer size it is needed to subtract InodeType's size from
ReparseDataLength.
Function cifs_strndup_from_utf16() is currentlly accessing buf->DataBuffer
at position after the end of the buffer because it does not subtract
InodeType size from the length. Fix this problem and correctly subtract
variable len.
Member InodeType is present only when reparse buffer is large enough. Check
for ReparseDataLength before accessing InodeType to prevent another invalid
memory access.
Major and minor rdev values are present also only when reparse buffer is
large enough. Check for reparse buffer size before calling reparse_mkdev().
Fixes: d5ecebc4900d ("smb3: Allow query of symlinks stored as reparse points")
Reviewed-by: Paulo Alcantara (Red Hat) <pc(a)manguebit.com>
Signed-off-by: Pali Rohár <pali(a)kernel.org>
Signed-off-by: Steve French <stfrench(a)microsoft.com>
[use variable name symlink_buf, the other buf->InodeType accesses are
not used in current version so skip]
Signed-off-by: Mahmoud Adam <mngyadam(a)amazon.com>
---
This fixes CVE-2024-49996, and applies cleanly on 5.4->6.1, 6.6 and
later already has the fix.
fs/smb/client/smb2ops.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/fs/smb/client/smb2ops.c b/fs/smb/client/smb2ops.c
index d1e5ff9a3cd39..fcfbc096924a8 100644
--- a/fs/smb/client/smb2ops.c
+++ b/fs/smb/client/smb2ops.c
@@ -2897,6 +2897,12 @@ parse_reparse_posix(struct reparse_posix_data *symlink_buf,
/* See MS-FSCC 2.1.2.6 for the 'NFS' style reparse tags */
len = le16_to_cpu(symlink_buf->ReparseDataLength);
+ if (len < sizeof(symlink_buf->InodeType)) {
+ cifs_dbg(VFS, "srv returned malformed nfs buffer\n");
+ return -EIO;
+ }
+
+ len -= sizeof(symlink_buf->InodeType);
if (le64_to_cpu(symlink_buf->InodeType) != NFS_SPECFILE_LNK) {
cifs_dbg(VFS, "%lld not a supported symlink type\n",
--
2.40.1
CC stable.
This needs picking up for 6.12
Head commit 573f45a9f9a47 applied by Linus with a modified commit message.
David
> -----Original Message-----
> From: David Laight
> Sent: 24 November 2024 15:39
> To: 'Linus Torvalds' <torvalds(a)linux-foundation.org>; 'Andrew Cooper' <andrew.cooper3(a)citrix.com>;
> 'bp(a)alien8.de' <bp(a)alien8.de>; 'Josh Poimboeuf' <jpoimboe(a)kernel.org>
> Cc: 'x86(a)kernel.org' <x86(a)kernel.org>; 'linux-kernel(a)vger.kernel.org' <linux-kernel(a)vger.kernel.org>;
> 'Arnd Bergmann' <arnd(a)kernel.org>; 'Mikel Rychliski' <mikel(a)mikelr.com>; 'Thomas Gleixner'
> <tglx(a)linutronix.de>; 'Ingo Molnar' <mingo(a)redhat.com>; 'Borislav Petkov' <bp(a)alien8.de>; 'Dave
> Hansen' <dave.hansen(a)linux.intel.com>; 'H. Peter Anvin' <hpa(a)zytor.com>
> Subject: [PATCH v2] x86: Allow user accesses to the base of the guard page
>
> __access_ok() calls valid_user_address() with the address after
> the last byte of the user buffer.
> It is valid for a buffer to end with the last valid user address
> so valid_user_address() must allow accesses to the base of the
> guard page.
>
> Fixes: 86e6b1547b3d0 ("x86: fix user address masking non-canonical speculation issue")
> Signed-off-by: David Laight <david.laight(a)aculab.com>
> ---
>
> v2: Rewritten commit message.
>
> arch/x86/kernel/cpu/common.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
> index 06a516f6795b..ca327cfa42ae 100644
> --- a/arch/x86/kernel/cpu/common.c
> +++ b/arch/x86/kernel/cpu/common.c
> @@ -2389,12 +2389,12 @@ void __init arch_cpu_finalize_init(void)
> alternative_instructions();
>
> if (IS_ENABLED(CONFIG_X86_64)) {
> - unsigned long USER_PTR_MAX = TASK_SIZE_MAX-1;
> + unsigned long USER_PTR_MAX = TASK_SIZE_MAX;
>
> /*
> * Enable this when LAM is gated on LASS support
> if (cpu_feature_enabled(X86_FEATURE_LAM))
> - USER_PTR_MAX = (1ul << 63) - PAGE_SIZE - 1;
> + USER_PTR_MAX = (1ul << 63) - PAGE_SIZE;
> */
> runtime_const_init(ptr, USER_PTR_MAX);
>
> --
> 2.17.1
-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
The quilt patch titled
Subject: mm: vmscan: ensure kswapd is woken up if the wait queue is active
has been removed from the -mm tree. Its filename was
mm-vmscan-ensure-kswapd-is-woken-up-if-the-wait-queue-is-active.patch
This patch was dropped because an updated version will be issued
------------------------------------------------------
From: Seiji Nishikawa <snishika(a)redhat.com>
Subject: mm: vmscan: ensure kswapd is woken up if the wait queue is active
Date: Wed, 27 Nov 2024 00:06:12 +0900
Even after commit 501b26510ae3 ("vmstat: allow_direct_reclaim should use
zone_page_state_snapshot"), a task may remain indefinitely stuck in
throttle_direct_reclaim() while holding mm->rwsem.
__alloc_pages_nodemask
try_to_free_pages
throttle_direct_reclaim
This can cause numerous other tasks to wait on the same rwsem, leading
to severe system hangups:
[1088963.358712] INFO: task python3:1670971 blocked for more than 120 seconds.
[1088963.365653] Tainted: G OE -------- - - 4.18.0-553.el8_10.aarch64 #1
[1088963.373887] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[1088963.381862] task:python3 state:D stack:0 pid:1670971 ppid:1667117 flags:0x00800080
[1088963.381869] Call trace:
[1088963.381872] __switch_to+0xd0/0x120
[1088963.381877] __schedule+0x340/0xac8
[1088963.381881] schedule+0x68/0x118
[1088963.381886] rwsem_down_read_slowpath+0x2d4/0x4b8
The issue arises when allow_direct_reclaim(pgdat) returns false,
preventing progress even when the pgdat->pfmemalloc_wait wait queue is
empty. Despite the wait queue being empty, the condition,
allow_direct_reclaim(pgdat), may still be returning false, causing it to
continue looping.
In some cases, reclaimable pages exist (zone_reclaimable_pages() returns
> 0), but calculations of pfmemalloc_reserve and free_pages result in
wmark_ok being false.
And then, despite the pgdat->kswapd_wait queue being non-empty, kswapd
is not woken up, further exacerbating the problem:
crash> px ((struct pglist_data *) 0xffff00817fffe540)->kswapd_highest_zoneidx
$775 = __MAX_NR_ZONES
The issue likely occurs under specific conditions: high memory pressure
with frequent direct reclaim, contention on mmap_sem from concurrent
memory allocations, reclaimable pages exist, but zone states cause
wmark_ok to return false.
Modern workloads (e.g., Python multiprocessing) and changes in kernel
reclaim logic may have surfaced such edge cases more prominently than
before.
The workload involves concurrent Python processes under high memory
pressure, leading to contention on mmap_sem. While not unusual, this
workload may trigger a rare combination of conditions that expose the
issue.
This patch modifies allow_direct_reclaim() to wake kswapd if the
pgdat->kswapd_wait queue is active, regardless of whether wmark_ok is true
or false. This change ensures kswapd does not miss wake-ups under high
memory pressure, reducing the risk of task stalls in the throttled reclaim
path.
Link: https://lkml.kernel.org/r/20241126150612.114561-1-snishika@redhat.com
Signed-off-by: Seiji Nishikawa <snishika(a)redhat.com>
Cc: Mel Gorman <mgorman(a)techsingularity.net>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/vmscan.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
--- a/mm/vmscan.c~mm-vmscan-ensure-kswapd-is-woken-up-if-the-wait-queue-is-active
+++ a/mm/vmscan.c
@@ -6389,8 +6389,8 @@ static bool allow_direct_reclaim(pg_data
wmark_ok = free_pages > pfmemalloc_reserve / 2;
- /* kswapd must be awake if processes are being throttled */
- if (!wmark_ok && waitqueue_active(&pgdat->kswapd_wait)) {
+ /* Always wake up kswapd if the wait queue is not empty */
+ if (waitqueue_active(&pgdat->kswapd_wait)) {
if (READ_ONCE(pgdat->kswapd_highest_zoneidx) > ZONE_NORMAL)
WRITE_ONCE(pgdat->kswapd_highest_zoneidx, ZONE_NORMAL);
_
Patches currently in -mm which might be from snishika(a)redhat.com are
mm-vmscan-account-for-free-pages-to-prevent-infinite-loop-in-throttle_direct_reclaim.patch
The patch titled
Subject: mm/hugetlb: change ENOSPC to ENOMEM in alloc_hugetlb_folio
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-hugetlb-change-enospc-to-enomem-in-alloc_hugetlb_folio.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Dafna Hirschfeld <dafna.hirschfeld(a)intel.com>
Subject: mm/hugetlb: change ENOSPC to ENOMEM in alloc_hugetlb_folio
Date: Sun, 1 Dec 2024 03:03:41 +0200
The error ENOSPC is translated in vmf_error to VM_FAULT_SIGBUS which is
further translated in EFAULT in i.e. pin/get_user_pages. But when
running out of pages/hugepages we expect to see ENOMEM and not EFAULT.
Link: https://lkml.kernel.org/r/20241201010341.1382431-1-dafna.hirschfeld@intel.c…
Fixes: 8f34af6f93ae ("mm, hugetlb: move the error handle logic out of normal code path")
Signed-off-by: Dafna Hirschfeld <dafna.hirschfeld(a)intel.com>
Cc: Muchun Song <muchun.song(a)linux.dev>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/hugetlb.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/mm/hugetlb.c~mm-hugetlb-change-enospc-to-enomem-in-alloc_hugetlb_folio
+++ a/mm/hugetlb.c
@@ -3113,7 +3113,7 @@ out_end_reservation:
if (!memcg_charge_ret)
mem_cgroup_cancel_charge(memcg, nr_pages);
mem_cgroup_put(memcg);
- return ERR_PTR(-ENOSPC);
+ return ERR_PTR(-ENOMEM);
}
int alloc_bootmem_huge_page(struct hstate *h, int nid)
_
Patches currently in -mm which might be from dafna.hirschfeld(a)intel.com are
mm-hugetlb-change-enospc-to-enomem-in-alloc_hugetlb_folio.patch
The patch titled
Subject: mm: vmscan: account for free pages to prevent infinite Loop in throttle_direct_reclaim()
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-vmscan-account-for-free-pages-to-prevent-infinite-loop-in-throttle_direct_reclaim.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Seiji Nishikawa <snishika(a)redhat.com>
Subject: mm: vmscan: account for free pages to prevent infinite Loop in throttle_direct_reclaim()
Date: Sun, 1 Dec 2024 01:12:34 +0900
The task sometimes continues looping in throttle_direct_reclaim() because
allow_direct_reclaim(pgdat) keeps returning false.
#0 [ffff80002cb6f8d0] __switch_to at ffff8000080095ac
#1 [ffff80002cb6f900] __schedule at ffff800008abbd1c
#2 [ffff80002cb6f990] schedule at ffff800008abc50c
#3 [ffff80002cb6f9b0] throttle_direct_reclaim at ffff800008273550
#4 [ffff80002cb6fa20] try_to_free_pages at ffff800008277b68
#5 [ffff80002cb6fae0] __alloc_pages_nodemask at ffff8000082c4660
#6 [ffff80002cb6fc50] alloc_pages_vma at ffff8000082e4a98
#7 [ffff80002cb6fca0] do_anonymous_page at ffff80000829f5a8
#8 [ffff80002cb6fce0] __handle_mm_fault at ffff8000082a5974
#9 [ffff80002cb6fd90] handle_mm_fault at ffff8000082a5bd4
At this point, the pgdat contains the following two zones:
NODE: 4 ZONE: 0 ADDR: ffff00817fffe540 NAME: "DMA32"
SIZE: 20480 MIN/LOW/HIGH: 11/28/45
VM_STAT:
NR_FREE_PAGES: 359
NR_ZONE_INACTIVE_ANON: 18813
NR_ZONE_ACTIVE_ANON: 0
NR_ZONE_INACTIVE_FILE: 50
NR_ZONE_ACTIVE_FILE: 0
NR_ZONE_UNEVICTABLE: 0
NR_ZONE_WRITE_PENDING: 0
NR_MLOCK: 0
NR_BOUNCE: 0
NR_ZSPAGES: 0
NR_FREE_CMA_PAGES: 0
NODE: 4 ZONE: 1 ADDR: ffff00817fffec00 NAME: "Normal"
SIZE: 8454144 PRESENT: 98304 MIN/LOW/HIGH: 68/166/264
VM_STAT:
NR_FREE_PAGES: 146
NR_ZONE_INACTIVE_ANON: 94668
NR_ZONE_ACTIVE_ANON: 3
NR_ZONE_INACTIVE_FILE: 735
NR_ZONE_ACTIVE_FILE: 78
NR_ZONE_UNEVICTABLE: 0
NR_ZONE_WRITE_PENDING: 0
NR_MLOCK: 0
NR_BOUNCE: 0
NR_ZSPAGES: 0
NR_FREE_CMA_PAGES: 0
In allow_direct_reclaim(), while processing ZONE_DMA32, the sum of
inactive/active file-backed pages calculated in zone_reclaimable_pages()
based on the result of zone_page_state_snapshot() is zero.
Additionally, since this system lacks swap, the calculation of inactive/
active anonymous pages is skipped.
crash> p nr_swap_pages
nr_swap_pages = $1937 = {
counter = 0
}
As a result, ZONE_DMA32 is deemed unreclaimable and skipped, moving on to
the processing of the next zone, ZONE_NORMAL, despite ZONE_DMA32 having
free pages significantly exceeding the high watermark.
The problem is that the pgdat->kswapd_failures hasn't been incremented.
crash> px ((struct pglist_data *) 0xffff00817fffe540)->kswapd_failures
$1935 = 0x0
This is because the node deemed balanced. The node balancing logic in
balance_pgdat() evaluates all zones collectively. If one or more zones
(e.g., ZONE_DMA32) have enough free pages to meet their watermarks, the
entire node is deemed balanced. This causes balance_pgdat() to exit early
before incrementing the kswapd_failures, as it considers the overall
memory state acceptable, even though some zones (like ZONE_NORMAL) remain
under significant pressure.
The patch ensures that zone_reclaimable_pages() includes free pages
(NR_FREE_PAGES) in its calculation when no other reclaimable pages are
available (e.g., file-backed or anonymous pages). This change prevents
zones like ZONE_DMA32, which have sufficient free pages, from being
mistakenly deemed unreclaimable. By doing so, the patch ensures proper
node balancing, avoids masking pressure on other zones like ZONE_NORMAL,
and prevents infinite loops in throttle_direct_reclaim() caused by
allow_direct_reclaim(pgdat) repeatedly returning false.
The kernel hangs due to a task stuck in throttle_direct_reclaim(), caused
by a node being incorrectly deemed balanced despite pressure in certain
zones, such as ZONE_NORMAL. This issue arises from
zone_reclaimable_pages() returning 0 for zones without reclaimable file-
backed or anonymous pages, causing zones like ZONE_DMA32 with sufficient
free pages to be skipped.
The lack of swap or reclaimable pages results in ZONE_DMA32 being ignored
during reclaim, masking pressure in other zones. Consequently,
pgdat->kswapd_failures remains 0 in balance_pgdat(), preventing fallback
mechanisms in allow_direct_reclaim() from being triggered, leading to an
infinite loop in throttle_direct_reclaim().
This patch modifies zone_reclaimable_pages() to account for free pages
(NR_FREE_PAGES) when no other reclaimable pages exist. This ensures zones
with sufficient free pages are not skipped, enabling proper balancing and
reclaim behavior.
Link: https://lkml.kernel.org/r/20241130164346.436469-1-snishika@redhat.com
Link: https://lkml.kernel.org/r/20241130161236.433747-2-snishika@redhat.com
Signed-off-by: Seiji Nishikawa <snishika(a)redhat.com>
Cc: Mel Gorman <mgorman(a)techsingularity.net>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/vmscan.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
--- a/mm/vmscan.c~mm-vmscan-account-for-free-pages-to-prevent-infinite-loop-in-throttle_direct_reclaim
+++ a/mm/vmscan.c
@@ -374,7 +374,14 @@ unsigned long zone_reclaimable_pages(str
if (can_reclaim_anon_pages(NULL, zone_to_nid(zone), NULL))
nr += zone_page_state_snapshot(zone, NR_ZONE_INACTIVE_ANON) +
zone_page_state_snapshot(zone, NR_ZONE_ACTIVE_ANON);
-
+ /*
+ * If there are no reclaimable file-backed or anonymous pages,
+ * ensure zones with sufficient free pages are not skipped.
+ * This prevents zones like DMA32 from being ignored in reclaim
+ * scenarios where they can still help alleviate memory pressure.
+ */
+ if (nr == 0)
+ nr = zone_page_state_snapshot(zone, NR_FREE_PAGES);
return nr;
}
_
Patches currently in -mm which might be from snishika(a)redhat.com are
mm-vmscan-ensure-kswapd-is-woken-up-if-the-wait-queue-is-active.patch
mm-vmscan-account-for-free-pages-to-prevent-infinite-loop-in-throttle_direct_reclaim.patch
The patch titled
Subject: mm: vmscan: account for free pages to prevent infinite Loop in throttle_direct_reclaim()
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-vmscan-account-for-free-pages-to-prevent-infinite-loop-in-throttle_direct_reclaim.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Seiji Nishikawa <snishika(a)redhat.com>
Subject: mm: vmscan: account for free pages to prevent infinite Loop in throttle_direct_reclaim()
Date: Sun, 1 Dec 2024 01:12:34 +0900
The kernel hangs due to a task stuck in throttle_direct_reclaim(), caused
by a node being incorrectly deemed balanced despite pressure in certain
zones, such as ZONE_NORMAL. This issue arises from
zone_reclaimable_pages() returning 0 for zones without reclaimable file-
backed or anonymous pages, causing zones like ZONE_DMA32 with sufficient
free pages to be skipped.
The lack of swap or reclaimable pages results in ZONE_DMA32 being ignored
during reclaim, masking pressure in other zones. Consequently,
pgdat->kswapd_failures remains 0 in balance_pgdat(), preventing fallback
mechanisms in allow_direct_reclaim() from being triggered, leading to an
infinite loop in throttle_direct_reclaim().
This patch modifies zone_reclaimable_pages() to account for free pages
(NR_FREE_PAGES) when no other reclaimable pages exist. This ensures zones
with sufficient free pages are not skipped, enabling proper balancing and
reclaim behavior.
Link: https://lkml.kernel.org/r/20241130161236.433747-2-snishika@redhat.com
Signed-off-by: Seiji Nishikawa <snishika(a)redhat.com>
Cc: Mel Gorman <mgorman(a)techsingularity.net>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/vmscan.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
--- a/mm/vmscan.c~mm-vmscan-account-for-free-pages-to-prevent-infinite-loop-in-throttle_direct_reclaim
+++ a/mm/vmscan.c
@@ -374,7 +374,14 @@ unsigned long zone_reclaimable_pages(str
if (can_reclaim_anon_pages(NULL, zone_to_nid(zone), NULL))
nr += zone_page_state_snapshot(zone, NR_ZONE_INACTIVE_ANON) +
zone_page_state_snapshot(zone, NR_ZONE_ACTIVE_ANON);
-
+ /*
+ * If there are no reclaimable file-backed or anonymous pages,
+ * ensure zones with sufficient free pages are not skipped.
+ * This prevents zones like DMA32 from being ignored in reclaim
+ * scenarios where they can still help alleviate memory pressure.
+ */
+ if (nr == 0)
+ nr = zone_page_state_snapshot(zone, NR_FREE_PAGES);
return nr;
}
_
Patches currently in -mm which might be from snishika(a)redhat.com are
mm-vmscan-ensure-kswapd-is-woken-up-if-the-wait-queue-is-active.patch
mm-vmscan-account-for-free-pages-to-prevent-infinite-loop-in-throttle_direct_reclaim.patch
[ Upstream commit 122aba8c80618eca904490b1733af27fb8f07528 ]
Recent kernels cause a lot of TCP retransmissions
[ ID] Interval Transfer Bitrate Retr Cwnd
[ 5] 0.00-1.00 sec 2.24 GBytes 19.2 Gbits/sec 2767 442 KBytes
[ 5] 1.00-2.00 sec 2.23 GBytes 19.1 Gbits/sec 2312 350 KBytes
^^^^
Replacing the qdisc with pfifo makes retransmissions go away.
It appears that a flow may have a delayed packet with a very near
Tx time. Later, we may get busy processing Rx and the target Tx time
will pass, but we won't service Tx since the CPU is busy with Rx.
If Rx sees an ACK and we try to push more data for the delayed flow
we may fastpath the skb, not realizing that there are already "ready
to send" packets for this flow sitting in the qdisc.
Don't trust the fastpath if we are "behind" according to the projected
Tx time for next flow waiting in the Qdisc. Because we consider anything
within the offload window to be okay for fastpath we must consider
the entire offload window as "now".
Qdisc config:
qdisc fq 8001: dev eth0 parent 1234:1 limit 10000p flow_limit 100p \
buckets 32768 orphan_mask 1023 bands 3 \
priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 \
weights 589824 196608 65536 quantum 3028b initial_quantum 15140b \
low_rate_threshold 550Kbit \
refill_delay 40ms timer_slack 10us horizon 10s horizon_drop
For iperf this change seems to do fine, the reordering is gone.
The fastpath still gets used most of the time:
gc 0 highprio 0 fastpath 142614 throttled 418309 latency 19.1us
xx_behind 2731
where "xx_behind" counts how many times we hit the new "return false".
CC: stable(a)vger.kernel.org
Fixes: 076433bd78d7 ("net_sched: sch_fq: add fast path for mostly idle qdisc")
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
Reviewed-by: Eric Dumazet <edumazet(a)google.com>
Link: https://patch.msgid.link/20241124022148.3126719-1-kuba@kernel.org
Signed-off-by: Paolo Abeni <pabeni(a)redhat.com>
[stable: drop the offload horizon, it's not supported / 0]
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
---
Per Fixes tag 6.7+, so the two non-longterm branches.
---
net/sched/sch_fq.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/net/sched/sch_fq.c b/net/sched/sch_fq.c
index 19a49af5a9e5..afefe124d903 100644
--- a/net/sched/sch_fq.c
+++ b/net/sched/sch_fq.c
@@ -331,6 +331,12 @@ static bool fq_fastpath_check(const struct Qdisc *sch, struct sk_buff *skb,
*/
if (q->internal.qlen >= 8)
return false;
+
+ /* Ordering invariants fall apart if some delayed flows
+ * are ready but we haven't serviced them, yet.
+ */
+ if (q->time_next_delayed_flow <= now)
+ return false;
}
sk = skb->sk;
--
2.47.0
From: Celeste Liu <CoelacanthusHex(a)gmail.com>
The return value of syscall_enter_from_user_mode() is always -1 when the
syscall was filtered. We can't know whether syscall_nr is -1 when we get -1
from syscall_enter_from_user_mode(). And the old syscall variable is
unusable because syscall_enter_from_user_mode() may change a7 register.
So get correct syscall number from syscall_get_nr().
So syscall number part of return value of syscall_enter_from_user_mode()
is completely useless. We can remove it from API and require caller to
get syscall number from syscall_get_nr(). But this change affect more
architectures and will block more time. So we split it into another
patchset to avoid block this fix. (Other architectures can works
without this change but riscv need it, see Link: tag below)
Fixes: 61119394631f ("riscv: entry: always initialize regs->a0 to -ENOSYS")
Reported-by: Andrea Bolognani <abologna(a)redhat.com>
Closes: https://github.com/strace/strace/issues/315
Link: https://lore.kernel.org/all/59505464-c84a-403d-972f-d4b2055eeaac@gmail.com/
Signed-off-by: Celeste Liu <CoelacanthusHex(a)gmail.com>
---
arch/riscv/kernel/traps.c | 13 ++++++++++---
1 file changed, 10 insertions(+), 3 deletions(-)
diff --git a/arch/riscv/kernel/traps.c b/arch/riscv/kernel/traps.c
index 51ebfd23e0076447518081d137102a9a11ff2e45..3125fab8ee4af468ace9f692dd34e1797555cce3 100644
--- a/arch/riscv/kernel/traps.c
+++ b/arch/riscv/kernel/traps.c
@@ -316,18 +316,25 @@ void do_trap_ecall_u(struct pt_regs *regs)
{
if (user_mode(regs)) {
long syscall = regs->a7;
+ long res;
regs->epc += 4;
regs->orig_a0 = regs->a0;
- regs->a0 = -ENOSYS;
riscv_v_vstate_discard(regs);
- syscall = syscall_enter_from_user_mode(regs, syscall);
+ res = syscall_enter_from_user_mode(regs, syscall);
+ /*
+ * Call syscall_get_nr() again because syscall_enter_from_user_mode()
+ * may change a7 register.
+ */
+ syscall = syscall_get_nr(current, regs);
add_random_kstack_offset();
- if (syscall >= 0 && syscall < NR_syscalls)
+ if (syscall < 0 || syscall >= NR_syscalls)
+ regs->a0 = -ENOSYS;
+ else if (res != -1)
syscall_handler(regs, syscall);
/*
---
base-commit: 2f87d0916ce0d2925cedbc9e8f5d6291ba2ac7b2
change-id: 20241016-fix-riscv-syscall-nr-917b566f97f3
Best regards,
--
Celeste Liu <CoelacanthusHex(a)gmail.com>
Respected Partners,
Thank you for being patient, and we regret the delay in replying to your last message. We acknowledge your inquiry and are delighted to offer you the information you need.
This email contains an attached screenshot with essential information about your request. Open the attachment to explore the relevant details and gain a full understanding of the data included.
If you have any inquiries or need further assistance, please do not hesitate to reach out. We are ready and willing to assist you, providing all the help you require.
With appreciation,
Diann Gibbs
Sapphire Strategies, LLC
+1 (212) 586-44-37
Hi,
Jerry has been working on getting a lot of testing for these two commits:
commit 9afeda049642 ("drm/amd/display: Skip Invalid Streams from DSC
Policy")
commit 4641169a8c95 ("drm/amd/display: Fix incorrect DSC recompute trigger")
They fix a ton of MST issues reported in the drm/amd tracker over the
last few kernel releases.
Can you please apply to 6.11.y and 6.12.y?
Thanks,
From: Luiz Augusto von Dentz <luiz.von.dentz(a)intel.com>
commit b25e11f978b63cb7857890edb3a698599cddb10e upstream.
This aligned BR/EDR JUST_WORKS method with LE which since 92516cd97fd4
("Bluetooth: Always request for user confirmation for Just Works")
always request user confirmation with confirm_hint set since the
likes of bluetoothd have dedicated policy around JUST_WORKS method
(e.g. main.conf:JustWorksRepairing).
CVE: CVE-2024-8805
Cc: stable(a)vger.kernel.org
Fixes: ba15a58b179e ("Bluetooth: Fix SSP acceptor just-works confirmation without MITM")
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz(a)intel.com>
Tested-by: Kiran K <kiran.k(a)intel.com>
[Nikita: minor fix to resolve a conflict caused by different debug
print macros used around the change: keep BT_DBG() instead of
bt_dev_dbg().]
Signed-off-by: Nikita Zhandarovich <n.zhandarovich(a)fintech.ru>
---
net/bluetooth/hci_event.c | 13 +++++--------
1 file changed, 5 insertions(+), 8 deletions(-)
diff --git a/net/bluetooth/hci_event.c b/net/bluetooth/hci_event.c
index 58c029958759..546795425119 100644
--- a/net/bluetooth/hci_event.c
+++ b/net/bluetooth/hci_event.c
@@ -4751,19 +4751,16 @@ static void hci_user_confirm_request_evt(struct hci_dev *hdev,
goto unlock;
}
- /* If no side requires MITM protection; auto-accept */
+ /* If no side requires MITM protection; use JUST_CFM method */
if ((!loc_mitm || conn->remote_cap == HCI_IO_NO_INPUT_OUTPUT) &&
(!rem_mitm || conn->io_capability == HCI_IO_NO_INPUT_OUTPUT)) {
- /* If we're not the initiators request authorization to
- * proceed from user space (mgmt_user_confirm with
- * confirm_hint set to 1). The exception is if neither
- * side had MITM or if the local IO capability is
- * NoInputNoOutput, in which case we do auto-accept
+ /* If we're not the initiator of request authorization and the
+ * local IO capability is not NoInputNoOutput, use JUST_WORKS
+ * method (mgmt_user_confirm with confirm_hint set to 1).
*/
if (!test_bit(HCI_CONN_AUTH_PEND, &conn->flags) &&
- conn->io_capability != HCI_IO_NO_INPUT_OUTPUT &&
- (loc_mitm || rem_mitm)) {
+ conn->io_capability != HCI_IO_NO_INPUT_OUTPUT) {
BT_DBG("Confirming auto-accept as acceptor");
confirm_hint = 1;
goto confirm;
--
2.25.1
sn65dsi83.c: fix dual-channel LVDS output also divide porches
When generating dual-channel LVDS to a single display, the
horizontal part has to be divided in halves for each channel.
This was done correctly for hactive, but not for the porches.
Of course this does only apply to sn65dsi84, which is also covered
by this driver.
Cc: stable(a)vger.kernel.org
Signed-off-by: Markus Bauer <markus.bauer2(a)avnet.com>
---
drivers/gpu/drm/bridge/ti-sn65dsi83.c | 16 +++++++++++++---
1 file changed, 13 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/bridge/ti-sn65dsi83.c b/drivers/gpu/drm/bridge/ti-sn65dsi83.c
index ad73f69d768d..d71f752e79ec 100644
--- a/drivers/gpu/drm/bridge/ti-sn65dsi83.c
+++ b/drivers/gpu/drm/bridge/ti-sn65dsi83.c
@@ -399,7 +399,7 @@ static void sn65dsi83_atomic_pre_enable(struct drm_bridge *bridge,
unsigned int pval;
__le16 le16val;
u16 val;
- int ret;
+ int ret, hfront, hback;
ret = regulator_enable(ctx->vcc);
if (ret) {
@@ -521,12 +521,22 @@ static void sn65dsi83_atomic_pre_enable(struct drm_bridge *bridge,
le16val = cpu_to_le16(mode->vsync_end - mode->vsync_start);
regmap_bulk_write(ctx->regmap, REG_VID_CHA_VSYNC_PULSE_WIDTH_LOW,
&le16val, 2);
+
+ hback = mode->htotal - mode->hsync_end;
+ if (ctx->lvds_dual_link)
+ hback /= 2;
+
regmap_write(ctx->regmap, REG_VID_CHA_HORIZONTAL_BACK_PORCH,
- mode->htotal - mode->hsync_end);
+ hback);
regmap_write(ctx->regmap, REG_VID_CHA_VERTICAL_BACK_PORCH,
mode->vtotal - mode->vsync_end);
+
+ hfront = mode->hsync_start - mode->hdisplay;
+ if (ctx->lvds_dual_link)
+ hfront /= 2;
+
regmap_write(ctx->regmap, REG_VID_CHA_HORIZONTAL_FRONT_PORCH,
- mode->hsync_start - mode->hdisplay);
+ hfront);
regmap_write(ctx->regmap, REG_VID_CHA_VERTICAL_FRONT_PORCH,
mode->vsync_start - mode->vdisplay);
regmap_write(ctx->regmap, REG_VID_CHA_TEST_PATTERN, 0x00);
--
2.34.1
--
Markus Bauer
Avnet Embedded is becoming TRIA:
www.tria-technologies.com
We continuously commit to comply with the applicable data protection laws and ensure fair and transparent processing of your personal data.
Please read our privacy statement including an information notice and data protection policy for detailed information on our website.
The patch titled
Subject: maple_tree: simplify split calculation
has been added to the -mm mm-unstable branch. Its filename is
maple_tree-simplify-split-calculation.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Wei Yang <richard.weiyang(a)gmail.com>
Subject: maple_tree: simplify split calculation
Date: Wed, 13 Nov 2024 03:16:14 +0000
Patch series "simplify split calculation", v3.
This patch (of 3):
The current calculation for splitting nodes tries to enforce a minimum
span on the leaf nodes. This code is complex and never worked correctly
to begin with, due to the min value being passed as 0 for all leaves.
The calculation should just split the data as equally as possible
between the new nodes. Note that b_end will be one more than the data,
so the left side is still favoured in the calculation.
The current code may also lead to a deficient node by not leaving enough
data for the right side of the split. This issue is also addressed with
the split calculation change.
[Liam.Howlett(a)Oracle.com: rephrase the change log]
Link: https://lkml.kernel.org/r/20241113031616.10530-1-richard.weiyang@gmail.com
Link: https://lkml.kernel.org/r/20241113031616.10530-2-richard.weiyang@gmail.com
Fixes: 54a611b60590 ("Maple Tree: add new data structure")
Signed-off-by: Wei Yang <richard.weiyang(a)gmail.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett(a)Oracle.com>
Cc: Sidhartha Kumar <sidhartha.kumar(a)oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes(a)oracle.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
lib/maple_tree.c | 23 ++++++-----------------
1 file changed, 6 insertions(+), 17 deletions(-)
--- a/lib/maple_tree.c~maple_tree-simplify-split-calculation
+++ a/lib/maple_tree.c
@@ -1863,11 +1863,11 @@ static inline int mab_no_null_split(stru
* Return: The first split location. The middle split is set in @mid_split.
*/
static inline int mab_calc_split(struct ma_state *mas,
- struct maple_big_node *bn, unsigned char *mid_split, unsigned long min)
+ struct maple_big_node *bn, unsigned char *mid_split)
{
unsigned char b_end = bn->b_end;
int split = b_end / 2; /* Assume equal split. */
- unsigned char slot_min, slot_count = mt_slots[bn->type];
+ unsigned char slot_count = mt_slots[bn->type];
/*
* To support gap tracking, all NULL entries are kept together and a node cannot
@@ -1900,18 +1900,7 @@ static inline int mab_calc_split(struct
split = b_end / 3;
*mid_split = split * 2;
} else {
- slot_min = mt_min_slots[bn->type];
-
*mid_split = 0;
- /*
- * Avoid having a range less than the slot count unless it
- * causes one node to be deficient.
- * NOTE: mt_min_slots is 1 based, b_end and split are zero.
- */
- while ((split < slot_count - 1) &&
- ((bn->pivot[split] - min) < slot_count - 1) &&
- (b_end - split > slot_min))
- split++;
}
/* Avoid ending a node on a NULL entry */
@@ -2377,7 +2366,7 @@ static inline struct maple_enode
static inline unsigned char mas_mab_to_node(struct ma_state *mas,
struct maple_big_node *b_node, struct maple_enode **left,
struct maple_enode **right, struct maple_enode **middle,
- unsigned char *mid_split, unsigned long min)
+ unsigned char *mid_split)
{
unsigned char split = 0;
unsigned char slot_count = mt_slots[b_node->type];
@@ -2390,7 +2379,7 @@ static inline unsigned char mas_mab_to_n
if (b_node->b_end < slot_count) {
split = b_node->b_end;
} else {
- split = mab_calc_split(mas, b_node, mid_split, min);
+ split = mab_calc_split(mas, b_node, mid_split);
*right = mas_new_ma_node(mas, b_node);
}
@@ -2877,7 +2866,7 @@ static void mas_spanning_rebalance(struc
mast->bn->b_end--;
mast->bn->type = mte_node_type(mast->orig_l->node);
split = mas_mab_to_node(mas, mast->bn, &left, &right, &middle,
- &mid_split, mast->orig_l->min);
+ &mid_split);
mast_set_split_parents(mast, left, middle, right, split,
mid_split);
mast_cp_to_nodes(mast, left, middle, right, split, mid_split);
@@ -3365,7 +3354,7 @@ static void mas_split(struct ma_state *m
if (mas_push_data(mas, height, &mast, false))
break;
- split = mab_calc_split(mas, b_node, &mid_split, prev_l_mas.min);
+ split = mab_calc_split(mas, b_node, &mid_split);
mast_split_data(&mast, mas, split);
/*
* Usually correct, mab_mas_cp in the above call overwrites
_
Patches currently in -mm which might be from richard.weiyang(a)gmail.com are
maple_tree-use-mas_next_slot-directly.patch
maple_tree-index-has-been-checked-to-be-smaller-than-pivot.patch
maple_tree-not-possible-to-be-a-root-node-after-loop.patch
maple_tree-we-dont-set-offset-to-maple_node_slots-on-error.patch
maple_tree-simplify-split-calculation.patch
maple_tree-add-a-test-check-deficient-node.patch
maple_tree-only-root-node-could-be-deficient.patch
The patch titled
Subject: sched/numa: fix memory leak due to the overwritten vma->numab_state
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
sched-numa-fix-memory-leak-due-to-the-overwritten-vma-numab_state.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Adrian Huang <ahuang12(a)lenovo.com>
Subject: sched/numa: fix memory leak due to the overwritten vma->numab_state
Date: Wed, 13 Nov 2024 18:21:46 +0800
[Problem Description]
When running the hackbench program of LTP, the following memory leak is
reported by kmemleak.
# /opt/ltp/testcases/bin/hackbench 20 thread 1000
Running with 20*40 (== 800) tasks.
# dmesg | grep kmemleak
...
kmemleak: 480 new suspected memory leaks (see /sys/kernel/debug/kmemleak)
kmemleak: 665 new suspected memory leaks (see /sys/kernel/debug/kmemleak)
# cat /sys/kernel/debug/kmemleak
unreferenced object 0xffff888cd8ca2c40 (size 64):
comm "hackbench", pid 17142, jiffies 4299780315
hex dump (first 32 bytes):
ac 74 49 00 01 00 00 00 4c 84 49 00 01 00 00 00 .tI.....L.I.....
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
backtrace (crc bff18fd4):
[<ffffffff81419a89>] __kmalloc_cache_noprof+0x2f9/0x3f0
[<ffffffff8113f715>] task_numa_work+0x725/0xa00
[<ffffffff8110f878>] task_work_run+0x58/0x90
[<ffffffff81ddd9f8>] syscall_exit_to_user_mode+0x1c8/0x1e0
[<ffffffff81dd78d5>] do_syscall_64+0x85/0x150
[<ffffffff81e0012b>] entry_SYSCALL_64_after_hwframe+0x76/0x7e
...
This issue can be consistently reproduced on three different servers:
* a 448-core server
* a 256-core server
* a 192-core server
[Root Cause]
Since multiple threads are created by the hackbench program (along with
the command argument 'thread'), a shared vma might be accessed by two or
more cores simultaneously. When two or more cores observe that
vma->numab_state is NULL at the same time, vma->numab_state will be
overwritten.
Although current code ensures that only one thread scans the VMAs in a
single 'numa_scan_period', there might be a chance for another thread
to enter in the next 'numa_scan_period' while we have not gotten till
numab_state allocation [1].
Note that the command `/opt/ltp/testcases/bin/hackbench 50 process 1000`
cannot the reproduce the issue. It is verified with 200+ test runs.
[Solution]
Use the cmpxchg atomic operation to ensure that only one thread executes
the vma->numab_state assignment.
[1] https://lore.kernel.org/lkml/1794be3c-358c-4cdc-a43d-a1f841d91ef7@amd.com/
Link: https://lkml.kernel.org/r/20241113102146.2384-1-ahuang12@lenovo.com
Fixes: ef6a22b70f6d ("sched/numa: apply the scan delay to every new vma")
Signed-off-by: Adrian Huang <ahuang12(a)lenovo.com>
Reported-by: Jiwei Sun <sunjw10(a)lenovo.com>
Reviewed-by: Raghavendra K T <raghavendra.kt(a)amd.com>
Reviewed-by: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Ben Segall <bsegall(a)google.com>
Cc: Dietmar Eggemann <dietmar.eggemann(a)arm.com>
Cc: Ingo Molnar <mingo(a)redhat.com>
Cc: Juri Lelli <juri.lelli(a)redhat.com>
Cc: Mel Gorman <mgorman(a)suse.de>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Steven Rostedt <rostedt(a)goodmis.org>
Cc: Valentin Schneider <vschneid(a)redhat.com>
Cc: Vincent Guittot <vincent.guittot(a)linaro.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
kernel/sched/fair.c | 12 +++++++++---
1 file changed, 9 insertions(+), 3 deletions(-)
--- a/kernel/sched/fair.c~sched-numa-fix-memory-leak-due-to-the-overwritten-vma-numab_state
+++ a/kernel/sched/fair.c
@@ -3399,10 +3399,16 @@ retry_pids:
/* Initialise new per-VMA NUMAB state. */
if (!vma->numab_state) {
- vma->numab_state = kzalloc(sizeof(struct vma_numab_state),
- GFP_KERNEL);
- if (!vma->numab_state)
+ struct vma_numab_state *ptr;
+
+ ptr = kzalloc(sizeof(*ptr), GFP_KERNEL);
+ if (!ptr)
+ continue;
+
+ if (cmpxchg(&vma->numab_state, NULL, ptr)) {
+ kfree(ptr);
continue;
+ }
vma->numab_state->start_scan_seq = mm->numa_scan_seq;
_
Patches currently in -mm which might be from ahuang12(a)lenovo.com are
sched-numa-fix-memory-leak-due-to-the-overwritten-vma-numab_state.patch
The patch titled
Subject: mm/damon: fix order of arguments in damos_before_apply tracepoint
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-damon-fix-order-of-arguments-in-damos_before_apply-tracepoint.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Akinobu Mita <akinobu.mita(a)gmail.com>
Subject: mm/damon: fix order of arguments in damos_before_apply tracepoint
Date: Fri, 15 Nov 2024 10:20:23 -0800
Since the order of the scheme_idx and target_idx arguments in TP_ARGS is
reversed, they are stored in the trace record in reverse.
Link: https://lkml.kernel.org/r/20241115182023.43118-1-sj@kernel.org
Link: https://patch.msgid.link/20241112154828.40307-1-akinobu.mita@gmail.com
Fixes: c603c630b509 ("mm/damon/core: add a tracepoint for damos apply target regions")
Signed-off-by: Akinobu Mita <akinobu.mita(a)gmail.com>
Signed-off-by: SeongJae Park <sj(a)kernel.org>
Cc: Masami Hiramatsu <mhiramat(a)kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers(a)efficios.com>
Cc: Steven Rostedt <rostedt(a)goodmis.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
include/trace/events/damon.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/include/trace/events/damon.h~mm-damon-fix-order-of-arguments-in-damos_before_apply-tracepoint
+++ a/include/trace/events/damon.h
@@ -15,7 +15,7 @@ TRACE_EVENT_CONDITION(damos_before_apply
unsigned int target_idx, struct damon_region *r,
unsigned int nr_regions, bool do_trace),
- TP_ARGS(context_idx, target_idx, scheme_idx, r, nr_regions, do_trace),
+ TP_ARGS(context_idx, scheme_idx, target_idx, r, nr_regions, do_trace),
TP_CONDITION(do_trace),
_
Patches currently in -mm which might be from akinobu.mita(a)gmail.com are
mm-damon-fix-order-of-arguments-in-damos_before_apply-tracepoint.patch
The patch titled
Subject: lib: stackinit: hide never-taken branch from compiler
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
lib-stackinit-hide-never-taken-branch-from-compiler.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Kees Cook <kees(a)kernel.org>
Subject: lib: stackinit: hide never-taken branch from compiler
Date: Sun, 17 Nov 2024 03:38:13 -0800
The never-taken branch leads to an invalid bounds condition, which is by
design. To avoid the unwanted warning from the compiler, hide the
variable from the optimizer.
../lib/stackinit_kunit.c: In function 'do_nothing_u16_zero':
../lib/stackinit_kunit.c:51:49: error: array subscript 1 is outside array bounds of 'u16[0]' {aka 'short unsigned int[]'} [-Werror=array-bounds=]
51 | #define DO_NOTHING_RETURN_SCALAR(ptr) *(ptr)
| ^~~~~~
../lib/stackinit_kunit.c:219:24: note: in expansion of macro 'DO_NOTHING_RETURN_SCALAR'
219 | return DO_NOTHING_RETURN_ ## which(ptr + 1); \
| ^~~~~~~~~~~~~~~~~~
Link: https://lkml.kernel.org/r/20241117113813.work.735-kees@kernel.org
Signed-off-by: Kees Cook <kees(a)kernel.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
lib/stackinit_kunit.c | 1 +
1 file changed, 1 insertion(+)
--- a/lib/stackinit_kunit.c~lib-stackinit-hide-never-taken-branch-from-compiler
+++ a/lib/stackinit_kunit.c
@@ -212,6 +212,7 @@ static noinline void test_ ## name (stru
static noinline DO_NOTHING_TYPE_ ## which(var_type) \
do_nothing_ ## name(var_type *ptr) \
{ \
+ OPTIMIZER_HIDE_VAR(ptr); \
/* Will always be true, but compiler doesn't know. */ \
if ((unsigned long)ptr > 0x2) \
return DO_NOTHING_RETURN_ ## which(ptr); \
_
Patches currently in -mm which might be from kees(a)kernel.org are
lib-stackinit-hide-never-taken-branch-from-compiler.patch
The patch titled
Subject: alloc_tag: fix set_codetag_empty() when !CONFIG_MEM_ALLOC_PROFILING_DEBUG
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
alloc_tag-fix-set_codetag_empty-when-config_mem_alloc_profiling_debug.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Suren Baghdasaryan <surenb(a)google.com>
Subject: alloc_tag: fix set_codetag_empty() when !CONFIG_MEM_ALLOC_PROFILING_DEBUG
Date: Fri, 29 Nov 2024 16:14:23 -0800
It was recently noticed that set_codetag_empty() might be used not only to
mark NULL alloctag references as empty to avoid warnings but also to reset
valid tags (in clear_page_tag_ref()). Since set_codetag_empty() is
defined as NOOP for CONFIG_MEM_ALLOC_PROFILING_DEBUG=n, such use of
set_codetag_empty() leads to subtle bugs. Fix set_codetag_empty() for
CONFIG_MEM_ALLOC_PROFILING_DEBUG=n to reset the tag reference.
Link: https://lkml.kernel.org/r/20241130001423.1114965-2-surenb@google.com
Fixes: a8fc28dad6d5 ("alloc_tag: introduce clear_page_tag_ref() helper function")
Signed-off-by: Suren Baghdasaryan <surenb(a)google.com>
Reported-by: David Wang <00107082(a)163.com>
Closes: https://lore.kernel.org/lkml/20241124074318.399027-1-00107082@163.com/
Cc: David Wang <00107082(a)163.com>
Cc: Kent Overstreet <kent.overstreet(a)linux.dev>
Cc: Mike Rapoport (Microsoft) <rppt(a)kernel.org>
Cc: Pasha Tatashin <pasha.tatashin(a)soleen.com>
Cc: Sourav Panda <souravpanda(a)google.com>
Cc: Yu Zhao <yuzhao(a)google.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
include/linux/alloc_tag.h | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
--- a/include/linux/alloc_tag.h~alloc_tag-fix-set_codetag_empty-when-config_mem_alloc_profiling_debug
+++ a/include/linux/alloc_tag.h
@@ -63,7 +63,12 @@ static inline void set_codetag_empty(uni
#else /* CONFIG_MEM_ALLOC_PROFILING_DEBUG */
static inline bool is_codetag_empty(union codetag_ref *ref) { return false; }
-static inline void set_codetag_empty(union codetag_ref *ref) {}
+
+static inline void set_codetag_empty(union codetag_ref *ref)
+{
+ if (ref)
+ ref->ct = NULL;
+}
#endif /* CONFIG_MEM_ALLOC_PROFILING_DEBUG */
_
Patches currently in -mm which might be from surenb(a)google.com are
alloc_tag-fix-module-allocation-tags-populated-area-calculation.patch
alloc_tag-fix-set_codetag_empty-when-config_mem_alloc_profiling_debug.patch
mm-convert-mm_lock_seq-to-a-proper-seqcount.patch
mm-introduce-mmap_lock_speculation_beginend.patch
The patch titled
Subject: alloc_tag: fix module allocation tags populated area calculation
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
alloc_tag-fix-module-allocation-tags-populated-area-calculation.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Suren Baghdasaryan <surenb(a)google.com>
Subject: alloc_tag: fix module allocation tags populated area calculation
Date: Fri, 29 Nov 2024 16:14:22 -0800
vm_module_tags_populate() calculation of the populated area assumes that
area starts at a page boundary and therefore when new pages are allocation,
the end of the area is page-aligned as well. If the start of the area is
not page-aligned then allocating a page and incrementing the end of the
area by PAGE_SIZE leads to an area at the end but within the area boundary
which is not populated. Accessing this are will lead to a kernel panic.
Fix the calculation by down-aligning the start of the area and using that
as the location allocated pages are mapped to.
Link: https://lkml.kernel.org/r/20241130001423.1114965-1-surenb@google.com
Fixes: 0f9b685626da ("alloc_tag: populate memory for module tags as needed")
Signed-off-by: Suren Baghdasaryan <surenb(a)google.com>
Reported-by: kernel test robot <oliver.sang(a)intel.com>
Closes: https://lore.kernel.org/oe-lkp/202411132111.6a221562-lkp@intel.com
Cc: David Wang <00107082(a)163.com>
Cc: Kent Overstreet <kent.overstreet(a)linux.dev>
Cc: Mike Rapoport (Microsoft) <rppt(a)kernel.org>
Cc: Pasha Tatashin <pasha.tatashin(a)soleen.com>
Cc: Sourav Panda <souravpanda(a)google.com>
Cc: Yu Zhao <yuzhao(a)google.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
lib/alloc_tag.c | 11 ++++++-----
1 file changed, 6 insertions(+), 5 deletions(-)
--- a/lib/alloc_tag.c~alloc_tag-fix-module-allocation-tags-populated-area-calculation
+++ a/lib/alloc_tag.c
@@ -401,19 +401,20 @@ repeat:
static int vm_module_tags_populate(void)
{
- unsigned long phys_size = vm_module_tags->nr_pages << PAGE_SHIFT;
+ unsigned long phys_end = ALIGN_DOWN(module_tags.start_addr, PAGE_SIZE) +
+ (vm_module_tags->nr_pages << PAGE_SHIFT);
+ unsigned long new_end = module_tags.start_addr + module_tags.size;
- if (phys_size < module_tags.size) {
+ if (phys_end < new_end) {
struct page **next_page = vm_module_tags->pages + vm_module_tags->nr_pages;
- unsigned long addr = module_tags.start_addr + phys_size;
unsigned long more_pages;
unsigned long nr;
- more_pages = ALIGN(module_tags.size - phys_size, PAGE_SIZE) >> PAGE_SHIFT;
+ more_pages = ALIGN(new_end - phys_end, PAGE_SIZE) >> PAGE_SHIFT;
nr = alloc_pages_bulk_array_node(GFP_KERNEL | __GFP_NOWARN,
NUMA_NO_NODE, more_pages, next_page);
if (nr < more_pages ||
- vmap_pages_range(addr, addr + (nr << PAGE_SHIFT), PAGE_KERNEL,
+ vmap_pages_range(phys_end, phys_end + (nr << PAGE_SHIFT), PAGE_KERNEL,
next_page, PAGE_SHIFT) < 0) {
/* Clean up and error out */
for (int i = 0; i < nr; i++)
_
Patches currently in -mm which might be from surenb(a)google.com are
alloc_tag-fix-module-allocation-tags-populated-area-calculation.patch
alloc_tag-fix-set_codetag_empty-when-config_mem_alloc_profiling_debug.patch
mm-convert-mm_lock_seq-to-a-proper-seqcount.patch
mm-introduce-mmap_lock_speculation_beginend.patch
Hi
Hope you are doing well.
Did you get a chance to see my previous email?? If you are
interested, Please reply so that I can provide details in
accordance.
Best regards
Jim Bertles
From: Dmitry Antipov <dmantipov(a)yandex.ru>
[ Upstream commit 1bfc466b13cf6652ba227c282c27a30ffede69a5 ]
When compiling with gcc version 14.0.0 20231220 (experimental)
and W=1, I've noticed the following warning:
kernel/watch_queue.c: In function 'watch_queue_set_size':
kernel/watch_queue.c:273:32: warning: 'kcalloc' sizes specified with 'sizeof'
in the earlier argument and not in the later argument [-Wcalloc-transposed-args]
273 | pages = kcalloc(sizeof(struct page *), nr_pages, GFP_KERNEL);
| ^~~~~~
Since 'n' and 'size' arguments of 'kcalloc()' are multiplied to
calculate the final size, their actual order doesn't affect the
result and so this is not a bug. But it's still worth to fix it.
Signed-off-by: Dmitry Antipov <dmantipov(a)yandex.ru>
Link: https://lore.kernel.org/r/20231221090139.12579-1-dmantipov@yandex.ru
Signed-off-by: Christian Brauner <brauner(a)kernel.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
kernel/watch_queue.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/watch_queue.c b/kernel/watch_queue.c
index ae31bf8d2feb..bf86e1d71cd3 100644
--- a/kernel/watch_queue.c
+++ b/kernel/watch_queue.c
@@ -275,7 +275,7 @@ long watch_queue_set_size(struct pipe_inode_info *pipe, unsigned int nr_notes)
goto error;
ret = -ENOMEM;
- pages = kcalloc(sizeof(struct page *), nr_pages, GFP_KERNEL);
+ pages = kcalloc(nr_pages, sizeof(struct page *), GFP_KERNEL);
if (!pages)
goto error;
--
2.43.0
This patchset fixes two bugs with the async controls for the uvc driver.
They were found while implementing the granular PM, but I am sending
them as a separate patches, so they can be reviewed sooner. They fix
real issues in the driver that need to be taken care.
Signed-off-by: Ricardo Ribalda <ribalda(a)chromium.org>
---
Changes in v3:
- change again! order of patches.
- Introduce uvc_ctrl_set_handle.
- Do not change ctrl->handle if it is not NULL.
Changes in v2:
- Annotate lockdep
- ctrl->handle != handle
- Change order of patches
- Move documentation of mutex
- Link to v1: https://lore.kernel.org/r/20241127-uvc-fix-async-v1-0-eb8722531b8c@chromium…
---
Ricardo Ribalda (4):
media: uvcvideo: Do not replace the handler of an async ctrl
media: uvcvideo: Remove dangling pointers
media: uvcvideo: Annotate lock requirements for uvc_ctrl_set
media: uvcvideo: Remove redundant NULL assignment
drivers/media/usb/uvc/uvc_ctrl.c | 52 +++++++++++++++++++++++++++++++++++-----
drivers/media/usb/uvc/uvc_v4l2.c | 2 ++
drivers/media/usb/uvc/uvcvideo.h | 14 +++++++++--
3 files changed, 60 insertions(+), 8 deletions(-)
---
base-commit: 72ad4ff638047bbbdf3232178fea4bec1f429319
change-id: 20241127-uvc-fix-async-2c9d40413ad8
Best regards,
--
Ricardo Ribalda <ribalda(a)chromium.org>
From: Boris Burkov <boris(a)bur.io>
commit 74e97958121aa1f5854da6effba70143f051b0cd upstream.
Create subvolume, create snapshot and delete subvolume all use
btrfs_subvolume_reserve_metadata() to reserve metadata for the changes
done to the parent subvolume's fs tree, which cannot be mediated in the
normal way via start_transaction. When quota groups (squota or qgroups)
are enabled, this reserves qgroup metadata of type PREALLOC. Once the
operation is associated to a transaction, we convert PREALLOC to
PERTRANS, which gets cleared in bulk at the end of the transaction.
However, the error paths of these three operations were not implementing
this lifecycle correctly. They unconditionally converted the PREALLOC to
PERTRANS in a generic cleanup step regardless of errors or whether the
operation was fully associated to a transaction or not. This resulted in
error paths occasionally converting this rsv to PERTRANS without calling
record_root_in_trans successfully, which meant that unless that root got
recorded in the transaction by some other thread, the end of the
transaction would not free that root's PERTRANS, leaking it. Ultimately,
this resulted in hitting a WARN in CONFIG_BTRFS_DEBUG builds at unmount
for the leaked reservation.
The fix is to ensure that every qgroup PREALLOC reservation observes the
following properties:
1. any failure before record_root_in_trans is called successfully
results in freeing the PREALLOC reservation.
2. after record_root_in_trans, we convert to PERTRANS, and now the
transaction owns freeing the reservation.
This patch enforces those properties on the three operations. Without
it, generic/269 with squotas enabled at mkfs time would fail in ~5-10
runs on my system. With this patch, it ran successfully 1000 times in a
row.
Fixes: e85fde5162bf ("btrfs: qgroup: fix qgroup meta rsv leak for subvolume operations")
CC: stable(a)vger.kernel.org # 6.1+
Reviewed-by: Qu Wenruo <wqu(a)suse.com>
Signed-off-by: Boris Burkov <boris(a)bur.io>
Signed-off-by: David Sterba <dsterba(a)suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
[Xiangyu: BP to fix CVE-2024-35956, due to 6.1 btrfs_subvolume_release_metadata()
defined in ctree.h, modified the header file name from root-tree.h to ctree.h]
Signed-off-by: Xiangyu Chen <xiangyu.chen(a)windriver.com>
---
fs/btrfs/ctree.h | 2 --
fs/btrfs/inode.c | 13 ++++++++++++-
fs/btrfs/ioctl.c | 36 ++++++++++++++++++++++++++++--------
fs/btrfs/root-tree.c | 10 ----------
4 files changed, 40 insertions(+), 21 deletions(-)
diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index cca1acf2e037..cab023927b43 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -2987,8 +2987,6 @@ enum btrfs_flush_state {
int btrfs_subvolume_reserve_metadata(struct btrfs_root *root,
struct btrfs_block_rsv *rsv,
int nitems, bool use_global_rsv);
-void btrfs_subvolume_release_metadata(struct btrfs_root *root,
- struct btrfs_block_rsv *rsv);
void btrfs_delalloc_release_extents(struct btrfs_inode *inode, u64 num_bytes);
int btrfs_delalloc_reserve_metadata(struct btrfs_inode *inode, u64 num_bytes,
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index a79da940f5b2..8fc8a24a1afe 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -4707,6 +4707,7 @@ int btrfs_delete_subvolume(struct inode *dir, struct dentry *dentry)
struct btrfs_trans_handle *trans;
struct btrfs_block_rsv block_rsv;
u64 root_flags;
+ u64 qgroup_reserved = 0;
int ret;
down_write(&fs_info->subvol_sem);
@@ -4751,12 +4752,20 @@ int btrfs_delete_subvolume(struct inode *dir, struct dentry *dentry)
ret = btrfs_subvolume_reserve_metadata(root, &block_rsv, 5, true);
if (ret)
goto out_undead;
+ qgroup_reserved = block_rsv.qgroup_rsv_reserved;
trans = btrfs_start_transaction(root, 0);
if (IS_ERR(trans)) {
ret = PTR_ERR(trans);
goto out_release;
}
+ ret = btrfs_record_root_in_trans(trans, root);
+ if (ret) {
+ btrfs_abort_transaction(trans, ret);
+ goto out_end_trans;
+ }
+ btrfs_qgroup_convert_reserved_meta(root, qgroup_reserved);
+ qgroup_reserved = 0;
trans->block_rsv = &block_rsv;
trans->bytes_reserved = block_rsv.size;
@@ -4815,7 +4824,9 @@ int btrfs_delete_subvolume(struct inode *dir, struct dentry *dentry)
ret = btrfs_end_transaction(trans);
inode->i_flags |= S_DEAD;
out_release:
- btrfs_subvolume_release_metadata(root, &block_rsv);
+ btrfs_block_rsv_release(fs_info, &block_rsv, (u64)-1, NULL);
+ if (qgroup_reserved)
+ btrfs_qgroup_free_meta_prealloc(root, qgroup_reserved);
out_undead:
if (ret) {
spin_lock(&dest->root_item_lock);
diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index 31f7fe31b607..a30379936af5 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -592,6 +592,7 @@ static noinline int create_subvol(struct user_namespace *mnt_userns,
int ret;
dev_t anon_dev;
u64 objectid;
+ u64 qgroup_reserved = 0;
root_item = kzalloc(sizeof(*root_item), GFP_KERNEL);
if (!root_item)
@@ -629,13 +630,18 @@ static noinline int create_subvol(struct user_namespace *mnt_userns,
trans_num_items, false);
if (ret)
goto out_new_inode_args;
+ qgroup_reserved = block_rsv.qgroup_rsv_reserved;
trans = btrfs_start_transaction(root, 0);
if (IS_ERR(trans)) {
ret = PTR_ERR(trans);
- btrfs_subvolume_release_metadata(root, &block_rsv);
- goto out_new_inode_args;
+ goto out_release_rsv;
}
+ ret = btrfs_record_root_in_trans(trans, BTRFS_I(dir)->root);
+ if (ret)
+ goto out;
+ btrfs_qgroup_convert_reserved_meta(root, qgroup_reserved);
+ qgroup_reserved = 0;
trans->block_rsv = &block_rsv;
trans->bytes_reserved = block_rsv.size;
@@ -744,12 +750,15 @@ static noinline int create_subvol(struct user_namespace *mnt_userns,
out:
trans->block_rsv = NULL;
trans->bytes_reserved = 0;
- btrfs_subvolume_release_metadata(root, &block_rsv);
if (ret)
btrfs_end_transaction(trans);
else
ret = btrfs_commit_transaction(trans);
+out_release_rsv:
+ btrfs_block_rsv_release(fs_info, &block_rsv, (u64)-1, NULL);
+ if (qgroup_reserved)
+ btrfs_qgroup_free_meta_prealloc(root, qgroup_reserved);
out_new_inode_args:
btrfs_new_inode_args_destroy(&new_inode_args);
out_inode:
@@ -771,6 +780,8 @@ static int create_snapshot(struct btrfs_root *root, struct inode *dir,
struct btrfs_pending_snapshot *pending_snapshot;
unsigned int trans_num_items;
struct btrfs_trans_handle *trans;
+ struct btrfs_block_rsv *block_rsv;
+ u64 qgroup_reserved = 0;
int ret;
/* We do not support snapshotting right now. */
@@ -807,19 +818,19 @@ static int create_snapshot(struct btrfs_root *root, struct inode *dir,
goto free_pending;
}
- btrfs_init_block_rsv(&pending_snapshot->block_rsv,
- BTRFS_BLOCK_RSV_TEMP);
+ block_rsv = &pending_snapshot->block_rsv;
+ btrfs_init_block_rsv(block_rsv, BTRFS_BLOCK_RSV_TEMP);
/*
* 1 to add dir item
* 1 to add dir index
* 1 to update parent inode item
*/
trans_num_items = create_subvol_num_items(inherit) + 3;
- ret = btrfs_subvolume_reserve_metadata(BTRFS_I(dir)->root,
- &pending_snapshot->block_rsv,
+ ret = btrfs_subvolume_reserve_metadata(BTRFS_I(dir)->root, block_rsv,
trans_num_items, false);
if (ret)
goto free_pending;
+ qgroup_reserved = block_rsv->qgroup_rsv_reserved;
pending_snapshot->dentry = dentry;
pending_snapshot->root = root;
@@ -832,6 +843,13 @@ static int create_snapshot(struct btrfs_root *root, struct inode *dir,
ret = PTR_ERR(trans);
goto fail;
}
+ ret = btrfs_record_root_in_trans(trans, BTRFS_I(dir)->root);
+ if (ret) {
+ btrfs_end_transaction(trans);
+ goto fail;
+ }
+ btrfs_qgroup_convert_reserved_meta(root, qgroup_reserved);
+ qgroup_reserved = 0;
trans->pending_snapshot = pending_snapshot;
@@ -861,7 +879,9 @@ static int create_snapshot(struct btrfs_root *root, struct inode *dir,
if (ret && pending_snapshot->snap)
pending_snapshot->snap->anon_dev = 0;
btrfs_put_root(pending_snapshot->snap);
- btrfs_subvolume_release_metadata(root, &pending_snapshot->block_rsv);
+ btrfs_block_rsv_release(fs_info, block_rsv, (u64)-1, NULL);
+ if (qgroup_reserved)
+ btrfs_qgroup_free_meta_prealloc(root, qgroup_reserved);
free_pending:
if (pending_snapshot->anon_dev)
free_anon_bdev(pending_snapshot->anon_dev);
diff --git a/fs/btrfs/root-tree.c b/fs/btrfs/root-tree.c
index 7d783f094306..37780ede89ba 100644
--- a/fs/btrfs/root-tree.c
+++ b/fs/btrfs/root-tree.c
@@ -532,13 +532,3 @@ int btrfs_subvolume_reserve_metadata(struct btrfs_root *root,
}
return ret;
}
-
-void btrfs_subvolume_release_metadata(struct btrfs_root *root,
- struct btrfs_block_rsv *rsv)
-{
- struct btrfs_fs_info *fs_info = root->fs_info;
- u64 qgroup_to_release;
-
- btrfs_block_rsv_release(fs_info, rsv, (u64)-1, &qgroup_to_release);
- btrfs_qgroup_convert_reserved_meta(root, qgroup_to_release);
-}
--
2.25.1
Kexec bypasses EFI's switch to virtual mode. In exchange, it has its own
routine, kexec_enter_virtual_mode(), which replays the mappings made by
the original kernel. Unfortunately, that function fails to reinstate
EFI's memory attributes, which would've otherwise been set after
entering virtual mode. Remediate this by calling
efi_runtime_update_mappings() within kexec's routine.
Cc: stable(a)vger.kernel.org
Fixes: 18141e89a76c ("x86/efi: Add support for EFI_MEMORY_ATTRIBUTES_TABLE")
Signed-off-by: Nicolas Saenz Julienne <nsaenz(a)amazon.com>
---
Notes:
- Tested with QEMU/OVMF.
arch/x86/platform/efi/efi.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/arch/x86/platform/efi/efi.c b/arch/x86/platform/efi/efi.c
index 375ebd78296a..a7ff189421c3 100644
--- a/arch/x86/platform/efi/efi.c
+++ b/arch/x86/platform/efi/efi.c
@@ -765,6 +765,7 @@ static void __init kexec_enter_virtual_mode(void)
efi_sync_low_kernel_mappings();
efi_native_runtime_setup();
+ efi_runtime_update_mappings();
#endif
}
--
2.40.1
From: Christian Brauner <brauner(a)kernel.org>
commit 7af2ae1b1531feab5d38ec9c8f472dc6cceb4606 upstream.
When erofs_kill_sb() is called in block dev based mode, s_bdev may not
have been initialised yet, and if CONFIG_EROFS_FS_ONDEMAND is enabled,
it will be mistaken for fscache mode, and then attempt to free an anon_dev
that has never been allocated, triggering the following warning:
============================================
ida_free called for id=0 which is not allocated.
WARNING: CPU: 14 PID: 926 at lib/idr.c:525 ida_free+0x134/0x140
Modules linked in:
CPU: 14 PID: 926 Comm: mount Not tainted 6.9.0-rc3-dirty #630
RIP: 0010:ida_free+0x134/0x140
Call Trace:
<TASK>
erofs_kill_sb+0x81/0x90
deactivate_locked_super+0x35/0x80
get_tree_bdev+0x136/0x1e0
vfs_get_tree+0x2c/0xf0
do_new_mount+0x190/0x2f0
[...]
============================================
Now when erofs_kill_sb() is called, erofs_sb_info must have been
initialised, so use sbi->fsid to distinguish between the two modes.
Signed-off-by: Christian Brauner <brauner(a)kernel.org>
Signed-off-by: Baokun Li <libaokun1(a)huawei.com>
Reviewed-by: Jingbo Xu <jefflexu(a)linux.alibaba.com>
Reviewed-by: Gao Xiang <hsiangkao(a)linux.alibaba.com>
Reviewed-by: Chao Yu <chao(a)kernel.org>
Link: https://lore.kernel.org/r/20240419123611.947084-3-libaokun1@huawei.com
Signed-off-by: Gao Xiang <hsiangkao(a)linux.alibaba.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Signed-off-by: Xiangyu Chen <xiangyu.chen(a)windriver.com>
---
fs/erofs/super.c | 8 ++------
1 file changed, 2 insertions(+), 6 deletions(-)
diff --git a/fs/erofs/super.c b/fs/erofs/super.c
index 25cd66e487e8..5bb194558da5 100644
--- a/fs/erofs/super.c
+++ b/fs/erofs/super.c
@@ -892,7 +892,7 @@ static int erofs_init_fs_context(struct fs_context *fc)
*/
static void erofs_kill_sb(struct super_block *sb)
{
- struct erofs_sb_info *sbi;
+ struct erofs_sb_info *sbi = EROFS_SB(sb);
WARN_ON(sb->s_magic != EROFS_SUPER_MAGIC);
@@ -902,15 +902,11 @@ static void erofs_kill_sb(struct super_block *sb)
return;
}
- if (erofs_is_fscache_mode(sb))
+ if (IS_ENABLED(CONFIG_EROFS_FS_ONDEMAND) && sbi->fsid)
kill_anon_super(sb);
else
kill_block_super(sb);
- sbi = EROFS_SB(sb);
- if (!sbi)
- return;
-
erofs_free_dev_context(sbi->devs);
fs_put_dax(sbi->dax_dev, NULL);
erofs_fscache_unregister_fs(sb);
--
2.25.1
From: Ville Syrjälä <ville.syrjala(a)linux.intel.com>
Mesa changed its clear color alignment from 4k to 64 bytes
without informing the kernel side about the change. This
is now likely to cause framebuffer creation to fail.
The only thing we do with the clear color buffer in i915 is:
1. map a single page
2. read out bytes 16-23 from said page
3. unmap the page
So the only requirement we really have is that those 8 bytes
are all contained within one page. Thus we can deal with the
Mesa regression by reducing the alignment requiment from 4k
to the same 64 bytes in the kernel. We could even go as low as
32 bytes, but IIRC 64 bytes is the hardware requirement on
the 3D engine side so matching that seems sensible.
Cc: stable(a)vger.kernel.org
Cc: Sagar Ghuge <sagar.ghuge(a)intel.com>
Cc: Nanley Chery <nanley.g.chery(a)intel.com>
Reported-by: Xi Ruoyao <xry111(a)xry111.site>
Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/13057
Closes: https://lore.kernel.org/all/45a5bba8de009347262d86a4acb27169d9ae0d9f.camel@…
Link: https://gitlab.freedesktop.org/mesa/mesa/-/commit/17f97a69c13832a6c1b0b3aad…
Signed-off-by: Ville Syrjälä <ville.syrjala(a)linux.intel.com>
---
drivers/gpu/drm/i915/display/intel_fb.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/display/intel_fb.c b/drivers/gpu/drm/i915/display/intel_fb.c
index 6a7060889f40..223c4218c019 100644
--- a/drivers/gpu/drm/i915/display/intel_fb.c
+++ b/drivers/gpu/drm/i915/display/intel_fb.c
@@ -1694,7 +1694,7 @@ int intel_fill_fb_info(struct drm_i915_private *i915, struct intel_framebuffer *
* arithmetic related to alignment and offset calculation.
*/
if (is_gen12_ccs_cc_plane(&fb->base, i)) {
- if (IS_ALIGNED(fb->base.offsets[i], PAGE_SIZE))
+ if (IS_ALIGNED(fb->base.offsets[i], 64))
continue;
else
return -EINVAL;
--
2.45.2
From: Ville Syrjälä <ville.syrjala(a)linux.intel.com>
drm_mode_vrefresh() is trying to avoid divide by zero
by checking whether htotal or vtotal are zero. But we may
still end up with a div-by-zero of vtotal*htotal*...
Cc: stable(a)vger.kernel.org
Reported-by: syzbot+622bba18029bcde672e1(a)syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=622bba18029bcde672e1
Signed-off-by: Ville Syrjälä <ville.syrjala(a)linux.intel.com>
---
drivers/gpu/drm/drm_modes.c | 11 +++++++----
1 file changed, 7 insertions(+), 4 deletions(-)
diff --git a/drivers/gpu/drm/drm_modes.c b/drivers/gpu/drm/drm_modes.c
index 6ba167a33461..71573b85d924 100644
--- a/drivers/gpu/drm/drm_modes.c
+++ b/drivers/gpu/drm/drm_modes.c
@@ -1287,14 +1287,11 @@ EXPORT_SYMBOL(drm_mode_set_name);
*/
int drm_mode_vrefresh(const struct drm_display_mode *mode)
{
- unsigned int num, den;
+ unsigned int num = 1, den = 1;
if (mode->htotal == 0 || mode->vtotal == 0)
return 0;
- num = mode->clock;
- den = mode->htotal * mode->vtotal;
-
if (mode->flags & DRM_MODE_FLAG_INTERLACE)
num *= 2;
if (mode->flags & DRM_MODE_FLAG_DBLSCAN)
@@ -1302,6 +1299,12 @@ int drm_mode_vrefresh(const struct drm_display_mode *mode)
if (mode->vscan > 1)
den *= mode->vscan;
+ if (check_mul_overflow(mode->clock, num, &num))
+ return 0;
+
+ if (check_mul_overflow(mode->htotal * mode->vtotal, den, &den))
+ return 0;
+
return DIV_ROUND_CLOSEST_ULL(mul_u32_u32(num, 1000), den);
}
EXPORT_SYMBOL(drm_mode_vrefresh);
--
2.45.2
Changes in v6:
- Passes NULL to second parameter of devm_pm_domain_attach_list - Vlad
- Link to v5: https://lore.kernel.org/r/20241128-b4-linux-next-24-11-18-clock-multiple-po…
Changes in v5:
- In-lines devm_pm_domain_attach_list() in probe() directly - Vlad
- Link to v4: https://lore.kernel.org/r/20241127-b4-linux-next-24-11-18-clock-multiple-po…
v4:
- Adds Bjorn's RB to first patch - Bjorn
- Drops the 'd' in "and int" - Bjorn
- Amends commit log of patch 3 to capture a number of open questions -
Bjorn
- Link to v3: https://lore.kernel.org/r/20241126-b4-linux-next-24-11-18-clock-multiple-po…
v3:
- Fixes commit log "per which" - Bryan
- Link to v2: https://lore.kernel.org/r/20241125-b4-linux-next-24-11-18-clock-multiple-po…
v2:
The main change in this version is Bjorn's pointing out that pm_runtime_*
inside of the gdsc_enable/gdsc_disable path would be recursive and cause a
lockdep splat. Dmitry alluded to this too.
Bjorn pointed to stuff being done lower in the gdsc_register() routine that
might be a starting point.
I iterated around that idea and came up with patch #3. When a gdsc has no
parent and the pd_list is non-NULL then attach that orphan GDSC to the
clock controller power-domain list.
Existing subdomain code in gdsc_register() will connect the parent GDSCs in
the clock-controller to the clock-controller subdomain, the new code here
does that same job for a list of power-domains the clock controller depends
on.
To Dmitry's point about MMCX and MCX dependencies for the registers inside
of the clock controller, I have switched off all references in a test dtsi
and confirmed that accessing the clock-controller regs themselves isn't
required.
On the second point I also verified my test branch with lockdep on which
was a concern with the pm_domain version of this solution but I wanted to
cover it anyway with the new approach for completeness sake.
Here's the item-by-item list of changes:
- Adds a patch to capture pm_genpd_add_subdomain() result code - Bryan
- Changes changelog of second patch to remove singleton and generally
to make the commit log easier to understand - Bjorn
- Uses demv_pm_domain_attach_list - Vlad
- Changes error check to if (ret < 0 && ret != -EEXIST) - Vlad
- Retains passing &pd_data instead of NULL - because NULL doesn't do
the same thing - Bryan/Vlad
- Retains standalone function qcom_cc_pds_attach() because the pd_data
enumeration looks neater in a standalone function - Bryan/Vlad
- Drops pm_runtime in favour of gdsc_add_subdomain_list() for each
power-domain in the pd_list.
The pd_list will be whatever is pointed to by power-domains = <>
in the dtsi - Bjorn
- Link to v1: https://lore.kernel.org/r/20241118-b4-linux-next-24-11-18-clock-multiple-po…
v1:
On x1e80100 and it's SKUs the Camera Clock Controller - CAMCC has
multiple power-domains which power it. Usually with a single power-domain
the core platform code will automatically switch on the singleton
power-domain for you. If you have multiple power-domains for a device, in
this case the clock controller, you need to switch those power-domains
on/off yourself.
The clock controllers can also contain Global Distributed
Switch Controllers - GDSCs which themselves can be referenced from dtsi
nodes ultimately triggering a gdsc_en() in drivers/clk/qcom/gdsc.c.
As an example:
cci0: cci@ac4a000 {
power-domains = <&camcc TITAN_TOP_GDSC>;
};
This series adds the support to attach a power-domain list to the
clock-controllers and the GDSCs those controllers provide so that in the
case of the above example gdsc_toggle_logic() will trigger the power-domain
list with pm_runtime_resume_and_get() and pm_runtime_put_sync()
respectively.
Signed-off-by: Bryan O'Donoghue <bryan.odonoghue(a)linaro.org>
---
Bryan O'Donoghue (3):
clk: qcom: gdsc: Capture pm_genpd_add_subdomain result code
clk: qcom: common: Add support for power-domain attachment
clk: qcom: Support attaching GDSCs to multiple parents
drivers/clk/qcom/common.c | 6 ++++++
drivers/clk/qcom/gdsc.c | 41 +++++++++++++++++++++++++++++++++++++++--
drivers/clk/qcom/gdsc.h | 1 +
3 files changed, 46 insertions(+), 2 deletions(-)
---
base-commit: 744cf71b8bdfcdd77aaf58395e068b7457634b2c
change-id: 20241118-b4-linux-next-24-11-18-clock-multiple-power-domains-a5f994dc452a
Best regards,
--
Bryan O'Donoghue <bryan.odonoghue(a)linaro.org>
This patch fixes CVE-2023-52531 [1] present in 5.4 and 5.10 stable
kernels. The vulnerability concerns flawed pointer arithmetic in
iwlwifi driver caused by use of spurious casting to (u8 *). Original
upstream commit [3] removed that cast but kept a change to increment
a pointer first and only then cast it to (void *) or other type.
However, as older branches did not receive commit 3827cb59b3b8
("iwlwifi: avoid void pointer arithmetic") [2], the aforementioned
kept change is also missing, which should be corrected and applied
to other vulnerable versions. This backport ensures that correction
and keeps away from dangerous void pointer arithmetic.
[PATCH 5.4/5.10 1/1] wifi: iwlwifi: mvm: Fix a memory corruption issue
Change 'channels' pointer before casting it to (void *).
Fixes [1].
[1] https://nvd.nist.gov/vuln/detail/cve-2023-52531
[2] https://github.com/torvalds/linux/commit/3827cb59b3b8ce4b1687385d35034dadcd…
[3] https://github.com/torvalds/linux/commit/8ba438ef3cacc4808a63ed0ce24d4f0942…
The following commit has been merged into the timers/urgent branch of tip:
Commit-ID: 63dffecfba3eddcf67a8f76d80e0c141f93d44a5
Gitweb: https://git.kernel.org/tip/63dffecfba3eddcf67a8f76d80e0c141f93d44a5
Author: Frederic Weisbecker <frederic(a)kernel.org>
AuthorDate: Sat, 23 Nov 2024 00:48:11 +01:00
Committer: Thomas Gleixner <tglx(a)linutronix.de>
CommitterDate: Fri, 29 Nov 2024 13:19:09 +01:00
posix-timers: Target group sigqueue to current task only if not exiting
A sigqueue belonging to a posix timer, which target is not a specific
thread but a whole thread group, is preferrably targeted to the current
task if it is part of that thread group.
However nothing prevents a posix timer event from queueing such a
sigqueue from a reaped yet running task. The interruptible code space
between exit_notify() and the final call to schedule() is enough for
posix_timer_fn() hrtimer to fire.
If that happens while the current task is part of the thread group
target, it is proposed to handle it but since its sighand pointer may
have been cleared already, the sigqueue is dropped even if there are
other tasks running within the group that could handle it.
As a result posix timers with thread group wide target may miss signals
when some of their threads are exiting.
Fix this with verifying that the current task hasn't been through
exit_notify() before proposing it as a preferred target so as to ensure
that its sighand is still here and stable.
complete_signal() might still reconsider the choice and find a better
target within the group if current has passed retarget_shared_pending()
already.
Fixes: bcb7ee79029d ("posix-timers: Prefer delivery of signals to the current thread")
Reported-by: Anthony Mallet <anthony.mallet(a)laas.fr>
Suggested-by: Oleg Nesterov <oleg(a)redhat.com>
Signed-off-by: Frederic Weisbecker <frederic(a)kernel.org>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Acked-by: Oleg Nesterov <oleg(a)redhat.com>
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/all/20241122234811.60455-1-frederic@kernel.org
Closes: https://lore.kernel.org/all/26411.57288.238690.681680@gargle.gargle.HOWL
---
kernel/signal.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/kernel/signal.c b/kernel/signal.c
index 98b65cb..989b1cc 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -1959,14 +1959,15 @@ static void posixtimer_queue_sigqueue(struct sigqueue *q, struct task_struct *t,
*
* Where type is not PIDTYPE_PID, signals must be delivered to the
* process. In this case, prefer to deliver to current if it is in
- * the same thread group as the target process, which avoids
- * unnecessarily waking up a potentially idle task.
+ * the same thread group as the target process and its sighand is
+ * stable, which avoids unnecessarily waking up a potentially idle task.
*/
static inline struct task_struct *posixtimer_get_target(struct k_itimer *tmr)
{
struct task_struct *t = pid_task(tmr->it_pid, tmr->it_pid_type);
- if (t && tmr->it_pid_type != PIDTYPE_PID && same_thread_group(t, current))
+ if (t && tmr->it_pid_type != PIDTYPE_PID &&
+ same_thread_group(t, current) && !current->exit_state)
t = current;
return t;
}
Commit b8e0ddd36ce9 ("can: mcp251xfd: tef: prepare to workaround
broken TEF FIFO tail index erratum") introduced
mcp251xfd_get_tef_len() to get the number of unhandled transmit events
from the Transmit Event FIFO (TEF).
As the TEF has no head index, the driver uses the TX-FIFO's tail index
instead, assuming that send frames are completed.
When calculating the number of unhandled TEF events, that commit
didn't take mcp2518fd erratum DS80000789E 6. into account. According
to that erratum, the FIFOCI bits of a FIFOSTA register, here the
TX-FIFO tail index might be corrupted.
However here it seems the bit indicating that the TX-FIFO is
empty (MCP251XFD_REG_FIFOSTA_TFERFFIF) is not correct while the
TX-FIFO tail index is.
Assume that the TX-FIFO is indeed empty if:
- Chip's head and tail index are equal (len == 0).
- The TX-FIFO is less than half full.
(The TX-FIFO empty case has already been checked at the
beginning of this function.)
- No free buffers in the TX ring.
If the TX-FIFO is assumed to be empty, assume that the TEF is full and
return the number of elements in the TX-FIFO (which equals the number
of TEF elements).
If these assumptions are false, the driver might read to many objects
from the TEF. mcp251xfd_handle_tefif_one() checks the sequence numbers
and will refuse to process old events.
Reported-by: Renjaya Raga Zenta <renjaya.zenta(a)formulatrix.com>
Closes: https://patch.msgid.link/CAJ7t6HgaeQ3a_OtfszezU=zB-FqiZXqrnATJ3UujNoQJJf7Gg…
Fixes: b8e0ddd36ce9 ("can: mcp251xfd: tef: prepare to workaround broken TEF FIFO tail index erratum")
Tested-by: Renjaya Raga Zenta <renjaya.zenta(a)formulatrix.com>
Cc: stable(a)vger.kernel.org
Link: https://patch.msgid.link/20241126-mcp251xfd-fix-length-calculation-v2-1-c2e…
Signed-off-by: Marc Kleine-Budde <mkl(a)pengutronix.de>
---
drivers/net/can/spi/mcp251xfd/mcp251xfd-tef.c | 29 ++++++++++++++++++-
1 file changed, 28 insertions(+), 1 deletion(-)
diff --git a/drivers/net/can/spi/mcp251xfd/mcp251xfd-tef.c b/drivers/net/can/spi/mcp251xfd/mcp251xfd-tef.c
index d3ac865933fd..e94321849fd7 100644
--- a/drivers/net/can/spi/mcp251xfd/mcp251xfd-tef.c
+++ b/drivers/net/can/spi/mcp251xfd/mcp251xfd-tef.c
@@ -21,6 +21,11 @@ static inline bool mcp251xfd_tx_fifo_sta_empty(u32 fifo_sta)
return fifo_sta & MCP251XFD_REG_FIFOSTA_TFERFFIF;
}
+static inline bool mcp251xfd_tx_fifo_sta_less_than_half_full(u32 fifo_sta)
+{
+ return fifo_sta & MCP251XFD_REG_FIFOSTA_TFHRFHIF;
+}
+
static inline int
mcp251xfd_tef_tail_get_from_chip(const struct mcp251xfd_priv *priv,
u8 *tef_tail)
@@ -147,7 +152,29 @@ mcp251xfd_get_tef_len(struct mcp251xfd_priv *priv, u8 *len_p)
BUILD_BUG_ON(sizeof(tx_ring->obj_num) != sizeof(len));
len = (chip_tx_tail << shift) - (tail << shift);
- *len_p = len >> shift;
+ len >>= shift;
+
+ /* According to mcp2518fd erratum DS80000789E 6. the FIFOCI
+ * bits of a FIFOSTA register, here the TX-FIFO tail index
+ * might be corrupted.
+ *
+ * However here it seems the bit indicating that the TX-FIFO
+ * is empty (MCP251XFD_REG_FIFOSTA_TFERFFIF) is not correct
+ * while the TX-FIFO tail index is.
+ *
+ * We assume the TX-FIFO is empty, i.e. all pending CAN frames
+ * haven been send, if:
+ * - Chip's head and tail index are equal (len == 0).
+ * - The TX-FIFO is less than half full.
+ * (The TX-FIFO empty case has already been checked at the
+ * beginning of this function.)
+ * - No free buffers in the TX ring.
+ */
+ if (len == 0 && mcp251xfd_tx_fifo_sta_less_than_half_full(fifo_sta) &&
+ mcp251xfd_get_tx_free(tx_ring) == 0)
+ len = tx_ring->obj_num;
+
+ *len_p = len;
return 0;
}
--
2.45.2
In commit 6e86a1543c37 ("can: dev: provide optional GPIO based
termination support") GPIO based termination support was added.
For no particular reason that patch uses gpiod_set_value() to set the
GPIO. This leads to the following warning, if the systems uses a
sleeping GPIO, i.e. behind an I2C port expander:
| WARNING: CPU: 0 PID: 379 at /drivers/gpio/gpiolib.c:3496 gpiod_set_value+0x50/0x6c
| CPU: 0 UID: 0 PID: 379 Comm: ip Not tainted 6.11.0-20241016-1 #1 823affae360cc91126e4d316d7a614a8bf86236c
Replace gpiod_set_value() by gpiod_set_value_cansleep() to allow the
use of sleeping GPIOs.
Cc: Nicolai Buchwitz <nb(a)tipi-net.de>
Cc: Lino Sanfilippo <l.sanfilippo(a)kunbus.com>
Cc: stable(a)vger.kernel.org
Reported-by: Leonard Göhrs <l.goehrs(a)pengutronix.de>
Tested-by: Leonard Göhrs <l.goehrs(a)pengutronix.de>
Fixes: 6e86a1543c37 ("can: dev: provide optional GPIO based termination support")
Link: https://patch.msgid.link/20241121-dev-fix-can_set_termination-v1-1-41fa6e29…
Signed-off-by: Marc Kleine-Budde <mkl(a)pengutronix.de>
---
drivers/net/can/dev/dev.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/can/dev/dev.c b/drivers/net/can/dev/dev.c
index 6792c14fd7eb..681643ab3780 100644
--- a/drivers/net/can/dev/dev.c
+++ b/drivers/net/can/dev/dev.c
@@ -468,7 +468,7 @@ static int can_set_termination(struct net_device *ndev, u16 term)
else
set = 0;
- gpiod_set_value(priv->termination_gpio, set);
+ gpiod_set_value_cansleep(priv->termination_gpio, set);
return 0;
}
base-commit: 9bb88c659673003453fd42e0ddf95c9628409094
--
2.45.2
The patch titled
Subject: mm: Respect mmap hint address when aligning for THP
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-respect-mmap-hint-address-when-aligning-for-thp.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Kalesh Singh <kaleshsingh(a)google.com>
Subject: mm: Respect mmap hint address when aligning for THP
Date: Mon, 18 Nov 2024 13:46:48 -0800
Commit efa7df3e3bb5 ("mm: align larger anonymous mappings on THP
boundaries") updated __get_unmapped_area() to align the start address for
the VMA to a PMD boundary if CONFIG_TRANSPARENT_HUGEPAGE=y.
It does this by effectively looking up a region that is of size,
request_size + PMD_SIZE, and aligning up the start to a PMD boundary.
Commit 4ef9ad19e176 ("mm: huge_memory: don't force huge page alignment on
32 bit") opted out of this for 32bit due to regressions in mmap base
randomization.
Commit d4148aeab412 ("mm, mmap: limit THP alignment of anonymous mappings
to PMD-aligned sizes") restricted this to only mmap sizes that are
multiples of the PMD_SIZE due to reported regressions in some performance
benchmarks -- which seemed mostly due to the reduced spatial locality of
related mappings due to the forced PMD-alignment.
Another unintended side effect has emerged: When a user specifies an mmap
hint address, the THP alignment logic modifies the behavior, potentially
ignoring the hint even if a sufficiently large gap exists at the requested
hint location.
Example Scenario:
Consider the following simplified virtual address (VA) space:
...
0x200000-0x400000 --- VMA A
0x400000-0x600000 --- Hole
0x600000-0x800000 --- VMA B
...
A call to mmap() with hint=0x400000 and len=0x200000 behaves differently:
- Before THP alignment: The requested region (size 0x200000) fits into
the gap at 0x400000, so the hint is respected.
- After alignment: The logic searches for a region of size
0x400000 (len + PMD_SIZE) starting at 0x400000.
This search fails due to the mapping at 0x600000 (VMA B), and the hint
is ignored, falling back to arch_get_unmapped_area[_topdown]().
In general the hint is effectively ignored, if there is any existing
mapping in the below range:
[mmap_hint + mmap_size, mmap_hint + mmap_size + PMD_SIZE)
This changes the semantics of mmap hint; from ""Respect the hint if a
sufficiently large gap exists at the requested location" to "Respect the
hint only if an additional PMD-sized gap exists beyond the requested
size".
This has performance implications for allocators that allocate their heap
using mmap but try to keep it "as contiguous as possible" by using the end
of the exisiting heap as the address hint. With the new behavior it's
more likely to get a much less contiguous heap, adding extra fragmentation
and performance overhead.
To restore the expected behavior; don't use
thp_get_unmapped_area_vmflags() when the user provided a hint address, for
anonymous mappings.
Note: As Yang Shi pointed out: the issue still remains for filesystems
which are using thp_get_unmapped_area() for their get_unmapped_area() op.
It is unclear what worklaods will regress for if we ignore THP alignment
when the hint address is provided for such file backed mappings -- so this
fix will be handled separately.
Link: https://lkml.kernel.org/r/20241118214650.3667577-1-kaleshsingh@google.com
Fixes: efa7df3e3bb5 ("mm: align larger anonymous mappings on THP boundaries")
Signed-off-by: Kalesh Singh <kaleshsingh(a)google.com>
Reviewed-by: Rik van Riel <riel(a)surriel.com>
Reviewed-by: Vlastimil Babka <vbabka(a)suse.cz>
Reviewed-by: David Hildenbrand <david(a)redhat.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Yang Shi <yang(a)os.amperecomputing.com>
Cc: Rik van Riel <riel(a)surriel.com>
Cc: Ryan Roberts <ryan.roberts(a)arm.com>
Cc: Suren Baghdasaryan <surenb(a)google.com>
Cc: Minchan Kim <minchan(a)kernel.org>
Cc: Hans Boehm <hboehm(a)google.com>
Cc: Lokesh Gidra <lokeshgidra(a)google.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/mmap.c | 1 +
1 file changed, 1 insertion(+)
--- a/mm/mmap.c~mm-respect-mmap-hint-address-when-aligning-for-thp
+++ a/mm/mmap.c
@@ -893,6 +893,7 @@ __get_unmapped_area(struct file *file, u
if (get_area) {
addr = get_area(file, addr, len, pgoff, flags);
} else if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE)
+ && !addr /* no hint */
&& IS_ALIGNED(len, PMD_SIZE)) {
/* Ensures that larger anonymous mappings are THP aligned. */
addr = thp_get_unmapped_area_vmflags(file, addr, len,
_
Patches currently in -mm which might be from kaleshsingh(a)google.com are
mm-respect-mmap-hint-address-when-aligning-for-thp.patch
The patch titled
Subject: mm: reinstate ability to map write-sealed memfd mappings read-only
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-reinstate-ability-to-map-write-sealed-memfd-mappings-read-only.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Lorenzo Stoakes <lorenzo.stoakes(a)oracle.com>
Subject: mm: reinstate ability to map write-sealed memfd mappings read-only
Date: Thu, 28 Nov 2024 15:06:17 +0000
Patch series "mm: reinstate ability to map write-sealed memfd mappings
read-only".
In commit 158978945f31 ("mm: perform the mapping_map_writable() check
after call_mmap()") (and preceding changes in the same series) it became
possible to mmap() F_SEAL_WRITE sealed memfd mappings read-only.
Commit 5de195060b2e ("mm: resolve faulty mmap_region() error path
behaviour") unintentionally undid this logic by moving the
mapping_map_writable() check before the shmem_mmap() hook is invoked,
thereby regressing this change.
This series reworks how we both permit write-sealed mappings being mapped
read-only and disallow mprotect() from undoing the write-seal, fixing this
regression.
We also add a regression test to ensure that we do not accidentally
regress this in future.
Thanks to Julian Orth for reporting this regression.
This patch (of 2):
In commit 158978945f31 ("mm: perform the mapping_map_writable() check
after call_mmap()") (and preceding changes in the same series) it became
possible to mmap() F_SEAL_WRITE sealed memfd mappings read-only.
This was previously unnecessarily disallowed, despite the man page
documentation indicating that it would be, thereby limiting the usefulness
of F_SEAL_WRITE logic.
We fixed this by adapting logic that existed for the F_SEAL_FUTURE_WRITE
seal (one which disallows future writes to the memfd) to also be used for
F_SEAL_WRITE.
For background - the F_SEAL_FUTURE_WRITE seal clears VM_MAYWRITE for a
read-only mapping to disallow mprotect() from overriding the seal - an
operation performed by seal_check_write(), invoked from shmem_mmap(), the
f_op->mmap() hook used by shmem mappings.
By extending this to F_SEAL_WRITE and critically - checking
mapping_map_writable() to determine if we may map the memfd AFTER we
invoke shmem_mmap() - the desired logic becomes possible. This is because
mapping_map_writable() explicitly checks for VM_MAYWRITE, which we will
have cleared.
Commit 5de195060b2e ("mm: resolve faulty mmap_region() error path
behaviour") unintentionally undid this logic by moving the
mapping_map_writable() check before the shmem_mmap() hook is invoked,
thereby regressing this change.
We reinstate this functionality by moving the check out of shmem_mmap()
and instead performing it in do_mmap() at the point at which VMA flags are
being determined, which seems in any case to be a more appropriate place
in which to make this determination.
In order to achieve this we rework memfd seal logic to allow us access to
this information using existing logic and eliminate the clearing of
VM_MAYWRITE from seal_check_write() which we are performing in do_mmap()
instead.
Link: https://lkml.kernel.org/r/99fc35d2c62bd2e05571cf60d9f8b843c56069e0.17328047…
Fixes: 5de195060b2e ("mm: resolve faulty mmap_region() error path behaviour")
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes(a)oracle.com>
Reported-by: Julian Orth <ju.orth(a)gmail.com>
Closes: https://lore.kernel.org/all/CAHijbEUMhvJTN9Xw1GmbM266FXXv=U7s4L_Jem5x3AaPZx…
Cc: Jann Horn <jannh(a)google.com>
Cc: Liam R. Howlett <Liam.Howlett(a)Oracle.com>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Shuah Khan <shuah(a)kernel.org>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
include/linux/memfd.h | 14 +++++++++
include/linux/mm.h | 58 +++++++++++++++++++++++++++-------------
mm/memfd.c | 2 -
mm/mmap.c | 4 ++
4 files changed, 59 insertions(+), 19 deletions(-)
--- a/include/linux/memfd.h~mm-reinstate-ability-to-map-write-sealed-memfd-mappings-read-only
+++ a/include/linux/memfd.h
@@ -7,6 +7,7 @@
#ifdef CONFIG_MEMFD_CREATE
extern long memfd_fcntl(struct file *file, unsigned int cmd, unsigned int arg);
struct folio *memfd_alloc_folio(struct file *memfd, pgoff_t idx);
+unsigned int *memfd_file_seals_ptr(struct file *file);
#else
static inline long memfd_fcntl(struct file *f, unsigned int c, unsigned int a)
{
@@ -16,6 +17,19 @@ static inline struct folio *memfd_alloc_
{
return ERR_PTR(-EINVAL);
}
+
+static inline unsigned int *memfd_file_seals_ptr(struct file *file)
+{
+ return NULL;
+}
#endif
+/* Retrieve memfd seals associated with the file, if any. */
+static inline unsigned int memfd_file_seals(struct file *file)
+{
+ unsigned int *sealsp = memfd_file_seals_ptr(file);
+
+ return sealsp ? *sealsp : 0;
+}
+
#endif /* __LINUX_MEMFD_H */
--- a/include/linux/mm.h~mm-reinstate-ability-to-map-write-sealed-memfd-mappings-read-only
+++ a/include/linux/mm.h
@@ -4091,6 +4091,37 @@ void mem_dump_obj(void *object);
static inline void mem_dump_obj(void *object) {}
#endif
+static inline bool is_write_sealed(int seals)
+{
+ return seals & (F_SEAL_WRITE | F_SEAL_FUTURE_WRITE);
+}
+
+/**
+ * is_readonly_sealed - Checks whether write-sealed but mapped read-only,
+ * in which case writes should be disallowing moving
+ * forwards.
+ * @seals: the seals to check
+ * @vm_flags: the VMA flags to check
+ *
+ * Returns whether readonly sealed, in which case writess should be disallowed
+ * going forward.
+ */
+static inline bool is_readonly_sealed(int seals, vm_flags_t vm_flags)
+{
+ /*
+ * Since an F_SEAL_[FUTURE_]WRITE sealed memfd can be mapped as
+ * MAP_SHARED and read-only, take care to not allow mprotect to
+ * revert protections on such mappings. Do this only for shared
+ * mappings. For private mappings, don't need to mask
+ * VM_MAYWRITE as we still want them to be COW-writable.
+ */
+ if (is_write_sealed(seals) &&
+ ((vm_flags & (VM_SHARED | VM_WRITE)) == VM_SHARED))
+ return true;
+
+ return false;
+}
+
/**
* seal_check_write - Check for F_SEAL_WRITE or F_SEAL_FUTURE_WRITE flags and
* handle them.
@@ -4102,24 +4133,15 @@ static inline void mem_dump_obj(void *ob
*/
static inline int seal_check_write(int seals, struct vm_area_struct *vma)
{
- if (seals & (F_SEAL_WRITE | F_SEAL_FUTURE_WRITE)) {
- /*
- * New PROT_WRITE and MAP_SHARED mmaps are not allowed when
- * write seals are active.
- */
- if ((vma->vm_flags & VM_SHARED) && (vma->vm_flags & VM_WRITE))
- return -EPERM;
-
- /*
- * Since an F_SEAL_[FUTURE_]WRITE sealed memfd can be mapped as
- * MAP_SHARED and read-only, take care to not allow mprotect to
- * revert protections on such mappings. Do this only for shared
- * mappings. For private mappings, don't need to mask
- * VM_MAYWRITE as we still want them to be COW-writable.
- */
- if (vma->vm_flags & VM_SHARED)
- vm_flags_clear(vma, VM_MAYWRITE);
- }
+ if (!is_write_sealed(seals))
+ return 0;
+
+ /*
+ * New PROT_WRITE and MAP_SHARED mmaps are not allowed when
+ * write seals are active.
+ */
+ if ((vma->vm_flags & VM_SHARED) && (vma->vm_flags & VM_WRITE))
+ return -EPERM;
return 0;
}
--- a/mm/memfd.c~mm-reinstate-ability-to-map-write-sealed-memfd-mappings-read-only
+++ a/mm/memfd.c
@@ -170,7 +170,7 @@ static int memfd_wait_for_pins(struct ad
return error;
}
-static unsigned int *memfd_file_seals_ptr(struct file *file)
+unsigned int *memfd_file_seals_ptr(struct file *file)
{
if (shmem_file(file))
return &SHMEM_I(file_inode(file))->seals;
--- a/mm/mmap.c~mm-reinstate-ability-to-map-write-sealed-memfd-mappings-read-only
+++ a/mm/mmap.c
@@ -47,6 +47,7 @@
#include <linux/oom.h>
#include <linux/sched/mm.h>
#include <linux/ksm.h>
+#include <linux/memfd.h>
#include <linux/uaccess.h>
#include <asm/cacheflush.h>
@@ -368,6 +369,7 @@ unsigned long do_mmap(struct file *file,
if (file) {
struct inode *inode = file_inode(file);
+ unsigned int seals = memfd_file_seals(file);
unsigned long flags_mask;
if (!file_mmap_ok(file, inode, pgoff, len))
@@ -408,6 +410,8 @@ unsigned long do_mmap(struct file *file,
vm_flags |= VM_SHARED | VM_MAYSHARE;
if (!(file->f_mode & FMODE_WRITE))
vm_flags &= ~(VM_MAYWRITE | VM_SHARED);
+ else if (is_readonly_sealed(seals, vm_flags))
+ vm_flags &= ~VM_MAYWRITE;
fallthrough;
case MAP_PRIVATE:
if (!(file->f_mode & FMODE_READ))
_
Patches currently in -mm which might be from lorenzo.stoakes(a)oracle.com are
mm-reinstate-ability-to-map-write-sealed-memfd-mappings-read-only.patch
selftests-memfd-add-test-for-mapping-write-sealed-memfd-read-only.patch
docs-mm-add-vma-locks-documentation.patch
docs-mm-add-vma-locks-documentation-v3.patch
docs-mm-add-vma-locks-documentation-fix.patch
The patch titled
Subject: mm: memcg: declare do_memsw_account inline
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-memcg-declare-do_memsw_account-inline.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: John Sperbeck <jsperbeck(a)google.com>
Subject: mm: memcg: declare do_memsw_account inline
Date: Thu, 28 Nov 2024 12:39:59 -0800
In commit 66d60c428b23 ("mm: memcg: move legacy memcg event code into
memcontrol-v1.c"), the static do_memsw_account() function was moved from a
.c file to a .h file. Unfortunately, the traditional inline keyword
wasn't added. If a file (e.g., a unit test) includes the .h file, but
doesn't refer to do_memsw_account(), it will get a warning like:
mm/memcontrol-v1.h:41:13: warning: unused function 'do_memsw_account' [-Wunused-function]
41 | static bool do_memsw_account(void)
| ^~~~~~~~~~~~~~~~
Link: https://lkml.kernel.org/r/20241128203959.726527-1-jsperbeck@google.com
Fixes: 66d60c428b23 ("mm: memcg: move legacy memcg event code into memcontrol-v1.c")
Signed-off-by: John Sperbeck <jsperbeck(a)google.com>
Acked-by: Roman Gushchin <roman.gushchin(a)linux.dev>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: Michal Hocko <mhocko(a)kernel.org>
Cc: Muchun Song <muchun.song(a)linux.dev>
Cc: Shakeel Butt <shakeel.butt(a)linux.dev>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/memcontrol-v1.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/mm/memcontrol-v1.h~mm-memcg-declare-do_memsw_account-inline
+++ a/mm/memcontrol-v1.h
@@ -38,7 +38,7 @@ void mem_cgroup_id_put_many(struct mem_c
iter = mem_cgroup_iter(NULL, iter, NULL))
/* Whether legacy memory+swap accounting is active */
-static bool do_memsw_account(void)
+static inline bool do_memsw_account(void)
{
return !cgroup_subsys_on_dfl(memory_cgrp_subsys);
}
_
Patches currently in -mm which might be from jsperbeck(a)google.com are
mm-memcg-declare-do_memsw_account-inline.patch
[BUG]
When testing with COW fixup marked as BUG_ON() (this is involved with the
new pin_user_pages*() change, which should not result new out-of-band
dirty pages), I hit a crash triggered by the BUG_ON() from hitting COW
fixup path.
This BUG_ON() happens just after a failed btrfs_run_delalloc_range():
BTRFS error (device dm-2): failed to run delalloc range, root 348 ino 405 folio 65536 submit_bitmap 6-15 start 90112 len 106496: -28
------------[ cut here ]------------
kernel BUG at fs/btrfs/extent_io.c:1444!
Internal error: Oops - BUG: 00000000f2000800 [#1] SMP
CPU: 0 UID: 0 PID: 434621 Comm: kworker/u24:8 Tainted: G OE 6.12.0-rc7-custom+ #86
Hardware name: QEMU KVM Virtual Machine, BIOS unknown 2/2/2022
Workqueue: events_unbound btrfs_async_reclaim_data_space [btrfs]
pc : extent_writepage_io+0x2d4/0x308 [btrfs]
lr : extent_writepage_io+0x2d4/0x308 [btrfs]
Call trace:
extent_writepage_io+0x2d4/0x308 [btrfs]
extent_writepage+0x218/0x330 [btrfs]
extent_write_cache_pages+0x1d4/0x4b0 [btrfs]
btrfs_writepages+0x94/0x150 [btrfs]
do_writepages+0x74/0x190
filemap_fdatawrite_wbc+0x88/0xc8
start_delalloc_inodes+0x180/0x3b0 [btrfs]
btrfs_start_delalloc_roots+0x174/0x280 [btrfs]
shrink_delalloc+0x114/0x280 [btrfs]
flush_space+0x250/0x2f8 [btrfs]
btrfs_async_reclaim_data_space+0x180/0x228 [btrfs]
process_one_work+0x164/0x408
worker_thread+0x25c/0x388
kthread+0x100/0x118
ret_from_fork+0x10/0x20
Code: aa1403e1 9402f3ef aa1403e0 9402f36f (d4210000)
---[ end trace 0000000000000000 ]---
[CAUSE]
That failure is mostly from cow_file_range(), where we can hit -ENOSPC.
Although the -ENOSPC is already a bug related to our space reservation
code, let's just focus on the error handling.
For example, we have the following dirty range [0, 64K) of an inode,
with 4K sector size and 4K page size:
0 16K 32K 48K 64K
|///////////////////////////////////////|
|#######################################|
Where |///| means page are still dirty, and |###| means the extent io
tree has EXTENT_DELALLOC flag.
- Enter extent_writepage() for page 0
- Enter btrfs_run_delalloc_range() for range [0, 64K)
- Enter cow_file_range() for range [0, 64K)
- Function btrfs_reserve_extent() only reserved one 16K extent
So we created extent map and ordered extent for range [0, 16K)
0 16K 32K 48K 64K
|////////|//////////////////////////////|
|<- OE ->|##############################|
And range [0, 16K) has its delalloc flag cleared.
But since we haven't yet submit any bio, involved 4 pages are still
dirty.
- Function btrfs_reserve_extent() return with -ENOSPC
Now we have to run error cleanup, which will clear all
EXTENT_DELALLOC* flags and clear the dirty flags for the remaining
ranges:
0 16K 32K 48K 64K
|////////| |
| | |
Note that range [0, 16K) still has their pages dirty.
- Some time later, writeback are triggered again for the range [0, 16K)
since the page range still have dirty flags.
- btrfs_run_delalloc_range() will do nothing because there is no
EXTENT_DELALLOC flag.
- extent_writepage_io() find page 0 has no ordered flag
Which falls into the COW fixup path, triggering the BUG_ON().
Unfortunately this error handling bug dates back to the introduction of btrfs.
Thankfully with the abuse of cow fixup, at least it won't crash the
kernel.
[FIX]
Instead of immediately unlock the extent and folios, we keep the extent
and folios locked until either erroring out or the whole delalloc range
finished.
When the whole delalloc range finished without error, we just unlock the
whole range with PAGE_SET_ORDERED (and PAGE_UNLOCK for !keep_locked
cases), with EXTENT_DELALLOC and EXTENT_LOCKED cleared.
And those involved folios will be properly submitted, with their dirty
flags cleared during submission.
For the error path, it will be a little more complex:
- The range with ordered extent allocated (range (1))
We only clear the EXTENT_DELALLOC and EXTENT_LOCKED, as the remaining
flags are cleaned up by
btrfs_mark_ordered_io_finished()->btrfs_finish_one_ordered().
For folios we finish the IO (clear dirty, start writeback and
immediately finish the writeback) and unlock the folios.
- The range with reserved extent but no ordered extent (range(2))
- The range we never touched (range(3))
For both range (2) and range(3) the behavior is not changed.
Now even if cow_file_range() failed halfway with some successfully
reserved extents/ordered extents, we will keep all folios clean, so
there will be no future writeback triggered on them.
Cc: stable(a)vger.kernel.org
Signed-off-by: Qu Wenruo <wqu(a)suse.com>
---
fs/btrfs/inode.c | 63 ++++++++++++++++++++++++------------------------
1 file changed, 31 insertions(+), 32 deletions(-)
---
The similar bug exists for nocow path too (and other routines like
zoned), the fix for nocow will come later after the patch get reviewed.
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 9267861f8ab0..e8232ac7917f 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -1372,6 +1372,17 @@ static noinline int cow_file_range(struct btrfs_inode *inode,
alloc_hint = btrfs_get_extent_allocation_hint(inode, start, num_bytes);
+ /*
+ * We're not doing compressed IO, don't unlock the first page
+ * (which the caller expects to stay locked), don't clear any
+ * dirty bits and don't set any writeback bits
+ *
+ * Do set the Ordered (Private2) bit so we know this page was
+ * properly setup for writepage.
+ */
+ page_ops = (keep_locked ? 0 : PAGE_UNLOCK);
+ page_ops |= PAGE_SET_ORDERED;
+
/*
* Relocation relies on the relocated extents to have exactly the same
* size as the original extents. Normally writeback for relocation data
@@ -1431,6 +1442,10 @@ static noinline int cow_file_range(struct btrfs_inode *inode,
file_extent.offset = 0;
file_extent.compression = BTRFS_COMPRESS_NONE;
+ /*
+ * Locked range will be released either during error clean up or
+ * after the whole range is finished.
+ */
lock_extent(&inode->io_tree, start, start + cur_alloc_size - 1,
&cached);
@@ -1476,21 +1491,6 @@ static noinline int cow_file_range(struct btrfs_inode *inode,
btrfs_dec_block_group_reservations(fs_info, ins.objectid);
- /*
- * We're not doing compressed IO, don't unlock the first page
- * (which the caller expects to stay locked), don't clear any
- * dirty bits and don't set any writeback bits
- *
- * Do set the Ordered (Private2) bit so we know this page was
- * properly setup for writepage.
- */
- page_ops = (keep_locked ? 0 : PAGE_UNLOCK);
- page_ops |= PAGE_SET_ORDERED;
-
- extent_clear_unlock_delalloc(inode, start, start + cur_alloc_size - 1,
- locked_folio, &cached,
- EXTENT_LOCKED | EXTENT_DELALLOC,
- page_ops);
if (num_bytes < cur_alloc_size)
num_bytes = 0;
else
@@ -1507,6 +1507,9 @@ static noinline int cow_file_range(struct btrfs_inode *inode,
if (ret)
goto out_unlock;
}
+ extent_clear_unlock_delalloc(inode, orig_start, end, locked_folio, &cached,
+ EXTENT_LOCKED | EXTENT_DELALLOC,
+ page_ops);
done:
if (done_offset)
*done_offset = end;
@@ -1527,35 +1530,31 @@ static noinline int cow_file_range(struct btrfs_inode *inode,
* We process each region below.
*/
- clear_bits = EXTENT_LOCKED | EXTENT_DELALLOC | EXTENT_DELALLOC_NEW |
- EXTENT_DEFRAG | EXTENT_CLEAR_META_RESV;
- page_ops = PAGE_UNLOCK | PAGE_START_WRITEBACK | PAGE_END_WRITEBACK;
-
/*
* For the range (1). We have already instantiated the ordered extents
* for this region. They are cleaned up by
* btrfs_cleanup_ordered_extents() in e.g,
- * btrfs_run_delalloc_range(). EXTENT_LOCKED | EXTENT_DELALLOC are
- * already cleared in the above loop. And, EXTENT_DELALLOC_NEW |
- * EXTENT_DEFRAG | EXTENT_CLEAR_META_RESV are handled by the cleanup
- * function.
+ * btrfs_run_delalloc_range().
+ * EXTENT_DELALLOC_NEW | EXTENT_DEFRAG | EXTENT_CLEAR_META_RESV
+ * are also handled by the cleanup function.
*
- * However, in case of @keep_locked, we still need to unlock the pages
- * (except @locked_folio) to ensure all the pages are unlocked.
+ * So here we only clear EXTENT_LOCKED and EXTENT_DELALLOC flag,
+ * and finish the writeback of the involved folios, which will be
+ * never submitted.
*/
- if (keep_locked && orig_start < start) {
+ if (orig_start < start) {
+ clear_bits = EXTENT_LOCKED | EXTENT_DELALLOC;
+ page_ops = PAGE_UNLOCK | PAGE_START_WRITEBACK | PAGE_END_WRITEBACK;
+
if (!locked_folio)
mapping_set_error(inode->vfs_inode.i_mapping, ret);
extent_clear_unlock_delalloc(inode, orig_start, start - 1,
locked_folio, NULL, 0, page_ops);
}
- /*
- * At this point we're unlocked, we want to make sure we're only
- * clearing these flags under the extent lock, so lock the rest of the
- * range and clear everything up.
- */
- lock_extent(&inode->io_tree, start, end, NULL);
+ clear_bits = EXTENT_LOCKED | EXTENT_DELALLOC | EXTENT_DELALLOC_NEW |
+ EXTENT_DEFRAG | EXTENT_CLEAR_META_RESV;
+ page_ops = PAGE_UNLOCK | PAGE_START_WRITEBACK | PAGE_END_WRITEBACK;
/*
* For the range (2). If we reserved an extent for our delalloc range
--
2.47.0
From: John Harrison <John.C.Harrison(a)Intel.com>
Adding lockdep checking to the coredump code showed that there was an
existing violation. The dev_coredumpm_timeout() call is used to
register the dump with the base coredump subsystem. However, that
makes multiple memory allocations, only some of which use the GFP_
flags passed in. So that also needs to be deferred to the worker
function where it is safe to allocate with arbitrary flags.
In order to not add protoypes for the callback functions, moving the
_timeout call also means moving the worker thread function to later in
the file.
v2: Rebased after other changes to the worker function.
Fixes: e799485044cb ("drm/xe: Introduce the dev_coredump infrastructure.")
Cc: Thomas Hellström <thomas.hellstrom(a)linux.intel.com>
Cc: Matthew Brost <matthew.brost(a)intel.com>
Cc: Jani Nikula <jani.nikula(a)linux.intel.com>
Cc: Daniel Vetter <daniel.vetter(a)ffwll.ch>
Cc: Francois Dugast <francois.dugast(a)intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi(a)intel.com>
Cc: Lucas De Marchi <lucas.demarchi(a)intel.com>
Cc: "Thomas Hellström" <thomas.hellstrom(a)linux.intel.com>
Cc: Sumit Semwal <sumit.semwal(a)linaro.org>
Cc: "Christian König" <christian.koenig(a)amd.com>
Cc: intel-xe(a)lists.freedesktop.org
Cc: linux-media(a)vger.kernel.org
Cc: dri-devel(a)lists.freedesktop.org
Cc: linaro-mm-sig(a)lists.linaro.org
Cc: <stable(a)vger.kernel.org> # v6.8+
Signed-off-by: John Harrison <John.C.Harrison(a)Intel.com>
Reviewed-by: Matthew Brost <matthew.brost(a)intel.com>
---
drivers/gpu/drm/xe/xe_devcoredump.c | 73 +++++++++++++++--------------
1 file changed, 39 insertions(+), 34 deletions(-)
diff --git a/drivers/gpu/drm/xe/xe_devcoredump.c b/drivers/gpu/drm/xe/xe_devcoredump.c
index baac50f6dd7e..d24f1088e298 100644
--- a/drivers/gpu/drm/xe/xe_devcoredump.c
+++ b/drivers/gpu/drm/xe/xe_devcoredump.c
@@ -168,36 +168,6 @@ static void xe_devcoredump_snapshot_free(struct xe_devcoredump_snapshot *ss)
ss->vm = NULL;
}
-static void xe_devcoredump_deferred_snap_work(struct work_struct *work)
-{
- struct xe_devcoredump_snapshot *ss = container_of(work, typeof(*ss), work);
- struct xe_devcoredump *coredump = container_of(ss, typeof(*coredump), snapshot);
- struct xe_device *xe = coredump_to_xe(coredump);
- unsigned int fw_ref;
-
- xe_pm_runtime_get(xe);
-
- /* keep going if fw fails as we still want to save the memory and SW data */
- fw_ref = xe_force_wake_get(gt_to_fw(ss->gt), XE_FORCEWAKE_ALL);
- if (!xe_force_wake_ref_has_domain(fw_ref, XE_FORCEWAKE_ALL))
- xe_gt_info(ss->gt, "failed to get forcewake for coredump capture\n");
- xe_vm_snapshot_capture_delayed(ss->vm);
- xe_guc_exec_queue_snapshot_capture_delayed(ss->ge);
- xe_force_wake_put(gt_to_fw(ss->gt), fw_ref);
-
- xe_pm_runtime_put(xe);
-
- /* Calculate devcoredump size */
- ss->read.size = __xe_devcoredump_read(NULL, INT_MAX, coredump);
-
- ss->read.buffer = kvmalloc(ss->read.size, GFP_USER);
- if (!ss->read.buffer)
- return;
-
- __xe_devcoredump_read(ss->read.buffer, ss->read.size, coredump);
- xe_devcoredump_snapshot_free(ss);
-}
-
static ssize_t xe_devcoredump_read(char *buffer, loff_t offset,
size_t count, void *data, size_t datalen)
{
@@ -246,6 +216,45 @@ static void xe_devcoredump_free(void *data)
"Xe device coredump has been deleted.\n");
}
+static void xe_devcoredump_deferred_snap_work(struct work_struct *work)
+{
+ struct xe_devcoredump_snapshot *ss = container_of(work, typeof(*ss), work);
+ struct xe_devcoredump *coredump = container_of(ss, typeof(*coredump), snapshot);
+ struct xe_device *xe = coredump_to_xe(coredump);
+ unsigned int fw_ref;
+
+ /*
+ * NB: Despite passing a GFP_ flags parameter here, more allocations are done
+ * internally using GFP_KERNEL expliictly. Hence this call must be in the worker
+ * thread and not in the initial capture call.
+ */
+ dev_coredumpm_timeout(gt_to_xe(ss->gt)->drm.dev, THIS_MODULE, coredump, 0, GFP_KERNEL,
+ xe_devcoredump_read, xe_devcoredump_free,
+ XE_COREDUMP_TIMEOUT_JIFFIES);
+
+ xe_pm_runtime_get(xe);
+
+ /* keep going if fw fails as we still want to save the memory and SW data */
+ fw_ref = xe_force_wake_get(gt_to_fw(ss->gt), XE_FORCEWAKE_ALL);
+ if (!xe_force_wake_ref_has_domain(fw_ref, XE_FORCEWAKE_ALL))
+ xe_gt_info(ss->gt, "failed to get forcewake for coredump capture\n");
+ xe_vm_snapshot_capture_delayed(ss->vm);
+ xe_guc_exec_queue_snapshot_capture_delayed(ss->ge);
+ xe_force_wake_put(gt_to_fw(ss->gt), fw_ref);
+
+ xe_pm_runtime_put(xe);
+
+ /* Calculate devcoredump size */
+ ss->read.size = __xe_devcoredump_read(NULL, INT_MAX, coredump);
+
+ ss->read.buffer = kvmalloc(ss->read.size, GFP_USER);
+ if (!ss->read.buffer)
+ return;
+
+ __xe_devcoredump_read(ss->read.buffer, ss->read.size, coredump);
+ xe_devcoredump_snapshot_free(ss);
+}
+
static void devcoredump_snapshot(struct xe_devcoredump *coredump,
struct xe_exec_queue *q,
struct xe_sched_job *job)
@@ -334,10 +343,6 @@ void xe_devcoredump(struct xe_exec_queue *q, struct xe_sched_job *job, const cha
drm_info(&xe->drm, "Xe device coredump has been created\n");
drm_info(&xe->drm, "Check your /sys/class/drm/card%d/device/devcoredump/data\n",
xe->drm.primary->index);
-
- dev_coredumpm_timeout(xe->drm.dev, THIS_MODULE, coredump, 0, GFP_KERNEL,
- xe_devcoredump_read, xe_devcoredump_free,
- XE_COREDUMP_TIMEOUT_JIFFIES);
}
static void xe_driver_devcoredump_fini(void *arg)
--
2.47.0
This patchset fixes two bugs with the async controls for the uvc driver.
They were found while implementing the granular PM, but I am sending
them as a separate patches, so they can be reviewed sooner. They fix
real issues in the driver that need to be taken care.
Signed-off-by: Ricardo Ribalda <ribalda(a)chromium.org>
---
Ricardo Ribalda (2):
media: uvcvideo: Do not set an async control owned by other fh
media: uvcvideo: Remove dangling pointers
drivers/media/usb/uvc/uvc_ctrl.c | 44 ++++++++++++++++++++++++++++++++++++++--
drivers/media/usb/uvc/uvc_v4l2.c | 2 ++
drivers/media/usb/uvc/uvcvideo.h | 3 +++
3 files changed, 47 insertions(+), 2 deletions(-)
---
base-commit: 72ad4ff638047bbbdf3232178fea4bec1f429319
change-id: 20241127-uvc-fix-async-2c9d40413ad8
Best regards,
--
Ricardo Ribalda <ribalda(a)chromium.org>
Some cameras do not return all the bytes requested from a control
if it can fit in less bytes. Eg: returning 0xab instead of 0x00ab.
Support these devices.
Also, now that we are at it, improve uvc_query_ctrl() logging.
Signed-off-by: Ricardo Ribalda <ribalda(a)chromium.org>
---
Changes in v4:
- Improve comment.
- Keep old likely(ret == size)
- Link to v3: https://lore.kernel.org/r/20241118-uvc-readless-v3-0-d97c1a3084d0@chromium.…
Changes in v3:
- Improve documentation.
- Do not change return sequence.
- Use dev_ratelimit and dev_warn_once
- Link to v2: https://lore.kernel.org/r/20241008-uvc-readless-v2-0-04d9d51aee56@chromium.…
Changes in v2:
- Rewrite error handling (Thanks Sakari)
- Discard 2/3. It is not needed after rewriting the error handling.
- Link to v1: https://lore.kernel.org/r/20241008-uvc-readless-v1-0-042ac4581f44@chromium.…
---
Ricardo Ribalda (2):
media: uvcvideo: Support partial control reads
media: uvcvideo: Add more logging to uvc_query_ctrl()
drivers/media/usb/uvc/uvc_video.c | 22 +++++++++++++++++++++-
1 file changed, 21 insertions(+), 1 deletion(-)
---
base-commit: 9852d85ec9d492ebef56dc5f229416c925758edc
change-id: 20241008-uvc-readless-23f9b8cad0b3
Best regards,
--
Ricardo Ribalda <ribalda(a)chromium.org>
This patch addresses an issue of type confusion in tls_is_tx_ready(),
as a check for NULL of list_first_entry() return value is wrong.
This issue has been given a CVE entry CVE-2023-1075 [1] and is still
present in several stable branches.
As the flawed function tls_is_tx_ready() is named is_tx_ready() and
is situated in another file (specifically, include/net/tls.h) in older
kernel versions, fix the error there instead. This adapted backport
can be cleanly applied to 5.4, 5.10 and 5.15 branches.
[PATCH 5.4/5.10/5.15 1/1] net/tls: tls_is_tx_ready() checked list_entry
Use list_first_entry_or_null() instead of list_entry() to properly
check for empty lists.
Fixes [1].
[1] https://nvd.nist.gov/vuln/detail/cve-2023-1075
[2] https://github.com/torvalds/linux/commit/ffe2a22562444720b05bdfeb999c03e810…
Changes in v5:
- In-lines devm_pm_domain_attach_list() in probe() directly - Vlad
- Link to v4: https://lore.kernel.org/r/20241127-b4-linux-next-24-11-18-clock-multiple-po…
v4:
- Adds Bjorn's RB to first patch - Bjorn
- Drops the 'd' in "and int" - Bjorn
- Amends commit log of patch 3 to capture a number of open questions -
Bjorn
- Link to v3: https://lore.kernel.org/r/20241126-b4-linux-next-24-11-18-clock-multiple-po…
v3:
- Fixes commit log "per which" - Bryan
- Link to v2: https://lore.kernel.org/r/20241125-b4-linux-next-24-11-18-clock-multiple-po…
v2:
The main change in this version is Bjorn's pointing out that pm_runtime_*
inside of the gdsc_enable/gdsc_disable path would be recursive and cause a
lockdep splat. Dmitry alluded to this too.
Bjorn pointed to stuff being done lower in the gdsc_register() routine that
might be a starting point.
I iterated around that idea and came up with patch #3. When a gdsc has no
parent and the pd_list is non-NULL then attach that orphan GDSC to the
clock controller power-domain list.
Existing subdomain code in gdsc_register() will connect the parent GDSCs in
the clock-controller to the clock-controller subdomain, the new code here
does that same job for a list of power-domains the clock controller depends
on.
To Dmitry's point about MMCX and MCX dependencies for the registers inside
of the clock controller, I have switched off all references in a test dtsi
and confirmed that accessing the clock-controller regs themselves isn't
required.
On the second point I also verified my test branch with lockdep on which
was a concern with the pm_domain version of this solution but I wanted to
cover it anyway with the new approach for completeness sake.
Here's the item-by-item list of changes:
- Adds a patch to capture pm_genpd_add_subdomain() result code - Bryan
- Changes changelog of second patch to remove singleton and generally
to make the commit log easier to understand - Bjorn
- Uses demv_pm_domain_attach_list - Vlad
- Changes error check to if (ret < 0 && ret != -EEXIST) - Vlad
- Retains passing &pd_data instead of NULL - because NULL doesn't do
the same thing - Bryan/Vlad
- Retains standalone function qcom_cc_pds_attach() because the pd_data
enumeration looks neater in a standalone function - Bryan/Vlad
- Drops pm_runtime in favour of gdsc_add_subdomain_list() for each
power-domain in the pd_list.
The pd_list will be whatever is pointed to by power-domains = <>
in the dtsi - Bjorn
- Link to v1: https://lore.kernel.org/r/20241118-b4-linux-next-24-11-18-clock-multiple-po…
v1:
On x1e80100 and it's SKUs the Camera Clock Controller - CAMCC has
multiple power-domains which power it. Usually with a single power-domain
the core platform code will automatically switch on the singleton
power-domain for you. If you have multiple power-domains for a device, in
this case the clock controller, you need to switch those power-domains
on/off yourself.
The clock controllers can also contain Global Distributed
Switch Controllers - GDSCs which themselves can be referenced from dtsi
nodes ultimately triggering a gdsc_en() in drivers/clk/qcom/gdsc.c.
As an example:
cci0: cci@ac4a000 {
power-domains = <&camcc TITAN_TOP_GDSC>;
};
This series adds the support to attach a power-domain list to the
clock-controllers and the GDSCs those controllers provide so that in the
case of the above example gdsc_toggle_logic() will trigger the power-domain
list with pm_runtime_resume_and_get() and pm_runtime_put_sync()
respectively.
Signed-off-by: Bryan O'Donoghue <bryan.odonoghue(a)linaro.org>
---
Bryan O'Donoghue (3):
clk: qcom: gdsc: Capture pm_genpd_add_subdomain result code
clk: qcom: common: Add support for power-domain attachment
clk: qcom: Support attaching GDSCs to multiple parents
drivers/clk/qcom/common.c | 10 ++++++++++
drivers/clk/qcom/gdsc.c | 41 +++++++++++++++++++++++++++++++++++++++--
drivers/clk/qcom/gdsc.h | 1 +
3 files changed, 50 insertions(+), 2 deletions(-)
---
base-commit: 744cf71b8bdfcdd77aaf58395e068b7457634b2c
change-id: 20241118-b4-linux-next-24-11-18-clock-multiple-power-domains-a5f994dc452a
Best regards,
--
Bryan O'Donoghue <bryan.odonoghue(a)linaro.org>
7d6f065de37c ("HID: i2c-hid: Use address probe to wake on resume")
replaced the retry of power commands with the dummy read "bus probe" we
use on boot which accounts for a necessary delay before retry.
This made at least one Weida device (2575:0910 in an ASUS Vivobook S14)
very unhappy, as the bus probe despite being successful somehow lead to
the following power command failing so hard that the device never lets
go of the bus. This means that even retries of the power command would
fail on a timeout as the bus remains busy.
Remove the bus probe on resume and instead reintroduce retry of the
power command for wake-up purposes while respecting the newly
established wake-up retry timings.
Fixes: 7d6f065de37c ("HID: i2c-hid: Use address probe to wake on resume")
Cc: stable(a)vger.kernel.org
Reported-by: Michael <auslands-kv(a)gmx.de>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=219440
Link: https://lore.kernel.org/r/d5acb485-7377-4139-826d-4df04d21b5ed@leemhuis.inf…
Signed-off-by: Kenny Levinsen <kl(a)kl.wtf>
---
As I don't have access to the hardware in question, a test by the
reporter (Michael) would be preferred to confirm the final patch.
drivers/hid/i2c-hid/i2c-hid-core.c | 20 ++++++++++++--------
1 file changed, 12 insertions(+), 8 deletions(-)
diff --git a/drivers/hid/i2c-hid/i2c-hid-core.c b/drivers/hid/i2c-hid/i2c-hid-core.c
index 43664a24176f..4e87380d3edd 100644
--- a/drivers/hid/i2c-hid/i2c-hid-core.c
+++ b/drivers/hid/i2c-hid/i2c-hid-core.c
@@ -414,7 +414,19 @@ static int i2c_hid_set_power(struct i2c_hid *ihid, int power_state)
i2c_hid_dbg(ihid, "%s\n", __func__);
+ /*
+ * Some STM-based devices need 400µs after a rising clock edge to wake
+ * from deep sleep, in which case the first request will fail due to
+ * the address not being acknowledged. Try after a short sleep to see
+ * if the device came alive on the bus. Certain Weida Tech devices also
+ * need this.
+ */
ret = i2c_hid_set_power_command(ihid, power_state);
+ if (ret && power_state == I2C_HID_PWR_ON) {
+ usleep_range(400, 500);
+ ret = i2c_hid_set_power_command(ihid, I2C_HID_PWR_ON);
+ }
+
if (ret)
dev_err(&ihid->client->dev,
"failed to change power setting.\n");
@@ -976,14 +988,6 @@ static int i2c_hid_core_resume(struct i2c_hid *ihid)
enable_irq(client->irq);
- /* Make sure the device is awake on the bus */
- ret = i2c_hid_probe_address(ihid);
- if (ret < 0) {
- dev_err(&client->dev, "nothing at address after resume: %d\n",
- ret);
- return -ENXIO;
- }
-
/* On Goodix 27c6:0d42 wait extra time before device wakeup.
* It's not clear why but if we send wakeup too early, the device will
* never trigger input interrupts.
--
2.47.0
OPM PPM LPM
| 1.send cmd | |
|-------------------------->| |
| |-- |
| | | 2.set busy bit in CCI |
| |<- |
| 3.notify the OPM | |
|<--------------------------| |
| | 4.send cmd to be executed |
| |-------------------------->|
| | |
| | 5.cmd completed |
| |<--------------------------|
| | |
| |-- |
| | | 6.set cmd completed |
| |<- bit in CCI |
| | |
| 7.handle notification | |
| from point 3, read CCI | |
|<--------------------------| |
| | |
| 8.notify the OPM | |
|<--------------------------| |
| | |
When the PPM receives command from the OPM (p.1) it sets the busy bit
in the CCI (p.2), sends notification to the OPM (p.3) and forwards the
command to be executed by the LPM (p.4). When the PPM receives command
completion from the LPM (p.5) it sets command completion bit in the CCI
(p.6) and sends notification to the OPM (p.8). If command execution by
the LPM is fast enough then when the OPM starts handling the notification
from p.3 in p.7 and reads the CCI value it will see command completion bit
and will call complete(). Then complete() might be called again when the
OPM handles notification from p.8.
This fix replaces test_bit() with test_and_clear_bit()
in ucsi_notify_common() in order to call complete() only
once per request.
Fixes: 584e8df58942 ("usb: typec: ucsi: extract common code for command handling")
Cc: stable(a)vger.kernel.org
Signed-off-by: Łukasz Bartosik <ukaszb(a)chromium.org>
---
drivers/usb/typec/ucsi/ucsi.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/usb/typec/ucsi/ucsi.c b/drivers/usb/typec/ucsi/ucsi.c
index e0f3925e401b..7a9b987ea80c 100644
--- a/drivers/usb/typec/ucsi/ucsi.c
+++ b/drivers/usb/typec/ucsi/ucsi.c
@@ -46,11 +46,11 @@ void ucsi_notify_common(struct ucsi *ucsi, u32 cci)
ucsi_connector_change(ucsi, UCSI_CCI_CONNECTOR(cci));
if (cci & UCSI_CCI_ACK_COMPLETE &&
- test_bit(ACK_PENDING, &ucsi->flags))
+ test_and_clear_bit(ACK_PENDING, &ucsi->flags))
complete(&ucsi->complete);
if (cci & UCSI_CCI_COMMAND_COMPLETE &&
- test_bit(COMMAND_PENDING, &ucsi->flags))
+ test_and_clear_bit(COMMAND_PENDING, &ucsi->flags))
complete(&ucsi->complete);
}
EXPORT_SYMBOL_GPL(ucsi_notify_common);
--
2.47.0.199.ga7371fff76-goog
This series addresses several s390 driver vulnerabilities related to
improper handling of sensitive keys-related material and its lack
of proper disposal in stable kernel branches. These issues have been
announced as CVE-2024-42155 [1], CVE-2024-42156 [2] and
CVE-2024-42158 [4] and fixed in upstream. Another problem named as
CVE-2024-42157 [3] has already been successfully backported.
All patches have been cherry-picked and are ready to be cleanly
applied to 6.1 stable branch. Same series adapted for 6.6 version
will follow separately. Backports for 5.10/5.15 have already been
sent, see [5].
[PATCH 6.1 1/3] s390/pkey: Use kfree_sensitive() to fix Coccinelle warnings
Use kfree_sensitive() instead of kfree() and memzero_explicit().
Fixes CVE-2024-42158.
[PATCH 6.1 2/3] s390/pkey: Wipe copies of clear-key structures on failure
Properly wipe sensitive key material from stack for IOCTLs that
deal with clear-key conversion.
Fixes CVE-2024-42156.
[PATCH 6.1 3/3] s390/pkey: Wipe copies of protected- and secure-keys
Properly wipe key copies from stack for affected IOCTLs.
Fixes CVE-2024-42155.
[1] https://nvd.nist.gov/vuln/detail/CVE-2024-42155
[2] https://nvd.nist.gov/vuln/detail/CVE-2024-42156
[3] https://nvd.nist.gov/vuln/detail/CVE-2024-42157
[4] https://nvd.nist.gov/vuln/detail/CVE-2024-42158
[5] https://lore.kernel.org/all/20241128142245.18136-1-n.zhandarovich@fintech.r…
This series addresses several s390 driver vulnerabilities related to
improper handling of sensitive keys-related material and its lack
of proper disposal in stable kernel branches. These issues have been
announced as CVE-2024-42155 [1], CVE-2024-42156 [2] and
CVE-2024-42158 [4] and fixed in upstream. Another problem named as
CVE-2024-42157 [3] has already been successfully backported.
All patches have been cherry-picked and are ready to be cleanly
applied to 5.10/5.15 stable branches. Same series adapted for 6.1 and
6.6 versions will follow separately.
[PATCH 5.10/5.15 1/3] s390/pkey: Use kfree_sensitive() to fix Coccinelle warnings
Use kfree_sensitive() instead of kfree() and memzero_explicit().
Fixes CVE-2024-42158.
[PATCH 5.10/5.15 2/3] s390/pkey: Wipe copies of clear-key structures on failure
Properly wipe sensitive key material from stack for IOCTLs that
deal with clear-key conversion.
Fixes CVE-2024-42156.
[PATCH 5.10/5.15 3/3] s390/pkey: Wipe copies of protected- and secure-keys
Properly wipe key copies from stack for affected IOCTLs.
Fixes CVE-2024-42155.
[1] https://nvd.nist.gov/vuln/detail/CVE-2024-42155
[2] https://nvd.nist.gov/vuln/detail/CVE-2024-42156
[3] https://nvd.nist.gov/vuln/detail/CVE-2024-42157
[4] https://nvd.nist.gov/vuln/detail/CVE-2024-42158
From: "Jason-JH.Lin" <jason-jh.lin(a)mediatek.com>
[ Upstream commit a8bd68e4329f9a0ad1b878733e0f80be6a971649 ]
When mtk-cmdq unbinds, a WARN_ON message with condition
pm_runtime_get_sync() < 0 occurs.
According to the call tracei below:
cmdq_mbox_shutdown
mbox_free_channel
mbox_controller_unregister
__devm_mbox_controller_unregister
...
The root cause can be deduced to be calling pm_runtime_get_sync() after
calling pm_runtime_disable() as observed below:
1. CMDQ driver uses devm_mbox_controller_register() in cmdq_probe()
to bind the cmdq device to the mbox_controller, so
devm_mbox_controller_unregister() will automatically unregister
the device bound to the mailbox controller when the device-managed
resource is removed. That means devm_mbox_controller_unregister()
and cmdq_mbox_shoutdown() will be called after cmdq_remove().
2. CMDQ driver also uses devm_pm_runtime_enable() in cmdq_probe() after
devm_mbox_controller_register(), so that devm_pm_runtime_disable()
will be called after cmdq_remove(), but before
devm_mbox_controller_unregister().
To fix this problem, cmdq_probe() needs to move
devm_mbox_controller_register() after devm_pm_runtime_enable() to make
devm_pm_runtime_disable() be called after
devm_mbox_controller_unregister().
Fixes: 623a6143a845 ("mailbox: mediatek: Add Mediatek CMDQ driver")
Signed-off-by: Jason-JH.Lin <jason-jh.lin(a)mediatek.com>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno(a)collabora.com>
Signed-off-by: Jassi Brar <jassisinghbrar(a)gmail.com>
Signed-off-by: Bin Lan <bin.lan.cn(a)windriver.com>
---
drivers/mailbox/mtk-cmdq-mailbox.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/drivers/mailbox/mtk-cmdq-mailbox.c b/drivers/mailbox/mtk-cmdq-mailbox.c
index 4d62b07c1411..d5f5606585f4 100644
--- a/drivers/mailbox/mtk-cmdq-mailbox.c
+++ b/drivers/mailbox/mtk-cmdq-mailbox.c
@@ -623,12 +623,6 @@ static int cmdq_probe(struct platform_device *pdev)
cmdq->mbox.chans[i].con_priv = (void *)&cmdq->thread[i];
}
- err = devm_mbox_controller_register(dev, &cmdq->mbox);
- if (err < 0) {
- dev_err(dev, "failed to register mailbox: %d\n", err);
- return err;
- }
-
platform_set_drvdata(pdev, cmdq);
WARN_ON(clk_bulk_prepare(cmdq->pdata->gce_num, cmdq->clocks));
@@ -642,6 +636,12 @@ static int cmdq_probe(struct platform_device *pdev)
return err;
}
+ err = devm_mbox_controller_register(dev, &cmdq->mbox);
+ if (err < 0) {
+ dev_err(dev, "failed to register mailbox: %d\n", err);
+ return err;
+ }
+
return 0;
}
--
2.34.1
From: "Jason-JH.Lin" <jason-jh.lin(a)mediatek.com>
[ Upstream commit a8bd68e4329f9a0ad1b878733e0f80be6a971649 ]
When mtk-cmdq unbinds, a WARN_ON message with condition
pm_runtime_get_sync() < 0 occurs.
According to the call tracei below:
cmdq_mbox_shutdown
mbox_free_channel
mbox_controller_unregister
__devm_mbox_controller_unregister
...
The root cause can be deduced to be calling pm_runtime_get_sync() after
calling pm_runtime_disable() as observed below:
1. CMDQ driver uses devm_mbox_controller_register() in cmdq_probe()
to bind the cmdq device to the mbox_controller, so
devm_mbox_controller_unregister() will automatically unregister
the device bound to the mailbox controller when the device-managed
resource is removed. That means devm_mbox_controller_unregister()
and cmdq_mbox_shoutdown() will be called after cmdq_remove().
2. CMDQ driver also uses devm_pm_runtime_enable() in cmdq_probe() after
devm_mbox_controller_register(), so that devm_pm_runtime_disable()
will be called after cmdq_remove(), but before
devm_mbox_controller_unregister().
To fix this problem, cmdq_probe() needs to move
devm_mbox_controller_register() after devm_pm_runtime_enable() to make
devm_pm_runtime_disable() be called after
devm_mbox_controller_unregister().
Fixes: 623a6143a845 ("mailbox: mediatek: Add Mediatek CMDQ driver")
Signed-off-by: Jason-JH.Lin <jason-jh.lin(a)mediatek.com>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno(a)collabora.com>
Signed-off-by: Jassi Brar <jassisinghbrar(a)gmail.com>
[ Resolve minor conflicts ]
Signed-off-by: Bin Lan <bin.lan.cn(a)windriver.com>
---
drivers/mailbox/mtk-cmdq-mailbox.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/drivers/mailbox/mtk-cmdq-mailbox.c b/drivers/mailbox/mtk-cmdq-mailbox.c
index 9465f9081515..3d369c23970c 100644
--- a/drivers/mailbox/mtk-cmdq-mailbox.c
+++ b/drivers/mailbox/mtk-cmdq-mailbox.c
@@ -605,18 +605,18 @@ static int cmdq_probe(struct platform_device *pdev)
cmdq->mbox.chans[i].con_priv = (void *)&cmdq->thread[i];
}
- err = devm_mbox_controller_register(dev, &cmdq->mbox);
- if (err < 0) {
- dev_err(dev, "failed to register mailbox: %d\n", err);
- return err;
- }
-
platform_set_drvdata(pdev, cmdq);
WARN_ON(clk_bulk_prepare(cmdq->gce_num, cmdq->clocks));
cmdq_init(cmdq);
+ err = devm_mbox_controller_register(dev, &cmdq->mbox);
+ if (err < 0) {
+ dev_err(dev, "failed to register mailbox: %d\n", err);
+ return err;
+ }
+
return 0;
}
--
2.34.1
Good day Sir/Madam,
I am Ethan Allen, Procurement Managerr at MACHINARY&EQUIPMENT Co.
Inc. We have
bulk order requirement for export to our customers in Spain and
India.
kindly confirm if you can supply to Spain and India.
We would greatly appreciate any additional information you can
provide, as well as digital copy of your products catalog (PDF or
Online link),
information on new or featured products, pricing and packaging
details.
I look forward to reviewing your catalog.
Regards,
Ethan Allen
Procurement Manager
Northern California 3401 Bayshore Blvd, Brisbane, CA 94005
+1 415 467-3400
+1 909 599-3916
www.machineryandequipment.com
Incorrect casting is possible in 6.1 stable release using ESR_ELx_EC_*
constants.
The problem has been fixed by the following upstream patch that was adapted
to 6.1. The patch couldn't be applied clearly but the changes made are
minor.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
The following commit has been merged into the timers/urgent branch of tip:
Commit-ID: f5807b0606da7ac7c1b74a386b22134ec7702d05
Gitweb: https://git.kernel.org/tip/f5807b0606da7ac7c1b74a386b22134ec7702d05
Author: Marcelo Dalmas <marcelo.dalmas(a)ge.com>
AuthorDate: Mon, 25 Nov 2024 12:16:09
Committer: Thomas Gleixner <tglx(a)linutronix.de>
CommitterDate: Thu, 28 Nov 2024 12:02:38 +01:00
ntp: Remove invalid cast in time offset math
Due to an unsigned cast, adjtimex() returns the wrong offest when using
ADJ_MICRO and the offset is negative. In this case a small negative offset
returns approximately 4.29 seconds (~ 2^32/1000 milliseconds) due to the
unsigned cast of the negative offset.
This cast was added when the kernel internal struct timex was changed to
use type long long for the time offset value to address the problem of a
64bit/32bit division on 32bit systems.
The correct cast would have been (s32), which is correct as time_offset can
only be in the range of [INT_MIN..INT_MAX] because the shift constant used
for calculating it is 32. But that's non-obvious.
Remove the cast and use div_s64() to cure the issue.
[ tglx: Fix white space damage, use div_s64() and amend the change log ]
Fixes: ead25417f82e ("timex: use __kernel_timex internally")
Signed-off-by: Marcelo Dalmas <marcelo.dalmas(a)ge.com>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/all/SJ0P101MB03687BF7D5A10FD3C49C51E5F42E2@SJ0P101M…
---
kernel/time/ntp.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/time/ntp.c b/kernel/time/ntp.c
index b550ebe..163e7a2 100644
--- a/kernel/time/ntp.c
+++ b/kernel/time/ntp.c
@@ -798,7 +798,7 @@ int __do_adjtimex(struct __kernel_timex *txc, const struct timespec64 *ts,
txc->offset = shift_right(ntpdata->time_offset * NTP_INTERVAL_FREQ, NTP_SCALE_SHIFT);
if (!(ntpdata->time_status & STA_NANO))
- txc->offset = (u32)txc->offset / NSEC_PER_USEC;
+ txc->offset = div_s64(txc->offset, NSEC_PER_USEC);
}
result = ntpdata->time_state;
Recent kernels cause a lot of TCP retransmissions
[ ID] Interval Transfer Bitrate Retr Cwnd
[ 5] 0.00-1.00 sec 2.24 GBytes 19.2 Gbits/sec 2767 442 KBytes
[ 5] 1.00-2.00 sec 2.23 GBytes 19.1 Gbits/sec 2312 350 KBytes
^^^^
Replacing the qdisc with pfifo makes retransmissions go away.
It appears that a flow may have a delayed packet with a very near
Tx time. Later, we may get busy processing Rx and the target Tx time
will pass, but we won't service Tx since the CPU is busy with Rx.
If Rx sees an ACK and we try to push more data for the delayed flow
we may fastpath the skb, not realizing that there are already "ready
to send" packets for this flow sitting in the qdisc.
Don't trust the fastpath if we are "behind" according to the projected
Tx time for next flow waiting in the Qdisc. Because we consider anything
within the offload window to be okay for fastpath we must consider
the entire offload window as "now".
Qdisc config:
qdisc fq 8001: dev eth0 parent 1234:1 limit 10000p flow_limit 100p \
buckets 32768 orphan_mask 1023 bands 3 \
priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 \
weights 589824 196608 65536 quantum 3028b initial_quantum 15140b \
low_rate_threshold 550Kbit \
refill_delay 40ms timer_slack 10us horizon 10s horizon_drop
For iperf this change seems to do fine, the reordering is gone.
The fastpath still gets used most of the time:
gc 0 highprio 0 fastpath 142614 throttled 418309 latency 19.1us
xx_behind 2731
where "xx_behind" counts how many times we hit the new "return false".
CC: stable(a)vger.kernel.org
Fixes: 076433bd78d7 ("net_sched: sch_fq: add fast path for mostly idle qdisc")
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
---
v2:
- use Eric's condition (fix offload, don't care about throttled)
- throttled -> delayed
- explicitly CC stable, it won't build on 6.12 because of the offload
horizon, so make sure they don't just drop this
v1: https://lore.kernel.org/20241122162108.2697803-1-kuba@kernel.org
CC: jhs(a)mojatatu.com
CC: xiyou.wangcong(a)gmail.com
CC: jiri(a)resnulli.us
---
net/sched/sch_fq.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/net/sched/sch_fq.c b/net/sched/sch_fq.c
index a97638bef6da..a5e87f9ea986 100644
--- a/net/sched/sch_fq.c
+++ b/net/sched/sch_fq.c
@@ -332,6 +332,12 @@ static bool fq_fastpath_check(const struct Qdisc *sch, struct sk_buff *skb,
*/
if (q->internal.qlen >= 8)
return false;
+
+ /* Ordering invariants fall apart if some delayed flows
+ * are ready but we haven't serviced them, yet.
+ */
+ if (q->time_next_delayed_flow <= now + q->offload_horizon)
+ return false;
}
sk = skb->sk;
--
2.47.0
The patch titled
Subject: ocfs2: update seq_file index in ocfs2_dlm_seq_next
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
ocfs2-update-seq_file-index-in-ocfs2_dlm_seq_next-v2.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Wengang Wang <wen.gang.wang(a)oracle.com>
Subject: ocfs2: update seq_file index in ocfs2_dlm_seq_next
Date: Tue, 19 Nov 2024 09:45:00 -0800
The following INFO level message was seen:
seq_file: buggy .next function ocfs2_dlm_seq_next [ocfs2] did not
update position index
Fix:
Update *pos (so m->index) to make seq_read_iter happy though the index its
self makes no sense to ocfs2_dlm_seq_next.
Link: https://lkml.kernel.org/r/20241119174500.9198-1-wen.gang.wang@oracle.com
Signed-off-by: Wengang Wang <wen.gang.wang(a)oracle.com>
Reviewed-by: Joseph Qi <joseph.qi(a)linux.alibaba.com>
Cc: Mark Fasheh <mark(a)fasheh.com>
Cc: Joel Becker <jlbec(a)evilplan.org>
Cc: Junxiao Bi <junxiao.bi(a)oracle.com>
Cc: Changwei Ge <gechangwei(a)live.cn>
Cc: Jun Piao <piaojun(a)huawei.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/ocfs2/dlmglue.c | 1 +
1 file changed, 1 insertion(+)
--- a/fs/ocfs2/dlmglue.c~ocfs2-update-seq_file-index-in-ocfs2_dlm_seq_next-v2
+++ a/fs/ocfs2/dlmglue.c
@@ -3110,6 +3110,7 @@ static void *ocfs2_dlm_seq_next(struct s
struct ocfs2_lock_res *iter = v;
struct ocfs2_lock_res *dummy = &priv->p_iter_res;
+ (*pos)++;
spin_lock(&ocfs2_dlm_tracking_lock);
iter = ocfs2_dlm_next_res(iter, priv);
list_del_init(&dummy->l_debug_list);
_
Patches currently in -mm which might be from wen.gang.wang(a)oracle.com are
ocfs2-update-seq_file-index-in-ocfs2_dlm_seq_next-v2.patch
The patch titled
Subject: stackdepot: fix stack_depot_save_flags() in NMI context
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
stackdepot-fix-stack_depot_save_flags-in-nmi-context.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Marco Elver <elver(a)google.com>
Subject: stackdepot: fix stack_depot_save_flags() in NMI context
Date: Fri, 22 Nov 2024 16:39:47 +0100
Per documentation, stack_depot_save_flags() was meant to be usable from
NMI context if STACK_DEPOT_FLAG_CAN_ALLOC is unset. However, it still
would try to take the pool_lock in an attempt to save a stack trace in the
current pool (if space is available).
This could result in deadlock if an NMI is handled while pool_lock is
already held. To avoid deadlock, only try to take the lock in NMI context
and give up if unsuccessful.
The documentation is fixed to clearly convey this.
Link: https://lkml.kernel.org/r/Z0CcyfbPqmxJ9uJH@elver.google.com
Link: https://lkml.kernel.org/r/20241122154051.3914732-1-elver@google.com
Fixes: 4434a56ec209 ("stackdepot: make fast paths lock-less again")
Signed-off-by: Marco Elver <elver(a)google.com>
Reported-by: Sebastian Andrzej Siewior <bigeasy(a)linutronix.de>
Reviewed-by: Sebastian Andrzej Siewior <bigeasy(a)linutronix.de>
Cc: Alexander Potapenko <glider(a)google.com>
Cc: Andrey Konovalov <andreyknvl(a)gmail.com>
Cc: Dmitry Vyukov <dvyukov(a)google.com>
Cc: Oscar Salvador <osalvador(a)suse.de>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
include/linux/stackdepot.h | 6 +++---
lib/stackdepot.c | 10 +++++++++-
2 files changed, 12 insertions(+), 4 deletions(-)
--- a/include/linux/stackdepot.h~stackdepot-fix-stack_depot_save_flags-in-nmi-context
+++ a/include/linux/stackdepot.h
@@ -147,7 +147,7 @@ static inline int stack_depot_early_init
* If the provided stack trace comes from the interrupt context, only the part
* up to the interrupt entry is saved.
*
- * Context: Any context, but setting STACK_DEPOT_FLAG_CAN_ALLOC is required if
+ * Context: Any context, but unsetting STACK_DEPOT_FLAG_CAN_ALLOC is required if
* alloc_pages() cannot be used from the current context. Currently
* this is the case for contexts where neither %GFP_ATOMIC nor
* %GFP_NOWAIT can be used (NMI, raw_spin_lock).
@@ -156,7 +156,7 @@ static inline int stack_depot_early_init
*/
depot_stack_handle_t stack_depot_save_flags(unsigned long *entries,
unsigned int nr_entries,
- gfp_t gfp_flags,
+ gfp_t alloc_flags,
depot_flags_t depot_flags);
/**
@@ -175,7 +175,7 @@ depot_stack_handle_t stack_depot_save_fl
* Return: Handle of the stack trace stored in depot, 0 on failure
*/
depot_stack_handle_t stack_depot_save(unsigned long *entries,
- unsigned int nr_entries, gfp_t gfp_flags);
+ unsigned int nr_entries, gfp_t alloc_flags);
/**
* __stack_depot_get_stack_record - Get a pointer to a stack_record struct
--- a/lib/stackdepot.c~stackdepot-fix-stack_depot_save_flags-in-nmi-context
+++ a/lib/stackdepot.c
@@ -630,7 +630,15 @@ depot_stack_handle_t stack_depot_save_fl
prealloc = page_address(page);
}
- raw_spin_lock_irqsave(&pool_lock, flags);
+ if (in_nmi()) {
+ /* We can never allocate in NMI context. */
+ WARN_ON_ONCE(can_alloc);
+ /* Best effort; bail if we fail to take the lock. */
+ if (!raw_spin_trylock_irqsave(&pool_lock, flags))
+ goto exit;
+ } else {
+ raw_spin_lock_irqsave(&pool_lock, flags);
+ }
printk_deferred_enter();
/* Try to find again, to avoid concurrently inserting duplicates. */
_
Patches currently in -mm which might be from elver(a)google.com are
stackdepot-fix-stack_depot_save_flags-in-nmi-context.patch
The patch titled
Subject: mm: open-code page_folio() in dump_page()
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-open-code-page_folio-in-dump_page.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: "Matthew Wilcox (Oracle)" <willy(a)infradead.org>
Subject: mm: open-code page_folio() in dump_page()
Date: Mon, 25 Nov 2024 20:17:19 +0000
page_folio() calls page_fixed_fake_head() which will misidentify this page
as being a fake head and load off the end of 'precise'. We may have a
pointer to a fake head, but that's OK because it contains the right
information for dump_page().
gcc-15 is smart enough to catch this with -Warray-bounds:
In function 'page_fixed_fake_head',
inlined from '_compound_head' at ../include/linux/page-flags.h:251:24,
inlined from '__dump_page' at ../mm/debug.c:123:11:
../include/asm-generic/rwonce.h:44:26: warning: array subscript 9 is outside
+array bounds of 'struct page[1]' [-Warray-bounds=]
Link: https://lkml.kernel.org/r/20241125201721.2963278-2-willy@infradead.org
Fixes: fae7d834c43c ("mm: add __dump_folio()")
Signed-off-by: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Reported-by: Kees Cook <kees(a)kernel.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/debug.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
--- a/mm/debug.c~mm-open-code-page_folio-in-dump_page
+++ a/mm/debug.c
@@ -124,19 +124,22 @@ static void __dump_page(const struct pag
{
struct folio *foliop, folio;
struct page precise;
+ unsigned long head;
unsigned long pfn = page_to_pfn(page);
unsigned long idx, nr_pages = 1;
int loops = 5;
again:
memcpy(&precise, page, sizeof(*page));
- foliop = page_folio(&precise);
- if (foliop == (struct folio *)&precise) {
+ head = precise.compound_head;
+ if ((head & 1) == 0) {
+ foliop = (struct folio *)&precise;
idx = 0;
if (!folio_test_large(foliop))
goto dump;
foliop = (struct folio *)page;
} else {
+ foliop = (struct folio *)(head - 1);
idx = folio_page_idx(foliop, page);
}
_
Patches currently in -mm which might be from willy(a)infradead.org are
mm-open-code-pagetail-in-folio_flags-and-const_folio_flags.patch
mm-open-code-page_folio-in-dump_page.patch
mm-page_alloc-cache-page_zone-result-in-free_unref_page.patch
mm-make-alloc_pages_mpol-static.patch
mm-page_alloc-export-free_frozen_pages-instead-of-free_unref_page.patch
mm-page_alloc-move-set_page_refcounted-to-callers-of-post_alloc_hook.patch
mm-page_alloc-move-set_page_refcounted-to-callers-of-prep_new_page.patch
mm-page_alloc-move-set_page_refcounted-to-callers-of-get_page_from_freelist.patch
mm-page_alloc-move-set_page_refcounted-to-callers-of-__alloc_pages_cpuset_fallback.patch
mm-page_alloc-move-set_page_refcounted-to-callers-of-__alloc_pages_may_oom.patch
mm-page_alloc-move-set_page_refcounted-to-callers-of-__alloc_pages_direct_compact.patch
mm-page_alloc-move-set_page_refcounted-to-callers-of-__alloc_pages_direct_reclaim.patch
mm-page_alloc-move-set_page_refcounted-to-callers-of-__alloc_pages_slowpath.patch
mm-page_alloc-move-set_page_refcounted-to-end-of-__alloc_pages.patch
mm-page_alloc-add-__alloc_frozen_pages.patch
mm-mempolicy-add-alloc_frozen_pages.patch
slab-allocate-frozen-pages.patch
The patch titled
Subject: mm: open-code PageTail in folio_flags() and const_folio_flags()
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-open-code-pagetail-in-folio_flags-and-const_folio_flags.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: "Matthew Wilcox (Oracle)" <willy(a)infradead.org>
Subject: mm: open-code PageTail in folio_flags() and const_folio_flags()
Date: Mon, 25 Nov 2024 20:17:18 +0000
It is unsafe to call PageTail() in dump_page() as page_is_fake_head() will
almost certainly return true when called on a head page that is copied to
the stack. That will cause the VM_BUG_ON_PGFLAGS() in const_folio_flags()
to trigger when it shouldn't. Fortunately, we don't need to call
PageTail() here; it's fine to have a pointer to a virtual alias of the
page's flag word rather than the real page's flag word.
Link: https://lkml.kernel.org/r/20241125201721.2963278-1-willy@infradead.org
Fixes: fae7d834c43c ("mm: add __dump_folio()")
Signed-off-by: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: <stable(a)vger.kernel.org>
Cc: Kees Cook <kees(a)kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
include/linux/page-flags.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
--- a/include/linux/page-flags.h~mm-open-code-pagetail-in-folio_flags-and-const_folio_flags
+++ a/include/linux/page-flags.h
@@ -306,7 +306,7 @@ static const unsigned long *const_folio_
{
const struct page *page = &folio->page;
- VM_BUG_ON_PGFLAGS(PageTail(page), page);
+ VM_BUG_ON_PGFLAGS(page->compound_head & 1, page);
VM_BUG_ON_PGFLAGS(n > 0 && !test_bit(PG_head, &page->flags), page);
return &page[n].flags;
}
@@ -315,7 +315,7 @@ static unsigned long *folio_flags(struct
{
struct page *page = &folio->page;
- VM_BUG_ON_PGFLAGS(PageTail(page), page);
+ VM_BUG_ON_PGFLAGS(page->compound_head & 1, page);
VM_BUG_ON_PGFLAGS(n > 0 && !test_bit(PG_head, &page->flags), page);
return &page[n].flags;
}
_
Patches currently in -mm which might be from willy(a)infradead.org are
mm-open-code-pagetail-in-folio_flags-and-const_folio_flags.patch
mm-open-code-page_folio-in-dump_page.patch
mm-page_alloc-cache-page_zone-result-in-free_unref_page.patch
mm-make-alloc_pages_mpol-static.patch
mm-page_alloc-export-free_frozen_pages-instead-of-free_unref_page.patch
mm-page_alloc-move-set_page_refcounted-to-callers-of-post_alloc_hook.patch
mm-page_alloc-move-set_page_refcounted-to-callers-of-prep_new_page.patch
mm-page_alloc-move-set_page_refcounted-to-callers-of-get_page_from_freelist.patch
mm-page_alloc-move-set_page_refcounted-to-callers-of-__alloc_pages_cpuset_fallback.patch
mm-page_alloc-move-set_page_refcounted-to-callers-of-__alloc_pages_may_oom.patch
mm-page_alloc-move-set_page_refcounted-to-callers-of-__alloc_pages_direct_compact.patch
mm-page_alloc-move-set_page_refcounted-to-callers-of-__alloc_pages_direct_reclaim.patch
mm-page_alloc-move-set_page_refcounted-to-callers-of-__alloc_pages_slowpath.patch
mm-page_alloc-move-set_page_refcounted-to-end-of-__alloc_pages.patch
mm-page_alloc-add-__alloc_frozen_pages.patch
mm-mempolicy-add-alloc_frozen_pages.patch
slab-allocate-frozen-pages.patch
The patch titled
Subject: mm: fix vrealloc()'s KASAN poisoning logic
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-fix-vreallocs-kasan-poisoning-logic.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Andrii Nakryiko <andrii(a)kernel.org>
Subject: mm: fix vrealloc()'s KASAN poisoning logic
Date: Mon, 25 Nov 2024 16:52:06 -0800
When vrealloc() reuses already allocated vmap_area, we need to re-annotate
poisoned and unpoisoned portions of underlying memory according to the new
size.
Note, hard-coding KASAN_VMALLOC_PROT_NORMAL might not be exactly correct,
but KASAN flag logic is pretty involved and spread out throughout
__vmalloc_node_range_noprof(), so I'm using the bare minimum flag here and
leaving the rest to mm people to refactor this logic and reuse it here.
Link: https://lkml.kernel.org/r/20241126005206.3457974-1-andrii@kernel.org
Fixes: 3ddc2fefe6f3 ("mm: vmalloc: implement vrealloc()")
Signed-off-by: Andrii Nakryiko <andrii(a)kernel.org>
Cc: Alexei Starovoitov <ast(a)kernel.org>
Cc: Christoph Hellwig <hch(a)infradead.org>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Uladzislau Rezki (Sony) <urezki(a)gmail.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/vmalloc.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
--- a/mm/vmalloc.c~mm-fix-vreallocs-kasan-poisoning-logic
+++ a/mm/vmalloc.c
@@ -4093,7 +4093,8 @@ void *vrealloc_noprof(const void *p, siz
/* Zero out spare memory. */
if (want_init_on_alloc(flags))
memset((void *)p + size, 0, old_size - size);
-
+ kasan_poison_vmalloc(p + size, old_size - size);
+ kasan_unpoison_vmalloc(p, size, KASAN_VMALLOC_PROT_NORMAL);
return (void *)p;
}
_
Patches currently in -mm which might be from andrii(a)kernel.org are
mm-fix-vreallocs-kasan-poisoning-logic.patch
The patch titled
Subject: Revert "readahead: properly shorten readahead when falling back to do_page_cache_ra()"
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
revert-readahead-properly-shorten-readahead-when-falling-back-to-do_page_cache_ra.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Jan Kara <jack(a)suse.cz>
Subject: Revert "readahead: properly shorten readahead when falling back to do_page_cache_ra()"
Date: Tue, 26 Nov 2024 15:52:08 +0100
This reverts commit 7c877586da3178974a8a94577b6045a48377ff25.
Anders and Philippe have reported that recent kernels occasionally hang
when used with NFS in readahead code. The problem has been bisected to
7c877586da3 ("readahead: properly shorten readahead when falling back to
do_page_cache_ra()"). The cause of the problem is that ra->size can be
shrunk by read_pages() call and subsequently we end up calling
do_page_cache_ra() with negative (read huge positive) number of pages.
Let's revert 7c877586da3 for now until we can find a proper way how the
logic in read_pages() and page_cache_ra_order() can coexist. This can
lead to reduced readahead throughput due to readahead window confusion but
that's better than outright hangs.
Link: https://lkml.kernel.org/r/20241126145208.985-1-jack@suse.cz
Fixes: 7c877586da31 ("readahead: properly shorten readahead when falling back to do_page_cache_ra()")
Reported-by: Anders Blomdell <anders.blomdell(a)gmail.com>
Reported-by: Philippe Troin <phil(a)fifi.org>
Signed-off-by: Jan Kara <jack(a)suse.cz>
Tested-by: Philippe Troin <phil(a)fifi.org>
Cc: Matthew Wilcox <willy(a)infradead.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/readahead.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
--- a/mm/readahead.c~revert-readahead-properly-shorten-readahead-when-falling-back-to-do_page_cache_ra
+++ a/mm/readahead.c
@@ -460,8 +460,7 @@ void page_cache_ra_order(struct readahea
struct file_ra_state *ra, unsigned int new_order)
{
struct address_space *mapping = ractl->mapping;
- pgoff_t start = readahead_index(ractl);
- pgoff_t index = start;
+ pgoff_t index = readahead_index(ractl);
unsigned int min_order = mapping_min_folio_order(mapping);
pgoff_t limit = (i_size_read(mapping->host) - 1) >> PAGE_SHIFT;
pgoff_t mark = index + ra->size - ra->async_size;
@@ -524,7 +523,7 @@ void page_cache_ra_order(struct readahea
if (!err)
return;
fallback:
- do_page_cache_ra(ractl, ra->size - (index - start), ra->async_size);
+ do_page_cache_ra(ractl, ra->size, ra->async_size);
}
static unsigned long ractl_max_pages(struct readahead_control *ractl,
_
Patches currently in -mm which might be from jack(a)suse.cz are
revert-readahead-properly-shorten-readahead-when-falling-back-to-do_page_cache_ra.patch
The patch titled
Subject: mm: vmscan: ensure kswapd is woken up if the wait queue is active
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-vmscan-ensure-kswapd-is-woken-up-if-the-wait-queue-is-active.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Seiji Nishikawa <snishika(a)redhat.com>
Subject: mm: vmscan: ensure kswapd is woken up if the wait queue is active
Date: Wed, 27 Nov 2024 00:06:12 +0900
Even after commit 501b26510ae3 ("vmstat: allow_direct_reclaim should use
zone_page_state_snapshot"), a task may remain indefinitely stuck in
throttle_direct_reclaim() while holding mm->rwsem.
__alloc_pages_nodemask
try_to_free_pages
throttle_direct_reclaim
This can cause numerous other tasks to wait on the same rwsem, leading
to severe system hangups:
[1088963.358712] INFO: task python3:1670971 blocked for more than 120 seconds.
[1088963.365653] Tainted: G OE -------- - - 4.18.0-553.el8_10.aarch64 #1
[1088963.373887] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[1088963.381862] task:python3 state:D stack:0 pid:1670971 ppid:1667117 flags:0x00800080
[1088963.381869] Call trace:
[1088963.381872] __switch_to+0xd0/0x120
[1088963.381877] __schedule+0x340/0xac8
[1088963.381881] schedule+0x68/0x118
[1088963.381886] rwsem_down_read_slowpath+0x2d4/0x4b8
The issue arises when allow_direct_reclaim(pgdat) returns false,
preventing progress even when the pgdat->pfmemalloc_wait wait queue is
empty. Despite the wait queue being empty, the condition,
allow_direct_reclaim(pgdat), may still be returning false, causing it to
continue looping.
In some cases, reclaimable pages exist (zone_reclaimable_pages() returns
> 0), but calculations of pfmemalloc_reserve and free_pages result in
wmark_ok being false.
And then, despite the pgdat->kswapd_wait queue being non-empty, kswapd
is not woken up, further exacerbating the problem:
crash> px ((struct pglist_data *) 0xffff00817fffe540)->kswapd_highest_zoneidx
$775 = __MAX_NR_ZONES
This patch modifies allow_direct_reclaim() to wake kswapd if the
pgdat->kswapd_wait queue is active, regardless of whether wmark_ok is true
or false. This change ensures kswapd does not miss wake-ups under high
memory pressure, reducing the risk of task stalls in the throttled reclaim
path.
Link: https://lkml.kernel.org/r/20241126150612.114561-1-snishika@redhat.com
Signed-off-by: Seiji Nishikawa <snishika(a)redhat.com>
Cc: Mel Gorman <mgorman(a)techsingularity.net>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/vmscan.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
--- a/mm/vmscan.c~mm-vmscan-ensure-kswapd-is-woken-up-if-the-wait-queue-is-active
+++ a/mm/vmscan.c
@@ -6389,8 +6389,8 @@ static bool allow_direct_reclaim(pg_data
wmark_ok = free_pages > pfmemalloc_reserve / 2;
- /* kswapd must be awake if processes are being throttled */
- if (!wmark_ok && waitqueue_active(&pgdat->kswapd_wait)) {
+ /* Always wake up kswapd if the wait queue is not empty */
+ if (waitqueue_active(&pgdat->kswapd_wait)) {
if (READ_ONCE(pgdat->kswapd_highest_zoneidx) > ZONE_NORMAL)
WRITE_ONCE(pgdat->kswapd_highest_zoneidx, ZONE_NORMAL);
_
Patches currently in -mm which might be from snishika(a)redhat.com are
mm-vmscan-ensure-kswapd-is-woken-up-if-the-wait-queue-is-active.patch
The patch titled
Subject: selftests/damon: add _damon_sysfs.py to TEST_FILES
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
selftests-damon-add-_damon_sysfspy-to-test_files.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Maximilian Heyne <mheyne(a)amazon.de>
Subject: selftests/damon: add _damon_sysfs.py to TEST_FILES
Date: Wed, 27 Nov 2024 12:08:53 +0000
When running selftests I encountered the following error message with
some damon tests:
# Traceback (most recent call last):
# File "[...]/damon/./damos_quota.py", line 7, in <module>
# import _damon_sysfs
# ModuleNotFoundError: No module named '_damon_sysfs'
Fix this by adding the _damon_sysfs.py file to TEST_FILES so that it
will be available when running the respective damon selftests.
Link: https://lkml.kernel.org/r/20241127-picks-visitor-7416685b-mheyne@amazon.de
Fixes: 306abb63a8ca ("selftests/damon: implement a python module for test-purpose DAMON sysfs controls")
Signed-off-by: Maximilian Heyne <mheyne(a)amazon.de>
Reviewed-by: SeongJae Park <sj(a)kernel.org>
Cc: Shuah Khan <shuah(a)kernel.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
tools/testing/selftests/damon/Makefile | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/tools/testing/selftests/damon/Makefile~selftests-damon-add-_damon_sysfspy-to-test_files
+++ a/tools/testing/selftests/damon/Makefile
@@ -6,7 +6,7 @@ TEST_GEN_FILES += debugfs_target_ids_rea
TEST_GEN_FILES += debugfs_target_ids_pid_leak
TEST_GEN_FILES += access_memory access_memory_even
-TEST_FILES = _chk_dependency.sh _debugfs_common.sh
+TEST_FILES = _chk_dependency.sh _debugfs_common.sh _damon_sysfs.py
# functionality tests
TEST_PROGS = debugfs_attrs.sh debugfs_schemes.sh debugfs_target_ids.sh
_
Patches currently in -mm which might be from mheyne(a)amazon.de are
selftests-damon-add-_damon_sysfspy-to-test_files.patch
The patch titled
Subject: selftest: hugetlb_dio: fix test naming
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
selftest-hugetlb_dio-fix-test-naming.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Mark Brown <broonie(a)kernel.org>
Subject: selftest: hugetlb_dio: fix test naming
Date: Wed, 27 Nov 2024 16:14:22 +0000
The string logged when a test passes or fails is used by the selftest
framework to identify which test is being reported. The hugetlb_dio test
not only uses the same strings for every test that is run but it also uses
different strings for test passes and failures which means that test
automation is unable to follow what the test is doing at all.
Pull the existing duplicated logging of the number of free huge pages
before and after the test out of the conditional and replace that and the
logging of the result with a single ksft_print_result() which incorporates
the parameters passed into the test into the output.
Link: https://lkml.kernel.org/r/20241127-kselftest-mm-hugetlb-dio-names-v1-1-22aa…
Fixes: fae1980347bf ("selftests: hugetlb_dio: fixup check for initial conditions to skip in the start")
Signed-off-by: Mark Brown <broonie(a)kernel.org>
Cc: Donet Tom <donettom(a)linux.ibm.com>
Cc: Muhammad Usama Anjum <usama.anjum(a)collabora.com>
Cc: Ritesh Harjani (IBM) <ritesh.list(a)gmail.com>
Cc: Shuah Khan <shuah(a)kernel.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
tools/testing/selftests/mm/hugetlb_dio.c | 14 +++++---------
1 file changed, 5 insertions(+), 9 deletions(-)
--- a/tools/testing/selftests/mm/hugetlb_dio.c~selftest-hugetlb_dio-fix-test-naming
+++ a/tools/testing/selftests/mm/hugetlb_dio.c
@@ -76,19 +76,15 @@ void run_dio_using_hugetlb(unsigned int
/* Get the free huge pages after unmap*/
free_hpage_a = get_free_hugepages();
+ ksft_print_msg("No. Free pages before allocation : %d\n", free_hpage_b);
+ ksft_print_msg("No. Free pages after munmap : %d\n", free_hpage_a);
+
/*
* If the no. of free hugepages before allocation and after unmap does
* not match - that means there could still be a page which is pinned.
*/
- if (free_hpage_a != free_hpage_b) {
- ksft_print_msg("No. Free pages before allocation : %d\n", free_hpage_b);
- ksft_print_msg("No. Free pages after munmap : %d\n", free_hpage_a);
- ksft_test_result_fail(": Huge pages not freed!\n");
- } else {
- ksft_print_msg("No. Free pages before allocation : %d\n", free_hpage_b);
- ksft_print_msg("No. Free pages after munmap : %d\n", free_hpage_a);
- ksft_test_result_pass(": Huge pages freed successfully !\n");
- }
+ ksft_test_result(free_hpage_a == free_hpage_b,
+ "free huge pages from %u-%u\n", start_off, end_off);
}
int main(void)
_
Patches currently in -mm which might be from broonie(a)kernel.org are
selftest-hugetlb_dio-fix-test-naming.patch
The patch titled
Subject: arch_numa: restore nid checks before registering a memblock with a node
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
arch_numa-restore-nid-checks-before-registering-a-memblock-with-a-node.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Marc Zyngier <maz(a)kernel.org>
Subject: arch_numa: restore nid checks before registering a memblock with a node
Date: Wed, 27 Nov 2024 19:30:00 +0000
Commit 767507654c22 ("arch_numa: switch over to numa_memblks")
significantly cleaned up the NUMA registration code, but also dropped a
significant check that was refusing to accept to configure a memblock with
an invalid nid.
On "quality hardware" such as my ThunderX machine, this results
in a kernel that dies immediately:
[ 0.000000] Booting Linux on physical CPU 0x0000000000 [0x431f0a10]
[ 0.000000] Linux version 6.12.0-00013-g8920d74cf8db (maz@valley-girl) (gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40) #3872 SMP PREEMPT Wed Nov 27 15:25:49 GMT 2024
[ 0.000000] KASLR disabled due to lack of seed
[ 0.000000] Machine model: Cavium ThunderX CN88XX board
[ 0.000000] efi: EFI v2.4 by American Megatrends
[ 0.000000] efi: ESRT=0xffce0ff18 SMBIOS 3.0=0xfffb0000 ACPI 2.0=0xffec60000 MEMRESERVE=0xffc905d98
[ 0.000000] esrt: Reserving ESRT space from 0x0000000ffce0ff18 to 0x0000000ffce0ff50.
[ 0.000000] earlycon: pl11 at MMIO 0x000087e024000000 (options '115200n8')
[ 0.000000] printk: legacy bootconsole [pl11] enabled
[ 0.000000] NODE_DATA(0) allocated [mem 0xff6754580-0xff67566bf]
[ 0.000000] Unable to handle kernel paging request at virtual address 0000000000001d40
[ 0.000000] Mem abort info:
[ 0.000000] ESR = 0x0000000096000004
[ 0.000000] EC = 0x25: DABT (current EL), IL = 32 bits
[ 0.000000] SET = 0, FnV = 0
[ 0.000000] EA = 0, S1PTW = 0
[ 0.000000] FSC = 0x04: level 0 translation fault
[ 0.000000] Data abort info:
[ 0.000000] ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000
[ 0.000000] CM = 0, WnR = 0, TnD = 0, TagAccess = 0
[ 0.000000] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
[ 0.000000] [0000000000001d40] user address but active_mm is swapper
[ 0.000000] Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP
[ 0.000000] Modules linked in:
[ 0.000000] CPU: 0 UID: 0 PID: 0 Comm: swapper Not tainted 6.12.0-00013-g8920d74cf8db #3872
[ 0.000000] Hardware name: Cavium ThunderX CN88XX board (DT)
[ 0.000000] pstate: a00000c5 (NzCv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 0.000000] pc : sparse_init_nid+0x54/0x428
[ 0.000000] lr : sparse_init+0x118/0x240
[ 0.000000] sp : ffff800081da3cb0
[ 0.000000] x29: ffff800081da3cb0 x28: 0000000fedbab10c x27: 0000000000000001
[ 0.000000] x26: 0000000ffee250f8 x25: 0000000000000001 x24: ffff800082102cd0
[ 0.000000] x23: 0000000000000001 x22: 0000000000000000 x21: 00000000001fffff
[ 0.000000] x20: 0000000000000001 x19: 0000000000000000 x18: ffffffffffffffff
[ 0.000000] x17: 0000000001b00000 x16: 0000000ffd130000 x15: 0000000000000000
[ 0.000000] x14: 00000000003e0000 x13: 00000000000001c8 x12: 0000000000000014
[ 0.000000] x11: ffff800081e82860 x10: ffff8000820fb2c8 x9 : ffff8000820fb490
[ 0.000000] x8 : 0000000000ffed20 x7 : 0000000000000014 x6 : 00000000001fffff
[ 0.000000] x5 : 00000000ffffffff x4 : 0000000000000000 x3 : 0000000000000000
[ 0.000000] x2 : 0000000000000000 x1 : 0000000000000040 x0 : 0000000000000007
[ 0.000000] Call trace:
[ 0.000000] sparse_init_nid+0x54/0x428
[ 0.000000] sparse_init+0x118/0x240
[ 0.000000] bootmem_init+0x70/0x1c8
[ 0.000000] setup_arch+0x184/0x270
[ 0.000000] start_kernel+0x74/0x670
[ 0.000000] __primary_switched+0x80/0x90
[ 0.000000] Code: f865d804 d37df060 cb030000 d2800003 (b95d4084)
[ 0.000000] ---[ end trace 0000000000000000 ]---
[ 0.000000] Kernel panic - not syncing: Attempted to kill the idle task!
[ 0.000000] ---[ end Kernel panic - not syncing: Attempted to kill the idle task! ]---
while previous kernel versions were able to recognise how brain-damaged
the machine is, and only build a fake node.
Restoring the check brings back some sanity and a "working" system.
Link: https://lkml.kernel.org/r/20241127193000.3702637-1-maz@kernel.org
Fixes: 767507654c22 ("arch_numa: switch over to numa_memblks")
Signed-off-by: Marc Zyngier <maz(a)kernel.org>
Cc: Mike Rapoport <rppt(a)kernel.org>
Cc: Catalin Marinas <catalin.marinas(a)arm.com>
Cc: Will Deacon <will(a)kernel.org>
Cc: Zi Yan <ziy(a)nvidia.com>
Cc: Dan Williams <dan.j.williams(a)intel.com>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
drivers/base/arch_numa.c | 14 ++++++++++++++
1 file changed, 14 insertions(+)
--- a/drivers/base/arch_numa.c~arch_numa-restore-nid-checks-before-registering-a-memblock-with-a-node
+++ a/drivers/base/arch_numa.c
@@ -207,7 +207,21 @@ static void __init setup_node_data(int n
static int __init numa_register_nodes(void)
{
int nid;
+ struct memblock_region *mblk;
+ /* Check that valid nid is set to memblks */
+ for_each_mem_region(mblk) {
+ int mblk_nid = memblock_get_region_node(mblk);
+ phys_addr_t start = mblk->base;
+ phys_addr_t end = mblk->base + mblk->size - 1;
+
+ if (mblk_nid == NUMA_NO_NODE || mblk_nid >= MAX_NUMNODES) {
+ pr_warn("Warning: invalid memblk node %d [mem %pap-%pap]\n",
+ mblk_nid, &start, &end);
+ return -EINVAL;
+ }
+ }
+
/* Finally register nodes. */
for_each_node_mask(nid, numa_nodes_parsed) {
unsigned long start_pfn, end_pfn;
_
Patches currently in -mm which might be from maz(a)kernel.org are
arch_numa-restore-nid-checks-before-registering-a-memblock-with-a-node.patch
The quilt patch titled
Subject: fs/proc/kcore.c: clear ret value in read_kcore_iter after successful iov_iter_zero
has been removed from the -mm tree. Its filename was
fs-proc-kcorec-clear-ret-value-in-read_kcore_iter-after-successful-iov_iter_zero.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Jiri Olsa <jolsa(a)kernel.org>
Subject: fs/proc/kcore.c: clear ret value in read_kcore_iter after successful iov_iter_zero
Date: Fri, 22 Nov 2024 00:11:18 +0100
If iov_iter_zero succeeds after failed copy_from_kernel_nofault, we need
to reset the ret value to zero otherwise it will be returned as final
return value of read_kcore_iter.
This fixes objdump -d dump over /proc/kcore for me.
Link: https://lkml.kernel.org/r/20241121231118.3212000-1-jolsa@kernel.org
Fixes: 3d5854d75e31 ("fs/proc/kcore.c: allow translation of physical memory addresses")
Signed-off-by: Jiri Olsa <jolsa(a)kernel.org>
Cc: Alexander Gordeev <agordeev(a)linux.ibm.com>
Cc: Christian Brauner <brauner(a)kernel.org>
Cc: <hca(a)linux.ibm.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/proc/kcore.c | 1 +
1 file changed, 1 insertion(+)
--- a/fs/proc/kcore.c~fs-proc-kcorec-clear-ret-value-in-read_kcore_iter-after-successful-iov_iter_zero
+++ a/fs/proc/kcore.c
@@ -600,6 +600,7 @@ static ssize_t read_kcore_iter(struct ki
ret = -EFAULT;
goto out;
}
+ ret = 0;
/*
* We know the bounce buffer is safe to copy from, so
* use _copy_to_iter() directly.
_
Patches currently in -mm which might be from jolsa(a)kernel.org are
Despite CM_IDLEST1_CORE and CM_FCLKEN1_CORE behaving normal,
disabling SPI leads to messages like:
Powerdomain (core_pwrdm) didn't enter target state 0
and according to /sys/kernel/debug/pm_debug/count off state is not
entered. That was not connected to SPI during the discussion
of disabling SPI. See:
https://lore.kernel.org/linux-omap/20230122100852.32ae082c@aktux/
The reason is that SPI is per default in slave mode. Linux driver
will turn it to master per default. It slave mode, the powerdomain seems to
be kept active if active chip select input is sensed.
Fix that by explicitly disabling the SPI3 pins which are muxed by
the bootloader since they are available on an optionally fitted header
which would require dtb overlays anyways.
Fixes: a622310f7f01 ("ARM: dts: gta04: fix excess dma channel usage")
CC: stable(a)vger.kernel.org
Signed-off-by: Andreas Kemnade <andreas(a)kemnade.info>
---
arch/arm/boot/dts/ti/omap/omap3-gta04.dtsi | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/arch/arm/boot/dts/ti/omap/omap3-gta04.dtsi b/arch/arm/boot/dts/ti/omap/omap3-gta04.dtsi
index 3661340009e7a..3940909a5aac7 100644
--- a/arch/arm/boot/dts/ti/omap/omap3-gta04.dtsi
+++ b/arch/arm/boot/dts/ti/omap/omap3-gta04.dtsi
@@ -446,6 +446,7 @@ &omap3_pmx_core2 {
pinctrl-names = "default";
pinctrl-0 = <
&hsusb2_2_pins
+ &mcspi3hog_pins
>;
hsusb2_2_pins: hsusb2-2-pins {
@@ -459,6 +460,15 @@ OMAP3630_CORE2_IOPAD(0x25fa, PIN_INPUT_PULLDOWN | MUX_MODE3) /* etk_d15.hsusb2_d
>;
};
+ mcspi3hog_pins: mcspi3hog-pins {
+ pinctrl-single,pins = <
+ OMAP3630_CORE2_IOPAD(0x25dc, PIN_OUTPUT_PULLDOWN | MUX_MODE7) /* etk_d0 */
+ OMAP3630_CORE2_IOPAD(0x25de, PIN_OUTPUT_PULLDOWN | MUX_MODE7) /* etk_d1 */
+ OMAP3630_CORE2_IOPAD(0x25e0, PIN_OUTPUT_PULLDOWN | MUX_MODE7) /* etk_d2 */
+ OMAP3630_CORE2_IOPAD(0x25e2, PIN_OUTPUT_PULLDOWN | MUX_MODE7) /* etk_d3 */
+ >;
+ };
+
spi_gpio_pins: spi-gpio-pinmux-pins {
pinctrl-single,pins = <
OMAP3630_CORE2_IOPAD(0x25d8, PIN_OUTPUT | MUX_MODE4) /* clk */
--
2.39.2
From: Ville Syrjälä <ville.syrjala(a)linux.intel.com>
DSB LUT register writes vs. palette anti-collision logic
appear to interact in interesting ways:
- posted DSB writes simply vanish into thin air while
anti-collision is active
- non-posted DSB writes actually get blocked by the anti-collision
logic, but unfortunately this ends up hogging the bus for
long enough that unrelated parallel CPU MMIO accesses start
to disappear instead
Even though we are updating the LUT during vblank we aren't
immune to the anti-collision logic because it kicks in brifly
for pipe prefill (initiated at frame start). The safe time
window for performing the LUT update is thus between the
undelayed vblank and frame start. Turns out that with low
enough CDCLK frequency (DSB execution speed depends on CDCLK)
we can exceed that.
As we are currently using non-posted writes for the legacy LUT
updates, in which case we can hit the far more severe failure
mode. The problem is exacerbated by the fact that non-posted
writes are much slower than posted writes (~4x it seems).
To mititage the problem let's switch to using posted DSB
writes for legacy LUT updates (which will involve using the
double write approach to avoid other problems with DSB
vs. legacy LUT writes). Despite writing each register twice
this will in fact make the legacy LUT update faster when
compared to the non-posted write approach, making the
problem less likely to appear. The failure mode is also
less severe.
This isn't the 100% solution we need though. That will involve
estimating how long the LUT update will take, and pushing
frame start and/or delayed vblank forward to guarantee that
the update will have finished by the time the pipe prefill
starts...
Cc: stable(a)vger.kernel.org
Fixes: 34d8311f4a1c ("drm/i915/dsb: Re-instate DSB for LUT updates")
Fixes: 25ea3411bd23 ("drm/i915/dsb: Use non-posted register writes for legacy LUT")
Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/12494
Signed-off-by: Ville Syrjälä <ville.syrjala(a)linux.intel.com>
---
drivers/gpu/drm/i915/display/intel_color.c | 30 ++++++++++++++--------
1 file changed, 20 insertions(+), 10 deletions(-)
diff --git a/drivers/gpu/drm/i915/display/intel_color.c b/drivers/gpu/drm/i915/display/intel_color.c
index 6ea3d5c58cb1..7cd902bbd244 100644
--- a/drivers/gpu/drm/i915/display/intel_color.c
+++ b/drivers/gpu/drm/i915/display/intel_color.c
@@ -1368,19 +1368,29 @@ static void ilk_load_lut_8(const struct intel_crtc_state *crtc_state,
lut = blob->data;
/*
- * DSB fails to correctly load the legacy LUT
- * unless we either write each entry twice,
- * or use non-posted writes
+ * DSB fails to correctly load the legacy LUT unless
+ * we either write each entry twice when using posted
+ * writes, or we use non-posted writes.
+ *
+ * If palette anti-collision is active during LUT
+ * register writes:
+ * - posted writes simply get dropped and thus the LUT
+ * contents may not be correctly updated
+ * - non-posted writes are blocked and thus the LUT
+ * contents are always correct, but simultaneous CPU
+ * MMIO access will start to fail
+ *
+ * Choose the lesser of two evils and use posted writes.
+ * Using posted writes is also faster, even when having
+ * to write each register twice.
*/
- if (crtc_state->dsb_color_vblank)
- intel_dsb_nonpost_start(crtc_state->dsb_color_vblank);
-
- for (i = 0; i < 256; i++)
+ for (i = 0; i < 256; i++) {
ilk_lut_write(crtc_state, LGC_PALETTE(pipe, i),
i9xx_lut_8(&lut[i]));
-
- if (crtc_state->dsb_color_vblank)
- intel_dsb_nonpost_end(crtc_state->dsb_color_vblank);
+ if (crtc_state->dsb_color_vblank)
+ ilk_lut_write(crtc_state, LGC_PALETTE(pipe, i),
+ i9xx_lut_8(&lut[i]));
+ }
}
static void ilk_load_lut_10(const struct intel_crtc_state *crtc_state,
--
2.45.2
From: Claudiu Beznea <claudiu.beznea.uj(a)bp.renesas.com>
The early_console_setup() function initializes the sci_ports[0].port with
an object of type struct uart_port obtained from the object of type
struct earlycon_device received as argument by the early_console_setup().
It may happen that later, when the rest of the serial ports are probed,
the serial port that was used as earlycon (e.g., port A) to be mapped to a
different position in sci_ports[] and the slot 0 to be used by a different
serial port (e.g., port B), as follows:
sci_ports[0] = port A
sci_ports[X] = port B
In this case, the new port mapped at index zero will have associated data
that was used for earlycon.
In case this happens, after Linux boot, any access to the serial port that
maps on sci_ports[0] (port A) will block the serial port that was used as
earlycon (port B).
To fix this, add early_console_exit() that clean the sci_ports[0] at
earlycon exit time.
Fixes: 0b0cced19ab1 ("serial: sh-sci: Add CONFIG_SERIAL_EARLYCON support")
Cc: stable(a)vger.kernel.org
Signed-off-by: Claudiu Beznea <claudiu.beznea.uj(a)bp.renesas.com>
---
drivers/tty/serial/sh-sci.c | 28 ++++++++++++++++++++++++++++
1 file changed, 28 insertions(+)
diff --git a/drivers/tty/serial/sh-sci.c b/drivers/tty/serial/sh-sci.c
index 8e2d534401fa..2f8188bdb251 100644
--- a/drivers/tty/serial/sh-sci.c
+++ b/drivers/tty/serial/sh-sci.c
@@ -3546,6 +3546,32 @@ sh_early_platform_init_buffer("earlyprintk", &sci_driver,
#ifdef CONFIG_SERIAL_SH_SCI_EARLYCON
static struct plat_sci_port port_cfg __initdata;
+static int early_console_exit(struct console *co)
+{
+ struct sci_port *sci_port = &sci_ports[0];
+ struct uart_port *port = &sci_port->port;
+ unsigned long flags;
+ int locked = 1;
+
+ if (port->sysrq)
+ locked = 0;
+ else if (oops_in_progress)
+ locked = uart_port_trylock_irqsave(port, &flags);
+ else
+ uart_port_lock_irqsave(port, &flags);
+
+ /*
+ * Clean the slot used by earlycon. A new SCI device might
+ * map to this slot.
+ */
+ memset(sci_ports, 0, sizeof(*sci_port));
+
+ if (locked)
+ uart_port_unlock_irqrestore(port, flags);
+
+ return 0;
+}
+
static int __init early_console_setup(struct earlycon_device *device,
int type)
{
@@ -3562,6 +3588,8 @@ static int __init early_console_setup(struct earlycon_device *device,
SCSCR_RE | SCSCR_TE | port_cfg.scscr);
device->con->write = serial_console_write;
+ device->con->exit = early_console_exit;
+
return 0;
}
static int __init sci_early_console_setup(struct earlycon_device *device,
--
2.39.2
From: lei lu <llfamsec(a)gmail.com>
[ Upstream commit fb63435b7c7dc112b1ae1baea5486e0a6e27b196 ]
There is a lack of verification of the space occupied by fixed members
of xlog_op_header in the xlog_recover_process_data.
We can create a crafted image to trigger an out of bounds read by
following these steps:
1) Mount an image of xfs, and do some file operations to leave records
2) Before umounting, copy the image for subsequent steps to simulate
abnormal exit. Because umount will ensure that tail_blk and
head_blk are the same, which will result in the inability to enter
xlog_recover_process_data
3) Write a tool to parse and modify the copied image in step 2
4) Make the end of the xlog_op_header entries only 1 byte away from
xlog_rec_header->h_size
5) xlog_rec_header->h_num_logops++
6) Modify xlog_rec_header->h_crc
Fix:
Add a check to make sure there is sufficient space to access fixed members
of xlog_op_header.
Signed-off-by: lei lu <llfamsec(a)gmail.com>
Reviewed-by: Dave Chinner <dchinner(a)redhat.com>
Reviewed-by: Darrick J. Wong <djwong(a)kernel.org>
Signed-off-by: Chandan Babu R <chandanbabu(a)kernel.org>
Signed-off-by: Bin Lan <bin.lan.cn(a)windriver.com>
---
fs/xfs/xfs_log_recover.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
index 9f9d3abad2cf..d11de0fa5c5f 100644
--- a/fs/xfs/xfs_log_recover.c
+++ b/fs/xfs/xfs_log_recover.c
@@ -2456,7 +2456,10 @@ xlog_recover_process_data(
ohead = (struct xlog_op_header *)dp;
dp += sizeof(*ohead);
- ASSERT(dp <= end);
+ if (dp > end) {
+ xfs_warn(log->l_mp, "%s: op header overrun", __func__);
+ return -EFSCORRUPTED;
+ }
/* errors will abort recovery */
error = xlog_recover_process_ophdr(log, rhash, rhead, ohead,
--
2.34.1
From: Alex Hung <alex.hung(a)amd.com>
[ Upstream commit 3718a619a8c0a53152e76bb6769b6c414e1e83f4 ]
dcn32_enable_phantom_stream can return null, so returned value
must be checked before used.
This fixes 1 NULL_RETURNS issue reported by Coverity.
Reviewed-by: Rodrigo Siqueira <rodrigo.siqueira(a)amd.com>
Signed-off-by: Jerry Zuo <jerry.zuo(a)amd.com>
Signed-off-by: Alex Hung <alex.hung(a)amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler(a)amd.com>
Signed-off-by: Alex Deucher <alexander.deucher(a)amd.com>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
[Xiangyu: BP to fix CVE: CVE-2024-49897, modified the source path]
Signed-off-by: Xiangyu Chen <xiangyu.chen(a)windriver.com>
---
drivers/gpu/drm/amd/display/dc/dcn32/dcn32_resource.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_resource.c b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_resource.c
index 2b8700b291a4..ef47fb2f6905 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_resource.c
@@ -1796,6 +1796,9 @@ void dcn32_add_phantom_pipes(struct dc *dc, struct dc_state *context,
// be a valid candidate for SubVP (i.e. has a plane, stream, doesn't
// already have phantom pipe assigned, etc.) by previous checks.
phantom_stream = dcn32_enable_phantom_stream(dc, context, pipes, pipe_cnt, index);
+ if (!phantom_stream)
+ return;
+
dcn32_enable_phantom_plane(dc, context, phantom_stream, index);
for (i = 0; i < dc->res_pool->pipe_count; i++) {
--
2.25.1
From: Srinivasan Shanmugam <srinivasan.shanmugam(a)amd.com>
[ Upstream commit 62ed6f0f198da04e884062264df308277628004f ]
This commit adds a null check for the set_output_gamma function pointer
in the dcn20_set_output_transfer_func function. Previously,
set_output_gamma was being checked for null at line 1030, but then it
was being dereferenced without any null check at line 1048. This could
potentially lead to a null pointer dereference error if set_output_gamma
is null.
To fix this, we now ensure that set_output_gamma is not null before
dereferencing it. We do this by adding a null check for set_output_gamma
before the call to set_output_gamma at line 1048.
Cc: Tom Chung <chiahsuan.chung(a)amd.com>
Cc: Rodrigo Siqueira <Rodrigo.Siqueira(a)amd.com>
Cc: Roman Li <roman.li(a)amd.com>
Cc: Alex Hung <alex.hung(a)amd.com>
Cc: Aurabindo Pillai <aurabindo.pillai(a)amd.com>
Cc: Harry Wentland <harry.wentland(a)amd.com>
Cc: Hamza Mahfooz <hamza.mahfooz(a)amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam(a)amd.com>
Reviewed-by: Tom Chung <chiahsuan.chung(a)amd.com>
Signed-off-by: Alex Deucher <alexander.deucher(a)amd.com>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
Signed-off-by: Xiangyu Chen <xiangyu.chen(a)windriver.com>
---
drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c
index 9bd6a5716cdc..81b1ab55338a 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c
@@ -856,7 +856,8 @@ bool dcn20_set_output_transfer_func(struct dc *dc, struct pipe_ctx *pipe_ctx,
/*
* if above if is not executed then 'params' equal to 0 and set in bypass
*/
- mpc->funcs->set_output_gamma(mpc, mpcc_id, params);
+ if (mpc->funcs->set_output_gamma)
+ mpc->funcs->set_output_gamma(mpc, mpcc_id, params);
return true;
}
--
2.43.0
From: lei lu <llfamsec(a)gmail.com>
[ Upstream commit fb63435b7c7dc112b1ae1baea5486e0a6e27b196 ]
There is a lack of verification of the space occupied by fixed members
of xlog_op_header in the xlog_recover_process_data.
We can create a crafted image to trigger an out of bounds read by
following these steps:
1) Mount an image of xfs, and do some file operations to leave records
2) Before umounting, copy the image for subsequent steps to simulate
abnormal exit. Because umount will ensure that tail_blk and
head_blk are the same, which will result in the inability to enter
xlog_recover_process_data
3) Write a tool to parse and modify the copied image in step 2
4) Make the end of the xlog_op_header entries only 1 byte away from
xlog_rec_header->h_size
5) xlog_rec_header->h_num_logops++
6) Modify xlog_rec_header->h_crc
Fix:
Add a check to make sure there is sufficient space to access fixed members
of xlog_op_header.
Signed-off-by: lei lu <llfamsec(a)gmail.com>
Reviewed-by: Dave Chinner <dchinner(a)redhat.com>
Reviewed-by: Darrick J. Wong <djwong(a)kernel.org>
Signed-off-by: Chandan Babu R <chandanbabu(a)kernel.org>
Signed-off-by: Bin Lan <bin.lan.cn(a)windriver.com>
---
fs/xfs/xfs_log_recover.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
index affe94356ed1..006a376c34b2 100644
--- a/fs/xfs/xfs_log_recover.c
+++ b/fs/xfs/xfs_log_recover.c
@@ -2439,7 +2439,10 @@ xlog_recover_process_data(
ohead = (struct xlog_op_header *)dp;
dp += sizeof(*ohead);
- ASSERT(dp <= end);
+ if (dp > end) {
+ xfs_warn(log->l_mp, "%s: op header overrun", __func__);
+ return -EFSCORRUPTED;
+ }
/* errors will abort recovery */
error = xlog_recover_process_ophdr(log, rhash, rhead, ohead,
--
2.34.1
From: Srinivasan Shanmugam <srinivasan.shanmugam(a)amd.com>
[ Upstream commit c395fd47d1565bd67671f45cca281b3acc2c31ef ]
This commit addresses a potential null pointer dereference issue in the
`dcn32_init_hw` function. The issue could occur when `dc->clk_mgr` is
null.
The fix adds a check to ensure `dc->clk_mgr` is not null before
accessing its functions. This prevents a potential null pointer
dereference.
Reported by smatch:
drivers/gpu/drm/amd/amdgpu/../display/dc/hwss/dcn32/dcn32_hwseq.c:961 dcn32_init_hw() error: we previously assumed 'dc->clk_mgr' could be null (see line 782)
Cc: Tom Chung <chiahsuan.chung(a)amd.com>
Cc: Rodrigo Siqueira <Rodrigo.Siqueira(a)amd.com>
Cc: Roman Li <roman.li(a)amd.com>
Cc: Alex Hung <alex.hung(a)amd.com>
Cc: Aurabindo Pillai <aurabindo.pillai(a)amd.com>
Cc: Harry Wentland <harry.wentland(a)amd.com>
Cc: Hamza Mahfooz <hamza.mahfooz(a)amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam(a)amd.com>
Reviewed-by: Alex Hung <alex.hung(a)amd.com>
Signed-off-by: Alex Deucher <alexander.deucher(a)amd.com>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
[Xiangyu: BP to fix CVE: CVE-2024-49915, modified the source path]
Signed-off-by: Xiangyu Chen <xiangyu.chen(a)windriver.com>
---
drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hwseq.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hwseq.c b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hwseq.c
index d3ad13bf35c8..55a24d9f5b14 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hwseq.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hwseq.c
@@ -811,7 +811,7 @@ void dcn32_init_hw(struct dc *dc)
int edp_num;
uint32_t backlight = MAX_BACKLIGHT_LEVEL;
- if (dc->clk_mgr && dc->clk_mgr->funcs->init_clocks)
+ if (dc->clk_mgr && dc->clk_mgr->funcs && dc->clk_mgr->funcs->init_clocks)
dc->clk_mgr->funcs->init_clocks(dc->clk_mgr);
// Initialize the dccg
@@ -970,10 +970,11 @@ void dcn32_init_hw(struct dc *dc)
if (!dcb->funcs->is_accelerated_mode(dcb) && dc->res_pool->hubbub->funcs->init_watermarks)
dc->res_pool->hubbub->funcs->init_watermarks(dc->res_pool->hubbub);
- if (dc->clk_mgr->funcs->notify_wm_ranges)
+ if (dc->clk_mgr && dc->clk_mgr->funcs && dc->clk_mgr->funcs->notify_wm_ranges)
dc->clk_mgr->funcs->notify_wm_ranges(dc->clk_mgr);
- if (dc->clk_mgr->funcs->set_hard_max_memclk && !dc->clk_mgr->dc_mode_softmax_enabled)
+ if (dc->clk_mgr && dc->clk_mgr->funcs && dc->clk_mgr->funcs->set_hard_max_memclk &&
+ !dc->clk_mgr->dc_mode_softmax_enabled)
dc->clk_mgr->funcs->set_hard_max_memclk(dc->clk_mgr);
if (dc->res_pool->hubbub->funcs->force_pstate_change_control)
--
2.25.1
From: Srinivasan Shanmugam <srinivasan.shanmugam(a)amd.com>
[ Upstream commit cba7fec864172dadd953daefdd26e01742b71a6a ]
This commit addresses a potential null pointer dereference issue in the
`dcn30_init_hw` function. The issue could occur when `dc->clk_mgr` or
`dc->clk_mgr->funcs` is null.
The fix adds a check to ensure `dc->clk_mgr` and `dc->clk_mgr->funcs` is
not null before accessing its functions. This prevents a potential null
pointer dereference.
Reported by smatch:
drivers/gpu/drm/amd/amdgpu/../display/dc/hwss/dcn30/dcn30_hwseq.c:789 dcn30_init_hw() error: we previously assumed 'dc->clk_mgr' could be null (see line 628)
Cc: Tom Chung <chiahsuan.chung(a)amd.com>
Cc: Rodrigo Siqueira <Rodrigo.Siqueira(a)amd.com>
Cc: Roman Li <roman.li(a)amd.com>
Cc: Alex Hung <alex.hung(a)amd.com>
Cc: Aurabindo Pillai <aurabindo.pillai(a)amd.com>
Cc: Harry Wentland <harry.wentland(a)amd.com>
Cc: Hamza Mahfooz <hamza.mahfooz(a)amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam(a)amd.com>
Reviewed-by: Alex Hung <alex.hung(a)amd.com>
Signed-off-by: Alex Deucher <alexander.deucher(a)amd.com>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
[Xiangyu: BP to fix CVE: CVE-2024-49917, modified the source path]
Signed-off-by: Xiangyu Chen <xiangyu.chen(a)windriver.com>
---
drivers/gpu/drm/amd/display/dc/dcn30/dcn30_hwseq.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_hwseq.c b/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_hwseq.c
index ba4a1e7f196d..b8653bdfc40f 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_hwseq.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_hwseq.c
@@ -440,7 +440,7 @@ void dcn30_init_hw(struct dc *dc)
int edp_num;
uint32_t backlight = MAX_BACKLIGHT_LEVEL;
- if (dc->clk_mgr && dc->clk_mgr->funcs->init_clocks)
+ if (dc->clk_mgr && dc->clk_mgr->funcs && dc->clk_mgr->funcs->init_clocks)
dc->clk_mgr->funcs->init_clocks(dc->clk_mgr);
// Initialize the dccg
@@ -599,11 +599,12 @@ void dcn30_init_hw(struct dc *dc)
if (!dcb->funcs->is_accelerated_mode(dcb) && dc->res_pool->hubbub->funcs->init_watermarks)
dc->res_pool->hubbub->funcs->init_watermarks(dc->res_pool->hubbub);
- if (dc->clk_mgr->funcs->notify_wm_ranges)
+ if (dc->clk_mgr && dc->clk_mgr->funcs && dc->clk_mgr->funcs->notify_wm_ranges)
dc->clk_mgr->funcs->notify_wm_ranges(dc->clk_mgr);
//if softmax is enabled then hardmax will be set by a different call
- if (dc->clk_mgr->funcs->set_hard_max_memclk && !dc->clk_mgr->dc_mode_softmax_enabled)
+ if (dc->clk_mgr && dc->clk_mgr->funcs && dc->clk_mgr->funcs->set_hard_max_memclk &&
+ !dc->clk_mgr->dc_mode_softmax_enabled)
dc->clk_mgr->funcs->set_hard_max_memclk(dc->clk_mgr);
if (dc->res_pool->hubbub->funcs->force_pstate_change_control)
--
2.25.1
v4:
- Adds Bjorn's RB to first patch - Bjorn
- Drops the 'd' in "and int" - Bjorn
- Amends commit log of patch 3 to capture a number of open questions -
Bjorn
- Link to v3: https://lore.kernel.org/r/20241126-b4-linux-next-24-11-18-clock-multiple-po…
v3:
- Fixes commit log "per which" - Bryan
- Link to v2: https://lore.kernel.org/r/20241125-b4-linux-next-24-11-18-clock-multiple-po…
v2:
The main change in this version is Bjorn's pointing out that pm_runtime_*
inside of the gdsc_enable/gdsc_disable path would be recursive and cause a
lockdep splat. Dmitry alluded to this too.
Bjorn pointed to stuff being done lower in the gdsc_register() routine that
might be a starting point.
I iterated around that idea and came up with patch #3. When a gdsc has no
parent and the pd_list is non-NULL then attach that orphan GDSC to the
clock controller power-domain list.
Existing subdomain code in gdsc_register() will connect the parent GDSCs in
the clock-controller to the clock-controller subdomain, the new code here
does that same job for a list of power-domains the clock controller depends
on.
To Dmitry's point about MMCX and MCX dependencies for the registers inside
of the clock controller, I have switched off all references in a test dtsi
and confirmed that accessing the clock-controller regs themselves isn't
required.
On the second point I also verified my test branch with lockdep on which
was a concern with the pm_domain version of this solution but I wanted to
cover it anyway with the new approach for completeness sake.
Here's the item-by-item list of changes:
- Adds a patch to capture pm_genpd_add_subdomain() result code - Bryan
- Changes changelog of second patch to remove singleton and generally
to make the commit log easier to understand - Bjorn
- Uses demv_pm_domain_attach_list - Vlad
- Changes error check to if (ret < 0 && ret != -EEXIST) - Vlad
- Retains passing &pd_data instead of NULL - because NULL doesn't do
the same thing - Bryan/Vlad
- Retains standalone function qcom_cc_pds_attach() because the pd_data
enumeration looks neater in a standalone function - Bryan/Vlad
- Drops pm_runtime in favour of gdsc_add_subdomain_list() for each
power-domain in the pd_list.
The pd_list will be whatever is pointed to by power-domains = <>
in the dtsi - Bjorn
- Link to v1: https://lore.kernel.org/r/20241118-b4-linux-next-24-11-18-clock-multiple-po…
v1:
On x1e80100 and it's SKUs the Camera Clock Controller - CAMCC has
multiple power-domains which power it. Usually with a single power-domain
the core platform code will automatically switch on the singleton
power-domain for you. If you have multiple power-domains for a device, in
this case the clock controller, you need to switch those power-domains
on/off yourself.
The clock controllers can also contain Global Distributed
Switch Controllers - GDSCs which themselves can be referenced from dtsi
nodes ultimately triggering a gdsc_en() in drivers/clk/qcom/gdsc.c.
As an example:
cci0: cci@ac4a000 {
power-domains = <&camcc TITAN_TOP_GDSC>;
};
This series adds the support to attach a power-domain list to the
clock-controllers and the GDSCs those controllers provide so that in the
case of the above example gdsc_toggle_logic() will trigger the power-domain
list with pm_runtime_resume_and_get() and pm_runtime_put_sync()
respectively.
Signed-off-by: Bryan O'Donoghue <bryan.odonoghue(a)linaro.org>
---
Bryan O'Donoghue (3):
clk: qcom: gdsc: Capture pm_genpd_add_subdomain result code
clk: qcom: common: Add support for power-domain attachment
clk: qcom: Support attaching GDSCs to multiple parents
drivers/clk/qcom/common.c | 21 +++++++++++++++++++++
drivers/clk/qcom/gdsc.c | 41 +++++++++++++++++++++++++++++++++++++++--
drivers/clk/qcom/gdsc.h | 1 +
3 files changed, 61 insertions(+), 2 deletions(-)
---
base-commit: 744cf71b8bdfcdd77aaf58395e068b7457634b2c
change-id: 20241118-b4-linux-next-24-11-18-clock-multiple-power-domains-a5f994dc452a
Best regards,
--
Bryan O'Donoghue <bryan.odonoghue(a)linaro.org>
From: Chuck Lever <chuck.lever(a)oracle.com>
Testing shows that the EBUSY error return from mtree_alloc_cyclic()
leaks into user space. The ERRORS section of "man creat(2)" says:
> EBUSY O_EXCL was specified in flags and pathname refers
> to a block device that is in use by the system
> (e.g., it is mounted).
ENOSPC is closer to what applications expect in this situation.
Note that the normal range of simple directory offset values is
2..2^63, so hitting this error is going to be rare to impossible.
Fixes: 6faddda69f62 ("libfs: Add directory operations for stable offsets")
Cc: <stable(a)vger.kernel.org> # v6.9+
Reviewed-by: Jeff Layton <jlayton(a)kernel.org>
Reviewed-by: Yang Erkun <yangerkun(a)huawei.com>
Signed-off-by: Chuck Lever <chuck.lever(a)oracle.com>
---
fs/libfs.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/fs/libfs.c b/fs/libfs.c
index 46966fd8bcf9..bf67954b525b 100644
--- a/fs/libfs.c
+++ b/fs/libfs.c
@@ -288,7 +288,9 @@ int simple_offset_add(struct offset_ctx *octx, struct dentry *dentry)
ret = mtree_alloc_cyclic(&octx->mt, &offset, dentry, DIR_OFFSET_MIN,
LONG_MAX, &octx->next_offset, GFP_KERNEL);
- if (ret < 0)
+ if (unlikely(ret == -EBUSY))
+ return -ENOSPC;
+ if (unlikely(ret < 0))
return ret;
offset_set(dentry, offset);
--
2.47.0
The following commit has been merged into the timers/urgent branch of tip:
Commit-ID: 299130166e70124956c865a66a3669a61db1c212
Gitweb: https://git.kernel.org/tip/299130166e70124956c865a66a3669a61db1c212
Author: Marcelo Dalmas <marcelo.dalmas(a)ge.com>
AuthorDate: Mon, 25 Nov 2024 12:16:09
Committer: Thomas Gleixner <tglx(a)linutronix.de>
CommitterDate: Wed, 27 Nov 2024 15:18:45 +01:00
ntp: Remove invalid cast in time offset math
Due to an unsigned cast, adjtimex() returns the wrong offest when using
ADJ_MICRO and the offset is negative. In this case a small negative offset
returns approximately 4.29 seconds (~ 2^32/1000 milliseconds) due to the
unsigned cast of the negative offset.
Remove the cast and restore the signed division to cure that issue.
Fixes: ead25417f82e ("timex: use __kernel_timex internally")
Signed-off-by: Marcelo Dalmas <marcelo.dalmas(a)ge.com>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/all/SJ0P101MB03687BF7D5A10FD3C49C51E5F42E2@SJ0P101M…
---
kernel/time/ntp.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/time/ntp.c b/kernel/time/ntp.c
index b550ebe..02e7fe6 100644
--- a/kernel/time/ntp.c
+++ b/kernel/time/ntp.c
@@ -798,7 +798,7 @@ int __do_adjtimex(struct __kernel_timex *txc, const struct timespec64 *ts,
txc->offset = shift_right(ntpdata->time_offset * NTP_INTERVAL_FREQ, NTP_SCALE_SHIFT);
if (!(ntpdata->time_status & STA_NANO))
- txc->offset = (u32)txc->offset / NSEC_PER_USEC;
+ txc->offset /= NSEC_PER_USEC;
}
result = ntpdata->time_state;
From: Claudiu Beznea <claudiu.beznea.uj(a)bp.renesas.com>
On the Renesas RZ/G3S, when doing suspend to RAM, the uart_suspend_port()
is called. The uart_suspend_port() calls 3 times the
struct uart_port::ops::tx_empty() before shutting down the port.
According to the documentation, the struct uart_port::ops::tx_empty()
API tests whether the transmitter FIFO and shifter for the port is
empty.
The Renesas RZ/G3S SCIFA IP reports the number of data units stored in the
transmit FIFO through the FDR (FIFO Data Count Register). The data units
in the FIFOs are written in the shift register and transmitted from there.
The TEND bit in the Serial Status Register reports if the data was
transmitted from the shift register.
In the previous code, in the tx_empty() API implemented by the sh-sci
driver, it is considered that the TX is empty if the hardware reports the
TEND bit set and the number of data units in the FIFO is zero.
According to the HW manual, the TEND bit has the following meaning:
0: Transmission is in the waiting state or in progress.
1: Transmission is completed.
It has been noticed that when opening the serial device w/o using it and
then switch to a power saving mode, the tx_empty() call in the
uart_port_suspend() function fails, leading to the "Unable to drain
transmitter" message being printed on the console. This is because the
TEND=0 if nothing has been transmitted and the FIFOs are empty. As the
TEND=0 has double meaning (waiting state, in progress) we can't
determined the scenario described above.
Add a software workaround for this. This sets a variable if any data has
been sent on the serial console (when using PIO) or if the DMA callback has
been called (meaning something has been transmitted). In the tx_empty()
API the status of the DMA transaction is also checked and if it is
completed or in progress the code falls back in checking the hardware
registers instead of relying on the software variable.
Fixes: 73a19e4c0301 ("serial: sh-sci: Add DMA support.")
Cc: stable(a)vger.kernel.org
Signed-off-by: Claudiu Beznea <claudiu.beznea.uj(a)bp.renesas.com>
---
Changes in v3:
- s/first_time_tx/tx_occurred/g
- checked the DMA status in sci_tx_empty() through sci_dma_check_tx_occurred()
function; added this new function as the DMA support is conditioned by
the CONFIG_SERIAL_SH_SCI_DMA flag
- dropped the tx_occurred initialization in sci_shutdown() as it is already
initialized in sci_startup()
- adjusted the commit message to reflect latest changes
Changes in v2:
- use bool type instead of atomic_t
drivers/tty/serial/sh-sci.c | 26 ++++++++++++++++++++++++++
1 file changed, 26 insertions(+)
diff --git a/drivers/tty/serial/sh-sci.c b/drivers/tty/serial/sh-sci.c
index 136e0c257af1..ade151ff39d2 100644
--- a/drivers/tty/serial/sh-sci.c
+++ b/drivers/tty/serial/sh-sci.c
@@ -157,6 +157,7 @@ struct sci_port {
bool has_rtscts;
bool autorts;
+ bool tx_occurred;
};
#define SCI_NPORTS CONFIG_SERIAL_SH_SCI_NR_UARTS
@@ -850,6 +851,7 @@ static void sci_transmit_chars(struct uart_port *port)
{
struct tty_port *tport = &port->state->port;
unsigned int stopped = uart_tx_stopped(port);
+ struct sci_port *s = to_sci_port(port);
unsigned short status;
unsigned short ctrl;
int count;
@@ -885,6 +887,7 @@ static void sci_transmit_chars(struct uart_port *port)
}
sci_serial_out(port, SCxTDR, c);
+ s->tx_occurred = true;
port->icount.tx++;
} while (--count > 0);
@@ -1241,6 +1244,8 @@ static void sci_dma_tx_complete(void *arg)
if (kfifo_len(&tport->xmit_fifo) < WAKEUP_CHARS)
uart_write_wakeup(port);
+ s->tx_occurred = true;
+
if (!kfifo_is_empty(&tport->xmit_fifo)) {
s->cookie_tx = 0;
schedule_work(&s->work_tx);
@@ -1731,6 +1736,16 @@ static void sci_flush_buffer(struct uart_port *port)
s->cookie_tx = -EINVAL;
}
}
+
+static void sci_dma_check_tx_occurred(struct sci_port *s)
+{
+ struct dma_tx_state state;
+ enum dma_status status;
+
+ status = dmaengine_tx_status(s->chan_tx, s->cookie_tx, &state);
+ if (status == DMA_COMPLETE || status == DMA_IN_PROGRESS)
+ s->tx_occurred = true;
+}
#else /* !CONFIG_SERIAL_SH_SCI_DMA */
static inline void sci_request_dma(struct uart_port *port)
{
@@ -1740,6 +1755,10 @@ static inline void sci_free_dma(struct uart_port *port)
{
}
+static void sci_dma_check_tx_occurred(struct sci_port *s)
+{
+}
+
#define sci_flush_buffer NULL
#endif /* !CONFIG_SERIAL_SH_SCI_DMA */
@@ -2076,6 +2095,12 @@ static unsigned int sci_tx_empty(struct uart_port *port)
{
unsigned short status = sci_serial_in(port, SCxSR);
unsigned short in_tx_fifo = sci_txfill(port);
+ struct sci_port *s = to_sci_port(port);
+
+ sci_dma_check_tx_occurred(s);
+
+ if (!s->tx_occurred)
+ return TIOCSER_TEMT;
return (status & SCxSR_TEND(port)) && !in_tx_fifo ? TIOCSER_TEMT : 0;
}
@@ -2247,6 +2272,7 @@ static int sci_startup(struct uart_port *port)
dev_dbg(port->dev, "%s(%d)\n", __func__, port->line);
+ s->tx_occurred = false;
sci_request_dma(port);
ret = sci_request_irq(s);
--
2.39.2
From: Alexander Sverdlin <alexander.sverdlin(a)siemens.com>
The problem apparetly has been known since the conversion to
raw_spin_lock() (commit 4dbada2be460
("gpio: omap: use raw locks for locking")).
Symptom:
[ BUG: Invalid wait context ]
5.10.214
-----------------------------
swapper/1 is trying to lock:
(enable_lock){....}-{3:3}, at: clk_enable_lock
other info that might help us debug this:
context-{5:5}
2 locks held by swapper/1:
#0: (&dev->mutex){....}-{4:4}, at: device_driver_attach
#1: (&bank->lock){....}-{2:2}, at: omap_gpio_set_config
stack backtrace:
CPU: 0 PID: 1 Comm: swapper Not tainted 5.10.214
Hardware name: Generic AM33XX (Flattened Device Tree)
unwind_backtrace
show_stack
__lock_acquire
lock_acquire.part.0
_raw_spin_lock_irqsave
clk_enable_lock
clk_enable
omap_gpio_set_config
gpio_keys_setup_key
gpio_keys_probe
platform_drv_probe
really_probe
driver_probe_device
device_driver_attach
__driver_attach
bus_for_each_dev
bus_add_driver
driver_register
do_one_initcall
do_initcalls
kernel_init_freeable
kernel_init
Problematic spin_lock_irqsave(&enable_lock, ...) is being called by
clk_enable()/clk_disable() in omap2_set_gpio_debounce() and
omap_clear_gpio_debounce().
For omap2_set_gpio_debounce() it's possible to move
raw_spin_lock_irqsave(&bank->lock, ...) inside omap2_set_gpio_debounce()
so that the locks nest as follows:
clk_enable(bank->dbck)
raw_spin_lock_irqsave(&bank->lock, ...)
raw_spin_unlock_irqrestore()
clk_disable()
Two call-sites of omap_clear_gpio_debounce() are more convoluted, but one
can take the advantage of the nesting nature of clk_enable()/clk_disable(),
so that the inner clk_disable() becomes lockless:
clk_enable(bank->dbck) <-- only to clk_enable_lock()
raw_spin_lock_irqsave(&bank->lock, ...)
omap_clear_gpio_debounce()
clk_disable() <-- becomes lockless
raw_spin_unlock_irqrestore()
clk_disable()
Cc: stable(a)vger.kernel.org
Fixes: 4dbada2be460 ("gpio: omap: use raw locks for locking")
Signed-off-by: Alexander Sverdlin <alexander.sverdlin(a)siemens.com>
---
drivers/gpio/gpio-omap.c | 35 ++++++++++++++++++++++++++++++-----
1 file changed, 30 insertions(+), 5 deletions(-)
diff --git a/drivers/gpio/gpio-omap.c b/drivers/gpio/gpio-omap.c
index 7ad4534054962..f9e502aa57753 100644
--- a/drivers/gpio/gpio-omap.c
+++ b/drivers/gpio/gpio-omap.c
@@ -181,6 +181,7 @@ static inline void omap_gpio_dbck_disable(struct gpio_bank *bank)
static int omap2_set_gpio_debounce(struct gpio_bank *bank, unsigned offset,
unsigned debounce)
{
+ unsigned long flags;
u32 val;
u32 l;
bool enable = !!debounce;
@@ -196,13 +197,18 @@ static int omap2_set_gpio_debounce(struct gpio_bank *bank, unsigned offset,
l = BIT(offset);
+ /*
+ * Ordering is important here: clk_enable() calls spin_lock_irqsave(),
+ * therefore it must be outside of the following raw_spin_lock_irqsave()
+ */
clk_enable(bank->dbck);
+ raw_spin_lock_irqsave(&bank->lock, flags);
+
writel_relaxed(debounce, bank->base + bank->regs->debounce);
val = omap_gpio_rmw(bank->base + bank->regs->debounce_en, l, enable);
bank->dbck_enable_mask = val;
- clk_disable(bank->dbck);
/*
* Enable debounce clock per module.
* This call is mandatory because in omap_gpio_request() when
@@ -217,6 +223,9 @@ static int omap2_set_gpio_debounce(struct gpio_bank *bank, unsigned offset,
bank->context.debounce_en = val;
}
+ raw_spin_unlock_irqrestore(&bank->lock, flags);
+ clk_disable(bank->dbck);
+
return 0;
}
@@ -647,6 +656,13 @@ static void omap_gpio_irq_shutdown(struct irq_data *d)
unsigned long flags;
unsigned offset = d->hwirq;
+ /*
+ * Enable the clock here so that the nested clk_disable() in the
+ * following omap_clear_gpio_debounce() is lockless
+ */
+ if (bank->dbck_flag)
+ clk_enable(bank->dbck);
+
raw_spin_lock_irqsave(&bank->lock, flags);
bank->irq_usage &= ~(BIT(offset));
omap_set_gpio_triggering(bank, offset, IRQ_TYPE_NONE);
@@ -656,6 +672,9 @@ static void omap_gpio_irq_shutdown(struct irq_data *d)
omap_clear_gpio_debounce(bank, offset);
omap_disable_gpio_module(bank, offset);
raw_spin_unlock_irqrestore(&bank->lock, flags);
+
+ if (bank->dbck_flag)
+ clk_disable(bank->dbck);
}
static void omap_gpio_irq_bus_lock(struct irq_data *data)
@@ -827,6 +846,13 @@ static void omap_gpio_free(struct gpio_chip *chip, unsigned offset)
struct gpio_bank *bank = gpiochip_get_data(chip);
unsigned long flags;
+ /*
+ * Enable the clock here so that the nested clk_disable() in the
+ * following omap_clear_gpio_debounce() is lockless
+ */
+ if (bank->dbck_flag)
+ clk_enable(bank->dbck);
+
raw_spin_lock_irqsave(&bank->lock, flags);
bank->mod_usage &= ~(BIT(offset));
if (!LINE_USED(bank->irq_usage, offset)) {
@@ -836,6 +862,9 @@ static void omap_gpio_free(struct gpio_chip *chip, unsigned offset)
omap_disable_gpio_module(bank, offset);
raw_spin_unlock_irqrestore(&bank->lock, flags);
+ if (bank->dbck_flag)
+ clk_disable(bank->dbck);
+
pm_runtime_put(chip->parent);
}
@@ -913,15 +942,11 @@ static int omap_gpio_debounce(struct gpio_chip *chip, unsigned offset,
unsigned debounce)
{
struct gpio_bank *bank;
- unsigned long flags;
int ret;
bank = gpiochip_get_data(chip);
- raw_spin_lock_irqsave(&bank->lock, flags);
ret = omap2_set_gpio_debounce(bank, offset, debounce);
- raw_spin_unlock_irqrestore(&bank->lock, flags);
-
if (ret)
dev_info(chip->parent,
"Could not set line %u debounce to %u microseconds (%d)",
--
2.47.0
[BUG]
If submit_one_sector() failed inside extent_writepage_io() for sector
size < page size cases (e.g. 4K sector size and 64K page size), then
we can hit double ordered extent accounting error.
This should be very rare, as submit_one_sector() only fails when we
failed to grab the extent map, and such extent map should exist inside
the memory and have been pinned.
[CAUSE]
For example we have the following folio layout:
0 4K 32K 48K 60K 64K
|//| |//////| |///|
Where |///| is the dirty range we need to writeback. The 3 different
dirty ranges are submitted for regular COW.
Now we hit the following sequence:
- submit_one_sector() returned 0 for [0, 4K)
- submit_one_sector() returned 0 for [32K, 48K)
- submit_one_sector() returned error for [60K, 64K)
- btrfs_mark_ordered_io_finished() called for the whole folio
This will mark the following ranges as finished:
* [0, 4K)
* [32K, 48K)
Both ranges have their IO already submitted, this cleanup will
lead to double accounting.
* [60K, 64K)
That's the correct cleanup.
The only good news is, this error is only theoretical, as the target
extent map is always pinned, thus we should directly grab it from
memory, other than reading it from the disk.
[FIX]
Instead of calling btrfs_mark_ordered_io_finished() for the whole folio
range, which can touch ranges we should not touch, instead
move the error handling inside extent_writepage_io().
So that we can cleanup exact sectors that are ought to be submitted but
failed.
This provide much more accurate cleanup, avoiding the double accounting.
Cc: stable(a)vger.kernel.org # 5.15+
Signed-off-by: Qu Wenruo <wqu(a)suse.com>
---
fs/btrfs/extent_io.c | 32 +++++++++++++++++++-------------
1 file changed, 19 insertions(+), 13 deletions(-)
diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index d619c4e148be..b74298c2c24f 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -1418,6 +1418,7 @@ static noinline_for_stack int extent_writepage_io(struct btrfs_inode *inode,
struct btrfs_fs_info *fs_info = inode->root->fs_info;
unsigned long range_bitmap = 0;
bool submitted_io = false;
+ bool error = false;
const u64 folio_start = folio_pos(folio);
u64 cur;
int bit;
@@ -1460,11 +1461,21 @@ static noinline_for_stack int extent_writepage_io(struct btrfs_inode *inode,
break;
}
ret = submit_one_sector(inode, folio, cur, bio_ctrl, i_size);
- if (ret < 0)
- goto out;
+ if (unlikely(ret < 0)) {
+ submit_one_bio(bio_ctrl);
+ /*
+ * Failed to grab the extent map which should be very rare.
+ * Since there is no bio submitted to finish the ordered
+ * extent, we have to manually finish this sector.
+ */
+ btrfs_mark_ordered_io_finished(inode, folio, cur,
+ fs_info->sectorsize, false);
+ error = true;
+ continue;
+ }
submitted_io = true;
}
-out:
+
/*
* If we didn't submitted any sector (>= i_size), folio dirty get
* cleared but PAGECACHE_TAG_DIRTY is not cleared (only cleared
@@ -1472,8 +1483,11 @@ static noinline_for_stack int extent_writepage_io(struct btrfs_inode *inode,
*
* Here we set writeback and clear for the range. If the full folio
* is no longer dirty then we clear the PAGECACHE_TAG_DIRTY tag.
+ *
+ * If we hit any error, the corresponding sector will still be dirty
+ * thus no need to clear PAGECACHE_TAG_DIRTY.
*/
- if (!submitted_io) {
+ if (!submitted_io && !error) {
btrfs_folio_set_writeback(fs_info, folio, start, len);
btrfs_folio_clear_writeback(fs_info, folio, start, len);
}
@@ -1493,7 +1507,6 @@ static int extent_writepage(struct folio *folio, struct btrfs_bio_ctrl *bio_ctrl
{
struct inode *inode = folio->mapping->host;
struct btrfs_fs_info *fs_info = inode_to_fs_info(inode);
- const u64 page_start = folio_pos(folio);
int ret;
size_t pg_offset;
loff_t i_size = i_size_read(inode);
@@ -1536,10 +1549,6 @@ static int extent_writepage(struct folio *folio, struct btrfs_bio_ctrl *bio_ctrl
bio_ctrl->wbc->nr_to_write--;
- if (ret)
- btrfs_mark_ordered_io_finished(BTRFS_I(inode), folio,
- page_start, PAGE_SIZE, !ret);
-
done:
if (ret < 0)
mapping_set_error(folio->mapping, ret);
@@ -2320,11 +2329,8 @@ void extent_write_locked_range(struct inode *inode, const struct folio *locked_f
if (ret == 1)
goto next_page;
- if (ret) {
- btrfs_mark_ordered_io_finished(BTRFS_I(inode), folio,
- cur, cur_len, !ret);
+ if (ret)
mapping_set_error(mapping, ret);
- }
btrfs_folio_end_lock(fs_info, folio, cur, cur_len);
if (ret < 0)
found_error = true;
--
2.47.0
[BUG]
There are several double accounting case, where the WARN_ON_ONCE() is
triggered inside can_finish_ordered_extent().
And all such cases points back to the btrfs_mark_ordered_io_finished()
call inside extent_writepage() when it hits some error.
[CAUSE]
With extra debug patches to show where the error is from, it turns out
to be btrfs_run_delalloc_range() can fail with -ENOSPC.
Such failure itself is already a symptom of some bad data/metadata space
reservation, but here we need to focus on the error handling part.
For example, we have the following dirty page layout (4K sector size and
4K page size):
0 16K 32K
|/////|/////|/////|/////|/////|/////|/////|/////|
Where the range [0, 32K) is dirty and we need to write all the 8 pages
back.
When handling the first page 0, we go the following sequence:
- btrfs_run_delalloc_range() for range [0, 32k)
We enter cow_file_range() for [0, 32K)
- btrfs_reserve_extent() only returned a 16K data extent.
This can be caused by fragmentation, and it's already an indication
we're almost running of space.
Now we have the following layout:
0 16K 32K
|<----- Reserved ------>|/////|/////|/////|/////|
The range [0, 16K) has ordered extent allocated.
- btrfs_reserve_extent() returned -ENOSPC
We really run out of space. But since we have reserved space
for range [0, 16K) we need to clean them up.
But that cleanup for ordered extent only happens inside
btrfs_run_delalloc_range().
- btrfs_run_delalloc_range() cleanup the reserved ordered extent
By calling btrfs_mark_ordered_io_finished() for range [0, 32K).
It will locate the ordered extent [0, 16K) and mark it as IOERR.
Also since the ordered extent is only 16K, we're finishing the whole
ordered extent.
Thus we call btrfs_queue_ordered_fn() to queue to finish the ordered
extent.
But still, the ordered extent [0, 16K) is still in the
btrfs_inode::ordered_tree.
- extent_writepage() cleanup the ordered extent inside the folio
We call btrfs_mark_ordered_io_finished() for range [0, 4K).
Since the finished ordered extent [0, 16K) is not yet removed (racy,
depends on when btrfs_finish_one_ordered() is called), if
btrfs_mark_ordered_io_finished() is called before
btrfs_finish_one_ordered(), we will double account and trigger the
warning inside can_finish_ordered_extent().
So the root cause is, we're relying on btrfs_mark_ordered_io_finished()
to handle ranges which is already cleaned up.
Unfortunately the bug dates back to the early days when
btrfs_mark_ordered_io_finished() is introduced as a no-brain choice for
error paths, but such no-brain solution just hides all the race and make
us less cautious when handling errors.
[FIX]
Instead of relying on the btrfs_mark_ordered_io_finished() call to
cleanup the whole folio range, record the last successfully ran delalloc
range.
And combined with bio_ctrl->submit_bitmap to properly clean up any newly
created ordered extents.
Since we have cleaned up the ordered extents in range, we should not
rely on the btrfs_mark_ordered_io_finished() inside extent_writepage()
anymore.
By this, we ensure btrfs_mark_ordered_io_finished() is only called once
when writepage_delalloc() failed.
Cc: stable(a)vger.kernel.org # 5.15+
Fixes: e65f152e4348 ("btrfs: refactor how we finish ordered extent io for endio functions")
Signed-off-by: Qu Wenruo <wqu(a)suse.com>
---
fs/btrfs/extent_io.c | 37 ++++++++++++++++++++++++++++++++-----
1 file changed, 32 insertions(+), 5 deletions(-)
diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index 438974d4def4..d619c4e148be 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -1167,6 +1167,12 @@ static noinline_for_stack int writepage_delalloc(struct btrfs_inode *inode,
* last delalloc end.
*/
u64 last_delalloc_end = 0;
+ /*
+ * Save the last successfully ran delalloc range end (exclusive).
+ * This is for error handling to avoid ranges with ordered extent created
+ * but no IO will be submitted due to error.
+ */
+ u64 last_finished = page_start;
u64 delalloc_start = page_start;
u64 delalloc_end = page_end;
u64 delalloc_to_write = 0;
@@ -1235,11 +1241,19 @@ static noinline_for_stack int writepage_delalloc(struct btrfs_inode *inode,
found_len = last_delalloc_end + 1 - found_start;
if (ret >= 0) {
+ /*
+ * Some delalloc range may be created by previous folios.
+ * Thus we still need to clean those range up during error
+ * handling.
+ */
+ last_finished = found_start;
/* No errors hit so far, run the current delalloc range. */
ret = btrfs_run_delalloc_range(inode, folio,
found_start,
found_start + found_len - 1,
wbc);
+ if (ret >= 0)
+ last_finished = found_start + found_len;
} else {
/*
* We've hit an error during previous delalloc range,
@@ -1274,8 +1288,21 @@ static noinline_for_stack int writepage_delalloc(struct btrfs_inode *inode,
delalloc_start = found_start + found_len;
}
- if (ret < 0)
+ /*
+ * It's possible we have some ordered extents created before we hit
+ * an error, cleanup non-async successfully created delalloc ranges.
+ */
+ if (unlikely(ret < 0)) {
+ unsigned int bitmap_size = min(
+ (last_finished - page_start) >> fs_info->sectorsize_bits,
+ fs_info->sectors_per_page);
+
+ for_each_set_bit(bit, &bio_ctrl->submit_bitmap, bitmap_size)
+ btrfs_mark_ordered_io_finished(inode, folio,
+ page_start + (bit << fs_info->sectorsize_bits),
+ fs_info->sectorsize, false);
return ret;
+ }
out:
if (last_delalloc_end)
delalloc_end = last_delalloc_end;
@@ -1509,13 +1536,13 @@ static int extent_writepage(struct folio *folio, struct btrfs_bio_ctrl *bio_ctrl
bio_ctrl->wbc->nr_to_write--;
-done:
- if (ret) {
+ if (ret)
btrfs_mark_ordered_io_finished(BTRFS_I(inode), folio,
page_start, PAGE_SIZE, !ret);
- mapping_set_error(folio->mapping, ret);
- }
+done:
+ if (ret < 0)
+ mapping_set_error(folio->mapping, ret);
/*
* Only unlock ranges that are submitted. As there can be some async
* submitted ranges inside the folio.
--
2.47.0
Hi,
This series fixes the UFS resume from suspend issue by marking the UFS PHY GDSCs
as ALWAYS_ON. Starting from SM8550, UFS PHY GDSCs doesn't support retention
state. So we should keep them always on so that they don't loose the state
during suspend.
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam(a)linaro.org>
---
Manivannan Sadhasivam (2):
clk: qcom: gcc-sm8550: Keep UFS PHY GDSCs ALWAYS_ON
clk: qcom: gcc-sm8650: Keep UFS PHY GDSCs ALWAYS_ON
drivers/clk/qcom/gcc-sm8550.c | 4 ++--
drivers/clk/qcom/gcc-sm8650.c | 4 ++--
2 files changed, 4 insertions(+), 4 deletions(-)
---
base-commit: 9852d85ec9d492ebef56dc5f229416c925758edc
change-id: 20241107-ufs-clk-fix-e49ee2097594
Best regards,
--
Manivannan Sadhasivam <manivannan.sadhasivam(a)linaro.org>
Patches 1 and 2 of this series fix the issue reported by Hsin-Te Yuan
[1] where MT8192-based Chromebooks are not able to suspend/resume 10
times in a row. Either one of those patches on its own is enough to fix
the issue, but I believe both are desirable, so I've included them both
here.
Patches 3-5 fix unrelated issues that I've noticed while debugging.
Patch 3 fixes IRQ storms when the temperature sensors drop to 20
Celsius. Patches 4 and 5 are cleanups to prevent future issues.
To test this series, I've run 'rtcwake -m mem -d 60' 10 times in a row
on a MT8192-Asurada-Spherion-rev3 Chromebook and checked that the wakeup
happened 60 seconds later (+-5 seconds). I've repeated that test on 10
separate runs. Not once did the chromebook wake up early with the series
applied.
I've also checked that during those runs, the LVTS interrupt didn't
trigger even once, while before the series it would trigger a few times
per run, generally during boot or resume.
Finally, as a sanity check I've verified that the interrupts still work
by lowering the thermal trip point to 45 Celsius and running 'stress -c
8'. Indeed they still do, and the temperature showed by the
thermal_temperature ftrace event matched the expected value.
[1] https://lore.kernel.org/all/20241108-lvts-v1-1-eee339c6ca20@chromium.org/
Signed-off-by: Nícolas F. R. A. Prado <nfraprado(a)collabora.com>
---
Nícolas F. R. A. Prado (5):
thermal/drivers/mediatek/lvts: Disable monitor mode during suspend
thermal/drivers/mediatek/lvts: Disable Stage 3 thermal threshold
thermal/drivers/mediatek/lvts: Disable low offset IRQ for minimum threshold
thermal/drivers/mediatek/lvts: Start sensor interrupts disabled
thermal/drivers/mediatek/lvts: Only update IRQ enable for valid sensors
drivers/thermal/mediatek/lvts_thermal.c | 103 ++++++++++++++++++++++----------
1 file changed, 72 insertions(+), 31 deletions(-)
---
base-commit: b852e1e7a0389ed6168ef1d38eb0bad71a6b11e8
change-id: 20241121-mt8192-lvts-filtered-suspend-fix-a5032ca8eceb
Best regards,
--
Nícolas F. R. A. Prado <nfraprado(a)collabora.com>
v3:
- Fixes commit log "per which" - Bryan
- Link to v2: https://lore.kernel.org/r/20241125-b4-linux-next-24-11-18-clock-multiple-po…
v2:
The main change in this version is Bjorn's pointing out that pm_runtime_*
inside of the gdsc_enable/gdsc_disable path would be recursive and cause a
lockdep splat. Dmitry alluded to this too.
Bjorn pointed to stuff being done lower in the gdsc_register() routine that
might be a starting point.
I iterated around that idea and came up with patch #3. When a gdsc has no
parent and the pd_list is non-NULL then attach that orphan GDSC to the
clock controller power-domain list.
Existing subdomain code in gdsc_register() will connect the parent GDSCs in
the clock-controller to the clock-controller subdomain, the new code here
does that same job for a list of power-domains the clock controller depends
on.
To Dmitry's point about MMCX and MCX dependencies for the registers inside
of the clock controller, I have switched off all references in a test dtsi
and confirmed that accessing the clock-controller regs themselves isn't
required.
On the second point I also verified my test branch with lockdep on which
was a concern with the pm_domain version of this solution but I wanted to
cover it anyway with the new approach for completeness sake.
Here's the item-by-item list of changes:
- Adds a patch to capture pm_genpd_add_subdomain() result code - Bryan
- Changes changelog of second patch to remove singleton and generally
to make the commit log easier to understand - Bjorn
- Uses demv_pm_domain_attach_list - Vlad
- Changes error check to if (ret < 0 && ret != -EEXIST) - Vlad
- Retains passing &pd_data instead of NULL - because NULL doesn't do
the same thing - Bryan/Vlad
- Retains standalone function qcom_cc_pds_attach() because the pd_data
enumeration looks neater in a standalone function - Bryan/Vlad
- Drops pm_runtime in favour of gdsc_add_subdomain_list() for each
power-domain in the pd_list.
The pd_list will be whatever is pointed to by power-domains = <>
in the dtsi - Bjorn
- Link to v1: https://lore.kernel.org/r/20241118-b4-linux-next-24-11-18-clock-multiple-po…
v1:
On x1e80100 and it's SKUs the Camera Clock Controller - CAMCC has
multiple power-domains which power it. Usually with a single power-domain
the core platform code will automatically switch on the singleton
power-domain for you. If you have multiple power-domains for a device, in
this case the clock controller, you need to switch those power-domains
on/off yourself.
The clock controllers can also contain Global Distributed
Switch Controllers - GDSCs which themselves can be referenced from dtsi
nodes ultimately triggering a gdsc_en() in drivers/clk/qcom/gdsc.c.
As an example:
cci0: cci@ac4a000 {
power-domains = <&camcc TITAN_TOP_GDSC>;
};
This series adds the support to attach a power-domain list to the
clock-controllers and the GDSCs those controllers provide so that in the
case of the above example gdsc_toggle_logic() will trigger the power-domain
list with pm_runtime_resume_and_get() and pm_runtime_put_sync()
respectively.
Signed-off-by: Bryan O'Donoghue <bryan.odonoghue(a)linaro.org>
---
Bryan O'Donoghue (3):
clk: qcom: gdsc: Capture pm_genpd_add_subdomain result code
clk: qcom: common: Add support for power-domain attachment
driver: clk: qcom: Support attaching subdomain list to multiple parents
drivers/clk/qcom/common.c | 21 +++++++++++++++++++++
drivers/clk/qcom/gdsc.c | 41 +++++++++++++++++++++++++++++++++++++++--
drivers/clk/qcom/gdsc.h | 1 +
3 files changed, 61 insertions(+), 2 deletions(-)
---
base-commit: 744cf71b8bdfcdd77aaf58395e068b7457634b2c
change-id: 20241118-b4-linux-next-24-11-18-clock-multiple-power-domains-a5f994dc452a
Best regards,
--
Bryan O'Donoghue <bryan.odonoghue(a)linaro.org>
From: Chuck Lever <chuck.lever(a)oracle.com>
Testing shows that the EBUSY error return from mtree_alloc_cyclic()
leaks into user space. The ERRORS section of "man creat(2)" says:
> EBUSY O_EXCL was specified in flags and pathname refers
> to a block device that is in use by the system
> (e.g., it is mounted).
ENOSPC is closer to what applications expect in this situation.
Note that the normal range of simple directory offset values is
2..2^63, so hitting this error is going to be rare to impossible.
Fixes: 6faddda69f62 ("libfs: Add directory operations for stable offsets")
Cc: <stable(a)vger.kernel.org> # v6.9+
Signed-off-by: Chuck Lever <chuck.lever(a)oracle.com>
---
fs/libfs.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/fs/libfs.c b/fs/libfs.c
index 46966fd8bcf9..bf67954b525b 100644
--- a/fs/libfs.c
+++ b/fs/libfs.c
@@ -288,7 +288,9 @@ int simple_offset_add(struct offset_ctx *octx, struct dentry *dentry)
ret = mtree_alloc_cyclic(&octx->mt, &offset, dentry, DIR_OFFSET_MIN,
LONG_MAX, &octx->next_offset, GFP_KERNEL);
- if (ret < 0)
+ if (unlikely(ret == -EBUSY))
+ return -ENOSPC;
+ if (unlikely(ret < 0))
return ret;
offset_set(dentry, offset);
--
2.47.0
XE_CACHE_WB must be converted into the per-platform pat index for that
particular caching mode, otherwise we are just encoding whatever happens
to be the value of that enum.
Fixes: e8babb280b5e ("drm/xe: Convert multiple bind ops into single job")
Signed-off-by: Matthew Auld <matthew.auld(a)intel.com>
Cc: Matthew Brost <matthew.brost(a)intel.com>
Cc: Nirmoy Das <nirmoy.das(a)intel.com>
Cc: <stable(a)vger.kernel.org> # v6.12+
---
drivers/gpu/drm/xe/xe_migrate.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/xe/xe_migrate.c b/drivers/gpu/drm/xe/xe_migrate.c
index cfd31ae49cc1..48e205a40fd2 100644
--- a/drivers/gpu/drm/xe/xe_migrate.c
+++ b/drivers/gpu/drm/xe/xe_migrate.c
@@ -1350,6 +1350,7 @@ __xe_migrate_update_pgtables(struct xe_migrate *m,
/* For sysmem PTE's, need to map them in our hole.. */
if (!IS_DGFX(xe)) {
+ u16 pat_index = xe->pat.idx[XE_CACHE_WB];
u32 ptes, ofs;
ppgtt_ofs = NUM_KERNEL_PDE - 1;
@@ -1409,7 +1410,7 @@ __xe_migrate_update_pgtables(struct xe_migrate *m,
pt_bo->update_index = current_update;
addr = vm->pt_ops->pte_encode_bo(pt_bo, 0,
- XE_CACHE_WB, 0);
+ pat_index, 0);
bb->cs[bb->len++] = lower_32_bits(addr);
bb->cs[bb->len++] = upper_32_bits(addr);
}
--
2.47.0
The following commit has been merged into the irq/urgent branch of tip:
Commit-ID: 12aaf67584cf19dc84615b7aba272fe642c35b8b
Gitweb: https://git.kernel.org/tip/12aaf67584cf19dc84615b7aba272fe642c35b8b
Author: Russell King (Oracle) <rmk+kernel(a)armlinux.org.uk>
AuthorDate: Thu, 21 Nov 2024 12:48:25
Committer: Thomas Gleixner <tglx(a)linutronix.de>
CommitterDate: Tue, 26 Nov 2024 19:58:27 +01:00
irqchip/irq-mvebu-sei: Move misplaced select() callback to SEI CP domain
Commit fbdf14e90ce4 ("irqchip/irq-mvebu-sei: Switch to MSI parent")
introduced in v6.11-rc1 broke Mavell Armada platforms (and possibly others)
by incorrectly switching irq-mvebu-sei to MSI parent.
In the above commit, msi_parent_ops is set for the sei->cp_domain, but
rather than adding a .select method to mvebu_sei_cp_domain_ops (which is
associated with sei->cp_domain), it was added to mvebu_sei_domain_ops which
is associated with sei->sei_domain, which doesn't have any
msi_parent_ops. This makes the call to msi_lib_irq_domain_select() always
fail.
This bug manifests itself with the following kernel messages on Armada 8040
based systems:
platform f21e0000.interrupt-controller:interrupt-controller@50: deferred probe pending: (reason unknown)
platform f41e0000.interrupt-controller:interrupt-controller@50: deferred probe pending: (reason unknown)
Move the select callback to mvebu_sei_cp_domain_ops to cure it.
Fixes: fbdf14e90ce4 ("irqchip/irq-mvebu-sei: Switch to MSI parent")
Signed-off-by: Russell King (Oracle) <rmk+kernel(a)armlinux.org.uk>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/all/E1tE6bh-004CmX-QU@rmk-PC.armlinux.org.uk
---
drivers/irqchip/irq-mvebu-sei.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/irqchip/irq-mvebu-sei.c b/drivers/irqchip/irq-mvebu-sei.c
index f8c70f2..065166a 100644
--- a/drivers/irqchip/irq-mvebu-sei.c
+++ b/drivers/irqchip/irq-mvebu-sei.c
@@ -192,7 +192,6 @@ static void mvebu_sei_domain_free(struct irq_domain *domain, unsigned int virq,
}
static const struct irq_domain_ops mvebu_sei_domain_ops = {
- .select = msi_lib_irq_domain_select,
.alloc = mvebu_sei_domain_alloc,
.free = mvebu_sei_domain_free,
};
@@ -306,6 +305,7 @@ static void mvebu_sei_cp_domain_free(struct irq_domain *domain,
}
static const struct irq_domain_ops mvebu_sei_cp_domain_ops = {
+ .select = msi_lib_irq_domain_select,
.alloc = mvebu_sei_cp_domain_alloc,
.free = mvebu_sei_cp_domain_free,
};
The following commit has been merged into the irq/urgent branch of tip:
Commit-ID: 81b9e4c6910fd779b679d7674ec7d3730c7f0e2c
Gitweb: https://git.kernel.org/tip/81b9e4c6910fd779b679d7674ec7d3730c7f0e2c
Author: Russell King (Oracle) <rmk+kernel(a)armlinux.org.uk>
AuthorDate: Thu, 21 Nov 2024 12:48:25
Committer: Thomas Gleixner <tglx(a)linutronix.de>
CommitterDate: Tue, 26 Nov 2024 19:50:42 +01:00
irqchip/irq-mvebu-sei: Move misplaced select() callback to SEI CP domain
Commit fbdf14e90ce4 ("irqchip/irq-mvebu-sei: Switch to MSI parent")
introduced in v6.11-rc1 broke Mavell Armada platforms (and possibly others)
by incorrectly switching irq-mvebu-sei to MSI parent.
In the above commit, msi_parent_ops is set for the sei->cp_domain, but
rather than adding a .select method to mvebu_sei_cp_domain_ops (which is
associated with sei->cp_domain), it was added to mvebu_sei_domain_ops which
is associated with sei->sei_domain, which doesn't have any
msi_parent_ops. This makes the call to msi_lib_irq_domain_select() always
fail.
This bug manifests itself with the following kernel messages on Armada 8040
based systems:
platform f21e0000.interrupt-controller:interrupt-controller@50: deferred probe pending: (reason unknown)
platform f41e0000.interrupt-controller:interrupt-controller@50: deferred probe pending: (reason unknown)
Move the select callback to mvebu_sei_cp_domain_ops to cure it.
Fixes: fbdf14e90ce4 ("irqchip/irq-mvebu-sei: Switch to MSI parent")
Signed-off-by: Russell King (Oracle) <rmk+kernel(a)armlinux.org.uk>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Cc: stable(a)vger.kernel.org
---
drivers/irqchip/irq-mvebu-sei.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/irqchip/irq-mvebu-sei.c b/drivers/irqchip/irq-mvebu-sei.c
index f8c70f2..065166a 100644
--- a/drivers/irqchip/irq-mvebu-sei.c
+++ b/drivers/irqchip/irq-mvebu-sei.c
@@ -192,7 +192,6 @@ static void mvebu_sei_domain_free(struct irq_domain *domain, unsigned int virq,
}
static const struct irq_domain_ops mvebu_sei_domain_ops = {
- .select = msi_lib_irq_domain_select,
.alloc = mvebu_sei_domain_alloc,
.free = mvebu_sei_domain_free,
};
@@ -306,6 +305,7 @@ static void mvebu_sei_cp_domain_free(struct irq_domain *domain,
}
static const struct irq_domain_ops mvebu_sei_cp_domain_ops = {
+ .select = msi_lib_irq_domain_select,
.alloc = mvebu_sei_cp_domain_alloc,
.free = mvebu_sei_cp_domain_free,
};
From: Alex Hung <alex.hung(a)amd.com>
[ Upstream commit b995c0a6de6c74656a0c39cd57a0626351b13e3c ]
[WHAT & HOW]
Variables used as denominators and maybe not assigned to other values,
should not be 0. Change their default to 1 so they are never 0.
This fixes 10 DIVIDE_BY_ZERO issues reported by Coverity.
Reviewed-by: Harry Wentland <harry.wentland(a)amd.com>
Signed-off-by: Jerry Zuo <jerry.zuo(a)amd.com>
Signed-off-by: Alex Hung <alex.hung(a)amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler(a)amd.com>
Signed-off-by: Alex Deucher <alexander.deucher(a)amd.com>
[Xiangyu: Bp to fix CVE: CVE-2024-49899
Discard the dml2_core/dml2_core_shared.c due to this file no exists]
Signed-off-by: Xiangyu Chen <xiangyu.chen(a)windriver.com>
---
.../gpu/drm/amd/display/dc/dml/dcn20/display_rq_dlg_calc_20.c | 2 +-
drivers/gpu/drm/amd/display/dc/dml/dml1_display_rq_dlg_calc.c | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn20/display_rq_dlg_calc_20.c b/drivers/gpu/drm/amd/display/dc/dml/dcn20/display_rq_dlg_calc_20.c
index 548cdef8a8ad..543ce9a08cfd 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/dcn20/display_rq_dlg_calc_20.c
+++ b/drivers/gpu/drm/amd/display/dc/dml/dcn20/display_rq_dlg_calc_20.c
@@ -78,7 +78,7 @@ static void calculate_ttu_cursor(struct display_mode_lib *mode_lib,
static unsigned int get_bytes_per_element(enum source_format_class source_format, bool is_chroma)
{
- unsigned int ret_val = 0;
+ unsigned int ret_val = 1;
if (source_format == dm_444_16) {
if (!is_chroma)
diff --git a/drivers/gpu/drm/amd/display/dc/dml/dml1_display_rq_dlg_calc.c b/drivers/gpu/drm/amd/display/dc/dml/dml1_display_rq_dlg_calc.c
index 3df559c591f8..70df992f859d 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/dml1_display_rq_dlg_calc.c
+++ b/drivers/gpu/drm/amd/display/dc/dml/dml1_display_rq_dlg_calc.c
@@ -39,7 +39,7 @@
static unsigned int get_bytes_per_element(enum source_format_class source_format, bool is_chroma)
{
- unsigned int ret_val = 0;
+ unsigned int ret_val = 1;
if (source_format == dm_444_16) {
if (!is_chroma)
--
2.43.0
From: Srinivasan Shanmugam <srinivasan.shanmugam(a)amd.com>
[ Upstream commit 28574b08c70e56d34d6f6379326a860b96749051 ]
This commit adds a null check for the set_output_gamma function pointer
in the dcn32_set_output_transfer_func function. Previously,
set_output_gamma was being checked for null, but then it was being
dereferenced without any null check. This could lead to a null pointer
dereference if set_output_gamma is null.
To fix this, we now ensure that set_output_gamma is not null before
dereferencing it. We do this by adding a null check for set_output_gamma
before the call to set_output_gamma.
Cc: Tom Chung <chiahsuan.chung(a)amd.com>
Cc: Rodrigo Siqueira <Rodrigo.Siqueira(a)amd.com>
Cc: Roman Li <roman.li(a)amd.com>
Cc: Alex Hung <alex.hung(a)amd.com>
Cc: Aurabindo Pillai <aurabindo.pillai(a)amd.com>
Cc: Harry Wentland <harry.wentland(a)amd.com>
Cc: Hamza Mahfooz <hamza.mahfooz(a)amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam(a)amd.com>
Reviewed-by: Tom Chung <chiahsuan.chung(a)amd.com>
Signed-off-by: Alex Deucher <alexander.deucher(a)amd.com>
Signed-off-by: Xiangyu Chen <xiangyu.chen(a)windriver.com>
---
drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hwseq.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hwseq.c b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hwseq.c
index bd75d3cba098..d3ad13bf35c8 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hwseq.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hwseq.c
@@ -667,7 +667,9 @@ bool dcn32_set_output_transfer_func(struct dc *dc,
}
}
- mpc->funcs->set_output_gamma(mpc, mpcc_id, params);
+ if (mpc->funcs->set_output_gamma)
+ mpc->funcs->set_output_gamma(mpc, mpcc_id, params);
+
return ret;
}
--
2.43.0