The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x f755be0b1ff429a2ecf709beeb1bcd7abc111c2b
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092142-stingily-broadside-f761@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From f755be0b1ff429a2ecf709beeb1bcd7abc111c2b Mon Sep 17 00:00:00 2001
From: "Matthieu Baerts (NGI0)" <matttbe(a)kernel.org>
Date: Fri, 12 Sep 2025 14:25:50 +0200
Subject: [PATCH] mptcp: propagate shutdown to subflows when possible
When the MPTCP DATA FIN have been ACKed, there is no more MPTCP related
metadata to exchange, and all subflows can be safely shutdown.
Before this patch, the subflows were actually terminated at 'close()'
time. That's certainly fine most of the time, but not when the userspace
'shutdown()' a connection, without close()ing it. When doing so, the
subflows were staying in LAST_ACK state on one side -- and consequently
in FIN_WAIT2 on the other side -- until the 'close()' of the MPTCP
socket.
Now, when the DATA FIN have been ACKed, all subflows are shutdown. A
consequence of this is that the TCP 'FIN' flag can be set earlier now,
but the end result is the same. This affects the packetdrill tests
looking at the end of the MPTCP connections, but for a good reason.
Note that tcp_shutdown() will check the subflow state, so no need to do
that again before calling it.
Fixes: 3721b9b64676 ("mptcp: Track received DATA_FIN sequence number and add related helpers")
Cc: stable(a)vger.kernel.org
Fixes: 16a9a9da1723 ("mptcp: Add helper to process acks of DATA_FIN")
Reviewed-by: Mat Martineau <martineau(a)kernel.org>
Reviewed-by: Geliang Tang <geliang(a)kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
Link: https://patch.msgid.link/20250912-net-mptcp-fix-sft-connect-v1-1-d40e77cbbf…
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index e6fd97b21e9e..5e497a83e967 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -371,6 +371,20 @@ static void mptcp_close_wake_up(struct sock *sk)
sk_wake_async(sk, SOCK_WAKE_WAITD, POLL_IN);
}
+static void mptcp_shutdown_subflows(struct mptcp_sock *msk)
+{
+ struct mptcp_subflow_context *subflow;
+
+ mptcp_for_each_subflow(msk, subflow) {
+ struct sock *ssk = mptcp_subflow_tcp_sock(subflow);
+ bool slow;
+
+ slow = lock_sock_fast(ssk);
+ tcp_shutdown(ssk, SEND_SHUTDOWN);
+ unlock_sock_fast(ssk, slow);
+ }
+}
+
/* called under the msk socket lock */
static bool mptcp_pending_data_fin_ack(struct sock *sk)
{
@@ -395,6 +409,7 @@ static void mptcp_check_data_fin_ack(struct sock *sk)
break;
case TCP_CLOSING:
case TCP_LAST_ACK:
+ mptcp_shutdown_subflows(msk);
mptcp_set_state(sk, TCP_CLOSE);
break;
}
@@ -563,6 +578,7 @@ static bool mptcp_check_data_fin(struct sock *sk)
mptcp_set_state(sk, TCP_CLOSING);
break;
case TCP_FIN_WAIT2:
+ mptcp_shutdown_subflows(msk);
mptcp_set_state(sk, TCP_CLOSE);
break;
default:
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 14e22b43df25dbd4301351b882486ea38892ae4f
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092157-mullets-tweed-dee4@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 14e22b43df25dbd4301351b882486ea38892ae4f Mon Sep 17 00:00:00 2001
From: "Matthieu Baerts (NGI0)" <matttbe(a)kernel.org>
Date: Fri, 12 Sep 2025 14:25:51 +0200
Subject: [PATCH] selftests: mptcp: connect: catch IO errors on listen side
IO errors were correctly printed to stderr, and propagated up to the
main loop for the server side, but the returned value was ignored. As a
consequence, the program for the listener side was no longer exiting
with an error code in case of IO issues.
Because of that, some issues might not have been seen. But very likely,
most issues either had an effect on the client side, or the file
transfer was not the expected one, e.g. the connection got reset before
the end. Still, it is better to fix this.
The main consequence of this issue is the error that was reported by the
selftests: the received and sent files were different, and the MIB
counters were not printed. Also, when such errors happened during the
'disconnect' tests, the program tried to continue until the timeout.
Now when an IO error is detected, the program exits directly with an
error.
Fixes: 05be5e273c84 ("selftests: mptcp: add disconnect tests")
Cc: stable(a)vger.kernel.org
Reviewed-by: Mat Martineau <martineau(a)kernel.org>
Reviewed-by: Geliang Tang <geliang(a)kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
Link: https://patch.msgid.link/20250912-net-mptcp-fix-sft-connect-v1-2-d40e77cbbf…
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
diff --git a/tools/testing/selftests/net/mptcp/mptcp_connect.c b/tools/testing/selftests/net/mptcp/mptcp_connect.c
index 4f07ac9fa207..1408698df099 100644
--- a/tools/testing/selftests/net/mptcp/mptcp_connect.c
+++ b/tools/testing/selftests/net/mptcp/mptcp_connect.c
@@ -1093,6 +1093,7 @@ int main_loop_s(int listensock)
struct pollfd polls;
socklen_t salen;
int remotesock;
+ int err = 0;
int fd = 0;
again:
@@ -1125,7 +1126,7 @@ int main_loop_s(int listensock)
SOCK_TEST_TCPULP(remotesock, 0);
memset(&winfo, 0, sizeof(winfo));
- copyfd_io(fd, remotesock, 1, true, &winfo);
+ err = copyfd_io(fd, remotesock, 1, true, &winfo);
} else {
perror("accept");
return 1;
@@ -1134,10 +1135,10 @@ int main_loop_s(int listensock)
if (cfg_input)
close(fd);
- if (--cfg_repeat > 0)
+ if (!err && --cfg_repeat > 0)
goto again;
- return 0;
+ return err;
}
static void init_rng(void)
The patch below does not apply to the 6.16-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.16.y
git checkout FETCH_HEAD
git cherry-pick -x 225d1ee0f5ba3218d1814d36564fdb5f37b50474
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092125-thigh-immerse-6abd@gregkh' --subject-prefix 'PATCH 6.16.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 225d1ee0f5ba3218d1814d36564fdb5f37b50474 Mon Sep 17 00:00:00 2001
From: Antheas Kapenekakis <lkml(a)antheas.dev>
Date: Tue, 16 Sep 2025 09:28:18 +0200
Subject: [PATCH] platform/x86: asus-wmi: Re-add extra keys to ignore_key_wlan
quirk
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
It turns out that the dual screen models use 0x5E for attaching and
detaching the keyboard instead of 0x5F. So, re-add the codes by
reverting commit cf3940ac737d ("platform/x86: asus-wmi: Remove extra
keys from ignore_key_wlan quirk"). For our future reference, add a
comment next to 0x5E indicating that it is used for that purpose.
Fixes: cf3940ac737d ("platform/x86: asus-wmi: Remove extra keys from ignore_key_wlan quirk")
Reported-by: Rahul Chandra <rahul(a)chandra.net>
Closes: https://lore.kernel.org/all/10020-68c90c80-d-4ac6c580@106290038/
Cc: stable(a)kernel.org
Signed-off-by: Antheas Kapenekakis <lkml(a)antheas.dev>
Link: https://patch.msgid.link/20250916072818.196462-1-lkml@antheas.dev
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen(a)linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen(a)linux.intel.com>
diff --git a/drivers/platform/x86/asus-nb-wmi.c b/drivers/platform/x86/asus-nb-wmi.c
index 3a488cf9ca06..6a62bc5b02fd 100644
--- a/drivers/platform/x86/asus-nb-wmi.c
+++ b/drivers/platform/x86/asus-nb-wmi.c
@@ -673,6 +673,8 @@ static void asus_nb_wmi_key_filter(struct asus_wmi_driver *asus_wmi, int *code,
if (atkbd_reports_vol_keys)
*code = ASUS_WMI_KEY_IGNORE;
break;
+ case 0x5D: /* Wireless console Toggle */
+ case 0x5E: /* Wireless console Enable / Keyboard Attach, Detach */
case 0x5F: /* Wireless console Disable / Special Key */
if (quirks->key_wlan_event)
*code = quirks->key_wlan_event;
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x f755be0b1ff429a2ecf709beeb1bcd7abc111c2b
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092141-slashing-postal-be15@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From f755be0b1ff429a2ecf709beeb1bcd7abc111c2b Mon Sep 17 00:00:00 2001
From: "Matthieu Baerts (NGI0)" <matttbe(a)kernel.org>
Date: Fri, 12 Sep 2025 14:25:50 +0200
Subject: [PATCH] mptcp: propagate shutdown to subflows when possible
When the MPTCP DATA FIN have been ACKed, there is no more MPTCP related
metadata to exchange, and all subflows can be safely shutdown.
Before this patch, the subflows were actually terminated at 'close()'
time. That's certainly fine most of the time, but not when the userspace
'shutdown()' a connection, without close()ing it. When doing so, the
subflows were staying in LAST_ACK state on one side -- and consequently
in FIN_WAIT2 on the other side -- until the 'close()' of the MPTCP
socket.
Now, when the DATA FIN have been ACKed, all subflows are shutdown. A
consequence of this is that the TCP 'FIN' flag can be set earlier now,
but the end result is the same. This affects the packetdrill tests
looking at the end of the MPTCP connections, but for a good reason.
Note that tcp_shutdown() will check the subflow state, so no need to do
that again before calling it.
Fixes: 3721b9b64676 ("mptcp: Track received DATA_FIN sequence number and add related helpers")
Cc: stable(a)vger.kernel.org
Fixes: 16a9a9da1723 ("mptcp: Add helper to process acks of DATA_FIN")
Reviewed-by: Mat Martineau <martineau(a)kernel.org>
Reviewed-by: Geliang Tang <geliang(a)kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
Link: https://patch.msgid.link/20250912-net-mptcp-fix-sft-connect-v1-1-d40e77cbbf…
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index e6fd97b21e9e..5e497a83e967 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -371,6 +371,20 @@ static void mptcp_close_wake_up(struct sock *sk)
sk_wake_async(sk, SOCK_WAKE_WAITD, POLL_IN);
}
+static void mptcp_shutdown_subflows(struct mptcp_sock *msk)
+{
+ struct mptcp_subflow_context *subflow;
+
+ mptcp_for_each_subflow(msk, subflow) {
+ struct sock *ssk = mptcp_subflow_tcp_sock(subflow);
+ bool slow;
+
+ slow = lock_sock_fast(ssk);
+ tcp_shutdown(ssk, SEND_SHUTDOWN);
+ unlock_sock_fast(ssk, slow);
+ }
+}
+
/* called under the msk socket lock */
static bool mptcp_pending_data_fin_ack(struct sock *sk)
{
@@ -395,6 +409,7 @@ static void mptcp_check_data_fin_ack(struct sock *sk)
break;
case TCP_CLOSING:
case TCP_LAST_ACK:
+ mptcp_shutdown_subflows(msk);
mptcp_set_state(sk, TCP_CLOSE);
break;
}
@@ -563,6 +578,7 @@ static bool mptcp_check_data_fin(struct sock *sk)
mptcp_set_state(sk, TCP_CLOSING);
break;
case TCP_FIN_WAIT2:
+ mptcp_shutdown_subflows(msk);
mptcp_set_state(sk, TCP_CLOSE);
break;
default:
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x d02e48830e3fce9701265f6c5a58d9bdaf906a76
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092122-popper-small-d970@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From d02e48830e3fce9701265f6c5a58d9bdaf906a76 Mon Sep 17 00:00:00 2001
From: "Maciej S. Szmigiero" <maciej.szmigiero(a)oracle.com>
Date: Mon, 25 Aug 2025 18:44:28 +0200
Subject: [PATCH] KVM: SVM: Sync TPR from LAPIC into VMCB::V_TPR even if AVIC
is active
Commit 3bbf3565f48c ("svm: Do not intercept CR8 when enable AVIC")
inhibited pre-VMRUN sync of TPR from LAPIC into VMCB::V_TPR in
sync_lapic_to_cr8() when AVIC is active.
AVIC does automatically sync between these two fields, however it does
so only on explicit guest writes to one of these fields, not on a bare
VMRUN.
This meant that when AVIC is enabled host changes to TPR in the LAPIC
state might not get automatically copied into the V_TPR field of VMCB.
This is especially true when it is the userspace setting LAPIC state via
KVM_SET_LAPIC ioctl() since userspace does not have access to the guest
VMCB.
Practice shows that it is the V_TPR that is actually used by the AVIC to
decide whether to issue pending interrupts to the CPU (not TPR in TASKPRI),
so any leftover value in V_TPR will cause serious interrupt delivery issues
in the guest when AVIC is enabled.
Fix this issue by doing pre-VMRUN TPR sync from LAPIC into VMCB::V_TPR
even when AVIC is enabled.
Fixes: 3bbf3565f48c ("svm: Do not intercept CR8 when enable AVIC")
Cc: stable(a)vger.kernel.org
Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero(a)oracle.com>
Reviewed-by: Naveen N Rao (AMD) <naveen(a)kernel.org>
Link: https://lore.kernel.org/r/c231be64280b1461e854e1ce3595d70cde3a2e9d.17561396…
[sean: tag for stable@]
Signed-off-by: Sean Christopherson <seanjc(a)google.com>
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index d9931c6c4bc6..1bfebe40854f 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -4046,8 +4046,7 @@ static inline void sync_lapic_to_cr8(struct kvm_vcpu *vcpu)
struct vcpu_svm *svm = to_svm(vcpu);
u64 cr8;
- if (nested_svm_virtualize_tpr(vcpu) ||
- kvm_vcpu_apicv_active(vcpu))
+ if (nested_svm_virtualize_tpr(vcpu))
return;
cr8 = kvm_get_cr8(vcpu);
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x b6f56a44e4c1014b08859dcf04ed246500e310e5
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092157-imagines-darkroom-e5c5@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From b6f56a44e4c1014b08859dcf04ed246500e310e5 Mon Sep 17 00:00:00 2001
From: Hans de Goede <hansg(a)kernel.org>
Date: Sat, 13 Sep 2025 13:35:15 +0200
Subject: [PATCH] net: rfkill: gpio: Fix crash due to dereferencering
uninitialized pointer
Since commit 7d5e9737efda ("net: rfkill: gpio: get the name and type from
device property") rfkill_find_type() gets called with the possibly
uninitialized "const char *type_name;" local variable.
On x86 systems when rfkill-gpio binds to a "BCM4752" or "LNV4752"
acpi_device, the rfkill->type is set based on the ACPI acpi_device_id:
rfkill->type = (unsigned)id->driver_data;
and there is no "type" property so device_property_read_string() will fail
and leave type_name uninitialized, leading to a potential crash.
rfkill_find_type() does accept a NULL pointer, fix the potential crash
by initializing type_name to NULL.
Note likely sofar this has not been caught because:
1. Not many x86 machines actually have a "BCM4752"/"LNV4752" acpi_device
2. The stack happened to contain NULL where type_name is stored
Fixes: 7d5e9737efda ("net: rfkill: gpio: get the name and type from device property")
Cc: stable(a)vger.kernel.org
Cc: Heikki Krogerus <heikki.krogerus(a)linux.intel.com>
Signed-off-by: Hans de Goede <hansg(a)kernel.org>
Reviewed-by: Heikki Krogerus <heikki.krogerus(a)linux.intel.com>
Link: https://patch.msgid.link/20250913113515.21698-1-hansg@kernel.org
Signed-off-by: Johannes Berg <johannes.berg(a)intel.com>
diff --git a/net/rfkill/rfkill-gpio.c b/net/rfkill/rfkill-gpio.c
index 41e657e97761..cf2dcec6ce5a 100644
--- a/net/rfkill/rfkill-gpio.c
+++ b/net/rfkill/rfkill-gpio.c
@@ -94,10 +94,10 @@ static const struct dmi_system_id rfkill_gpio_deny_table[] = {
static int rfkill_gpio_probe(struct platform_device *pdev)
{
struct rfkill_gpio_data *rfkill;
- struct gpio_desc *gpio;
+ const char *type_name = NULL;
const char *name_property;
const char *type_property;
- const char *type_name;
+ struct gpio_desc *gpio;
int ret;
if (dmi_check_system(rfkill_gpio_deny_table))
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x b6f56a44e4c1014b08859dcf04ed246500e310e5
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092156-postal-sappiness-e1ac@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From b6f56a44e4c1014b08859dcf04ed246500e310e5 Mon Sep 17 00:00:00 2001
From: Hans de Goede <hansg(a)kernel.org>
Date: Sat, 13 Sep 2025 13:35:15 +0200
Subject: [PATCH] net: rfkill: gpio: Fix crash due to dereferencering
uninitialized pointer
Since commit 7d5e9737efda ("net: rfkill: gpio: get the name and type from
device property") rfkill_find_type() gets called with the possibly
uninitialized "const char *type_name;" local variable.
On x86 systems when rfkill-gpio binds to a "BCM4752" or "LNV4752"
acpi_device, the rfkill->type is set based on the ACPI acpi_device_id:
rfkill->type = (unsigned)id->driver_data;
and there is no "type" property so device_property_read_string() will fail
and leave type_name uninitialized, leading to a potential crash.
rfkill_find_type() does accept a NULL pointer, fix the potential crash
by initializing type_name to NULL.
Note likely sofar this has not been caught because:
1. Not many x86 machines actually have a "BCM4752"/"LNV4752" acpi_device
2. The stack happened to contain NULL where type_name is stored
Fixes: 7d5e9737efda ("net: rfkill: gpio: get the name and type from device property")
Cc: stable(a)vger.kernel.org
Cc: Heikki Krogerus <heikki.krogerus(a)linux.intel.com>
Signed-off-by: Hans de Goede <hansg(a)kernel.org>
Reviewed-by: Heikki Krogerus <heikki.krogerus(a)linux.intel.com>
Link: https://patch.msgid.link/20250913113515.21698-1-hansg@kernel.org
Signed-off-by: Johannes Berg <johannes.berg(a)intel.com>
diff --git a/net/rfkill/rfkill-gpio.c b/net/rfkill/rfkill-gpio.c
index 41e657e97761..cf2dcec6ce5a 100644
--- a/net/rfkill/rfkill-gpio.c
+++ b/net/rfkill/rfkill-gpio.c
@@ -94,10 +94,10 @@ static const struct dmi_system_id rfkill_gpio_deny_table[] = {
static int rfkill_gpio_probe(struct platform_device *pdev)
{
struct rfkill_gpio_data *rfkill;
- struct gpio_desc *gpio;
+ const char *type_name = NULL;
const char *name_property;
const char *type_property;
- const char *type_name;
+ struct gpio_desc *gpio;
int ret;
if (dmi_check_system(rfkill_gpio_deny_table))
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x b6f56a44e4c1014b08859dcf04ed246500e310e5
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092155-familiar-divisible-9535@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From b6f56a44e4c1014b08859dcf04ed246500e310e5 Mon Sep 17 00:00:00 2001
From: Hans de Goede <hansg(a)kernel.org>
Date: Sat, 13 Sep 2025 13:35:15 +0200
Subject: [PATCH] net: rfkill: gpio: Fix crash due to dereferencering
uninitialized pointer
Since commit 7d5e9737efda ("net: rfkill: gpio: get the name and type from
device property") rfkill_find_type() gets called with the possibly
uninitialized "const char *type_name;" local variable.
On x86 systems when rfkill-gpio binds to a "BCM4752" or "LNV4752"
acpi_device, the rfkill->type is set based on the ACPI acpi_device_id:
rfkill->type = (unsigned)id->driver_data;
and there is no "type" property so device_property_read_string() will fail
and leave type_name uninitialized, leading to a potential crash.
rfkill_find_type() does accept a NULL pointer, fix the potential crash
by initializing type_name to NULL.
Note likely sofar this has not been caught because:
1. Not many x86 machines actually have a "BCM4752"/"LNV4752" acpi_device
2. The stack happened to contain NULL where type_name is stored
Fixes: 7d5e9737efda ("net: rfkill: gpio: get the name and type from device property")
Cc: stable(a)vger.kernel.org
Cc: Heikki Krogerus <heikki.krogerus(a)linux.intel.com>
Signed-off-by: Hans de Goede <hansg(a)kernel.org>
Reviewed-by: Heikki Krogerus <heikki.krogerus(a)linux.intel.com>
Link: https://patch.msgid.link/20250913113515.21698-1-hansg@kernel.org
Signed-off-by: Johannes Berg <johannes.berg(a)intel.com>
diff --git a/net/rfkill/rfkill-gpio.c b/net/rfkill/rfkill-gpio.c
index 41e657e97761..cf2dcec6ce5a 100644
--- a/net/rfkill/rfkill-gpio.c
+++ b/net/rfkill/rfkill-gpio.c
@@ -94,10 +94,10 @@ static const struct dmi_system_id rfkill_gpio_deny_table[] = {
static int rfkill_gpio_probe(struct platform_device *pdev)
{
struct rfkill_gpio_data *rfkill;
- struct gpio_desc *gpio;
+ const char *type_name = NULL;
const char *name_property;
const char *type_property;
- const char *type_name;
+ struct gpio_desc *gpio;
int ret;
if (dmi_check_system(rfkill_gpio_deny_table))
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x b6f56a44e4c1014b08859dcf04ed246500e310e5
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092155-evacuate-condition-525e@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From b6f56a44e4c1014b08859dcf04ed246500e310e5 Mon Sep 17 00:00:00 2001
From: Hans de Goede <hansg(a)kernel.org>
Date: Sat, 13 Sep 2025 13:35:15 +0200
Subject: [PATCH] net: rfkill: gpio: Fix crash due to dereferencering
uninitialized pointer
Since commit 7d5e9737efda ("net: rfkill: gpio: get the name and type from
device property") rfkill_find_type() gets called with the possibly
uninitialized "const char *type_name;" local variable.
On x86 systems when rfkill-gpio binds to a "BCM4752" or "LNV4752"
acpi_device, the rfkill->type is set based on the ACPI acpi_device_id:
rfkill->type = (unsigned)id->driver_data;
and there is no "type" property so device_property_read_string() will fail
and leave type_name uninitialized, leading to a potential crash.
rfkill_find_type() does accept a NULL pointer, fix the potential crash
by initializing type_name to NULL.
Note likely sofar this has not been caught because:
1. Not many x86 machines actually have a "BCM4752"/"LNV4752" acpi_device
2. The stack happened to contain NULL where type_name is stored
Fixes: 7d5e9737efda ("net: rfkill: gpio: get the name and type from device property")
Cc: stable(a)vger.kernel.org
Cc: Heikki Krogerus <heikki.krogerus(a)linux.intel.com>
Signed-off-by: Hans de Goede <hansg(a)kernel.org>
Reviewed-by: Heikki Krogerus <heikki.krogerus(a)linux.intel.com>
Link: https://patch.msgid.link/20250913113515.21698-1-hansg@kernel.org
Signed-off-by: Johannes Berg <johannes.berg(a)intel.com>
diff --git a/net/rfkill/rfkill-gpio.c b/net/rfkill/rfkill-gpio.c
index 41e657e97761..cf2dcec6ce5a 100644
--- a/net/rfkill/rfkill-gpio.c
+++ b/net/rfkill/rfkill-gpio.c
@@ -94,10 +94,10 @@ static const struct dmi_system_id rfkill_gpio_deny_table[] = {
static int rfkill_gpio_probe(struct platform_device *pdev)
{
struct rfkill_gpio_data *rfkill;
- struct gpio_desc *gpio;
+ const char *type_name = NULL;
const char *name_property;
const char *type_property;
- const char *type_name;
+ struct gpio_desc *gpio;
int ret;
if (dmi_check_system(rfkill_gpio_deny_table))
The patch below does not apply to the 6.12-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.12.y
git checkout FETCH_HEAD
git cherry-pick -x 7f830e126dc357fc086905ce9730140fd4528d66
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092125-resurface-hypertext-5ca5@gregkh' --subject-prefix 'PATCH 6.12.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 7f830e126dc357fc086905ce9730140fd4528d66 Mon Sep 17 00:00:00 2001
From: Tom Lendacky <thomas.lendacky(a)amd.com>
Date: Mon, 15 Sep 2025 11:04:12 -0500
Subject: [PATCH] x86/sev: Guard sev_evict_cache() with CONFIG_AMD_MEM_ENCRYPT
The sev_evict_cache() is guest-related code and should be guarded by
CONFIG_AMD_MEM_ENCRYPT, not CONFIG_KVM_AMD_SEV.
CONFIG_AMD_MEM_ENCRYPT=y is required for a guest to run properly as an SEV-SNP
guest, but a guest kernel built with CONFIG_KVM_AMD_SEV=n would get the stub
function of sev_evict_cache() instead of the version that performs the actual
eviction. Move the function declarations under the appropriate #ifdef.
Fixes: 7b306dfa326f ("x86/sev: Evict cache lines during SNP memory validation")
Signed-off-by: Tom Lendacky <thomas.lendacky(a)amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp(a)alien8.de>
Cc: stable(a)kernel.org # 6.16.x
Link: https://lore.kernel.org/r/70e38f2c4a549063de54052c9f64929705313526.17577089…
diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index 02236962fdb1..465b19fd1a2d 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -562,6 +562,24 @@ enum es_result sev_es_ghcb_hv_call(struct ghcb *ghcb,
extern struct ghcb *boot_ghcb;
+static inline void sev_evict_cache(void *va, int npages)
+{
+ volatile u8 val __always_unused;
+ u8 *bytes = va;
+ int page_idx;
+
+ /*
+ * For SEV guests, a read from the first/last cache-lines of a 4K page
+ * using the guest key is sufficient to cause a flush of all cache-lines
+ * associated with that 4K page without incurring all the overhead of a
+ * full CLFLUSH sequence.
+ */
+ for (page_idx = 0; page_idx < npages; page_idx++) {
+ val = bytes[page_idx * PAGE_SIZE];
+ val = bytes[page_idx * PAGE_SIZE + PAGE_SIZE - 1];
+ }
+}
+
#else /* !CONFIG_AMD_MEM_ENCRYPT */
#define snp_vmpl 0
@@ -605,6 +623,7 @@ static inline int snp_send_guest_request(struct snp_msg_desc *mdesc,
static inline int snp_svsm_vtpm_send_command(u8 *buffer) { return -ENODEV; }
static inline void __init snp_secure_tsc_prepare(void) { }
static inline void __init snp_secure_tsc_init(void) { }
+static inline void sev_evict_cache(void *va, int npages) {}
#endif /* CONFIG_AMD_MEM_ENCRYPT */
@@ -619,24 +638,6 @@ int rmp_make_shared(u64 pfn, enum pg_level level);
void snp_leak_pages(u64 pfn, unsigned int npages);
void kdump_sev_callback(void);
void snp_fixup_e820_tables(void);
-
-static inline void sev_evict_cache(void *va, int npages)
-{
- volatile u8 val __always_unused;
- u8 *bytes = va;
- int page_idx;
-
- /*
- * For SEV guests, a read from the first/last cache-lines of a 4K page
- * using the guest key is sufficient to cause a flush of all cache-lines
- * associated with that 4K page without incurring all the overhead of a
- * full CLFLUSH sequence.
- */
- for (page_idx = 0; page_idx < npages; page_idx++) {
- val = bytes[page_idx * PAGE_SIZE];
- val = bytes[page_idx * PAGE_SIZE + PAGE_SIZE - 1];
- }
-}
#else
static inline bool snp_probe_rmptable_info(void) { return false; }
static inline int snp_rmptable_init(void) { return -ENOSYS; }
@@ -652,7 +653,6 @@ static inline int rmp_make_shared(u64 pfn, enum pg_level level) { return -ENODEV
static inline void snp_leak_pages(u64 pfn, unsigned int npages) {}
static inline void kdump_sev_callback(void) { }
static inline void snp_fixup_e820_tables(void) {}
-static inline void sev_evict_cache(void *va, int npages) {}
#endif
#endif
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 68f27f7c7708183e7873c585ded2f1b057ac5b97
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092104-booting-overstate-c9cf@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 68f27f7c7708183e7873c585ded2f1b057ac5b97 Mon Sep 17 00:00:00 2001
From: Krzysztof Kozlowski <krzysztof.kozlowski(a)linaro.org>
Date: Thu, 4 Sep 2025 12:18:50 +0200
Subject: [PATCH] ASoC: qcom: q6apm-lpass-dais: Fix NULL pointer dereference if
source graph failed
If earlier opening of source graph fails (e.g. ADSP rejects due to
incorrect audioreach topology), the graph is closed and
"dai_data->graph[dai->id]" is assigned NULL. Preparing the DAI for sink
graph continues though and next call to q6apm_lpass_dai_prepare()
receives dai_data->graph[dai->id]=NULL leading to NULL pointer
exception:
qcom-apm gprsvc:service:2:1: Error (1) Processing 0x01001002 cmd
qcom-apm gprsvc:service:2:1: DSP returned error[1001002] 1
q6apm-lpass-dais 30000000.remoteproc:glink-edge:gpr:service@1:bedais: fail to start APM port 78
q6apm-lpass-dais 30000000.remoteproc:glink-edge:gpr:service@1:bedais: ASoC: error at snd_soc_pcm_dai_prepare on TX_CODEC_DMA_TX_3: -22
Unable to handle kernel NULL pointer dereference at virtual address 00000000000000a8
...
Call trace:
q6apm_graph_media_format_pcm+0x48/0x120 (P)
q6apm_lpass_dai_prepare+0x110/0x1b4
snd_soc_pcm_dai_prepare+0x74/0x108
__soc_pcm_prepare+0x44/0x160
dpcm_be_dai_prepare+0x124/0x1c0
Fixes: 30ad723b93ad ("ASoC: qdsp6: audioreach: add q6apm lpass dai support")
Cc: stable(a)vger.kernel.org
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski(a)linaro.org>
Reviewed-by: Srinivas Kandagatla <srinivas.kandagatla(a)oss.qualcomm.com>
Message-ID: <20250904101849.121503-2-krzysztof.kozlowski(a)linaro.org>
Signed-off-by: Mark Brown <broonie(a)kernel.org>
diff --git a/sound/soc/qcom/qdsp6/q6apm-lpass-dais.c b/sound/soc/qcom/qdsp6/q6apm-lpass-dais.c
index a0d90462fd6a..20974f10406b 100644
--- a/sound/soc/qcom/qdsp6/q6apm-lpass-dais.c
+++ b/sound/soc/qcom/qdsp6/q6apm-lpass-dais.c
@@ -213,8 +213,10 @@ static int q6apm_lpass_dai_prepare(struct snd_pcm_substream *substream, struct s
return 0;
err:
- q6apm_graph_close(dai_data->graph[dai->id]);
- dai_data->graph[dai->id] = NULL;
+ if (substream->stream == SNDRV_PCM_STREAM_PLAYBACK) {
+ q6apm_graph_close(dai_data->graph[dai->id]);
+ dai_data->graph[dai->id] = NULL;
+ }
return rc;
}
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 96fa515e70f3e4b98685ef8cac9d737fc62f10e1
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092135-stinky-correct-5051@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 96fa515e70f3e4b98685ef8cac9d737fc62f10e1 Mon Sep 17 00:00:00 2001
From: Qu Wenruo <wqu(a)suse.com>
Date: Tue, 16 Sep 2025 07:54:06 +0930
Subject: [PATCH] btrfs: tree-checker: fix the incorrect inode ref size check
[BUG]
Inside check_inode_ref(), we need to make sure every structure,
including the btrfs_inode_extref header, is covered by the item. But
our code is incorrectly using "sizeof(iref)", where @iref is just a
pointer.
This means "sizeof(iref)" will always be "sizeof(void *)", which is much
smaller than "sizeof(struct btrfs_inode_extref)".
This will allow some bad inode extrefs to sneak in, defeating tree-checker.
[FIX]
Fix the typo by calling "sizeof(*iref)", which is the same as
"sizeof(struct btrfs_inode_extref)", and will be the correct behavior we
want.
Fixes: 71bf92a9b877 ("btrfs: tree-checker: Add check for INODE_REF")
CC: stable(a)vger.kernel.org # 6.1+
Reviewed-by: Johannes Thumshirn <johannes.thumshirn(a)wdc.com>
Reviewed-by: Filipe Manana <fdmanana(a)suse.com>
Signed-off-by: Qu Wenruo <wqu(a)suse.com>
Reviewed-by: David Sterba <dsterba(a)suse.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
diff --git a/fs/btrfs/tree-checker.c b/fs/btrfs/tree-checker.c
index 0f556f4de3f9..a997c7cc35a2 100644
--- a/fs/btrfs/tree-checker.c
+++ b/fs/btrfs/tree-checker.c
@@ -1756,10 +1756,10 @@ static int check_inode_ref(struct extent_buffer *leaf,
while (ptr < end) {
u16 namelen;
- if (unlikely(ptr + sizeof(iref) > end)) {
+ if (unlikely(ptr + sizeof(*iref) > end)) {
inode_ref_err(leaf, slot,
"inode ref overflow, ptr %lu end %lu inode_ref_size %zu",
- ptr, end, sizeof(iref));
+ ptr, end, sizeof(*iref));
return -EUCLEAN;
}
[BUG]
With my local branch to enable bs > ps support for btrfs, sometimes I
hit the following ASSERT() inside submit_one_sector():
ASSERT(block_start != EXTENT_MAP_HOLE);
Please note that it's not yet possible to hit this ASSERT() in the wild
yet, as it requires btrfs bs > ps support, which is not even in the
development branch.
But on the other hand, there is also a very low chance to hit above
ASSERT() with bs < ps cases, so this is an existing bug affect not only
the incoming bs > ps support but also the existing bs < ps support.
[CAUSE]
Firstly that ASSERT() means we're trying to submit a dirty block but
without a real extent map nor ordered extent map backing it.
Furthermore with extra debugging, the folio triggering such ASSERT() is
always larger than the fs block size in my bs > ps case.
(8K block size, 4K page size)
After some more debugging, the ASSERT() is trigger by the following
sequence:
extent_writepage()
| We got a 32K folio (4 fs blocks) at file offset 0, and the fs block
| size is 8K, page size is 4K.
| And there is another 8K folio at file offset 32K, which is also
| dirty.
| So the filemap layout looks like the following:
|
| "||" is the filio boundary in the filemap.
| "//| is the dirty range.
|
| 0 8K 16K 24K 32K 40K
| |////////| |//////////////////////||////////|
|
|- writepage_delalloc()
| |- find_lock_delalloc_range() for [0, 8K)
| | Now range [0, 8K) is properly locked.
| |
| |- find_lock_delalloc_range() for [16K, 40K)
| | |- btrfs_find_delalloc_range() returned range [0, 8K)
| | |- lock_delalloc_folios() succeeded.
| | |
| | | The filemap range [32K, 40K) got dropped from filemap.
| | |
| | |- lock_delalloc_folios() failed with -EAGAIN.
| | | As it failed to lock the folio at [32K, 40K).
| | |
| | |- loops = 1;
| | |- max_bytes = PAGE_SIZE;
| | |- goto again;
| | | This will re-do the lookup for dirty delalloc ranges.
| | |
| | |- btrfs_find_delalloc_range() called with @max_bytes == 4K
| | | This is smaller than block size, so
| | | btrfs_find_delalloc_range() is unable to return any range.
| | \- return false;
| |
| \- Now only range [0, 8K) has an OE for it, but for dirty range
| [16K, 32K) it's dirty without an OE.
| This breaks the assumption that writepage_delalloc() will find
| and lock all dirty ranges inside the folio.
|
|- extent_writepage_io()
|- submit_one_sector() for [0, 8K)
| Succeeded
|
|- submit_one_sector() for [16K, 24K)
Triggering the ASSERT(), as there is no OE, and the original
extent map is a hole.
Please note that, this also exposed the same problem for bs < ps
support. E.g. with 64K page size and 4K block size.
If we failed to lock a folio, and falls back into the "loops = 1;"
branch, we will re-do the search using 64K as max_bytes.
Which may fail again to lock the next folio, and exit early without
handling all dirty blocks inside the folio.
[FIX]
Instead of using the fixed size PAGE_SIZE as @max_bytes, use
@sectorsize, so that we are ensured to find and lock any remaining
blocks inside the folio.
And since we're here, add an extra ASSERT() to
before calling btrfs_find_delalloc_range() to make sure the @max_bytes is
at least no smaller than a block to avoid false negative.
Cc: stable(a)vger.kernel.org #5.15+
Signed-off-by: Qu Wenruo <wqu(a)suse.com>
---
fs/btrfs/extent_io.c | 14 +++++++++++---
1 file changed, 11 insertions(+), 3 deletions(-)
diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index ca7174fa0240..2fd82055a779 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -393,6 +393,13 @@ noinline_for_stack bool find_lock_delalloc_range(struct inode *inode,
/* step one, find a bunch of delalloc bytes starting at start */
delalloc_start = *start;
delalloc_end = 0;
+
+ /*
+ * If @max_bytes is smaller than a block, btrfs_find_delalloc_range() can
+ * return early without handling any dirty ranges.
+ */
+ ASSERT(max_bytes >= fs_info->sectorsize);
+
found = btrfs_find_delalloc_range(tree, &delalloc_start, &delalloc_end,
max_bytes, &cached_state);
if (!found || delalloc_end <= *start || delalloc_start > orig_end) {
@@ -423,13 +430,14 @@ noinline_for_stack bool find_lock_delalloc_range(struct inode *inode,
delalloc_end);
ASSERT(!ret || ret == -EAGAIN);
if (ret == -EAGAIN) {
- /* some of the folios are gone, lets avoid looping by
- * shortening the size of the delalloc range we're searching
+ /*
+ * Some of the folios are gone, lets avoid looping by
+ * shortening the size of the delalloc range we're searching.
*/
btrfs_free_extent_state(cached_state);
cached_state = NULL;
if (!loops) {
- max_bytes = PAGE_SIZE;
+ max_bytes = fs_info->sectorsize;
loops = 1;
goto again;
} else {
--
2.50.1
The quilt patch titled
Subject: mm/damon/lru_sort: use param_ctx for damon_attrs staging
has been removed from the -mm tree. Its filename was
mm-damon-lru_sort-use-param_ctx-for-damon_attrs-staging.patch
This patch was dropped because it was merged into the mm-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: SeongJae Park <sj(a)kernel.org>
Subject: mm/damon/lru_sort: use param_ctx for damon_attrs staging
Date: Mon, 15 Sep 2025 20:15:49 -0700
damon_lru_sort_apply_parameters() allocates a new DAMON context, stages
user-specified DAMON parameters on it, and commits to running DAMON
context at once, using damon_commit_ctx(). The code is, however, directly
updating the monitoring attributes of the running context. And the
attributes are over-written by later damon_commit_ctx() call. This means
that the monitoring attributes parameters are not really working. Fix the
wrong use of the parameter context.
Link: https://lkml.kernel.org/r/20250916031549.115326-1-sj@kernel.org
Fixes: a30969436428 ("mm/damon/lru_sort: use damon_commit_ctx()")
Signed-off-by: SeongJae Park <sj(a)kernel.org>
Reviewed-by: Joshua Hahn <joshua.hahnjy(a)gmail.com>
Cc: Joshua Hahn <joshua.hahnjy(a)gmail.com>
Cc: <stable(a)vger.kernel.org> [6.11+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/damon/lru_sort.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/mm/damon/lru_sort.c~mm-damon-lru_sort-use-param_ctx-for-damon_attrs-staging
+++ a/mm/damon/lru_sort.c
@@ -219,7 +219,7 @@ static int damon_lru_sort_apply_paramete
goto out;
}
- err = damon_set_attrs(ctx, &damon_lru_sort_mon_attrs);
+ err = damon_set_attrs(param_ctx, &damon_lru_sort_mon_attrs);
if (err)
goto out;
_
Patches currently in -mm which might be from sj(a)kernel.org are
mm-damon-sysfs-set-damon_ctx-min_sz_region-only-for-paddr-use-case.patch
This series backports seven commits from v5.15.y that update minmax.h
and related code:
- ed6e37e30826 ("tracing: Define the is_signed_type() macro once")
- 998f03984e25 ("minmax: sanity check constant bounds when clamping")
- d470787b25e6 ("minmax: clamp more efficiently by avoiding extra
comparison")
- 1c2ee5bc9f11 ("minmax: fix header inclusions")
- d53b5d862acd ("minmax: allow min()/max()/clamp() if the arguments
have the same signedness.")
- 7ed91c5560df ("minmax: allow comparisons of 'int' against 'unsigned
char/short'")
- 22f7794ef5a3 ("minmax: relax check to allow comparison between
unsigned arguments and signed constants")
The main motivation is commit d53b5d862acd, which removes the strict
type check in min()/max() when both arguments have the same signedness.
Without this, kernel 5.10 builds can emit warnings that become build
failures when -Werror is used.
Additionally, commit ed6e37e30826 from tracing is required as a
dependency; without it, compilation fails.
Andy Shevchenko (1):
minmax: fix header inclusions
Bart Van Assche (1):
tracing: Define the is_signed_type() macro once
David Laight (3):
minmax: allow min()/max()/clamp() if the arguments have the same
signedness.
minmax: allow comparisons of 'int' against 'unsigned char/short'
minmax: relax check to allow comparison between unsigned arguments and
signed constants
Jason A. Donenfeld (2):
minmax: sanity check constant bounds when clamping
minmax: clamp more efficiently by avoiding extra comparison
include/linux/compiler.h | 6 +++
include/linux/minmax.h | 89 ++++++++++++++++++++++++++----------
include/linux/overflow.h | 1 -
include/linux/trace_events.h | 2 -
4 files changed, 70 insertions(+), 28 deletions(-)
--
2.47.3
The patch below does not apply to the 6.12-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.12.y
git checkout FETCH_HEAD
git cherry-pick -x 225d1ee0f5ba3218d1814d36564fdb5f37b50474
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092126-upstream-favorite-2f89@gregkh' --subject-prefix 'PATCH 6.12.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 225d1ee0f5ba3218d1814d36564fdb5f37b50474 Mon Sep 17 00:00:00 2001
From: Antheas Kapenekakis <lkml(a)antheas.dev>
Date: Tue, 16 Sep 2025 09:28:18 +0200
Subject: [PATCH] platform/x86: asus-wmi: Re-add extra keys to ignore_key_wlan
quirk
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
It turns out that the dual screen models use 0x5E for attaching and
detaching the keyboard instead of 0x5F. So, re-add the codes by
reverting commit cf3940ac737d ("platform/x86: asus-wmi: Remove extra
keys from ignore_key_wlan quirk"). For our future reference, add a
comment next to 0x5E indicating that it is used for that purpose.
Fixes: cf3940ac737d ("platform/x86: asus-wmi: Remove extra keys from ignore_key_wlan quirk")
Reported-by: Rahul Chandra <rahul(a)chandra.net>
Closes: https://lore.kernel.org/all/10020-68c90c80-d-4ac6c580@106290038/
Cc: stable(a)kernel.org
Signed-off-by: Antheas Kapenekakis <lkml(a)antheas.dev>
Link: https://patch.msgid.link/20250916072818.196462-1-lkml@antheas.dev
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen(a)linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen(a)linux.intel.com>
diff --git a/drivers/platform/x86/asus-nb-wmi.c b/drivers/platform/x86/asus-nb-wmi.c
index 3a488cf9ca06..6a62bc5b02fd 100644
--- a/drivers/platform/x86/asus-nb-wmi.c
+++ b/drivers/platform/x86/asus-nb-wmi.c
@@ -673,6 +673,8 @@ static void asus_nb_wmi_key_filter(struct asus_wmi_driver *asus_wmi, int *code,
if (atkbd_reports_vol_keys)
*code = ASUS_WMI_KEY_IGNORE;
break;
+ case 0x5D: /* Wireless console Toggle */
+ case 0x5E: /* Wireless console Enable / Keyboard Attach, Detach */
case 0x5F: /* Wireless console Disable / Special Key */
if (quirks->key_wlan_event)
*code = quirks->key_wlan_event;
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 5282491fc49d5614ac6ddcd012e5743eecb6a67c
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092118-portside-cheesy-44d2@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 5282491fc49d5614ac6ddcd012e5743eecb6a67c Mon Sep 17 00:00:00 2001
From: Namjae Jeon <linkinjeon(a)kernel.org>
Date: Wed, 10 Sep 2025 11:22:52 +0900
Subject: [PATCH] ksmbd: smbdirect: validate data_offset and data_length field
of smb_direct_data_transfer
If data_offset and data_length of smb_direct_data_transfer struct are
invalid, out of bounds issue could happen.
This patch validate data_offset and data_length field in recv_done.
Cc: stable(a)vger.kernel.org
Fixes: 2ea086e35c3d ("ksmbd: add buffer validation for smb direct")
Reviewed-by: Stefan Metzmacher <metze(a)samba.org>
Reported-by: Luigino Camastra, Aisle Research <luigino.camastra(a)aisle.com>
Signed-off-by: Namjae Jeon <linkinjeon(a)kernel.org>
Signed-off-by: Steve French <stfrench(a)microsoft.com>
diff --git a/fs/smb/server/transport_rdma.c b/fs/smb/server/transport_rdma.c
index cc4322bfa1d6..d52f37578276 100644
--- a/fs/smb/server/transport_rdma.c
+++ b/fs/smb/server/transport_rdma.c
@@ -554,7 +554,7 @@ static void recv_done(struct ib_cq *cq, struct ib_wc *wc)
case SMB_DIRECT_MSG_DATA_TRANSFER: {
struct smb_direct_data_transfer *data_transfer =
(struct smb_direct_data_transfer *)recvmsg->packet;
- unsigned int data_length;
+ unsigned int data_offset, data_length;
int avail_recvmsg_count, receive_credits;
if (wc->byte_len <
@@ -565,14 +565,15 @@ static void recv_done(struct ib_cq *cq, struct ib_wc *wc)
}
data_length = le32_to_cpu(data_transfer->data_length);
- if (data_length) {
- if (wc->byte_len < sizeof(struct smb_direct_data_transfer) +
- (u64)data_length) {
- put_recvmsg(t, recvmsg);
- smb_direct_disconnect_rdma_connection(t);
- return;
- }
+ data_offset = le32_to_cpu(data_transfer->data_offset);
+ if (wc->byte_len < data_offset ||
+ wc->byte_len < (u64)data_offset + data_length) {
+ put_recvmsg(t, recvmsg);
+ smb_direct_disconnect_rdma_connection(t);
+ return;
+ }
+ if (data_length) {
if (t->full_packet_received)
recvmsg->first_segment = true;
The patch below does not apply to the 6.12-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.12.y
git checkout FETCH_HEAD
git cherry-pick -x 98c6d259319ecf6e8d027abd3f14b81324b8c0ad
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092158-payee-omega-5893@gregkh' --subject-prefix 'PATCH 6.12.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 98c6d259319ecf6e8d027abd3f14b81324b8c0ad Mon Sep 17 00:00:00 2001
From: Hugh Dickins <hughd(a)google.com>
Date: Mon, 8 Sep 2025 15:15:03 -0700
Subject: [PATCH] mm/gup: check ref_count instead of lru before migration
Patch series "mm: better GUP pin lru_add_drain_all()", v2.
Series of lru_add_drain_all()-related patches, arising from recent mm/gup
migration report from Will Deacon.
This patch (of 5):
Will Deacon reports:-
When taking a longterm GUP pin via pin_user_pages(),
__gup_longterm_locked() tries to migrate target folios that should not be
longterm pinned, for example because they reside in a CMA region or
movable zone. This is done by first pinning all of the target folios
anyway, collecting all of the longterm-unpinnable target folios into a
list, dropping the pins that were just taken and finally handing the list
off to migrate_pages() for the actual migration.
It is critically important that no unexpected references are held on the
folios being migrated, otherwise the migration will fail and
pin_user_pages() will return -ENOMEM to its caller. Unfortunately, it is
relatively easy to observe migration failures when running pKVM (which
uses pin_user_pages() on crosvm's virtual address space to resolve stage-2
page faults from the guest) on a 6.15-based Pixel 6 device and this
results in the VM terminating prematurely.
In the failure case, 'crosvm' has called mlock(MLOCK_ONFAULT) on its
mapping of guest memory prior to the pinning. Subsequently, when
pin_user_pages() walks the page-table, the relevant 'pte' is not present
and so the faulting logic allocates a new folio, mlocks it with
mlock_folio() and maps it in the page-table.
Since commit 2fbb0c10d1e8 ("mm/munlock: mlock_page() munlock_page() batch
by pagevec"), mlock/munlock operations on a folio (formerly page), are
deferred. For example, mlock_folio() takes an additional reference on the
target folio before placing it into a per-cpu 'folio_batch' for later
processing by mlock_folio_batch(), which drops the refcount once the
operation is complete. Processing of the batches is coupled with the LRU
batch logic and can be forcefully drained with lru_add_drain_all() but as
long as a folio remains unprocessed on the batch, its refcount will be
elevated.
This deferred batching therefore interacts poorly with the pKVM pinning
scenario as we can find ourselves in a situation where the migration code
fails to migrate a folio due to the elevated refcount from the pending
mlock operation.
Hugh Dickins adds:-
!folio_test_lru() has never been a very reliable way to tell if an
lru_add_drain_all() is worth calling, to remove LRU cache references to
make the folio migratable: the LRU flag may be set even while the folio is
held with an extra reference in a per-CPU LRU cache.
5.18 commit 2fbb0c10d1e8 may have made it more unreliable. Then 6.11
commit 33dfe9204f29 ("mm/gup: clear the LRU flag of a page before adding
to LRU batch") tried to make it reliable, by moving LRU flag clearing; but
missed the mlock/munlock batches, so still unreliable as reported.
And it turns out to be difficult to extend 33dfe9204f29's LRU flag
clearing to the mlock/munlock batches: if they do benefit from batching,
mlock/munlock cannot be so effective when easily suppressed while !LRU.
Instead, switch to an expected ref_count check, which was more reliable
all along: some more false positives (unhelpful drains) than before, and
never a guarantee that the folio will prove migratable, but better.
Note on PG_private_2: ceph and nfs are still using the deprecated
PG_private_2 flag, with the aid of netfs and filemap support functions.
Although it is consistently matched by an increment of folio ref_count,
folio_expected_ref_count() intentionally does not recognize it, and ceph
folio migration currently depends on that for PG_private_2 folios to be
rejected. New references to the deprecated flag are discouraged, so do
not add it into the collect_longterm_unpinnable_folios() calculation: but
longterm pinning of transiently PG_private_2 ceph and nfs folios (an
uncommon case) may invoke a redundant lru_add_drain_all(). And this makes
easy the backport to earlier releases: up to and including 6.12, btrfs
also used PG_private_2, but without a ref_count increment.
Note for stable backports: requires 6.16 commit 86ebd50224c0 ("mm:
add folio_expected_ref_count() for reference count calculation").
Link: https://lkml.kernel.org/r/41395944-b0e3-c3ac-d648-8ddd70451d28@google.com
Link: https://lkml.kernel.org/r/bd1f314a-fca1-8f19-cac0-b936c9614557@google.com
Fixes: 9a4e9f3b2d73 ("mm: update get_user_pages_longterm to migrate pages allocated from CMA region")
Signed-off-by: Hugh Dickins <hughd(a)google.com>
Reported-by: Will Deacon <will(a)kernel.org>
Closes: https://lore.kernel.org/linux-mm/20250815101858.24352-1-will@kernel.org/
Acked-by: Kiryl Shutsemau <kas(a)kernel.org>
Acked-by: David Hildenbrand <david(a)redhat.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar(a)kernel.org>
Cc: Axel Rasmussen <axelrasmussen(a)google.com>
Cc: Chris Li <chrisl(a)kernel.org>
Cc: Christoph Hellwig <hch(a)infradead.org>
Cc: Jason Gunthorpe <jgg(a)ziepe.ca>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: John Hubbard <jhubbard(a)nvidia.com>
Cc: Keir Fraser <keirf(a)google.com>
Cc: Konstantin Khlebnikov <koct9i(a)gmail.com>
Cc: Li Zhe <lizhe.67(a)bytedance.com>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: Peter Xu <peterx(a)redhat.com>
Cc: Rik van Riel <riel(a)surriel.com>
Cc: Shivank Garg <shivankg(a)amd.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Wei Xu <weixugc(a)google.com>
Cc: yangge <yangge1116(a)126.com>
Cc: Yuanchu Xie <yuanchu(a)google.com>
Cc: Yu Zhao <yuzhao(a)google.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/mm/gup.c b/mm/gup.c
index adffe663594d..82aec6443c0a 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -2307,7 +2307,8 @@ static unsigned long collect_longterm_unpinnable_folios(
continue;
}
- if (!folio_test_lru(folio) && drain_allow) {
+ if (drain_allow && folio_ref_count(folio) !=
+ folio_expected_ref_count(folio) + 1) {
lru_add_drain_all();
drain_allow = false;
}
The patch below does not apply to the 6.16-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.16.y
git checkout FETCH_HEAD
git cherry-pick -x c62cff40481c037307a13becbda795f7afdcfebd
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092116-ceramics-stratus-5d18@gregkh' --subject-prefix 'PATCH 6.16.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From c62cff40481c037307a13becbda795f7afdcfebd Mon Sep 17 00:00:00 2001
From: SeongJae Park <sj(a)kernel.org>
Date: Mon, 8 Sep 2025 19:22:38 -0700
Subject: [PATCH] samples/damon/mtier: avoid starting DAMON before
initialization
Commit 964314344eab ("samples/damon/mtier: support boot time enable
setup") is somehow incompletely applying the origin patch [1]. It is
missing the part that avoids starting DAMON before module initialization.
Probably a mistake during a merge has happened. Fix it by applying the
missed part again.
Link: https://lkml.kernel.org/r/20250909022238.2989-4-sj@kernel.org
Link: https://lore.kernel.org/20250706193207.39810-4-sj@kernel.org [1]
Fixes: 964314344eab ("samples/damon/mtier: support boot time enable setup")
Signed-off-by: SeongJae Park <sj(a)kernel.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/samples/damon/mtier.c b/samples/damon/mtier.c
index 7ebd352138e4..beaf36657dea 100644
--- a/samples/damon/mtier.c
+++ b/samples/damon/mtier.c
@@ -208,6 +208,9 @@ static int damon_sample_mtier_enable_store(
if (enabled == is_enabled)
return 0;
+ if (!init_called)
+ return 0;
+
if (enabled) {
err = damon_sample_mtier_start();
if (err)
The patch below does not apply to the 6.16-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.16.y
git checkout FETCH_HEAD
git cherry-pick -x f826edeb888c5a8bd1b6e95ae6a50b0db2b21902
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092111-specked-enviably-906d@gregkh' --subject-prefix 'PATCH 6.16.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From f826edeb888c5a8bd1b6e95ae6a50b0db2b21902 Mon Sep 17 00:00:00 2001
From: SeongJae Park <sj(a)kernel.org>
Date: Mon, 8 Sep 2025 19:22:36 -0700
Subject: [PATCH] samples/damon/wsse: avoid starting DAMON before
initialization
Patch series "samples/damon: fix boot time enable handling fixup merge
mistakes".
First three patches of the patch series "mm/damon: fix misc bugs in DAMON
modules" [1] were trying to fix boot time DAMON sample modules enabling
issues. The issues are the modules can crash if those are enabled before
DAMON is enabled, like using boot time parameter options. The three
patches were fixing the issues by avoiding starting DAMON before the
module initialization phase.
However, probably by a mistake during a merge, only half of the change is
merged, and the part for avoiding the starting of DAMON before the module
initialized is missed. So the problem is not solved and thus the modules
can still crash if enabled before DAMON is initialized. Fix those by
applying the unmerged parts again.
Note that the broken commits are merged into 6.17-rc1, but also backported
to relevant stable kernels. So this series also needs to be merged into
the stable kernels. Hence Cc-ing stable@.
This patch (of 3):
Commit 0ed1165c3727 ("samples/damon/wsse: fix boot time enable handling")
is somehow incompletely applying the origin patch [2]. It is missing the
part that avoids starting DAMON before module initialization. Probably a
mistake during a merge has happened. Fix it by applying the missed part
again.
Link: https://lkml.kernel.org/r/20250909022238.2989-1-sj@kernel.org
Link: https://lkml.kernel.org/r/20250909022238.2989-2-sj@kernel.org
Link: https://lkml.kernel.org/r/20250706193207.39810-1-sj@kernel.org [1]
Link: https://lore.kernel.org/20250706193207.39810-2-sj@kernel.org [2]
Fixes: 0ed1165c3727 ("samples/damon/wsse: fix boot time enable handling")
Signed-off-by: SeongJae Park <sj(a)kernel.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/samples/damon/wsse.c b/samples/damon/wsse.c
index da052023b099..21eaf15f987d 100644
--- a/samples/damon/wsse.c
+++ b/samples/damon/wsse.c
@@ -118,6 +118,9 @@ static int damon_sample_wsse_enable_store(
return 0;
if (enabled) {
+ if (!init_called)
+ return 0;
+
err = damon_sample_wsse_start();
if (err)
enabled = false;
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 3539b1467e94336d5854ebf976d9627bfb65d6c3
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092128-embassy-flyable-e3fb@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 3539b1467e94336d5854ebf976d9627bfb65d6c3 Mon Sep 17 00:00:00 2001
From: Jens Axboe <axboe(a)kernel.dk>
Date: Thu, 18 Sep 2025 10:21:14 -0600
Subject: [PATCH] io_uring: include dying ring in task_work "should cancel"
state
When running task_work for an exiting task, rather than perform the
issue retry attempt, the task_work is canceled. However, this isn't
done for a ring that has been closed. This can lead to requests being
successfully completed post the ring being closed, which is somewhat
confusing and surprising to an application.
Rather than just check the task exit state, also include the ring
ref state in deciding whether or not to terminate a given request when
run from task_work.
Cc: stable(a)vger.kernel.org # 6.1+
Link: https://github.com/axboe/liburing/discussions/1459
Reported-by: Benedek Thaler <thaler(a)thaler.hu>
Signed-off-by: Jens Axboe <axboe(a)kernel.dk>
diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
index 93633613a165..bcec12256f34 100644
--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
@@ -1406,8 +1406,10 @@ static void io_req_task_cancel(struct io_kiocb *req, io_tw_token_t tw)
void io_req_task_submit(struct io_kiocb *req, io_tw_token_t tw)
{
- io_tw_lock(req->ctx, tw);
- if (unlikely(io_should_terminate_tw()))
+ struct io_ring_ctx *ctx = req->ctx;
+
+ io_tw_lock(ctx, tw);
+ if (unlikely(io_should_terminate_tw(ctx)))
io_req_defer_failed(req, -EFAULT);
else if (req->flags & REQ_F_FORCE_ASYNC)
io_queue_iowq(req);
diff --git a/io_uring/io_uring.h b/io_uring/io_uring.h
index abc6de227f74..1880902be6fd 100644
--- a/io_uring/io_uring.h
+++ b/io_uring/io_uring.h
@@ -476,9 +476,9 @@ static inline bool io_allowed_run_tw(struct io_ring_ctx *ctx)
* 2) PF_KTHREAD is set, in which case the invoker of the task_work is
* our fallback task_work.
*/
-static inline bool io_should_terminate_tw(void)
+static inline bool io_should_terminate_tw(struct io_ring_ctx *ctx)
{
- return current->flags & (PF_KTHREAD | PF_EXITING);
+ return (current->flags & (PF_KTHREAD | PF_EXITING)) || percpu_ref_is_dying(&ctx->refs);
}
static inline void io_req_queue_tw_complete(struct io_kiocb *req, s32 res)
diff --git a/io_uring/poll.c b/io_uring/poll.c
index c786e587563b..6090a26975d4 100644
--- a/io_uring/poll.c
+++ b/io_uring/poll.c
@@ -224,7 +224,7 @@ static int io_poll_check_events(struct io_kiocb *req, io_tw_token_t tw)
{
int v;
- if (unlikely(io_should_terminate_tw()))
+ if (unlikely(io_should_terminate_tw(req->ctx)))
return -ECANCELED;
do {
diff --git a/io_uring/timeout.c b/io_uring/timeout.c
index 7f13bfa9f2b6..17e3aab0af36 100644
--- a/io_uring/timeout.c
+++ b/io_uring/timeout.c
@@ -324,7 +324,7 @@ static void io_req_task_link_timeout(struct io_kiocb *req, io_tw_token_t tw)
int ret;
if (prev) {
- if (!io_should_terminate_tw()) {
+ if (!io_should_terminate_tw(req->ctx)) {
struct io_cancel_data cd = {
.ctx = req->ctx,
.data = prev->cqe.user_data,
diff --git a/io_uring/uring_cmd.c b/io_uring/uring_cmd.c
index 053bac89b6c0..213716e10d70 100644
--- a/io_uring/uring_cmd.c
+++ b/io_uring/uring_cmd.c
@@ -118,7 +118,7 @@ static void io_uring_cmd_work(struct io_kiocb *req, io_tw_token_t tw)
struct io_uring_cmd *ioucmd = io_kiocb_to_cmd(req, struct io_uring_cmd);
unsigned int flags = IO_URING_F_COMPLETE_DEFER;
- if (io_should_terminate_tw())
+ if (io_should_terminate_tw(req->ctx))
flags |= IO_URING_F_TASK_DEAD;
/* task_work executor checks the deffered list completion */
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x 3539b1467e94336d5854ebf976d9627bfb65d6c3
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092127-emit-dean-5272@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 3539b1467e94336d5854ebf976d9627bfb65d6c3 Mon Sep 17 00:00:00 2001
From: Jens Axboe <axboe(a)kernel.dk>
Date: Thu, 18 Sep 2025 10:21:14 -0600
Subject: [PATCH] io_uring: include dying ring in task_work "should cancel"
state
When running task_work for an exiting task, rather than perform the
issue retry attempt, the task_work is canceled. However, this isn't
done for a ring that has been closed. This can lead to requests being
successfully completed post the ring being closed, which is somewhat
confusing and surprising to an application.
Rather than just check the task exit state, also include the ring
ref state in deciding whether or not to terminate a given request when
run from task_work.
Cc: stable(a)vger.kernel.org # 6.1+
Link: https://github.com/axboe/liburing/discussions/1459
Reported-by: Benedek Thaler <thaler(a)thaler.hu>
Signed-off-by: Jens Axboe <axboe(a)kernel.dk>
diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
index 93633613a165..bcec12256f34 100644
--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
@@ -1406,8 +1406,10 @@ static void io_req_task_cancel(struct io_kiocb *req, io_tw_token_t tw)
void io_req_task_submit(struct io_kiocb *req, io_tw_token_t tw)
{
- io_tw_lock(req->ctx, tw);
- if (unlikely(io_should_terminate_tw()))
+ struct io_ring_ctx *ctx = req->ctx;
+
+ io_tw_lock(ctx, tw);
+ if (unlikely(io_should_terminate_tw(ctx)))
io_req_defer_failed(req, -EFAULT);
else if (req->flags & REQ_F_FORCE_ASYNC)
io_queue_iowq(req);
diff --git a/io_uring/io_uring.h b/io_uring/io_uring.h
index abc6de227f74..1880902be6fd 100644
--- a/io_uring/io_uring.h
+++ b/io_uring/io_uring.h
@@ -476,9 +476,9 @@ static inline bool io_allowed_run_tw(struct io_ring_ctx *ctx)
* 2) PF_KTHREAD is set, in which case the invoker of the task_work is
* our fallback task_work.
*/
-static inline bool io_should_terminate_tw(void)
+static inline bool io_should_terminate_tw(struct io_ring_ctx *ctx)
{
- return current->flags & (PF_KTHREAD | PF_EXITING);
+ return (current->flags & (PF_KTHREAD | PF_EXITING)) || percpu_ref_is_dying(&ctx->refs);
}
static inline void io_req_queue_tw_complete(struct io_kiocb *req, s32 res)
diff --git a/io_uring/poll.c b/io_uring/poll.c
index c786e587563b..6090a26975d4 100644
--- a/io_uring/poll.c
+++ b/io_uring/poll.c
@@ -224,7 +224,7 @@ static int io_poll_check_events(struct io_kiocb *req, io_tw_token_t tw)
{
int v;
- if (unlikely(io_should_terminate_tw()))
+ if (unlikely(io_should_terminate_tw(req->ctx)))
return -ECANCELED;
do {
diff --git a/io_uring/timeout.c b/io_uring/timeout.c
index 7f13bfa9f2b6..17e3aab0af36 100644
--- a/io_uring/timeout.c
+++ b/io_uring/timeout.c
@@ -324,7 +324,7 @@ static void io_req_task_link_timeout(struct io_kiocb *req, io_tw_token_t tw)
int ret;
if (prev) {
- if (!io_should_terminate_tw()) {
+ if (!io_should_terminate_tw(req->ctx)) {
struct io_cancel_data cd = {
.ctx = req->ctx,
.data = prev->cqe.user_data,
diff --git a/io_uring/uring_cmd.c b/io_uring/uring_cmd.c
index 053bac89b6c0..213716e10d70 100644
--- a/io_uring/uring_cmd.c
+++ b/io_uring/uring_cmd.c
@@ -118,7 +118,7 @@ static void io_uring_cmd_work(struct io_kiocb *req, io_tw_token_t tw)
struct io_uring_cmd *ioucmd = io_kiocb_to_cmd(req, struct io_uring_cmd);
unsigned int flags = IO_URING_F_COMPLETE_DEFER;
- if (io_should_terminate_tw())
+ if (io_should_terminate_tw(req->ctx))
flags |= IO_URING_F_TASK_DEAD;
/* task_work executor checks the deffered list completion */
The patch below does not apply to the 6.12-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.12.y
git checkout FETCH_HEAD
git cherry-pick -x a09a8a1fbb374e0053b97306da9dbc05bd384685
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092110-music-knoll-828f@gregkh' --subject-prefix 'PATCH 6.12.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From a09a8a1fbb374e0053b97306da9dbc05bd384685 Mon Sep 17 00:00:00 2001
From: Hugh Dickins <hughd(a)google.com>
Date: Mon, 8 Sep 2025 15:16:53 -0700
Subject: [PATCH] mm/gup: local lru_add_drain() to avoid lru_add_drain_all()
In many cases, if collect_longterm_unpinnable_folios() does need to drain
the LRU cache to release a reference, the cache in question is on this
same CPU, and much more efficiently drained by a preliminary local
lru_add_drain(), than the later cross-CPU lru_add_drain_all().
Marked for stable, to counter the increase in lru_add_drain_all()s from
"mm/gup: check ref_count instead of lru before migration". Note for clean
backports: can take 6.16 commit a03db236aebf ("gup: optimize longterm
pin_user_pages() for large folio") first.
Link: https://lkml.kernel.org/r/66f2751f-283e-816d-9530-765db7edc465@google.com
Signed-off-by: Hugh Dickins <hughd(a)google.com>
Acked-by: David Hildenbrand <david(a)redhat.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar(a)kernel.org>
Cc: Axel Rasmussen <axelrasmussen(a)google.com>
Cc: Chris Li <chrisl(a)kernel.org>
Cc: Christoph Hellwig <hch(a)infradead.org>
Cc: Jason Gunthorpe <jgg(a)ziepe.ca>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: John Hubbard <jhubbard(a)nvidia.com>
Cc: Keir Fraser <keirf(a)google.com>
Cc: Konstantin Khlebnikov <koct9i(a)gmail.com>
Cc: Li Zhe <lizhe.67(a)bytedance.com>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: Peter Xu <peterx(a)redhat.com>
Cc: Rik van Riel <riel(a)surriel.com>
Cc: Shivank Garg <shivankg(a)amd.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Wei Xu <weixugc(a)google.com>
Cc: Will Deacon <will(a)kernel.org>
Cc: yangge <yangge1116(a)126.com>
Cc: Yuanchu Xie <yuanchu(a)google.com>
Cc: Yu Zhao <yuzhao(a)google.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/mm/gup.c b/mm/gup.c
index 82aec6443c0a..b47066a54f52 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -2287,8 +2287,8 @@ static unsigned long collect_longterm_unpinnable_folios(
struct pages_or_folios *pofs)
{
unsigned long collected = 0;
- bool drain_allow = true;
struct folio *folio;
+ int drained = 0;
long i = 0;
for (folio = pofs_get_folio(pofs, i); folio;
@@ -2307,10 +2307,17 @@ static unsigned long collect_longterm_unpinnable_folios(
continue;
}
- if (drain_allow && folio_ref_count(folio) !=
- folio_expected_ref_count(folio) + 1) {
+ if (drained == 0 &&
+ folio_ref_count(folio) !=
+ folio_expected_ref_count(folio) + 1) {
+ lru_add_drain();
+ drained = 1;
+ }
+ if (drained == 1 &&
+ folio_ref_count(folio) !=
+ folio_expected_ref_count(folio) + 1) {
lru_add_drain_all();
- drain_allow = false;
+ drained = 2;
}
if (!folio_isolate_lru(folio))
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x 1b34cbbf4f011a121ef7b2d7d6e6920a036d5285
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092108-unmarked-tropical-1899@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 1b34cbbf4f011a121ef7b2d7d6e6920a036d5285 Mon Sep 17 00:00:00 2001
From: Herbert Xu <herbert(a)gondor.apana.org.au>
Date: Tue, 16 Sep 2025 17:20:59 +0800
Subject: [PATCH] crypto: af_alg - Disallow concurrent writes in af_alg_sendmsg
Issuing two writes to the same af_alg socket is bogus as the
data will be interleaved in an unpredictable fashion. Furthermore,
concurrent writes may create inconsistencies in the internal
socket state.
Disallow this by adding a new ctx->write field that indiciates
exclusive ownership for writing.
Fixes: 8ff590903d5 ("crypto: algif_skcipher - User-space interface for skcipher operations")
Reported-by: Muhammad Alifa Ramdhan <ramdhan(a)starlabs.sg>
Reported-by: Bing-Jhong Billy Jheng <billy(a)starlabs.sg>
Signed-off-by: Herbert Xu <herbert(a)gondor.apana.org.au>
diff --git a/crypto/af_alg.c b/crypto/af_alg.c
index 407f2c238f2c..ca6fdcc6c54a 100644
--- a/crypto/af_alg.c
+++ b/crypto/af_alg.c
@@ -970,6 +970,12 @@ int af_alg_sendmsg(struct socket *sock, struct msghdr *msg, size_t size,
}
lock_sock(sk);
+ if (ctx->write) {
+ release_sock(sk);
+ return -EBUSY;
+ }
+ ctx->write = true;
+
if (ctx->init && !ctx->more) {
if (ctx->used) {
err = -EINVAL;
@@ -1105,6 +1111,7 @@ int af_alg_sendmsg(struct socket *sock, struct msghdr *msg, size_t size,
unlock:
af_alg_data_wakeup(sk);
+ ctx->write = false;
release_sock(sk);
return copied ?: err;
diff --git a/include/crypto/if_alg.h b/include/crypto/if_alg.h
index f7b3b93f3a49..0c70f3a55575 100644
--- a/include/crypto/if_alg.h
+++ b/include/crypto/if_alg.h
@@ -135,6 +135,7 @@ struct af_alg_async_req {
* SG?
* @enc: Cryptographic operation to be performed when
* recvmsg is invoked.
+ * @write: True if we are in the middle of a write.
* @init: True if metadata has been sent.
* @len: Length of memory allocated for this data structure.
* @inflight: Non-zero when AIO requests are in flight.
@@ -151,10 +152,11 @@ struct af_alg_ctx {
size_t used;
atomic_t rcvused;
- bool more;
- bool merge;
- bool enc;
- bool init;
+ u32 more:1,
+ merge:1,
+ enc:1,
+ write:1,
+ init:1;
unsigned int len;
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 1b34cbbf4f011a121ef7b2d7d6e6920a036d5285
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092108-drinking-sloped-1caa@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 1b34cbbf4f011a121ef7b2d7d6e6920a036d5285 Mon Sep 17 00:00:00 2001
From: Herbert Xu <herbert(a)gondor.apana.org.au>
Date: Tue, 16 Sep 2025 17:20:59 +0800
Subject: [PATCH] crypto: af_alg - Disallow concurrent writes in af_alg_sendmsg
Issuing two writes to the same af_alg socket is bogus as the
data will be interleaved in an unpredictable fashion. Furthermore,
concurrent writes may create inconsistencies in the internal
socket state.
Disallow this by adding a new ctx->write field that indiciates
exclusive ownership for writing.
Fixes: 8ff590903d5 ("crypto: algif_skcipher - User-space interface for skcipher operations")
Reported-by: Muhammad Alifa Ramdhan <ramdhan(a)starlabs.sg>
Reported-by: Bing-Jhong Billy Jheng <billy(a)starlabs.sg>
Signed-off-by: Herbert Xu <herbert(a)gondor.apana.org.au>
diff --git a/crypto/af_alg.c b/crypto/af_alg.c
index 407f2c238f2c..ca6fdcc6c54a 100644
--- a/crypto/af_alg.c
+++ b/crypto/af_alg.c
@@ -970,6 +970,12 @@ int af_alg_sendmsg(struct socket *sock, struct msghdr *msg, size_t size,
}
lock_sock(sk);
+ if (ctx->write) {
+ release_sock(sk);
+ return -EBUSY;
+ }
+ ctx->write = true;
+
if (ctx->init && !ctx->more) {
if (ctx->used) {
err = -EINVAL;
@@ -1105,6 +1111,7 @@ int af_alg_sendmsg(struct socket *sock, struct msghdr *msg, size_t size,
unlock:
af_alg_data_wakeup(sk);
+ ctx->write = false;
release_sock(sk);
return copied ?: err;
diff --git a/include/crypto/if_alg.h b/include/crypto/if_alg.h
index f7b3b93f3a49..0c70f3a55575 100644
--- a/include/crypto/if_alg.h
+++ b/include/crypto/if_alg.h
@@ -135,6 +135,7 @@ struct af_alg_async_req {
* SG?
* @enc: Cryptographic operation to be performed when
* recvmsg is invoked.
+ * @write: True if we are in the middle of a write.
* @init: True if metadata has been sent.
* @len: Length of memory allocated for this data structure.
* @inflight: Non-zero when AIO requests are in flight.
@@ -151,10 +152,11 @@ struct af_alg_ctx {
size_t used;
atomic_t rcvused;
- bool more;
- bool merge;
- bool enc;
- bool init;
+ u32 more:1,
+ merge:1,
+ enc:1,
+ write:1,
+ init:1;
unsigned int len;
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 1b34cbbf4f011a121ef7b2d7d6e6920a036d5285
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092107-crowbar-posting-c6ba@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 1b34cbbf4f011a121ef7b2d7d6e6920a036d5285 Mon Sep 17 00:00:00 2001
From: Herbert Xu <herbert(a)gondor.apana.org.au>
Date: Tue, 16 Sep 2025 17:20:59 +0800
Subject: [PATCH] crypto: af_alg - Disallow concurrent writes in af_alg_sendmsg
Issuing two writes to the same af_alg socket is bogus as the
data will be interleaved in an unpredictable fashion. Furthermore,
concurrent writes may create inconsistencies in the internal
socket state.
Disallow this by adding a new ctx->write field that indiciates
exclusive ownership for writing.
Fixes: 8ff590903d5 ("crypto: algif_skcipher - User-space interface for skcipher operations")
Reported-by: Muhammad Alifa Ramdhan <ramdhan(a)starlabs.sg>
Reported-by: Bing-Jhong Billy Jheng <billy(a)starlabs.sg>
Signed-off-by: Herbert Xu <herbert(a)gondor.apana.org.au>
diff --git a/crypto/af_alg.c b/crypto/af_alg.c
index 407f2c238f2c..ca6fdcc6c54a 100644
--- a/crypto/af_alg.c
+++ b/crypto/af_alg.c
@@ -970,6 +970,12 @@ int af_alg_sendmsg(struct socket *sock, struct msghdr *msg, size_t size,
}
lock_sock(sk);
+ if (ctx->write) {
+ release_sock(sk);
+ return -EBUSY;
+ }
+ ctx->write = true;
+
if (ctx->init && !ctx->more) {
if (ctx->used) {
err = -EINVAL;
@@ -1105,6 +1111,7 @@ int af_alg_sendmsg(struct socket *sock, struct msghdr *msg, size_t size,
unlock:
af_alg_data_wakeup(sk);
+ ctx->write = false;
release_sock(sk);
return copied ?: err;
diff --git a/include/crypto/if_alg.h b/include/crypto/if_alg.h
index f7b3b93f3a49..0c70f3a55575 100644
--- a/include/crypto/if_alg.h
+++ b/include/crypto/if_alg.h
@@ -135,6 +135,7 @@ struct af_alg_async_req {
* SG?
* @enc: Cryptographic operation to be performed when
* recvmsg is invoked.
+ * @write: True if we are in the middle of a write.
* @init: True if metadata has been sent.
* @len: Length of memory allocated for this data structure.
* @inflight: Non-zero when AIO requests are in flight.
@@ -151,10 +152,11 @@ struct af_alg_ctx {
size_t used;
atomic_t rcvused;
- bool more;
- bool merge;
- bool enc;
- bool init;
+ u32 more:1,
+ merge:1,
+ enc:1,
+ write:1,
+ init:1;
unsigned int len;
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 14e22b43df25dbd4301351b882486ea38892ae4f
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092158-molehill-radiation-11c3@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 14e22b43df25dbd4301351b882486ea38892ae4f Mon Sep 17 00:00:00 2001
From: "Matthieu Baerts (NGI0)" <matttbe(a)kernel.org>
Date: Fri, 12 Sep 2025 14:25:51 +0200
Subject: [PATCH] selftests: mptcp: connect: catch IO errors on listen side
IO errors were correctly printed to stderr, and propagated up to the
main loop for the server side, but the returned value was ignored. As a
consequence, the program for the listener side was no longer exiting
with an error code in case of IO issues.
Because of that, some issues might not have been seen. But very likely,
most issues either had an effect on the client side, or the file
transfer was not the expected one, e.g. the connection got reset before
the end. Still, it is better to fix this.
The main consequence of this issue is the error that was reported by the
selftests: the received and sent files were different, and the MIB
counters were not printed. Also, when such errors happened during the
'disconnect' tests, the program tried to continue until the timeout.
Now when an IO error is detected, the program exits directly with an
error.
Fixes: 05be5e273c84 ("selftests: mptcp: add disconnect tests")
Cc: stable(a)vger.kernel.org
Reviewed-by: Mat Martineau <martineau(a)kernel.org>
Reviewed-by: Geliang Tang <geliang(a)kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
Link: https://patch.msgid.link/20250912-net-mptcp-fix-sft-connect-v1-2-d40e77cbbf…
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
diff --git a/tools/testing/selftests/net/mptcp/mptcp_connect.c b/tools/testing/selftests/net/mptcp/mptcp_connect.c
index 4f07ac9fa207..1408698df099 100644
--- a/tools/testing/selftests/net/mptcp/mptcp_connect.c
+++ b/tools/testing/selftests/net/mptcp/mptcp_connect.c
@@ -1093,6 +1093,7 @@ int main_loop_s(int listensock)
struct pollfd polls;
socklen_t salen;
int remotesock;
+ int err = 0;
int fd = 0;
again:
@@ -1125,7 +1126,7 @@ int main_loop_s(int listensock)
SOCK_TEST_TCPULP(remotesock, 0);
memset(&winfo, 0, sizeof(winfo));
- copyfd_io(fd, remotesock, 1, true, &winfo);
+ err = copyfd_io(fd, remotesock, 1, true, &winfo);
} else {
perror("accept");
return 1;
@@ -1134,10 +1135,10 @@ int main_loop_s(int listensock)
if (cfg_input)
close(fd);
- if (--cfg_repeat > 0)
+ if (!err && --cfg_repeat > 0)
goto again;
- return 0;
+ return err;
}
static void init_rng(void)
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 7f830e126dc357fc086905ce9730140fd4528d66
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092126-fabulous-despair-ac21@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 7f830e126dc357fc086905ce9730140fd4528d66 Mon Sep 17 00:00:00 2001
From: Tom Lendacky <thomas.lendacky(a)amd.com>
Date: Mon, 15 Sep 2025 11:04:12 -0500
Subject: [PATCH] x86/sev: Guard sev_evict_cache() with CONFIG_AMD_MEM_ENCRYPT
The sev_evict_cache() is guest-related code and should be guarded by
CONFIG_AMD_MEM_ENCRYPT, not CONFIG_KVM_AMD_SEV.
CONFIG_AMD_MEM_ENCRYPT=y is required for a guest to run properly as an SEV-SNP
guest, but a guest kernel built with CONFIG_KVM_AMD_SEV=n would get the stub
function of sev_evict_cache() instead of the version that performs the actual
eviction. Move the function declarations under the appropriate #ifdef.
Fixes: 7b306dfa326f ("x86/sev: Evict cache lines during SNP memory validation")
Signed-off-by: Tom Lendacky <thomas.lendacky(a)amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp(a)alien8.de>
Cc: stable(a)kernel.org # 6.16.x
Link: https://lore.kernel.org/r/70e38f2c4a549063de54052c9f64929705313526.17577089…
diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index 02236962fdb1..465b19fd1a2d 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -562,6 +562,24 @@ enum es_result sev_es_ghcb_hv_call(struct ghcb *ghcb,
extern struct ghcb *boot_ghcb;
+static inline void sev_evict_cache(void *va, int npages)
+{
+ volatile u8 val __always_unused;
+ u8 *bytes = va;
+ int page_idx;
+
+ /*
+ * For SEV guests, a read from the first/last cache-lines of a 4K page
+ * using the guest key is sufficient to cause a flush of all cache-lines
+ * associated with that 4K page without incurring all the overhead of a
+ * full CLFLUSH sequence.
+ */
+ for (page_idx = 0; page_idx < npages; page_idx++) {
+ val = bytes[page_idx * PAGE_SIZE];
+ val = bytes[page_idx * PAGE_SIZE + PAGE_SIZE - 1];
+ }
+}
+
#else /* !CONFIG_AMD_MEM_ENCRYPT */
#define snp_vmpl 0
@@ -605,6 +623,7 @@ static inline int snp_send_guest_request(struct snp_msg_desc *mdesc,
static inline int snp_svsm_vtpm_send_command(u8 *buffer) { return -ENODEV; }
static inline void __init snp_secure_tsc_prepare(void) { }
static inline void __init snp_secure_tsc_init(void) { }
+static inline void sev_evict_cache(void *va, int npages) {}
#endif /* CONFIG_AMD_MEM_ENCRYPT */
@@ -619,24 +638,6 @@ int rmp_make_shared(u64 pfn, enum pg_level level);
void snp_leak_pages(u64 pfn, unsigned int npages);
void kdump_sev_callback(void);
void snp_fixup_e820_tables(void);
-
-static inline void sev_evict_cache(void *va, int npages)
-{
- volatile u8 val __always_unused;
- u8 *bytes = va;
- int page_idx;
-
- /*
- * For SEV guests, a read from the first/last cache-lines of a 4K page
- * using the guest key is sufficient to cause a flush of all cache-lines
- * associated with that 4K page without incurring all the overhead of a
- * full CLFLUSH sequence.
- */
- for (page_idx = 0; page_idx < npages; page_idx++) {
- val = bytes[page_idx * PAGE_SIZE];
- val = bytes[page_idx * PAGE_SIZE + PAGE_SIZE - 1];
- }
-}
#else
static inline bool snp_probe_rmptable_info(void) { return false; }
static inline int snp_rmptable_init(void) { return -ENOSYS; }
@@ -652,7 +653,6 @@ static inline int rmp_make_shared(u64 pfn, enum pg_level level) { return -ENODEV
static inline void snp_leak_pages(u64 pfn, unsigned int npages) {}
static inline void kdump_sev_callback(void) { }
static inline void snp_fixup_e820_tables(void) {}
-static inline void sev_evict_cache(void *va, int npages) {}
#endif
#endif
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x 7f830e126dc357fc086905ce9730140fd4528d66
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092125-stitch-starting-35cb@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 7f830e126dc357fc086905ce9730140fd4528d66 Mon Sep 17 00:00:00 2001
From: Tom Lendacky <thomas.lendacky(a)amd.com>
Date: Mon, 15 Sep 2025 11:04:12 -0500
Subject: [PATCH] x86/sev: Guard sev_evict_cache() with CONFIG_AMD_MEM_ENCRYPT
The sev_evict_cache() is guest-related code and should be guarded by
CONFIG_AMD_MEM_ENCRYPT, not CONFIG_KVM_AMD_SEV.
CONFIG_AMD_MEM_ENCRYPT=y is required for a guest to run properly as an SEV-SNP
guest, but a guest kernel built with CONFIG_KVM_AMD_SEV=n would get the stub
function of sev_evict_cache() instead of the version that performs the actual
eviction. Move the function declarations under the appropriate #ifdef.
Fixes: 7b306dfa326f ("x86/sev: Evict cache lines during SNP memory validation")
Signed-off-by: Tom Lendacky <thomas.lendacky(a)amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp(a)alien8.de>
Cc: stable(a)kernel.org # 6.16.x
Link: https://lore.kernel.org/r/70e38f2c4a549063de54052c9f64929705313526.17577089…
diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index 02236962fdb1..465b19fd1a2d 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -562,6 +562,24 @@ enum es_result sev_es_ghcb_hv_call(struct ghcb *ghcb,
extern struct ghcb *boot_ghcb;
+static inline void sev_evict_cache(void *va, int npages)
+{
+ volatile u8 val __always_unused;
+ u8 *bytes = va;
+ int page_idx;
+
+ /*
+ * For SEV guests, a read from the first/last cache-lines of a 4K page
+ * using the guest key is sufficient to cause a flush of all cache-lines
+ * associated with that 4K page without incurring all the overhead of a
+ * full CLFLUSH sequence.
+ */
+ for (page_idx = 0; page_idx < npages; page_idx++) {
+ val = bytes[page_idx * PAGE_SIZE];
+ val = bytes[page_idx * PAGE_SIZE + PAGE_SIZE - 1];
+ }
+}
+
#else /* !CONFIG_AMD_MEM_ENCRYPT */
#define snp_vmpl 0
@@ -605,6 +623,7 @@ static inline int snp_send_guest_request(struct snp_msg_desc *mdesc,
static inline int snp_svsm_vtpm_send_command(u8 *buffer) { return -ENODEV; }
static inline void __init snp_secure_tsc_prepare(void) { }
static inline void __init snp_secure_tsc_init(void) { }
+static inline void sev_evict_cache(void *va, int npages) {}
#endif /* CONFIG_AMD_MEM_ENCRYPT */
@@ -619,24 +638,6 @@ int rmp_make_shared(u64 pfn, enum pg_level level);
void snp_leak_pages(u64 pfn, unsigned int npages);
void kdump_sev_callback(void);
void snp_fixup_e820_tables(void);
-
-static inline void sev_evict_cache(void *va, int npages)
-{
- volatile u8 val __always_unused;
- u8 *bytes = va;
- int page_idx;
-
- /*
- * For SEV guests, a read from the first/last cache-lines of a 4K page
- * using the guest key is sufficient to cause a flush of all cache-lines
- * associated with that 4K page without incurring all the overhead of a
- * full CLFLUSH sequence.
- */
- for (page_idx = 0; page_idx < npages; page_idx++) {
- val = bytes[page_idx * PAGE_SIZE];
- val = bytes[page_idx * PAGE_SIZE + PAGE_SIZE - 1];
- }
-}
#else
static inline bool snp_probe_rmptable_info(void) { return false; }
static inline int snp_rmptable_init(void) { return -ENOSYS; }
@@ -652,7 +653,6 @@ static inline int rmp_make_shared(u64 pfn, enum pg_level level) { return -ENODEV
static inline void snp_leak_pages(u64 pfn, unsigned int npages) {}
static inline void kdump_sev_callback(void) { }
static inline void snp_fixup_e820_tables(void) {}
-static inline void sev_evict_cache(void *va, int npages) {}
#endif
#endif
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x 96fa515e70f3e4b98685ef8cac9d737fc62f10e1
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092135-breeding-chrome-585a@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 96fa515e70f3e4b98685ef8cac9d737fc62f10e1 Mon Sep 17 00:00:00 2001
From: Qu Wenruo <wqu(a)suse.com>
Date: Tue, 16 Sep 2025 07:54:06 +0930
Subject: [PATCH] btrfs: tree-checker: fix the incorrect inode ref size check
[BUG]
Inside check_inode_ref(), we need to make sure every structure,
including the btrfs_inode_extref header, is covered by the item. But
our code is incorrectly using "sizeof(iref)", where @iref is just a
pointer.
This means "sizeof(iref)" will always be "sizeof(void *)", which is much
smaller than "sizeof(struct btrfs_inode_extref)".
This will allow some bad inode extrefs to sneak in, defeating tree-checker.
[FIX]
Fix the typo by calling "sizeof(*iref)", which is the same as
"sizeof(struct btrfs_inode_extref)", and will be the correct behavior we
want.
Fixes: 71bf92a9b877 ("btrfs: tree-checker: Add check for INODE_REF")
CC: stable(a)vger.kernel.org # 6.1+
Reviewed-by: Johannes Thumshirn <johannes.thumshirn(a)wdc.com>
Reviewed-by: Filipe Manana <fdmanana(a)suse.com>
Signed-off-by: Qu Wenruo <wqu(a)suse.com>
Reviewed-by: David Sterba <dsterba(a)suse.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
diff --git a/fs/btrfs/tree-checker.c b/fs/btrfs/tree-checker.c
index 0f556f4de3f9..a997c7cc35a2 100644
--- a/fs/btrfs/tree-checker.c
+++ b/fs/btrfs/tree-checker.c
@@ -1756,10 +1756,10 @@ static int check_inode_ref(struct extent_buffer *leaf,
while (ptr < end) {
u16 namelen;
- if (unlikely(ptr + sizeof(iref) > end)) {
+ if (unlikely(ptr + sizeof(*iref) > end)) {
inode_ref_err(leaf, slot,
"inode ref overflow, ptr %lu end %lu inode_ref_size %zu",
- ptr, end, sizeof(iref));
+ ptr, end, sizeof(*iref));
return -EUCLEAN;
}
The patch below does not apply to the 6.16-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.16.y
git checkout FETCH_HEAD
git cherry-pick -x e6b733ca2f99e968d696c2e812c8eb8e090bf37b
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092121-boned-marbles-55ea@gregkh' --subject-prefix 'PATCH 6.16.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From e6b733ca2f99e968d696c2e812c8eb8e090bf37b Mon Sep 17 00:00:00 2001
From: SeongJae Park <sj(a)kernel.org>
Date: Mon, 8 Sep 2025 19:22:37 -0700
Subject: [PATCH] samples/damon/prcl: avoid starting DAMON before
initialization
Commit 2780505ec2b4 ("samples/damon/prcl: fix boot time enable crash") is
somehow incompletely applying the origin patch [1]. It is missing the
part that avoids starting DAMON before module initialization. Probably a
mistake during a merge has happened. Fix it by applying the missed part
again.
Link: https://lkml.kernel.org/r/20250909022238.2989-3-sj@kernel.org
Link: https://lore.kernel.org/20250706193207.39810-3-sj@kernel.org [1]
Fixes: 2780505ec2b4 ("samples/damon/prcl: fix boot time enable crash")
Signed-off-by: SeongJae Park <sj(a)kernel.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/samples/damon/prcl.c b/samples/damon/prcl.c
index 1b839c06a612..0226652f94d5 100644
--- a/samples/damon/prcl.c
+++ b/samples/damon/prcl.c
@@ -137,6 +137,9 @@ static int damon_sample_prcl_enable_store(
if (enabled == is_enabled)
return 0;
+ if (!init_called)
+ return 0;
+
if (enabled) {
err = damon_sample_prcl_start();
if (err)
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x 2da6de30e60dd9bb14600eff1cc99df2fa2ddae3
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092147-truck-ceremony-311d@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 2da6de30e60dd9bb14600eff1cc99df2fa2ddae3 Mon Sep 17 00:00:00 2001
From: Hugh Dickins <hughd(a)google.com>
Date: Mon, 8 Sep 2025 15:23:15 -0700
Subject: [PATCH] mm: folio_may_be_lru_cached() unless folio_test_large()
mm/swap.c and mm/mlock.c agree to drain any per-CPU batch as soon as a
large folio is added: so collect_longterm_unpinnable_folios() just wastes
effort when calling lru_add_drain[_all]() on a large folio.
But although there is good reason not to batch up PMD-sized folios, we
might well benefit from batching a small number of low-order mTHPs (though
unclear how that "small number" limitation will be implemented).
So ask if folio_may_be_lru_cached() rather than !folio_test_large(), to
insulate those particular checks from future change. Name preferred to
"folio_is_batchable" because large folios can well be put on a batch: it's
just the per-CPU LRU caches, drained much later, which need care.
Marked for stable, to counter the increase in lru_add_drain_all()s from
"mm/gup: check ref_count instead of lru before migration".
Link: https://lkml.kernel.org/r/57d2eaf8-3607-f318-e0c5-be02dce61ad0@google.com
Fixes: 9a4e9f3b2d73 ("mm: update get_user_pages_longterm to migrate pages allocated from CMA region")
Signed-off-by: Hugh Dickins <hughd(a)google.com>
Suggested-by: David Hildenbrand <david(a)redhat.com>
Acked-by: David Hildenbrand <david(a)redhat.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar(a)kernel.org>
Cc: Axel Rasmussen <axelrasmussen(a)google.com>
Cc: Chris Li <chrisl(a)kernel.org>
Cc: Christoph Hellwig <hch(a)infradead.org>
Cc: Jason Gunthorpe <jgg(a)ziepe.ca>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: John Hubbard <jhubbard(a)nvidia.com>
Cc: Keir Fraser <keirf(a)google.com>
Cc: Konstantin Khlebnikov <koct9i(a)gmail.com>
Cc: Li Zhe <lizhe.67(a)bytedance.com>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: Peter Xu <peterx(a)redhat.com>
Cc: Rik van Riel <riel(a)surriel.com>
Cc: Shivank Garg <shivankg(a)amd.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Wei Xu <weixugc(a)google.com>
Cc: Will Deacon <will(a)kernel.org>
Cc: yangge <yangge1116(a)126.com>
Cc: Yuanchu Xie <yuanchu(a)google.com>
Cc: Yu Zhao <yuzhao(a)google.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/include/linux/swap.h b/include/linux/swap.h
index 2fe6ed2cc3fd..7012a0f758d8 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -385,6 +385,16 @@ void folio_add_lru_vma(struct folio *, struct vm_area_struct *);
void mark_page_accessed(struct page *);
void folio_mark_accessed(struct folio *);
+static inline bool folio_may_be_lru_cached(struct folio *folio)
+{
+ /*
+ * Holding PMD-sized folios in per-CPU LRU cache unbalances accounting.
+ * Holding small numbers of low-order mTHP folios in per-CPU LRU cache
+ * will be sensible, but nobody has implemented and tested that yet.
+ */
+ return !folio_test_large(folio);
+}
+
extern atomic_t lru_disable_count;
static inline bool lru_cache_disabled(void)
diff --git a/mm/gup.c b/mm/gup.c
index b47066a54f52..0bc4d140fc07 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -2307,13 +2307,13 @@ static unsigned long collect_longterm_unpinnable_folios(
continue;
}
- if (drained == 0 &&
+ if (drained == 0 && folio_may_be_lru_cached(folio) &&
folio_ref_count(folio) !=
folio_expected_ref_count(folio) + 1) {
lru_add_drain();
drained = 1;
}
- if (drained == 1 &&
+ if (drained == 1 && folio_may_be_lru_cached(folio) &&
folio_ref_count(folio) !=
folio_expected_ref_count(folio) + 1) {
lru_add_drain_all();
diff --git a/mm/mlock.c b/mm/mlock.c
index a1d93ad33c6d..bb0776f5ef7c 100644
--- a/mm/mlock.c
+++ b/mm/mlock.c
@@ -255,7 +255,7 @@ void mlock_folio(struct folio *folio)
folio_get(folio);
if (!folio_batch_add(fbatch, mlock_lru(folio)) ||
- folio_test_large(folio) || lru_cache_disabled())
+ !folio_may_be_lru_cached(folio) || lru_cache_disabled())
mlock_folio_batch(fbatch);
local_unlock(&mlock_fbatch.lock);
}
@@ -278,7 +278,7 @@ void mlock_new_folio(struct folio *folio)
folio_get(folio);
if (!folio_batch_add(fbatch, mlock_new(folio)) ||
- folio_test_large(folio) || lru_cache_disabled())
+ !folio_may_be_lru_cached(folio) || lru_cache_disabled())
mlock_folio_batch(fbatch);
local_unlock(&mlock_fbatch.lock);
}
@@ -299,7 +299,7 @@ void munlock_folio(struct folio *folio)
*/
folio_get(folio);
if (!folio_batch_add(fbatch, folio) ||
- folio_test_large(folio) || lru_cache_disabled())
+ !folio_may_be_lru_cached(folio) || lru_cache_disabled())
mlock_folio_batch(fbatch);
local_unlock(&mlock_fbatch.lock);
}
diff --git a/mm/swap.c b/mm/swap.c
index 6ae2d5680574..b74ebe865dd9 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -192,7 +192,7 @@ static void __folio_batch_add_and_move(struct folio_batch __percpu *fbatch,
local_lock(&cpu_fbatches.lock);
if (!folio_batch_add(this_cpu_ptr(fbatch), folio) ||
- folio_test_large(folio) || lru_cache_disabled())
+ !folio_may_be_lru_cached(folio) || lru_cache_disabled())
folio_batch_move_lru(this_cpu_ptr(fbatch), move_fn);
if (disable_irq)
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 2da6de30e60dd9bb14600eff1cc99df2fa2ddae3
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092146-exhume-krypton-1383@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 2da6de30e60dd9bb14600eff1cc99df2fa2ddae3 Mon Sep 17 00:00:00 2001
From: Hugh Dickins <hughd(a)google.com>
Date: Mon, 8 Sep 2025 15:23:15 -0700
Subject: [PATCH] mm: folio_may_be_lru_cached() unless folio_test_large()
mm/swap.c and mm/mlock.c agree to drain any per-CPU batch as soon as a
large folio is added: so collect_longterm_unpinnable_folios() just wastes
effort when calling lru_add_drain[_all]() on a large folio.
But although there is good reason not to batch up PMD-sized folios, we
might well benefit from batching a small number of low-order mTHPs (though
unclear how that "small number" limitation will be implemented).
So ask if folio_may_be_lru_cached() rather than !folio_test_large(), to
insulate those particular checks from future change. Name preferred to
"folio_is_batchable" because large folios can well be put on a batch: it's
just the per-CPU LRU caches, drained much later, which need care.
Marked for stable, to counter the increase in lru_add_drain_all()s from
"mm/gup: check ref_count instead of lru before migration".
Link: https://lkml.kernel.org/r/57d2eaf8-3607-f318-e0c5-be02dce61ad0@google.com
Fixes: 9a4e9f3b2d73 ("mm: update get_user_pages_longterm to migrate pages allocated from CMA region")
Signed-off-by: Hugh Dickins <hughd(a)google.com>
Suggested-by: David Hildenbrand <david(a)redhat.com>
Acked-by: David Hildenbrand <david(a)redhat.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar(a)kernel.org>
Cc: Axel Rasmussen <axelrasmussen(a)google.com>
Cc: Chris Li <chrisl(a)kernel.org>
Cc: Christoph Hellwig <hch(a)infradead.org>
Cc: Jason Gunthorpe <jgg(a)ziepe.ca>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: John Hubbard <jhubbard(a)nvidia.com>
Cc: Keir Fraser <keirf(a)google.com>
Cc: Konstantin Khlebnikov <koct9i(a)gmail.com>
Cc: Li Zhe <lizhe.67(a)bytedance.com>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: Peter Xu <peterx(a)redhat.com>
Cc: Rik van Riel <riel(a)surriel.com>
Cc: Shivank Garg <shivankg(a)amd.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Wei Xu <weixugc(a)google.com>
Cc: Will Deacon <will(a)kernel.org>
Cc: yangge <yangge1116(a)126.com>
Cc: Yuanchu Xie <yuanchu(a)google.com>
Cc: Yu Zhao <yuzhao(a)google.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/include/linux/swap.h b/include/linux/swap.h
index 2fe6ed2cc3fd..7012a0f758d8 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -385,6 +385,16 @@ void folio_add_lru_vma(struct folio *, struct vm_area_struct *);
void mark_page_accessed(struct page *);
void folio_mark_accessed(struct folio *);
+static inline bool folio_may_be_lru_cached(struct folio *folio)
+{
+ /*
+ * Holding PMD-sized folios in per-CPU LRU cache unbalances accounting.
+ * Holding small numbers of low-order mTHP folios in per-CPU LRU cache
+ * will be sensible, but nobody has implemented and tested that yet.
+ */
+ return !folio_test_large(folio);
+}
+
extern atomic_t lru_disable_count;
static inline bool lru_cache_disabled(void)
diff --git a/mm/gup.c b/mm/gup.c
index b47066a54f52..0bc4d140fc07 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -2307,13 +2307,13 @@ static unsigned long collect_longterm_unpinnable_folios(
continue;
}
- if (drained == 0 &&
+ if (drained == 0 && folio_may_be_lru_cached(folio) &&
folio_ref_count(folio) !=
folio_expected_ref_count(folio) + 1) {
lru_add_drain();
drained = 1;
}
- if (drained == 1 &&
+ if (drained == 1 && folio_may_be_lru_cached(folio) &&
folio_ref_count(folio) !=
folio_expected_ref_count(folio) + 1) {
lru_add_drain_all();
diff --git a/mm/mlock.c b/mm/mlock.c
index a1d93ad33c6d..bb0776f5ef7c 100644
--- a/mm/mlock.c
+++ b/mm/mlock.c
@@ -255,7 +255,7 @@ void mlock_folio(struct folio *folio)
folio_get(folio);
if (!folio_batch_add(fbatch, mlock_lru(folio)) ||
- folio_test_large(folio) || lru_cache_disabled())
+ !folio_may_be_lru_cached(folio) || lru_cache_disabled())
mlock_folio_batch(fbatch);
local_unlock(&mlock_fbatch.lock);
}
@@ -278,7 +278,7 @@ void mlock_new_folio(struct folio *folio)
folio_get(folio);
if (!folio_batch_add(fbatch, mlock_new(folio)) ||
- folio_test_large(folio) || lru_cache_disabled())
+ !folio_may_be_lru_cached(folio) || lru_cache_disabled())
mlock_folio_batch(fbatch);
local_unlock(&mlock_fbatch.lock);
}
@@ -299,7 +299,7 @@ void munlock_folio(struct folio *folio)
*/
folio_get(folio);
if (!folio_batch_add(fbatch, folio) ||
- folio_test_large(folio) || lru_cache_disabled())
+ !folio_may_be_lru_cached(folio) || lru_cache_disabled())
mlock_folio_batch(fbatch);
local_unlock(&mlock_fbatch.lock);
}
diff --git a/mm/swap.c b/mm/swap.c
index 6ae2d5680574..b74ebe865dd9 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -192,7 +192,7 @@ static void __folio_batch_add_and_move(struct folio_batch __percpu *fbatch,
local_lock(&cpu_fbatches.lock);
if (!folio_batch_add(this_cpu_ptr(fbatch), folio) ||
- folio_test_large(folio) || lru_cache_disabled())
+ !folio_may_be_lru_cached(folio) || lru_cache_disabled())
folio_batch_move_lru(this_cpu_ptr(fbatch), move_fn);
if (disable_irq)
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 2da6de30e60dd9bb14600eff1cc99df2fa2ddae3
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092145-system-superjet-a0fb@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 2da6de30e60dd9bb14600eff1cc99df2fa2ddae3 Mon Sep 17 00:00:00 2001
From: Hugh Dickins <hughd(a)google.com>
Date: Mon, 8 Sep 2025 15:23:15 -0700
Subject: [PATCH] mm: folio_may_be_lru_cached() unless folio_test_large()
mm/swap.c and mm/mlock.c agree to drain any per-CPU batch as soon as a
large folio is added: so collect_longterm_unpinnable_folios() just wastes
effort when calling lru_add_drain[_all]() on a large folio.
But although there is good reason not to batch up PMD-sized folios, we
might well benefit from batching a small number of low-order mTHPs (though
unclear how that "small number" limitation will be implemented).
So ask if folio_may_be_lru_cached() rather than !folio_test_large(), to
insulate those particular checks from future change. Name preferred to
"folio_is_batchable" because large folios can well be put on a batch: it's
just the per-CPU LRU caches, drained much later, which need care.
Marked for stable, to counter the increase in lru_add_drain_all()s from
"mm/gup: check ref_count instead of lru before migration".
Link: https://lkml.kernel.org/r/57d2eaf8-3607-f318-e0c5-be02dce61ad0@google.com
Fixes: 9a4e9f3b2d73 ("mm: update get_user_pages_longterm to migrate pages allocated from CMA region")
Signed-off-by: Hugh Dickins <hughd(a)google.com>
Suggested-by: David Hildenbrand <david(a)redhat.com>
Acked-by: David Hildenbrand <david(a)redhat.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar(a)kernel.org>
Cc: Axel Rasmussen <axelrasmussen(a)google.com>
Cc: Chris Li <chrisl(a)kernel.org>
Cc: Christoph Hellwig <hch(a)infradead.org>
Cc: Jason Gunthorpe <jgg(a)ziepe.ca>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: John Hubbard <jhubbard(a)nvidia.com>
Cc: Keir Fraser <keirf(a)google.com>
Cc: Konstantin Khlebnikov <koct9i(a)gmail.com>
Cc: Li Zhe <lizhe.67(a)bytedance.com>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: Peter Xu <peterx(a)redhat.com>
Cc: Rik van Riel <riel(a)surriel.com>
Cc: Shivank Garg <shivankg(a)amd.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Wei Xu <weixugc(a)google.com>
Cc: Will Deacon <will(a)kernel.org>
Cc: yangge <yangge1116(a)126.com>
Cc: Yuanchu Xie <yuanchu(a)google.com>
Cc: Yu Zhao <yuzhao(a)google.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/include/linux/swap.h b/include/linux/swap.h
index 2fe6ed2cc3fd..7012a0f758d8 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -385,6 +385,16 @@ void folio_add_lru_vma(struct folio *, struct vm_area_struct *);
void mark_page_accessed(struct page *);
void folio_mark_accessed(struct folio *);
+static inline bool folio_may_be_lru_cached(struct folio *folio)
+{
+ /*
+ * Holding PMD-sized folios in per-CPU LRU cache unbalances accounting.
+ * Holding small numbers of low-order mTHP folios in per-CPU LRU cache
+ * will be sensible, but nobody has implemented and tested that yet.
+ */
+ return !folio_test_large(folio);
+}
+
extern atomic_t lru_disable_count;
static inline bool lru_cache_disabled(void)
diff --git a/mm/gup.c b/mm/gup.c
index b47066a54f52..0bc4d140fc07 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -2307,13 +2307,13 @@ static unsigned long collect_longterm_unpinnable_folios(
continue;
}
- if (drained == 0 &&
+ if (drained == 0 && folio_may_be_lru_cached(folio) &&
folio_ref_count(folio) !=
folio_expected_ref_count(folio) + 1) {
lru_add_drain();
drained = 1;
}
- if (drained == 1 &&
+ if (drained == 1 && folio_may_be_lru_cached(folio) &&
folio_ref_count(folio) !=
folio_expected_ref_count(folio) + 1) {
lru_add_drain_all();
diff --git a/mm/mlock.c b/mm/mlock.c
index a1d93ad33c6d..bb0776f5ef7c 100644
--- a/mm/mlock.c
+++ b/mm/mlock.c
@@ -255,7 +255,7 @@ void mlock_folio(struct folio *folio)
folio_get(folio);
if (!folio_batch_add(fbatch, mlock_lru(folio)) ||
- folio_test_large(folio) || lru_cache_disabled())
+ !folio_may_be_lru_cached(folio) || lru_cache_disabled())
mlock_folio_batch(fbatch);
local_unlock(&mlock_fbatch.lock);
}
@@ -278,7 +278,7 @@ void mlock_new_folio(struct folio *folio)
folio_get(folio);
if (!folio_batch_add(fbatch, mlock_new(folio)) ||
- folio_test_large(folio) || lru_cache_disabled())
+ !folio_may_be_lru_cached(folio) || lru_cache_disabled())
mlock_folio_batch(fbatch);
local_unlock(&mlock_fbatch.lock);
}
@@ -299,7 +299,7 @@ void munlock_folio(struct folio *folio)
*/
folio_get(folio);
if (!folio_batch_add(fbatch, folio) ||
- folio_test_large(folio) || lru_cache_disabled())
+ !folio_may_be_lru_cached(folio) || lru_cache_disabled())
mlock_folio_batch(fbatch);
local_unlock(&mlock_fbatch.lock);
}
diff --git a/mm/swap.c b/mm/swap.c
index 6ae2d5680574..b74ebe865dd9 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -192,7 +192,7 @@ static void __folio_batch_add_and_move(struct folio_batch __percpu *fbatch,
local_lock(&cpu_fbatches.lock);
if (!folio_batch_add(this_cpu_ptr(fbatch), folio) ||
- folio_test_large(folio) || lru_cache_disabled())
+ !folio_may_be_lru_cached(folio) || lru_cache_disabled())
folio_batch_move_lru(this_cpu_ptr(fbatch), move_fn);
if (disable_irq)
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 2da6de30e60dd9bb14600eff1cc99df2fa2ddae3
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092144-angler-cuddly-30db@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 2da6de30e60dd9bb14600eff1cc99df2fa2ddae3 Mon Sep 17 00:00:00 2001
From: Hugh Dickins <hughd(a)google.com>
Date: Mon, 8 Sep 2025 15:23:15 -0700
Subject: [PATCH] mm: folio_may_be_lru_cached() unless folio_test_large()
mm/swap.c and mm/mlock.c agree to drain any per-CPU batch as soon as a
large folio is added: so collect_longterm_unpinnable_folios() just wastes
effort when calling lru_add_drain[_all]() on a large folio.
But although there is good reason not to batch up PMD-sized folios, we
might well benefit from batching a small number of low-order mTHPs (though
unclear how that "small number" limitation will be implemented).
So ask if folio_may_be_lru_cached() rather than !folio_test_large(), to
insulate those particular checks from future change. Name preferred to
"folio_is_batchable" because large folios can well be put on a batch: it's
just the per-CPU LRU caches, drained much later, which need care.
Marked for stable, to counter the increase in lru_add_drain_all()s from
"mm/gup: check ref_count instead of lru before migration".
Link: https://lkml.kernel.org/r/57d2eaf8-3607-f318-e0c5-be02dce61ad0@google.com
Fixes: 9a4e9f3b2d73 ("mm: update get_user_pages_longterm to migrate pages allocated from CMA region")
Signed-off-by: Hugh Dickins <hughd(a)google.com>
Suggested-by: David Hildenbrand <david(a)redhat.com>
Acked-by: David Hildenbrand <david(a)redhat.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar(a)kernel.org>
Cc: Axel Rasmussen <axelrasmussen(a)google.com>
Cc: Chris Li <chrisl(a)kernel.org>
Cc: Christoph Hellwig <hch(a)infradead.org>
Cc: Jason Gunthorpe <jgg(a)ziepe.ca>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: John Hubbard <jhubbard(a)nvidia.com>
Cc: Keir Fraser <keirf(a)google.com>
Cc: Konstantin Khlebnikov <koct9i(a)gmail.com>
Cc: Li Zhe <lizhe.67(a)bytedance.com>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: Peter Xu <peterx(a)redhat.com>
Cc: Rik van Riel <riel(a)surriel.com>
Cc: Shivank Garg <shivankg(a)amd.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Wei Xu <weixugc(a)google.com>
Cc: Will Deacon <will(a)kernel.org>
Cc: yangge <yangge1116(a)126.com>
Cc: Yuanchu Xie <yuanchu(a)google.com>
Cc: Yu Zhao <yuzhao(a)google.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/include/linux/swap.h b/include/linux/swap.h
index 2fe6ed2cc3fd..7012a0f758d8 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -385,6 +385,16 @@ void folio_add_lru_vma(struct folio *, struct vm_area_struct *);
void mark_page_accessed(struct page *);
void folio_mark_accessed(struct folio *);
+static inline bool folio_may_be_lru_cached(struct folio *folio)
+{
+ /*
+ * Holding PMD-sized folios in per-CPU LRU cache unbalances accounting.
+ * Holding small numbers of low-order mTHP folios in per-CPU LRU cache
+ * will be sensible, but nobody has implemented and tested that yet.
+ */
+ return !folio_test_large(folio);
+}
+
extern atomic_t lru_disable_count;
static inline bool lru_cache_disabled(void)
diff --git a/mm/gup.c b/mm/gup.c
index b47066a54f52..0bc4d140fc07 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -2307,13 +2307,13 @@ static unsigned long collect_longterm_unpinnable_folios(
continue;
}
- if (drained == 0 &&
+ if (drained == 0 && folio_may_be_lru_cached(folio) &&
folio_ref_count(folio) !=
folio_expected_ref_count(folio) + 1) {
lru_add_drain();
drained = 1;
}
- if (drained == 1 &&
+ if (drained == 1 && folio_may_be_lru_cached(folio) &&
folio_ref_count(folio) !=
folio_expected_ref_count(folio) + 1) {
lru_add_drain_all();
diff --git a/mm/mlock.c b/mm/mlock.c
index a1d93ad33c6d..bb0776f5ef7c 100644
--- a/mm/mlock.c
+++ b/mm/mlock.c
@@ -255,7 +255,7 @@ void mlock_folio(struct folio *folio)
folio_get(folio);
if (!folio_batch_add(fbatch, mlock_lru(folio)) ||
- folio_test_large(folio) || lru_cache_disabled())
+ !folio_may_be_lru_cached(folio) || lru_cache_disabled())
mlock_folio_batch(fbatch);
local_unlock(&mlock_fbatch.lock);
}
@@ -278,7 +278,7 @@ void mlock_new_folio(struct folio *folio)
folio_get(folio);
if (!folio_batch_add(fbatch, mlock_new(folio)) ||
- folio_test_large(folio) || lru_cache_disabled())
+ !folio_may_be_lru_cached(folio) || lru_cache_disabled())
mlock_folio_batch(fbatch);
local_unlock(&mlock_fbatch.lock);
}
@@ -299,7 +299,7 @@ void munlock_folio(struct folio *folio)
*/
folio_get(folio);
if (!folio_batch_add(fbatch, folio) ||
- folio_test_large(folio) || lru_cache_disabled())
+ !folio_may_be_lru_cached(folio) || lru_cache_disabled())
mlock_folio_batch(fbatch);
local_unlock(&mlock_fbatch.lock);
}
diff --git a/mm/swap.c b/mm/swap.c
index 6ae2d5680574..b74ebe865dd9 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -192,7 +192,7 @@ static void __folio_batch_add_and_move(struct folio_batch __percpu *fbatch,
local_lock(&cpu_fbatches.lock);
if (!folio_batch_add(this_cpu_ptr(fbatch), folio) ||
- folio_test_large(folio) || lru_cache_disabled())
+ !folio_may_be_lru_cached(folio) || lru_cache_disabled())
folio_batch_move_lru(this_cpu_ptr(fbatch), move_fn);
if (disable_irq)
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x 2da6de30e60dd9bb14600eff1cc99df2fa2ddae3
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092143-defrost-backboned-d1ea@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 2da6de30e60dd9bb14600eff1cc99df2fa2ddae3 Mon Sep 17 00:00:00 2001
From: Hugh Dickins <hughd(a)google.com>
Date: Mon, 8 Sep 2025 15:23:15 -0700
Subject: [PATCH] mm: folio_may_be_lru_cached() unless folio_test_large()
mm/swap.c and mm/mlock.c agree to drain any per-CPU batch as soon as a
large folio is added: so collect_longterm_unpinnable_folios() just wastes
effort when calling lru_add_drain[_all]() on a large folio.
But although there is good reason not to batch up PMD-sized folios, we
might well benefit from batching a small number of low-order mTHPs (though
unclear how that "small number" limitation will be implemented).
So ask if folio_may_be_lru_cached() rather than !folio_test_large(), to
insulate those particular checks from future change. Name preferred to
"folio_is_batchable" because large folios can well be put on a batch: it's
just the per-CPU LRU caches, drained much later, which need care.
Marked for stable, to counter the increase in lru_add_drain_all()s from
"mm/gup: check ref_count instead of lru before migration".
Link: https://lkml.kernel.org/r/57d2eaf8-3607-f318-e0c5-be02dce61ad0@google.com
Fixes: 9a4e9f3b2d73 ("mm: update get_user_pages_longterm to migrate pages allocated from CMA region")
Signed-off-by: Hugh Dickins <hughd(a)google.com>
Suggested-by: David Hildenbrand <david(a)redhat.com>
Acked-by: David Hildenbrand <david(a)redhat.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar(a)kernel.org>
Cc: Axel Rasmussen <axelrasmussen(a)google.com>
Cc: Chris Li <chrisl(a)kernel.org>
Cc: Christoph Hellwig <hch(a)infradead.org>
Cc: Jason Gunthorpe <jgg(a)ziepe.ca>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: John Hubbard <jhubbard(a)nvidia.com>
Cc: Keir Fraser <keirf(a)google.com>
Cc: Konstantin Khlebnikov <koct9i(a)gmail.com>
Cc: Li Zhe <lizhe.67(a)bytedance.com>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: Peter Xu <peterx(a)redhat.com>
Cc: Rik van Riel <riel(a)surriel.com>
Cc: Shivank Garg <shivankg(a)amd.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Wei Xu <weixugc(a)google.com>
Cc: Will Deacon <will(a)kernel.org>
Cc: yangge <yangge1116(a)126.com>
Cc: Yuanchu Xie <yuanchu(a)google.com>
Cc: Yu Zhao <yuzhao(a)google.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/include/linux/swap.h b/include/linux/swap.h
index 2fe6ed2cc3fd..7012a0f758d8 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -385,6 +385,16 @@ void folio_add_lru_vma(struct folio *, struct vm_area_struct *);
void mark_page_accessed(struct page *);
void folio_mark_accessed(struct folio *);
+static inline bool folio_may_be_lru_cached(struct folio *folio)
+{
+ /*
+ * Holding PMD-sized folios in per-CPU LRU cache unbalances accounting.
+ * Holding small numbers of low-order mTHP folios in per-CPU LRU cache
+ * will be sensible, but nobody has implemented and tested that yet.
+ */
+ return !folio_test_large(folio);
+}
+
extern atomic_t lru_disable_count;
static inline bool lru_cache_disabled(void)
diff --git a/mm/gup.c b/mm/gup.c
index b47066a54f52..0bc4d140fc07 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -2307,13 +2307,13 @@ static unsigned long collect_longterm_unpinnable_folios(
continue;
}
- if (drained == 0 &&
+ if (drained == 0 && folio_may_be_lru_cached(folio) &&
folio_ref_count(folio) !=
folio_expected_ref_count(folio) + 1) {
lru_add_drain();
drained = 1;
}
- if (drained == 1 &&
+ if (drained == 1 && folio_may_be_lru_cached(folio) &&
folio_ref_count(folio) !=
folio_expected_ref_count(folio) + 1) {
lru_add_drain_all();
diff --git a/mm/mlock.c b/mm/mlock.c
index a1d93ad33c6d..bb0776f5ef7c 100644
--- a/mm/mlock.c
+++ b/mm/mlock.c
@@ -255,7 +255,7 @@ void mlock_folio(struct folio *folio)
folio_get(folio);
if (!folio_batch_add(fbatch, mlock_lru(folio)) ||
- folio_test_large(folio) || lru_cache_disabled())
+ !folio_may_be_lru_cached(folio) || lru_cache_disabled())
mlock_folio_batch(fbatch);
local_unlock(&mlock_fbatch.lock);
}
@@ -278,7 +278,7 @@ void mlock_new_folio(struct folio *folio)
folio_get(folio);
if (!folio_batch_add(fbatch, mlock_new(folio)) ||
- folio_test_large(folio) || lru_cache_disabled())
+ !folio_may_be_lru_cached(folio) || lru_cache_disabled())
mlock_folio_batch(fbatch);
local_unlock(&mlock_fbatch.lock);
}
@@ -299,7 +299,7 @@ void munlock_folio(struct folio *folio)
*/
folio_get(folio);
if (!folio_batch_add(fbatch, folio) ||
- folio_test_large(folio) || lru_cache_disabled())
+ !folio_may_be_lru_cached(folio) || lru_cache_disabled())
mlock_folio_batch(fbatch);
local_unlock(&mlock_fbatch.lock);
}
diff --git a/mm/swap.c b/mm/swap.c
index 6ae2d5680574..b74ebe865dd9 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -192,7 +192,7 @@ static void __folio_batch_add_and_move(struct folio_batch __percpu *fbatch,
local_lock(&cpu_fbatches.lock);
if (!folio_batch_add(this_cpu_ptr(fbatch), folio) ||
- folio_test_large(folio) || lru_cache_disabled())
+ !folio_may_be_lru_cached(folio) || lru_cache_disabled())
folio_batch_move_lru(this_cpu_ptr(fbatch), move_fn);
if (disable_irq)
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x 98c6d259319ecf6e8d027abd3f14b81324b8c0ad
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092156-candied-rogue-bf08@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 98c6d259319ecf6e8d027abd3f14b81324b8c0ad Mon Sep 17 00:00:00 2001
From: Hugh Dickins <hughd(a)google.com>
Date: Mon, 8 Sep 2025 15:15:03 -0700
Subject: [PATCH] mm/gup: check ref_count instead of lru before migration
Patch series "mm: better GUP pin lru_add_drain_all()", v2.
Series of lru_add_drain_all()-related patches, arising from recent mm/gup
migration report from Will Deacon.
This patch (of 5):
Will Deacon reports:-
When taking a longterm GUP pin via pin_user_pages(),
__gup_longterm_locked() tries to migrate target folios that should not be
longterm pinned, for example because they reside in a CMA region or
movable zone. This is done by first pinning all of the target folios
anyway, collecting all of the longterm-unpinnable target folios into a
list, dropping the pins that were just taken and finally handing the list
off to migrate_pages() for the actual migration.
It is critically important that no unexpected references are held on the
folios being migrated, otherwise the migration will fail and
pin_user_pages() will return -ENOMEM to its caller. Unfortunately, it is
relatively easy to observe migration failures when running pKVM (which
uses pin_user_pages() on crosvm's virtual address space to resolve stage-2
page faults from the guest) on a 6.15-based Pixel 6 device and this
results in the VM terminating prematurely.
In the failure case, 'crosvm' has called mlock(MLOCK_ONFAULT) on its
mapping of guest memory prior to the pinning. Subsequently, when
pin_user_pages() walks the page-table, the relevant 'pte' is not present
and so the faulting logic allocates a new folio, mlocks it with
mlock_folio() and maps it in the page-table.
Since commit 2fbb0c10d1e8 ("mm/munlock: mlock_page() munlock_page() batch
by pagevec"), mlock/munlock operations on a folio (formerly page), are
deferred. For example, mlock_folio() takes an additional reference on the
target folio before placing it into a per-cpu 'folio_batch' for later
processing by mlock_folio_batch(), which drops the refcount once the
operation is complete. Processing of the batches is coupled with the LRU
batch logic and can be forcefully drained with lru_add_drain_all() but as
long as a folio remains unprocessed on the batch, its refcount will be
elevated.
This deferred batching therefore interacts poorly with the pKVM pinning
scenario as we can find ourselves in a situation where the migration code
fails to migrate a folio due to the elevated refcount from the pending
mlock operation.
Hugh Dickins adds:-
!folio_test_lru() has never been a very reliable way to tell if an
lru_add_drain_all() is worth calling, to remove LRU cache references to
make the folio migratable: the LRU flag may be set even while the folio is
held with an extra reference in a per-CPU LRU cache.
5.18 commit 2fbb0c10d1e8 may have made it more unreliable. Then 6.11
commit 33dfe9204f29 ("mm/gup: clear the LRU flag of a page before adding
to LRU batch") tried to make it reliable, by moving LRU flag clearing; but
missed the mlock/munlock batches, so still unreliable as reported.
And it turns out to be difficult to extend 33dfe9204f29's LRU flag
clearing to the mlock/munlock batches: if they do benefit from batching,
mlock/munlock cannot be so effective when easily suppressed while !LRU.
Instead, switch to an expected ref_count check, which was more reliable
all along: some more false positives (unhelpful drains) than before, and
never a guarantee that the folio will prove migratable, but better.
Note on PG_private_2: ceph and nfs are still using the deprecated
PG_private_2 flag, with the aid of netfs and filemap support functions.
Although it is consistently matched by an increment of folio ref_count,
folio_expected_ref_count() intentionally does not recognize it, and ceph
folio migration currently depends on that for PG_private_2 folios to be
rejected. New references to the deprecated flag are discouraged, so do
not add it into the collect_longterm_unpinnable_folios() calculation: but
longterm pinning of transiently PG_private_2 ceph and nfs folios (an
uncommon case) may invoke a redundant lru_add_drain_all(). And this makes
easy the backport to earlier releases: up to and including 6.12, btrfs
also used PG_private_2, but without a ref_count increment.
Note for stable backports: requires 6.16 commit 86ebd50224c0 ("mm:
add folio_expected_ref_count() for reference count calculation").
Link: https://lkml.kernel.org/r/41395944-b0e3-c3ac-d648-8ddd70451d28@google.com
Link: https://lkml.kernel.org/r/bd1f314a-fca1-8f19-cac0-b936c9614557@google.com
Fixes: 9a4e9f3b2d73 ("mm: update get_user_pages_longterm to migrate pages allocated from CMA region")
Signed-off-by: Hugh Dickins <hughd(a)google.com>
Reported-by: Will Deacon <will(a)kernel.org>
Closes: https://lore.kernel.org/linux-mm/20250815101858.24352-1-will@kernel.org/
Acked-by: Kiryl Shutsemau <kas(a)kernel.org>
Acked-by: David Hildenbrand <david(a)redhat.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar(a)kernel.org>
Cc: Axel Rasmussen <axelrasmussen(a)google.com>
Cc: Chris Li <chrisl(a)kernel.org>
Cc: Christoph Hellwig <hch(a)infradead.org>
Cc: Jason Gunthorpe <jgg(a)ziepe.ca>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: John Hubbard <jhubbard(a)nvidia.com>
Cc: Keir Fraser <keirf(a)google.com>
Cc: Konstantin Khlebnikov <koct9i(a)gmail.com>
Cc: Li Zhe <lizhe.67(a)bytedance.com>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: Peter Xu <peterx(a)redhat.com>
Cc: Rik van Riel <riel(a)surriel.com>
Cc: Shivank Garg <shivankg(a)amd.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Wei Xu <weixugc(a)google.com>
Cc: yangge <yangge1116(a)126.com>
Cc: Yuanchu Xie <yuanchu(a)google.com>
Cc: Yu Zhao <yuzhao(a)google.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/mm/gup.c b/mm/gup.c
index adffe663594d..82aec6443c0a 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -2307,7 +2307,8 @@ static unsigned long collect_longterm_unpinnable_folios(
continue;
}
- if (!folio_test_lru(folio) && drain_allow) {
+ if (drain_allow && folio_ref_count(folio) !=
+ folio_expected_ref_count(folio) + 1) {
lru_add_drain_all();
drain_allow = false;
}
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 98c6d259319ecf6e8d027abd3f14b81324b8c0ad
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092154-canon-user-98b7@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 98c6d259319ecf6e8d027abd3f14b81324b8c0ad Mon Sep 17 00:00:00 2001
From: Hugh Dickins <hughd(a)google.com>
Date: Mon, 8 Sep 2025 15:15:03 -0700
Subject: [PATCH] mm/gup: check ref_count instead of lru before migration
Patch series "mm: better GUP pin lru_add_drain_all()", v2.
Series of lru_add_drain_all()-related patches, arising from recent mm/gup
migration report from Will Deacon.
This patch (of 5):
Will Deacon reports:-
When taking a longterm GUP pin via pin_user_pages(),
__gup_longterm_locked() tries to migrate target folios that should not be
longterm pinned, for example because they reside in a CMA region or
movable zone. This is done by first pinning all of the target folios
anyway, collecting all of the longterm-unpinnable target folios into a
list, dropping the pins that were just taken and finally handing the list
off to migrate_pages() for the actual migration.
It is critically important that no unexpected references are held on the
folios being migrated, otherwise the migration will fail and
pin_user_pages() will return -ENOMEM to its caller. Unfortunately, it is
relatively easy to observe migration failures when running pKVM (which
uses pin_user_pages() on crosvm's virtual address space to resolve stage-2
page faults from the guest) on a 6.15-based Pixel 6 device and this
results in the VM terminating prematurely.
In the failure case, 'crosvm' has called mlock(MLOCK_ONFAULT) on its
mapping of guest memory prior to the pinning. Subsequently, when
pin_user_pages() walks the page-table, the relevant 'pte' is not present
and so the faulting logic allocates a new folio, mlocks it with
mlock_folio() and maps it in the page-table.
Since commit 2fbb0c10d1e8 ("mm/munlock: mlock_page() munlock_page() batch
by pagevec"), mlock/munlock operations on a folio (formerly page), are
deferred. For example, mlock_folio() takes an additional reference on the
target folio before placing it into a per-cpu 'folio_batch' for later
processing by mlock_folio_batch(), which drops the refcount once the
operation is complete. Processing of the batches is coupled with the LRU
batch logic and can be forcefully drained with lru_add_drain_all() but as
long as a folio remains unprocessed on the batch, its refcount will be
elevated.
This deferred batching therefore interacts poorly with the pKVM pinning
scenario as we can find ourselves in a situation where the migration code
fails to migrate a folio due to the elevated refcount from the pending
mlock operation.
Hugh Dickins adds:-
!folio_test_lru() has never been a very reliable way to tell if an
lru_add_drain_all() is worth calling, to remove LRU cache references to
make the folio migratable: the LRU flag may be set even while the folio is
held with an extra reference in a per-CPU LRU cache.
5.18 commit 2fbb0c10d1e8 may have made it more unreliable. Then 6.11
commit 33dfe9204f29 ("mm/gup: clear the LRU flag of a page before adding
to LRU batch") tried to make it reliable, by moving LRU flag clearing; but
missed the mlock/munlock batches, so still unreliable as reported.
And it turns out to be difficult to extend 33dfe9204f29's LRU flag
clearing to the mlock/munlock batches: if they do benefit from batching,
mlock/munlock cannot be so effective when easily suppressed while !LRU.
Instead, switch to an expected ref_count check, which was more reliable
all along: some more false positives (unhelpful drains) than before, and
never a guarantee that the folio will prove migratable, but better.
Note on PG_private_2: ceph and nfs are still using the deprecated
PG_private_2 flag, with the aid of netfs and filemap support functions.
Although it is consistently matched by an increment of folio ref_count,
folio_expected_ref_count() intentionally does not recognize it, and ceph
folio migration currently depends on that for PG_private_2 folios to be
rejected. New references to the deprecated flag are discouraged, so do
not add it into the collect_longterm_unpinnable_folios() calculation: but
longterm pinning of transiently PG_private_2 ceph and nfs folios (an
uncommon case) may invoke a redundant lru_add_drain_all(). And this makes
easy the backport to earlier releases: up to and including 6.12, btrfs
also used PG_private_2, but without a ref_count increment.
Note for stable backports: requires 6.16 commit 86ebd50224c0 ("mm:
add folio_expected_ref_count() for reference count calculation").
Link: https://lkml.kernel.org/r/41395944-b0e3-c3ac-d648-8ddd70451d28@google.com
Link: https://lkml.kernel.org/r/bd1f314a-fca1-8f19-cac0-b936c9614557@google.com
Fixes: 9a4e9f3b2d73 ("mm: update get_user_pages_longterm to migrate pages allocated from CMA region")
Signed-off-by: Hugh Dickins <hughd(a)google.com>
Reported-by: Will Deacon <will(a)kernel.org>
Closes: https://lore.kernel.org/linux-mm/20250815101858.24352-1-will@kernel.org/
Acked-by: Kiryl Shutsemau <kas(a)kernel.org>
Acked-by: David Hildenbrand <david(a)redhat.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar(a)kernel.org>
Cc: Axel Rasmussen <axelrasmussen(a)google.com>
Cc: Chris Li <chrisl(a)kernel.org>
Cc: Christoph Hellwig <hch(a)infradead.org>
Cc: Jason Gunthorpe <jgg(a)ziepe.ca>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: John Hubbard <jhubbard(a)nvidia.com>
Cc: Keir Fraser <keirf(a)google.com>
Cc: Konstantin Khlebnikov <koct9i(a)gmail.com>
Cc: Li Zhe <lizhe.67(a)bytedance.com>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: Peter Xu <peterx(a)redhat.com>
Cc: Rik van Riel <riel(a)surriel.com>
Cc: Shivank Garg <shivankg(a)amd.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Wei Xu <weixugc(a)google.com>
Cc: yangge <yangge1116(a)126.com>
Cc: Yuanchu Xie <yuanchu(a)google.com>
Cc: Yu Zhao <yuzhao(a)google.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/mm/gup.c b/mm/gup.c
index adffe663594d..82aec6443c0a 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -2307,7 +2307,8 @@ static unsigned long collect_longterm_unpinnable_folios(
continue;
}
- if (!folio_test_lru(folio) && drain_allow) {
+ if (drain_allow && folio_ref_count(folio) !=
+ folio_expected_ref_count(folio) + 1) {
lru_add_drain_all();
drain_allow = false;
}
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 98c6d259319ecf6e8d027abd3f14b81324b8c0ad
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092152-bobtail-scarring-ffff@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 98c6d259319ecf6e8d027abd3f14b81324b8c0ad Mon Sep 17 00:00:00 2001
From: Hugh Dickins <hughd(a)google.com>
Date: Mon, 8 Sep 2025 15:15:03 -0700
Subject: [PATCH] mm/gup: check ref_count instead of lru before migration
Patch series "mm: better GUP pin lru_add_drain_all()", v2.
Series of lru_add_drain_all()-related patches, arising from recent mm/gup
migration report from Will Deacon.
This patch (of 5):
Will Deacon reports:-
When taking a longterm GUP pin via pin_user_pages(),
__gup_longterm_locked() tries to migrate target folios that should not be
longterm pinned, for example because they reside in a CMA region or
movable zone. This is done by first pinning all of the target folios
anyway, collecting all of the longterm-unpinnable target folios into a
list, dropping the pins that were just taken and finally handing the list
off to migrate_pages() for the actual migration.
It is critically important that no unexpected references are held on the
folios being migrated, otherwise the migration will fail and
pin_user_pages() will return -ENOMEM to its caller. Unfortunately, it is
relatively easy to observe migration failures when running pKVM (which
uses pin_user_pages() on crosvm's virtual address space to resolve stage-2
page faults from the guest) on a 6.15-based Pixel 6 device and this
results in the VM terminating prematurely.
In the failure case, 'crosvm' has called mlock(MLOCK_ONFAULT) on its
mapping of guest memory prior to the pinning. Subsequently, when
pin_user_pages() walks the page-table, the relevant 'pte' is not present
and so the faulting logic allocates a new folio, mlocks it with
mlock_folio() and maps it in the page-table.
Since commit 2fbb0c10d1e8 ("mm/munlock: mlock_page() munlock_page() batch
by pagevec"), mlock/munlock operations on a folio (formerly page), are
deferred. For example, mlock_folio() takes an additional reference on the
target folio before placing it into a per-cpu 'folio_batch' for later
processing by mlock_folio_batch(), which drops the refcount once the
operation is complete. Processing of the batches is coupled with the LRU
batch logic and can be forcefully drained with lru_add_drain_all() but as
long as a folio remains unprocessed on the batch, its refcount will be
elevated.
This deferred batching therefore interacts poorly with the pKVM pinning
scenario as we can find ourselves in a situation where the migration code
fails to migrate a folio due to the elevated refcount from the pending
mlock operation.
Hugh Dickins adds:-
!folio_test_lru() has never been a very reliable way to tell if an
lru_add_drain_all() is worth calling, to remove LRU cache references to
make the folio migratable: the LRU flag may be set even while the folio is
held with an extra reference in a per-CPU LRU cache.
5.18 commit 2fbb0c10d1e8 may have made it more unreliable. Then 6.11
commit 33dfe9204f29 ("mm/gup: clear the LRU flag of a page before adding
to LRU batch") tried to make it reliable, by moving LRU flag clearing; but
missed the mlock/munlock batches, so still unreliable as reported.
And it turns out to be difficult to extend 33dfe9204f29's LRU flag
clearing to the mlock/munlock batches: if they do benefit from batching,
mlock/munlock cannot be so effective when easily suppressed while !LRU.
Instead, switch to an expected ref_count check, which was more reliable
all along: some more false positives (unhelpful drains) than before, and
never a guarantee that the folio will prove migratable, but better.
Note on PG_private_2: ceph and nfs are still using the deprecated
PG_private_2 flag, with the aid of netfs and filemap support functions.
Although it is consistently matched by an increment of folio ref_count,
folio_expected_ref_count() intentionally does not recognize it, and ceph
folio migration currently depends on that for PG_private_2 folios to be
rejected. New references to the deprecated flag are discouraged, so do
not add it into the collect_longterm_unpinnable_folios() calculation: but
longterm pinning of transiently PG_private_2 ceph and nfs folios (an
uncommon case) may invoke a redundant lru_add_drain_all(). And this makes
easy the backport to earlier releases: up to and including 6.12, btrfs
also used PG_private_2, but without a ref_count increment.
Note for stable backports: requires 6.16 commit 86ebd50224c0 ("mm:
add folio_expected_ref_count() for reference count calculation").
Link: https://lkml.kernel.org/r/41395944-b0e3-c3ac-d648-8ddd70451d28@google.com
Link: https://lkml.kernel.org/r/bd1f314a-fca1-8f19-cac0-b936c9614557@google.com
Fixes: 9a4e9f3b2d73 ("mm: update get_user_pages_longterm to migrate pages allocated from CMA region")
Signed-off-by: Hugh Dickins <hughd(a)google.com>
Reported-by: Will Deacon <will(a)kernel.org>
Closes: https://lore.kernel.org/linux-mm/20250815101858.24352-1-will@kernel.org/
Acked-by: Kiryl Shutsemau <kas(a)kernel.org>
Acked-by: David Hildenbrand <david(a)redhat.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar(a)kernel.org>
Cc: Axel Rasmussen <axelrasmussen(a)google.com>
Cc: Chris Li <chrisl(a)kernel.org>
Cc: Christoph Hellwig <hch(a)infradead.org>
Cc: Jason Gunthorpe <jgg(a)ziepe.ca>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: John Hubbard <jhubbard(a)nvidia.com>
Cc: Keir Fraser <keirf(a)google.com>
Cc: Konstantin Khlebnikov <koct9i(a)gmail.com>
Cc: Li Zhe <lizhe.67(a)bytedance.com>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: Peter Xu <peterx(a)redhat.com>
Cc: Rik van Riel <riel(a)surriel.com>
Cc: Shivank Garg <shivankg(a)amd.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Wei Xu <weixugc(a)google.com>
Cc: yangge <yangge1116(a)126.com>
Cc: Yuanchu Xie <yuanchu(a)google.com>
Cc: Yu Zhao <yuzhao(a)google.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/mm/gup.c b/mm/gup.c
index adffe663594d..82aec6443c0a 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -2307,7 +2307,8 @@ static unsigned long collect_longterm_unpinnable_folios(
continue;
}
- if (!folio_test_lru(folio) && drain_allow) {
+ if (drain_allow && folio_ref_count(folio) !=
+ folio_expected_ref_count(folio) + 1) {
lru_add_drain_all();
drain_allow = false;
}
wcd934x_codec_parse_data() contains a device reference count leak in
of_slim_get_device() where device_find_child() increases the reference
count of the device but this reference is not properly decreased in
the success path. Add put_device() in wcd934x_codec_parse_data(),
which ensures that the reference count of the device is correctly
managed.
Calling path: of_slim_get_device() -> of_find_slim_device() ->
device_find_child(). As comment of device_find_child() says, 'NOTE:
you will need to drop the reference with put_device() after use.'.
Found by code review.
Cc: stable(a)vger.kernel.org
Fixes: a61f3b4f476e ("ASoC: wcd934x: add support to wcd9340/wcd9341 codec")
Signed-off-by: Ma Ke <make24(a)iscas.ac.cn>
---
sound/soc/codecs/wcd934x.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/sound/soc/codecs/wcd934x.c b/sound/soc/codecs/wcd934x.c
index 1bb7e1dc7e6b..9ffa65329934 100644
--- a/sound/soc/codecs/wcd934x.c
+++ b/sound/soc/codecs/wcd934x.c
@@ -5849,10 +5849,13 @@ static int wcd934x_codec_parse_data(struct wcd934x_codec *wcd)
slim_get_logical_addr(wcd->sidev);
wcd->if_regmap = regmap_init_slimbus(wcd->sidev,
&wcd934x_ifc_regmap_config);
- if (IS_ERR(wcd->if_regmap))
+ if (IS_ERR(wcd->if_regmap)) {
+ put_device(&wcd->sidev->dev);
return dev_err_probe(dev, PTR_ERR(wcd->if_regmap),
"Failed to allocate ifc register map\n");
+ }
+ put_device(&wcd->sidev->dev);
of_property_read_u32(dev->parent->of_node, "qcom,dmic-sample-rate",
&wcd->dmic_sample_rate);
--
2.17.1
When do_task() exhausts its iteration budget (!ret), it sets the state
to TASK_STATE_IDLE to reschedule, without a secondary check on the
current task->state. This can overwrite the TASK_STATE_DRAINING state
set by a concurrent call to rxe_cleanup_task() or rxe_disable_task().
While state changes are protected by a spinlock, both rxe_cleanup_task()
and rxe_disable_task() release the lock while waiting for the task to
finish draining in the while(!is_done(task)) loop. The race occurs if
do_task() hits its iteration limit and acquires the lock in this window.
The cleanup logic may then proceed while the task incorrectly
reschedules itself, leading to a potential use-after-free.
This bug was introduced during the migration from tasklets to workqueues,
where the special handling for the draining case was lost.
Fix this by restoring the original pre-migration behavior. If the state is
TASK_STATE_DRAINING when iterations are exhausted, set cont to 1 to
force a new loop iteration. This allows the task to finish its work, so
that a subsequent iteration can reach the switch statement and correctly
transition the state to TASK_STATE_DRAINED, stopping the task as intended.
Fixes: 9b4b7c1f9f54 ("RDMA/rxe: Add workqueue support for rxe tasks")
Cc: stable(a)vger.kernel.org
Reviewed-by: Zhu Yanjun <yanjun.zhu(a)linux.dev>
Signed-off-by: Gui-Dong Han <hanguidong02(a)gmail.com>
---
v2:
* Rewrite commit message for clarity. Thanks to Zhu Yanjun for the review.
---
drivers/infiniband/sw/rxe/rxe_task.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/drivers/infiniband/sw/rxe/rxe_task.c b/drivers/infiniband/sw/rxe/rxe_task.c
index 6f8f353e9583..f522820b950c 100644
--- a/drivers/infiniband/sw/rxe/rxe_task.c
+++ b/drivers/infiniband/sw/rxe/rxe_task.c
@@ -132,8 +132,12 @@ static void do_task(struct rxe_task *task)
* yield the cpu and reschedule the task
*/
if (!ret) {
- task->state = TASK_STATE_IDLE;
- resched = 1;
+ if (task->state != TASK_STATE_DRAINING) {
+ task->state = TASK_STATE_IDLE;
+ resched = 1;
+ } else {
+ cont = 1;
+ }
goto exit;
}
--
2.25.1
To: linux-kernel(a)vger.kernel.org
Cc: Paul Walmsley <paul.walmsley(a)sifive.com>
Cc: Samuel Holland <samuel.holland(a)sifive.com>
Cc: stable(a)vger.kernel.org
Cc: linux-riscv(a)lists.infradead.org
Cc: Thomas Gleixner <tglx(a)linutronix.de>
According to the PLIC specification[1], global interrupt sources are
assigned small unsigned integer identifiers beginning at the value 1.
An interrupt ID of 0 is reserved to mean "no interrupt".
The current plic_irq_resume() and plic_irq_suspend() functions incorrectly
starts the loop from index 0, which could access the reserved interrupt ID
0 register space.
This fix changes the loop to start from index 1, skipping the reserved
interrupt ID 0 as per the PLIC specification.
This prevents potential undefined behavior when accessing the reserved
register space during suspend/resume cycles.
Fixes: e80f0b6a2cf3 ("irqchip/irq-sifive-plic: Add syscore callbacks for hibernation")
Co-developed-by: Jia Wang <wangjia(a)ultrarisc.com>
Signed-off-by: Jia Wang <wangjia(a)ultrarisc.com>
Signed-off-by: Lucas Zampieri <lzampier(a)redhat.com>
[1] https://github.com/riscv/riscv-plic-spec/releases/tag/1.0.0
---
drivers/irqchip/irq-sifive-plic.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/irqchip/irq-sifive-plic.c b/drivers/irqchip/irq-sifive-plic.c
index bf69a4802b71..1c2b4d2575ac 100644
--- a/drivers/irqchip/irq-sifive-plic.c
+++ b/drivers/irqchip/irq-sifive-plic.c
@@ -252,7 +252,7 @@ static int plic_irq_suspend(void)
priv = per_cpu_ptr(&plic_handlers, smp_processor_id())->priv;
- for (i = 0; i < priv->nr_irqs; i++) {
+ for (i = 1; i < priv->nr_irqs; i++) {
__assign_bit(i, priv->prio_save,
readl(priv->regs + PRIORITY_BASE + i * PRIORITY_PER_ID));
}
@@ -283,7 +283,7 @@ static void plic_irq_resume(void)
priv = per_cpu_ptr(&plic_handlers, smp_processor_id())->priv;
- for (i = 0; i < priv->nr_irqs; i++) {
+ for (i = 1; i < priv->nr_irqs; i++) {
index = BIT_WORD(i);
writel((priv->prio_save[index] & BIT_MASK(i)) ? 1 : 0,
priv->regs + PRIORITY_BASE + i * PRIORITY_PER_ID);
--
2.51.0
Hi Stable,
Please provide a quote for your products:
Include:
1.Pricing (per unit)
2.Delivery cost & timeline
3.Quote expiry date
Deadline: September
Thanks!
Kamal Prasad
Albinayah Trading
The Qualcomm SM6375 processor is a 7nm process SoC for the mid-range market with the following features:
CPU: Eight-core design, including high-performance Kryo 670 core and efficient Kryo 265 core, optimized performance and energy efficiency.
GPU: Equipped with Adreno 642L GPU, supporting high-quality graphics and gaming experience.
AI Engine: Integrated Qualcomm AI engine to enhance intelligent features such as voice recognition and image processing.
Connectivity: Supports modern wireless standards such as 5G, Wi-Fi 6 and Bluetooth 5.2.
Multimedia: Supports 4K video encoding and decoding
Mainly used in mid-to-high-end smartphones, tablets and some IoT devices, suitable for users who need to balance cost performance and performance.
.# Part Number Manufacturer Date Code Quantity Unit Price Lead Time Condition (PCS) USD/Each one 1 SM-6375-1-PSP837-TR-00-0-AB QUALCOMM 2023+ 12000pcs US$18.00/pcs 7days New & original - stock 2 PM-6375-0-FOWNSP144-TR-01-0;TR-01-1 QUALCOMM 2023+ 12000pcs US$1.00/pcs 3 PMR-735A-0-WLNSP48-TR-05-0,TR-05-1 QUALCOMM 2023+ 12000pcs US$0.85/pcs 4 PMK-8003-0-FOWPSP36-TR-01-0 QUALCOMM 2023+ 12000pcs US$0.24/pcs 5 SDR-735-0-PSP219B-TR-01-0;TR-01-1 QUALCOMM 2023+ 12000pcs US$2.50/pcs 6 WCD-9370-0-WLPSP55-TR-01-0;TR-01-4 QUALCOMM 2023+ 12000pcs US$0.50/pcs 7 WCN-3988-0-82BWLPSP-TR-00-0 QUALCOMM 2023+ 12000pcs US$3.50/pcs 8 QET-6105-0-WLNSP24B-TR-00-1 QUALCOMM 2023+ 12000pcs US$1.20/pcs 9 QET4101-0-12WLNSP-TR-00-0 QUALCOMM 2022+ 12000pcs US$0.21/pcs
These materials are sold as a set for $28/usd, and are guaranteed to be authentic.
If you need other Qualcomm materials, please feel free to contact me
Stay in tune with product evolutions—tap . Keep Receiving Notices
Feel like taking a break? Select Configure Your Mailing.
The callback return value is ignored in damon_sysfs_damon_call(), which
means that it is not possible to detect invalid user input when writing
commands such as 'commit' to /sys/kernel/mm/damon/admin/kdamonds/<K>/state.
Fix it.
Signed-off-by: Akinobu Mita <akinobu.mita(a)gmail.com>
Fixes: f64539dcdb87 ("mm/damon/sysfs: use damon_call() for update_schemes_stats")
Cc: <stable(a)vger.kernel.org> # v6.14.x
Reviewed-by: SeongJae Park <sj(a)kernel.org>
---
mm/damon/sysfs.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/mm/damon/sysfs.c b/mm/damon/sysfs.c
index fe4e73d0ebbb..3ffe3a77b5db 100644
--- a/mm/damon/sysfs.c
+++ b/mm/damon/sysfs.c
@@ -1627,12 +1627,14 @@ static int damon_sysfs_damon_call(int (*fn)(void *data),
struct damon_sysfs_kdamond *kdamond)
{
struct damon_call_control call_control = {};
+ int err;
if (!kdamond->damon_ctx)
return -EINVAL;
call_control.fn = fn;
call_control.data = kdamond;
- return damon_call(kdamond->damon_ctx, &call_control);
+ err = damon_call(kdamond->damon_ctx, &call_control);
+ return err ? err : call_control.return_code;
}
struct damon_sysfs_schemes_walk_data {
--
2.43.0
It seems like everywhere in this file, when the request is not
bidirectionala, req->src is mapped with DMA_TO_DEVICE and req->dst is
mapped with DMA_FROM_DEVICE.
Fixes: 62f58b1637b7 ("crypto: aspeed - add HACE crypto driver")
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Thomas Fourier <fourier.thomas(a)gmail.com>
---
v1->v2:
- fix confusion between dst and src in commit message
drivers/crypto/aspeed/aspeed-hace-crypto.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/crypto/aspeed/aspeed-hace-crypto.c b/drivers/crypto/aspeed/aspeed-hace-crypto.c
index a72dfebc53ff..fa201dae1f81 100644
--- a/drivers/crypto/aspeed/aspeed-hace-crypto.c
+++ b/drivers/crypto/aspeed/aspeed-hace-crypto.c
@@ -346,7 +346,7 @@ static int aspeed_sk_start_sg(struct aspeed_hace_dev *hace_dev)
} else {
dma_unmap_sg(hace_dev->dev, req->dst, rctx->dst_nents,
- DMA_TO_DEVICE);
+ DMA_FROM_DEVICE);
dma_unmap_sg(hace_dev->dev, req->src, rctx->src_nents,
DMA_TO_DEVICE);
}
--
2.43.0
Greetings!!
We are a 24+ yr old high tech Web Development firm with presence of over
18+ yrs in Mauritius; partners of RV Tec hAdvisora Ltd and headquartered in
India
We have catered to over 7000 customers. You may visit
https://www.mirackle.com for more information about our company. We create
designs that help businesses and individuals attract and engage readers. We
work with all the latest technologies.
We are Authorized Google Workspace Reseller Partner for Asia Pacific region
including Mauritius.
Our Services: Domain Registrations, Web hosting, Google Workspace, Mobile
Responsive Website Designing, Wordpress Websites, Mobile Apps, Web Apps,
E-commerce websites, Google Ads, SEO, Catalogue design & affiliated services
We create beautiful designs. Our brief website portfolio:
http://www.mirackle.com/portfolio.html
Note: We are also looking for tie-ups with IT/Web design cos. who would
want to outsource work for high end Websites/Mobile APP requirements etc.
We have a team of highly skilled php coders who can cater to any complex
requirement.
Get in touch with to get the best prices & offers
India Whatsapp: +91 9323272846 / 9323551195; Mauritius WharsApp: +230 5758
5497; Email: business(a)mirackle.com ; Web: http://www.mirackle.com
Regards,
Nishith Patel
commit 96939cec994070aa5df852c10fad5fc303a97ea3 upstream.
When a SYN containing the 'C' flag (deny join id0) was received, this
piece of information was not propagated to the path-manager.
Even if this flag is mainly set on the server side, a client can also
tell the server it cannot try to establish new subflows to the client's
initial IP address and port. The server's PM should then record such
info when received, and before sending events about the new connection.
Fixes: df377be38725 ("mptcp: add deny_join_id0 in mptcp_options_received")
Reviewed-by: Mat Martineau <martineau(a)kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
Link: https://patch.msgid.link/20250912-net-mptcp-pm-uspace-deny_join_id0-v1-1-40…
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
[ Conflicts in subflow.c, because of differences in the context, e.g.
introduced by commit 3a236aef280e ("mptcp: refactor passive socket
initialization"), which is not in this version. The same lines --
using 'mptcp_sk(new_msk)' instead of 'owner' -- can still be added
approximately at the same place, before calling
mptcp_pm_new_connection(). ]
Signed-off-by: Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
---
net/mptcp/subflow.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c
index 6bc36132d490..f67d8c98d58a 100644
--- a/net/mptcp/subflow.c
+++ b/net/mptcp/subflow.c
@@ -758,6 +758,9 @@ static struct sock *subflow_syn_recv_sock(const struct sock *sk,
*/
WRITE_ONCE(mptcp_sk(new_msk)->first, child);
+ if (mp_opt.deny_join_id0)
+ WRITE_ONCE(mptcp_sk(new_msk)->pm.remote_deny_join_id0, true);
+
/* new mpc subflow takes ownership of the newly
* created mptcp socket
*/
--
2.51.0
Currently this is hidden behind perfmon_capable() since this is
technically an info leak, given that this is a system wide metric.
However the granularity reported here is always PAGE_SIZE aligned, which
matches what the core kernel is already willing to expose to userspace
if querying how many free RAM pages there are on the system, and that
doesn't need any special privileges. In addition other drm drivers seem
happy to expose this.
The motivation here if with oneAPI where they want to use the system
wide 'used' reporting here, so not the per-client fdinfo stats. This has
also come up with some perf overlay applications wanting this
information.
Fixes: 1105ac15d2a1 ("drm/xe/uapi: restrict system wide accounting")
Signed-off-by: Matthew Auld <matthew.auld(a)intel.com>
Cc: Thomas Hellström <thomas.hellstrom(a)linux.intel.com>
Cc: Joshua Santosh <joshua.santosh.ranjan(a)intel.com>
Cc: José Roberto de Souza <jose.souza(a)intel.com>
Cc: Matthew Brost <matthew.brost(a)intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi(a)intel.com>
Cc: <stable(a)vger.kernel.org> # v6.8+
---
drivers/gpu/drm/xe/xe_query.c | 15 ++++++---------
1 file changed, 6 insertions(+), 9 deletions(-)
diff --git a/drivers/gpu/drm/xe/xe_query.c b/drivers/gpu/drm/xe/xe_query.c
index e1b603aba61b..2e9ff33ed2fe 100644
--- a/drivers/gpu/drm/xe/xe_query.c
+++ b/drivers/gpu/drm/xe/xe_query.c
@@ -276,8 +276,7 @@ static int query_mem_regions(struct xe_device *xe,
mem_regions->mem_regions[0].instance = 0;
mem_regions->mem_regions[0].min_page_size = PAGE_SIZE;
mem_regions->mem_regions[0].total_size = man->size << PAGE_SHIFT;
- if (perfmon_capable())
- mem_regions->mem_regions[0].used = ttm_resource_manager_usage(man);
+ mem_regions->mem_regions[0].used = ttm_resource_manager_usage(man);
mem_regions->num_mem_regions = 1;
for (i = XE_PL_VRAM0; i <= XE_PL_VRAM1; ++i) {
@@ -293,13 +292,11 @@ static int query_mem_regions(struct xe_device *xe,
mem_regions->mem_regions[mem_regions->num_mem_regions].total_size =
man->size;
- if (perfmon_capable()) {
- xe_ttm_vram_get_used(man,
- &mem_regions->mem_regions
- [mem_regions->num_mem_regions].used,
- &mem_regions->mem_regions
- [mem_regions->num_mem_regions].cpu_visible_used);
- }
+ xe_ttm_vram_get_used(man,
+ &mem_regions->mem_regions
+ [mem_regions->num_mem_regions].used,
+ &mem_regions->mem_regions
+ [mem_regions->num_mem_regions].cpu_visible_used);
mem_regions->mem_regions[mem_regions->num_mem_regions].cpu_visible_size =
xe_ttm_vram_get_cpu_visible_size(man);
--
2.51.0
commit 2293c57484ae64c9a3c847c8807db8c26a3a4d41 upstream.
During the connection establishment, a peer can tell the other one that
it cannot establish new subflows to the initial IP address and port by
setting the 'C' flag [1]. Doing so makes sense when the sender is behind
a strict NAT, operating behind a legacy Layer 4 load balancer, or using
anycast IP address for example.
When this 'C' flag is set, the path-managers must then not try to
establish new subflows to the other peer's initial IP address and port.
The in-kernel PM has access to this info, but the userspace PM didn't.
The RFC8684 [1] is strict about that:
(...) therefore the receiver MUST NOT try to open any additional
subflows toward this address and port.
So it is important to tell the userspace about that as it is responsible
for the respect of this flag.
When a new connection is created and established, the Netlink events
now contain the existing but not currently used 'flags' attribute. When
MPTCP_PM_EV_FLAG_DENY_JOIN_ID0 is set, it means no other subflows
to the initial IP address and port -- info that are also part of the
event -- can be established.
Link: https://datatracker.ietf.org/doc/html/rfc8684#section-3.1-20.6 [1]
Fixes: 702c2f646d42 ("mptcp: netlink: allow userspace-driven subflow establishment")
Reported-by: Marek Majkowski <marek(a)cloudflare.com>
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/532
Reviewed-by: Mat Martineau <martineau(a)kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
Link: https://patch.msgid.link/20250912-net-mptcp-pm-uspace-deny_join_id0-v1-2-40…
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
[ Conflicts in mptcp_pm.yaml, because the indentation has been modified
in commit ec362192aa9e ("netlink: specs: fix up indentation errors"),
which is not in this version. Applying the same modifications, but at
a different level. ]
Signed-off-by: Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
---
Documentation/netlink/specs/mptcp_pm.yaml | 4 ++--
include/uapi/linux/mptcp.h | 2 ++
include/uapi/linux/mptcp_pm.h | 4 ++--
net/mptcp/pm_netlink.c | 7 +++++++
4 files changed, 13 insertions(+), 4 deletions(-)
diff --git a/Documentation/netlink/specs/mptcp_pm.yaml b/Documentation/netlink/specs/mptcp_pm.yaml
index 7e295bad8b29..a670a9bbe01b 100644
--- a/Documentation/netlink/specs/mptcp_pm.yaml
+++ b/Documentation/netlink/specs/mptcp_pm.yaml
@@ -28,13 +28,13 @@ definitions:
traffic-patterns it can take a long time until the
MPTCP_EVENT_ESTABLISHED is sent.
Attributes: token, family, saddr4 | saddr6, daddr4 | daddr6, sport,
- dport, server-side.
+ dport, server-side, [flags].
-
name: established
doc: >-
A MPTCP connection is established (can start new subflows).
Attributes: token, family, saddr4 | saddr6, daddr4 | daddr6, sport,
- dport, server-side.
+ dport, server-side, [flags].
-
name: closed
doc: >-
diff --git a/include/uapi/linux/mptcp.h b/include/uapi/linux/mptcp.h
index 67d015df8893..5fd5b4cf75ca 100644
--- a/include/uapi/linux/mptcp.h
+++ b/include/uapi/linux/mptcp.h
@@ -31,6 +31,8 @@
#define MPTCP_INFO_FLAG_FALLBACK _BITUL(0)
#define MPTCP_INFO_FLAG_REMOTE_KEY_RECEIVED _BITUL(1)
+#define MPTCP_PM_EV_FLAG_DENY_JOIN_ID0 _BITUL(0)
+
#define MPTCP_PM_ADDR_FLAG_SIGNAL (1 << 0)
#define MPTCP_PM_ADDR_FLAG_SUBFLOW (1 << 1)
#define MPTCP_PM_ADDR_FLAG_BACKUP (1 << 2)
diff --git a/include/uapi/linux/mptcp_pm.h b/include/uapi/linux/mptcp_pm.h
index 6ac84b2f636c..7359d34da446 100644
--- a/include/uapi/linux/mptcp_pm.h
+++ b/include/uapi/linux/mptcp_pm.h
@@ -16,10 +16,10 @@
* good time to allocate memory and send ADD_ADDR if needed. Depending on the
* traffic-patterns it can take a long time until the MPTCP_EVENT_ESTABLISHED
* is sent. Attributes: token, family, saddr4 | saddr6, daddr4 | daddr6,
- * sport, dport, server-side.
+ * sport, dport, server-side, [flags].
* @MPTCP_EVENT_ESTABLISHED: A MPTCP connection is established (can start new
* subflows). Attributes: token, family, saddr4 | saddr6, daddr4 | daddr6,
- * sport, dport, server-side.
+ * sport, dport, server-side, [flags].
* @MPTCP_EVENT_CLOSED: A MPTCP connection has stopped. Attribute: token.
* @MPTCP_EVENT_ANNOUNCED: A new address has been announced by the peer.
* Attributes: token, rem_id, family, daddr4 | daddr6 [, dport].
diff --git a/net/mptcp/pm_netlink.c b/net/mptcp/pm_netlink.c
index b763729b85e0..463c2e7956d5 100644
--- a/net/mptcp/pm_netlink.c
+++ b/net/mptcp/pm_netlink.c
@@ -2211,6 +2211,7 @@ static int mptcp_event_created(struct sk_buff *skb,
const struct sock *ssk)
{
int err = nla_put_u32(skb, MPTCP_ATTR_TOKEN, READ_ONCE(msk->token));
+ u16 flags = 0;
if (err)
return err;
@@ -2218,6 +2219,12 @@ static int mptcp_event_created(struct sk_buff *skb,
if (nla_put_u8(skb, MPTCP_ATTR_SERVER_SIDE, READ_ONCE(msk->pm.server_side)))
return -EMSGSIZE;
+ if (READ_ONCE(msk->pm.remote_deny_join_id0))
+ flags |= MPTCP_PM_EV_FLAG_DENY_JOIN_ID0;
+
+ if (flags && nla_put_u16(skb, MPTCP_ATTR_FLAGS, flags))
+ return -EMSGSIZE;
+
return mptcp_event_add_subflow(skb, ssk);
}
--
2.51.0
commit 2293c57484ae64c9a3c847c8807db8c26a3a4d41 upstream.
During the connection establishment, a peer can tell the other one that
it cannot establish new subflows to the initial IP address and port by
setting the 'C' flag [1]. Doing so makes sense when the sender is behind
a strict NAT, operating behind a legacy Layer 4 load balancer, or using
anycast IP address for example.
When this 'C' flag is set, the path-managers must then not try to
establish new subflows to the other peer's initial IP address and port.
The in-kernel PM has access to this info, but the userspace PM didn't.
The RFC8684 [1] is strict about that:
(...) therefore the receiver MUST NOT try to open any additional
subflows toward this address and port.
So it is important to tell the userspace about that as it is responsible
for the respect of this flag.
When a new connection is created and established, the Netlink events
now contain the existing but not currently used 'flags' attribute. When
MPTCP_PM_EV_FLAG_DENY_JOIN_ID0 is set, it means no other subflows
to the initial IP address and port -- info that are also part of the
event -- can be established.
Link: https://datatracker.ietf.org/doc/html/rfc8684#section-3.1-20.6 [1]
Fixes: 702c2f646d42 ("mptcp: netlink: allow userspace-driven subflow establishment")
Reported-by: Marek Majkowski <marek(a)cloudflare.com>
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/532
Reviewed-by: Mat Martineau <martineau(a)kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
Link: https://patch.msgid.link/20250912-net-mptcp-pm-uspace-deny_join_id0-v1-2-40…
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
[ Conflicts in mptcp_pm.yaml, because the indentation has been modified
in commit ec362192aa9e ("netlink: specs: fix up indentation errors"),
which is not in this version. Applying the same modifications, but at
a different level. ]
Signed-off-by: Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
---
Documentation/netlink/specs/mptcp_pm.yaml | 4 ++--
include/uapi/linux/mptcp.h | 2 ++
include/uapi/linux/mptcp_pm.h | 4 ++--
net/mptcp/pm_netlink.c | 7 +++++++
4 files changed, 13 insertions(+), 4 deletions(-)
diff --git a/Documentation/netlink/specs/mptcp_pm.yaml b/Documentation/netlink/specs/mptcp_pm.yaml
index ecfe5ee33de2..c77f32cfcae9 100644
--- a/Documentation/netlink/specs/mptcp_pm.yaml
+++ b/Documentation/netlink/specs/mptcp_pm.yaml
@@ -28,13 +28,13 @@ definitions:
traffic-patterns it can take a long time until the
MPTCP_EVENT_ESTABLISHED is sent.
Attributes: token, family, saddr4 | saddr6, daddr4 | daddr6, sport,
- dport, server-side.
+ dport, server-side, [flags].
-
name: established
doc: >-
A MPTCP connection is established (can start new subflows).
Attributes: token, family, saddr4 | saddr6, daddr4 | daddr6, sport,
- dport, server-side.
+ dport, server-side, [flags].
-
name: closed
doc: >-
diff --git a/include/uapi/linux/mptcp.h b/include/uapi/linux/mptcp.h
index 67d015df8893..5fd5b4cf75ca 100644
--- a/include/uapi/linux/mptcp.h
+++ b/include/uapi/linux/mptcp.h
@@ -31,6 +31,8 @@
#define MPTCP_INFO_FLAG_FALLBACK _BITUL(0)
#define MPTCP_INFO_FLAG_REMOTE_KEY_RECEIVED _BITUL(1)
+#define MPTCP_PM_EV_FLAG_DENY_JOIN_ID0 _BITUL(0)
+
#define MPTCP_PM_ADDR_FLAG_SIGNAL (1 << 0)
#define MPTCP_PM_ADDR_FLAG_SUBFLOW (1 << 1)
#define MPTCP_PM_ADDR_FLAG_BACKUP (1 << 2)
diff --git a/include/uapi/linux/mptcp_pm.h b/include/uapi/linux/mptcp_pm.h
index 6ac84b2f636c..7359d34da446 100644
--- a/include/uapi/linux/mptcp_pm.h
+++ b/include/uapi/linux/mptcp_pm.h
@@ -16,10 +16,10 @@
* good time to allocate memory and send ADD_ADDR if needed. Depending on the
* traffic-patterns it can take a long time until the MPTCP_EVENT_ESTABLISHED
* is sent. Attributes: token, family, saddr4 | saddr6, daddr4 | daddr6,
- * sport, dport, server-side.
+ * sport, dport, server-side, [flags].
* @MPTCP_EVENT_ESTABLISHED: A MPTCP connection is established (can start new
* subflows). Attributes: token, family, saddr4 | saddr6, daddr4 | daddr6,
- * sport, dport, server-side.
+ * sport, dport, server-side, [flags].
* @MPTCP_EVENT_CLOSED: A MPTCP connection has stopped. Attribute: token.
* @MPTCP_EVENT_ANNOUNCED: A new address has been announced by the peer.
* Attributes: token, rem_id, family, daddr4 | daddr6 [, dport].
diff --git a/net/mptcp/pm_netlink.c b/net/mptcp/pm_netlink.c
index 50aaf259959a..ce7d42d3bd00 100644
--- a/net/mptcp/pm_netlink.c
+++ b/net/mptcp/pm_netlink.c
@@ -408,6 +408,7 @@ static int mptcp_event_created(struct sk_buff *skb,
const struct sock *ssk)
{
int err = nla_put_u32(skb, MPTCP_ATTR_TOKEN, READ_ONCE(msk->token));
+ u16 flags = 0;
if (err)
return err;
@@ -415,6 +416,12 @@ static int mptcp_event_created(struct sk_buff *skb,
if (nla_put_u8(skb, MPTCP_ATTR_SERVER_SIDE, READ_ONCE(msk->pm.server_side)))
return -EMSGSIZE;
+ if (READ_ONCE(msk->pm.remote_deny_join_id0))
+ flags |= MPTCP_PM_EV_FLAG_DENY_JOIN_ID0;
+
+ if (flags && nla_put_u16(skb, MPTCP_ATTR_FLAGS, flags))
+ return -EMSGSIZE;
+
return mptcp_event_add_subflow(skb, ssk);
}
--
2.51.0
Hi Sasha,
Thank you for maintaining the stable versions with Greg!
If I remember well, you run some scripts on your side to maintain the
queue/* branches in the linux-stable-rc Git tree [1], is that correct?
These branches have not been updated for a bit more than 3 weeks. Is it
normal?
Personally, I find them useful. But if it is just me, I can work without
them.
[1]
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git/…
Cheers,
Matt
--
Sponsored by the NGI0 Core fund.
The patch titled
Subject: kmsan: Fix out-of-bounds access to shadow memory
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
kmsan-fix-out-of-bounds-access-to-shadow-memory.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Eric Biggers <ebiggers(a)kernel.org>
Subject: kmsan: Fix out-of-bounds access to shadow memory
Date: Thu, 11 Sep 2025 12:58:58 -0700
Running sha224_kunit on a KMSAN-enabled kernel results in a crash in
kmsan_internal_set_shadow_origin():
BUG: unable to handle page fault for address: ffffbc3840291000
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 1810067 P4D 1810067 PUD 192d067 PMD 3c17067 PTE 0
Oops: 0000 [#1] SMP NOPTI
CPU: 0 UID: 0 PID: 81 Comm: kunit_try_catch Tainted: G N 6.17.0-rc3 #10 PREEMPT(voluntary)
Tainted: [N]=TEST
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.17.0-0-gb52ca86e094d-prebuilt.qemu.org 04/01/2014
RIP: 0010:kmsan_internal_set_shadow_origin+0x91/0x100
[...]
Call Trace:
<TASK>
__msan_memset+0xee/0x1a0
sha224_final+0x9e/0x350
test_hash_buffer_overruns+0x46f/0x5f0
? kmsan_get_shadow_origin_ptr+0x46/0xa0
? __pfx_test_hash_buffer_overruns+0x10/0x10
kunit_try_run_case+0x198/0xa00
This occurs when memset() is called on a buffer that is not 4-byte aligned
and extends to the end of a guard page, i.e. the next page is unmapped.
The bug is that the loop at the end of kmsan_internal_set_shadow_origin()
accesses the wrong shadow memory bytes when the address is not 4-byte
aligned. Since each 4 bytes are associated with an origin, it rounds the
address and size so that it can access all the origins that contain the
buffer. However, when it checks the corresponding shadow bytes for a
particular origin, it incorrectly uses the original unrounded shadow
address. This results in reads from shadow memory beyond the end of the
buffer's shadow memory, which crashes when that memory is not mapped.
To fix this, correctly align the shadow address before accessing the 4
shadow bytes corresponding to each origin.
Link: https://lkml.kernel.org/r/20250911195858.394235-1-ebiggers@kernel.org
Fixes: 2ef3cec44c60 ("kmsan: do not wipe out origin when doing partial unpoisoning")
Signed-off-by: Eric Biggers <ebiggers(a)kernel.org>
Tested-by: Alexander Potapenko <glider(a)google.com>
Reviewed-by: Alexander Potapenko <glider(a)google.com>
Cc: Dmitriy Vyukov <dvyukov(a)google.com>
Cc: Marco Elver <elver(a)google.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/kmsan/core.c | 10 +++++++---
mm/kmsan/kmsan_test.c | 16 ++++++++++++++++
2 files changed, 23 insertions(+), 3 deletions(-)
--- a/mm/kmsan/core.c~kmsan-fix-out-of-bounds-access-to-shadow-memory
+++ a/mm/kmsan/core.c
@@ -195,7 +195,8 @@ void kmsan_internal_set_shadow_origin(vo
u32 origin, bool checked)
{
u64 address = (u64)addr;
- u32 *shadow_start, *origin_start;
+ void *shadow_start;
+ u32 *aligned_shadow, *origin_start;
size_t pad = 0;
KMSAN_WARN_ON(!kmsan_metadata_is_contiguous(addr, size));
@@ -214,9 +215,12 @@ void kmsan_internal_set_shadow_origin(vo
}
__memset(shadow_start, b, size);
- if (!IS_ALIGNED(address, KMSAN_ORIGIN_SIZE)) {
+ if (IS_ALIGNED(address, KMSAN_ORIGIN_SIZE)) {
+ aligned_shadow = shadow_start;
+ } else {
pad = address % KMSAN_ORIGIN_SIZE;
address -= pad;
+ aligned_shadow = shadow_start - pad;
size += pad;
}
size = ALIGN(size, KMSAN_ORIGIN_SIZE);
@@ -230,7 +234,7 @@ void kmsan_internal_set_shadow_origin(vo
* corresponding shadow slot is zero.
*/
for (int i = 0; i < size / KMSAN_ORIGIN_SIZE; i++) {
- if (origin || !shadow_start[i])
+ if (origin || !aligned_shadow[i])
origin_start[i] = origin;
}
}
--- a/mm/kmsan/kmsan_test.c~kmsan-fix-out-of-bounds-access-to-shadow-memory
+++ a/mm/kmsan/kmsan_test.c
@@ -556,6 +556,21 @@ DEFINE_TEST_MEMSETXX(16)
DEFINE_TEST_MEMSETXX(32)
DEFINE_TEST_MEMSETXX(64)
+/* Test case: ensure that KMSAN does not access shadow memory out of bounds. */
+static void test_memset_on_guarded_buffer(struct kunit *test)
+{
+ void *buf = vmalloc(PAGE_SIZE);
+
+ kunit_info(test,
+ "memset() on ends of guarded buffer should not crash\n");
+
+ for (size_t size = 0; size <= 128; size++) {
+ memset(buf, 0xff, size);
+ memset(buf + PAGE_SIZE - size, 0xff, size);
+ }
+ vfree(buf);
+}
+
static noinline void fibonacci(int *array, int size, int start)
{
if (start < 2 || (start == size))
@@ -677,6 +692,7 @@ static struct kunit_case kmsan_test_case
KUNIT_CASE(test_memset16),
KUNIT_CASE(test_memset32),
KUNIT_CASE(test_memset64),
+ KUNIT_CASE(test_memset_on_guarded_buffer),
KUNIT_CASE(test_long_origin_chain),
KUNIT_CASE(test_stackdepot_roundtrip),
KUNIT_CASE(test_unpoison_memory),
_
Patches currently in -mm which might be from ebiggers(a)kernel.org are
kmsan-fix-out-of-bounds-access-to-shadow-memory.patch
The patch titled
Subject: Squashfs: fix uninit-value in squashfs_get_parent
has been added to the -mm mm-nonmm-unstable branch. Its filename is
squashfs-fix-uninit-value-in-squashfs_get_parent.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-nonmm-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Phillip Lougher <phillip(a)squashfs.org.uk>
Subject: Squashfs: fix uninit-value in squashfs_get_parent
Date: Fri, 19 Sep 2025 00:33:08 +0100
Syzkaller reports a "KMSAN: uninit-value in squashfs_get_parent" bug.
This is caused by open_by_handle_at() being called with a file handle
containing an invalid parent inode number. In particular the inode number
is that of a symbolic link, rather than a directory.
Squashfs_get_parent() gets called with that symbolic link inode, and
accesses the parent member field.
unsigned int parent_ino = squashfs_i(inode)->parent;
Because non-directory inodes in Squashfs do not have a parent value, this
is uninitialised, and this causes an uninitialised value access.
The fix is to initialise parent with the invalid inode 0, which will cause
an EINVAL error to be returned.
Regular inodes used to share the parent field with the block_list_start
field. This is removed in this commit to enable the parent field to
contain the invalid inode number 0.
Link: https://lkml.kernel.org/r/20250918233308.293861-1-phillip@squashfs.org.uk
Fixes: 122601408d20 ("Squashfs: export operations")
Signed-off-by: Phillip Lougher <phillip(a)squashfs.org.uk>
Reported-by: syzbot+157bdef5cf596ad0da2c(a)syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/68cc2431.050a0220.139b6.0001.GAE@google.com/
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/squashfs/inode.c | 7 +++++++
fs/squashfs/squashfs_fs_i.h | 2 +-
2 files changed, 8 insertions(+), 1 deletion(-)
--- a/fs/squashfs/inode.c~squashfs-fix-uninit-value-in-squashfs_get_parent
+++ a/fs/squashfs/inode.c
@@ -169,6 +169,7 @@ int squashfs_read_inode(struct inode *in
squashfs_i(inode)->start = le32_to_cpu(sqsh_ino->start_block);
squashfs_i(inode)->block_list_start = block;
squashfs_i(inode)->offset = offset;
+ squashfs_i(inode)->parent = 0;
inode->i_data.a_ops = &squashfs_aops;
TRACE("File inode %x:%x, start_block %llx, block_list_start "
@@ -216,6 +217,7 @@ int squashfs_read_inode(struct inode *in
squashfs_i(inode)->start = le64_to_cpu(sqsh_ino->start_block);
squashfs_i(inode)->block_list_start = block;
squashfs_i(inode)->offset = offset;
+ squashfs_i(inode)->parent = 0;
inode->i_data.a_ops = &squashfs_aops;
TRACE("File inode %x:%x, start_block %llx, block_list_start "
@@ -296,6 +298,7 @@ int squashfs_read_inode(struct inode *in
inode->i_mode |= S_IFLNK;
squashfs_i(inode)->start = block;
squashfs_i(inode)->offset = offset;
+ squashfs_i(inode)->parent = 0;
if (type == SQUASHFS_LSYMLINK_TYPE) {
__le32 xattr;
@@ -333,6 +336,7 @@ int squashfs_read_inode(struct inode *in
set_nlink(inode, le32_to_cpu(sqsh_ino->nlink));
rdev = le32_to_cpu(sqsh_ino->rdev);
init_special_inode(inode, inode->i_mode, new_decode_dev(rdev));
+ squashfs_i(inode)->parent = 0;
TRACE("Device inode %x:%x, rdev %x\n",
SQUASHFS_INODE_BLK(ino), offset, rdev);
@@ -357,6 +361,7 @@ int squashfs_read_inode(struct inode *in
set_nlink(inode, le32_to_cpu(sqsh_ino->nlink));
rdev = le32_to_cpu(sqsh_ino->rdev);
init_special_inode(inode, inode->i_mode, new_decode_dev(rdev));
+ squashfs_i(inode)->parent = 0;
TRACE("Device inode %x:%x, rdev %x\n",
SQUASHFS_INODE_BLK(ino), offset, rdev);
@@ -377,6 +382,7 @@ int squashfs_read_inode(struct inode *in
inode->i_mode |= S_IFSOCK;
set_nlink(inode, le32_to_cpu(sqsh_ino->nlink));
init_special_inode(inode, inode->i_mode, 0);
+ squashfs_i(inode)->parent = 0;
break;
}
case SQUASHFS_LFIFO_TYPE:
@@ -396,6 +402,7 @@ int squashfs_read_inode(struct inode *in
inode->i_op = &squashfs_inode_ops;
set_nlink(inode, le32_to_cpu(sqsh_ino->nlink));
init_special_inode(inode, inode->i_mode, 0);
+ squashfs_i(inode)->parent = 0;
break;
}
default:
--- a/fs/squashfs/squashfs_fs_i.h~squashfs-fix-uninit-value-in-squashfs_get_parent
+++ a/fs/squashfs/squashfs_fs_i.h
@@ -16,6 +16,7 @@ struct squashfs_inode_info {
u64 xattr;
unsigned int xattr_size;
int xattr_count;
+ int parent;
union {
struct {
u64 fragment_block;
@@ -27,7 +28,6 @@ struct squashfs_inode_info {
u64 dir_idx_start;
int dir_idx_offset;
int dir_idx_cnt;
- int parent;
};
};
struct inode vfs_inode;
_
Patches currently in -mm which might be from phillip(a)squashfs.org.uk are
squashfs-fix-uninit-value-in-squashfs_get_parent.patch
The patch titled
Subject: fs/proc/task_mmu: check cur_buf for NULL
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
fs-proc-task_mmu-check-cur_buf-for-null.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Jakub Acs <acsjakub(a)amazon.de>
Subject: fs/proc/task_mmu: check cur_buf for NULL
Date: Fri, 19 Sep 2025 14:21:04 +0000
When the PAGEMAP_SCAN ioctl is invoked with vec_len = 0 reaches
pagemap_scan_backout_range(), kernel panics with null-ptr-deref:
[ 44.936808] Oops: general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN NOPTI
[ 44.937797] KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
[ 44.938391] CPU: 1 UID: 0 PID: 2480 Comm: reproducer Not tainted 6.17.0-rc6 #22 PREEMPT(none)
[ 44.939062] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
[ 44.939935] RIP: 0010:pagemap_scan_thp_entry.isra.0+0x741/0xa80
<snip registers, unreliable trace>
[ 44.946828] Call Trace:
[ 44.947030] <TASK>
[ 44.949219] pagemap_scan_pmd_entry+0xec/0xfa0
[ 44.952593] walk_pmd_range.isra.0+0x302/0x910
[ 44.954069] walk_pud_range.isra.0+0x419/0x790
[ 44.954427] walk_p4d_range+0x41e/0x620
[ 44.954743] walk_pgd_range+0x31e/0x630
[ 44.955057] __walk_page_range+0x160/0x670
[ 44.956883] walk_page_range_mm+0x408/0x980
[ 44.958677] walk_page_range+0x66/0x90
[ 44.958984] do_pagemap_scan+0x28d/0x9c0
[ 44.961833] do_pagemap_cmd+0x59/0x80
[ 44.962484] __x64_sys_ioctl+0x18d/0x210
[ 44.962804] do_syscall_64+0x5b/0x290
[ 44.963111] entry_SYSCALL_64_after_hwframe+0x76/0x7e
vec_len = 0 in pagemap_scan_init_bounce_buffer() means no buffers are
allocated and p->vec_buf remains set to NULL.
This breaks an assumption made later in pagemap_scan_backout_range(), that
page_region is always allocated for p->vec_buf_index.
Fix it by explicitly checking cur_buf for NULL before dereferencing.
Other sites that might run into same deref-issue are already (directly or
transitively) protected by checking p->vec_buf.
Note:
From PAGEMAP_SCAN man page, it seems vec_len = 0 is valid when no output
is requested and it's only the side effects caller is interested in, hence
it passes check in pagemap_scan_get_args().
This issue was found by syzkaller.
Link: https://lkml.kernel.org/r/20250919142106.43527-1-acsjakub@amazon.de
Fixes: 52526ca7fdb9 ("fs/proc/task_mmu: implement IOCTL to get and optionally clear info about PTEs")
Signed-off-by: Jakub Acs <acsjakub(a)amazon.de>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Lorenzo Stoakes <lorenzo.stoakes(a)oracle.com>
Cc: Jinjiang Tu <tujinjiang(a)huawei.com>
Cc: Suren Baghdasaryan <surenb(a)google.com>
Cc: Penglei Jiang <superman.xpt(a)gmail.com>
Cc: Mark Brown <broonie(a)kernel.org>
Cc: Baolin Wang <baolin.wang(a)linux.alibaba.com>
Cc: Ryan Roberts <ryan.roberts(a)arm.com>
Cc: Andrei Vagin <avagin(a)gmail.com>
Cc: "Micha�� Miros��aw" <mirq-linux(a)rere.qmqm.pl>
Cc: Stephen Rothwell <sfr(a)canb.auug.org.au>
Cc: Muhammad Usama Anjum <usama.anjum(a)collabora.com>
Cc: Alexey Dobriyan <adobriyan(a)gmail.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/proc/task_mmu.c | 3 +++
1 file changed, 3 insertions(+)
--- a/fs/proc/task_mmu.c~fs-proc-task_mmu-check-cur_buf-for-null
+++ a/fs/proc/task_mmu.c
@@ -2417,6 +2417,9 @@ static void pagemap_scan_backout_range(s
{
struct page_region *cur_buf = &p->vec_buf[p->vec_buf_index];
+ if (!cur_buf)
+ return;
+
if (cur_buf->start != addr)
cur_buf->end = addr;
else
_
Patches currently in -mm which might be from acsjakub(a)amazon.de are
fs-proc-task_mmu-check-cur_buf-for-null.patch
Bug-report: https://lore.kernel.org/all/915c0e00-b92d-4e37-9d4b-0f6a4580da97@oracle.com/
Summary: While backporting commit: 7c62c442b6eb ("x86/vmscape: Enumerate
VMSCAPE bug") to 6.12.y --> VULNBL_AMD(0x1a, SRSO | VMSCAPE) was added
even when 6.12.y doesn't have commit: 877818802c3e ("x86/bugs: Add
SRSO_USER_KERNEL_NO support").
Boris Ostrovsky suggested backporting three commits to 6.12.y:
1. commit: 877818802c3e ("x86/bugs: Add SRSO_USER_KERNEL_NO support")
2. commit: 8442df2b49ed ("x86/bugs: KVM: Add support for SRSO_MSR_FIX")
and its fix
3. commit: e3417ab75ab2 ("KVM: SVM: Set/clear SRSO's BP_SPEC_REDUCE on 0
<=> 1 VM count transitions") -- Maybe optional
Which changes current mitigation status on turin for 6.12.48 from Safe
RET to Reduced Speculation, leaving it with Safe RET liely causes heavy
performance regressions.
This three patches together change mitigation status from Safe RET to
Reduced Speculation
Tested on Turin:
[ 3.188134] Speculative Return Stack Overflow: Mitigation: Reduced Speculation
Backports:
1. Patch 1 had minor conflict as VMSCAPE commit added VULNBL_AMD(0x1a,
SRSO | VMSCAPE), and resolution is to skip that line.
2. Patch 2 and 3 are clean cherry-picks, 3 is a fix for 2.
Note: I verified if this problem is also on other stable trees like (6.6
--> 5.10, no they don't have this backport problem)
Thanks,
Harshit
Borislav Petkov (1):
x86/bugs: KVM: Add support for SRSO_MSR_FIX
Borislav Petkov (AMD) (1):
x86/bugs: Add SRSO_USER_KERNEL_NO support
Sean Christopherson (1):
KVM: SVM: Set/clear SRSO's BP_SPEC_REDUCE on 0 <=> 1 VM count
transitions
Documentation/admin-guide/hw-vuln/srso.rst | 13 +++++
arch/x86/include/asm/cpufeatures.h | 5 ++
arch/x86/include/asm/msr-index.h | 1 +
arch/x86/kernel/cpu/bugs.c | 28 ++++++++--
arch/x86/kvm/svm/svm.c | 65 ++++++++++++++++++++++
arch/x86/kvm/svm/svm.h | 2 +
arch/x86/lib/msr.c | 2 +
7 files changed, 112 insertions(+), 4 deletions(-)
--
2.50.1
The following commit has been merged into the x86/cpu branch of tip:
Commit-ID: 32278c677947ae2f042c9535674a7fff9a245dd3
Gitweb: https://git.kernel.org/tip/32278c677947ae2f042c9535674a7fff9a245dd3
Author: Sean Christopherson <seanjc(a)google.com>
AuthorDate: Fri, 08 Aug 2025 10:23:56 -07:00
Committer: Borislav Petkov (AMD) <bp(a)alien8.de>
CommitterDate: Fri, 19 Sep 2025 20:21:12 +02:00
x86/umip: Check that the instruction opcode is at least two bytes
When checking for a potential UMIP violation on #GP, verify the decoder found
at least two opcode bytes to avoid false positives when the kernel encounters
an unknown instruction that starts with 0f. Because the array of opcode.bytes
is zero-initialized by insn_init(), peeking at bytes[1] will misinterpret
garbage as a potential SLDT or STR instruction, and can incorrectly trigger
emulation.
E.g. if a VPALIGNR instruction
62 83 c5 05 0f 08 ff vpalignr xmm17{k5},xmm23,XMMWORD PTR [r8],0xff
hits a #GP, the kernel emulates it as STR and squashes the #GP (and corrupts
the userspace code stream).
Arguably the check should look for exactly two bytes, but no three byte
opcodes use '0f 00 xx' or '0f 01 xx' as an escape, i.e. it should be
impossible to get a false positive if the first two opcode bytes match '0f 00'
or '0f 01'. Go with a more conservative check with respect to the existing
code to minimize the chances of breaking userspace, e.g. due to decoder
weirdness.
Analyzed by Nick Bray <ncbray(a)google.com>.
Fixes: 1e5db223696a ("x86/umip: Add emulation code for UMIP instructions")
Reported-by: Dan Snyder <dansnyder(a)google.com>
Signed-off-by: Sean Christopherson <seanjc(a)google.com>
Signed-off-by: Borislav Petkov (AMD) <bp(a)alien8.de>
Acked-by: Peter Zijlstra (Intel) <peterz(a)infradead.org>
Cc: stable(a)vger.kernel.org
---
arch/x86/kernel/umip.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kernel/umip.c b/arch/x86/kernel/umip.c
index 5a4b213..406ac01 100644
--- a/arch/x86/kernel/umip.c
+++ b/arch/x86/kernel/umip.c
@@ -156,8 +156,8 @@ static int identify_insn(struct insn *insn)
if (!insn->modrm.nbytes)
return -EINVAL;
- /* All the instructions of interest start with 0x0f. */
- if (insn->opcode.bytes[0] != 0xf)
+ /* The instructions of interest have 2-byte opcodes: 0F 00 or 0F 01. */
+ if (insn->opcode.nbytes < 2 || insn->opcode.bytes[0] != 0xf)
return -EINVAL;
if (insn->opcode.bytes[1] == 0x1) {
The following commit has been merged into the x86/cpu branch of tip:
Commit-ID: 27b1fd62012dfe9d3eb8ecde344d7aa673695ecf
Gitweb: https://git.kernel.org/tip/27b1fd62012dfe9d3eb8ecde344d7aa673695ecf
Author: Sean Christopherson <seanjc(a)google.com>
AuthorDate: Fri, 08 Aug 2025 10:23:57 -07:00
Committer: Borislav Petkov (AMD) <bp(a)alien8.de>
CommitterDate: Fri, 19 Sep 2025 21:34:48 +02:00
x86/umip: Fix decoding of register forms of 0F 01 (SGDT and SIDT aliases)
Filter out the register forms of 0F 01 when determining whether or not to
emulate in response to a potential UMIP violation #GP, as SGDT and SIDT only
accept memory operands. The register variants of 0F 01 are used to encode
instructions for things like VMX and SGX, i.e. not checking the Mod field
would cause the kernel to incorrectly emulate on #GP, e.g. due to a CPL
violation on VMLAUNCH.
Fixes: 1e5db223696a ("x86/umip: Add emulation code for UMIP instructions")
Signed-off-by: Sean Christopherson <seanjc(a)google.com>
Signed-off-by: Borislav Petkov (AMD) <bp(a)alien8.de>
Acked-by: Peter Zijlstra (Intel) <peterz(a)infradead.org>
Cc: stable(a)vger.kernel.org
---
arch/x86/kernel/umip.c | 11 +++++++++++
1 file changed, 11 insertions(+)
diff --git a/arch/x86/kernel/umip.c b/arch/x86/kernel/umip.c
index 406ac01..d432f38 100644
--- a/arch/x86/kernel/umip.c
+++ b/arch/x86/kernel/umip.c
@@ -163,8 +163,19 @@ static int identify_insn(struct insn *insn)
if (insn->opcode.bytes[1] == 0x1) {
switch (X86_MODRM_REG(insn->modrm.value)) {
case 0:
+ /* The reg form of 0F 01 /0 encodes VMX instructions. */
+ if (X86_MODRM_MOD(insn->modrm.value) == 3)
+ return -EINVAL;
+
return UMIP_INST_SGDT;
case 1:
+ /*
+ * The reg form of 0F 01 /1 encodes MONITOR/MWAIT,
+ * STAC/CLAC, and ENCLS.
+ */
+ if (X86_MODRM_MOD(insn->modrm.value) == 3)
+ return -EINVAL;
+
return UMIP_INST_SIDT;
case 4:
return UMIP_INST_SMSW;
From: HariKrishna Sagala <hariconscious(a)gmail.com>
Syzbot reported an uninit-value bug on at kmalloc_reserve for
commit 320475fbd590 ("Merge tag 'mtd/fixes-for-6.17-rc6' of
git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux")'
Syzbot KMSAN reported use of uninitialized memory originating from functions
"kmalloc_reserve()", where memory allocated via "kmem_cache_alloc_node()" or
"kmalloc_node_track_caller()" was not explicitly initialized.
This can lead to undefined behavior when the allocated buffer
is later accessed.
Fix this by requesting the initialized memory using the gfp flag
appended with the option "__GFP_ZERO".
Reported-by: syzbot+9a4fbb77c9d4aacd3388(a)syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=9a4fbb77c9d4aacd3388
Fixes: 915d975b2ffa ("net: deal with integer overflows in
kmalloc_reserve()")
Tested-by: syzbot+9a4fbb77c9d4aacd3388(a)syzkaller.appspotmail.com
Signed-off-by: HariKrishna Sagala <hariconscious(a)gmail.com>
---
net/core/skbuff.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index ee0274417948..2308ebf99bbd 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -573,6 +573,7 @@ static void *kmalloc_reserve(unsigned int *size, gfp_t flags, int node,
void *obj;
obj_size = SKB_HEAD_ALIGN(*size);
+ flags |= __GFP_ZERO;
if (obj_size <= SKB_SMALL_HEAD_CACHE_SIZE &&
!(flags & KMALLOC_NOT_NORMAL_BITS)) {
obj = kmem_cache_alloc_node(net_hotdata.skb_small_head_cache,
--
2.43.0
Once of_device_register() failed, we should call put_device() to
decrement reference count for cleanup. Or it could cause memory leak.
So fix this by calling put_device(), then the name can be freed in
kobject_cleanup().
Calling path: of_device_register() -> of_device_add() -> device_add().
As comment of device_add() says, 'if device_add() succeeds, you should
call device_del() when you want to get rid of it. If device_add() has
not succeeded, use only put_device() to drop the reference count'.
Found by code review.
Cc: stable(a)vger.kernel.org
Fixes: cf44bbc26cf1 ("[SPARC]: Beginnings of generic of_device framework.")
Signed-off-by: Ma Ke <make24(a)iscas.ac.cn>
---
Changes in v2:
- retained kfree() manually due to the lack of a release callback function.
---
arch/sparc/kernel/of_device_64.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/arch/sparc/kernel/of_device_64.c b/arch/sparc/kernel/of_device_64.c
index f98c2901f335..f53092b07b9e 100644
--- a/arch/sparc/kernel/of_device_64.c
+++ b/arch/sparc/kernel/of_device_64.c
@@ -677,6 +677,7 @@ static struct platform_device * __init scan_one_device(struct device_node *dp,
if (of_device_register(op)) {
printk("%pOF: Could not register of device.\n", dp);
+ put_device(&op->dev);
kfree(op);
op = NULL;
}
--
2.25.1
Hi stable maintainers,
While skimming over stable backports for VMSCAPE commits, I found
something unusual.
This is regarding the 6.12.y commit: 7c62c442b6eb ("x86/vmscape:
Enumerate VMSCAPE bug")
commit 7c62c442b6eb95d21bc4c5afc12fee721646ebe2
Author: Pawan Gupta <pawan.kumar.gupta(a)linux.intel.com>
Date: Thu Aug 14 10:20:42 2025 -0700
x86/vmscape: Enumerate VMSCAPE bug
Commit a508cec6e5215a3fbc7e73ae86a5c5602187934d upstream.
The VMSCAPE vulnerability may allow a guest to cause Branch Target
Injection (BTI) in userspace hypervisors.
Kernels (both host and guest) have existing defenses against direct BTI
attacks from guests. There are also inter-process BTI mitigations which
prevent processes from attacking each other. However, the threat in
this
case is to a userspace hypervisor within the same process as the
attacker.
Userspace hypervisors have access to their own sensitive data like disk
encryption keys and also typically have access to all guest data. This
means guest userspace may use the hypervisor as a confused deputy
to attack
sensitive guest kernel data. There are no existing mitigations for
these
attacks.
Introduce X86_BUG_VMSCAPE for this vulnerability and set it on affected
Intel and AMD CPUs.
Signed-off-by: Pawan Gupta <pawan.kumar.gupta(a)linux.intel.com>
Signed-off-by: Dave Hansen <dave.hansen(a)linux.intel.com>
Reviewed-by: Borislav Petkov (AMD) <bp(a)alien8.de>
Signed-off-by: Borislav Petkov (AMD) <bp(a)alien8.de>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
So the problem in this commit is this part of the backport:
in file: arch/x86/kernel/cpu/common.c
VULNBL_AMD(0x15, RETBLEED),
VULNBL_AMD(0x16, RETBLEED),
- VULNBL_AMD(0x17, RETBLEED | SMT_RSB | SRSO),
- VULNBL_HYGON(0x18, RETBLEED | SMT_RSB | SRSO),
- VULNBL_AMD(0x19, SRSO | TSA),
+ VULNBL_AMD(0x17, RETBLEED | SMT_RSB | SRSO | VMSCAPE),
+ VULNBL_HYGON(0x18, RETBLEED | SMT_RSB | SRSO | VMSCAPE),
+ VULNBL_AMD(0x19, SRSO | TSA | VMSCAPE),
+ VULNBL_AMD(0x1a, SRSO | VMSCAPE),
+
{}
Notice the part where VULNBL_AMD(0x1a, SRSO | VMSCAPE) is added, 6.12.y
doesn't have commit: 877818802c3e ("x86/bugs: Add SRSO_USER_KERNEL_NO
support") so I think we shouldn't be adding VULNBL_AMD(0x1a, SRSO |
VMSCAPE) directly.
Boris Ostrovsky suggested me to verify this on a Turin machine as this
could cause a very big performance regression : and stated if SRSO
mitigation status is Safe RET we are likely in a problem, and we are in
that situation.
# lscpu | grep -E "CPU family"
CPU family: 26
Notes: CPU ID 26 -> 0x1a
And Turin machine reports the SRSO mitigation status as "Safe RET"
# uname -r
6.12.48-master.20250917.el8.rc1.x86_64
# cat /sys/devices/system/cpu/vulnerabilities/spec_rstack_overflow
Mitigation: Safe RET
Boris Ostrovsky suggested backporting three commits to 6.12.y:
1. commit: 877818802c3e ("x86/bugs: Add SRSO_USER_KERNEL_NO support")
2. commit: 8442df2b49ed ("x86/bugs: KVM: Add support for SRSO_MSR_FIX")
and its fix
3. commit: e3417ab75ab2 ("KVM: SVM: Set/clear SRSO's BP_SPEC_REDUCE on 0
<=> 1 VM count transitions") -- Maybe optional
After backporting these three:
# uname -r
6.12.48-master.20250919.el8.dev.x86_64 // Note this this is kernel with
patches above three applied.
# dmesg | grep -C 2 Reduce
[ 3.186135] Speculative Store Bypass: Mitigation: Speculative Store
Bypass disabled via prctl
[ 3.187135] Speculative Return Stack Overflow: Reducing speculation to
address VM/HV SRSO attack vector.
[ 3.188134] Speculative Return Stack Overflow: Mitigation: Reduced
Speculation
[ 3.189135] VMSCAPE: Mitigation: IBPB before exit to userspace
[ 3.191139] x86/fpu: Supporting XSAVE feature 0x001: 'x87 floating point
registers'
# cat /sys/devices/system/cpu/vulnerabilities/spec_rstack_overflow
Mitigation: Reduced Speculation
I can send my backports to stable if this looks good. Thoughts ?
Thanks,
Harshit
The iput() function is a dangerous one - if the reference counter goes
to zero, the function may block for a long time due to:
- inode_wait_for_writeback() waits until writeback on this inode
completes
- the filesystem-specific "evict_inode" callback can do similar
things; e.g. all netfs-based filesystems will call
netfs_wait_for_outstanding_io() which is similar to
inode_wait_for_writeback()
Therefore, callers must carefully evaluate the context they're in and
check whether invoking iput() is a good idea at all.
Most of the time, this is not a problem because the dcache holds
references to all inodes, and the dcache is usually the one to release
the last reference. But this assumption is fragile. For example,
under (memcg) memory pressure, the dcache shrinker is more likely to
release inode references, moving the inode eviction to contexts where
that was extremely unlikely to occur.
Our production servers "found" at least two deadlock bugs in the Ceph
filesystem that were caused by this iput() behavior:
1. Writeback may lead to iput() calls in Ceph (e.g. from
ceph_put_wrbuffer_cap_refs()) which deadlocks in
inode_wait_for_writeback(). Waiting for writeback completion from
within writeback will obviously never be able to make any progress.
This leads to blocked kworkers like this:
INFO: task kworker/u777:6:1270802 blocked for more than 122 seconds.
Not tainted 6.16.7-i1-es #773
task:kworker/u777:6 state:D stack:0 pid:1270802 tgid:1270802 ppid:2
task_flags:0x4208060 flags:0x00004000
Workqueue: writeback wb_workfn (flush-ceph-3)
Call Trace:
<TASK>
__schedule+0x4ea/0x17d0
schedule+0x1c/0xc0
inode_wait_for_writeback+0x71/0xb0
evict+0xcf/0x200
ceph_put_wrbuffer_cap_refs+0xdd/0x220
ceph_invalidate_folio+0x97/0xc0
ceph_writepages_start+0x127b/0x14d0
do_writepages+0xba/0x150
__writeback_single_inode+0x34/0x290
writeback_sb_inodes+0x203/0x470
__writeback_inodes_wb+0x4c/0xe0
wb_writeback+0x189/0x2b0
wb_workfn+0x30b/0x3d0
process_one_work+0x143/0x2b0
worker_thread+0x30a/0x450
2. In the Ceph messenger thread (net/ceph/messenger*.c), any iput()
call may invoke ceph_evict_inode() which will deadlock in
netfs_wait_for_outstanding_io(); since this blocks the messenger
thread, completions from the Ceph servers will not ever be received
and handled.
It looks like these deadlock bugs have been in the Ceph filesystem
code since forever (therefore no "Fixes" tag in this patch). There
may be various ways to solve this:
- make iput() asynchronous and defer the actual eviction like fput()
(may add overhead)
- make iput() only asynchronous if I_SYNC is set (doesn't solve random
things happening inside the "evict_inode" callback)
- add iput_deferred() to make this asynchronous behavior/overhead
optional and explicit
- refactor Ceph to avoid iput() calls from within writeback and
messenger (if that is even possible)
- add a Ceph-specific workaround
After advice from Mateusz Guzik, I decided to do the latter. The
implementation is simple because it piggybacks on the existing
work_struct for ceph_queue_inode_work() - ceph_inode_work() calls
iput() at the end which means we can donate the last reference to it.
Since Ceph has a few iput() callers in a loop, it seemed simple enough
to pass this counter and use atomic_sub() instead of atomic_dec().
This patch adds ceph_iput_n_async() and converts lots of iput() calls
to it - at least those that may come through writeback and the
messenger.
Signed-off-by: Max Kellermann <max.kellermann(a)ionos.com>
Cc: Mateusz Guzik <mjguzik(a)gmail.com>
Cc: stable(a)vger.kernel.org
---
fs/ceph/addr.c | 2 +-
fs/ceph/caps.c | 21 ++++++++++-----------
fs/ceph/dir.c | 2 +-
fs/ceph/inode.c | 42 ++++++++++++++++++++++++++++++++++++++++++
fs/ceph/mds_client.c | 32 ++++++++++++++++----------------
fs/ceph/quota.c | 4 ++--
fs/ceph/snap.c | 10 +++++-----
fs/ceph/super.h | 7 +++++++
8 files changed, 84 insertions(+), 36 deletions(-)
diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c
index 322ed268f14a..fc497c91530e 100644
--- a/fs/ceph/addr.c
+++ b/fs/ceph/addr.c
@@ -265,7 +265,7 @@ static void finish_netfs_read(struct ceph_osd_request *req)
subreq->error = err;
trace_netfs_sreq(subreq, netfs_sreq_trace_io_progress);
netfs_read_subreq_terminated(subreq);
- iput(req->r_inode);
+ ceph_iput_async(req->r_inode);
ceph_dec_osd_stopping_blocker(fsc->mdsc);
}
diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c
index b1a8ff612c41..bd88b5287a2b 100644
--- a/fs/ceph/caps.c
+++ b/fs/ceph/caps.c
@@ -1771,7 +1771,7 @@ void ceph_flush_snaps(struct ceph_inode_info *ci,
spin_unlock(&mdsc->snap_flush_lock);
if (need_put)
- iput(inode);
+ ceph_iput_async(inode);
}
/*
@@ -3318,8 +3318,8 @@ static void __ceph_put_cap_refs(struct ceph_inode_info *ci, int had,
}
if (wake)
wake_up_all(&ci->i_cap_wq);
- while (put-- > 0)
- iput(inode);
+ if (put > 0)
+ ceph_iput_n_async(inode, put);
}
void ceph_put_cap_refs(struct ceph_inode_info *ci, int had)
@@ -3418,9 +3418,8 @@ void ceph_put_wrbuffer_cap_refs(struct ceph_inode_info *ci, int nr,
}
if (complete_capsnap)
wake_up_all(&ci->i_cap_wq);
- while (put-- > 0) {
- iput(inode);
- }
+ if (put > 0)
+ ceph_iput_n_async(inode, put);
}
/*
@@ -3917,7 +3916,7 @@ static void handle_cap_flush_ack(struct inode *inode, u64 flush_tid,
if (wake_mdsc)
wake_up_all(&mdsc->cap_flushing_wq);
if (drop)
- iput(inode);
+ ceph_iput_async(inode);
}
void __ceph_remove_capsnap(struct inode *inode, struct ceph_cap_snap *capsnap,
@@ -4008,7 +4007,7 @@ static void handle_cap_flushsnap_ack(struct inode *inode, u64 flush_tid,
wake_up_all(&ci->i_cap_wq);
if (wake_mdsc)
wake_up_all(&mdsc->cap_flushing_wq);
- iput(inode);
+ ceph_iput_async(inode);
}
}
@@ -4557,7 +4556,7 @@ void ceph_handle_caps(struct ceph_mds_session *session,
done:
mutex_unlock(&session->s_mutex);
done_unlocked:
- iput(inode);
+ ceph_iput_async(inode);
out:
ceph_dec_mds_stopping_blocker(mdsc);
@@ -4636,7 +4635,7 @@ unsigned long ceph_check_delayed_caps(struct ceph_mds_client *mdsc)
doutc(cl, "on %p %llx.%llx\n", inode,
ceph_vinop(inode));
ceph_check_caps(ci, 0);
- iput(inode);
+ ceph_iput_async(inode);
spin_lock(&mdsc->cap_delay_lock);
}
@@ -4675,7 +4674,7 @@ static void flush_dirty_session_caps(struct ceph_mds_session *s)
spin_unlock(&mdsc->cap_dirty_lock);
ceph_wait_on_async_create(inode);
ceph_check_caps(ci, CHECK_CAPS_FLUSH);
- iput(inode);
+ ceph_iput_async(inode);
spin_lock(&mdsc->cap_dirty_lock);
}
spin_unlock(&mdsc->cap_dirty_lock);
diff --git a/fs/ceph/dir.c b/fs/ceph/dir.c
index 32973c62c1a2..ec73ed52a227 100644
--- a/fs/ceph/dir.c
+++ b/fs/ceph/dir.c
@@ -1290,7 +1290,7 @@ static void ceph_async_unlink_cb(struct ceph_mds_client *mdsc,
ceph_mdsc_free_path_info(&path_info);
}
out:
- iput(req->r_old_inode);
+ ceph_iput_async(req->r_old_inode);
ceph_mdsc_release_dir_caps(req);
}
diff --git a/fs/ceph/inode.c b/fs/ceph/inode.c
index f67025465de0..385d5261632d 100644
--- a/fs/ceph/inode.c
+++ b/fs/ceph/inode.c
@@ -2191,6 +2191,48 @@ void ceph_queue_inode_work(struct inode *inode, int work_bit)
}
}
+/**
+ * Queue an asynchronous iput() call in a worker thread. Use this
+ * instead of iput() in contexts where evicting the inode is unsafe.
+ * For example, inode eviction may cause deadlocks in
+ * inode_wait_for_writeback() (when called from within writeback) or
+ * in netfs_wait_for_outstanding_io() (when called from within the
+ * Ceph messenger).
+ *
+ * @n: how many references to put
+ */
+void ceph_iput_n_async(struct inode *inode, int n)
+{
+ if (unlikely(!inode))
+ return;
+
+ if (likely(atomic_sub_return(n, &inode->i_count) > 0))
+ /* somebody else is holding another reference -
+ * nothing left to do for us
+ */
+ return;
+
+ doutc(ceph_inode_to_fs_client(inode)->client, "%p %llx.%llx\n", inode, ceph_vinop(inode));
+
+ /* the reference counter is now 0, i.e. nobody else is holding
+ * a reference to this inode; restore it to 1 and donate it to
+ * ceph_inode_work() which will call iput() at the end
+ */
+ atomic_set(&inode->i_count, 1);
+
+ /* simply queue a ceph_inode_work() without setting
+ * i_work_mask bit; other than putting the reference, there is
+ * nothing to do
+ */
+ WARN_ON_ONCE(!queue_work(ceph_inode_to_fs_client(inode)->inode_wq,
+ &ceph_inode(inode)->i_work));
+
+ /* note: queue_work() cannot fail; it i_work were already
+ * queued, then it would be holding another reference, but no
+ * such reference exists
+ */
+}
+
static void ceph_do_invalidate_pages(struct inode *inode)
{
struct ceph_client *cl = ceph_inode_to_client(inode);
diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c
index 3bc72b47fe4d..d7fce1ad8073 100644
--- a/fs/ceph/mds_client.c
+++ b/fs/ceph/mds_client.c
@@ -1097,14 +1097,14 @@ void ceph_mdsc_release_request(struct kref *kref)
ceph_msg_put(req->r_reply);
if (req->r_inode) {
ceph_put_cap_refs(ceph_inode(req->r_inode), CEPH_CAP_PIN);
- iput(req->r_inode);
+ ceph_iput_async(req->r_inode);
}
if (req->r_parent) {
ceph_put_cap_refs(ceph_inode(req->r_parent), CEPH_CAP_PIN);
- iput(req->r_parent);
+ ceph_iput_async(req->r_parent);
}
- iput(req->r_target_inode);
- iput(req->r_new_inode);
+ ceph_iput_async(req->r_target_inode);
+ ceph_iput_async(req->r_new_inode);
if (req->r_dentry)
dput(req->r_dentry);
if (req->r_old_dentry)
@@ -1118,7 +1118,7 @@ void ceph_mdsc_release_request(struct kref *kref)
*/
ceph_put_cap_refs(ceph_inode(req->r_old_dentry_dir),
CEPH_CAP_PIN);
- iput(req->r_old_dentry_dir);
+ ceph_iput_async(req->r_old_dentry_dir);
}
kfree(req->r_path1);
kfree(req->r_path2);
@@ -1240,7 +1240,7 @@ static void __unregister_request(struct ceph_mds_client *mdsc,
}
if (req->r_unsafe_dir) {
- iput(req->r_unsafe_dir);
+ ceph_iput_async(req->r_unsafe_dir);
req->r_unsafe_dir = NULL;
}
@@ -1413,7 +1413,7 @@ static int __choose_mds(struct ceph_mds_client *mdsc,
cap = rb_entry(rb_first(&ci->i_caps), struct ceph_cap, ci_node);
if (!cap) {
spin_unlock(&ci->i_ceph_lock);
- iput(inode);
+ ceph_iput_async(inode);
goto random;
}
mds = cap->session->s_mds;
@@ -1422,7 +1422,7 @@ static int __choose_mds(struct ceph_mds_client *mdsc,
cap == ci->i_auth_cap ? "auth " : "", cap);
spin_unlock(&ci->i_ceph_lock);
out:
- iput(inode);
+ ceph_iput_async(inode);
return mds;
random:
@@ -1841,7 +1841,7 @@ int ceph_iterate_session_caps(struct ceph_mds_session *session,
spin_unlock(&session->s_cap_lock);
if (last_inode) {
- iput(last_inode);
+ ceph_iput_async(last_inode);
last_inode = NULL;
}
if (old_cap) {
@@ -1874,7 +1874,7 @@ int ceph_iterate_session_caps(struct ceph_mds_session *session,
session->s_cap_iterator = NULL;
spin_unlock(&session->s_cap_lock);
- iput(last_inode);
+ ceph_iput_async(last_inode);
if (old_cap)
ceph_put_cap(session->s_mdsc, old_cap);
@@ -1903,8 +1903,8 @@ static int remove_session_caps_cb(struct inode *inode, int mds, void *arg)
wake_up_all(&ci->i_cap_wq);
if (invalidate)
ceph_queue_invalidate(inode);
- while (iputs--)
- iput(inode);
+ if (iputs > 0)
+ ceph_iput_n_async(inode, iputs);
return 0;
}
@@ -1944,7 +1944,7 @@ static void remove_session_caps(struct ceph_mds_session *session)
spin_unlock(&session->s_cap_lock);
inode = ceph_find_inode(sb, vino);
- iput(inode);
+ ceph_iput_async(inode);
spin_lock(&session->s_cap_lock);
}
@@ -2512,7 +2512,7 @@ static void ceph_cap_unlink_work(struct work_struct *work)
doutc(cl, "on %p %llx.%llx\n", inode,
ceph_vinop(inode));
ceph_check_caps(ci, CHECK_CAPS_FLUSH);
- iput(inode);
+ ceph_iput_async(inode);
spin_lock(&mdsc->cap_delay_lock);
}
}
@@ -3933,7 +3933,7 @@ static void handle_reply(struct ceph_mds_session *session, struct ceph_msg *msg)
!req->r_reply_info.has_create_ino) {
/* This should never happen on an async create */
WARN_ON_ONCE(req->r_deleg_ino);
- iput(in);
+ ceph_iput_async(in);
in = NULL;
}
@@ -5313,7 +5313,7 @@ static void handle_lease(struct ceph_mds_client *mdsc,
out:
mutex_unlock(&session->s_mutex);
- iput(inode);
+ ceph_iput_async(inode);
ceph_dec_mds_stopping_blocker(mdsc);
return;
diff --git a/fs/ceph/quota.c b/fs/ceph/quota.c
index d90eda19bcc4..bba00f8926e6 100644
--- a/fs/ceph/quota.c
+++ b/fs/ceph/quota.c
@@ -76,7 +76,7 @@ void ceph_handle_quota(struct ceph_mds_client *mdsc,
le64_to_cpu(h->max_files));
spin_unlock(&ci->i_ceph_lock);
- iput(inode);
+ ceph_iput_async(inode);
out:
ceph_dec_mds_stopping_blocker(mdsc);
}
@@ -190,7 +190,7 @@ void ceph_cleanup_quotarealms_inodes(struct ceph_mds_client *mdsc)
node = rb_first(&mdsc->quotarealms_inodes);
qri = rb_entry(node, struct ceph_quotarealm_inode, node);
rb_erase(node, &mdsc->quotarealms_inodes);
- iput(qri->inode);
+ ceph_iput_async(qri->inode);
kfree(qri);
}
mutex_unlock(&mdsc->quotarealms_inodes_mutex);
diff --git a/fs/ceph/snap.c b/fs/ceph/snap.c
index c65f2b202b2b..19f097e79b3c 100644
--- a/fs/ceph/snap.c
+++ b/fs/ceph/snap.c
@@ -735,7 +735,7 @@ static void queue_realm_cap_snaps(struct ceph_mds_client *mdsc,
if (!inode)
continue;
spin_unlock(&realm->inodes_with_caps_lock);
- iput(lastinode);
+ ceph_iput_async(lastinode);
lastinode = inode;
/*
@@ -762,7 +762,7 @@ static void queue_realm_cap_snaps(struct ceph_mds_client *mdsc,
spin_lock(&realm->inodes_with_caps_lock);
}
spin_unlock(&realm->inodes_with_caps_lock);
- iput(lastinode);
+ ceph_iput_async(lastinode);
if (capsnap)
kmem_cache_free(ceph_cap_snap_cachep, capsnap);
@@ -955,7 +955,7 @@ static void flush_snaps(struct ceph_mds_client *mdsc)
ihold(inode);
spin_unlock(&mdsc->snap_flush_lock);
ceph_flush_snaps(ci, &session);
- iput(inode);
+ ceph_iput_async(inode);
spin_lock(&mdsc->snap_flush_lock);
}
spin_unlock(&mdsc->snap_flush_lock);
@@ -1116,12 +1116,12 @@ void ceph_handle_snap(struct ceph_mds_client *mdsc,
ceph_get_snap_realm(mdsc, realm);
ceph_change_snap_realm(inode, realm);
spin_unlock(&ci->i_ceph_lock);
- iput(inode);
+ ceph_iput_async(inode);
continue;
skip_inode:
spin_unlock(&ci->i_ceph_lock);
- iput(inode);
+ ceph_iput_async(inode);
}
/* we may have taken some of the old realm's children. */
diff --git a/fs/ceph/super.h b/fs/ceph/super.h
index cf176aab0f82..15c09b6c94aa 100644
--- a/fs/ceph/super.h
+++ b/fs/ceph/super.h
@@ -1085,6 +1085,13 @@ static inline void ceph_queue_flush_snaps(struct inode *inode)
ceph_queue_inode_work(inode, CEPH_I_WORK_FLUSH_SNAPS);
}
+void ceph_iput_n_async(struct inode *inode, int n);
+
+static inline void ceph_iput_async(struct inode *inode)
+{
+ ceph_iput_n_async(inode, 1);
+}
+
extern int ceph_try_to_choose_auth_mds(struct inode *inode, int mask);
extern int __ceph_do_getattr(struct inode *inode, struct page *locked_page,
int mask, bool force);
--
2.47.3
From: Kan Liang <kan.liang(a)linux.intel.com>
[ Upstream commit b0823d5fbacb1c551d793cbfe7af24e0d1fa45ed ]
The perf_fuzzer found a hard-lockup crash on a RaptorLake machine:
Oops: general protection fault, maybe for address 0xffff89aeceab400: 0000
CPU: 23 UID: 0 PID: 0 Comm: swapper/23
Tainted: [W]=WARN
Hardware name: Dell Inc. Precision 9660/0VJ762
RIP: 0010:native_read_pmc+0x7/0x40
Code: cc e8 8d a9 01 00 48 89 03 5b cd cc cc cc cc 0f 1f ...
RSP: 000:fffb03100273de8 EFLAGS: 00010046
....
Call Trace:
<TASK>
icl_update_topdown_event+0x165/0x190
? ktime_get+0x38/0xd0
intel_pmu_read_event+0xf9/0x210
__perf_event_read+0xf9/0x210
CPUs 16-23 are E-core CPUs that don't support the perf metrics feature.
The icl_update_topdown_event() should not be invoked on these CPUs.
It's a regression of commit:
f9bdf1f95339 ("perf/x86/intel: Avoid disable PMU if !cpuc->enabled in sample read")
The bug introduced by that commit is that the is_topdown_event() function
is mistakenly used to replace the is_topdown_count() call to check if the
topdown functions for the perf metrics feature should be invoked.
Fix it.
Fixes: f9bdf1f95339 ("perf/x86/intel: Avoid disable PMU if !cpuc->enabled in sample read")
Closes: https://lore.kernel.org/lkml/352f0709-f026-cd45-e60c-60dfd97f73f3@maine.edu/
Reported-by: Vince Weaver <vincent.weaver(a)maine.edu>
Signed-off-by: Kan Liang <kan.liang(a)linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz(a)infradead.org>
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
Tested-by: Vince Weaver <vincent.weaver(a)maine.edu>
Cc: stable(a)vger.kernel.org # v6.15+
Link: https://lore.kernel.org/r/20250612143818.2889040-1-kan.liang@linux.intel.com
[ omitted PEBS check ]
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Signed-off-by: Angel Adetula <angeladetula(a)google.com>
---
arch/x86/events/intel/core.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 5e43d390f7a3..36d8404f406d 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -2793,7 +2793,7 @@ static void intel_pmu_read_event(struct perf_event *event)
if (pmu_enabled)
intel_pmu_disable_all();
- if (is_topdown_event(event))
+ if (is_topdown_count(event))
static_call(intel_pmu_update_topdown_event)(event);
else
intel_pmu_drain_pebs_buffer();
--
2.51.0.470.ga7dc726c21-goog
According to documentation, the DP PHY on x1e80100 has another clock
called ref.
The current X Elite devices supported upstream work fine without this
clock, because the boot firmware leaves this clock enabled. But we should
not rely on that. Also, when it comes to power management, this clock
needs to be also disabled on suspend. So even though this change breaks
the ABI, it is needed in order to make we disable this clock on runtime
PM, when that is going to be enabled in the driver.
So rework the driver to allow different number of clocks, fix the
dt-bindings schema and add the clock to the DT node as well.
Signed-off-by: Abel Vesa <abel.vesa(a)linaro.org>
---
Changes in v3:
- Use dev_err_probe() on clocks parsing failure.
- Explain why the ABI break is necessary.
- Drop the extra 'clk' suffix from the clock name. So ref instead of
refclk.
- Link to v2: https://lore.kernel.org/r/20250903-phy-qcom-edp-add-missing-refclk-v2-0-d88…
Changes in v2:
- Fix schema by adding the minItems, as suggested by Krzysztof.
- Use devm_clk_bulk_get_all, as suggested by Konrad.
- Rephrase the commit messages to reflect the flexible number of clocks.
- Link to v1: https://lore.kernel.org/r/20250730-phy-qcom-edp-add-missing-refclk-v1-0-6f7…
---
Abel Vesa (3):
dt-bindings: phy: qcom-edp: Add missing clock for X Elite
phy: qcom: edp: Make the number of clocks flexible
arm64: dts: qcom: Add missing TCSR ref clock to the DP PHYs
.../devicetree/bindings/phy/qcom,edp-phy.yaml | 28 +++++++++++++++++++++-
arch/arm64/boot/dts/qcom/x1e80100.dtsi | 12 ++++++----
drivers/phy/qualcomm/phy-qcom-edp.c | 16 ++++++-------
3 files changed, 43 insertions(+), 13 deletions(-)
---
base-commit: 65dd046ef55861190ecde44c6d9fcde54b9fb77d
change-id: 20250730-phy-qcom-edp-add-missing-refclk-5ab82828f8e7
Best regards,
--
Abel Vesa <abel.vesa(a)linaro.org>
When a first MPTCP connection gets successfully established after a
blackhole period, 'active_disable_times' was supposed to be reset when
this connection was done via any non-loopback interfaces.
Unfortunately, the opposite condition was checked: only reset when the
connection was established via a loopback interface. Fixing this by
simply looking at the opposite.
This is similar to what is done with TCP FastOpen, see
tcp_fastopen_active_disable_ofo_check().
This patch is a follow-up of a previous discussion linked to commit
893c49a78d9f ("mptcp: Use __sk_dst_get() and dst_dev_rcu() in
mptcp_active_enable()."), see [1].
Fixes: 27069e7cb3d1 ("mptcp: disable active MPTCP in case of blackhole")
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/4209a283-8822-47bd-95b7-87e96d9b7ea3@kernel.org [1]
Signed-off-by: Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
---
Cc: Kuniyuki Iwashima <kuniyu(a)google.com>
Note: sending this fix to net-next, similar to commits 108a86c71c93
("mptcp: Call dst_release() in mptcp_active_enable().") and 893c49a78d9f
("mptcp: Use __sk_dst_get() and dst_dev_rcu() in mptcp_active_enable().").
Also to avoid conflicts, and because we are close to the merge windows.
---
net/mptcp/ctrl.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/mptcp/ctrl.c b/net/mptcp/ctrl.c
index e8ffa62ec183f3cd8156e3969ac4a7d0213a990b..d96130e49942e2fb878cd1897ad43c1d420fb233 100644
--- a/net/mptcp/ctrl.c
+++ b/net/mptcp/ctrl.c
@@ -507,7 +507,7 @@ void mptcp_active_enable(struct sock *sk)
rcu_read_lock();
dst = __sk_dst_get(sk);
dev = dst ? dst_dev_rcu(dst) : NULL;
- if (dev && (dev->flags & IFF_LOOPBACK))
+ if (!(dev && (dev->flags & IFF_LOOPBACK)))
atomic_set(&pernet->active_disable_times, 0);
rcu_read_unlock();
}
---
base-commit: b127e355f1af1e4a635ed8f78cb0d11c916613cf
change-id: 20250918-net-next-mptcp-blackhole-reset-loopback-d82c518e409f
Best regards,
--
Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
This patch fixes a crash in icl_update_topdown_event().
This fix has already been applied to the 'linux-6.1.y', 'linux-6.6.y', and
'linux-6.15.y' stable trees. This submission is to request application to the
'linux-6.12.y' stable tree, as it appears to be still missing there.
This should also fix kernel bug CVE-2025-38322.
Kan Liang (1):
perf/x86/intel: Fix crash in icl_update_topdown_event()
arch/x86/events/intel/core.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
base-commit: f6cf124428f51e3ef07a8e54c743873face9d2b2
--
2.51.0.470.ga7dc726c21-goog
The patch below does not apply to the 6.16-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.16.y
git checkout FETCH_HEAD
git cherry-pick -x a1b51534b532dd4f0499907865553ee9251bebc3
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025091753-raider-wake-9e9d@gregkh' --subject-prefix 'PATCH 6.16.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From a1b51534b532dd4f0499907865553ee9251bebc3 Mon Sep 17 00:00:00 2001
From: Alex Elder <elder(a)riscstar.com>
Date: Tue, 12 Aug 2025 22:13:35 -0500
Subject: [PATCH] dt-bindings: serial: 8250: allow "main" and "uart" as clock
names
There are two compatible strings defined in "8250.yaml" that require
two clocks to be specified, along with their names:
- "spacemit,k1-uart", used in "spacemit/k1.dtsi"
- "nxp,lpc1850-uart", used in "lpc/lpc18xx.dtsi"
When only one clock is used, the name is not required. However there
are two places that do specify a name:
- In "mediatek/mt7623.dtsi", the clock for the "mediatek,mtk-btif"
compatible serial device is named "main"
- In "qca/ar9132.dtsi", the clock for the "ns8250" compatible
serial device is named "uart"
In commit d2db0d7815444 ("dt-bindings: serial: 8250: allow clock
'uartclk' and 'reg' for nxp,lpc1850-uart"), Frank Li added the
restriction that two named clocks be used for the NXP platform
mentioned above.
Change that logic, so that an additional condition for (only) the
SpacemiT platform similarly restricts the two clocks to have the
names "core" and "bus".
Finally, add "main" and "uart" as allowed names when a single clock is
specified.
Fixes: 2c0594f9f0629 ("dt-bindings: serial: 8250: support an optional second clock")
Cc: stable <stable(a)kernel.org>
Reported-by: kernel test robot <lkp(a)intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202507160314.wrC51lXX-lkp@intel.com/
Signed-off-by: Alex Elder <elder(a)riscstar.com>
Acked-by: Conor Dooley <conor.dooley(a)microchip.com>
Link: https://lore.kernel.org/r/20250813031338.2328392-1-elder@riscstar.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/Documentation/devicetree/bindings/serial/8250.yaml b/Documentation/devicetree/bindings/serial/8250.yaml
index f59c0b37e8eb..b243afa69a1a 100644
--- a/Documentation/devicetree/bindings/serial/8250.yaml
+++ b/Documentation/devicetree/bindings/serial/8250.yaml
@@ -59,7 +59,12 @@ allOf:
items:
- const: uartclk
- const: reg
- else:
+ - if:
+ properties:
+ compatible:
+ contains:
+ const: spacemit,k1-uart
+ then:
properties:
clock-names:
items:
@@ -183,6 +188,9 @@ properties:
minItems: 1
maxItems: 2
oneOf:
+ - enum:
+ - main
+ - uart
- items:
- const: core
- const: bus
Hi Stable,
Please provide a quote for your products:
Include:
1.Pricing (per unit)
2.Delivery cost & timeline
3.Quote expiry date
Deadline: September
Thanks!
Kamal Prasad
Albinayah Trading
Hi,
We’re offering verified business contact data for the upcoming Fruit Attraction 2025 (FA), tailored for effective outreach before and after the event.
Place: Madrid, Spain
Date:SEP 30 - OCT 02, 2025
Contact Overview:
1,01,351 Attendees
2,179 Exhibiting Companies
6,537 Verified Exhibitor Contacts
Total: 107,885 Business Contacts
Each entry includes: Name, Job Title, Company, Website, Address, Phone, Official Email, LinkedIn Profile, and more.
Get your list in just 48 hours—100% GDPR-compliant Data.
If you'd like more details, just reply: “Send me pricing”
Best regards,
Juanita Garcia
Sr. Marketing Manager
To opt out reply “Not Interested.”
From: Alexander Sverdlin <alexander.sverdlin(a)siemens.com>
KCSAN reports:
BUG: KCSAN: data-race in do_raw_write_lock / do_raw_write_lock
write (marked) to 0xffff800009cf504c of 4 bytes by task 1102 on cpu 1:
do_raw_write_lock+0x120/0x204
_raw_write_lock_irq
do_exit
call_usermodehelper_exec_async
ret_from_fork
read to 0xffff800009cf504c of 4 bytes by task 1103 on cpu 0:
do_raw_write_lock+0x88/0x204
_raw_write_lock_irq
do_exit
call_usermodehelper_exec_async
ret_from_fork
value changed: 0xffffffff -> 0x00000001
Reported by Kernel Concurrency Sanitizer on:
CPU: 0 PID: 1103 Comm: kworker/u4:1 6.1.111
Commit 1a365e822372 ("locking/spinlock/debug: Fix various data races") has
adressed most of these races, but seems to be not consistent/not complete.
From do_raw_write_lock() only debug_write_lock_after() part has been
converted to WRITE_ONCE(), but not debug_write_lock_before() part.
Do it now.
Cc: stable(a)vger.kernel.org
Fixes: 1a365e822372 ("locking/spinlock/debug: Fix various data races")
Reported-by: Adrian Freihofer <adrian.freihofer(a)siemens.com>
Acked-by: Waiman Long <longman(a)redhat.com>
Signed-off-by: Alexander Sverdlin <alexander.sverdlin(a)siemens.com>
Reviewed-by: Paul E. McKenney <paulmck(a)kernel.org>
Signed-off-by: Boqun Feng <boqun.feng(a)gmail.com>
---
Notes:
SubmissionLink: https://lore.kernel.org/all/20250826102731.52507-1-alexander.sverdlin@sieme…
kernel/locking/spinlock_debug.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/kernel/locking/spinlock_debug.c b/kernel/locking/spinlock_debug.c
index 87b03d2e41db..2338b3adfb55 100644
--- a/kernel/locking/spinlock_debug.c
+++ b/kernel/locking/spinlock_debug.c
@@ -184,8 +184,8 @@ void do_raw_read_unlock(rwlock_t *lock)
static inline void debug_write_lock_before(rwlock_t *lock)
{
RWLOCK_BUG_ON(lock->magic != RWLOCK_MAGIC, lock, "bad magic");
- RWLOCK_BUG_ON(lock->owner == current, lock, "recursion");
- RWLOCK_BUG_ON(lock->owner_cpu == raw_smp_processor_id(),
+ RWLOCK_BUG_ON(READ_ONCE(lock->owner) == current, lock, "recursion");
+ RWLOCK_BUG_ON(READ_ONCE(lock->owner_cpu) == raw_smp_processor_id(),
lock, "cpu recursion");
}
--
2.51.0
Within two-step API update let's provide 2 new MBX operations:
1) request PF's link state (speed & up/down) - as legacy approach became
obsolete for new E610 adapter and link state data can't be correctly
provided - increasing API to 1.6
2) ask PF about supported features - for some time there is quite a mess in
negotiating API versions caused by too loose approach in adding new
specific (not supported by all of the drivers capable of linking with
ixgbevf) feature and corresponding API versions. Now list of supported
features is provided by MBX operation - increasing API to 1.7
Jedrzej Jagielski (4):
ixgbevf: fix getting link speed data for E610 devices
ixgbe: handle IXGBE_VF_GET_PF_LINK_STATE mailbox operation
ixgbevf: fix mailbox API compatibility by negotiating supported
features
ixgbe: handle IXGBE_VF_FEATURES_NEGOTIATE mbox cmd
drivers/net/ethernet/intel/ixgbe/ixgbe_mbx.h | 15 ++
.../net/ethernet/intel/ixgbe/ixgbe_sriov.c | 79 ++++++++
drivers/net/ethernet/intel/ixgbevf/defines.h | 1 +
drivers/net/ethernet/intel/ixgbevf/ipsec.c | 10 +
drivers/net/ethernet/intel/ixgbevf/ixgbevf.h | 7 +
.../net/ethernet/intel/ixgbevf/ixgbevf_main.c | 34 +++-
drivers/net/ethernet/intel/ixgbevf/mbx.h | 8 +
drivers/net/ethernet/intel/ixgbevf/vf.c | 182 +++++++++++++++---
drivers/net/ethernet/intel/ixgbevf/vf.h | 1 +
9 files changed, 304 insertions(+), 33 deletions(-)
--
2.31.1
Hi Luca Weiss and Tamura Dai,
On 9/12/25 02:24, Luca Weiss wrote:
> Hi Tamura,
>
> On Fri Sep 12, 2025 at 9:01 AM CEST, Tamura Dai wrote:
>> The bug is a typo in the compatible string for the touchscreen node.
>> According to Documentation/devicetree/bindings/input/touchscreen/edt-ft5x06.yaml,
>> the correct compatible is "focaltech,ft8719", but the device tree used
>> "focaltech,fts8719".
>
> +Joel
>
> I don't think this patch is really correct, in the sdm845-mainline fork
> there's a different commit which has some more changes to make the
> touchscreen work:
>
> https://gitlab.com/sdm845-mainline/linux/-/commit/2ca76ac2e046158814b043fd4…
Yes, this patch is not correct. My commit from the gitlab repo is the
correct one. But I personally don't have the shiftmq6 device to smoke
test before sending the patch. That's why I was hesitant to send it
upstream. I have now requested someone to confirm if the touchscreen
works with my gitlab commit. If if its all good, I will send the correct
patch later.
Regards,
Joel
Commit 67a873df0c41 ("vhost: basic in order support") pass the number
of used elem to vhost_net_rx_peek_head_len() to make sure it can
signal the used correctly before trying to do busy polling. But it
forgets to clear the count, this would cause the count run out of sync
with handle_rx() and break the busy polling.
Fixing this by passing the pointer of the count and clearing it after
the signaling the used.
Acked-by: Michael S. Tsirkin <mst(a)redhat.com>
Cc: stable(a)vger.kernel.org
Fixes: 67a873df0c41 ("vhost: basic in order support")
Signed-off-by: Jason Wang <jasowang(a)redhat.com>
---
drivers/vhost/net.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index c6508fe0d5c8..16e39f3ab956 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -1014,7 +1014,7 @@ static int peek_head_len(struct vhost_net_virtqueue *rvq, struct sock *sk)
}
static int vhost_net_rx_peek_head_len(struct vhost_net *net, struct sock *sk,
- bool *busyloop_intr, unsigned int count)
+ bool *busyloop_intr, unsigned int *count)
{
struct vhost_net_virtqueue *rnvq = &net->vqs[VHOST_NET_VQ_RX];
struct vhost_net_virtqueue *tnvq = &net->vqs[VHOST_NET_VQ_TX];
@@ -1024,7 +1024,8 @@ static int vhost_net_rx_peek_head_len(struct vhost_net *net, struct sock *sk,
if (!len && rvq->busyloop_timeout) {
/* Flush batched heads first */
- vhost_net_signal_used(rnvq, count);
+ vhost_net_signal_used(rnvq, *count);
+ *count = 0;
/* Both tx vq and rx socket were polled here */
vhost_net_busy_poll(net, rvq, tvq, busyloop_intr, true);
@@ -1180,7 +1181,7 @@ static void handle_rx(struct vhost_net *net)
do {
sock_len = vhost_net_rx_peek_head_len(net, sock->sk,
- &busyloop_intr, count);
+ &busyloop_intr, &count);
if (!sock_len)
break;
sock_len += sock_hlen;
--
2.34.1
The specification, Section 7.10, "Software Steps to Drain Page Requests &
Responses," requires software to submit an Invalidation Wait Descriptor
(inv_wait_dsc) with the Page-request Drain (PD=1) flag set, along with
the Invalidation Wait Completion Status Write flag (SW=1). It then waits
for the Invalidation Wait Descriptor's completion.
However, the PD field in the Invalidation Wait Descriptor is optional, as
stated in Section 6.5.2.9, "Invalidation Wait Descriptor":
"Page-request Drain (PD): Remapping hardware implementations reporting
Page-request draining as not supported (PDS = 0 in ECAP_REG) treat this
field as reserved."
This implies that if the IOMMU doesn't support the PDS capability, software
can't drain page requests and group responses as expected.
Do not enable PCI/PRI if the IOMMU doesn't support PDS.
Reported-by: Joel Granados <joel.granados(a)kernel.org>
Closes: https://lore.kernel.org/r/20250909-jag-pds-v1-1-ad8cba0e494e@kernel.org
Fixes: 66ac4db36f4c ("iommu/vt-d: Add page request draining support")
Cc: stable(a)vger.kernel.org
Signed-off-by: Lu Baolu <baolu.lu(a)linux.intel.com>
---
drivers/iommu/intel/iommu.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 9c3ab9d9f69a..92759a8f8330 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -3812,7 +3812,7 @@ static struct iommu_device *intel_iommu_probe_device(struct device *dev)
}
if (info->ats_supported && ecap_prs(iommu->ecap) &&
- pci_pri_supported(pdev))
+ ecap_pds(iommu->ecap) && pci_pri_supported(pdev))
info->pri_supported = 1;
}
}
--
2.43.0
In order to set the AMCR register, which configures the
memory-region split between ospi1 and ospi2, we need to
identify the ospi instance.
By using memory-region-names, it allows to identify the
ospi instance this memory-region belongs to.
Fixes: cad2492de91c ("arm64: dts: st: Add SPI NOR flash support on stm32mp257f-ev1 board")
Cc: stable(a)vger.kernel.org
Signed-off-by: Patrice Chotard <patrice.chotard(a)foss.st.com>
---
Changes in v3:
- Set again "Cc: <stable(a)vger.kernel.org>"
- Link to v2: https://lore.kernel.org/r/20250811-upstream_fix_dts_omm-v2-1-00ff55076bd5@f…
Changes in v2:
- Update commit message.
- Use correct memory-region-names value.
- Remove "Cc: <stable(a)vger.kernel.org>" tag as the fixed patch is not part of a LTS.
- Link to v1: https://lore.kernel.org/r/20250806-upstream_fix_dts_omm-v1-1-e68c15ed422d@f…
---
arch/arm64/boot/dts/st/stm32mp257f-ev1.dts | 1 +
1 file changed, 1 insertion(+)
diff --git a/arch/arm64/boot/dts/st/stm32mp257f-ev1.dts b/arch/arm64/boot/dts/st/stm32mp257f-ev1.dts
index 2f561ad4066544445e93db78557bc4be1c27095a..7bd8433c1b4344bb5d58193a5e6314f9ae89e0a4 100644
--- a/arch/arm64/boot/dts/st/stm32mp257f-ev1.dts
+++ b/arch/arm64/boot/dts/st/stm32mp257f-ev1.dts
@@ -197,6 +197,7 @@ &i2c8 {
&ommanager {
memory-region = <&mm_ospi1>;
+ memory-region-names = "ospi1";
pinctrl-0 = <&ospi_port1_clk_pins_a
&ospi_port1_io03_pins_a
&ospi_port1_cs0_pins_a>;
---
base-commit: 038d61fd642278bab63ee8ef722c50d10ab01e8f
change-id: 20250806-upstream_fix_dts_omm-c006b69042f1
Best regards,
--
Patrice Chotard <patrice.chotard(a)foss.st.com>
In the IOMMU Shared Virtual Addressing (SVA) context, the IOMMU hardware
shares and walks the CPU's page tables. The x86 architecture maps the
kernel's virtual address space into the upper portion of every process's
page table. Consequently, in an SVA context, the IOMMU hardware can walk
and cache kernel page table entries.
The Linux kernel currently lacks a notification mechanism for kernel page
table changes, specifically when page table pages are freed and reused.
The IOMMU driver is only notified of changes to user virtual address
mappings. This can cause the IOMMU's internal caches to retain stale
entries for kernel VA.
A Use-After-Free (UAF) and Write-After-Free (WAF) condition arises when
kernel page table pages are freed and later reallocated. The IOMMU could
misinterpret the new data as valid page table entries. The IOMMU might
then walk into attacker-controlled memory, leading to arbitrary physical
memory DMA access or privilege escalation. This is also a Write-After-Free
issue, as the IOMMU will potentially continue to write Accessed and Dirty
bits to the freed memory while attempting to walk the stale page tables.
Currently, SVA contexts are unprivileged and cannot access kernel
mappings. However, the IOMMU will still walk kernel-only page tables
all the way down to the leaf entries, where it realizes the mapping
is for the kernel and errors out. This means the IOMMU still caches
these intermediate page table entries, making the described vulnerability
a real concern.
To mitigate this, a new IOMMU interface is introduced to flush IOTLB
entries for the kernel address space. This interface is invoked from the
x86 architecture code that manages combined user and kernel page tables,
specifically before any kernel page table page is freed and reused.
This addresses the main issue with vfree() which is a common occurrence
and can be triggered by unprivileged users. While this resolves the
primary problem, it doesn't address some extremely rare case related to
memory unplug of memory that was present as reserved memory at boot,
which cannot be triggered by unprivileged users. The discussion can be
found at the link below.
Fixes: 26b25a2b98e4 ("iommu: Bind process address spaces to devices")
Cc: stable(a)vger.kernel.org
Suggested-by: Jann Horn <jannh(a)google.com>
Co-developed-by: Jason Gunthorpe <jgg(a)nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg(a)nvidia.com>
Signed-off-by: Lu Baolu <baolu.lu(a)linux.intel.com>
Reviewed-by: Jason Gunthorpe <jgg(a)nvidia.com>
Reviewed-by: Vasant Hegde <vasant.hegde(a)amd.com>
Reviewed-by: Kevin Tian <kevin.tian(a)intel.com>
Link: https://lore.kernel.org/linux-iommu/04983c62-3b1d-40d4-93ae-34ca04b827e5@in…
---
drivers/iommu/iommu-sva.c | 29 ++++++++++++++++++++++++++++-
include/linux/iommu.h | 4 ++++
mm/pgtable-generic.c | 2 ++
3 files changed, 34 insertions(+), 1 deletion(-)
diff --git a/drivers/iommu/iommu-sva.c b/drivers/iommu/iommu-sva.c
index 1a51cfd82808..d236aef80a8d 100644
--- a/drivers/iommu/iommu-sva.c
+++ b/drivers/iommu/iommu-sva.c
@@ -10,6 +10,8 @@
#include "iommu-priv.h"
static DEFINE_MUTEX(iommu_sva_lock);
+static bool iommu_sva_present;
+static LIST_HEAD(iommu_sva_mms);
static struct iommu_domain *iommu_sva_domain_alloc(struct device *dev,
struct mm_struct *mm);
@@ -42,6 +44,7 @@ static struct iommu_mm_data *iommu_alloc_mm_data(struct mm_struct *mm, struct de
return ERR_PTR(-ENOSPC);
}
iommu_mm->pasid = pasid;
+ iommu_mm->mm = mm;
INIT_LIST_HEAD(&iommu_mm->sva_domains);
/*
* Make sure the write to mm->iommu_mm is not reordered in front of
@@ -132,8 +135,13 @@ struct iommu_sva *iommu_sva_bind_device(struct device *dev, struct mm_struct *mm
if (ret)
goto out_free_domain;
domain->users = 1;
- list_add(&domain->next, &mm->iommu_mm->sva_domains);
+ if (list_empty(&iommu_mm->sva_domains)) {
+ if (list_empty(&iommu_sva_mms))
+ iommu_sva_present = true;
+ list_add(&iommu_mm->mm_list_elm, &iommu_sva_mms);
+ }
+ list_add(&domain->next, &iommu_mm->sva_domains);
out:
refcount_set(&handle->users, 1);
mutex_unlock(&iommu_sva_lock);
@@ -175,6 +183,13 @@ void iommu_sva_unbind_device(struct iommu_sva *handle)
list_del(&domain->next);
iommu_domain_free(domain);
}
+
+ if (list_empty(&iommu_mm->sva_domains)) {
+ list_del(&iommu_mm->mm_list_elm);
+ if (list_empty(&iommu_sva_mms))
+ iommu_sva_present = false;
+ }
+
mutex_unlock(&iommu_sva_lock);
kfree(handle);
}
@@ -312,3 +327,15 @@ static struct iommu_domain *iommu_sva_domain_alloc(struct device *dev,
return domain;
}
+
+void iommu_sva_invalidate_kva_range(unsigned long start, unsigned long end)
+{
+ struct iommu_mm_data *iommu_mm;
+
+ guard(mutex)(&iommu_sva_lock);
+ if (!iommu_sva_present)
+ return;
+
+ list_for_each_entry(iommu_mm, &iommu_sva_mms, mm_list_elm)
+ mmu_notifier_arch_invalidate_secondary_tlbs(iommu_mm->mm, start, end);
+}
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index c30d12e16473..66e4abb2df0d 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -1134,7 +1134,9 @@ struct iommu_sva {
struct iommu_mm_data {
u32 pasid;
+ struct mm_struct *mm;
struct list_head sva_domains;
+ struct list_head mm_list_elm;
};
int iommu_fwspec_init(struct device *dev, struct fwnode_handle *iommu_fwnode);
@@ -1615,6 +1617,7 @@ struct iommu_sva *iommu_sva_bind_device(struct device *dev,
struct mm_struct *mm);
void iommu_sva_unbind_device(struct iommu_sva *handle);
u32 iommu_sva_get_pasid(struct iommu_sva *handle);
+void iommu_sva_invalidate_kva_range(unsigned long start, unsigned long end);
#else
static inline struct iommu_sva *
iommu_sva_bind_device(struct device *dev, struct mm_struct *mm)
@@ -1639,6 +1642,7 @@ static inline u32 mm_get_enqcmd_pasid(struct mm_struct *mm)
}
static inline void mm_pasid_drop(struct mm_struct *mm) {}
+static inline void iommu_sva_invalidate_kva_range(unsigned long start, unsigned long end) {}
#endif /* CONFIG_IOMMU_SVA */
#ifdef CONFIG_IOMMU_IOPF
diff --git a/mm/pgtable-generic.c b/mm/pgtable-generic.c
index 0279399d4910..2717dc9afff0 100644
--- a/mm/pgtable-generic.c
+++ b/mm/pgtable-generic.c
@@ -13,6 +13,7 @@
#include <linux/swap.h>
#include <linux/swapops.h>
#include <linux/mm_inline.h>
+#include <linux/iommu.h>
#include <asm/pgalloc.h>
#include <asm/tlb.h>
@@ -430,6 +431,7 @@ static void kernel_pgtable_work_func(struct work_struct *work)
list_splice_tail_init(&kernel_pgtable_work.list, &page_list);
spin_unlock(&kernel_pgtable_work.lock);
+ iommu_sva_invalidate_kva_range(PAGE_OFFSET, TLB_FLUSH_ALL);
list_for_each_entry_safe(pt, next, &page_list, pt_list)
__pagetable_free(pt);
}
--
2.43.0
Hi Stable,
Please provide a quote for your products:
Include:
1.Pricing (per unit)
2.Delivery cost & timeline
3.Quote expiry date
Deadline: September
Thanks!
Kamal Prasad
Albinayah Trading
When do_task() exhausts its RXE_MAX_ITERATIONS budget, it unconditionally
sets the task state to TASK_STATE_IDLE to reschedule. This overwrites
the TASK_STATE_DRAINING state that may have been concurrently set by
rxe_cleanup_task() or rxe_disable_task().
This race condition breaks the cleanup and disable logic, which expects
the task to stop processing new work. The cleanup code may proceed while
do_task() reschedules itself, leading to a potential use-after-free.
This bug was introduced during the migration from tasklets to workqueues,
where the special handling for the draining case was lost.
Fix this by restoring the original behavior. If the state is
TASK_STATE_DRAINING when iterations are exhausted, continue the loop by
setting cont to 1. This allows new iterations to finish the remaining
work and reach the switch statement, which properly transitions the
state to TASK_STATE_DRAINED and stops the task as intended.
Fixes: 9b4b7c1f9f54 ("RDMA/rxe: Add workqueue support for rxe tasks")
Cc: stable(a)vger.kernel.org
Signed-off-by: Gui-Dong Han <hanguidong02(a)gmail.com>
---
drivers/infiniband/sw/rxe/rxe_task.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/drivers/infiniband/sw/rxe/rxe_task.c b/drivers/infiniband/sw/rxe/rxe_task.c
index 6f8f353e9583..f522820b950c 100644
--- a/drivers/infiniband/sw/rxe/rxe_task.c
+++ b/drivers/infiniband/sw/rxe/rxe_task.c
@@ -132,8 +132,12 @@ static void do_task(struct rxe_task *task)
* yield the cpu and reschedule the task
*/
if (!ret) {
- task->state = TASK_STATE_IDLE;
- resched = 1;
+ if (task->state != TASK_STATE_DRAINING) {
+ task->state = TASK_STATE_IDLE;
+ resched = 1;
+ } else {
+ cont = 1;
+ }
goto exit;
}
--
2.25.1
An untrusted device may return a NULL context pointer in the request
header. hptiop_iop_request_callback_itl() dereferences that pointer
unconditionally to write result fields and to invoke arg->done(), which
can cause a NULL pointer dereference.
Add a NULL check for the reconstructed context pointer. If it is NULL,
acknowledge the request by writing the tag to the outbound queue and
return early.
Fixes: ede1e6f8b432 ("[SCSI] hptiop: HighPoint RocketRAID 3xxx controller driver")
Cc: stable(a)vger.kernel.org
Signed-off-by: Guangshuo Li <lgs201920130244(a)gmail.com>
---
drivers/scsi/hptiop.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/drivers/scsi/hptiop.c b/drivers/scsi/hptiop.c
index 21f1d9871a33..2b29cd83ce5e 100644
--- a/drivers/scsi/hptiop.c
+++ b/drivers/scsi/hptiop.c
@@ -812,6 +812,11 @@ static void hptiop_iop_request_callback_itl(struct hptiop_hba *hba, u32 tag)
(readl(&req->context) |
((u64)readl(&req->context_hi32)<<32));
+ if (!arg) {
+ writel(tag, &hba->u.itl.iop->outbound_queue);
+ return;
+ }
+
if (readl(&req->result) == IOP_RESULT_SUCCESS) {
arg->result = HPT_IOCTL_RESULT_OK;
--
2.43.0
If ab->fw.m3_data points to data, then fw pointer remains null.
Further, if m3_mem is not allocated, then fw is dereferenced to be
passed to ath11k_err function.
Replace fw->size by m3_len.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Fixes: 7db88b962f06 ("wifi: ath11k: add firmware-2.bin support")
Cc: stable(a)vger.kernel.org
Signed-off-by: Matvey Kovalev <matvey.kovalev(a)ispras.ru>
---
drivers/net/wireless/ath/ath11k/qmi.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/wireless/ath/ath11k/qmi.c b/drivers/net/wireless/ath/ath11k/qmi.c
index 378ac96b861b7..1a42b4abe7168 100644
--- a/drivers/net/wireless/ath/ath11k/qmi.c
+++ b/drivers/net/wireless/ath/ath11k/qmi.c
@@ -2557,7 +2557,7 @@ static int ath11k_qmi_m3_load(struct ath11k_base *ab)
GFP_KERNEL);
if (!m3_mem->vaddr) {
ath11k_err(ab, "failed to allocate memory for M3 with size %zu\n",
- fw->size);
+ m3_len);
ret = -ENOMEM;
goto out;
}
--
2.43.0.windows.1
The SolidRun CN9130 SoC based boards have a variety of functional
problems, in particular
- SATA ports
- CN9132 CEX-7 eMMC
- CN9132 Clearfog PCI-E x2 / x4 ports
are not functional.
The SATA issue was recently introduced via changes to the
armada-cp11x.dtsi, wheras the eMMC and SPI problems were present in the
board dts from the very beginning.
This patch-set aims to resolve the problems after testing on Debian 13
release (Linux v6.12).
Signed-off-by: Josua Mayer <josua(a)solid-run.com>
---
Changes in v2:
- fixed mistakes in the original board device-trees that caused
functional issues with eMMC and pci.
- Link to v1: https://lore.kernel.org/r/20250911-cn913x-sr-fix-sata-v1-1-9e72238d0988@sol…
---
Josua Mayer (4):
arm64: dts: marvell: cn913x-solidrun: fix sata ports status
arm64: dts: marvell: cn9132-clearfog: disable eMMC high-speed modes
arm64: dts: marvell: cn9132-clearfog: fix multi-lane pci x2 and x4 ports
arm64: dts: marvell: cn9130-sr-som: add missing properties to emmc
arch/arm64/boot/dts/marvell/cn9130-cf.dtsi | 7 ++++---
arch/arm64/boot/dts/marvell/cn9130-sr-som.dtsi | 2 ++
arch/arm64/boot/dts/marvell/cn9131-cf-solidwan.dts | 6 ++++--
arch/arm64/boot/dts/marvell/cn9132-clearfog.dts | 22 ++++++++++++++++------
arch/arm64/boot/dts/marvell/cn9132-sr-cex7.dtsi | 8 ++++++++
5 files changed, 34 insertions(+), 11 deletions(-)
---
base-commit: 8f5ae30d69d7543eee0d70083daf4de8fe15d585
change-id: 20250911-cn913x-sr-fix-sata-5c737ebdb97f
Best regards,
--
Josua Mayer <josua(a)solid-run.com>
We sell 13,120 Ultra Targeted Emails and with these Emails you can
successfully sell your ebook, your training or any other digital product
today
https://vlykohda.mychariow.com/eng
Nous vendons 13 120 Emails Ultra Ciblés et grâce à ces Emails tu peux
vendre avec succès ton ebook, ta formation ou tout autre produit digital
dès aujourd’hui
https://vlykohda.mychariow.com/149
devm_kcalloc() may fail. ndtest_probe() allocates three DMA address
arrays (dcr_dma, label_dma, dimm_dma) and later unconditionally uses
them in ndtest_nvdimm_init(), which can lead to a NULL pointer
dereference on allocation failure.
Add NULL checks for all three allocations and return -ENOMEM if any
allocation fails.
Fixes: 9399ab61ad82 ("ndtest: Add dimms to the two buses")
Cc: stable(a)vger.kernel.org
Signed-off-by: Guangshuo Li <lgs201920130244(a)gmail.com>
---
tools/testing/nvdimm/test/ndtest.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/tools/testing/nvdimm/test/ndtest.c b/tools/testing/nvdimm/test/ndtest.c
index 68a064ce598c..516f304bb0b9 100644
--- a/tools/testing/nvdimm/test/ndtest.c
+++ b/tools/testing/nvdimm/test/ndtest.c
@@ -855,6 +855,11 @@ static int ndtest_probe(struct platform_device *pdev)
p->dimm_dma = devm_kcalloc(&p->pdev.dev, NUM_DCR,
sizeof(dma_addr_t), GFP_KERNEL);
+ if (!p->dcr_dma || !p->label_dma || !p->dimm_dma) {
+ pr_err("%s: failed to allocate DMA address arrays\n", __func__);
+ return -ENOMEM;
+ }
+
rc = ndtest_nvdimm_init(p);
if (rc)
goto err;
--
2.43.0
Hi,
I would like to request backporting 5326ab737a47 ("virtio_console: fix
order of fields cols and rows") to all LTS kernels.
I'm working on QEMU patches that add virtio console size support.
Without the fix, rows and columns will be swapped.
As far as I know, there are no device implementations that use the
wrong order and would by broken by the fix.
Note: A previous version [1] of the patch contained "Cc: stable" and
"Fixes:" tags, but they seem to have been accidentally left out from
the final version.
[1]: https://lore.kernel.org/all/20250320172654.624657-1-maxbr@linux.ibm.com/
Thanks,
Filip Hejsek
kcalloc_node() may fail. When the interrupter array allocation returns
NULL, subsequent code uses xhci->interrupters (e.g. in xhci_add_interrupter()
and in cleanup paths), leading to a potential NULL pointer dereference.
Check the allocation and bail out to the existing fail path to avoid
the NULL dereference.
Fixes: c99b38c412343 ("xhci: add support to allocate several interrupters")
Cc: stable(a)vger.kernel.org
Signed-off-by: Guangshuo Li <lgs201920130244(a)gmail.com>
---
drivers/usb/host/xhci-mem.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/usb/host/xhci-mem.c b/drivers/usb/host/xhci-mem.c
index d698095fc88d..da257856e864 100644
--- a/drivers/usb/host/xhci-mem.c
+++ b/drivers/usb/host/xhci-mem.c
@@ -2505,7 +2505,8 @@ int xhci_mem_init(struct xhci_hcd *xhci, gfp_t flags)
"Allocating primary event ring");
xhci->interrupters = kcalloc_node(xhci->max_interrupters, sizeof(*xhci->interrupters),
flags, dev_to_node(dev));
-
+ if (!xhci->interrupters)
+ goto fail;
ir = xhci_alloc_interrupter(xhci, 0, flags);
if (!ir)
goto fail;
--
2.43.0
The following changes since commit 76eeb9b8de9880ca38696b2fb56ac45ac0a25c6c:
Linux 6.17-rc5 (2025-09-07 14:22:57 -0700)
are available in the Git repository at:
https://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git tags/for_linus
for you to fetch changes up to 549db78d951726646ae9468e86c92cbd1fe73595:
virtio_config: clarify output parameters (2025-09-16 05:37:03 -0400)
----------------------------------------------------------------
virtio,vhost: last minute fixes
More small fixes. Most notably this reverts a virtio console
change since we made it without considering compatibility
sufficiently.
Signed-off-by: Michael S. Tsirkin <mst(a)redhat.com>
----------------------------------------------------------------
Alok Tiwari (1):
vhost-scsi: fix argument order in tport allocation error message
Alyssa Ross (1):
virtio_config: clarify output parameters
Ashwini Sahu (1):
uapi: vduse: fix typo in comment
Michael S. Tsirkin (1):
Revert "virtio_console: fix order of fields cols and rows"
Sean Christopherson (3):
vhost_task: Don't wake KVM x86's recovery thread if vhost task was killed
vhost_task: Allow caller to omit handle_sigkill() callback
KVM: x86/mmu: Don't register a sigkill callback for NX hugepage recovery tasks
zhang jiao (1):
vhost: vringh: Modify the return value check
arch/x86/kvm/mmu/mmu.c | 7 +-----
drivers/char/virtio_console.c | 2 +-
drivers/vhost/scsi.c | 2 +-
drivers/vhost/vhost.c | 2 +-
drivers/vhost/vringh.c | 7 +++---
include/linux/sched/vhost_task.h | 1 +
include/linux/virtio_config.h | 11 ++++----
include/uapi/linux/vduse.h | 2 +-
kernel/vhost_task.c | 54 ++++++++++++++++++++++++++++++++++++----
9 files changed, 65 insertions(+), 23 deletions(-)
Two kcalloc() allocations (descriptor table and context table) can fail
and are used unconditionally afterwards (ALIGN()/phys conversion and
dereferences), leading to potential NULL pointer dereference.
Check both allocations and bail out early; on the second failure, free
the first allocation to avoid a leak. Do not emit extra OOM logs.
Fixes: 73d739698017 ("sb1250-mac.c: De-typedef, de-volatile, de-etc...")
Fixes: c477f3348abb ("drivers/net/sb1250-mac.c: kmalloc + memset conversion to kcalloc")
Cc: stable(a)vger.kernel.org
Signed-off-by: Guangshuo Li <lgs201920130244(a)gmail.com>
---
drivers/net/ethernet/broadcom/sb1250-mac.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/broadcom/sb1250-mac.c b/drivers/net/ethernet/broadcom/sb1250-mac.c
index 30865fe03eeb..e16a49e22488 100644
--- a/drivers/net/ethernet/broadcom/sb1250-mac.c
+++ b/drivers/net/ethernet/broadcom/sb1250-mac.c
@@ -625,6 +625,8 @@ static void sbdma_initctx(struct sbmacdma *d, struct sbmac_softc *s, int chan,
d->sbdma_dscrtable_unaligned = kcalloc(d->sbdma_maxdescr + 1,
sizeof(*d->sbdma_dscrtable),
GFP_KERNEL);
+ if (!d->sbdma_dscrtable_unaligned)
+ return; /* avoid NULL deref in ALIGN/phys conversion */
/*
* The descriptor table must be aligned to at least 16 bytes or the
@@ -644,7 +646,11 @@ static void sbdma_initctx(struct sbmacdma *d, struct sbmac_softc *s, int chan,
d->sbdma_ctxtable = kcalloc(d->sbdma_maxdescr,
sizeof(*d->sbdma_ctxtable), GFP_KERNEL);
-
+ if (!d->sbdma_ctxtable) {
+ kfree(d->sbdma_dscrtable_unaligned);
+ d->sbdma_dscrtable_unaligned = NULL;
+ return;
+ }
#ifdef CONFIG_SBMAC_COALESCE
/*
* Setup Rx/Tx DMA coalescing defaults
--
2.43.0
Our implementation for BAR2 (lmembar) resize works at the xe_vram layer
and only releases that BAR before resizing. That is not always
sufficient. If the parent bridge needs to move, the BAR0 also needs to
be released, otherwise the resize fails. This is the case of not having
enough space allocated from the beginning.
Also, there's a BAR0 in the upstream port of the pcie switch in BMG
preventing the resize to propagate to the bridge as previously discussed
at https://lore.kernel.org/intel-xe/20250721173057.867829-1-uwu@icenowy.me/
and https://lore.kernel.org/intel-xe/wqukxnxni2dbpdhri3cbvlrzsefgdanesgskzmxi5s…
I'm bringing that commit from Ilpo here so this can be tested with the
xe changes and go to stable (first 2 patches).
The third patch is just code move as all the logic is in a different
layer now. That could wait longer though as there are other refactors
coming through the PCI tree and that would conflict (see second link
above).
With this I could resize the lmembar on some problematic hosts and after
doing an SBR, with one caveat: the audio device also prevents the BAR
from moving and it needs to be manually removed before resizing. With
the PCI refactors and BAR fitting logic that Ilpo is working on, it's
expected that it won't be needed for a long time.
Signed-off-by: Lucas De Marchi <lucas.demarchi(a)intel.com>
---
Ilpo Järvinen (1):
PCI: Release BAR0 of an integrated bridge to allow GPU BAR resize
Lucas De Marchi (2):
drm/xe: Move rebar to be done earlier
drm/xe: Move rebar to its own file
drivers/gpu/drm/xe/Makefile | 1 +
drivers/gpu/drm/xe/xe_pci.c | 3 +
drivers/gpu/drm/xe/xe_pci_rebar.c | 125 ++++++++++++++++++++++++++++++++++++++
drivers/gpu/drm/xe/xe_pci_rebar.h | 13 ++++
drivers/gpu/drm/xe/xe_vram.c | 103 -------------------------------
drivers/pci/quirks.c | 20 ++++++
6 files changed, 162 insertions(+), 103 deletions(-)
base-commit: 95bc43e85f952ef4ebfff1406883e1e07a7daeda
change-id: 20250917-xe-pci-rebar-2-c0fe2f04c879
Lucas De Marchi
The VMA count limit check in do_mmap() and do_brk_flags() uses a
strict inequality (>), which allows a process's VMA count to exceed
the configured sysctl_max_map_count limit by one.
A process with mm->map_count == sysctl_max_map_count will incorrectly
pass this check and then exceed the limit upon allocation of a new VMA
when its map_count is incremented.
Other VMA allocation paths, such as split_vma(), already use the
correct, inclusive (>=) comparison.
Fix this bug by changing the comparison to be inclusive in do_mmap()
and do_brk_flags(), bringing them in line with the correct behavior
of other allocation paths.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Cc: <stable(a)vger.kernel.org>
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: "Liam R. Howlett" <Liam.Howlett(a)oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes(a)oracle.com>
Cc: Mike Rapoport <rppt(a)kernel.org>
Cc: Minchan Kim <minchan(a)kernel.org>
Cc: Pedro Falcato <pfalcato(a)suse.de>
Signed-off-by: Kalesh Singh <kaleshsingh(a)google.com>
---
Chnages in v2:
- Fix mmap check, per Pedro
mm/mmap.c | 2 +-
mm/vma.c | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/mm/mmap.c b/mm/mmap.c
index 7306253cc3b5..e5370e7fcd8f 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -374,7 +374,7 @@ unsigned long do_mmap(struct file *file, unsigned long addr,
return -EOVERFLOW;
/* Too many mappings? */
- if (mm->map_count > sysctl_max_map_count)
+ if (mm->map_count >= sysctl_max_map_count)
return -ENOMEM;
/*
diff --git a/mm/vma.c b/mm/vma.c
index 3b12c7579831..033a388bc4b1 100644
--- a/mm/vma.c
+++ b/mm/vma.c
@@ -2772,7 +2772,7 @@ int do_brk_flags(struct vma_iterator *vmi, struct vm_area_struct *vma,
if (!may_expand_vm(mm, vm_flags, len >> PAGE_SHIFT))
return -ENOMEM;
- if (mm->map_count > sysctl_max_map_count)
+ if (mm->map_count >= sysctl_max_map_count)
return -ENOMEM;
if (security_vm_enough_memory_mm(mm, len >> PAGE_SHIFT))
--
2.51.0.384.g4c02a37b29-goog
Fix a memory leak in netpoll and introduce netconsole selftests that
expose the issue when running with kmemleak detection enabled.
This patchset includes a selftest for netpoll with multiple concurrent
users (netconsole + bonding), which simulates the scenario from test[1]
that originally demonstrated the issue allegedly fixed by commit
efa95b01da18 ("netpoll: fix use after free") - a commit that is now
being reverted.
Sending this to "net" branch because this is a fix, and the selftest
might help with the backports validation.
Link: https://lore.kernel.org/lkml/96b940137a50e5c387687bb4f57de8b0435a653f.14048… [1]
Signed-off-by: Breno Leitao <leitao(a)debian.org>
---
Changes in v5:
- Set CONFIG_BONDING=m in selftests/drivers/net/config.
- Link to v4: https://lore.kernel.org/r/20250917-netconsole_torture-v4-0-0a5b3b8f81ce@deb…
Changes in v4:
- Added an additional selftest to test multiple netpoll users in
parallel
- Link to v3: https://lore.kernel.org/r/20250905-netconsole_torture-v3-0-875c7febd316@deb…
Changes in v3:
- This patchset is a merge of the fix and the selftest together as
recommended by Jakub.
Changes in v2:
- Reuse the netconsole creation from lib_netcons.sh. Thus, refactoring
the create_dynamic_target() (Jakub)
- Move the "wait" to after all the messages has been sent.
- Link to v1: https://lore.kernel.org/r/20250902-netconsole_torture-v1-1-03c6066598e9@deb…
---
Breno Leitao (4):
net: netpoll: fix incorrect refcount handling causing incorrect cleanup
selftest: netcons: refactor target creation
selftest: netcons: create a torture test
selftest: netcons: add test for netconsole over bonded interfaces
net/core/netpoll.c | 7 +-
tools/testing/selftests/drivers/net/Makefile | 2 +
tools/testing/selftests/drivers/net/config | 1 +
.../selftests/drivers/net/lib/sh/lib_netcons.sh | 197 ++++++++++++++++++---
.../selftests/drivers/net/netcons_over_bonding.sh | 76 ++++++++
.../selftests/drivers/net/netcons_torture.sh | 127 +++++++++++++
6 files changed, 385 insertions(+), 25 deletions(-)
---
base-commit: 5e87fdc37f8dc619549d49ba5c951b369ce7c136
change-id: 20250902-netconsole_torture-8fc23f0aca99
Best regards,
--
Breno Leitao <leitao(a)debian.org>
We were copying the bo content the bos on the list
"xe->pinned.late.kernel_bo_present" twice on suspend.
Presumingly the intent is to copy the pinned external bos on
the first pass.
This is harmless since we (currently) should have no pinned
external bos needing copy since
a) exernal system bos don't have compressed content,
b) We do not (yet) allow pinning of VRAM bos.
Still, fix this up so that we copy pinned external bos on
the first pass. We're about to allow bos pinned in VRAM.
Fixes: c6a4d46ec1d7 ("drm/xe: evict user memory in PM notifier")
Cc: Matthew Auld <matthew.auld(a)intel.com>
Cc: <stable(a)vger.kernel.org> # v6.16+
Signed-off-by: Thomas Hellström <thomas.hellstrom(a)linux.intel.com>
---
drivers/gpu/drm/xe/xe_bo_evict.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/xe/xe_bo_evict.c b/drivers/gpu/drm/xe/xe_bo_evict.c
index 7484ce55a303..d5dbc51e8612 100644
--- a/drivers/gpu/drm/xe/xe_bo_evict.c
+++ b/drivers/gpu/drm/xe/xe_bo_evict.c
@@ -158,8 +158,8 @@ int xe_bo_evict_all(struct xe_device *xe)
if (ret)
return ret;
- ret = xe_bo_apply_to_pinned(xe, &xe->pinned.late.kernel_bo_present,
- &xe->pinned.late.evicted, xe_bo_evict_pinned);
+ ret = xe_bo_apply_to_pinned(xe, &xe->pinned.late.external,
+ &xe->pinned.late.external, xe_bo_evict_pinned);
if (!ret)
ret = xe_bo_apply_to_pinned(xe, &xe->pinned.late.kernel_bo_present,
--
2.51.0
Some recent Lenovo and Inspur machines with Zhaoxin CPUs fail to create
/sys/class/backlight/acpi_video0 on v6.6 kernels, while the same hardware
works correctly on v5.4.
Our analysis shows that the current implementation assumes the presence of a
GPU. The backlight registration is only triggered if a GPU is detected, but on
these platforms the backlight is handled purely by the EC without any GPU.
As a result, the detection path does not create the expected backlight node.
To fix this, move the following logic:
/* Use ACPI video if available, except when native should be preferred. */
if ((video_caps & ACPI_VIDEO_BACKLIGHT) &&
!(native_available && prefer_native_over_acpi_video()))
return acpi_backlight_video;
above the if (auto_detect) *auto_detect = true; statement.
This ensures that the ACPI video backlight node is created even when no GPU is
present, restoring the correct behavior observed on older kernels.
Fixes: 78dfc9d1d1ab ("ACPI: video: Add auto_detect arg to __acpi_video_get_backlight_type()")
Cc: stable(a)vger.kernel.org
Signed-off-by: Zihuan Zhang <zhangzihuan(a)kylinos.cn>
---
drivers/acpi/video_detect.c | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/drivers/acpi/video_detect.c b/drivers/acpi/video_detect.c
index d507d5e08435..c1bb22b57f56 100644
--- a/drivers/acpi/video_detect.c
+++ b/drivers/acpi/video_detect.c
@@ -1011,6 +1011,11 @@ enum acpi_backlight_type __acpi_video_get_backlight_type(bool native, bool *auto
if (acpi_backlight_dmi != acpi_backlight_undef)
return acpi_backlight_dmi;
+ /* Use ACPI video if available, except when native should be preferred. */
+ if ((video_caps & ACPI_VIDEO_BACKLIGHT) &&
+ !(native_available && prefer_native_over_acpi_video()))
+ return acpi_backlight_video;
+
if (auto_detect)
*auto_detect = true;
@@ -1024,11 +1029,6 @@ enum acpi_backlight_type __acpi_video_get_backlight_type(bool native, bool *auto
if (dell_uart_present)
return acpi_backlight_dell_uart;
- /* Use ACPI video if available, except when native should be preferred. */
- if ((video_caps & ACPI_VIDEO_BACKLIGHT) &&
- !(native_available && prefer_native_over_acpi_video()))
- return acpi_backlight_video;
-
/* Use native if available */
if (native_available)
return acpi_backlight_native;
--
2.25.1
Hi,
Glad to know you and your company from Jordan.
I‘m Seven CTO of STHL We are a one-stop service provider for PCBA. We can help you with production from PCB to finished product assembly.
Why Partner With Us?
✅ One-Stop Expertise: From PCB fabrication, PCBA (SMT & Through-Hole), custom cable harnesses, , to final product assembly – we eliminate multi-vendor coordination risks.
✅ Cost Efficiency: 40%+ clients reduce logistics/QC costs through our integrated service model (ISO 9001:2015 certified).
✅ Speed-to-Market: Average 15% faster lead times achieved via in-house vertical integration.
Recent Success Case:
Helped a German IoT startup scale from prototype to 50K-unit/month production within 6 months through our:
PCB Design-for-Manufacturing (DFM) optimization Automated PCBA with 99.98% first-pass yield Mechanical housing CNC machining & IP67-rated assembly
Seven Marcus CTO
Shenzhen STHL Technology Co,Ltd
+8618569002840 Seven(a)pcba-china.com
在2025-06-04,Seven <seven(a)ems-sthi.com> 写道:-----原始邮件-----
发件人: Seven <seven(a)ems-sthi.com>
发件时间: 2025年06月04日 周三
收件人: [Linux-stable-mirror <linux-stable-mirror(a)lists.linaro.org>]
主题: Re:Jordan recommend me get in touch
Hi,
Glad to know you and your company from Jordan.
I‘m Seven CTO of STHL We are a one-stop service provider for PCBA. We can help you with production from PCB to finished product assembly.
Why Partner With Us?
✅ One-Stop Expertise: From PCB fabrication, PCBA (SMT & Through-Hole), custom cable harnesses, , to final product assembly – we eliminate multi-vendor coordination risks.
✅ Cost Efficiency: 40%+ clients reduce logistics/QC costs through our integrated service model (ISO 9001:2015 certified).
✅ Speed-to-Market: Average 15% faster lead times achieved via in-house vertical integration.
Recent Success Case:
Helped a German IoT startup scale from prototype to 50K-unit/month production within 6 months through our:
PCB Design-for-Manufacturing (DFM) optimization Automated PCBA with 99.98% first-pass yield Mechanical housing CNC machining & IP67-rated assembly
Seven Marcus CTO
Shenzhen STHL Technology Co,Ltd
+8618569002840 Seven(a)pcba-china.com
Hi Team,
We are observing an intermittent regression in UDP fragmentation
handling between Linux kernel versions v5.10.35 and v5.15.71.
Problem description:
Our application sends UDP packets that exceed the path MTU. An
intermediate hop returns an ICMP Type 3, Code 4 (Fragmentation Needed)
message.
On v5.10.35, the kernel correctly updates the Path MTU cache, and
subsequent packets are fragmented as expected.
On v5.15.71, although the ICMP message is received by the kernel,
subsequent UDP packets are sometimes not fragmented and continue to be
dropped.
System details:
Egress interface MTU: 9192 bytes
Path MTU at intermediate hop: 1500 bytes
Kernel parameter: ip_no_pmtu_disc=0 (default)
Questions / request for feedback:
Is this a known regression in the 5.15 kernel series?
We have verified that the Path MTU cache is usually updated correctly.
Is there a way to detect or log cases where the cache is not updated?
If this issue has already been addressed, could you please point us to
the relevant fix commit so we can backport and test it?
We have reviewed several patches between v5.10.35 and v5.15.71 related
to PMTU and ICMP handling and examined the code flow,
but have not been able to pinpoint the root cause.
Any guidance, insights, or pointers would be greatly appreciated.
Best regards,
Chandrasekharreddy C
The iput() function is a dangerous one - if the reference counter goes
to zero, the function may block for a long time due to:
- inode_wait_for_writeback() waits until writeback on this inode
completes
- the filesystem-specific "evict_inode" callback can do similar
things; e.g. all netfs-based filesystems will call
netfs_wait_for_outstanding_io() which is similar to
inode_wait_for_writeback()
Therefore, callers must carefully evaluate the context they're in and
check whether invoking iput() is a good idea at all.
Most of the time, this is not a problem because the dcache holds
references to all inodes, and the dcache is usually the one to release
the last reference. But this assumption is fragile. For example,
under (memcg) memory pressure, the dcache shrinker is more likely to
release inode references, moving the inode eviction to contexts where
that was extremely unlikely to occur.
Our production servers "found" at least two deadlock bugs in the Ceph
filesystem that were caused by this iput() behavior:
1. Writeback may lead to iput() calls in Ceph (e.g. from
ceph_put_wrbuffer_cap_refs()) which deadlocks in
inode_wait_for_writeback(). Waiting for writeback completion from
within writeback will obviously never be able to make any progress.
This leads to blocked kworkers like this:
INFO: task kworker/u777:6:1270802 blocked for more than 122 seconds.
Not tainted 6.16.7-i1-es #773
task:kworker/u777:6 state:D stack:0 pid:1270802 tgid:1270802 ppid:2
task_flags:0x4208060 flags:0x00004000
Workqueue: writeback wb_workfn (flush-ceph-3)
Call Trace:
<TASK>
__schedule+0x4ea/0x17d0
schedule+0x1c/0xc0
inode_wait_for_writeback+0x71/0xb0
evict+0xcf/0x200
ceph_put_wrbuffer_cap_refs+0xdd/0x220
ceph_invalidate_folio+0x97/0xc0
ceph_writepages_start+0x127b/0x14d0
do_writepages+0xba/0x150
__writeback_single_inode+0x34/0x290
writeback_sb_inodes+0x203/0x470
__writeback_inodes_wb+0x4c/0xe0
wb_writeback+0x189/0x2b0
wb_workfn+0x30b/0x3d0
process_one_work+0x143/0x2b0
worker_thread+0x30a/0x450
2. In the Ceph messenger thread (net/ceph/messenger*.c), any iput()
call may invoke ceph_evict_inode() which will deadlock in
netfs_wait_for_outstanding_io(); since this blocks the messenger
thread, completions from the Ceph servers will not ever be received
and handled.
It looks like these deadlock bugs have been in the Ceph filesystem
code since forever (therefore no "Fixes" tag in this patch). There
may be various ways to solve this:
- make iput() asynchronous and defer the actual eviction like fput()
(may add overhead)
- make iput() only asynchronous if I_SYNC is set (doesn't solve random
things happening inside the "evict_inode" callback)
- add iput_deferred() to make this asynchronous behavior/overhead
optional and explicit
- refactor Ceph to avoid iput() calls from within writeback and
messenger (if that is even possible)
- add a Ceph-specific workaround
After advice from Mateusz Guzik, I decided to do the latter. The
implementation is simple because it piggybacks on the existing
work_struct for ceph_queue_inode_work() - ceph_inode_work() calls
iput() at the end which means we can donate the last reference to it.
This patch adds ceph_iput_async() and converts lots of iput() calls to
it - at least those that may come through writeback and the messenger.
Signed-off-by: Max Kellermann <max.kellermann(a)ionos.com>
Cc: Mateusz Guzik <mjguzik(a)gmail.com>
Cc: stable(a)vger.kernel.org
---
v1->v2: using atomic_add_unless() instead of atomic_add_unless() to
avoid letting i_count drop to zero which may cause races (thanks
Mateusz Guzik)
Signed-off-by: Max Kellermann <max.kellermann(a)ionos.com>
---
fs/ceph/addr.c | 2 +-
fs/ceph/caps.c | 16 ++++++++--------
fs/ceph/dir.c | 2 +-
fs/ceph/inode.c | 34 ++++++++++++++++++++++++++++++++++
fs/ceph/mds_client.c | 30 +++++++++++++++---------------
fs/ceph/quota.c | 4 ++--
fs/ceph/snap.c | 10 +++++-----
fs/ceph/super.h | 2 ++
8 files changed, 68 insertions(+), 32 deletions(-)
diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c
index 322ed268f14a..fc497c91530e 100644
--- a/fs/ceph/addr.c
+++ b/fs/ceph/addr.c
@@ -265,7 +265,7 @@ static void finish_netfs_read(struct ceph_osd_request *req)
subreq->error = err;
trace_netfs_sreq(subreq, netfs_sreq_trace_io_progress);
netfs_read_subreq_terminated(subreq);
- iput(req->r_inode);
+ ceph_iput_async(req->r_inode);
ceph_dec_osd_stopping_blocker(fsc->mdsc);
}
diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c
index b1a8ff612c41..af9e3ae9ab7e 100644
--- a/fs/ceph/caps.c
+++ b/fs/ceph/caps.c
@@ -1771,7 +1771,7 @@ void ceph_flush_snaps(struct ceph_inode_info *ci,
spin_unlock(&mdsc->snap_flush_lock);
if (need_put)
- iput(inode);
+ ceph_iput_async(inode);
}
/*
@@ -3319,7 +3319,7 @@ static void __ceph_put_cap_refs(struct ceph_inode_info *ci, int had,
if (wake)
wake_up_all(&ci->i_cap_wq);
while (put-- > 0)
- iput(inode);
+ ceph_iput_async(inode);
}
void ceph_put_cap_refs(struct ceph_inode_info *ci, int had)
@@ -3419,7 +3419,7 @@ void ceph_put_wrbuffer_cap_refs(struct ceph_inode_info *ci, int nr,
if (complete_capsnap)
wake_up_all(&ci->i_cap_wq);
while (put-- > 0) {
- iput(inode);
+ ceph_iput_async(inode);
}
}
@@ -3917,7 +3917,7 @@ static void handle_cap_flush_ack(struct inode *inode, u64 flush_tid,
if (wake_mdsc)
wake_up_all(&mdsc->cap_flushing_wq);
if (drop)
- iput(inode);
+ ceph_iput_async(inode);
}
void __ceph_remove_capsnap(struct inode *inode, struct ceph_cap_snap *capsnap,
@@ -4008,7 +4008,7 @@ static void handle_cap_flushsnap_ack(struct inode *inode, u64 flush_tid,
wake_up_all(&ci->i_cap_wq);
if (wake_mdsc)
wake_up_all(&mdsc->cap_flushing_wq);
- iput(inode);
+ ceph_iput_async(inode);
}
}
@@ -4557,7 +4557,7 @@ void ceph_handle_caps(struct ceph_mds_session *session,
done:
mutex_unlock(&session->s_mutex);
done_unlocked:
- iput(inode);
+ ceph_iput_async(inode);
out:
ceph_dec_mds_stopping_blocker(mdsc);
@@ -4636,7 +4636,7 @@ unsigned long ceph_check_delayed_caps(struct ceph_mds_client *mdsc)
doutc(cl, "on %p %llx.%llx\n", inode,
ceph_vinop(inode));
ceph_check_caps(ci, 0);
- iput(inode);
+ ceph_iput_async(inode);
spin_lock(&mdsc->cap_delay_lock);
}
@@ -4675,7 +4675,7 @@ static void flush_dirty_session_caps(struct ceph_mds_session *s)
spin_unlock(&mdsc->cap_dirty_lock);
ceph_wait_on_async_create(inode);
ceph_check_caps(ci, CHECK_CAPS_FLUSH);
- iput(inode);
+ ceph_iput_async(inode);
spin_lock(&mdsc->cap_dirty_lock);
}
spin_unlock(&mdsc->cap_dirty_lock);
diff --git a/fs/ceph/dir.c b/fs/ceph/dir.c
index 32973c62c1a2..ec73ed52a227 100644
--- a/fs/ceph/dir.c
+++ b/fs/ceph/dir.c
@@ -1290,7 +1290,7 @@ static void ceph_async_unlink_cb(struct ceph_mds_client *mdsc,
ceph_mdsc_free_path_info(&path_info);
}
out:
- iput(req->r_old_inode);
+ ceph_iput_async(req->r_old_inode);
ceph_mdsc_release_dir_caps(req);
}
diff --git a/fs/ceph/inode.c b/fs/ceph/inode.c
index f67025465de0..d7c0ed82bf62 100644
--- a/fs/ceph/inode.c
+++ b/fs/ceph/inode.c
@@ -2191,6 +2191,40 @@ void ceph_queue_inode_work(struct inode *inode, int work_bit)
}
}
+/**
+ * Queue an asynchronous iput() call in a worker thread. Use this
+ * instead of iput() in contexts where evicting the inode is unsafe.
+ * For example, inode eviction may cause deadlocks in
+ * inode_wait_for_writeback() (when called from within writeback) or
+ * in netfs_wait_for_outstanding_io() (when called from within the
+ * Ceph messenger).
+ */
+void ceph_iput_async(struct inode *inode)
+{
+ if (unlikely(!inode))
+ return;
+
+ if (likely(atomic_add_unless(&inode->i_count, -1, 1)))
+ /* somebody else is holding another reference -
+ * nothing left to do for us
+ */
+ return;
+
+ doutc(ceph_inode_to_fs_client(inode)->client, "%p %llx.%llx\n", inode, ceph_vinop(inode));
+
+ /* simply queue a ceph_inode_work() (donating the remaining
+ * reference) without setting i_work_mask bit; other than
+ * putting the reference, there is nothing to do
+ */
+ WARN_ON_ONCE(!queue_work(ceph_inode_to_fs_client(inode)->inode_wq,
+ &ceph_inode(inode)->i_work));
+
+ /* note: queue_work() cannot fail; it i_work were already
+ * queued, then it would be holding another reference, but no
+ * such reference exists
+ */
+}
+
static void ceph_do_invalidate_pages(struct inode *inode)
{
struct ceph_client *cl = ceph_inode_to_client(inode);
diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c
index 3bc72b47fe4d..22d75c3be5a8 100644
--- a/fs/ceph/mds_client.c
+++ b/fs/ceph/mds_client.c
@@ -1097,14 +1097,14 @@ void ceph_mdsc_release_request(struct kref *kref)
ceph_msg_put(req->r_reply);
if (req->r_inode) {
ceph_put_cap_refs(ceph_inode(req->r_inode), CEPH_CAP_PIN);
- iput(req->r_inode);
+ ceph_iput_async(req->r_inode);
}
if (req->r_parent) {
ceph_put_cap_refs(ceph_inode(req->r_parent), CEPH_CAP_PIN);
- iput(req->r_parent);
+ ceph_iput_async(req->r_parent);
}
- iput(req->r_target_inode);
- iput(req->r_new_inode);
+ ceph_iput_async(req->r_target_inode);
+ ceph_iput_async(req->r_new_inode);
if (req->r_dentry)
dput(req->r_dentry);
if (req->r_old_dentry)
@@ -1118,7 +1118,7 @@ void ceph_mdsc_release_request(struct kref *kref)
*/
ceph_put_cap_refs(ceph_inode(req->r_old_dentry_dir),
CEPH_CAP_PIN);
- iput(req->r_old_dentry_dir);
+ ceph_iput_async(req->r_old_dentry_dir);
}
kfree(req->r_path1);
kfree(req->r_path2);
@@ -1240,7 +1240,7 @@ static void __unregister_request(struct ceph_mds_client *mdsc,
}
if (req->r_unsafe_dir) {
- iput(req->r_unsafe_dir);
+ ceph_iput_async(req->r_unsafe_dir);
req->r_unsafe_dir = NULL;
}
@@ -1413,7 +1413,7 @@ static int __choose_mds(struct ceph_mds_client *mdsc,
cap = rb_entry(rb_first(&ci->i_caps), struct ceph_cap, ci_node);
if (!cap) {
spin_unlock(&ci->i_ceph_lock);
- iput(inode);
+ ceph_iput_async(inode);
goto random;
}
mds = cap->session->s_mds;
@@ -1422,7 +1422,7 @@ static int __choose_mds(struct ceph_mds_client *mdsc,
cap == ci->i_auth_cap ? "auth " : "", cap);
spin_unlock(&ci->i_ceph_lock);
out:
- iput(inode);
+ ceph_iput_async(inode);
return mds;
random:
@@ -1841,7 +1841,7 @@ int ceph_iterate_session_caps(struct ceph_mds_session *session,
spin_unlock(&session->s_cap_lock);
if (last_inode) {
- iput(last_inode);
+ ceph_iput_async(last_inode);
last_inode = NULL;
}
if (old_cap) {
@@ -1874,7 +1874,7 @@ int ceph_iterate_session_caps(struct ceph_mds_session *session,
session->s_cap_iterator = NULL;
spin_unlock(&session->s_cap_lock);
- iput(last_inode);
+ ceph_iput_async(last_inode);
if (old_cap)
ceph_put_cap(session->s_mdsc, old_cap);
@@ -1904,7 +1904,7 @@ static int remove_session_caps_cb(struct inode *inode, int mds, void *arg)
if (invalidate)
ceph_queue_invalidate(inode);
while (iputs--)
- iput(inode);
+ ceph_iput_async(inode);
return 0;
}
@@ -1944,7 +1944,7 @@ static void remove_session_caps(struct ceph_mds_session *session)
spin_unlock(&session->s_cap_lock);
inode = ceph_find_inode(sb, vino);
- iput(inode);
+ ceph_iput_async(inode);
spin_lock(&session->s_cap_lock);
}
@@ -2512,7 +2512,7 @@ static void ceph_cap_unlink_work(struct work_struct *work)
doutc(cl, "on %p %llx.%llx\n", inode,
ceph_vinop(inode));
ceph_check_caps(ci, CHECK_CAPS_FLUSH);
- iput(inode);
+ ceph_iput_async(inode);
spin_lock(&mdsc->cap_delay_lock);
}
}
@@ -3933,7 +3933,7 @@ static void handle_reply(struct ceph_mds_session *session, struct ceph_msg *msg)
!req->r_reply_info.has_create_ino) {
/* This should never happen on an async create */
WARN_ON_ONCE(req->r_deleg_ino);
- iput(in);
+ ceph_iput_async(in);
in = NULL;
}
@@ -5313,7 +5313,7 @@ static void handle_lease(struct ceph_mds_client *mdsc,
out:
mutex_unlock(&session->s_mutex);
- iput(inode);
+ ceph_iput_async(inode);
ceph_dec_mds_stopping_blocker(mdsc);
return;
diff --git a/fs/ceph/quota.c b/fs/ceph/quota.c
index d90eda19bcc4..bba00f8926e6 100644
--- a/fs/ceph/quota.c
+++ b/fs/ceph/quota.c
@@ -76,7 +76,7 @@ void ceph_handle_quota(struct ceph_mds_client *mdsc,
le64_to_cpu(h->max_files));
spin_unlock(&ci->i_ceph_lock);
- iput(inode);
+ ceph_iput_async(inode);
out:
ceph_dec_mds_stopping_blocker(mdsc);
}
@@ -190,7 +190,7 @@ void ceph_cleanup_quotarealms_inodes(struct ceph_mds_client *mdsc)
node = rb_first(&mdsc->quotarealms_inodes);
qri = rb_entry(node, struct ceph_quotarealm_inode, node);
rb_erase(node, &mdsc->quotarealms_inodes);
- iput(qri->inode);
+ ceph_iput_async(qri->inode);
kfree(qri);
}
mutex_unlock(&mdsc->quotarealms_inodes_mutex);
diff --git a/fs/ceph/snap.c b/fs/ceph/snap.c
index c65f2b202b2b..19f097e79b3c 100644
--- a/fs/ceph/snap.c
+++ b/fs/ceph/snap.c
@@ -735,7 +735,7 @@ static void queue_realm_cap_snaps(struct ceph_mds_client *mdsc,
if (!inode)
continue;
spin_unlock(&realm->inodes_with_caps_lock);
- iput(lastinode);
+ ceph_iput_async(lastinode);
lastinode = inode;
/*
@@ -762,7 +762,7 @@ static void queue_realm_cap_snaps(struct ceph_mds_client *mdsc,
spin_lock(&realm->inodes_with_caps_lock);
}
spin_unlock(&realm->inodes_with_caps_lock);
- iput(lastinode);
+ ceph_iput_async(lastinode);
if (capsnap)
kmem_cache_free(ceph_cap_snap_cachep, capsnap);
@@ -955,7 +955,7 @@ static void flush_snaps(struct ceph_mds_client *mdsc)
ihold(inode);
spin_unlock(&mdsc->snap_flush_lock);
ceph_flush_snaps(ci, &session);
- iput(inode);
+ ceph_iput_async(inode);
spin_lock(&mdsc->snap_flush_lock);
}
spin_unlock(&mdsc->snap_flush_lock);
@@ -1116,12 +1116,12 @@ void ceph_handle_snap(struct ceph_mds_client *mdsc,
ceph_get_snap_realm(mdsc, realm);
ceph_change_snap_realm(inode, realm);
spin_unlock(&ci->i_ceph_lock);
- iput(inode);
+ ceph_iput_async(inode);
continue;
skip_inode:
spin_unlock(&ci->i_ceph_lock);
- iput(inode);
+ ceph_iput_async(inode);
}
/* we may have taken some of the old realm's children. */
diff --git a/fs/ceph/super.h b/fs/ceph/super.h
index cf176aab0f82..361a72a67bb8 100644
--- a/fs/ceph/super.h
+++ b/fs/ceph/super.h
@@ -1085,6 +1085,8 @@ static inline void ceph_queue_flush_snaps(struct inode *inode)
ceph_queue_inode_work(inode, CEPH_I_WORK_FLUSH_SNAPS);
}
+void ceph_iput_async(struct inode *inode);
+
extern int ceph_try_to_choose_auth_mds(struct inode *inode, int mask);
extern int __ceph_do_getattr(struct inode *inode, struct page *locked_page,
int mask, bool force);
--
2.47.3
Hi Stable,
Please provide a quote for your products:
Include:
1.Pricing (per unit)
2.Delivery cost & timeline
3.Quote expiry date
Deadline: September
Thanks!
Kamal Prasad
Albinayah Trading
The patch titled
Subject: selftests/mm: skip soft-dirty tests when CONFIG_MEM_SOFT_DIRTY is disabled
has been added to the -mm mm-new branch. Its filename is
selftests-mm-skip-soft-dirty-tests-when-config_mem_soft_dirty-is-disabled.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-new branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Note, mm-new is a provisional staging ground for work-in-progress
patches, and acceptance into mm-new is a notification for others take
notice and to finish up reviews. Please do not hesitate to respond to
review feedback and post updated versions to replace or incrementally
fixup patches in mm-new.
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Lance Yang <lance.yang(a)linux.dev>
Subject: selftests/mm: skip soft-dirty tests when CONFIG_MEM_SOFT_DIRTY is disabled
Date: Wed, 17 Sep 2025 21:31:37 +0800
The madv_populate and soft-dirty kselftests currently fail on systems
where CONFIG_MEM_SOFT_DIRTY is disabled.
Introduce a new helper softdirty_supported() into vm_util.c/h to ensure
tests are properly skipped when the feature is not enabled.
Link: https://lkml.kernel.org/r/20250917133137.62802-1-lance.yang@linux.dev
Fixes: 9f3265db6ae8 ("selftests: vm: add test for Soft-Dirty PTE bit")
Signed-off-by: Lance Yang <lance.yang(a)linux.dev>
Acked-by: David Hildenbrand <david(a)redhat.com>
Suggested-by: David Hildenbrand <david(a)redhat.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes(a)oracle.com>
Cc: Shuah Khan <shuah(a)kernel.org>
Cc: Gabriel Krisman Bertazi <krisman(a)collabora.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
tools/testing/selftests/mm/madv_populate.c | 21 +------------------
tools/testing/selftests/mm/soft-dirty.c | 5 +++-
tools/testing/selftests/mm/vm_util.c | 17 +++++++++++++++
tools/testing/selftests/mm/vm_util.h | 1
4 files changed, 24 insertions(+), 20 deletions(-)
--- a/tools/testing/selftests/mm/madv_populate.c~selftests-mm-skip-soft-dirty-tests-when-config_mem_soft_dirty-is-disabled
+++ a/tools/testing/selftests/mm/madv_populate.c
@@ -264,23 +264,6 @@ static void test_softdirty(void)
munmap(addr, SIZE);
}
-static int system_has_softdirty(void)
-{
- /*
- * There is no way to check if the kernel supports soft-dirty, other
- * than by writing to a page and seeing if the bit was set. But the
- * tests are intended to check that the bit gets set when it should, so
- * doing that check would turn a potentially legitimate fail into a
- * skip. Fortunately, we know for sure that arm64 does not support
- * soft-dirty. So for now, let's just use the arch as a corse guide.
- */
-#if defined(__aarch64__)
- return 0;
-#else
- return 1;
-#endif
-}
-
int main(int argc, char **argv)
{
int nr_tests = 16;
@@ -288,7 +271,7 @@ int main(int argc, char **argv)
pagesize = getpagesize();
- if (system_has_softdirty())
+ if (softdirty_supported())
nr_tests += 5;
ksft_print_header();
@@ -300,7 +283,7 @@ int main(int argc, char **argv)
test_holes();
test_populate_read();
test_populate_write();
- if (system_has_softdirty())
+ if (softdirty_supported())
test_softdirty();
err = ksft_get_fail_cnt();
--- a/tools/testing/selftests/mm/soft-dirty.c~selftests-mm-skip-soft-dirty-tests-when-config_mem_soft_dirty-is-disabled
+++ a/tools/testing/selftests/mm/soft-dirty.c
@@ -200,8 +200,11 @@ int main(int argc, char **argv)
int pagesize;
ksft_print_header();
- ksft_set_plan(15);
+ if (!softdirty_supported())
+ ksft_exit_skip("soft-dirty is not support\n");
+
+ ksft_set_plan(15);
pagemap_fd = open(PAGEMAP_FILE_PATH, O_RDONLY);
if (pagemap_fd < 0)
ksft_exit_fail_msg("Failed to open %s\n", PAGEMAP_FILE_PATH);
--- a/tools/testing/selftests/mm/vm_util.c~selftests-mm-skip-soft-dirty-tests-when-config_mem_soft_dirty-is-disabled
+++ a/tools/testing/selftests/mm/vm_util.c
@@ -449,6 +449,23 @@ bool check_vmflag_pfnmap(void *addr)
return check_vmflag(addr, "pf");
}
+bool softdirty_supported(void)
+{
+ char *addr;
+ bool supported = false;
+ const size_t pagesize = getpagesize();
+
+ /* New mappings are expected to be marked with VM_SOFTDIRTY (sd). */
+ addr = mmap(0, pagesize, PROT_READ | PROT_WRITE,
+ MAP_ANONYMOUS | MAP_PRIVATE, 0, 0);
+ if (!addr)
+ ksft_exit_fail_msg("mmap failed\n");
+
+ supported = check_vmflag(addr, "sd");
+ munmap(addr, pagesize);
+ return supported;
+}
+
/*
* Open an fd at /proc/$pid/maps and configure procmap_out ready for
* PROCMAP_QUERY query. Returns 0 on success, or an error code otherwise.
--- a/tools/testing/selftests/mm/vm_util.h~selftests-mm-skip-soft-dirty-tests-when-config_mem_soft_dirty-is-disabled
+++ a/tools/testing/selftests/mm/vm_util.h
@@ -104,6 +104,7 @@ bool find_vma_procmap(struct procmap_fd
int close_procmap(struct procmap_fd *procmap);
int write_sysfs(const char *file_path, unsigned long val);
int read_sysfs(const char *file_path, unsigned long *val);
+bool softdirty_supported(void);
static inline int open_self_procmap(struct procmap_fd *procmap_out)
{
_
Patches currently in -mm which might be from lance.yang(a)linux.dev are
hung_task-fix-warnings-caused-by-unaligned-lock-pointers.patch
mm-skip-mlocked-thps-that-are-underused-early-in-deferred_split_scan.patch
selftests-mm-skip-soft-dirty-tests-when-config_mem_soft_dirty-is-disabled.patch
From: Frédéric Danis <frederic.danis(a)collabora.com>
OBEX download from iPhone is currently slow due to small packet size
used to transfer data which doesn't follow the MTU negotiated during
L2CAP connection, i.e. 672 bytes instead of 32767:
< ACL Data TX: Handle 11 flags 0x00 dlen 12
L2CAP: Connection Request (0x02) ident 18 len 4
PSM: 4103 (0x1007)
Source CID: 72
> ACL Data RX: Handle 11 flags 0x02 dlen 16
L2CAP: Connection Response (0x03) ident 18 len 8
Destination CID: 14608
Source CID: 72
Result: Connection successful (0x0000)
Status: No further information available (0x0000)
< ACL Data TX: Handle 11 flags 0x00 dlen 27
L2CAP: Configure Request (0x04) ident 20 len 19
Destination CID: 14608
Flags: 0x0000
Option: Maximum Transmission Unit (0x01) [mandatory]
MTU: 32767
Option: Retransmission and Flow Control (0x04) [mandatory]
Mode: Enhanced Retransmission (0x03)
TX window size: 63
Max transmit: 3
Retransmission timeout: 2000
Monitor timeout: 12000
Maximum PDU size: 1009
> ACL Data RX: Handle 11 flags 0x02 dlen 26
L2CAP: Configure Request (0x04) ident 72 len 18
Destination CID: 72
Flags: 0x0000
Option: Retransmission and Flow Control (0x04) [mandatory]
Mode: Enhanced Retransmission (0x03)
TX window size: 32
Max transmit: 255
Retransmission timeout: 0
Monitor timeout: 0
Maximum PDU size: 65527
Option: Frame Check Sequence (0x05) [mandatory]
FCS: 16-bit FCS (0x01)
< ACL Data TX: Handle 11 flags 0x00 dlen 29
L2CAP: Configure Response (0x05) ident 72 len 21
Source CID: 14608
Flags: 0x0000
Result: Success (0x0000)
Option: Maximum Transmission Unit (0x01) [mandatory]
MTU: 672
Option: Retransmission and Flow Control (0x04) [mandatory]
Mode: Enhanced Retransmission (0x03)
TX window size: 32
Max transmit: 255
Retransmission timeout: 2000
Monitor timeout: 12000
Maximum PDU size: 1009
> ACL Data RX: Handle 11 flags 0x02 dlen 32
L2CAP: Configure Response (0x05) ident 20 len 24
Source CID: 72
Flags: 0x0000
Result: Success (0x0000)
Option: Maximum Transmission Unit (0x01) [mandatory]
MTU: 32767
Option: Retransmission and Flow Control (0x04) [mandatory]
Mode: Enhanced Retransmission (0x03)
TX window size: 63
Max transmit: 3
Retransmission timeout: 2000
Monitor timeout: 12000
Maximum PDU size: 1009
Option: Frame Check Sequence (0x05) [mandatory]
FCS: 16-bit FCS (0x01)
...
> ACL Data RX: Handle 11 flags 0x02 dlen 680
Channel: 72 len 676 ctrl 0x0202 [PSM 4103 mode Enhanced Retransmission (0x03)] {chan 8}
I-frame: Unsegmented TxSeq 1 ReqSeq 2
< ACL Data TX: Handle 11 flags 0x00 dlen 13
Channel: 14608 len 9 ctrl 0x0204 [PSM 4103 mode Enhanced Retransmission (0x03)] {chan 8}
I-frame: Unsegmented TxSeq 2 ReqSeq 2
> ACL Data RX: Handle 11 flags 0x02 dlen 680
Channel: 72 len 676 ctrl 0x0304 [PSM 4103 mode Enhanced Retransmission (0x03)] {chan 8}
I-frame: Unsegmented TxSeq 2 ReqSeq 3
The MTUs are negotiated for each direction. In this traces 32767 for
iPhone->localhost and no MTU for localhost->iPhone, which based on
'4.4 L2CAP_CONFIGURATION_REQ' (Core specification v5.4, Vol. 3, Part
A):
The only parameters that should be included in the
L2CAP_CONFIGURATION_REQ packet are those that require different
values than the default or previously agreed values.
...
Any missing configuration parameters are assumed to have their
most recently explicitly or implicitly accepted values.
and '5.1 Maximum transmission unit (MTU)':
If the remote device sends a positive L2CAP_CONFIGURATION_RSP
packet it should include the actual MTU to be used on this channel
for traffic flowing into the local device.
...
The default value is 672 octets.
is set by BlueZ to 672 bytes.
It seems that the iPhone used the lowest negotiated value to transfer
data to the localhost instead of the negotiated one for the incoming
direction.
This could be fixed by using the MTU negotiated for the other
direction, if exists, in the L2CAP_CONFIGURATION_RSP.
This allows to use segmented packets as in the following traces:
< ACL Data TX: Handle 11 flags 0x00 dlen 12
L2CAP: Connection Request (0x02) ident 22 len 4
PSM: 4103 (0x1007)
Source CID: 72
< ACL Data TX: Handle 11 flags 0x00 dlen 27
L2CAP: Configure Request (0x04) ident 24 len 19
Destination CID: 2832
Flags: 0x0000
Option: Maximum Transmission Unit (0x01) [mandatory]
MTU: 32767
Option: Retransmission and Flow Control (0x04) [mandatory]
Mode: Enhanced Retransmission (0x03)
TX window size: 63
Max transmit: 3
Retransmission timeout: 2000
Monitor timeout: 12000
Maximum PDU size: 1009
> ACL Data RX: Handle 11 flags 0x02 dlen 26
L2CAP: Configure Request (0x04) ident 15 len 18
Destination CID: 72
Flags: 0x0000
Option: Retransmission and Flow Control (0x04) [mandatory]
Mode: Enhanced Retransmission (0x03)
TX window size: 32
Max transmit: 255
Retransmission timeout: 0
Monitor timeout: 0
Maximum PDU size: 65527
Option: Frame Check Sequence (0x05) [mandatory]
FCS: 16-bit FCS (0x01)
< ACL Data TX: Handle 11 flags 0x00 dlen 29
L2CAP: Configure Response (0x05) ident 15 len 21
Source CID: 2832
Flags: 0x0000
Result: Success (0x0000)
Option: Maximum Transmission Unit (0x01) [mandatory]
MTU: 32767
Option: Retransmission and Flow Control (0x04) [mandatory]
Mode: Enhanced Retransmission (0x03)
TX window size: 32
Max transmit: 255
Retransmission timeout: 2000
Monitor timeout: 12000
Maximum PDU size: 1009
> ACL Data RX: Handle 11 flags 0x02 dlen 32
L2CAP: Configure Response (0x05) ident 24 len 24
Source CID: 72
Flags: 0x0000
Result: Success (0x0000)
Option: Maximum Transmission Unit (0x01) [mandatory]
MTU: 32767
Option: Retransmission and Flow Control (0x04) [mandatory]
Mode: Enhanced Retransmission (0x03)
TX window size: 63
Max transmit: 3
Retransmission timeout: 2000
Monitor timeout: 12000
Maximum PDU size: 1009
Option: Frame Check Sequence (0x05) [mandatory]
FCS: 16-bit FCS (0x01)
...
> ACL Data RX: Handle 11 flags 0x02 dlen 1009
Channel: 72 len 1005 ctrl 0x4202 [PSM 4103 mode Enhanced Retransmission (0x03)] {chan 8}
I-frame: Start (len 21884) TxSeq 1 ReqSeq 2
> ACL Data RX: Handle 11 flags 0x02 dlen 1009
Channel: 72 len 1005 ctrl 0xc204 [PSM 4103 mode Enhanced Retransmission (0x03)] {chan 8}
I-frame: Continuation TxSeq 2 ReqSeq 2
This has been tested with kernel 5.4 and BlueZ 5.77.
Cc: stable(a)vger.kernel.org
Signed-off-by: Frédéric Danis <frederic.danis(a)collabora.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz(a)intel.com>
---
net/bluetooth/l2cap_core.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/net/bluetooth/l2cap_core.c b/net/bluetooth/l2cap_core.c
index a5bde5db58ef..40daa38276f3 100644
--- a/net/bluetooth/l2cap_core.c
+++ b/net/bluetooth/l2cap_core.c
@@ -3415,7 +3415,7 @@ static int l2cap_parse_conf_req(struct l2cap_chan *chan, void *data, size_t data
struct l2cap_conf_rfc rfc = { .mode = L2CAP_MODE_BASIC };
struct l2cap_conf_efs efs;
u8 remote_efs = 0;
- u16 mtu = L2CAP_DEFAULT_MTU;
+ u16 mtu = 0;
u16 result = L2CAP_CONF_SUCCESS;
u16 size;
@@ -3520,6 +3520,13 @@ static int l2cap_parse_conf_req(struct l2cap_chan *chan, void *data, size_t data
/* Configure output options and let the other side know
* which ones we don't like. */
+ /* If MTU is not provided in configure request, use the most recently
+ * explicitly or implicitly accepted value for the other direction,
+ * or the default value.
+ */
+ if (mtu == 0)
+ mtu = chan->imtu ? chan->imtu : L2CAP_DEFAULT_MTU;
+
if (mtu < L2CAP_DEFAULT_MIN_MTU)
result = L2CAP_CONF_UNACCEPT;
else {
--
2.45.2
On Wed, Sep 17, 2025 at 10:03 AM Andrei Vagin <avagin(a)google.com> wrote:
>
> is
>
> On Wed, Sep 17, 2025 at 8:59 AM Eric Dumazet <edumazet(a)google.com> wrote:
> >
> > On Wed, Sep 17, 2025 at 8:39 AM Andrei Vagin <avagin(a)google.com> wrote:
> > >
> > > On Wed, Sep 17, 2025 at 6:53 AM Eric Dumazet <edumazet(a)google.com> wrote:
> > > >
> > > > Andrei Vagin reported that blamed commit broke CRIU.
> > > >
> > > > Indeed, while we want to keep sk_uid unchanged when a socket
> > > > is cloned, we want to clear sk->sk_ino.
> > > >
> > > > Otherwise, sock_diag might report multiple sockets sharing
> > > > the same inode number.
> > > >
> > > > Move the clearing part from sock_orphan() to sk_set_socket(sk, NULL),
> > > > called both from sock_orphan() and sk_clone_lock().
> > > >
> > > > Fixes: 5d6b58c932ec ("net: lockless sock_i_ino()")
> > > > Closes: https://lore.kernel.org/netdev/aMhX-VnXkYDpKd9V@google.com/
> > > > Closes: https://github.com/checkpoint-restore/criu/issues/2744
> > > > Reported-by: Andrei Vagin <avagin(a)google.com>
> > > > Signed-off-by: Eric Dumazet <edumazet(a)google.com>
> > >
> > > Acked-by: Andrei Vagin <avagin(a)google.com>
> > > I think we need to add `Cc: stable(a)vger.kernel.org`.
> >
> > I never do this. Note that the prior patch had no such CC.
>
> The original patch has been ported to the v6.16 kernels. According to the
> kernel documentation
> (https://www.kernel.org/doc/html/v6.5/process/stable-kernel-rules.html),
> adding Cc: stable(a)vger.kernel.org is required for automatic porting into
> stable trees. Without this tag, someone will likely need to manually request
> that this patch be ported. This is my understanding of how the stable
> branch process works, sorry if I missed something.
Andrei, I think I know pretty well what I am doing. You do not have to
explain to me anything.
Thank you.
Running sha224_kunit on a KMSAN-enabled kernel results in a crash in
kmsan_internal_set_shadow_origin():
BUG: unable to handle page fault for address: ffffbc3840291000
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 1810067 P4D 1810067 PUD 192d067 PMD 3c17067 PTE 0
Oops: 0000 [#1] SMP NOPTI
CPU: 0 UID: 0 PID: 81 Comm: kunit_try_catch Tainted: G N 6.17.0-rc3 #10 PREEMPT(voluntary)
Tainted: [N]=TEST
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.17.0-0-gb52ca86e094d-prebuilt.qemu.org 04/01/2014
RIP: 0010:kmsan_internal_set_shadow_origin+0x91/0x100
[...]
Call Trace:
<TASK>
__msan_memset+0xee/0x1a0
sha224_final+0x9e/0x350
test_hash_buffer_overruns+0x46f/0x5f0
? kmsan_get_shadow_origin_ptr+0x46/0xa0
? __pfx_test_hash_buffer_overruns+0x10/0x10
kunit_try_run_case+0x198/0xa00
This occurs when memset() is called on a buffer that is not 4-byte
aligned and extends to the end of a guard page, i.e. the next page is
unmapped.
The bug is that the loop at the end of
kmsan_internal_set_shadow_origin() accesses the wrong shadow memory
bytes when the address is not 4-byte aligned. Since each 4 bytes are
associated with an origin, it rounds the address and size so that it can
access all the origins that contain the buffer. However, when it checks
the corresponding shadow bytes for a particular origin, it incorrectly
uses the original unrounded shadow address. This results in reads from
shadow memory beyond the end of the buffer's shadow memory, which
crashes when that memory is not mapped.
To fix this, correctly align the shadow address before accessing the 4
shadow bytes corresponding to each origin.
Fixes: 2ef3cec44c60 ("kmsan: do not wipe out origin when doing partial unpoisoning")
Cc: stable(a)vger.kernel.org
Signed-off-by: Eric Biggers <ebiggers(a)kernel.org>
---
mm/kmsan/core.c | 10 +++++++---
1 file changed, 7 insertions(+), 3 deletions(-)
diff --git a/mm/kmsan/core.c b/mm/kmsan/core.c
index 1ea711786c522..8bca7fece47f0 100644
--- a/mm/kmsan/core.c
+++ b/mm/kmsan/core.c
@@ -193,11 +193,12 @@ depot_stack_handle_t kmsan_internal_chain_origin(depot_stack_handle_t id)
void kmsan_internal_set_shadow_origin(void *addr, size_t size, int b,
u32 origin, bool checked)
{
u64 address = (u64)addr;
- u32 *shadow_start, *origin_start;
+ void *shadow_start;
+ u32 *aligned_shadow, *origin_start;
size_t pad = 0;
KMSAN_WARN_ON(!kmsan_metadata_is_contiguous(addr, size));
shadow_start = kmsan_get_metadata(addr, KMSAN_META_SHADOW);
if (!shadow_start) {
@@ -212,13 +213,16 @@ void kmsan_internal_set_shadow_origin(void *addr, size_t size, int b,
}
return;
}
__memset(shadow_start, b, size);
- if (!IS_ALIGNED(address, KMSAN_ORIGIN_SIZE)) {
+ if (IS_ALIGNED(address, KMSAN_ORIGIN_SIZE)) {
+ aligned_shadow = shadow_start;
+ } else {
pad = address % KMSAN_ORIGIN_SIZE;
address -= pad;
+ aligned_shadow = shadow_start - pad;
size += pad;
}
size = ALIGN(size, KMSAN_ORIGIN_SIZE);
origin_start =
(u32 *)kmsan_get_metadata((void *)address, KMSAN_META_ORIGIN);
@@ -228,11 +232,11 @@ void kmsan_internal_set_shadow_origin(void *addr, size_t size, int b,
* and unconditionally overwrite the old origin slot.
* If the new origin is zero, overwrite the old origin slot iff the
* corresponding shadow slot is zero.
*/
for (int i = 0; i < size / KMSAN_ORIGIN_SIZE; i++) {
- if (origin || !shadow_start[i])
+ if (origin || !aligned_shadow[i])
origin_start[i] = origin;
}
}
struct page *kmsan_vmalloc_to_page_or_null(void *vaddr)
base-commit: 1b237f190eb3d36f52dffe07a40b5eb210280e00
--
2.50.1
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x 535fd4c98452c87537a40610abba45daf5761ec6
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025091722-chatter-dyslexia-7db3@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 535fd4c98452c87537a40610abba45daf5761ec6 Mon Sep 17 00:00:00 2001
From: Hugo Villeneuve <hvilleneuve(a)dimonoff.com>
Date: Thu, 31 Jul 2025 08:44:50 -0400
Subject: [PATCH] serial: sc16is7xx: fix bug in flow control levels init
When trying to set MCR[2], XON1 is incorrectly accessed instead. And when
writing to the TCR register to configure flow control levels, we are
incorrectly writing to the MSR register. The default value of $00 is then
used for TCR, which means that selectable trigger levels in FCR are used
in place of TCR.
TCR/TLR access requires EFR[4] (enable enhanced functions) and MCR[2]
to be set. EFR[4] is already set in probe().
MCR access requires LCR[7] to be zero.
Since LCR is set to $BF when trying to set MCR[2], XON1 is incorrectly
accessed instead because MCR shares the same address space as XON1.
Since MCR[2] is unmodified and still zero, when writing to TCR we are in
fact writing to MSR because TCR/TLR registers share the same address space
as MSR/SPR.
Fix by first removing useless reconfiguration of EFR[4] (enable enhanced
functions), as it is already enabled in sc16is7xx_probe() since commit
43c51bb573aa ("sc16is7xx: make sure device is in suspend once probed").
Now LCR is $00, which means that MCR access is enabled.
Also remove regcache_cache_bypass() calls since we no longer access the
enhanced registers set, and TCR is already declared as volatile (in fact
by declaring MSR as volatile, which shares the same address).
Finally disable access to TCR/TLR registers after modifying them by
clearing MCR[2].
Note: the comment about "... and internal clock div" is wrong and can be
ignored/removed as access to internal clock div registers (DLL/DLH)
is permitted only when LCR[7] is logic 1, not when enhanced features
is enabled. And DLL/DLH access is not needed in sc16is7xx_startup().
Fixes: dfeae619d781 ("serial: sc16is7xx")
Cc: stable(a)vger.kernel.org
Signed-off-by: Hugo Villeneuve <hvilleneuve(a)dimonoff.com>
Link: https://lore.kernel.org/r/20250731124451.1108864-1-hugo@hugovil.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/tty/serial/sc16is7xx.c b/drivers/tty/serial/sc16is7xx.c
index 3f38fba8f6ea..a668e0bb26b3 100644
--- a/drivers/tty/serial/sc16is7xx.c
+++ b/drivers/tty/serial/sc16is7xx.c
@@ -1177,17 +1177,6 @@ static int sc16is7xx_startup(struct uart_port *port)
sc16is7xx_port_write(port, SC16IS7XX_FCR_REG,
SC16IS7XX_FCR_FIFO_BIT);
- /* Enable EFR */
- sc16is7xx_port_write(port, SC16IS7XX_LCR_REG,
- SC16IS7XX_LCR_CONF_MODE_B);
-
- regcache_cache_bypass(one->regmap, true);
-
- /* Enable write access to enhanced features and internal clock div */
- sc16is7xx_port_update(port, SC16IS7XX_EFR_REG,
- SC16IS7XX_EFR_ENABLE_BIT,
- SC16IS7XX_EFR_ENABLE_BIT);
-
/* Enable TCR/TLR */
sc16is7xx_port_update(port, SC16IS7XX_MCR_REG,
SC16IS7XX_MCR_TCRTLR_BIT,
@@ -1199,7 +1188,8 @@ static int sc16is7xx_startup(struct uart_port *port)
SC16IS7XX_TCR_RX_RESUME(24) |
SC16IS7XX_TCR_RX_HALT(48));
- regcache_cache_bypass(one->regmap, false);
+ /* Disable TCR/TLR access */
+ sc16is7xx_port_update(port, SC16IS7XX_MCR_REG, SC16IS7XX_MCR_TCRTLR_BIT, 0);
/* Now, initialize the UART */
sc16is7xx_port_write(port, SC16IS7XX_LCR_REG, SC16IS7XX_LCR_WORD_LEN_8);
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 535fd4c98452c87537a40610abba45daf5761ec6
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025091721-speak-detoxify-e6fe@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 535fd4c98452c87537a40610abba45daf5761ec6 Mon Sep 17 00:00:00 2001
From: Hugo Villeneuve <hvilleneuve(a)dimonoff.com>
Date: Thu, 31 Jul 2025 08:44:50 -0400
Subject: [PATCH] serial: sc16is7xx: fix bug in flow control levels init
When trying to set MCR[2], XON1 is incorrectly accessed instead. And when
writing to the TCR register to configure flow control levels, we are
incorrectly writing to the MSR register. The default value of $00 is then
used for TCR, which means that selectable trigger levels in FCR are used
in place of TCR.
TCR/TLR access requires EFR[4] (enable enhanced functions) and MCR[2]
to be set. EFR[4] is already set in probe().
MCR access requires LCR[7] to be zero.
Since LCR is set to $BF when trying to set MCR[2], XON1 is incorrectly
accessed instead because MCR shares the same address space as XON1.
Since MCR[2] is unmodified and still zero, when writing to TCR we are in
fact writing to MSR because TCR/TLR registers share the same address space
as MSR/SPR.
Fix by first removing useless reconfiguration of EFR[4] (enable enhanced
functions), as it is already enabled in sc16is7xx_probe() since commit
43c51bb573aa ("sc16is7xx: make sure device is in suspend once probed").
Now LCR is $00, which means that MCR access is enabled.
Also remove regcache_cache_bypass() calls since we no longer access the
enhanced registers set, and TCR is already declared as volatile (in fact
by declaring MSR as volatile, which shares the same address).
Finally disable access to TCR/TLR registers after modifying them by
clearing MCR[2].
Note: the comment about "... and internal clock div" is wrong and can be
ignored/removed as access to internal clock div registers (DLL/DLH)
is permitted only when LCR[7] is logic 1, not when enhanced features
is enabled. And DLL/DLH access is not needed in sc16is7xx_startup().
Fixes: dfeae619d781 ("serial: sc16is7xx")
Cc: stable(a)vger.kernel.org
Signed-off-by: Hugo Villeneuve <hvilleneuve(a)dimonoff.com>
Link: https://lore.kernel.org/r/20250731124451.1108864-1-hugo@hugovil.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/tty/serial/sc16is7xx.c b/drivers/tty/serial/sc16is7xx.c
index 3f38fba8f6ea..a668e0bb26b3 100644
--- a/drivers/tty/serial/sc16is7xx.c
+++ b/drivers/tty/serial/sc16is7xx.c
@@ -1177,17 +1177,6 @@ static int sc16is7xx_startup(struct uart_port *port)
sc16is7xx_port_write(port, SC16IS7XX_FCR_REG,
SC16IS7XX_FCR_FIFO_BIT);
- /* Enable EFR */
- sc16is7xx_port_write(port, SC16IS7XX_LCR_REG,
- SC16IS7XX_LCR_CONF_MODE_B);
-
- regcache_cache_bypass(one->regmap, true);
-
- /* Enable write access to enhanced features and internal clock div */
- sc16is7xx_port_update(port, SC16IS7XX_EFR_REG,
- SC16IS7XX_EFR_ENABLE_BIT,
- SC16IS7XX_EFR_ENABLE_BIT);
-
/* Enable TCR/TLR */
sc16is7xx_port_update(port, SC16IS7XX_MCR_REG,
SC16IS7XX_MCR_TCRTLR_BIT,
@@ -1199,7 +1188,8 @@ static int sc16is7xx_startup(struct uart_port *port)
SC16IS7XX_TCR_RX_RESUME(24) |
SC16IS7XX_TCR_RX_HALT(48));
- regcache_cache_bypass(one->regmap, false);
+ /* Disable TCR/TLR access */
+ sc16is7xx_port_update(port, SC16IS7XX_MCR_REG, SC16IS7XX_MCR_TCRTLR_BIT, 0);
/* Now, initialize the UART */
sc16is7xx_port_write(port, SC16IS7XX_LCR_REG, SC16IS7XX_LCR_WORD_LEN_8);
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 535fd4c98452c87537a40610abba45daf5761ec6
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025091721-moonwalk-barrack-423e@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 535fd4c98452c87537a40610abba45daf5761ec6 Mon Sep 17 00:00:00 2001
From: Hugo Villeneuve <hvilleneuve(a)dimonoff.com>
Date: Thu, 31 Jul 2025 08:44:50 -0400
Subject: [PATCH] serial: sc16is7xx: fix bug in flow control levels init
When trying to set MCR[2], XON1 is incorrectly accessed instead. And when
writing to the TCR register to configure flow control levels, we are
incorrectly writing to the MSR register. The default value of $00 is then
used for TCR, which means that selectable trigger levels in FCR are used
in place of TCR.
TCR/TLR access requires EFR[4] (enable enhanced functions) and MCR[2]
to be set. EFR[4] is already set in probe().
MCR access requires LCR[7] to be zero.
Since LCR is set to $BF when trying to set MCR[2], XON1 is incorrectly
accessed instead because MCR shares the same address space as XON1.
Since MCR[2] is unmodified and still zero, when writing to TCR we are in
fact writing to MSR because TCR/TLR registers share the same address space
as MSR/SPR.
Fix by first removing useless reconfiguration of EFR[4] (enable enhanced
functions), as it is already enabled in sc16is7xx_probe() since commit
43c51bb573aa ("sc16is7xx: make sure device is in suspend once probed").
Now LCR is $00, which means that MCR access is enabled.
Also remove regcache_cache_bypass() calls since we no longer access the
enhanced registers set, and TCR is already declared as volatile (in fact
by declaring MSR as volatile, which shares the same address).
Finally disable access to TCR/TLR registers after modifying them by
clearing MCR[2].
Note: the comment about "... and internal clock div" is wrong and can be
ignored/removed as access to internal clock div registers (DLL/DLH)
is permitted only when LCR[7] is logic 1, not when enhanced features
is enabled. And DLL/DLH access is not needed in sc16is7xx_startup().
Fixes: dfeae619d781 ("serial: sc16is7xx")
Cc: stable(a)vger.kernel.org
Signed-off-by: Hugo Villeneuve <hvilleneuve(a)dimonoff.com>
Link: https://lore.kernel.org/r/20250731124451.1108864-1-hugo@hugovil.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/tty/serial/sc16is7xx.c b/drivers/tty/serial/sc16is7xx.c
index 3f38fba8f6ea..a668e0bb26b3 100644
--- a/drivers/tty/serial/sc16is7xx.c
+++ b/drivers/tty/serial/sc16is7xx.c
@@ -1177,17 +1177,6 @@ static int sc16is7xx_startup(struct uart_port *port)
sc16is7xx_port_write(port, SC16IS7XX_FCR_REG,
SC16IS7XX_FCR_FIFO_BIT);
- /* Enable EFR */
- sc16is7xx_port_write(port, SC16IS7XX_LCR_REG,
- SC16IS7XX_LCR_CONF_MODE_B);
-
- regcache_cache_bypass(one->regmap, true);
-
- /* Enable write access to enhanced features and internal clock div */
- sc16is7xx_port_update(port, SC16IS7XX_EFR_REG,
- SC16IS7XX_EFR_ENABLE_BIT,
- SC16IS7XX_EFR_ENABLE_BIT);
-
/* Enable TCR/TLR */
sc16is7xx_port_update(port, SC16IS7XX_MCR_REG,
SC16IS7XX_MCR_TCRTLR_BIT,
@@ -1199,7 +1188,8 @@ static int sc16is7xx_startup(struct uart_port *port)
SC16IS7XX_TCR_RX_RESUME(24) |
SC16IS7XX_TCR_RX_HALT(48));
- regcache_cache_bypass(one->regmap, false);
+ /* Disable TCR/TLR access */
+ sc16is7xx_port_update(port, SC16IS7XX_MCR_REG, SC16IS7XX_MCR_TCRTLR_BIT, 0);
/* Now, initialize the UART */
sc16is7xx_port_write(port, SC16IS7XX_LCR_REG, SC16IS7XX_LCR_WORD_LEN_8);
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x a5c98e8b1398534ae1feb6e95e2d3ee5215538ed
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025091731-subtotal-outcome-6092@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From a5c98e8b1398534ae1feb6e95e2d3ee5215538ed Mon Sep 17 00:00:00 2001
From: Mathias Nyman <mathias.nyman(a)linux.intel.com>
Date: Tue, 2 Sep 2025 13:53:05 +0300
Subject: [PATCH] xhci: dbc: Fix full DbC transfer ring after several
reconnects
Pending requests will be flushed on disconnect, and the corresponding
TRBs will be turned into No-op TRBs, which are ignored by the xHC
controller once it starts processing the ring.
If the USB debug cable repeatedly disconnects before ring is started
then the ring will eventually be filled with No-op TRBs.
No new transfers can be queued when the ring is full, and driver will
print the following error message:
"xhci_hcd 0000:00:14.0: failed to queue trbs"
This is a normal case for 'in' transfers where TRBs are always enqueued
in advance, ready to take on incoming data. If no data arrives, and
device is disconnected, then ring dequeue will remain at beginning of
the ring while enqueue points to first free TRB after last cancelled
No-op TRB.
s
Solve this by reinitializing the rings when the debug cable disconnects
and DbC is leaving the configured state.
Clear the whole ring buffer and set enqueue and dequeue to the beginning
of ring, and set cycle bit to its initial state.
Cc: stable(a)vger.kernel.org
Fixes: dfba2174dc42 ("usb: xhci: Add DbC support in xHCI driver")
Signed-off-by: Mathias Nyman <mathias.nyman(a)linux.intel.com>
Link: https://lore.kernel.org/r/20250902105306.877476-3-mathias.nyman@linux.intel…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/usb/host/xhci-dbgcap.c b/drivers/usb/host/xhci-dbgcap.c
index d0faff233e3e..63edf2d8f245 100644
--- a/drivers/usb/host/xhci-dbgcap.c
+++ b/drivers/usb/host/xhci-dbgcap.c
@@ -462,6 +462,25 @@ static void xhci_dbc_ring_init(struct xhci_ring *ring)
xhci_initialize_ring_info(ring);
}
+static int xhci_dbc_reinit_ep_rings(struct xhci_dbc *dbc)
+{
+ struct xhci_ring *in_ring = dbc->eps[BULK_IN].ring;
+ struct xhci_ring *out_ring = dbc->eps[BULK_OUT].ring;
+
+ if (!in_ring || !out_ring || !dbc->ctx) {
+ dev_warn(dbc->dev, "Can't re-init unallocated endpoints\n");
+ return -ENODEV;
+ }
+
+ xhci_dbc_ring_init(in_ring);
+ xhci_dbc_ring_init(out_ring);
+
+ /* set ep context enqueue, dequeue, and cycle to initial values */
+ xhci_dbc_init_ep_contexts(dbc);
+
+ return 0;
+}
+
static struct xhci_ring *
xhci_dbc_ring_alloc(struct device *dev, enum xhci_ring_type type, gfp_t flags)
{
@@ -885,7 +904,7 @@ static enum evtreturn xhci_dbc_do_handle_events(struct xhci_dbc *dbc)
dev_info(dbc->dev, "DbC cable unplugged\n");
dbc->state = DS_ENABLED;
xhci_dbc_flush_requests(dbc);
-
+ xhci_dbc_reinit_ep_rings(dbc);
return EVT_DISC;
}
@@ -895,7 +914,7 @@ static enum evtreturn xhci_dbc_do_handle_events(struct xhci_dbc *dbc)
writel(portsc, &dbc->regs->portsc);
dbc->state = DS_ENABLED;
xhci_dbc_flush_requests(dbc);
-
+ xhci_dbc_reinit_ep_rings(dbc);
return EVT_DISC;
}
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x a5c98e8b1398534ae1feb6e95e2d3ee5215538ed
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025091747-thirstily-dispose-3727@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From a5c98e8b1398534ae1feb6e95e2d3ee5215538ed Mon Sep 17 00:00:00 2001
From: Mathias Nyman <mathias.nyman(a)linux.intel.com>
Date: Tue, 2 Sep 2025 13:53:05 +0300
Subject: [PATCH] xhci: dbc: Fix full DbC transfer ring after several
reconnects
Pending requests will be flushed on disconnect, and the corresponding
TRBs will be turned into No-op TRBs, which are ignored by the xHC
controller once it starts processing the ring.
If the USB debug cable repeatedly disconnects before ring is started
then the ring will eventually be filled with No-op TRBs.
No new transfers can be queued when the ring is full, and driver will
print the following error message:
"xhci_hcd 0000:00:14.0: failed to queue trbs"
This is a normal case for 'in' transfers where TRBs are always enqueued
in advance, ready to take on incoming data. If no data arrives, and
device is disconnected, then ring dequeue will remain at beginning of
the ring while enqueue points to first free TRB after last cancelled
No-op TRB.
s
Solve this by reinitializing the rings when the debug cable disconnects
and DbC is leaving the configured state.
Clear the whole ring buffer and set enqueue and dequeue to the beginning
of ring, and set cycle bit to its initial state.
Cc: stable(a)vger.kernel.org
Fixes: dfba2174dc42 ("usb: xhci: Add DbC support in xHCI driver")
Signed-off-by: Mathias Nyman <mathias.nyman(a)linux.intel.com>
Link: https://lore.kernel.org/r/20250902105306.877476-3-mathias.nyman@linux.intel…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/usb/host/xhci-dbgcap.c b/drivers/usb/host/xhci-dbgcap.c
index d0faff233e3e..63edf2d8f245 100644
--- a/drivers/usb/host/xhci-dbgcap.c
+++ b/drivers/usb/host/xhci-dbgcap.c
@@ -462,6 +462,25 @@ static void xhci_dbc_ring_init(struct xhci_ring *ring)
xhci_initialize_ring_info(ring);
}
+static int xhci_dbc_reinit_ep_rings(struct xhci_dbc *dbc)
+{
+ struct xhci_ring *in_ring = dbc->eps[BULK_IN].ring;
+ struct xhci_ring *out_ring = dbc->eps[BULK_OUT].ring;
+
+ if (!in_ring || !out_ring || !dbc->ctx) {
+ dev_warn(dbc->dev, "Can't re-init unallocated endpoints\n");
+ return -ENODEV;
+ }
+
+ xhci_dbc_ring_init(in_ring);
+ xhci_dbc_ring_init(out_ring);
+
+ /* set ep context enqueue, dequeue, and cycle to initial values */
+ xhci_dbc_init_ep_contexts(dbc);
+
+ return 0;
+}
+
static struct xhci_ring *
xhci_dbc_ring_alloc(struct device *dev, enum xhci_ring_type type, gfp_t flags)
{
@@ -885,7 +904,7 @@ static enum evtreturn xhci_dbc_do_handle_events(struct xhci_dbc *dbc)
dev_info(dbc->dev, "DbC cable unplugged\n");
dbc->state = DS_ENABLED;
xhci_dbc_flush_requests(dbc);
-
+ xhci_dbc_reinit_ep_rings(dbc);
return EVT_DISC;
}
@@ -895,7 +914,7 @@ static enum evtreturn xhci_dbc_do_handle_events(struct xhci_dbc *dbc)
writel(portsc, &dbc->regs->portsc);
dbc->state = DS_ENABLED;
xhci_dbc_flush_requests(dbc);
-
+ xhci_dbc_reinit_ep_rings(dbc);
return EVT_DISC;
}
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x a5c98e8b1398534ae1feb6e95e2d3ee5215538ed
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025091706-savings-chatroom-9c46@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From a5c98e8b1398534ae1feb6e95e2d3ee5215538ed Mon Sep 17 00:00:00 2001
From: Mathias Nyman <mathias.nyman(a)linux.intel.com>
Date: Tue, 2 Sep 2025 13:53:05 +0300
Subject: [PATCH] xhci: dbc: Fix full DbC transfer ring after several
reconnects
Pending requests will be flushed on disconnect, and the corresponding
TRBs will be turned into No-op TRBs, which are ignored by the xHC
controller once it starts processing the ring.
If the USB debug cable repeatedly disconnects before ring is started
then the ring will eventually be filled with No-op TRBs.
No new transfers can be queued when the ring is full, and driver will
print the following error message:
"xhci_hcd 0000:00:14.0: failed to queue trbs"
This is a normal case for 'in' transfers where TRBs are always enqueued
in advance, ready to take on incoming data. If no data arrives, and
device is disconnected, then ring dequeue will remain at beginning of
the ring while enqueue points to first free TRB after last cancelled
No-op TRB.
s
Solve this by reinitializing the rings when the debug cable disconnects
and DbC is leaving the configured state.
Clear the whole ring buffer and set enqueue and dequeue to the beginning
of ring, and set cycle bit to its initial state.
Cc: stable(a)vger.kernel.org
Fixes: dfba2174dc42 ("usb: xhci: Add DbC support in xHCI driver")
Signed-off-by: Mathias Nyman <mathias.nyman(a)linux.intel.com>
Link: https://lore.kernel.org/r/20250902105306.877476-3-mathias.nyman@linux.intel…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/usb/host/xhci-dbgcap.c b/drivers/usb/host/xhci-dbgcap.c
index d0faff233e3e..63edf2d8f245 100644
--- a/drivers/usb/host/xhci-dbgcap.c
+++ b/drivers/usb/host/xhci-dbgcap.c
@@ -462,6 +462,25 @@ static void xhci_dbc_ring_init(struct xhci_ring *ring)
xhci_initialize_ring_info(ring);
}
+static int xhci_dbc_reinit_ep_rings(struct xhci_dbc *dbc)
+{
+ struct xhci_ring *in_ring = dbc->eps[BULK_IN].ring;
+ struct xhci_ring *out_ring = dbc->eps[BULK_OUT].ring;
+
+ if (!in_ring || !out_ring || !dbc->ctx) {
+ dev_warn(dbc->dev, "Can't re-init unallocated endpoints\n");
+ return -ENODEV;
+ }
+
+ xhci_dbc_ring_init(in_ring);
+ xhci_dbc_ring_init(out_ring);
+
+ /* set ep context enqueue, dequeue, and cycle to initial values */
+ xhci_dbc_init_ep_contexts(dbc);
+
+ return 0;
+}
+
static struct xhci_ring *
xhci_dbc_ring_alloc(struct device *dev, enum xhci_ring_type type, gfp_t flags)
{
@@ -885,7 +904,7 @@ static enum evtreturn xhci_dbc_do_handle_events(struct xhci_dbc *dbc)
dev_info(dbc->dev, "DbC cable unplugged\n");
dbc->state = DS_ENABLED;
xhci_dbc_flush_requests(dbc);
-
+ xhci_dbc_reinit_ep_rings(dbc);
return EVT_DISC;
}
@@ -895,7 +914,7 @@ static enum evtreturn xhci_dbc_do_handle_events(struct xhci_dbc *dbc)
writel(portsc, &dbc->regs->portsc);
dbc->state = DS_ENABLED;
xhci_dbc_flush_requests(dbc);
-
+ xhci_dbc_reinit_ep_rings(dbc);
return EVT_DISC;
}
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 64961557efa1b98f375c0579779e7eeda1a02c42
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025091752-daycare-art-9e78@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 64961557efa1b98f375c0579779e7eeda1a02c42 Mon Sep 17 00:00:00 2001
From: Johan Hovold <johan(a)kernel.org>
Date: Thu, 24 Jul 2025 15:12:05 +0200
Subject: [PATCH] phy: ti: omap-usb2: fix device leak at unbind
Make sure to drop the reference to the control device taken by
of_find_device_by_node() during probe when the driver is unbound.
Fixes: 478b6c7436c2 ("usb: phy: omap-usb2: Don't use omap_get_control_dev()")
Cc: stable(a)vger.kernel.org # 3.13
Cc: Roger Quadros <rogerq(a)kernel.org>
Signed-off-by: Johan Hovold <johan(a)kernel.org>
Link: https://lore.kernel.org/r/20250724131206.2211-3-johan@kernel.org
Signed-off-by: Vinod Koul <vkoul(a)kernel.org>
diff --git a/drivers/phy/ti/phy-omap-usb2.c b/drivers/phy/ti/phy-omap-usb2.c
index c1a0ef979142..c444bb2530ca 100644
--- a/drivers/phy/ti/phy-omap-usb2.c
+++ b/drivers/phy/ti/phy-omap-usb2.c
@@ -363,6 +363,13 @@ static void omap_usb2_init_errata(struct omap_usb *phy)
phy->flags |= OMAP_USB2_DISABLE_CHRG_DET;
}
+static void omap_usb2_put_device(void *_dev)
+{
+ struct device *dev = _dev;
+
+ put_device(dev);
+}
+
static int omap_usb2_probe(struct platform_device *pdev)
{
struct omap_usb *phy;
@@ -373,6 +380,7 @@ static int omap_usb2_probe(struct platform_device *pdev)
struct device_node *control_node;
struct platform_device *control_pdev;
const struct usb_phy_data *phy_data;
+ int ret;
phy_data = device_get_match_data(&pdev->dev);
if (!phy_data)
@@ -423,6 +431,11 @@ static int omap_usb2_probe(struct platform_device *pdev)
return -EINVAL;
}
phy->control_dev = &control_pdev->dev;
+
+ ret = devm_add_action_or_reset(&pdev->dev, omap_usb2_put_device,
+ phy->control_dev);
+ if (ret)
+ return ret;
} else {
if (of_property_read_u32_index(node,
"syscon-phy-power", 1,
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x a5c98e8b1398534ae1feb6e95e2d3ee5215538ed
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025091739-chaste-kilometer-da82@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From a5c98e8b1398534ae1feb6e95e2d3ee5215538ed Mon Sep 17 00:00:00 2001
From: Mathias Nyman <mathias.nyman(a)linux.intel.com>
Date: Tue, 2 Sep 2025 13:53:05 +0300
Subject: [PATCH] xhci: dbc: Fix full DbC transfer ring after several
reconnects
Pending requests will be flushed on disconnect, and the corresponding
TRBs will be turned into No-op TRBs, which are ignored by the xHC
controller once it starts processing the ring.
If the USB debug cable repeatedly disconnects before ring is started
then the ring will eventually be filled with No-op TRBs.
No new transfers can be queued when the ring is full, and driver will
print the following error message:
"xhci_hcd 0000:00:14.0: failed to queue trbs"
This is a normal case for 'in' transfers where TRBs are always enqueued
in advance, ready to take on incoming data. If no data arrives, and
device is disconnected, then ring dequeue will remain at beginning of
the ring while enqueue points to first free TRB after last cancelled
No-op TRB.
s
Solve this by reinitializing the rings when the debug cable disconnects
and DbC is leaving the configured state.
Clear the whole ring buffer and set enqueue and dequeue to the beginning
of ring, and set cycle bit to its initial state.
Cc: stable(a)vger.kernel.org
Fixes: dfba2174dc42 ("usb: xhci: Add DbC support in xHCI driver")
Signed-off-by: Mathias Nyman <mathias.nyman(a)linux.intel.com>
Link: https://lore.kernel.org/r/20250902105306.877476-3-mathias.nyman@linux.intel…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/usb/host/xhci-dbgcap.c b/drivers/usb/host/xhci-dbgcap.c
index d0faff233e3e..63edf2d8f245 100644
--- a/drivers/usb/host/xhci-dbgcap.c
+++ b/drivers/usb/host/xhci-dbgcap.c
@@ -462,6 +462,25 @@ static void xhci_dbc_ring_init(struct xhci_ring *ring)
xhci_initialize_ring_info(ring);
}
+static int xhci_dbc_reinit_ep_rings(struct xhci_dbc *dbc)
+{
+ struct xhci_ring *in_ring = dbc->eps[BULK_IN].ring;
+ struct xhci_ring *out_ring = dbc->eps[BULK_OUT].ring;
+
+ if (!in_ring || !out_ring || !dbc->ctx) {
+ dev_warn(dbc->dev, "Can't re-init unallocated endpoints\n");
+ return -ENODEV;
+ }
+
+ xhci_dbc_ring_init(in_ring);
+ xhci_dbc_ring_init(out_ring);
+
+ /* set ep context enqueue, dequeue, and cycle to initial values */
+ xhci_dbc_init_ep_contexts(dbc);
+
+ return 0;
+}
+
static struct xhci_ring *
xhci_dbc_ring_alloc(struct device *dev, enum xhci_ring_type type, gfp_t flags)
{
@@ -885,7 +904,7 @@ static enum evtreturn xhci_dbc_do_handle_events(struct xhci_dbc *dbc)
dev_info(dbc->dev, "DbC cable unplugged\n");
dbc->state = DS_ENABLED;
xhci_dbc_flush_requests(dbc);
-
+ xhci_dbc_reinit_ep_rings(dbc);
return EVT_DISC;
}
@@ -895,7 +914,7 @@ static enum evtreturn xhci_dbc_do_handle_events(struct xhci_dbc *dbc)
writel(portsc, &dbc->regs->portsc);
dbc->state = DS_ENABLED;
xhci_dbc_flush_requests(dbc);
-
+ xhci_dbc_reinit_ep_rings(dbc);
return EVT_DISC;
}
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 64961557efa1b98f375c0579779e7eeda1a02c42
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025091751-nuzzle-jolt-dcac@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 64961557efa1b98f375c0579779e7eeda1a02c42 Mon Sep 17 00:00:00 2001
From: Johan Hovold <johan(a)kernel.org>
Date: Thu, 24 Jul 2025 15:12:05 +0200
Subject: [PATCH] phy: ti: omap-usb2: fix device leak at unbind
Make sure to drop the reference to the control device taken by
of_find_device_by_node() during probe when the driver is unbound.
Fixes: 478b6c7436c2 ("usb: phy: omap-usb2: Don't use omap_get_control_dev()")
Cc: stable(a)vger.kernel.org # 3.13
Cc: Roger Quadros <rogerq(a)kernel.org>
Signed-off-by: Johan Hovold <johan(a)kernel.org>
Link: https://lore.kernel.org/r/20250724131206.2211-3-johan@kernel.org
Signed-off-by: Vinod Koul <vkoul(a)kernel.org>
diff --git a/drivers/phy/ti/phy-omap-usb2.c b/drivers/phy/ti/phy-omap-usb2.c
index c1a0ef979142..c444bb2530ca 100644
--- a/drivers/phy/ti/phy-omap-usb2.c
+++ b/drivers/phy/ti/phy-omap-usb2.c
@@ -363,6 +363,13 @@ static void omap_usb2_init_errata(struct omap_usb *phy)
phy->flags |= OMAP_USB2_DISABLE_CHRG_DET;
}
+static void omap_usb2_put_device(void *_dev)
+{
+ struct device *dev = _dev;
+
+ put_device(dev);
+}
+
static int omap_usb2_probe(struct platform_device *pdev)
{
struct omap_usb *phy;
@@ -373,6 +380,7 @@ static int omap_usb2_probe(struct platform_device *pdev)
struct device_node *control_node;
struct platform_device *control_pdev;
const struct usb_phy_data *phy_data;
+ int ret;
phy_data = device_get_match_data(&pdev->dev);
if (!phy_data)
@@ -423,6 +431,11 @@ static int omap_usb2_probe(struct platform_device *pdev)
return -EINVAL;
}
phy->control_dev = &control_pdev->dev;
+
+ ret = devm_add_action_or_reset(&pdev->dev, omap_usb2_put_device,
+ phy->control_dev);
+ if (ret)
+ return ret;
} else {
if (of_property_read_u32_index(node,
"syscon-phy-power", 1,
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x 8d63c83d8eb922f6c316320f50c82fa88d099bea
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025091724-yeast-sublet-dc69@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 8d63c83d8eb922f6c316320f50c82fa88d099bea Mon Sep 17 00:00:00 2001
From: Alan Stern <stern(a)rowland.harvard.edu>
Date: Mon, 25 Aug 2025 12:00:22 -0400
Subject: [PATCH] USB: gadget: dummy-hcd: Fix locking bug in RT-enabled kernels
Yunseong Kim and the syzbot fuzzer both reported a problem in
RT-enabled kernels caused by the way dummy-hcd mixes interrupt
management and spin-locking. The pattern was:
local_irq_save(flags);
spin_lock(&dum->lock);
...
spin_unlock(&dum->lock);
... // calls usb_gadget_giveback_request()
local_irq_restore(flags);
The code was written this way because usb_gadget_giveback_request()
needs to be called with interrupts disabled and the private lock not
held.
While this pattern works fine in non-RT kernels, it's not good when RT
is enabled. RT kernels handle spinlocks much like mutexes; in particular,
spin_lock() may sleep. But sleeping is not allowed while local
interrupts are disabled.
To fix the problem, rewrite the code to conform to the pattern used
elsewhere in dummy-hcd and other UDC drivers:
spin_lock_irqsave(&dum->lock, flags);
...
spin_unlock(&dum->lock);
usb_gadget_giveback_request(...);
spin_lock(&dum->lock);
...
spin_unlock_irqrestore(&dum->lock, flags);
This approach satisfies the RT requirements.
Signed-off-by: Alan Stern <stern(a)rowland.harvard.edu>
Cc: stable <stable(a)kernel.org>
Fixes: b4dbda1a22d2 ("USB: dummy-hcd: disable interrupts during req->complete")
Reported-by: Yunseong Kim <ysk(a)kzalloc.com>
Closes: <https://lore.kernel.org/linux-usb/5b337389-73b9-4ee4-a83e-7e82bf5af87a@kzal…>
Reported-by: syzbot+8baacc4139f12fa77909(a)syzkaller.appspotmail.com
Closes: <https://lore.kernel.org/linux-usb/68ac2411.050a0220.37038e.0087.GAE@google.…>
Tested-by: syzbot+8baacc4139f12fa77909(a)syzkaller.appspotmail.com
CC: Sebastian Andrzej Siewior <bigeasy(a)linutronix.de>
CC: stable(a)vger.kernel.org
Reviewed-by: Sebastian Andrzej Siewior <bigeasy(a)linutronix.de>
Link: https://lore.kernel.org/r/bb192ae2-4eee-48ee-981f-3efdbbd0d8f0@rowland.harv…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/usb/gadget/udc/dummy_hcd.c b/drivers/usb/gadget/udc/dummy_hcd.c
index 21dbfb0b3bac..1cefca660773 100644
--- a/drivers/usb/gadget/udc/dummy_hcd.c
+++ b/drivers/usb/gadget/udc/dummy_hcd.c
@@ -765,8 +765,7 @@ static int dummy_dequeue(struct usb_ep *_ep, struct usb_request *_req)
if (!dum->driver)
return -ESHUTDOWN;
- local_irq_save(flags);
- spin_lock(&dum->lock);
+ spin_lock_irqsave(&dum->lock, flags);
list_for_each_entry(iter, &ep->queue, queue) {
if (&iter->req != _req)
continue;
@@ -776,15 +775,16 @@ static int dummy_dequeue(struct usb_ep *_ep, struct usb_request *_req)
retval = 0;
break;
}
- spin_unlock(&dum->lock);
if (retval == 0) {
dev_dbg(udc_dev(dum),
"dequeued req %p from %s, len %d buf %p\n",
req, _ep->name, _req->length, _req->buf);
+ spin_unlock(&dum->lock);
usb_gadget_giveback_request(_ep, _req);
+ spin_lock(&dum->lock);
}
- local_irq_restore(flags);
+ spin_unlock_irqrestore(&dum->lock, flags);
return retval;
}
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 64961557efa1b98f375c0579779e7eeda1a02c42
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025091751-geometry-screen-8bd7@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 64961557efa1b98f375c0579779e7eeda1a02c42 Mon Sep 17 00:00:00 2001
From: Johan Hovold <johan(a)kernel.org>
Date: Thu, 24 Jul 2025 15:12:05 +0200
Subject: [PATCH] phy: ti: omap-usb2: fix device leak at unbind
Make sure to drop the reference to the control device taken by
of_find_device_by_node() during probe when the driver is unbound.
Fixes: 478b6c7436c2 ("usb: phy: omap-usb2: Don't use omap_get_control_dev()")
Cc: stable(a)vger.kernel.org # 3.13
Cc: Roger Quadros <rogerq(a)kernel.org>
Signed-off-by: Johan Hovold <johan(a)kernel.org>
Link: https://lore.kernel.org/r/20250724131206.2211-3-johan@kernel.org
Signed-off-by: Vinod Koul <vkoul(a)kernel.org>
diff --git a/drivers/phy/ti/phy-omap-usb2.c b/drivers/phy/ti/phy-omap-usb2.c
index c1a0ef979142..c444bb2530ca 100644
--- a/drivers/phy/ti/phy-omap-usb2.c
+++ b/drivers/phy/ti/phy-omap-usb2.c
@@ -363,6 +363,13 @@ static void omap_usb2_init_errata(struct omap_usb *phy)
phy->flags |= OMAP_USB2_DISABLE_CHRG_DET;
}
+static void omap_usb2_put_device(void *_dev)
+{
+ struct device *dev = _dev;
+
+ put_device(dev);
+}
+
static int omap_usb2_probe(struct platform_device *pdev)
{
struct omap_usb *phy;
@@ -373,6 +380,7 @@ static int omap_usb2_probe(struct platform_device *pdev)
struct device_node *control_node;
struct platform_device *control_pdev;
const struct usb_phy_data *phy_data;
+ int ret;
phy_data = device_get_match_data(&pdev->dev);
if (!phy_data)
@@ -423,6 +431,11 @@ static int omap_usb2_probe(struct platform_device *pdev)
return -EINVAL;
}
phy->control_dev = &control_pdev->dev;
+
+ ret = devm_add_action_or_reset(&pdev->dev, omap_usb2_put_device,
+ phy->control_dev);
+ if (ret)
+ return ret;
} else {
if (of_property_read_u32_index(node,
"syscon-phy-power", 1,
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 8d63c83d8eb922f6c316320f50c82fa88d099bea
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025091724-unrivaled-crystal-942a@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 8d63c83d8eb922f6c316320f50c82fa88d099bea Mon Sep 17 00:00:00 2001
From: Alan Stern <stern(a)rowland.harvard.edu>
Date: Mon, 25 Aug 2025 12:00:22 -0400
Subject: [PATCH] USB: gadget: dummy-hcd: Fix locking bug in RT-enabled kernels
Yunseong Kim and the syzbot fuzzer both reported a problem in
RT-enabled kernels caused by the way dummy-hcd mixes interrupt
management and spin-locking. The pattern was:
local_irq_save(flags);
spin_lock(&dum->lock);
...
spin_unlock(&dum->lock);
... // calls usb_gadget_giveback_request()
local_irq_restore(flags);
The code was written this way because usb_gadget_giveback_request()
needs to be called with interrupts disabled and the private lock not
held.
While this pattern works fine in non-RT kernels, it's not good when RT
is enabled. RT kernels handle spinlocks much like mutexes; in particular,
spin_lock() may sleep. But sleeping is not allowed while local
interrupts are disabled.
To fix the problem, rewrite the code to conform to the pattern used
elsewhere in dummy-hcd and other UDC drivers:
spin_lock_irqsave(&dum->lock, flags);
...
spin_unlock(&dum->lock);
usb_gadget_giveback_request(...);
spin_lock(&dum->lock);
...
spin_unlock_irqrestore(&dum->lock, flags);
This approach satisfies the RT requirements.
Signed-off-by: Alan Stern <stern(a)rowland.harvard.edu>
Cc: stable <stable(a)kernel.org>
Fixes: b4dbda1a22d2 ("USB: dummy-hcd: disable interrupts during req->complete")
Reported-by: Yunseong Kim <ysk(a)kzalloc.com>
Closes: <https://lore.kernel.org/linux-usb/5b337389-73b9-4ee4-a83e-7e82bf5af87a@kzal…>
Reported-by: syzbot+8baacc4139f12fa77909(a)syzkaller.appspotmail.com
Closes: <https://lore.kernel.org/linux-usb/68ac2411.050a0220.37038e.0087.GAE@google.…>
Tested-by: syzbot+8baacc4139f12fa77909(a)syzkaller.appspotmail.com
CC: Sebastian Andrzej Siewior <bigeasy(a)linutronix.de>
CC: stable(a)vger.kernel.org
Reviewed-by: Sebastian Andrzej Siewior <bigeasy(a)linutronix.de>
Link: https://lore.kernel.org/r/bb192ae2-4eee-48ee-981f-3efdbbd0d8f0@rowland.harv…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/usb/gadget/udc/dummy_hcd.c b/drivers/usb/gadget/udc/dummy_hcd.c
index 21dbfb0b3bac..1cefca660773 100644
--- a/drivers/usb/gadget/udc/dummy_hcd.c
+++ b/drivers/usb/gadget/udc/dummy_hcd.c
@@ -765,8 +765,7 @@ static int dummy_dequeue(struct usb_ep *_ep, struct usb_request *_req)
if (!dum->driver)
return -ESHUTDOWN;
- local_irq_save(flags);
- spin_lock(&dum->lock);
+ spin_lock_irqsave(&dum->lock, flags);
list_for_each_entry(iter, &ep->queue, queue) {
if (&iter->req != _req)
continue;
@@ -776,15 +775,16 @@ static int dummy_dequeue(struct usb_ep *_ep, struct usb_request *_req)
retval = 0;
break;
}
- spin_unlock(&dum->lock);
if (retval == 0) {
dev_dbg(udc_dev(dum),
"dequeued req %p from %s, len %d buf %p\n",
req, _ep->name, _req->length, _req->buf);
+ spin_unlock(&dum->lock);
usb_gadget_giveback_request(_ep, _req);
+ spin_lock(&dum->lock);
}
- local_irq_restore(flags);
+ spin_unlock_irqrestore(&dum->lock, flags);
return retval;
}
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x 64961557efa1b98f375c0579779e7eeda1a02c42
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025091751-almanac-unused-2e7b@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 64961557efa1b98f375c0579779e7eeda1a02c42 Mon Sep 17 00:00:00 2001
From: Johan Hovold <johan(a)kernel.org>
Date: Thu, 24 Jul 2025 15:12:05 +0200
Subject: [PATCH] phy: ti: omap-usb2: fix device leak at unbind
Make sure to drop the reference to the control device taken by
of_find_device_by_node() during probe when the driver is unbound.
Fixes: 478b6c7436c2 ("usb: phy: omap-usb2: Don't use omap_get_control_dev()")
Cc: stable(a)vger.kernel.org # 3.13
Cc: Roger Quadros <rogerq(a)kernel.org>
Signed-off-by: Johan Hovold <johan(a)kernel.org>
Link: https://lore.kernel.org/r/20250724131206.2211-3-johan@kernel.org
Signed-off-by: Vinod Koul <vkoul(a)kernel.org>
diff --git a/drivers/phy/ti/phy-omap-usb2.c b/drivers/phy/ti/phy-omap-usb2.c
index c1a0ef979142..c444bb2530ca 100644
--- a/drivers/phy/ti/phy-omap-usb2.c
+++ b/drivers/phy/ti/phy-omap-usb2.c
@@ -363,6 +363,13 @@ static void omap_usb2_init_errata(struct omap_usb *phy)
phy->flags |= OMAP_USB2_DISABLE_CHRG_DET;
}
+static void omap_usb2_put_device(void *_dev)
+{
+ struct device *dev = _dev;
+
+ put_device(dev);
+}
+
static int omap_usb2_probe(struct platform_device *pdev)
{
struct omap_usb *phy;
@@ -373,6 +380,7 @@ static int omap_usb2_probe(struct platform_device *pdev)
struct device_node *control_node;
struct platform_device *control_pdev;
const struct usb_phy_data *phy_data;
+ int ret;
phy_data = device_get_match_data(&pdev->dev);
if (!phy_data)
@@ -423,6 +431,11 @@ static int omap_usb2_probe(struct platform_device *pdev)
return -EINVAL;
}
phy->control_dev = &control_pdev->dev;
+
+ ret = devm_add_action_or_reset(&pdev->dev, omap_usb2_put_device,
+ phy->control_dev);
+ if (ret)
+ return ret;
} else {
if (of_property_read_u32_index(node,
"syscon-phy-power", 1,
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 8d63c83d8eb922f6c316320f50c82fa88d099bea
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025091723-stack-cargo-2b1d@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 8d63c83d8eb922f6c316320f50c82fa88d099bea Mon Sep 17 00:00:00 2001
From: Alan Stern <stern(a)rowland.harvard.edu>
Date: Mon, 25 Aug 2025 12:00:22 -0400
Subject: [PATCH] USB: gadget: dummy-hcd: Fix locking bug in RT-enabled kernels
Yunseong Kim and the syzbot fuzzer both reported a problem in
RT-enabled kernels caused by the way dummy-hcd mixes interrupt
management and spin-locking. The pattern was:
local_irq_save(flags);
spin_lock(&dum->lock);
...
spin_unlock(&dum->lock);
... // calls usb_gadget_giveback_request()
local_irq_restore(flags);
The code was written this way because usb_gadget_giveback_request()
needs to be called with interrupts disabled and the private lock not
held.
While this pattern works fine in non-RT kernels, it's not good when RT
is enabled. RT kernels handle spinlocks much like mutexes; in particular,
spin_lock() may sleep. But sleeping is not allowed while local
interrupts are disabled.
To fix the problem, rewrite the code to conform to the pattern used
elsewhere in dummy-hcd and other UDC drivers:
spin_lock_irqsave(&dum->lock, flags);
...
spin_unlock(&dum->lock);
usb_gadget_giveback_request(...);
spin_lock(&dum->lock);
...
spin_unlock_irqrestore(&dum->lock, flags);
This approach satisfies the RT requirements.
Signed-off-by: Alan Stern <stern(a)rowland.harvard.edu>
Cc: stable <stable(a)kernel.org>
Fixes: b4dbda1a22d2 ("USB: dummy-hcd: disable interrupts during req->complete")
Reported-by: Yunseong Kim <ysk(a)kzalloc.com>
Closes: <https://lore.kernel.org/linux-usb/5b337389-73b9-4ee4-a83e-7e82bf5af87a@kzal…>
Reported-by: syzbot+8baacc4139f12fa77909(a)syzkaller.appspotmail.com
Closes: <https://lore.kernel.org/linux-usb/68ac2411.050a0220.37038e.0087.GAE@google.…>
Tested-by: syzbot+8baacc4139f12fa77909(a)syzkaller.appspotmail.com
CC: Sebastian Andrzej Siewior <bigeasy(a)linutronix.de>
CC: stable(a)vger.kernel.org
Reviewed-by: Sebastian Andrzej Siewior <bigeasy(a)linutronix.de>
Link: https://lore.kernel.org/r/bb192ae2-4eee-48ee-981f-3efdbbd0d8f0@rowland.harv…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/usb/gadget/udc/dummy_hcd.c b/drivers/usb/gadget/udc/dummy_hcd.c
index 21dbfb0b3bac..1cefca660773 100644
--- a/drivers/usb/gadget/udc/dummy_hcd.c
+++ b/drivers/usb/gadget/udc/dummy_hcd.c
@@ -765,8 +765,7 @@ static int dummy_dequeue(struct usb_ep *_ep, struct usb_request *_req)
if (!dum->driver)
return -ESHUTDOWN;
- local_irq_save(flags);
- spin_lock(&dum->lock);
+ spin_lock_irqsave(&dum->lock, flags);
list_for_each_entry(iter, &ep->queue, queue) {
if (&iter->req != _req)
continue;
@@ -776,15 +775,16 @@ static int dummy_dequeue(struct usb_ep *_ep, struct usb_request *_req)
retval = 0;
break;
}
- spin_unlock(&dum->lock);
if (retval == 0) {
dev_dbg(udc_dev(dum),
"dequeued req %p from %s, len %d buf %p\n",
req, _ep->name, _req->length, _req->buf);
+ spin_unlock(&dum->lock);
usb_gadget_giveback_request(_ep, _req);
+ spin_lock(&dum->lock);
}
- local_irq_restore(flags);
+ spin_unlock_irqrestore(&dum->lock, flags);
return retval;
}
Fix a memory leak in netpoll and introduce netconsole selftests that
expose the issue when running with kmemleak detection enabled.
This patchset includes a selftest for netpoll with multiple concurrent
users (netconsole + bonding), which simulates the scenario from test[1]
that originally demonstrated the issue allegedly fixed by commit
efa95b01da18 ("netpoll: fix use after free") - a commit that is now
being reverted.
Sending this to "net" branch because this is a fix, and the selftest
might help with the backports validation.
Link: https://lore.kernel.org/lkml/96b940137a50e5c387687bb4f57de8b0435a653f.14048… [1]
Signed-off-by: Breno Leitao <leitao(a)debian.org>
---
Changes in v4:
- Added an additional selftest to test multiple netpoll users in
parallel
- Link to v3: https://lore.kernel.org/r/20250905-netconsole_torture-v3-0-875c7febd316@deb…
Changes in v3:
- This patchset is a merge of the fix and the selftest together as
recommended by Jakub.
Changes in v2:
- Reuse the netconsole creation from lib_netcons.sh. Thus, refactoring
the create_dynamic_target() (Jakub)
- Move the "wait" to after all the messages has been sent.
- Link to v1: https://lore.kernel.org/r/20250902-netconsole_torture-v1-1-03c6066598e9@deb…
---
Breno Leitao (4):
net: netpoll: fix incorrect refcount handling causing incorrect cleanup
selftest: netcons: refactor target creation
selftest: netcons: create a torture test
selftest: netcons: add test for netconsole over bonded interfaces
net/core/netpoll.c | 7 +-
tools/testing/selftests/drivers/net/Makefile | 2 +
.../selftests/drivers/net/lib/sh/lib_netcons.sh | 197 ++++++++++++++++++---
.../selftests/drivers/net/netcons_over_bonding.sh | 76 ++++++++
.../selftests/drivers/net/netcons_torture.sh | 127 +++++++++++++
5 files changed, 384 insertions(+), 25 deletions(-)
---
base-commit: 5e87fdc37f8dc619549d49ba5c951b369ce7c136
change-id: 20250902-netconsole_torture-8fc23f0aca99
Best regards,
--
Breno Leitao <leitao(a)debian.org>
The patch below does not apply to the 6.12-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.12.y
git checkout FETCH_HEAD
git cherry-pick -x a5c98e8b1398534ae1feb6e95e2d3ee5215538ed
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025091756-glare-cyclic-9298@gregkh' --subject-prefix 'PATCH 6.12.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From a5c98e8b1398534ae1feb6e95e2d3ee5215538ed Mon Sep 17 00:00:00 2001
From: Mathias Nyman <mathias.nyman(a)linux.intel.com>
Date: Tue, 2 Sep 2025 13:53:05 +0300
Subject: [PATCH] xhci: dbc: Fix full DbC transfer ring after several
reconnects
Pending requests will be flushed on disconnect, and the corresponding
TRBs will be turned into No-op TRBs, which are ignored by the xHC
controller once it starts processing the ring.
If the USB debug cable repeatedly disconnects before ring is started
then the ring will eventually be filled with No-op TRBs.
No new transfers can be queued when the ring is full, and driver will
print the following error message:
"xhci_hcd 0000:00:14.0: failed to queue trbs"
This is a normal case for 'in' transfers where TRBs are always enqueued
in advance, ready to take on incoming data. If no data arrives, and
device is disconnected, then ring dequeue will remain at beginning of
the ring while enqueue points to first free TRB after last cancelled
No-op TRB.
s
Solve this by reinitializing the rings when the debug cable disconnects
and DbC is leaving the configured state.
Clear the whole ring buffer and set enqueue and dequeue to the beginning
of ring, and set cycle bit to its initial state.
Cc: stable(a)vger.kernel.org
Fixes: dfba2174dc42 ("usb: xhci: Add DbC support in xHCI driver")
Signed-off-by: Mathias Nyman <mathias.nyman(a)linux.intel.com>
Link: https://lore.kernel.org/r/20250902105306.877476-3-mathias.nyman@linux.intel…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/usb/host/xhci-dbgcap.c b/drivers/usb/host/xhci-dbgcap.c
index d0faff233e3e..63edf2d8f245 100644
--- a/drivers/usb/host/xhci-dbgcap.c
+++ b/drivers/usb/host/xhci-dbgcap.c
@@ -462,6 +462,25 @@ static void xhci_dbc_ring_init(struct xhci_ring *ring)
xhci_initialize_ring_info(ring);
}
+static int xhci_dbc_reinit_ep_rings(struct xhci_dbc *dbc)
+{
+ struct xhci_ring *in_ring = dbc->eps[BULK_IN].ring;
+ struct xhci_ring *out_ring = dbc->eps[BULK_OUT].ring;
+
+ if (!in_ring || !out_ring || !dbc->ctx) {
+ dev_warn(dbc->dev, "Can't re-init unallocated endpoints\n");
+ return -ENODEV;
+ }
+
+ xhci_dbc_ring_init(in_ring);
+ xhci_dbc_ring_init(out_ring);
+
+ /* set ep context enqueue, dequeue, and cycle to initial values */
+ xhci_dbc_init_ep_contexts(dbc);
+
+ return 0;
+}
+
static struct xhci_ring *
xhci_dbc_ring_alloc(struct device *dev, enum xhci_ring_type type, gfp_t flags)
{
@@ -885,7 +904,7 @@ static enum evtreturn xhci_dbc_do_handle_events(struct xhci_dbc *dbc)
dev_info(dbc->dev, "DbC cable unplugged\n");
dbc->state = DS_ENABLED;
xhci_dbc_flush_requests(dbc);
-
+ xhci_dbc_reinit_ep_rings(dbc);
return EVT_DISC;
}
@@ -895,7 +914,7 @@ static enum evtreturn xhci_dbc_do_handle_events(struct xhci_dbc *dbc)
writel(portsc, &dbc->regs->portsc);
dbc->state = DS_ENABLED;
xhci_dbc_flush_requests(dbc);
-
+ xhci_dbc_reinit_ep_rings(dbc);
return EVT_DISC;
}
Hi,
I would like to request backport of these two commits to 6.6.y:
b5b4287accd702f562a49a60b10dbfaf7d40270f ("riscv: mm: Use hint address in mmap if available")
2116988d5372aec51f8c4fb85bf8e305ecda47a0 ("riscv: mm: Do not restrict mmap address based on hint")
Together, they amount to disabling arch_get_mmap_end and
arch_get_mmap_base for riscv.
I would like to note that the first patch conflicts with commit
724103429a2d ("riscv: mm: Fixup compat arch_get_mmap_end") in the stable
linux-6.6.y branch. STACK_TOP_MAX should end up as TASK_SIZE, *not*
TASK_SIZE_64.
Thanks,
Vivian "dramforever" Wang
Currently, the KSM-related counters in `mm_struct`, such as
`ksm_merging_pages`, `ksm_rmap_items`, and `ksm_zero_pages`, are
inherited by the child process during fork. This results in inconsistent
accounting.
When a process uses KSM, identical pages are merged and an rmap item is
created for each merged page. The `ksm_merging_pages` and
`ksm_rmap_items` counters are updated accordingly. However, after a
fork, these counters are copied to the child while the corresponding
rmap items are not. As a result, when the child later triggers an
unmerge, there are no rmap items present in the child, so the counters
remain stale, leading to incorrect accounting.
A similar issue exists with `ksm_zero_pages`, which maintains both a
global counter and a per-process counter. During fork, the per-process
counter is inherited by the child, but the global counter is not
incremented. Since the child also references zero pages, the global
counter should be updated as well. Otherwise, during zero-page unmerge,
both the global and per-process counters are decremented, causing the
global counter to become inconsistent.
To fix this, ksm_merging_pages and ksm_rmap_items are reset to 0
during fork, and the global ksm_zero_pages counter is updated with the
per-process ksm_zero_pages value inherited by the child. This ensures
that KSM statistics remain accurate and reflect the activity of each
process correctly.
Fixes: 7609385337a4 ("ksm: count ksm merging pages for each process")
Fixes: cb4df4cae4f2 ("ksm: count allocated ksm rmap_items for each process")
Fixes: e2942062e01d ("ksm: count all zero pages placed by KSM")
cc: stable(a)vger.kernel.org # v6.6
Signed-off-by: Donet Tom <donettom(a)linux.ibm.com>
---
include/linux/ksm.h | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/include/linux/ksm.h b/include/linux/ksm.h
index 22e67ca7cba3..067538fc4d58 100644
--- a/include/linux/ksm.h
+++ b/include/linux/ksm.h
@@ -56,8 +56,14 @@ static inline long mm_ksm_zero_pages(struct mm_struct *mm)
static inline void ksm_fork(struct mm_struct *mm, struct mm_struct *oldmm)
{
/* Adding mm to ksm is best effort on fork. */
- if (mm_flags_test(MMF_VM_MERGEABLE, oldmm))
+ if (mm_flags_test(MMF_VM_MERGEABLE, oldmm)) {
+ long nr_ksm_zero_pages = atomic_long_read(&mm->ksm_zero_pages);
+
+ mm->ksm_merging_pages = 0;
+ mm->ksm_rmap_items = 0;
+ atomic_long_add(nr_ksm_zero_pages, &ksm_zero_pages);
__ksm_enter(mm);
+ }
}
static inline int ksm_execve(struct mm_struct *mm)
--
2.51.0
From: Ryan Roberts <ryan.roberts(a)arm.com>
commit c910f2b65518 ("arm64/mm: Update tlb invalidation routines for
FEAT_LPA2") changed the "invalidation level unknown" hint from 0 to
TLBI_TTL_UNKNOWN (INT_MAX). But the fallback "unknown level" path in
flush_hugetlb_tlb_range() was not updated. So as it stands, when trying
to invalidate CONT_PMD_SIZE or CONT_PTE_SIZE hugetlb mappings, we will
spuriously try to invalidate at level 0 on LPA2-enabled systems.
Fix this so that the fallback passes TLBI_TTL_UNKNOWN, and while we are
at it, explicitly use the correct stride and level for CONT_PMD_SIZE and
CONT_PTE_SIZE, which should provide a minor optimization.
Cc: stable(a)vger.kernel.org
Fixes: c910f2b65518 ("arm64/mm: Update tlb invalidation routines for FEAT_LPA2")
Reviewed-by: Anshuman Khandual <anshuman.khandual(a)arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas(a)arm.com>
Signed-off-by: Ryan Roberts <ryan.roberts(a)arm.com>
Link: https://lore.kernel.org/r/20250226120656.2400136-4-ryan.roberts@arm.com
Signed-off-by: Will Deacon <will(a)kernel.org>
Signed-off-by: Jia He <justin.he(a)arm.com>
---
arch/arm64/include/asm/hugetlb.h | 22 ++++++++++++++++------
1 file changed, 16 insertions(+), 6 deletions(-)
diff --git a/arch/arm64/include/asm/hugetlb.h b/arch/arm64/include/asm/hugetlb.h
index 92a5e0879b11..eb413631edea 100644
--- a/arch/arm64/include/asm/hugetlb.h
+++ b/arch/arm64/include/asm/hugetlb.h
@@ -68,12 +68,22 @@ static inline void flush_hugetlb_tlb_range(struct vm_area_struct *vma,
{
unsigned long stride = huge_page_size(hstate_vma(vma));
- if (stride == PMD_SIZE)
- __flush_tlb_range(vma, start, end, stride, false, 2);
- else if (stride == PUD_SIZE)
- __flush_tlb_range(vma, start, end, stride, false, 1);
- else
- __flush_tlb_range(vma, start, end, PAGE_SIZE, false, 0);
+ switch (stride) {
+#ifndef __PAGETABLE_PMD_FOLDED
+ case PUD_SIZE:
+ __flush_tlb_range(vma, start, end, PUD_SIZE, false, 1);
+ break;
+#endif
+ case CONT_PMD_SIZE:
+ case PMD_SIZE:
+ __flush_tlb_range(vma, start, end, PMD_SIZE, false, 2);
+ break;
+ case CONT_PTE_SIZE:
+ __flush_tlb_range(vma, start, end, PAGE_SIZE, false, 3);
+ break;
+ default:
+ __flush_tlb_range(vma, start, end, PAGE_SIZE, false, TLBI_TTL_UNKNOWN);
+ }
}
#endif /* __ASM_HUGETLB_H */
--
2.34.1
From: Ryan Roberts <ryan.roberts(a)arm.com>
arm64 supports multiple huge_pte sizes. Some of the sizes are covered by
a single pte entry at a particular level (PMD_SIZE, PUD_SIZE), and some
are covered by multiple ptes at a particular level (CONT_PTE_SIZE,
CONT_PMD_SIZE). So the function has to figure out the size from the
huge_pte pointer. This was previously done by walking the pgtable to
determine the level and by using the PTE_CONT bit to determine the
number of ptes at the level.
But the PTE_CONT bit is only valid when the pte is present. For
non-present pte values (e.g. markers, migration entries), the previous
implementation was therefore erroneously determining the size. There is
at least one known caller in core-mm, move_huge_pte(), which may call
huge_ptep_get_and_clear() for a non-present pte. So we must be robust to
this case. Additionally the "regular" ptep_get_and_clear() is robust to
being called for non-present ptes so it makes sense to follow the
behavior.
Fix this by using the new sz parameter which is now provided to the
function. Additionally when clearing each pte in a contig range, don't
gather the access and dirty bits if the pte is not present.
An alternative approach that would not require API changes would be to
store the PTE_CONT bit in a spare bit in the swap entry pte for the
non-present case. But it felt cleaner to follow other APIs' lead and
just pass in the size.
As an aside, PTE_CONT is bit 52, which corresponds to bit 40 in the swap
entry offset field (layout of non-present pte). Since hugetlb is never
swapped to disk, this field will only be populated for markers, which
always set this bit to 0 and hwpoison swap entries, which set the offset
field to a PFN; So it would only ever be 1 for a 52-bit PVA system where
memory in that high half was poisoned (I think!). So in practice, this
bit would almost always be zero for non-present ptes and we would only
clear the first entry if it was actually a contiguous block. That's
probably a less severe symptom than if it was always interpreted as 1
and cleared out potentially-present neighboring PTEs.
Cc: stable(a)vger.kernel.org
Fixes: 66b3923a1a0f ("arm64: hugetlb: add support for PTE contiguous bit")
Reviewed-by: Catalin Marinas <catalin.marinas(a)arm.com>
Signed-off-by: Ryan Roberts <ryan.roberts(a)arm.com>
Link: https://lore.kernel.org/r/20250226120656.2400136-3-ryan.roberts@arm.com
Signed-off-by: Will Deacon <will(a)kernel.org>
Signed-off-by: Jia He <justin.he(a)arm.com>
---
arch/arm64/mm/hugetlbpage.c | 53 ++++++++++++++-----------------------
1 file changed, 20 insertions(+), 33 deletions(-)
diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c
index 7a54f8b4164b..d631d52986ec 100644
--- a/arch/arm64/mm/hugetlbpage.c
+++ b/arch/arm64/mm/hugetlbpage.c
@@ -121,20 +121,11 @@ static int find_num_contig(struct mm_struct *mm, unsigned long addr,
static inline int num_contig_ptes(unsigned long size, size_t *pgsize)
{
- int contig_ptes = 0;
+ int contig_ptes = 1;
*pgsize = size;
switch (size) {
-#ifndef __PAGETABLE_PMD_FOLDED
- case PUD_SIZE:
- if (pud_sect_supported())
- contig_ptes = 1;
- break;
-#endif
- case PMD_SIZE:
- contig_ptes = 1;
- break;
case CONT_PMD_SIZE:
*pgsize = PMD_SIZE;
contig_ptes = CONT_PMDS;
@@ -143,6 +134,8 @@ static inline int num_contig_ptes(unsigned long size, size_t *pgsize)
*pgsize = PAGE_SIZE;
contig_ptes = CONT_PTES;
break;
+ default:
+ WARN_ON(!__hugetlb_valid_size(size));
}
return contig_ptes;
@@ -184,24 +177,23 @@ static pte_t get_clear_contig(struct mm_struct *mm,
unsigned long pgsize,
unsigned long ncontig)
{
- pte_t orig_pte = __ptep_get(ptep);
- unsigned long i;
-
- for (i = 0; i < ncontig; i++, addr += pgsize, ptep++) {
- pte_t pte = __ptep_get_and_clear(mm, addr, ptep);
-
- /*
- * If HW_AFDBM is enabled, then the HW could turn on
- * the dirty or accessed bit for any page in the set,
- * so check them all.
- */
- if (pte_dirty(pte))
- orig_pte = pte_mkdirty(orig_pte);
-
- if (pte_young(pte))
- orig_pte = pte_mkyoung(orig_pte);
+ pte_t pte, tmp_pte;
+ bool present;
+
+ pte = __ptep_get_and_clear(mm, addr, ptep);
+ present = pte_present(pte);
+ while (--ncontig) {
+ ptep++;
+ addr += pgsize;
+ tmp_pte = __ptep_get_and_clear(mm, addr, ptep);
+ if (present) {
+ if (pte_dirty(tmp_pte))
+ pte = pte_mkdirty(pte);
+ if (pte_young(tmp_pte))
+ pte = pte_mkyoung(pte);
+ }
}
- return orig_pte;
+ return pte;
}
static pte_t get_clear_contig_flush(struct mm_struct *mm,
@@ -419,13 +411,8 @@ pte_t huge_ptep_get_and_clear(struct mm_struct *mm, unsigned long addr,
{
int ncontig;
size_t pgsize;
- pte_t orig_pte = __ptep_get(ptep);
-
- if (!pte_cont(orig_pte))
- return __ptep_get_and_clear(mm, addr, ptep);
-
- ncontig = find_num_contig(mm, addr, ptep, &pgsize);
+ ncontig = num_contig_ptes(sz, &pgsize);
return get_clear_contig(mm, addr, ptep, pgsize, ncontig);
}
--
2.34.1
Running sha224_kunit on a KMSAN-enabled kernel results in a crash in
kmsan_internal_set_shadow_origin():
BUG: unable to handle page fault for address: ffffbc3840291000
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 1810067 P4D 1810067 PUD 192d067 PMD 3c17067 PTE 0
Oops: 0000 [#1] SMP NOPTI
CPU: 0 UID: 0 PID: 81 Comm: kunit_try_catch Tainted: G N 6.17.0-rc3 #10 PREEMPT(voluntary)
Tainted: [N]=TEST
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.17.0-0-gb52ca86e094d-prebuilt.qemu.org 04/01/2014
RIP: 0010:kmsan_internal_set_shadow_origin+0x91/0x100
[...]
Call Trace:
<TASK>
__msan_memset+0xee/0x1a0
sha224_final+0x9e/0x350
test_hash_buffer_overruns+0x46f/0x5f0
? kmsan_get_shadow_origin_ptr+0x46/0xa0
? __pfx_test_hash_buffer_overruns+0x10/0x10
kunit_try_run_case+0x198/0xa00
This occurs when memset() is called on a buffer that is not 4-byte
aligned and extends to the end of a guard page, i.e. the next page is
unmapped.
The bug is that the loop at the end of
kmsan_internal_set_shadow_origin() accesses the wrong shadow memory
bytes when the address is not 4-byte aligned. Since each 4 bytes are
associated with an origin, it rounds the address and size so that it can
access all the origins that contain the buffer. However, when it checks
the corresponding shadow bytes for a particular origin, it incorrectly
uses the original unrounded shadow address. This results in reads from
shadow memory beyond the end of the buffer's shadow memory, which
crashes when that memory is not mapped.
To fix this, correctly align the shadow address before accessing the 4
shadow bytes corresponding to each origin.
Fixes: 2ef3cec44c60 ("kmsan: do not wipe out origin when doing partial unpoisoning")
Cc: stable(a)vger.kernel.org
Signed-off-by: Eric Biggers <ebiggers(a)kernel.org>
---
v2: Added test case to kmsan_test.
mm/kmsan/core.c | 10 +++++++---
mm/kmsan/kmsan_test.c | 16 ++++++++++++++++
2 files changed, 23 insertions(+), 3 deletions(-)
diff --git a/mm/kmsan/core.c b/mm/kmsan/core.c
index 1ea711786c522..8bca7fece47f0 100644
--- a/mm/kmsan/core.c
+++ b/mm/kmsan/core.c
@@ -193,11 +193,12 @@ depot_stack_handle_t kmsan_internal_chain_origin(depot_stack_handle_t id)
void kmsan_internal_set_shadow_origin(void *addr, size_t size, int b,
u32 origin, bool checked)
{
u64 address = (u64)addr;
- u32 *shadow_start, *origin_start;
+ void *shadow_start;
+ u32 *aligned_shadow, *origin_start;
size_t pad = 0;
KMSAN_WARN_ON(!kmsan_metadata_is_contiguous(addr, size));
shadow_start = kmsan_get_metadata(addr, KMSAN_META_SHADOW);
if (!shadow_start) {
@@ -212,13 +213,16 @@ void kmsan_internal_set_shadow_origin(void *addr, size_t size, int b,
}
return;
}
__memset(shadow_start, b, size);
- if (!IS_ALIGNED(address, KMSAN_ORIGIN_SIZE)) {
+ if (IS_ALIGNED(address, KMSAN_ORIGIN_SIZE)) {
+ aligned_shadow = shadow_start;
+ } else {
pad = address % KMSAN_ORIGIN_SIZE;
address -= pad;
+ aligned_shadow = shadow_start - pad;
size += pad;
}
size = ALIGN(size, KMSAN_ORIGIN_SIZE);
origin_start =
(u32 *)kmsan_get_metadata((void *)address, KMSAN_META_ORIGIN);
@@ -228,11 +232,11 @@ void kmsan_internal_set_shadow_origin(void *addr, size_t size, int b,
* and unconditionally overwrite the old origin slot.
* If the new origin is zero, overwrite the old origin slot iff the
* corresponding shadow slot is zero.
*/
for (int i = 0; i < size / KMSAN_ORIGIN_SIZE; i++) {
- if (origin || !shadow_start[i])
+ if (origin || !aligned_shadow[i])
origin_start[i] = origin;
}
}
struct page *kmsan_vmalloc_to_page_or_null(void *vaddr)
diff --git a/mm/kmsan/kmsan_test.c b/mm/kmsan/kmsan_test.c
index c6c5b2bbede0c..902ec48b1e3e6 100644
--- a/mm/kmsan/kmsan_test.c
+++ b/mm/kmsan/kmsan_test.c
@@ -554,10 +554,25 @@ static void test_memcpy_initialized_gap(struct kunit *test)
DEFINE_TEST_MEMSETXX(16)
DEFINE_TEST_MEMSETXX(32)
DEFINE_TEST_MEMSETXX(64)
+/* Test case: ensure that KMSAN does not access shadow memory out of bounds. */
+static void test_memset_on_guarded_buffer(struct kunit *test)
+{
+ void *buf = vmalloc(PAGE_SIZE);
+
+ kunit_info(test,
+ "memset() on ends of guarded buffer should not crash\n");
+
+ for (size_t size = 0; size <= 128; size++) {
+ memset(buf, 0xff, size);
+ memset(buf + PAGE_SIZE - size, 0xff, size);
+ }
+ vfree(buf);
+}
+
static noinline void fibonacci(int *array, int size, int start)
{
if (start < 2 || (start == size))
return;
array[start] = array[start - 1] + array[start - 2];
@@ -675,10 +690,11 @@ static struct kunit_case kmsan_test_cases[] = {
KUNIT_CASE(test_memcpy_aligned_to_unaligned),
KUNIT_CASE(test_memcpy_initialized_gap),
KUNIT_CASE(test_memset16),
KUNIT_CASE(test_memset32),
KUNIT_CASE(test_memset64),
+ KUNIT_CASE(test_memset_on_guarded_buffer),
KUNIT_CASE(test_long_origin_chain),
KUNIT_CASE(test_stackdepot_roundtrip),
KUNIT_CASE(test_unpoison_memory),
KUNIT_CASE(test_copy_from_kernel_nofault),
{},
base-commit: e59a039119c3ec241228adf12dca0dd4398104d0
--
2.51.0
Hi, All
Please help to cherry-pick the following commit
25daf9af0ac1 ("soc: qcom: mdt_loader: Deal with zero e_shentsize")
into the following branches:
linux-5.4.y
linux-5.10.y
linux-5.15.y
linux-6.1.y
Which is to fix the issue caused by the following commit in the
branches already:
9f9967fed9d0 ("soc: qcom: mdt_loader: Ensure we don't read past
the ELF header")
Just please note, for the linux-6.1.y branch the following commit
needs to be cherry-picked first:
9f35ab0e53cc ("soc: qcom: mdt_loader: Fix error return values in
mdt_header_valid()")
before the cherry-pick of the 25daf9af0ac1 commit.
# if this needs to be in a separate cherry-pick request
# please let me know.
--
Best Regards,
Yongqin Liu
---------------------------------------------------------------
#mailing list
linaro-android(a)lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-android
This is the start of the stable review cycle for the 5.15.192 release.
There are 64 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Tue, 09 Sep 2025 19:55:53 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.15.192-r…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 5.15.192-rc1
Qiu-ji Chen <chenqiuji666(a)gmail.com>
dmaengine: mediatek: Fix a flag reuse error in mtk_cqdma_tx_status()
Aaron Kling <webgeek1234(a)gmail.com>
spi: tegra114: Use value to check for invalid delays
Taniya Das <quic_tdas(a)quicinc.com>
clk: qcom: gdsc: Set retain_ff before moving to HW CTRL
Ian Rogers <irogers(a)google.com>
perf bpf-event: Fix use-after-free in synthesis
Michael Walle <mwalle(a)kernel.org>
drm/bridge: ti-sn65dsi86: fix REFCLK setting
Larisa Grigore <larisa.grigore(a)nxp.com>
spi: spi-fsl-lpspi: Reset FIFO and disable module on transfer abort
Larisa Grigore <larisa.grigore(a)nxp.com>
spi: spi-fsl-lpspi: Set correct chip-select polarity bit
Larisa Grigore <larisa.grigore(a)nxp.com>
spi: spi-fsl-lpspi: Fix transmissions when using CONT
Wentao Liang <vulab(a)iscas.ac.cn>
pcmcia: Add error handling for add_interval() in do_validate_mem()
Takashi Iwai <tiwai(a)suse.de>
ALSA: hda/hdmi: Add pin fix for another HP EliteDesk 800 G4 model
Li Qiong <liqiong(a)nfschina.com>
mm/slub: avoid accessing metadata when pointer is invalid in object_err()
Kees Cook <kees(a)kernel.org>
randstruct: gcc-plugin: Fix attribute addition
Kees Cook <kees(a)kernel.org>
randstruct: gcc-plugin: Remove bogus void member
Gabor Juhos <j4g8y7(a)gmail.com>
arm64: dts: marvell: uDPU: define pinctrl state for alarm LEDs
Ronak Doshi <ronak.doshi(a)broadcom.com>
vmxnet3: update MTU after device quiesce
Jakob Unterwurzacher <jakobunt(a)gmail.com>
net: dsa: microchip: linearize skb for tail-tagging switches
Pieter Van Trappen <pieter.van.trappen(a)cern.ch>
net: dsa: microchip: update tag_ksz masks for KSZ9477 family
Qiu-ji Chen <chenqiuji666(a)gmail.com>
dmaengine: mediatek: Fix a possible deadlock error in mtk_cqdma_tx_status()
Hyejeong Choi <hjeong.choi(a)samsung.com>
dma-buf: insert memory barrier before updating num_fences
Emanuele Ghidoli <emanuele.ghidoli(a)toradex.com>
gpio: pca953x: fix IRQ storm on system wake up
Luca Ceresoli <luca.ceresoli(a)bootlin.com>
iio: light: opt3001: fix deadlock due to concurrent flag access
David Lechner <dlechner(a)baylibre.com>
iio: chemical: pms7003: use aligned_s64 for timestamp
Aaron Kling <webgeek1234(a)gmail.com>
spi: tegra114: Don't fail set_cs_timing when delays are zero
Alexander Danilenko <al.b.danilenko(a)gmail.com>
spi: tegra114: Remove unnecessary NULL-pointer checks
Sean Christopherson <seanjc(a)google.com>
KVM: x86: Take irqfds.lock when adding/deleting IRQ bypass producer
Rafael J. Wysocki <rafael.j.wysocki(a)intel.com>
cpufreq/sched: Explicitly synchronize limits_changed flag handling
Jann Horn <jannh(a)google.com>
mm/khugepaged: fix ->anon_vma race
Vitaly Lifshits <vitaly.lifshits(a)intel.com>
e1000e: fix heap overflow in e1000_set_eeprom
Stanislav Fort <stanislav.fort(a)aisle.com>
batman-adv: fix OOB read/write in network-coding decode
John Evans <evans1210144(a)gmail.com>
scsi: lpfc: Fix buffer free/clear order in deferred receive path
Alex Deucher <alexander.deucher(a)amd.com>
drm/amdgpu: drop hw access in non-DC audio fini
Qianfeng Rong <rongqianfeng(a)vivo.com>
wifi: mwifiex: Initialize the chan_stats array to zero
Harry Yoo <harry.yoo(a)oracle.com>
mm: move page table sync declarations to linux/pgtable.h
Harry Yoo <harry.yoo(a)oracle.com>
x86/mm/64: define ARCH_PAGE_TABLE_SYNC_MASK and arch_sync_kernel_mappings()
Ma Ke <make24(a)iscas.ac.cn>
pcmcia: Fix a NULL pointer dereference in __iodyn_find_io_region()
Cryolitia PukNgae <cryolitia(a)uniontech.com>
ALSA: usb-audio: Add mute TLV for playback volumes on some devices
Horatiu Vultur <horatiu.vultur(a)microchip.com>
phy: mscc: Stop taking ts_lock for tx_queue and use its own lock
Horatiu Vultur <horatiu.vultur(a)microchip.com>
net: phy: mscc: Fix memory leak when using one step timestamping
Kurt Kanzenbach <kurt(a)linutronix.de>
ptp: Add generic PTP is_sync() function
Qingfang Deng <dqfext(a)gmail.com>
ppp: fix memory leak in pad_compress_skb
Wang Liang <wangliang74(a)huawei.com>
net: atm: fix memory leak in atm_register_sysfs when device_register fail
Eric Dumazet <edumazet(a)google.com>
ax25: properly unshare skbs in ax25_kiss_rcv()
Dan Carpenter <dan.carpenter(a)linaro.org>
ipv4: Fix NULL vs error pointer check in inet_blackhole_dev_init()
Rosen Penev <rosenp(a)gmail.com>
net: thunder_bgx: decrement cleanup index before use
Rosen Penev <rosenp(a)gmail.com>
net: thunder_bgx: add a missing of_node_put
Dan Carpenter <dan.carpenter(a)linaro.org>
wifi: libertas: cap SSID len in lbs_associate()
Dan Carpenter <dan.carpenter(a)linaro.org>
wifi: cw1200: cap SSID length in cw1200_do_join()
Felix Fietkau <nbd(a)nbd.name>
net: ethernet: mtk_eth_soc: fix tx vlan tag for llc packets
Zhen Ni <zhen.ni(a)easystack.cn>
i40e: Fix potential invalid access when MAC list is empty
Fabian Bläse <fabian(a)blaese.de>
icmp: fix icmp_ndo_send address translation for reply direction
Miaoqian Lin <linmq006(a)gmail.com>
mISDN: Fix memory leak in dsp_hwec_enable()
Alok Tiwari <alok.a.tiwari(a)oracle.com>
xirc2ps_cs: fix register access when enabling FullDuplex
Kuniyuki Iwashima <kuniyu(a)google.com>
Bluetooth: Fix use-after-free in l2cap_sock_cleanup_listen()
Phil Sutter <phil(a)nwl.cc>
netfilter: conntrack: helper: Replace -EEXIST by -EBUSY
Wang Liang <wangliang74(a)huawei.com>
netfilter: br_netfilter: do not check confirmed bit in br_nf_local_in() after confirm
Dmitry Antipov <dmantipov(a)yandex.ru>
wifi: cfg80211: fix use-after-free in cmp_bss()
Peter Robinson <pbrobinson(a)gmail.com>
arm64: dts: rockchip: Add vcc-supply to SPI flash on rk3399-pinebook-pro
Pei Xiao <xiaopei01(a)kylinos.cn>
tee: fix NULL pointer dereference in tee_shm_put
Jiufei Xue <jiufei.xue(a)samsung.com>
fs: writeback: fix use-after-free in __mark_inode_dirty()
Timur Kristóf <timur.kristof(a)gmail.com>
drm/amd/display: Don't warn when missing DCE encoder caps
Daniel Borkmann <daniel(a)iogearbox.net>
bpf: Fix oob access in cgroup local storage
Daniel Borkmann <daniel(a)iogearbox.net>
bpf: Move bpf map owner out of common struct
Daniel Borkmann <daniel(a)iogearbox.net>
bpf: Move cgroup iterator helpers to bpf.h
Daniel Borkmann <daniel(a)iogearbox.net>
bpf: Add cookie object to bpf maps
-------------
Diffstat:
Makefile | 4 +-
arch/arm64/boot/dts/marvell/armada-3720-uDPU.dts | 9 +-
.../boot/dts/rockchip/rk3399-pinebook-pro.dts | 1 +
arch/x86/include/asm/pgtable_64_types.h | 3 +
arch/x86/kvm/x86.c | 18 ++-
arch/x86/mm/init_64.c | 18 +++
drivers/clk/qcom/gdsc.c | 21 ++--
drivers/dma-buf/dma-resv.c | 5 +-
drivers/dma/mediatek/mtk-cqdma.c | 10 +-
drivers/gpio/gpio-pca953x.c | 5 +
drivers/gpu/drm/amd/amdgpu/dce_v10_0.c | 5 -
drivers/gpu/drm/amd/amdgpu/dce_v11_0.c | 5 -
drivers/gpu/drm/amd/amdgpu/dce_v6_0.c | 5 -
drivers/gpu/drm/amd/amdgpu/dce_v8_0.c | 5 -
.../gpu/drm/amd/display/dc/dce/dce_link_encoder.c | 8 +-
drivers/gpu/drm/bridge/ti-sn65dsi86.c | 11 ++
drivers/iio/chemical/pms7003.c | 5 +-
drivers/iio/light/opt3001.c | 5 +-
drivers/isdn/mISDN/dsp_hwec.c | 6 +-
drivers/net/ethernet/cavium/thunder/thunder_bgx.c | 20 +--
drivers/net/ethernet/intel/e1000e/ethtool.c | 10 +-
drivers/net/ethernet/intel/i40e/i40e_client.c | 4 +-
drivers/net/ethernet/mediatek/mtk_eth_soc.c | 10 +-
drivers/net/ethernet/xircom/xirc2ps_cs.c | 2 +-
drivers/net/phy/mscc/mscc_ptp.c | 34 +++---
drivers/net/ppp/ppp_generic.c | 6 +-
drivers/net/vmxnet3/vmxnet3_drv.c | 5 +-
drivers/net/wireless/marvell/libertas/cfg.c | 9 +-
drivers/net/wireless/marvell/mwifiex/cfg80211.c | 5 +-
drivers/net/wireless/marvell/mwifiex/main.c | 4 +-
drivers/net/wireless/st/cw1200/sta.c | 2 +-
drivers/pcmcia/rsrc_iodyn.c | 3 +
drivers/pcmcia/rsrc_nonstatic.c | 4 +-
drivers/scsi/lpfc/lpfc_nvmet.c | 10 +-
drivers/spi/spi-fsl-lpspi.c | 15 +--
drivers/spi/spi-tegra114.c | 18 ++-
drivers/tee/tee_shm.c | 6 +-
fs/fs-writeback.c | 9 +-
include/linux/bpf-cgroup.h | 5 -
include/linux/bpf.h | 134 ++++++++++++++++++---
include/linux/pgtable.h | 16 +++
include/linux/ptp_classify.h | 15 +++
include/linux/vmalloc.h | 16 ---
kernel/bpf/arraymap.c | 1 -
kernel/bpf/core.c | 83 ++++++++++---
kernel/bpf/syscall.c | 22 ++--
kernel/sched/cpufreq_schedutil.c | 28 ++++-
mm/khugepaged.c | 15 ++-
mm/slub.c | 7 +-
net/atm/resources.c | 6 +-
net/ax25/ax25_in.c | 4 +
net/batman-adv/network-coding.c | 7 +-
net/bluetooth/l2cap_sock.c | 3 +
net/bridge/br_netfilter_hooks.c | 3 -
net/core/ptp_classifier.c | 12 ++
net/dsa/tag_ksz.c | 22 +++-
net/ipv4/devinet.c | 7 +-
net/ipv4/icmp.c | 6 +-
net/ipv6/ip6_icmp.c | 6 +-
net/netfilter/nf_conntrack_helper.c | 4 +-
net/wireless/scan.c | 3 +-
scripts/gcc-plugins/gcc-common.h | 32 +++++
scripts/gcc-plugins/randomize_layout_plugin.c | 40 ++----
sound/pci/hda/patch_hdmi.c | 1 +
sound/usb/mixer_quirks.c | 2 +
tools/perf/util/bpf-event.c | 39 ++++--
66 files changed, 600 insertions(+), 264 deletions(-)
The 4 patches in this series make the JMP_NOSPEC and CALL_NOSPEC macros used
in the kernel consistent with what is generated by the compiler.
("x86,nospec: Simplify {JMP,CALL}_NOSPEC") was merged in v6.0 and the remaining
3 patches in this series were merged in v6.15. All 4 were included in kernels
v5.15+ as prerequisites for the backport of the ITS mitigations [1].
None of these patches were included in the backport of the ITS mitigations to
the 5.10 kernel [2]. They all apply cleanly and are applicable to the 5.10
kernel. Thus I see no reason that they weren't applied here, unless someone can
correct me?
I am sending them for inclusion in the 5.10 kernel as this kernel is still
actively maintained for these kind of vulnerability mitigations and as such
having these patches will unify the handling of these cases with subsequent
kernel versions easing code understanding and the ease of backports in the
future.
[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?…
[2] https://lore.kernel.org/stable/20250617-its-5-10-v2-0-3e925a1512a1@linux.in…
Pawan Gupta (3):
x86/speculation: Simplify and make CALL_NOSPEC consistent
x86/speculation: Add a conditional CS prefix to CALL_NOSPEC
x86/speculation: Remove the extra #ifdef around CALL_NOSPEC
Peter Zijlstra (1):
x86,nospec: Simplify {JMP,CALL}_NOSPEC
arch/x86/include/asm/nospec-branch.h | 46 ++++++++++++++++++----------
1 file changed, 30 insertions(+), 16 deletions(-)
--
2.34.1
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x 64961557efa1b98f375c0579779e7eeda1a02c42
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025091752-dizziness-decorated-ee3a@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 64961557efa1b98f375c0579779e7eeda1a02c42 Mon Sep 17 00:00:00 2001
From: Johan Hovold <johan(a)kernel.org>
Date: Thu, 24 Jul 2025 15:12:05 +0200
Subject: [PATCH] phy: ti: omap-usb2: fix device leak at unbind
Make sure to drop the reference to the control device taken by
of_find_device_by_node() during probe when the driver is unbound.
Fixes: 478b6c7436c2 ("usb: phy: omap-usb2: Don't use omap_get_control_dev()")
Cc: stable(a)vger.kernel.org # 3.13
Cc: Roger Quadros <rogerq(a)kernel.org>
Signed-off-by: Johan Hovold <johan(a)kernel.org>
Link: https://lore.kernel.org/r/20250724131206.2211-3-johan@kernel.org
Signed-off-by: Vinod Koul <vkoul(a)kernel.org>
diff --git a/drivers/phy/ti/phy-omap-usb2.c b/drivers/phy/ti/phy-omap-usb2.c
index c1a0ef979142..c444bb2530ca 100644
--- a/drivers/phy/ti/phy-omap-usb2.c
+++ b/drivers/phy/ti/phy-omap-usb2.c
@@ -363,6 +363,13 @@ static void omap_usb2_init_errata(struct omap_usb *phy)
phy->flags |= OMAP_USB2_DISABLE_CHRG_DET;
}
+static void omap_usb2_put_device(void *_dev)
+{
+ struct device *dev = _dev;
+
+ put_device(dev);
+}
+
static int omap_usb2_probe(struct platform_device *pdev)
{
struct omap_usb *phy;
@@ -373,6 +380,7 @@ static int omap_usb2_probe(struct platform_device *pdev)
struct device_node *control_node;
struct platform_device *control_pdev;
const struct usb_phy_data *phy_data;
+ int ret;
phy_data = device_get_match_data(&pdev->dev);
if (!phy_data)
@@ -423,6 +431,11 @@ static int omap_usb2_probe(struct platform_device *pdev)
return -EINVAL;
}
phy->control_dev = &control_pdev->dev;
+
+ ret = devm_add_action_or_reset(&pdev->dev, omap_usb2_put_device,
+ phy->control_dev);
+ if (ret)
+ return ret;
} else {
if (of_property_read_u32_index(node,
"syscon-phy-power", 1,
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x edcbe06453ddfde21f6aa763f7cab655f26133cc
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025091758-flask-diligence-4c70@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From edcbe06453ddfde21f6aa763f7cab655f26133cc Mon Sep 17 00:00:00 2001
From: Mathias Nyman <mathias.nyman(a)linux.intel.com>
Date: Tue, 2 Sep 2025 13:53:06 +0300
Subject: [PATCH] xhci: fix memory leak regression when freeing xhci vdev
devices depth first
Suspend-resume cycle test revealed a memory leak in 6.17-rc3
Turns out the slot_id race fix changes accidentally ends up calling
xhci_free_virt_device() with an incorrect vdev parameter.
The vdev variable was reused for temporary purposes right before calling
xhci_free_virt_device().
Fix this by passing the correct vdev parameter.
The slot_id race fix that caused this regression was targeted for stable,
so this needs to be applied there as well.
Fixes: 2eb03376151b ("usb: xhci: Fix slot_id resource race conflict")
Reported-by: David Wang <00107082(a)163.com>
Closes: https://lore.kernel.org/linux-usb/20250829181354.4450-1-00107082@163.com
Suggested-by: Michal Pecio <michal.pecio(a)gmail.com>
Suggested-by: David Wang <00107082(a)163.com>
Cc: stable(a)vger.kernel.org
Tested-by: David Wang <00107082(a)163.com>
Signed-off-by: Mathias Nyman <mathias.nyman(a)linux.intel.com>
Link: https://lore.kernel.org/r/20250902105306.877476-4-mathias.nyman@linux.intel…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/usb/host/xhci-mem.c b/drivers/usb/host/xhci-mem.c
index 81eaad87a3d9..c4a6544aa107 100644
--- a/drivers/usb/host/xhci-mem.c
+++ b/drivers/usb/host/xhci-mem.c
@@ -962,7 +962,7 @@ static void xhci_free_virt_devices_depth_first(struct xhci_hcd *xhci, int slot_i
out:
/* we are now at a leaf device */
xhci_debugfs_remove_slot(xhci, slot_id);
- xhci_free_virt_device(xhci, vdev, slot_id);
+ xhci_free_virt_device(xhci, xhci->devs[slot_id], slot_id);
}
int xhci_alloc_virt_device(struct xhci_hcd *xhci, int slot_id,
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x edcbe06453ddfde21f6aa763f7cab655f26133cc
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025091757-subsidy-arson-d8b7@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From edcbe06453ddfde21f6aa763f7cab655f26133cc Mon Sep 17 00:00:00 2001
From: Mathias Nyman <mathias.nyman(a)linux.intel.com>
Date: Tue, 2 Sep 2025 13:53:06 +0300
Subject: [PATCH] xhci: fix memory leak regression when freeing xhci vdev
devices depth first
Suspend-resume cycle test revealed a memory leak in 6.17-rc3
Turns out the slot_id race fix changes accidentally ends up calling
xhci_free_virt_device() with an incorrect vdev parameter.
The vdev variable was reused for temporary purposes right before calling
xhci_free_virt_device().
Fix this by passing the correct vdev parameter.
The slot_id race fix that caused this regression was targeted for stable,
so this needs to be applied there as well.
Fixes: 2eb03376151b ("usb: xhci: Fix slot_id resource race conflict")
Reported-by: David Wang <00107082(a)163.com>
Closes: https://lore.kernel.org/linux-usb/20250829181354.4450-1-00107082@163.com
Suggested-by: Michal Pecio <michal.pecio(a)gmail.com>
Suggested-by: David Wang <00107082(a)163.com>
Cc: stable(a)vger.kernel.org
Tested-by: David Wang <00107082(a)163.com>
Signed-off-by: Mathias Nyman <mathias.nyman(a)linux.intel.com>
Link: https://lore.kernel.org/r/20250902105306.877476-4-mathias.nyman@linux.intel…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/usb/host/xhci-mem.c b/drivers/usb/host/xhci-mem.c
index 81eaad87a3d9..c4a6544aa107 100644
--- a/drivers/usb/host/xhci-mem.c
+++ b/drivers/usb/host/xhci-mem.c
@@ -962,7 +962,7 @@ static void xhci_free_virt_devices_depth_first(struct xhci_hcd *xhci, int slot_i
out:
/* we are now at a leaf device */
xhci_debugfs_remove_slot(xhci, slot_id);
- xhci_free_virt_device(xhci, vdev, slot_id);
+ xhci_free_virt_device(xhci, xhci->devs[slot_id], slot_id);
}
int xhci_alloc_virt_device(struct xhci_hcd *xhci, int slot_id,
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x edcbe06453ddfde21f6aa763f7cab655f26133cc
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025091757-filler-dispose-635b@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From edcbe06453ddfde21f6aa763f7cab655f26133cc Mon Sep 17 00:00:00 2001
From: Mathias Nyman <mathias.nyman(a)linux.intel.com>
Date: Tue, 2 Sep 2025 13:53:06 +0300
Subject: [PATCH] xhci: fix memory leak regression when freeing xhci vdev
devices depth first
Suspend-resume cycle test revealed a memory leak in 6.17-rc3
Turns out the slot_id race fix changes accidentally ends up calling
xhci_free_virt_device() with an incorrect vdev parameter.
The vdev variable was reused for temporary purposes right before calling
xhci_free_virt_device().
Fix this by passing the correct vdev parameter.
The slot_id race fix that caused this regression was targeted for stable,
so this needs to be applied there as well.
Fixes: 2eb03376151b ("usb: xhci: Fix slot_id resource race conflict")
Reported-by: David Wang <00107082(a)163.com>
Closes: https://lore.kernel.org/linux-usb/20250829181354.4450-1-00107082@163.com
Suggested-by: Michal Pecio <michal.pecio(a)gmail.com>
Suggested-by: David Wang <00107082(a)163.com>
Cc: stable(a)vger.kernel.org
Tested-by: David Wang <00107082(a)163.com>
Signed-off-by: Mathias Nyman <mathias.nyman(a)linux.intel.com>
Link: https://lore.kernel.org/r/20250902105306.877476-4-mathias.nyman@linux.intel…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/usb/host/xhci-mem.c b/drivers/usb/host/xhci-mem.c
index 81eaad87a3d9..c4a6544aa107 100644
--- a/drivers/usb/host/xhci-mem.c
+++ b/drivers/usb/host/xhci-mem.c
@@ -962,7 +962,7 @@ static void xhci_free_virt_devices_depth_first(struct xhci_hcd *xhci, int slot_i
out:
/* we are now at a leaf device */
xhci_debugfs_remove_slot(xhci, slot_id);
- xhci_free_virt_device(xhci, vdev, slot_id);
+ xhci_free_virt_device(xhci, xhci->devs[slot_id], slot_id);
}
int xhci_alloc_virt_device(struct xhci_hcd *xhci, int slot_id,
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x edcbe06453ddfde21f6aa763f7cab655f26133cc
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025091756-outskirts-monetize-6f6b@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From edcbe06453ddfde21f6aa763f7cab655f26133cc Mon Sep 17 00:00:00 2001
From: Mathias Nyman <mathias.nyman(a)linux.intel.com>
Date: Tue, 2 Sep 2025 13:53:06 +0300
Subject: [PATCH] xhci: fix memory leak regression when freeing xhci vdev
devices depth first
Suspend-resume cycle test revealed a memory leak in 6.17-rc3
Turns out the slot_id race fix changes accidentally ends up calling
xhci_free_virt_device() with an incorrect vdev parameter.
The vdev variable was reused for temporary purposes right before calling
xhci_free_virt_device().
Fix this by passing the correct vdev parameter.
The slot_id race fix that caused this regression was targeted for stable,
so this needs to be applied there as well.
Fixes: 2eb03376151b ("usb: xhci: Fix slot_id resource race conflict")
Reported-by: David Wang <00107082(a)163.com>
Closes: https://lore.kernel.org/linux-usb/20250829181354.4450-1-00107082@163.com
Suggested-by: Michal Pecio <michal.pecio(a)gmail.com>
Suggested-by: David Wang <00107082(a)163.com>
Cc: stable(a)vger.kernel.org
Tested-by: David Wang <00107082(a)163.com>
Signed-off-by: Mathias Nyman <mathias.nyman(a)linux.intel.com>
Link: https://lore.kernel.org/r/20250902105306.877476-4-mathias.nyman@linux.intel…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/usb/host/xhci-mem.c b/drivers/usb/host/xhci-mem.c
index 81eaad87a3d9..c4a6544aa107 100644
--- a/drivers/usb/host/xhci-mem.c
+++ b/drivers/usb/host/xhci-mem.c
@@ -962,7 +962,7 @@ static void xhci_free_virt_devices_depth_first(struct xhci_hcd *xhci, int slot_i
out:
/* we are now at a leaf device */
xhci_debugfs_remove_slot(xhci, slot_id);
- xhci_free_virt_device(xhci, vdev, slot_id);
+ xhci_free_virt_device(xhci, xhci->devs[slot_id], slot_id);
}
int xhci_alloc_virt_device(struct xhci_hcd *xhci, int slot_id,
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x a5c98e8b1398534ae1feb6e95e2d3ee5215538ed
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025091734-scouts-eligible-d693@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From a5c98e8b1398534ae1feb6e95e2d3ee5215538ed Mon Sep 17 00:00:00 2001
From: Mathias Nyman <mathias.nyman(a)linux.intel.com>
Date: Tue, 2 Sep 2025 13:53:05 +0300
Subject: [PATCH] xhci: dbc: Fix full DbC transfer ring after several
reconnects
Pending requests will be flushed on disconnect, and the corresponding
TRBs will be turned into No-op TRBs, which are ignored by the xHC
controller once it starts processing the ring.
If the USB debug cable repeatedly disconnects before ring is started
then the ring will eventually be filled with No-op TRBs.
No new transfers can be queued when the ring is full, and driver will
print the following error message:
"xhci_hcd 0000:00:14.0: failed to queue trbs"
This is a normal case for 'in' transfers where TRBs are always enqueued
in advance, ready to take on incoming data. If no data arrives, and
device is disconnected, then ring dequeue will remain at beginning of
the ring while enqueue points to first free TRB after last cancelled
No-op TRB.
s
Solve this by reinitializing the rings when the debug cable disconnects
and DbC is leaving the configured state.
Clear the whole ring buffer and set enqueue and dequeue to the beginning
of ring, and set cycle bit to its initial state.
Cc: stable(a)vger.kernel.org
Fixes: dfba2174dc42 ("usb: xhci: Add DbC support in xHCI driver")
Signed-off-by: Mathias Nyman <mathias.nyman(a)linux.intel.com>
Link: https://lore.kernel.org/r/20250902105306.877476-3-mathias.nyman@linux.intel…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/usb/host/xhci-dbgcap.c b/drivers/usb/host/xhci-dbgcap.c
index d0faff233e3e..63edf2d8f245 100644
--- a/drivers/usb/host/xhci-dbgcap.c
+++ b/drivers/usb/host/xhci-dbgcap.c
@@ -462,6 +462,25 @@ static void xhci_dbc_ring_init(struct xhci_ring *ring)
xhci_initialize_ring_info(ring);
}
+static int xhci_dbc_reinit_ep_rings(struct xhci_dbc *dbc)
+{
+ struct xhci_ring *in_ring = dbc->eps[BULK_IN].ring;
+ struct xhci_ring *out_ring = dbc->eps[BULK_OUT].ring;
+
+ if (!in_ring || !out_ring || !dbc->ctx) {
+ dev_warn(dbc->dev, "Can't re-init unallocated endpoints\n");
+ return -ENODEV;
+ }
+
+ xhci_dbc_ring_init(in_ring);
+ xhci_dbc_ring_init(out_ring);
+
+ /* set ep context enqueue, dequeue, and cycle to initial values */
+ xhci_dbc_init_ep_contexts(dbc);
+
+ return 0;
+}
+
static struct xhci_ring *
xhci_dbc_ring_alloc(struct device *dev, enum xhci_ring_type type, gfp_t flags)
{
@@ -885,7 +904,7 @@ static enum evtreturn xhci_dbc_do_handle_events(struct xhci_dbc *dbc)
dev_info(dbc->dev, "DbC cable unplugged\n");
dbc->state = DS_ENABLED;
xhci_dbc_flush_requests(dbc);
-
+ xhci_dbc_reinit_ep_rings(dbc);
return EVT_DISC;
}
@@ -895,7 +914,7 @@ static enum evtreturn xhci_dbc_do_handle_events(struct xhci_dbc *dbc)
writel(portsc, &dbc->regs->portsc);
dbc->state = DS_ENABLED;
xhci_dbc_flush_requests(dbc);
-
+ xhci_dbc_reinit_ep_rings(dbc);
return EVT_DISC;
}
Good Morning,
I trust this message finds you well.
My name is Charles Twite, and I serve as the Chairman and CEO of Harley Pacific Investments Ltd., headquartered in the U.K. Our company specializes in offering financial solutions through loans to both businesses and individuals, providing highly competitive interest rates tailored specifically to meet the unique needs of our clients. We deliver these services with utmost confidence and sincerity.
I would be delighted to provide you with more comprehensive information regarding the loan terms, application procedures, and the various ways in which we can assist you in securing the financing you require. Should you wish to proceed, please do not hesitate to reach out, and I will ensure that you receive all relevant details to facilitate a well-informed decision.
Thank you for your time and consideration.
Sincerely,
Charles Twite
Harley Pacific Ltd
4 Islington Hall Cottages, Islington Green,
King's Lynn,
PE34 4SB United Kingdom
Tel/WhatsApp: +(44) 7878 955278
Email: charles.twite(a)harlayspacificinvest.com
The quilt patch titled
Subject: mm: fix off-by-one error in VMA count limit checks
has been removed from the -mm tree. Its filename was
mm-fix-off-by-one-error-in-vma-count-limit-checks.patch
This patch was dropped because an updated version will be issued
------------------------------------------------------
From: Kalesh Singh <kaleshsingh(a)google.com>
Subject: mm: fix off-by-one error in VMA count limit checks
Date: Mon, 15 Sep 2025 09:36:32 -0700
The VMA count limit check in do_mmap() and do_brk_flags() uses a strict
inequality (>), which allows a process's VMA count to exceed the
configured sysctl_max_map_count limit by one.
A process with mm->map_count == sysctl_max_map_count will incorrectly pass
this check and then exceed the limit upon allocation of a new VMA when its
map_count is incremented.
Other VMA allocation paths, such as split_vma(), already use the correct,
inclusive (>=) comparison.
Fix this bug by changing the comparison to be inclusive in do_mmap() and
do_brk_flags(), bringing them in line with the correct behavior of other
allocation paths.
Link: https://lkml.kernel.org/r/20250915163838.631445-2-kaleshsingh@google.com
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Kalesh Singh <kaleshsingh(a)google.com>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: "Liam R. Howlett" <Liam.Howlett(a)oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes(a)oracle.com>
Cc: Mike Rapoport <rppt(a)kernel.org>
Cc: Minchan Kim <minchan(a)kernel.org>
Cc: Pedro Falcato <pfalcato(a)suse.de>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/mmap.c | 2 +-
mm/vma.c | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
--- a/mm/mmap.c~mm-fix-off-by-one-error-in-vma-count-limit-checks
+++ a/mm/mmap.c
@@ -374,7 +374,7 @@ unsigned long do_mmap(struct file *file,
return -EOVERFLOW;
/* Too many mappings? */
- if (mm->map_count > sysctl_max_map_count)
+ if (mm->map_count >= sysctl_max_map_count)
return -ENOMEM;
/*
--- a/mm/vma.c~mm-fix-off-by-one-error-in-vma-count-limit-checks
+++ a/mm/vma.c
@@ -2772,7 +2772,7 @@ int do_brk_flags(struct vma_iterator *vm
if (!may_expand_vm(mm, vm_flags, len >> PAGE_SHIFT))
return -ENOMEM;
- if (mm->map_count > sysctl_max_map_count)
+ if (mm->map_count >= sysctl_max_map_count)
return -ENOMEM;
if (security_vm_enough_memory_mm(mm, len >> PAGE_SHIFT))
_
Patches currently in -mm which might be from kaleshsingh(a)google.com are