This reverts commit b3b274bc9d3d7307308aeaf75f70731765ac999a.
On the DragonBoard 820c (which uses APQ8096/MSM8996) this change causes
the CPUs to downclock to roughly half speed under sustained load. The
regression is visible both during boot and when running CPU stress
workloads such as stress-ng: the CPUs initially ramp up to the expected
frequency, then drop to a lower OPP even though the system is clearly
CPU-bound.
Bisecting points to this commit and reverting it restores the expected
behaviour on the DragonBoard 820c - the CPUs track the cpufreq policy
and run at full performance under load.
The exact interaction with the ACD is not yet fully understood and we
would like to keep ACD in use to avoid possible SoC reliability issues.
Until we have a better fix that preserves ACD while avoiding this
performance regression, revert the bisected patch to restore the
previous behaviour.
Fixes: b3b274bc9d3d ("clk: qcom: cpu-8996: simplify the cpu_clk_notifier_cb")
Cc: stable(a)vger.kernel.org # v6.3+
Link: https://lore.kernel.org/linux-arm-msm/20230113120544.59320-8-dmitry.baryshk…
Cc: Dmitry Baryshkov <dmitry.baryshkov(a)oss.qualcomm.com>
Signed-off-by: Christopher Obbard <christopher.obbard(a)linaro.org>
---
Hi all,
This series contains a single revert for a regression affecting the
APQ8096/MSM8996 (DragonBoard 820c).
The commit being reverted, b3b274bc9d3d ("clk: qcom: cpu-8996: simplify the cpu_clk_notifier_cb"),
introduces a significant performance issue where the CPUs downclock to
~50% of their expected frequency under sustained load. The problem is
reproducible both at boot and when running CPU-bound workloads such as
stress-ng.
Bisecting the issue pointed directly to this commit and reverting it
restores correct cpufreq behaviour.
The root cause appears to be related to the interaction between the
simplified notifier callback and ACD (Adaptive Clock Distribution).
Since we would prefer to keep ACD enabled for SoC reliability reasons,
a revert is the safest option until a proper fix is identified.
Full details are included in the commit message.
Feedback & suggestions welcome.
Cheers!
Christopher Obbard
---
drivers/clk/qcom/clk-cpu-8996.c | 30 +++++++++++-------------------
1 file changed, 11 insertions(+), 19 deletions(-)
diff --git a/drivers/clk/qcom/clk-cpu-8996.c b/drivers/clk/qcom/clk-cpu-8996.c
index 21d13c0841ed..028476931747 100644
--- a/drivers/clk/qcom/clk-cpu-8996.c
+++ b/drivers/clk/qcom/clk-cpu-8996.c
@@ -547,35 +547,27 @@ static int cpu_clk_notifier_cb(struct notifier_block *nb, unsigned long event,
{
struct clk_cpu_8996_pmux *cpuclk = to_clk_cpu_8996_pmux_nb(nb);
struct clk_notifier_data *cnd = data;
+ int ret;
switch (event) {
case PRE_RATE_CHANGE:
+ ret = clk_cpu_8996_pmux_set_parent(&cpuclk->clkr.hw, ALT_INDEX);
qcom_cpu_clk_msm8996_acd_init(cpuclk->clkr.regmap);
-
- /*
- * Avoid overvolting. clk_core_set_rate_nolock() walks from top
- * to bottom, so it will change the rate of the PLL before
- * chaging the parent of PMUX. This can result in pmux getting
- * clocked twice the expected rate.
- *
- * Manually switch to PLL/2 here.
- */
- if (cnd->new_rate < DIV_2_THRESHOLD &&
- cnd->old_rate > DIV_2_THRESHOLD)
- clk_cpu_8996_pmux_set_parent(&cpuclk->clkr.hw, SMUX_INDEX);
-
break;
- case ABORT_RATE_CHANGE:
- /* Revert manual change */
- if (cnd->new_rate < DIV_2_THRESHOLD &&
- cnd->old_rate > DIV_2_THRESHOLD)
- clk_cpu_8996_pmux_set_parent(&cpuclk->clkr.hw, ACD_INDEX);
+ case POST_RATE_CHANGE:
+ if (cnd->new_rate < DIV_2_THRESHOLD)
+ ret = clk_cpu_8996_pmux_set_parent(&cpuclk->clkr.hw,
+ SMUX_INDEX);
+ else
+ ret = clk_cpu_8996_pmux_set_parent(&cpuclk->clkr.hw,
+ ACD_INDEX);
break;
default:
+ ret = 0;
break;
}
- return NOTIFY_OK;
+ return notifier_from_errno(ret);
};
static int qcom_cpu_clk_msm8996_driver_probe(struct platform_device *pdev)
---
base-commit: c17e270dfb342a782d69c4a7c4c32980455afd9c
change-id: 20251202-wip-obbardc-qcom-msm8096-clk-cpu-fix-downclock-b7561da4cb95
Best regards,
--
Christopher Obbard <christopher.obbard(a)linaro.org>
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x baeb66fbd4201d1c4325074e78b1f557dff89b5b
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025120129-earthlike-curse-b3f8@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From baeb66fbd4201d1c4325074e78b1f557dff89b5b Mon Sep 17 00:00:00 2001
From: Jimmy Hu <hhhuuu(a)google.com>
Date: Thu, 23 Oct 2025 05:49:45 +0000
Subject: [PATCH] usb: gadget: udc: fix use-after-free in usb_gadget_state_work
A race condition during gadget teardown can lead to a use-after-free
in usb_gadget_state_work(), as reported by KASAN:
BUG: KASAN: invalid-access in sysfs_notify+0x2c/0xd0
Workqueue: events usb_gadget_state_work
The fundamental race occurs because a concurrent event (e.g., an
interrupt) can call usb_gadget_set_state() and schedule gadget->work
at any time during the cleanup process in usb_del_gadget().
Commit 399a45e5237c ("usb: gadget: core: flush gadget workqueue after
device removal") attempted to fix this by moving flush_work() to after
device_del(). However, this does not fully solve the race, as a new
work item can still be scheduled *after* flush_work() completes but
before the gadget's memory is freed, leading to the same use-after-free.
This patch fixes the race condition robustly by introducing a 'teardown'
flag and a 'state_lock' spinlock to the usb_gadget struct. The flag is
set during cleanup in usb_del_gadget() *before* calling flush_work() to
prevent any new work from being scheduled once cleanup has commenced.
The scheduling site, usb_gadget_set_state(), now checks this flag under
the lock before queueing the work, thus safely closing the race window.
Fixes: 5702f75375aa9 ("usb: gadget: udc-core: move sysfs_notify() to a workqueue")
Cc: stable <stable(a)kernel.org>
Signed-off-by: Jimmy Hu <hhhuuu(a)google.com>
Link: https://patch.msgid.link/20251023054945.233861-1-hhhuuu@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/usb/gadget/udc/core.c b/drivers/usb/gadget/udc/core.c
index 694653761c44..8dbe79bdc0f9 100644
--- a/drivers/usb/gadget/udc/core.c
+++ b/drivers/usb/gadget/udc/core.c
@@ -1126,8 +1126,13 @@ static void usb_gadget_state_work(struct work_struct *work)
void usb_gadget_set_state(struct usb_gadget *gadget,
enum usb_device_state state)
{
+ unsigned long flags;
+
+ spin_lock_irqsave(&gadget->state_lock, flags);
gadget->state = state;
- schedule_work(&gadget->work);
+ if (!gadget->teardown)
+ schedule_work(&gadget->work);
+ spin_unlock_irqrestore(&gadget->state_lock, flags);
trace_usb_gadget_set_state(gadget, 0);
}
EXPORT_SYMBOL_GPL(usb_gadget_set_state);
@@ -1361,6 +1366,8 @@ static void usb_udc_nop_release(struct device *dev)
void usb_initialize_gadget(struct device *parent, struct usb_gadget *gadget,
void (*release)(struct device *dev))
{
+ spin_lock_init(&gadget->state_lock);
+ gadget->teardown = false;
INIT_WORK(&gadget->work, usb_gadget_state_work);
gadget->dev.parent = parent;
@@ -1535,6 +1542,7 @@ EXPORT_SYMBOL_GPL(usb_add_gadget_udc);
void usb_del_gadget(struct usb_gadget *gadget)
{
struct usb_udc *udc = gadget->udc;
+ unsigned long flags;
if (!udc)
return;
@@ -1548,6 +1556,13 @@ void usb_del_gadget(struct usb_gadget *gadget)
kobject_uevent(&udc->dev.kobj, KOBJ_REMOVE);
sysfs_remove_link(&udc->dev.kobj, "gadget");
device_del(&gadget->dev);
+ /*
+ * Set the teardown flag before flushing the work to prevent new work
+ * from being scheduled while we are cleaning up.
+ */
+ spin_lock_irqsave(&gadget->state_lock, flags);
+ gadget->teardown = true;
+ spin_unlock_irqrestore(&gadget->state_lock, flags);
flush_work(&gadget->work);
ida_free(&gadget_id_numbers, gadget->id_number);
cancel_work_sync(&udc->vbus_work);
diff --git a/include/linux/usb/gadget.h b/include/linux/usb/gadget.h
index 3aaf19e77558..8285b19a25e0 100644
--- a/include/linux/usb/gadget.h
+++ b/include/linux/usb/gadget.h
@@ -376,6 +376,9 @@ struct usb_gadget_ops {
* can handle. The UDC must support this and all slower speeds and lower
* number of lanes.
* @state: the state we are now (attached, suspended, configured, etc)
+ * @state_lock: Spinlock protecting the `state` and `teardown` members.
+ * @teardown: True if the device is undergoing teardown, used to prevent
+ * new work from being scheduled during cleanup.
* @name: Identifies the controller hardware type. Used in diagnostics
* and sometimes configuration.
* @dev: Driver model state for this abstract device.
@@ -451,6 +454,8 @@ struct usb_gadget {
enum usb_ssp_rate max_ssp_rate;
enum usb_device_state state;
+ spinlock_t state_lock;
+ bool teardown;
const char *name;
struct device dev;
unsigned isoch_delay;
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x baeb66fbd4201d1c4325074e78b1f557dff89b5b
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025120123-borax-cut-5c52@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From baeb66fbd4201d1c4325074e78b1f557dff89b5b Mon Sep 17 00:00:00 2001
From: Jimmy Hu <hhhuuu(a)google.com>
Date: Thu, 23 Oct 2025 05:49:45 +0000
Subject: [PATCH] usb: gadget: udc: fix use-after-free in usb_gadget_state_work
A race condition during gadget teardown can lead to a use-after-free
in usb_gadget_state_work(), as reported by KASAN:
BUG: KASAN: invalid-access in sysfs_notify+0x2c/0xd0
Workqueue: events usb_gadget_state_work
The fundamental race occurs because a concurrent event (e.g., an
interrupt) can call usb_gadget_set_state() and schedule gadget->work
at any time during the cleanup process in usb_del_gadget().
Commit 399a45e5237c ("usb: gadget: core: flush gadget workqueue after
device removal") attempted to fix this by moving flush_work() to after
device_del(). However, this does not fully solve the race, as a new
work item can still be scheduled *after* flush_work() completes but
before the gadget's memory is freed, leading to the same use-after-free.
This patch fixes the race condition robustly by introducing a 'teardown'
flag and a 'state_lock' spinlock to the usb_gadget struct. The flag is
set during cleanup in usb_del_gadget() *before* calling flush_work() to
prevent any new work from being scheduled once cleanup has commenced.
The scheduling site, usb_gadget_set_state(), now checks this flag under
the lock before queueing the work, thus safely closing the race window.
Fixes: 5702f75375aa9 ("usb: gadget: udc-core: move sysfs_notify() to a workqueue")
Cc: stable <stable(a)kernel.org>
Signed-off-by: Jimmy Hu <hhhuuu(a)google.com>
Link: https://patch.msgid.link/20251023054945.233861-1-hhhuuu@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/usb/gadget/udc/core.c b/drivers/usb/gadget/udc/core.c
index 694653761c44..8dbe79bdc0f9 100644
--- a/drivers/usb/gadget/udc/core.c
+++ b/drivers/usb/gadget/udc/core.c
@@ -1126,8 +1126,13 @@ static void usb_gadget_state_work(struct work_struct *work)
void usb_gadget_set_state(struct usb_gadget *gadget,
enum usb_device_state state)
{
+ unsigned long flags;
+
+ spin_lock_irqsave(&gadget->state_lock, flags);
gadget->state = state;
- schedule_work(&gadget->work);
+ if (!gadget->teardown)
+ schedule_work(&gadget->work);
+ spin_unlock_irqrestore(&gadget->state_lock, flags);
trace_usb_gadget_set_state(gadget, 0);
}
EXPORT_SYMBOL_GPL(usb_gadget_set_state);
@@ -1361,6 +1366,8 @@ static void usb_udc_nop_release(struct device *dev)
void usb_initialize_gadget(struct device *parent, struct usb_gadget *gadget,
void (*release)(struct device *dev))
{
+ spin_lock_init(&gadget->state_lock);
+ gadget->teardown = false;
INIT_WORK(&gadget->work, usb_gadget_state_work);
gadget->dev.parent = parent;
@@ -1535,6 +1542,7 @@ EXPORT_SYMBOL_GPL(usb_add_gadget_udc);
void usb_del_gadget(struct usb_gadget *gadget)
{
struct usb_udc *udc = gadget->udc;
+ unsigned long flags;
if (!udc)
return;
@@ -1548,6 +1556,13 @@ void usb_del_gadget(struct usb_gadget *gadget)
kobject_uevent(&udc->dev.kobj, KOBJ_REMOVE);
sysfs_remove_link(&udc->dev.kobj, "gadget");
device_del(&gadget->dev);
+ /*
+ * Set the teardown flag before flushing the work to prevent new work
+ * from being scheduled while we are cleaning up.
+ */
+ spin_lock_irqsave(&gadget->state_lock, flags);
+ gadget->teardown = true;
+ spin_unlock_irqrestore(&gadget->state_lock, flags);
flush_work(&gadget->work);
ida_free(&gadget_id_numbers, gadget->id_number);
cancel_work_sync(&udc->vbus_work);
diff --git a/include/linux/usb/gadget.h b/include/linux/usb/gadget.h
index 3aaf19e77558..8285b19a25e0 100644
--- a/include/linux/usb/gadget.h
+++ b/include/linux/usb/gadget.h
@@ -376,6 +376,9 @@ struct usb_gadget_ops {
* can handle. The UDC must support this and all slower speeds and lower
* number of lanes.
* @state: the state we are now (attached, suspended, configured, etc)
+ * @state_lock: Spinlock protecting the `state` and `teardown` members.
+ * @teardown: True if the device is undergoing teardown, used to prevent
+ * new work from being scheduled during cleanup.
* @name: Identifies the controller hardware type. Used in diagnostics
* and sometimes configuration.
* @dev: Driver model state for this abstract device.
@@ -451,6 +454,8 @@ struct usb_gadget {
enum usb_ssp_rate max_ssp_rate;
enum usb_device_state state;
+ spinlock_t state_lock;
+ bool teardown;
const char *name;
struct device dev;
unsigned isoch_delay;
On converting to the of_reserved_mem_region_to_resource() helper with
commit 900730dc4705 ("wifi: ath: Use
of_reserved_mem_region_to_resource() for "memory-region"") a logic error
was introduced in the ath11k_core_coldboot_cal_support() if condition.
The original code checked for hremote_node presence and skipped
ath11k_core_coldboot_cal_support() in the other switch case but now
everything is driven entirely on the values of the resource struct.
resource_size() (in this case) is wrongly assumed to return a size of
zero if the passed resource struct is init to zero. This is not the case
as a resource struct should be always init with correct values (or at
best set the end value to -1 to signal it's not configured)
(the return value of resource_size() for a resource struct with start
and end set to zero is 1)
On top of this, using resource_size() to check if a resource struct is
initialized or not is generally wrong and other measure should be used
instead.
To better handle this, use the DEFINE_RES macro to initialize the
resource struct and set the IORESOURCE_UNSET flag by default.
Replace the resource_size() check with checking for the resource struct
flags and check if it's IORESOURCE_UNSET.
This change effectively restore the original logic and restore correct
loading of the ath11k firmware (restoring correct functionality of
Wi-Fi)
Cc: stable(a)vger.kernel.org
Fixes: 900730dc4705 ("wifi: ath: Use of_reserved_mem_region_to_resource() for "memory-region"")
Link: https://lore.kernel.org/all/20251207215359.28895-1-ansuelsmth@gmail.com/T/#…
Cc: Ilpo Järvinen <ilpo.jarvinen(a)linux.intel.com>
Signed-off-by: Christian Marangi <ansuelsmth(a)gmail.com>
---
drivers/net/wireless/ath/ath11k/qmi.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/net/wireless/ath/ath11k/qmi.c b/drivers/net/wireless/ath/ath11k/qmi.c
index ff6a97e328b8..afa663c00620 100644
--- a/drivers/net/wireless/ath/ath11k/qmi.c
+++ b/drivers/net/wireless/ath/ath11k/qmi.c
@@ -2039,8 +2039,8 @@ static int ath11k_qmi_alloc_target_mem_chunk(struct ath11k_base *ab)
static int ath11k_qmi_assign_target_mem_chunk(struct ath11k_base *ab)
{
+ struct resource res = DEFINE_RES(0, 0, IORESOURCE_UNSET);
struct device *dev = ab->dev;
- struct resource res = {};
u32 host_ddr_sz;
int i, idx, ret;
@@ -2086,7 +2086,7 @@ static int ath11k_qmi_assign_target_mem_chunk(struct ath11k_base *ab)
}
if (ath11k_core_coldboot_cal_support(ab)) {
- if (resource_size(&res)) {
+ if (res.flags != IORESOURCE_UNSET) {
ab->qmi.target_mem[idx].paddr =
res.start + host_ddr_sz;
ab->qmi.target_mem[idx].iaddr =
--
2.51.0
When the filesystem is being mounted, the kernel panics while the data
regarding slot map allocation to the local node, is being written to the
disk. This occurs because the value of slot map buffer head block
number, which should have been greater than or equal to
`OCFS2_SUPER_BLOCK_BLKNO` (evaluating to 2) is less than it, indicative
of disk metadata corruption. This triggers
BUG_ON(bh->b_blocknr < OCFS2_SUPER_BLOCK_BLKNO) in ocfs2_write_block(),
causing the kernel to panic.
This is fixed by introducing an if condition block in
ocfs2_update_disk_slot(), right before calling ocfs2_write_block(), which
checks if `bh->b_blocknr` is lesser than `OCFS2_SUPER_BLOCK_BLKNO`; if
yes, then ocfs2_error is called, which prints the error log, for
debugging purposes, and the return value of ocfs2_error() is returned
back to caller of ocfs2_update_disk_slot() i.e. ocfs2_find_slot(). If
the return value is zero. then error code EIO is returned.
Reported-by: syzbot+c818e5c4559444f88aa0(a)syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=c818e5c4559444f88aa0
Tested-by: syzbot+c818e5c4559444f88aa0(a)syzkaller.appspotmail.com
Cc: stable(a)vger.kernel.org
Signed-off-by: Prithvi Tambewagh <activprithvi(a)gmail.com>
---
fs/ocfs2/slot_map.c | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/fs/ocfs2/slot_map.c b/fs/ocfs2/slot_map.c
index e544c704b583..788924fc3663 100644
--- a/fs/ocfs2/slot_map.c
+++ b/fs/ocfs2/slot_map.c
@@ -193,6 +193,16 @@ static int ocfs2_update_disk_slot(struct ocfs2_super *osb,
else
ocfs2_update_disk_slot_old(si, slot_num, &bh);
spin_unlock(&osb->osb_lock);
+ if (bh->b_blocknr < OCFS2_SUPER_BLOCK_BLKNO) {
+ status = ocfs2_error(osb->sb,
+ "Invalid Slot Map Buffer Head "
+ "Block Number : %llu, Should be >= %d",
+ le16_to_cpu(bh->b_blocknr),
+ le16_to_cpu((int)OCFS2_SUPER_BLOCK_BLKNO));
+ if (!status)
+ return -EIO;
+ return status;
+ }
status = ocfs2_write_block(osb, bh, INODE_CACHE(si->si_inode));
if (status < 0)
base-commit: 24172e0d79900908cf5ebf366600616d29c9b417
--
2.43.0
tpm2_key_decode() overrides the explicit keyhandle parameter, which can
lead to problems, if the loaded parent handle does not match the handle
stored to the key file. This can easily happen as handle by definition
is an ambiguous attribute.
Cc: stable(a)vger.kernel.org # v5.13+
Fixes: f2219745250f ("security: keys: trusted: use ASN.1 TPM2 key format for the blobs")
Signed-off-by: Jarkko Sakkinen <jarkko(a)kernel.org>
---
security/keys/trusted-keys/trusted_tpm2.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/security/keys/trusted-keys/trusted_tpm2.c b/security/keys/trusted-keys/trusted_tpm2.c
index fb76c4ea496f..950684e54c71 100644
--- a/security/keys/trusted-keys/trusted_tpm2.c
+++ b/security/keys/trusted-keys/trusted_tpm2.c
@@ -121,7 +121,9 @@ static int tpm2_key_decode(struct trusted_key_payload *payload,
return -ENOMEM;
*buf = blob;
- options->keyhandle = ctx.parent;
+
+ if (!options->keyhandle)
+ options->keyhandle = ctx.parent;
memcpy(blob, ctx.priv, ctx.priv_len);
blob += ctx.priv_len;
--
2.39.5
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 4da3768e1820cf15cced390242d8789aed34f54d
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025120802-remedy-glimmer-fc9d@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 4da3768e1820cf15cced390242d8789aed34f54d Mon Sep 17 00:00:00 2001
From: Omar Sandoval <osandov(a)fb.com>
Date: Tue, 4 Nov 2025 09:55:26 -0800
Subject: [PATCH] KVM: SVM: Don't skip unrelated instruction if INT3/INTO is
replaced
When re-injecting a soft interrupt from an INT3, INT0, or (select) INTn
instruction, discard the exception and retry the instruction if the code
stream is changed (e.g. by a different vCPU) between when the CPU
executes the instruction and when KVM decodes the instruction to get the
next RIP.
As effectively predicted by commit 6ef88d6e36c2 ("KVM: SVM: Re-inject
INT3/INTO instead of retrying the instruction"), failure to verify that
the correct INTn instruction was decoded can effectively clobber guest
state due to decoding the wrong instruction and thus specifying the
wrong next RIP.
The bug most often manifests as "Oops: int3" panics on static branch
checks in Linux guests. Enabling or disabling a static branch in Linux
uses the kernel's "text poke" code patching mechanism. To modify code
while other CPUs may be executing that code, Linux (temporarily)
replaces the first byte of the original instruction with an int3 (opcode
0xcc), then patches in the new code stream except for the first byte,
and finally replaces the int3 with the first byte of the new code
stream. If a CPU hits the int3, i.e. executes the code while it's being
modified, then the guest kernel must look up the RIP to determine how to
handle the #BP, e.g. by emulating the new instruction. If the RIP is
incorrect, then this lookup fails and the guest kernel panics.
The bug reproduces almost instantly by hacking the guest kernel to
repeatedly check a static branch[1] while running a drgn script[2] on
the host to constantly swap out the memory containing the guest's TSS.
[1]: https://gist.github.com/osandov/44d17c51c28c0ac998ea0334edf90b5a
[2]: https://gist.github.com/osandov/10e45e45afa29b11e0c7209247afc00b
Fixes: 6ef88d6e36c2 ("KVM: SVM: Re-inject INT3/INTO instead of retrying the instruction")
Cc: stable(a)vger.kernel.org
Co-developed-by: Sean Christopherson <seanjc(a)google.com>
Signed-off-by: Omar Sandoval <osandov(a)fb.com>
Link: https://patch.msgid.link/1cc6dcdf36e3add7ee7c8d90ad58414eeb6c3d34.176227876…
Signed-off-by: Sean Christopherson <seanjc(a)google.com>
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 48598d017d6f..974d64bf0a4d 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -2143,6 +2143,11 @@ u64 vcpu_tsc_khz(struct kvm_vcpu *vcpu);
* the gfn, i.e. retrying the instruction will hit a
* !PRESENT fault, which results in a new shadow page
* and sends KVM back to square one.
+ *
+ * EMULTYPE_SKIP_SOFT_INT - Set in combination with EMULTYPE_SKIP to only skip
+ * an instruction if it could generate a given software
+ * interrupt, which must be encoded via
+ * EMULTYPE_SET_SOFT_INT_VECTOR().
*/
#define EMULTYPE_NO_DECODE (1 << 0)
#define EMULTYPE_TRAP_UD (1 << 1)
@@ -2153,6 +2158,10 @@ u64 vcpu_tsc_khz(struct kvm_vcpu *vcpu);
#define EMULTYPE_PF (1 << 6)
#define EMULTYPE_COMPLETE_USER_EXIT (1 << 7)
#define EMULTYPE_WRITE_PF_TO_SP (1 << 8)
+#define EMULTYPE_SKIP_SOFT_INT (1 << 9)
+
+#define EMULTYPE_SET_SOFT_INT_VECTOR(v) ((u32)((v) & 0xff) << 16)
+#define EMULTYPE_GET_SOFT_INT_VECTOR(e) (((e) >> 16) & 0xff)
static inline bool kvm_can_emulate_event_vectoring(int emul_type)
{
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 1ae7b3c5a7c5..459db8154929 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -272,6 +272,7 @@ static void svm_set_interrupt_shadow(struct kvm_vcpu *vcpu, int mask)
}
static int __svm_skip_emulated_instruction(struct kvm_vcpu *vcpu,
+ int emul_type,
bool commit_side_effects)
{
struct vcpu_svm *svm = to_svm(vcpu);
@@ -293,7 +294,7 @@ static int __svm_skip_emulated_instruction(struct kvm_vcpu *vcpu,
if (unlikely(!commit_side_effects))
old_rflags = svm->vmcb->save.rflags;
- if (!kvm_emulate_instruction(vcpu, EMULTYPE_SKIP))
+ if (!kvm_emulate_instruction(vcpu, emul_type))
return 0;
if (unlikely(!commit_side_effects))
@@ -311,11 +312,13 @@ static int __svm_skip_emulated_instruction(struct kvm_vcpu *vcpu,
static int svm_skip_emulated_instruction(struct kvm_vcpu *vcpu)
{
- return __svm_skip_emulated_instruction(vcpu, true);
+ return __svm_skip_emulated_instruction(vcpu, EMULTYPE_SKIP, true);
}
-static int svm_update_soft_interrupt_rip(struct kvm_vcpu *vcpu)
+static int svm_update_soft_interrupt_rip(struct kvm_vcpu *vcpu, u8 vector)
{
+ const int emul_type = EMULTYPE_SKIP | EMULTYPE_SKIP_SOFT_INT |
+ EMULTYPE_SET_SOFT_INT_VECTOR(vector);
unsigned long rip, old_rip = kvm_rip_read(vcpu);
struct vcpu_svm *svm = to_svm(vcpu);
@@ -331,7 +334,7 @@ static int svm_update_soft_interrupt_rip(struct kvm_vcpu *vcpu)
* in use, the skip must not commit any side effects such as clearing
* the interrupt shadow or RFLAGS.RF.
*/
- if (!__svm_skip_emulated_instruction(vcpu, !nrips))
+ if (!__svm_skip_emulated_instruction(vcpu, emul_type, !nrips))
return -EIO;
rip = kvm_rip_read(vcpu);
@@ -367,7 +370,7 @@ static void svm_inject_exception(struct kvm_vcpu *vcpu)
kvm_deliver_exception_payload(vcpu, ex);
if (kvm_exception_is_soft(ex->vector) &&
- svm_update_soft_interrupt_rip(vcpu))
+ svm_update_soft_interrupt_rip(vcpu, ex->vector))
return;
svm->vmcb->control.event_inj = ex->vector
@@ -3642,11 +3645,12 @@ static bool svm_set_vnmi_pending(struct kvm_vcpu *vcpu)
static void svm_inject_irq(struct kvm_vcpu *vcpu, bool reinjected)
{
+ struct kvm_queued_interrupt *intr = &vcpu->arch.interrupt;
struct vcpu_svm *svm = to_svm(vcpu);
u32 type;
- if (vcpu->arch.interrupt.soft) {
- if (svm_update_soft_interrupt_rip(vcpu))
+ if (intr->soft) {
+ if (svm_update_soft_interrupt_rip(vcpu, intr->nr))
return;
type = SVM_EVTINJ_TYPE_SOFT;
@@ -3654,12 +3658,10 @@ static void svm_inject_irq(struct kvm_vcpu *vcpu, bool reinjected)
type = SVM_EVTINJ_TYPE_INTR;
}
- trace_kvm_inj_virq(vcpu->arch.interrupt.nr,
- vcpu->arch.interrupt.soft, reinjected);
+ trace_kvm_inj_virq(intr->nr, intr->soft, reinjected);
++vcpu->stat.irq_injections;
- svm->vmcb->control.event_inj = vcpu->arch.interrupt.nr |
- SVM_EVTINJ_VALID | type;
+ svm->vmcb->control.event_inj = intr->nr | SVM_EVTINJ_VALID | type;
}
void svm_complete_interrupt_delivery(struct kvm_vcpu *vcpu, int delivery_mode,
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 42ecd093bb4c..8e14c0292809 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -9338,6 +9338,23 @@ static bool is_vmware_backdoor_opcode(struct x86_emulate_ctxt *ctxt)
return false;
}
+static bool is_soft_int_instruction(struct x86_emulate_ctxt *ctxt,
+ int emulation_type)
+{
+ u8 vector = EMULTYPE_GET_SOFT_INT_VECTOR(emulation_type);
+
+ switch (ctxt->b) {
+ case 0xcc:
+ return vector == BP_VECTOR;
+ case 0xcd:
+ return vector == ctxt->src.val;
+ case 0xce:
+ return vector == OF_VECTOR;
+ default:
+ return false;
+ }
+}
+
/*
* Decode an instruction for emulation. The caller is responsible for handling
* code breakpoints. Note, manually detecting code breakpoints is unnecessary
@@ -9448,6 +9465,10 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa,
* injecting single-step #DBs.
*/
if (emulation_type & EMULTYPE_SKIP) {
+ if (emulation_type & EMULTYPE_SKIP_SOFT_INT &&
+ !is_soft_int_instruction(ctxt, emulation_type))
+ return 0;
+
if (ctxt->mode != X86EMUL_MODE_PROT64)
ctxt->eip = (u32)ctxt->_eip;
else
¿Cuánto cuesta una mala contratación?
body {
margin: 0;
padding: 0;
font-family: Arial, Helvetica, sans-serif;
font-size: 14px;
color: #333;
background-color: #ffffff;
}
table {
border-spacing: 0;
width: 100%;
max-width: 600px;
margin: auto;
}
td {
padding: 12px 20px;
}
a {
color: #1a73e8;
text-decoration: none;
}
.footer {
font-size: 12px;
color: #888888;
text-align: center;
}
Una mala contratación cuesta 3X el salario. Evítalo con datos, no percepciones.
Hola, ,
¿Sabías que una mala contratación cuesta hasta 3 veces el salario anual?
El 74% de empresas admite haber contratado a la persona equivocada. El motivo: decisiones basadas en percepciones, no en datos objetivos.
PsicoSmart te ayuda a evaluar talento con precisión:
31 pruebas psicométricas validadas para medir liderazgo, honestidad e inteligencia
2,500+ exámenes técnicos especializados por industria
Verificación de identidad con captura fotográfica automática
Resultados en minutos, accesible desde cualquier dispositivo
Reduce hasta 60% el riesgo de error en selección.
¿Quieres una demostración gratuita? Responde este correo y te contacto en menos de 24 horas.
Saludos,
--------------
Atte.: Valeria Pérez
Ciudad de México: (55) 5018 0565
WhatsApp: +52 33 1607 2089
Si no deseas recibir más correos, haz clic aquí para darte de baja.
Para remover su dirección de esta lista haga <a href="https://s1.arrobamail.com/unsuscribe.php?id=yiwtsrewiswqrpseup">click aquí</a>
__pci_read_base() sets resource start and end addresses when resource
is larger than 4G but pci_bus_addr_t or resource_size_t are not capable
of representing 64-bit PCI addresses. This creates a problematic
resource that has non-zero flags but the start and end addresses do not
yield to resource size of 0 but 1.
Replace custom resource addresses setup with resource_set_range()
that correctly sets end address as -1 which results in resource_size()
returning 0.
For consistency, also use resource_set_range() in the other branch that
does size based resource setup.
Fixes: 23b13bc76f35 ("PCI: Fail safely if we can't handle BARs larger than 4GB")
Link: https://lore.kernel.org/all/20251207215359.28895-1-ansuelsmth@gmail.com/T/#…
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen(a)linux.intel.com>
Cc: stable(a)vger.kernel.org
Cc: Christian Marangi <ansuelsmth(a)gmail.com>
---
drivers/pci/probe.c | 6 ++----
1 file changed, 2 insertions(+), 4 deletions(-)
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 124d2d309c58..b8294a2f11f9 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -287,8 +287,7 @@ int __pci_read_base(struct pci_dev *dev, enum pci_bar_type type,
if ((sizeof(pci_bus_addr_t) < 8 || sizeof(resource_size_t) < 8)
&& sz64 > 0x100000000ULL) {
res->flags |= IORESOURCE_UNSET | IORESOURCE_DISABLED;
- res->start = 0;
- res->end = 0;
+ resource_set_range(res, 0, 0);
pci_err(dev, "%s: can't handle BAR larger than 4GB (size %#010llx)\n",
res_name, (unsigned long long)sz64);
goto out;
@@ -297,8 +296,7 @@ int __pci_read_base(struct pci_dev *dev, enum pci_bar_type type,
if ((sizeof(pci_bus_addr_t) < 8) && l) {
/* Above 32-bit boundary; try to reallocate */
res->flags |= IORESOURCE_UNSET;
- res->start = 0;
- res->end = sz64 - 1;
+ resource_set_range(res, 0, sz64);
pci_info(dev, "%s: can't handle BAR above 4GB (bus address %#010llx)\n",
res_name, (unsigned long long)l64);
goto out;
base-commit: 43dfc13ca972988e620a6edb72956981b75ab6b0
--
2.39.5
The function of_node_get() returns a node pointer with its refcount
incremented, but in the error path of the fsi_master_gpio_probe(),
this reference is not released before freeing the master structure,
causing a refcount leak.
Add the missing of_node_put() in the error handling path to properly
release the device tree node reference.
Fixes: 8ef9ccf81044 ("fsi: master-gpio: Add missing release function")
Cc: stable(a)vger.kernel.org
Signed-off-by: Wentao Liang <vulab(a)iscas.ac.cn>
---
drivers/fsi/fsi-master-gpio.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/fsi/fsi-master-gpio.c b/drivers/fsi/fsi-master-gpio.c
index 69de0b5b9cbd..ae9e156284d0 100644
--- a/drivers/fsi/fsi-master-gpio.c
+++ b/drivers/fsi/fsi-master-gpio.c
@@ -861,6 +861,7 @@ static int fsi_master_gpio_probe(struct platform_device *pdev)
}
return 0;
err_free:
+ of_node_put(master->master.dev.of_node);
kfree(master);
return rc;
}
--
2.34.1
Commit 900730dc4705 ("wifi: ath: Use
of_reserved_mem_region_to_resource() for "memory-region"") uncovered a
massive problem with the usage of resource_size() helper.
The reported commit caused a regression with ath11k WiFi firmware
loading and the change was just a simple replacement of duplicate code
with a new helper of_reserved_mem_region_to_resource().
On reworking this, in the commit also a check for the presence of the
node was replaced with resource_size(&res). This was done following the
logic that if the node wasn't present then it's expected that also the
resource_size is zero, mimicking the same if-else logic.
This was also the reason the regression was mostly hard to catch at
first sight as the rework is correctly done given the assumption on the
used helpers.
BUT this is actually not the case. On further inspection on
resource_size() it was found that it NEVER actually returns 0.
Even if the resource value of start and end are 0, the return value of
resource_size() will ALWAYS be 1, resulting in the broken if-else
condition ALWAYS going in the first if condition.
This was simply confirmed by reading the resource_size() logic:
return res->end - res->start + 1;
Given the confusion, also other case of such usage were searched in the
kernel and with great suprise it seems LOTS of place assume
resource_size() should return zero in the context of the resource start
and end set to 0.
Quoting for example comments in drivers/vfio/pci/vfio_pci_core.c:
/*
* The PCI core shouldn't set up a resource with a
* type but zero size. But there may be bugs that
* cause us to do that.
*/
if (!resource_size(res))
goto no_mmap;
It really seems resource_size() was tought with the assumption that
resource struct was always correctly initialized before calling it and
never set to zero.
But across the year this got lost and now there are lots of driver that
assume resource_size() returns 0 if start and end are also 0.
To better handle this and make resource_size() returns correct value in
such case, add a simple check and return 0 if both resource start and
resource end are zero.
Cc: Rob Herring (Arm) <robh(a)kernel.org>
Cc: stable(a)vger.kernel.org
Fixes: 1a4e564b7db9 ("resource: add resource_size()")
Signed-off-by: Christian Marangi <ansuelsmth(a)gmail.com>
---
include/linux/ioport.h | 3 +++
1 file changed, 3 insertions(+)
diff --git a/include/linux/ioport.h b/include/linux/ioport.h
index 9afa30f9346f..1b8ce62255db 100644
--- a/include/linux/ioport.h
+++ b/include/linux/ioport.h
@@ -288,6 +288,9 @@ static inline void resource_set_range(struct resource *res,
static inline resource_size_t resource_size(const struct resource *res)
{
+ if (!res->start && !res->end)
+ return 0;
+
return res->end - res->start + 1;
}
static inline unsigned long resource_type(const struct resource *res)
--
2.51.0
Hi,
After a stable kernel update, the hwclock command seems no longer
functional on my SPARC system with an ST M48T59Y-70PC1 RTC:
# hwclock
[...long delay...]
hwclock: select() to /dev/rtc0 to wait for clock tick timed out
On prior kernels, there is no problem:
# hwclock
2025-10-22 22:21:04.806992-04:00
I reproduced the same failure on 6.18-rc2 and bisected to this commit:
commit 795cda8338eab036013314dbc0b04aae728880ab
Author: Esben Haabendal <esben(a)geanix.com>
Date: Fri May 16 09:23:35 2025 +0200
rtc: interface: Fix long-standing race when setting alarm
This commit was backported to all current 6.x stable branches,
as well as 5.15.x, so they all have the same regression.
Reverting this commit on top of 6.18-rc2 corrects the problem.
Let me know if you need any more info!
Thanks,
Nick
Nilay reported that since commit daaa574aba6f ("powerpc/pseries/msi: Switch
to msi_create_parent_irq_domain()"), the NVMe driver cannot enable MSI-X
when the device's MSI-X table size is larger than the firmware's MSI quota
for the device.
This is because the commit changes how rtas_prepare_msi_irqs() is called:
- Before, it is called when interrupts are allocated at the global
interrupt domain with nvec_in being the number of allocated interrupts.
rtas_prepare_msi_irqs() can return a positive number and the allocation
will be retried.
- Now, it is called at the creation of per-device interrupt domain with
nvec_in being the number of interrupts that the device supports. If
rtas_prepare_msi_irqs() returns positive, domain creation just fails.
For Nilay's NVMe driver case, rtas_prepare_msi_irqs() returns a positive
number (the quota). This causes per-device interrupt domain creation to
fail and thus the NVMe driver cannot enable MSI-X.
Rework to make this scenario works again:
- pseries_msi_ops_prepare() only prepares as many interrupts as the quota
permit.
- pseries_irq_domain_alloc() fails if the device's quota is exceeded.
Now, if the quota is exceeded, pseries_msi_ops_prepare() will only prepare
as allowed by the quota. If device drivers attempt to allocate more
interrupts than the quota permits, pseries_irq_domain_alloc() will return
an error code and msi_handle_pci_fail() will allow device drivers a retry.
Reported-by: Nilay Shroff <nilay(a)inux.ibm.com>
Closes: https://lore.kernel.org/linuxppc-dev/6af2c4c2-97f6-4758-be33-256638ef39e5@l…
Fixes: daaa574aba6f ("powerpc/pseries/msi: Switch to msi_create_parent_irq_domain()")
Signed-off-by: Nam Cao <namcao(a)linutronix.de>
Acked-by: Nilay Shroff <nilay(a)inux.ibm.com>
Cc: stable(a)vger.kernel.org
---
arch/powerpc/platforms/pseries/msi.c | 42 ++++++++++++++++++++++++++--
1 file changed, 39 insertions(+), 3 deletions(-)
diff --git a/arch/powerpc/platforms/pseries/msi.c b/arch/powerpc/platforms/pseries/msi.c
index a82aaa786e9e..8898a968a59b 100644
--- a/arch/powerpc/platforms/pseries/msi.c
+++ b/arch/powerpc/platforms/pseries/msi.c
@@ -19,6 +19,11 @@
#include "pseries.h"
+struct pseries_msi_device {
+ unsigned int msi_quota;
+ unsigned int msi_used;
+};
+
static int query_token, change_token;
#define RTAS_QUERY_FN 0
@@ -433,8 +438,26 @@ static int pseries_msi_ops_prepare(struct irq_domain *domain, struct device *dev
struct msi_domain_info *info = domain->host_data;
struct pci_dev *pdev = to_pci_dev(dev);
int type = (info->flags & MSI_FLAG_PCI_MSIX) ? PCI_CAP_ID_MSIX : PCI_CAP_ID_MSI;
+ int ret;
+
+ struct pseries_msi_device *pseries_dev __free(kfree)
+ = kmalloc(sizeof(*pseries_dev), GFP_KERNEL);
+ if (!pseries_dev)
+ return -ENOMEM;
+
+ ret = rtas_prepare_msi_irqs(pdev, nvec, type, arg);
+ if (ret > 0) {
+ nvec = ret;
+ ret = rtas_prepare_msi_irqs(pdev, nvec, type, arg);
+ }
+ if (ret < 0)
+ return ret;
- return rtas_prepare_msi_irqs(pdev, nvec, type, arg);
+ pseries_dev->msi_quota = nvec;
+ pseries_dev->msi_used = 0;
+
+ arg->scratchpad[0].ptr = no_free_ptr(pseries_dev);
+ return 0;
}
/*
@@ -443,9 +466,13 @@ static int pseries_msi_ops_prepare(struct irq_domain *domain, struct device *dev
*/
static void pseries_msi_ops_teardown(struct irq_domain *domain, msi_alloc_info_t *arg)
{
+ struct pseries_msi_device *pseries_dev = arg->scratchpad[0].ptr;
struct pci_dev *pdev = to_pci_dev(domain->dev);
rtas_disable_msi(pdev);
+
+ WARN_ON(pseries_dev->msi_used);
+ kfree(pseries_dev);
}
static void pseries_msi_shutdown(struct irq_data *d)
@@ -546,12 +573,18 @@ static int pseries_irq_domain_alloc(struct irq_domain *domain, unsigned int virq
unsigned int nr_irqs, void *arg)
{
struct pci_controller *phb = domain->host_data;
+ struct pseries_msi_device *pseries_dev;
msi_alloc_info_t *info = arg;
struct msi_desc *desc = info->desc;
struct pci_dev *pdev = msi_desc_to_pci_dev(desc);
int hwirq;
int i, ret;
+ pseries_dev = info->scratchpad[0].ptr;
+
+ if (pseries_dev->msi_used + nr_irqs > pseries_dev->msi_quota)
+ return -ENOSPC;
+
hwirq = rtas_query_irq_number(pci_get_pdn(pdev), desc->msi_index);
if (hwirq < 0) {
dev_err(&pdev->dev, "Failed to query HW IRQ: %d\n", hwirq);
@@ -567,9 +600,10 @@ static int pseries_irq_domain_alloc(struct irq_domain *domain, unsigned int virq
goto out;
irq_domain_set_hwirq_and_chip(domain, virq + i, hwirq + i,
- &pseries_msi_irq_chip, domain->host_data);
+ &pseries_msi_irq_chip, pseries_dev);
}
+ pseries_dev->msi_used++;
return 0;
out:
@@ -582,9 +616,11 @@ static void pseries_irq_domain_free(struct irq_domain *domain, unsigned int virq
unsigned int nr_irqs)
{
struct irq_data *d = irq_domain_get_irq_data(domain, virq);
- struct pci_controller *phb = irq_data_get_irq_chip_data(d);
+ struct pseries_msi_device *pseries_dev = irq_data_get_irq_chip_data(d);
+ struct pci_controller *phb = domain->host_data;
pr_debug("%s bridge %pOF %d #%d\n", __func__, phb->dn, virq, nr_irqs);
+ pseries_dev->msi_used -= nr_irqs;
irq_domain_free_irqs_parent(domain, virq, nr_irqs);
}
--
2.47.3
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 1f73b8b56cf35de29a433aee7bfff26cea98be3f
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025120110-coastal-litigator-8952@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 1f73b8b56cf35de29a433aee7bfff26cea98be3f Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?=C5=81ukasz=20Bartosik?= <ukaszb(a)chromium.org>
Date: Wed, 19 Nov 2025 21:29:09 +0000
Subject: [PATCH] xhci: dbgtty: fix device unregister
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
When DbC is disconnected then xhci_dbc_tty_unregister_device()
is called. However if there is any user space process blocked
on write to DbC terminal device then it will never be signalled
and thus stay blocked indifinitely.
This fix adds a tty_vhangup() call in xhci_dbc_tty_unregister_device().
The tty_vhangup() wakes up any blocked writers and causes subsequent
write attempts to DbC terminal device to fail.
Cc: stable <stable(a)kernel.org>
Fixes: dfba2174dc42 ("usb: xhci: Add DbC support in xHCI driver")
Signed-off-by: Łukasz Bartosik <ukaszb(a)chromium.org>
Link: https://patch.msgid.link/20251119212910.1245694-1-ukaszb@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/usb/host/xhci-dbgtty.c b/drivers/usb/host/xhci-dbgtty.c
index b7f95565524d..57cdda4e09c8 100644
--- a/drivers/usb/host/xhci-dbgtty.c
+++ b/drivers/usb/host/xhci-dbgtty.c
@@ -550,6 +550,12 @@ static void xhci_dbc_tty_unregister_device(struct xhci_dbc *dbc)
if (!port->registered)
return;
+ /*
+ * Hang up the TTY. This wakes up any blocked
+ * writers and causes subsequent writes to fail.
+ */
+ tty_vhangup(port->port.tty);
+
tty_unregister_device(dbc_tty_driver, port->minor);
xhci_dbc_tty_exit_port(port);
port->registered = false;
Hi there,
While running performance benchmarks for the 5.15.196 LTS tags , it was
observed that several regressions across different benchmarks is being
introduced when compared to the previous 5.15.193 kernel tag. Running an
automated bisect on both of them narrowed down the culprit commit to:
- 5666bcc3c00f7 Revert "cpuidle: menu: Avoid discarding useful
information" for 5.15
Regressions on 5.15.196 include:
-9.3% : Phoronix pts/sqlite using 2 processes on OnPrem X6-2
-6.3% : Phoronix system/sqlite on OnPrem X6-2
-18% : rds-stress -M 1 (readonly rdma-mode) metrics with 1 depth & 1
thread & 1M buffer size on OnPrem X6-2
-4 -> -8% : rds-stress -M 2 (writeonly rdma-mode) metrics with 1 depth &
1 thread & 1M buffer size on OnPrem X6-2
Up to -30% : Some Netpipe metrics on OnPrem X5-2
The culprit commits' messages mention that these reverts were done due
to performance regressions introduced in Intel Jasper Lake systems but
this revert is causing issues in other systems unfortunately. I wanted
to know the maintainers' opinion on how we should proceed in order to
fix this. If we reapply it'll bring back the previous regressions on
Jasper Lake systems and if we don't revert it then it's stuck with
current regressions. If this problem has been reported before and a fix
is in the works then please let me know I shall follow developments to
that mail thread.
Thanks & Regards,
Harshvardhan
Changing the enable/disable sequence in commit c9b1150a68d9
("drm/atomic-helper: Re-order bridge chain pre-enable and post-disable")
has caused regressions on multiple platforms: R-Car, MCDE, Rockchip.
This is an alternate series to Linus' series:
https://lore.kernel.org/all/20251202-mcde-drm-regression-thirdfix-v6-0-f1bf…
This series first reverts the original commit and reverts a fix for
mediatek which is no longer needed. It then exposes helper functions
from DRM core, and finally implements the new sequence only in the tidss
driver.
There is one more fix in upstream for the original commit, commit
5d91394f2361 ("drm/exynos: fimd: Guard display clock control with
runtime PM calls"), but I have not reverted that one as it looks like a
valid patch in its own.
I added Cc stable v6.17+ to all patches, but I didn't add Fixes tags, as
I wasn't sure what should they point to. But I could perhaps add Fixes:
<original commit> to all of these.
Signed-off-by: Tomi Valkeinen <tomi.valkeinen(a)ideasonboard.com>
---
Linus Walleij (1):
drm/atomic-helper: Export and namespace some functions
Tomi Valkeinen (3):
Revert "drm/atomic-helper: Re-order bridge chain pre-enable and post-disable"
Revert "drm/mediatek: dsi: Fix DSI host and panel bridge pre-enable order"
drm/tidss: Fix enable/disable order
drivers/gpu/drm/drm_atomic_helper.c | 122 ++++++++++++++----
drivers/gpu/drm/mediatek/mtk_dsi.c | 6 -
drivers/gpu/drm/tidss/tidss_kms.c | 30 ++++-
include/drm/drm_atomic_helper.h | 22 ++++
include/drm/drm_bridge.h | 249 ++++++++++--------------------------
5 files changed, 214 insertions(+), 215 deletions(-)
---
base-commit: 88e721ab978a86426aa08da520de77430fa7bb84
change-id: 20251205-drm-seq-fix-b4ed1f56604b
Best regards,
--
Tomi Valkeinen <tomi.valkeinen(a)ideasonboard.com>
We switched from (wrongly) using the page count to an independent
shared count. Now, shared page tables have a refcount of 1 (excluding
speculative references) and instead use ptdesc->pt_share_count to
identify sharing.
We didn't convert hugetlb_pmd_shared(), so right now, we would never
detect a shared PMD table as such, because sharing/unsharing no longer
touches the refcount of a PMD table.
Page migration, like mbind() or migrate_pages() would allow for migrating
folios mapped into such shared PMD tables, even though the folios are
not exclusive. In smaps we would account them as "private" although they
are "shared", and we would be wrongly setting the PM_MMAP_EXCLUSIVE in the
pagemap interface.
Fix it by properly using ptdesc_pmd_is_shared() in hugetlb_pmd_shared().
Fixes: 59d9094df3d7 ("mm: hugetlb: independent PMD page table shared count")
Cc: <stable(a)vger.kernel.org>
Cc: Liu Shixin <liushixin2(a)huawei.com>
Signed-off-by: David Hildenbrand (Red Hat) <david(a)kernel.org>
---
include/linux/hugetlb.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
index 019a1c5281e4e..03c8725efa289 100644
--- a/include/linux/hugetlb.h
+++ b/include/linux/hugetlb.h
@@ -1326,7 +1326,7 @@ static inline __init void hugetlb_cma_reserve(int order)
#ifdef CONFIG_HUGETLB_PMD_PAGE_TABLE_SHARING
static inline bool hugetlb_pmd_shared(pte_t *pte)
{
- return page_count(virt_to_page(pte)) > 1;
+ return ptdesc_pmd_is_shared(virt_to_ptdesc(pte));
}
#else
static inline bool hugetlb_pmd_shared(pte_t *pte)
--
2.52.0