The patch titled
Subject: mm/mm_init: fix hash table order logging in alloc_large_system_hash()
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-mm_init-fix-hash-table-order-logging-in-alloc_large_system_hash.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: "Isaac J. Manjarres" <isaacmanjarres(a)google.com>
Subject: mm/mm_init: fix hash table order logging in alloc_large_system_hash()
Date: Tue, 28 Oct 2025 12:10:12 -0700
When emitting the order of the allocation for a hash table,
alloc_large_system_hash() unconditionally subtracts PAGE_SHIFT from log
base 2 of the allocation size. This is not correct if the allocation size
is smaller than a page, and yields a negative value for the order as seen
below:
TCP established hash table entries: 32 (order: -4, 256 bytes, linear) TCP
bind hash table entries: 32 (order: -2, 1024 bytes, linear)
Use get_order() to compute the order when emitting the hash table
information to correctly handle cases where the allocation size is smaller
than a page:
TCP established hash table entries: 32 (order: 0, 256 bytes, linear) TCP
bind hash table entries: 32 (order: 0, 1024 bytes, linear)
Link: https://lkml.kernel.org/r/20251028191020.413002-1-isaacmanjarres@google.com
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Isaac J. Manjarres <isaacmanjarres(a)google.com>
Cc: Mike Rapoport <rppt(a)kernel.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/mm_init.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/mm/mm_init.c~mm-mm_init-fix-hash-table-order-logging-in-alloc_large_system_hash
+++ a/mm/mm_init.c
@@ -2469,7 +2469,7 @@ void *__init alloc_large_system_hash(con
panic("Failed to allocate %s hash table\n", tablename);
pr_info("%s hash table entries: %ld (order: %d, %lu bytes, %s)\n",
- tablename, 1UL << log2qty, ilog2(size) - PAGE_SHIFT, size,
+ tablename, 1UL << log2qty, get_order(size), size,
virt ? (huge ? "vmalloc hugepage" : "vmalloc") : "linear");
if (_hash_shift)
_
Patches currently in -mm which might be from isaacmanjarres(a)google.com are
mm-mm_init-fix-hash-table-order-logging-in-alloc_large_system_hash.patch
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x 15292f1b4c55a3a7c940dbcb6cb8793871ed3d92
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025102058-citadel-crinkly-125f@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 15292f1b4c55a3a7c940dbcb6cb8793871ed3d92 Mon Sep 17 00:00:00 2001
From: Babu Moger <babu.moger(a)amd.com>
Date: Fri, 10 Oct 2025 12:08:35 -0500
Subject: [PATCH] x86/resctrl: Fix miscount of bandwidth event when
reactivating previously unavailable RMID
Users can create as many monitoring groups as the number of RMIDs supported
by the hardware. However, on AMD systems, only a limited number of RMIDs
are guaranteed to be actively tracked by the hardware. RMIDs that exceed
this limit are placed in an "Unavailable" state.
When a bandwidth counter is read for such an RMID, the hardware sets
MSR_IA32_QM_CTR.Unavailable (bit 62). When such an RMID starts being tracked
again the hardware counter is reset to zero. MSR_IA32_QM_CTR.Unavailable
remains set on first read after tracking re-starts and is clear on all
subsequent reads as long as the RMID is tracked.
resctrl miscounts the bandwidth events after an RMID transitions from the
"Unavailable" state back to being tracked. This happens because when the
hardware starts counting again after resetting the counter to zero, resctrl
in turn compares the new count against the counter value stored from the
previous time the RMID was tracked.
This results in resctrl computing an event value that is either undercounting
(when new counter is more than stored counter) or a mistaken overflow (when
new counter is less than stored counter).
Reset the stored value (arch_mbm_state::prev_msr) of MSR_IA32_QM_CTR to
zero whenever the RMID is in the "Unavailable" state to ensure accurate
counting after the RMID resets to zero when it starts to be tracked again.
Example scenario that results in mistaken overflow
==================================================
1. The resctrl filesystem is mounted, and a task is assigned to a
monitoring group.
$mount -t resctrl resctrl /sys/fs/resctrl
$mkdir /sys/fs/resctrl/mon_groups/test1/
$echo 1234 > /sys/fs/resctrl/mon_groups/test1/tasks
$cat /sys/fs/resctrl/mon_groups/test1/mon_data/mon_L3_*/mbm_total_bytes
21323 <- Total bytes on domain 0
"Unavailable" <- Total bytes on domain 1
Task is running on domain 0. Counter on domain 1 is "Unavailable".
2. The task runs on domain 0 for a while and then moves to domain 1. The
counter starts incrementing on domain 1.
$cat /sys/fs/resctrl/mon_groups/test1/mon_data/mon_L3_*/mbm_total_bytes
7345357 <- Total bytes on domain 0
4545 <- Total bytes on domain 1
3. At some point, the RMID in domain 0 transitions to the "Unavailable"
state because the task is no longer executing in that domain.
$cat /sys/fs/resctrl/mon_groups/test1/mon_data/mon_L3_*/mbm_total_bytes
"Unavailable" <- Total bytes on domain 0
434341 <- Total bytes on domain 1
4. Since the task continues to migrate between domains, it may eventually
return to domain 0.
$cat /sys/fs/resctrl/mon_groups/test1/mon_data/mon_L3_*/mbm_total_bytes
17592178699059 <- Overflow on domain 0
3232332 <- Total bytes on domain 1
In this case, the RMID on domain 0 transitions from "Unavailable" state to
active state. The hardware sets MSR_IA32_QM_CTR.Unavailable (bit 62) when
the counter is read and begins tracking the RMID counting from 0.
Subsequent reads succeed but return a value smaller than the previously
saved MSR value (7345357). Consequently, the resctrl's overflow logic is
triggered, it compares the previous value (7345357) with the new, smaller
value and incorrectly interprets this as a counter overflow, adding a large
delta.
In reality, this is a false positive: the counter did not overflow but was
simply reset when the RMID transitioned from "Unavailable" back to active
state.
Here is the text from APM [1] available from [2].
"In PQOS Version 2.0 or higher, the MBM hardware will set the U bit on the
first QM_CTR read when it begins tracking an RMID that it was not
previously tracking. The U bit will be zero for all subsequent reads from
that RMID while it is still tracked by the hardware. Therefore, a QM_CTR
read with the U bit set when that RMID is in use by a processor can be
considered 0 when calculating the difference with a subsequent read."
[1] AMD64 Architecture Programmer's Manual Volume 2: System Programming
Publication # 24593 Revision 3.41 section 19.3.3 Monitoring L3 Memory
Bandwidth (MBM).
[ bp: Split commit message into smaller paragraph chunks for better
consumption. ]
Fixes: 4d05bf71f157d ("x86/resctrl: Introduce AMD QOS feature")
Signed-off-by: Babu Moger <babu.moger(a)amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp(a)alien8.de>
Reviewed-by: Reinette Chatre <reinette.chatre(a)intel.com>
Tested-by: Reinette Chatre <reinette.chatre(a)intel.com>
Cc: stable(a)vger.kernel.org # needs adjustments for <= v6.17
Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537 # [2]
diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
index c8945610d455..2cd25a0d4637 100644
--- a/arch/x86/kernel/cpu/resctrl/monitor.c
+++ b/arch/x86/kernel/cpu/resctrl/monitor.c
@@ -242,7 +242,9 @@ int resctrl_arch_rmid_read(struct rdt_resource *r, struct rdt_mon_domain *d,
u32 unused, u32 rmid, enum resctrl_event_id eventid,
u64 *val, void *ignored)
{
+ struct rdt_hw_mon_domain *hw_dom = resctrl_to_arch_mon_dom(d);
int cpu = cpumask_any(&d->hdr.cpu_mask);
+ struct arch_mbm_state *am;
u64 msr_val;
u32 prmid;
int ret;
@@ -251,12 +253,16 @@ int resctrl_arch_rmid_read(struct rdt_resource *r, struct rdt_mon_domain *d,
prmid = logical_rmid_to_physical_rmid(cpu, rmid);
ret = __rmid_read_phys(prmid, eventid, &msr_val);
- if (ret)
- return ret;
- *val = get_corrected_val(r, d, rmid, eventid, msr_val);
+ if (!ret) {
+ *val = get_corrected_val(r, d, rmid, eventid, msr_val);
+ } else if (ret == -EINVAL) {
+ am = get_arch_mbm_state(hw_dom, rmid, eventid);
+ if (am)
+ am->prev_msr = 0;
+ }
- return 0;
+ return ret;
}
static int __cntr_id_read(u32 cntr_id, u64 *val)
The following commit has been merged into the x86/urgent branch of tip:
Commit-ID: 388eff894d6bc5f921e9bfff0e4b0ab2684a96e9
Gitweb: https://git.kernel.org/tip/388eff894d6bc5f921e9bfff0e4b0ab2684a96e9
Author: Chang S. Bae <chang.seok.bae(a)intel.com>
AuthorDate: Mon, 09 Jun 2025 17:16:59 -07:00
Committer: Dave Hansen <dave.hansen(a)linux.intel.com>
CommitterDate: Tue, 28 Oct 2025 12:10:59 -07:00
x86/fpu: Ensure XFD state on signal delivery
Sean reported [1] the following splat when running KVM tests:
WARNING: CPU: 232 PID: 15391 at xfd_validate_state+0x65/0x70
Call Trace:
<TASK>
fpu__clear_user_states+0x9c/0x100
arch_do_signal_or_restart+0x142/0x210
exit_to_user_mode_loop+0x55/0x100
do_syscall_64+0x205/0x2c0
entry_SYSCALL_64_after_hwframe+0x4b/0x53
Chao further identified [2] a reproducible scenario involving signal
delivery: a non-AMX task is preempted by an AMX-enabled task which
modifies the XFD MSR.
When the non-AMX task resumes and reloads XSTATE with init values,
a warning is triggered due to a mismatch between fpstate::xfd and the
CPU's current XFD state. fpu__clear_user_states() does not currently
re-synchronize the XFD state after such preemption.
Invoke xfd_update_state() which detects and corrects the mismatch if
there is a dynamic feature.
This also benefits the sigreturn path, as fpu__restore_sig() may call
fpu__clear_user_states() when the sigframe is inaccessible.
[ dhansen: minor changelog munging ]
Closes: https://lore.kernel.org/lkml/aDCo_SczQOUaB2rS@google.com [1]
Fixes: 672365477ae8a ("x86/fpu: Update XFD state where required")
Reported-by: Sean Christopherson <seanjc(a)google.com>
Signed-off-by: Chang S. Bae <chang.seok.bae(a)intel.com>
Signed-off-by: Dave Hansen <dave.hansen(a)linux.intel.com>
Reviewed-by: Chao Gao <chao.gao(a)intel.com>
Tested-by: Chao Gao <chao.gao(a)intel.com>
Link: https://lore.kernel.org/all/aDWbctO%2FRfTGiCg3@intel.com [2]
Cc:stable@vger.kernel.org
Link: https://patch.msgid.link/20250610001700.4097-1-chang.seok.bae%40intel.com
---
arch/x86/kernel/fpu/core.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 1f71cc1..e88eacb 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -825,6 +825,9 @@ void fpu__clear_user_states(struct fpu *fpu)
!fpregs_state_valid(fpu, smp_processor_id()))
os_xrstor_supervisor(fpu->fpstate);
+ /* Ensure XFD state is in sync before reloading XSTATE */
+ xfd_update_state(fpu->fpstate);
+
/* Reset user states in registers. */
restore_fpregs_from_init_fpstate(XFEATURE_MASK_USER_RESTORE);
The L1 substates support requires additional steps to work, namely:
-Proper handling of the CLKREQ# sideband signal. (It is mostly handled by
hardware, but software still needs to set the clkreq fields in the
PCIE_CLIENT_POWER_CON register to match the hardware implementation.)
-Program the frequency of the aux clock into the
DSP_PCIE_PL_AUX_CLK_FREQ_OFF register. (During L1 substates the core_clk
is turned off and the aux_clk is used instead.)
These steps are currently missing from the driver.
For more details, see section '18.6.6.4 L1 Substate' in the RK3658 TRM 1.1
Part 2, or section '11.6.6.4 L1 Substate' in the RK3588 TRM 1.0 Part2.
While this has always been a problem when using e.g.
CONFIG_PCIEASPM_POWER_SUPERSAVE=y, or when modifying
/sys/bus/pci/devices/.../link/l1_2_aspm, the lacking driver support for L1
substates became more apparent after commit f3ac2ff14834 ("PCI/ASPM:
Enable all ClockPM and ASPM states for devicetree platforms"), which
enabled ASPM also for CONFIG_PCIEASPM_DEFAULT=y.
When using e.g. an NVMe drive connected to the PCIe controller, the
problem will be seen as:
nvme nvme0: controller is down; will reset: CSTS=0xffffffff, PCI_STATUS=0x10
nvme nvme0: Does your device have a faulty power saving mode enabled?
nvme nvme0: Try "nvme_core.default_ps_max_latency_us=0 pcie_aspm=off pcie_port_pm=off" and report a bug
Thus, prevent advertising L1 Substates support until proper driver support
is added.
Cc: stable(a)vger.kernel.org
Fixes: 0e898eb8df4e ("PCI: rockchip-dwc: Add Rockchip RK356X host controller driver")
Fixes: f3ac2ff14834 ("PCI/ASPM: Enable all ClockPM and ASPM states for devicetree platforms")
Acked-by: Shawn Lin <shawn.lin(a)rock-chips.com>
Signed-off-by: Niklas Cassel <cassel(a)kernel.org>
---
Changes since v2:
-Improve commit message (Bjorn)
drivers/pci/controller/dwc/pcie-dw-rockchip.c | 21 +++++++++++++++++++
1 file changed, 21 insertions(+)
diff --git a/drivers/pci/controller/dwc/pcie-dw-rockchip.c b/drivers/pci/controller/dwc/pcie-dw-rockchip.c
index 3e2752c7dd09..84f882abbca5 100644
--- a/drivers/pci/controller/dwc/pcie-dw-rockchip.c
+++ b/drivers/pci/controller/dwc/pcie-dw-rockchip.c
@@ -200,6 +200,25 @@ static bool rockchip_pcie_link_up(struct dw_pcie *pci)
return FIELD_GET(PCIE_LINKUP_MASK, val) == PCIE_LINKUP;
}
+/*
+ * See e.g. section '11.6.6.4 L1 Substate' in the RK3588 TRM V1.0 for the steps
+ * needed to support L1 substates. Currently, not a single rockchip platform
+ * performs these steps, so disable L1 substates until there is proper support.
+ */
+static void rockchip_pcie_disable_l1sub(struct dw_pcie *pci)
+{
+ u32 cap, l1subcap;
+
+ cap = dw_pcie_find_ext_capability(pci, PCI_EXT_CAP_ID_L1SS);
+ if (cap) {
+ l1subcap = dw_pcie_readl_dbi(pci, cap + PCI_L1SS_CAP);
+ l1subcap &= ~(PCI_L1SS_CAP_L1_PM_SS | PCI_L1SS_CAP_ASPM_L1_1 |
+ PCI_L1SS_CAP_ASPM_L1_2 | PCI_L1SS_CAP_PCIPM_L1_1 |
+ PCI_L1SS_CAP_PCIPM_L1_2);
+ dw_pcie_writel_dbi(pci, cap + PCI_L1SS_CAP, l1subcap);
+ }
+}
+
static void rockchip_pcie_enable_l0s(struct dw_pcie *pci)
{
u32 cap, lnkcap;
@@ -264,6 +283,7 @@ static int rockchip_pcie_host_init(struct dw_pcie_rp *pp)
irq_set_chained_handler_and_data(irq, rockchip_pcie_intx_handler,
rockchip);
+ rockchip_pcie_disable_l1sub(pci);
rockchip_pcie_enable_l0s(pci);
return 0;
@@ -301,6 +321,7 @@ static void rockchip_pcie_ep_init(struct dw_pcie_ep *ep)
struct dw_pcie *pci = to_dw_pcie_from_ep(ep);
enum pci_barno bar;
+ rockchip_pcie_disable_l1sub(pci);
rockchip_pcie_enable_l0s(pci);
rockchip_pcie_ep_hide_broken_ats_cap_rk3588(ep);
--
2.51.0
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 15292f1b4c55a3a7c940dbcb6cb8793871ed3d92
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025102056-qualify-unopposed-be08@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 15292f1b4c55a3a7c940dbcb6cb8793871ed3d92 Mon Sep 17 00:00:00 2001
From: Babu Moger <babu.moger(a)amd.com>
Date: Fri, 10 Oct 2025 12:08:35 -0500
Subject: [PATCH] x86/resctrl: Fix miscount of bandwidth event when
reactivating previously unavailable RMID
Users can create as many monitoring groups as the number of RMIDs supported
by the hardware. However, on AMD systems, only a limited number of RMIDs
are guaranteed to be actively tracked by the hardware. RMIDs that exceed
this limit are placed in an "Unavailable" state.
When a bandwidth counter is read for such an RMID, the hardware sets
MSR_IA32_QM_CTR.Unavailable (bit 62). When such an RMID starts being tracked
again the hardware counter is reset to zero. MSR_IA32_QM_CTR.Unavailable
remains set on first read after tracking re-starts and is clear on all
subsequent reads as long as the RMID is tracked.
resctrl miscounts the bandwidth events after an RMID transitions from the
"Unavailable" state back to being tracked. This happens because when the
hardware starts counting again after resetting the counter to zero, resctrl
in turn compares the new count against the counter value stored from the
previous time the RMID was tracked.
This results in resctrl computing an event value that is either undercounting
(when new counter is more than stored counter) or a mistaken overflow (when
new counter is less than stored counter).
Reset the stored value (arch_mbm_state::prev_msr) of MSR_IA32_QM_CTR to
zero whenever the RMID is in the "Unavailable" state to ensure accurate
counting after the RMID resets to zero when it starts to be tracked again.
Example scenario that results in mistaken overflow
==================================================
1. The resctrl filesystem is mounted, and a task is assigned to a
monitoring group.
$mount -t resctrl resctrl /sys/fs/resctrl
$mkdir /sys/fs/resctrl/mon_groups/test1/
$echo 1234 > /sys/fs/resctrl/mon_groups/test1/tasks
$cat /sys/fs/resctrl/mon_groups/test1/mon_data/mon_L3_*/mbm_total_bytes
21323 <- Total bytes on domain 0
"Unavailable" <- Total bytes on domain 1
Task is running on domain 0. Counter on domain 1 is "Unavailable".
2. The task runs on domain 0 for a while and then moves to domain 1. The
counter starts incrementing on domain 1.
$cat /sys/fs/resctrl/mon_groups/test1/mon_data/mon_L3_*/mbm_total_bytes
7345357 <- Total bytes on domain 0
4545 <- Total bytes on domain 1
3. At some point, the RMID in domain 0 transitions to the "Unavailable"
state because the task is no longer executing in that domain.
$cat /sys/fs/resctrl/mon_groups/test1/mon_data/mon_L3_*/mbm_total_bytes
"Unavailable" <- Total bytes on domain 0
434341 <- Total bytes on domain 1
4. Since the task continues to migrate between domains, it may eventually
return to domain 0.
$cat /sys/fs/resctrl/mon_groups/test1/mon_data/mon_L3_*/mbm_total_bytes
17592178699059 <- Overflow on domain 0
3232332 <- Total bytes on domain 1
In this case, the RMID on domain 0 transitions from "Unavailable" state to
active state. The hardware sets MSR_IA32_QM_CTR.Unavailable (bit 62) when
the counter is read and begins tracking the RMID counting from 0.
Subsequent reads succeed but return a value smaller than the previously
saved MSR value (7345357). Consequently, the resctrl's overflow logic is
triggered, it compares the previous value (7345357) with the new, smaller
value and incorrectly interprets this as a counter overflow, adding a large
delta.
In reality, this is a false positive: the counter did not overflow but was
simply reset when the RMID transitioned from "Unavailable" back to active
state.
Here is the text from APM [1] available from [2].
"In PQOS Version 2.0 or higher, the MBM hardware will set the U bit on the
first QM_CTR read when it begins tracking an RMID that it was not
previously tracking. The U bit will be zero for all subsequent reads from
that RMID while it is still tracked by the hardware. Therefore, a QM_CTR
read with the U bit set when that RMID is in use by a processor can be
considered 0 when calculating the difference with a subsequent read."
[1] AMD64 Architecture Programmer's Manual Volume 2: System Programming
Publication # 24593 Revision 3.41 section 19.3.3 Monitoring L3 Memory
Bandwidth (MBM).
[ bp: Split commit message into smaller paragraph chunks for better
consumption. ]
Fixes: 4d05bf71f157d ("x86/resctrl: Introduce AMD QOS feature")
Signed-off-by: Babu Moger <babu.moger(a)amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp(a)alien8.de>
Reviewed-by: Reinette Chatre <reinette.chatre(a)intel.com>
Tested-by: Reinette Chatre <reinette.chatre(a)intel.com>
Cc: stable(a)vger.kernel.org # needs adjustments for <= v6.17
Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537 # [2]
diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
index c8945610d455..2cd25a0d4637 100644
--- a/arch/x86/kernel/cpu/resctrl/monitor.c
+++ b/arch/x86/kernel/cpu/resctrl/monitor.c
@@ -242,7 +242,9 @@ int resctrl_arch_rmid_read(struct rdt_resource *r, struct rdt_mon_domain *d,
u32 unused, u32 rmid, enum resctrl_event_id eventid,
u64 *val, void *ignored)
{
+ struct rdt_hw_mon_domain *hw_dom = resctrl_to_arch_mon_dom(d);
int cpu = cpumask_any(&d->hdr.cpu_mask);
+ struct arch_mbm_state *am;
u64 msr_val;
u32 prmid;
int ret;
@@ -251,12 +253,16 @@ int resctrl_arch_rmid_read(struct rdt_resource *r, struct rdt_mon_domain *d,
prmid = logical_rmid_to_physical_rmid(cpu, rmid);
ret = __rmid_read_phys(prmid, eventid, &msr_val);
- if (ret)
- return ret;
- *val = get_corrected_val(r, d, rmid, eventid, msr_val);
+ if (!ret) {
+ *val = get_corrected_val(r, d, rmid, eventid, msr_val);
+ } else if (ret == -EINVAL) {
+ am = get_arch_mbm_state(hw_dom, rmid, eventid);
+ if (am)
+ am->prev_msr = 0;
+ }
- return 0;
+ return ret;
}
static int __cntr_id_read(u32 cntr_id, u64 *val)
From: Sakari Ailus <sakari.ailus(a)linux.intel.com>
[ Upstream commit d9f866b2bb3eec38b3734f1fed325ec7c55ccdfa ]
fwnode_graph_get_next_subnode() may return fwnode backed by ACPI
device nodes and there has been no check these devices are present
in the system, unlike there has been on fwnode OF backend.
In order to provide consistent behaviour towards callers,
add a check for device presence by introducing
a new function acpi_get_next_present_subnode(), used as the
get_next_child_node() fwnode operation that also checks device
node presence.
Signed-off-by: Sakari Ailus <sakari.ailus(a)linux.intel.com>
Reviewed-by: Laurent Pinchart <laurent.pinchart+renesas(a)ideasonboard.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron(a)huawei.com>
Link: https://patch.msgid.link/20251001102636.1272722-2-sakari.ailus@linux.intel.…
[ rjw: Kerneldoc comment and changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki(a)intel.com>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
LLM Generated explanations, may be completely bogus:
YES – this change fixes a real behavioural bug in the ACPI fwnode
backend and should go to stable.
- `drivers/acpi/property.c:1375` adds `acpi_get_next_present_subnode()`
so `.get_next_child_node` skips ACPI child devices whose `_STA` says
they are absent (`acpi_device_is_present()`), while still returning
data subnodes unchanged.
- The new helper is now wired into the ACPI fwnode ops
(`drivers/acpi/property.c:1731`), making generic helpers such as
`fwnode_get_next_child_node()` and macros like
`fwnode_for_each_child_node` (`include/linux/property.h:167`) behave
the same as the OF backend, which already filtered unavailable
children via `of_get_next_available_child()`
(`drivers/of/property.c:1070`).
- Several core helpers assume disabled endpoints never surface: e.g.
`fwnode_graph_get_endpoint_by_id()` in `drivers/base/property.c:1286`
promises to hide endpoints on disabled devices, and higher layers such
as `v4l2_fwnode_reference_get_int_prop()`
(`drivers/media/v4l2-core/v4l2-fwnode.c:1064`) iterate child nodes
without rechecking availability. On ACPI systems today they still see
powered-off devices, leading to asynchronous notifiers that wait
forever for hardware that can’t appear, or to bogus graph
enumerations. This patch closes that gap.
Risk is low: it only suppresses ACPI device nodes already known to be
absent, aligns behaviour with DT userspace expectations, and leaves data
nodes untouched. No extra dependencies are required, so the fix is self-
contained and appropriate for stable backporting. Suggest running
existing ACPI graph users (media/typec drivers) after backport to
confirm no regressions.
drivers/acpi/property.c | 24 +++++++++++++++++++++++-
1 file changed, 23 insertions(+), 1 deletion(-)
diff --git a/drivers/acpi/property.c b/drivers/acpi/property.c
index c086786fe84cb..d74678f0ba4af 100644
--- a/drivers/acpi/property.c
+++ b/drivers/acpi/property.c
@@ -1357,6 +1357,28 @@ struct fwnode_handle *acpi_get_next_subnode(const struct fwnode_handle *fwnode,
return NULL;
}
+/*
+ * acpi_get_next_present_subnode - Return the next present child node handle
+ * @fwnode: Firmware node to find the next child node for.
+ * @child: Handle to one of the device's child nodes or a null handle.
+ *
+ * Like acpi_get_next_subnode(), but the device nodes returned by
+ * acpi_get_next_present_subnode() are guaranteed to be present.
+ *
+ * Returns: The fwnode handle of the next present sub-node.
+ */
+static struct fwnode_handle *
+acpi_get_next_present_subnode(const struct fwnode_handle *fwnode,
+ struct fwnode_handle *child)
+{
+ do {
+ child = acpi_get_next_subnode(fwnode, child);
+ } while (is_acpi_device_node(child) &&
+ !acpi_device_is_present(to_acpi_device_node(child)));
+
+ return child;
+}
+
/**
* acpi_node_get_parent - Return parent fwnode of this fwnode
* @fwnode: Firmware node whose parent to get
@@ -1701,7 +1723,7 @@ static int acpi_fwnode_irq_get(const struct fwnode_handle *fwnode,
.property_read_string_array = \
acpi_fwnode_property_read_string_array, \
.get_parent = acpi_node_get_parent, \
- .get_next_child_node = acpi_get_next_subnode, \
+ .get_next_child_node = acpi_get_next_present_subnode, \
.get_named_child_node = acpi_fwnode_get_named_child_node, \
.get_name = acpi_fwnode_get_name, \
.get_name_prefix = acpi_fwnode_get_name_prefix, \
--
2.51.0
Sean reported [1] the following splat when running KVM tests:
WARNING: CPU: 232 PID: 15391 at xfd_validate_state+0x65/0x70
Call Trace:
<TASK>
fpu__clear_user_states+0x9c/0x100
arch_do_signal_or_restart+0x142/0x210
exit_to_user_mode_loop+0x55/0x100
do_syscall_64+0x205/0x2c0
entry_SYSCALL_64_after_hwframe+0x4b/0x53
Chao further identified [2] a reproducible scenarios involving signal
delivery: a non-AMX task is preempted by an AMX-enabled task which
modifies the XFD MSR.
When the non-AMX task resumes and reloads XSTATE with init values,
a warning is triggered due to a mismatch between fpstate::xfd and the
CPU's current XFD state. fpu__clear_user_states() does not currently
re-synchronize the XFD state after such preemption.
Invoke xfd_update_state() which detects and corrects the mismatch if the
dynamic feature is enabled.
This also benefits the sigreturn path, as fpu__restore_sig() may call
fpu__clear_user_states() when the sigframe is inaccessible.
Fixes: 672365477ae8a ("x86/fpu: Update XFD state where required")
Reported-by: Sean Christopherson <seanjc(a)google.com>
Closes: https://lore.kernel.org/lkml/aDCo_SczQOUaB2rS@google.com [1]
Tested-by: Chao Gao <chao.gao(a)intel.com>
Signed-off-by: Chang S. Bae <chang.seok.bae(a)intel.com>
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/all/aDWbctO%2FRfTGiCg3@intel.com [2]
---
arch/x86/kernel/fpu/core.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index ea138583dd92..5fa782a2ae7c 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -800,6 +800,9 @@ void fpu__clear_user_states(struct fpu *fpu)
!fpregs_state_valid(fpu, smp_processor_id()))
os_xrstor_supervisor(fpu->fpstate);
+ /* Ensure XFD state is in sync before reloading XSTATE */
+ xfd_update_state(fpu->fpstate);
+
/* Reset user states in registers. */
restore_fpregs_from_init_fpstate(XFEATURE_MASK_USER_RESTORE);
--
2.48.1
Currently, on ASUS projects, the TAS2781 codec attaches the speaker GPIO
to the first tasdevice_priv instance using devm. This causes
tas2781_read_acpi to fail on subsequent probes since the GPIO is already
managed by the first device. This causes a failure on Xbox Ally X,
because it has two amplifiers, and prevents us from quirking both the
Xbox Ally and Xbox Ally X in the realtek codec driver.
It is unnecessary to attach the GPIO to a device as it is static.
Therefore, instead of attaching it and then reading it when loading the
firmware, read its value directly in tas2781_read_acpi and store it in
the private data structure. Then, make reading the value non-fatal so
that ASUS projects that miss a speaker pin can still work, perhaps using
fallback firmware.
Fixes: 4e7035a75da9 ("ALSA: hda/tas2781: Add speaker id check for ASUS projects")
Cc: stable(a)vger.kernel.org # 6.17
Signed-off-by: Antheas Kapenekakis <lkml(a)antheas.dev>
---
include/sound/tas2781.h | 2 +-
.../hda/codecs/side-codecs/tas2781_hda_i2c.c | 44 +++++++++++--------
2 files changed, 26 insertions(+), 20 deletions(-)
diff --git a/include/sound/tas2781.h b/include/sound/tas2781.h
index 0fbcdb15c74b..29d15ba65f04 100644
--- a/include/sound/tas2781.h
+++ b/include/sound/tas2781.h
@@ -197,7 +197,6 @@ struct tasdevice_priv {
struct acoustic_data acou_data;
#endif
struct tasdevice_fw *fmw;
- struct gpio_desc *speaker_id;
struct gpio_desc *reset;
struct mutex codec_lock;
struct regmap *regmap;
@@ -215,6 +214,7 @@ struct tasdevice_priv {
unsigned int magic_num;
unsigned int chip_id;
unsigned int sysclk;
+ int speaker_id;
int irq;
int cur_prog;
diff --git a/sound/hda/codecs/side-codecs/tas2781_hda_i2c.c b/sound/hda/codecs/side-codecs/tas2781_hda_i2c.c
index 0357401a6023..c8619995b1d7 100644
--- a/sound/hda/codecs/side-codecs/tas2781_hda_i2c.c
+++ b/sound/hda/codecs/side-codecs/tas2781_hda_i2c.c
@@ -87,6 +87,7 @@ static const struct acpi_gpio_mapping tas2781_speaker_id_gpios[] = {
static int tas2781_read_acpi(struct tasdevice_priv *p, const char *hid)
{
+ struct gpio_desc *speaker_id;
struct acpi_device *adev;
struct device *physdev;
LIST_HEAD(resources);
@@ -119,19 +120,31 @@ static int tas2781_read_acpi(struct tasdevice_priv *p, const char *hid)
/* Speaker id was needed for ASUS projects. */
ret = kstrtou32(sub, 16, &subid);
if (!ret && upper_16_bits(subid) == PCI_VENDOR_ID_ASUSTEK) {
- ret = devm_acpi_dev_add_driver_gpios(p->dev,
- tas2781_speaker_id_gpios);
- if (ret < 0)
+ ret = acpi_dev_add_driver_gpios(adev, tas2781_speaker_id_gpios);
+ if (ret < 0) {
dev_err(p->dev, "Failed to add driver gpio %d.\n",
ret);
- p->speaker_id = devm_gpiod_get(p->dev, "speakerid", GPIOD_IN);
- if (IS_ERR(p->speaker_id)) {
- dev_err(p->dev, "Failed to get Speaker id.\n");
- ret = PTR_ERR(p->speaker_id);
- goto err;
+ p->speaker_id = -1;
+ goto end_2563;
+ }
+
+ speaker_id = fwnode_gpiod_get_index(acpi_fwnode_handle(adev),
+ "speakerid", 0, GPIOD_IN, NULL);
+ if (!IS_ERR(speaker_id)) {
+ p->speaker_id = gpiod_get_value_cansleep(speaker_id);
+ dev_dbg(p->dev, "Got speaker id gpio from ACPI: %d.\n",
+ p->speaker_id);
+ gpiod_put(speaker_id);
+ } else {
+ p->speaker_id = -1;
+ ret = PTR_ERR(speaker_id);
+ dev_err(p->dev, "Get speaker id gpio failed %d.\n",
+ ret);
}
+
+ acpi_dev_remove_driver_gpios(adev);
} else {
- p->speaker_id = NULL;
+ p->speaker_id = -1;
}
end_2563:
@@ -432,23 +445,16 @@ static void tasdevice_dspfw_init(void *context)
struct tas2781_hda *tas_hda = dev_get_drvdata(tas_priv->dev);
struct tas2781_hda_i2c_priv *hda_priv = tas_hda->hda_priv;
struct hda_codec *codec = tas_priv->codec;
- int ret, spk_id;
+ int ret;
tasdevice_dsp_remove(tas_priv);
tas_priv->fw_state = TASDEVICE_DSP_FW_PENDING;
- if (tas_priv->speaker_id != NULL) {
- // Speaker id need to be checked for ASUS only.
- spk_id = gpiod_get_value(tas_priv->speaker_id);
- if (spk_id < 0) {
- // Speaker id is not valid, use default.
- dev_dbg(tas_priv->dev, "Wrong spk_id = %d\n", spk_id);
- spk_id = 0;
- }
+ if (tas_priv->speaker_id >= 0) {
snprintf(tas_priv->coef_binaryname,
sizeof(tas_priv->coef_binaryname),
"TAS2XXX%04X%d.bin",
lower_16_bits(codec->core.subsystem_id),
- spk_id);
+ tas_priv->speaker_id);
} else {
snprintf(tas_priv->coef_binaryname,
sizeof(tas_priv->coef_binaryname),
base-commit: 211ddde0823f1442e4ad052a2f30f050145ccada
--
2.51.1
From: Shawn Guo <shawnguo(a)kernel.org>
Per commit 9442490a0286 ("regmap: irq: Support wake IRQ mask inversion")
the wake_invert flag is to support enable register, so cleared bits are
wake disabled.
Fixes: 68622bdfefb9 ("regmap: irq: document mask/wake_invert flags")
Cc: stable(a)vger.kernel.org
Signed-off-by: Shawn Guo <shawnguo(a)kernel.org>
---
include/linux/regmap.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/include/linux/regmap.h b/include/linux/regmap.h
index 4e1ac1fbcec4..55343795644b 100644
--- a/include/linux/regmap.h
+++ b/include/linux/regmap.h
@@ -1643,7 +1643,7 @@ struct regmap_irq_chip_data;
* @status_invert: Inverted status register: cleared bits are active interrupts.
* @status_is_level: Status register is actuall signal level: Xor status
* register with previous value to get active interrupts.
- * @wake_invert: Inverted wake register: cleared bits are wake enabled.
+ * @wake_invert: Inverted wake register: cleared bits are wake disabled.
* @type_in_mask: Use the mask registers for controlling irq type. Use this if
* the hardware provides separate bits for rising/falling edge
* or low/high level interrupts and they should be combined into
--
2.43.0
Currently mremap folio pte batch ignores the writable bit during figuring
out a set of similar ptes mapping the same folio. Suppose that the first
pte of the batch is writable while the others are not - set_ptes will
end up setting the writable bit on the other ptes, which is a violation
of mremap semantics. Therefore, use FPB_RESPECT_WRITE to check the writable
bit while determining the pte batch.
Cc: stable(a)vger.kernel.org #6.17
Fixes: f822a9a81a31 ("mm: optimize mremap() by PTE batching")
Reported-by: David Hildenbrand <david(a)redhat.com>
Debugged-by: David Hildenbrand <david(a)redhat.com>
Signed-off-by: Dev Jain <dev.jain(a)arm.com>
---
mm-selftests pass. Based on mm-new. Need David H. to confirm whether
the repro passes.
mm/mremap.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/mremap.c b/mm/mremap.c
index a7f531c17b79..8ad06cf50783 100644
--- a/mm/mremap.c
+++ b/mm/mremap.c
@@ -187,7 +187,7 @@ static int mremap_folio_pte_batch(struct vm_area_struct *vma, unsigned long addr
if (!folio || !folio_test_large(folio))
return 1;
- return folio_pte_batch(folio, ptep, pte, max_nr);
+ return folio_pte_batch_flags(folio, NULL, ptep, &pte, max_nr, FPB_RESPECT_WRITE);
}
static int move_ptes(struct pagetable_move_control *pmc,
--
2.30.2
Evaluaciones Psicométricas para RR.HH.
body {
margin: 0;
padding: 0;
font-family: Arial, Helvetica, sans-serif;
font-size: 14px;
color: #333;
background-color: #ffffff;
}
table {
border-spacing: 0;
width: 100%;
max-width: 600px;
margin: auto;
}
td {
padding: 12px 20px;
}
a {
color: #1a73e8;
text-decoration: none;
}
.footer {
font-size: 12px;
color: #888888;
text-align: center;
}
Mejora tus procesos de selección con evaluaciones psicométricas fáciles y confiables.
Hola, ,
Sabemos que encontrar al candidato ideal va más allá del currículum. Por eso quiero contarte brevemente sobre PsicoSmart, una plataforma que ayuda a equipos de RR.HH. a evaluar talento con pruebas psicométricas rápidas, confiables y fáciles de aplicar.
Con PsicoSmart puedes:
Aplicar evaluaciones psicométricas 100% en línea.
Elegir entre más de 31 pruebas psicométricas
Generar reportes automáticos, visuales y fáciles de interpretar.
Comparar resultados entre candidatos en segundos.
Ahorrar horas valiosas del proceso de selección.
Si estás buscando mejorar tus contrataciones, te lo recomiendo muchísimo. Si quieres conocer más puedes responder este correo o simplemente contactarme, mis datos están abajo.
Saludos,
-----------------------------
Atte.: Valeria Pérez
Ciudad de México: (55) 5018 0565
WhatsApp: +52 33 1607 2089
Si no deseas recibir más correos, haz clic aquí para darte de baja.
Para remover su dirección de esta lista haga <a href="https://s1.arrobamail.com/unsuscribe.php?id=yiwtsrewisppiseup">click aquí</a>
From: Junrui Luo <moonafterrain(a)outlook.com>
The asd_pci_remove() function fails to synchronize with pending tasklets
before freeing the asd_ha structure, leading to a potential use-after-free
vulnerability.
When a device removal is triggered (via hot-unplug or module unload), race condition can occur.
The fix adds tasklet_kill() before freeing the asd_ha structure, ensuring
all scheduled tasklets complete before cleanup proceeds.
Reported-by: Yuhao Jiang <danisjiang(a)gmail.com>
Reported-by: Junrui Luo <moonafterrain(a)outlook.com>
Fixes: 2908d778ab3e ("[SCSI] aic94xx: new driver")
Cc: stable(a)vger.kernel.org
Signed-off-by: Junrui Luo <moonafterrain(a)outlook.com>
---
drivers/scsi/aic94xx/aic94xx_init.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/scsi/aic94xx/aic94xx_init.c b/drivers/scsi/aic94xx/aic94xx_init.c
index adf3d9145606..95f3620059f7 100644
--- a/drivers/scsi/aic94xx/aic94xx_init.c
+++ b/drivers/scsi/aic94xx/aic94xx_init.c
@@ -882,6 +882,9 @@ static void asd_pci_remove(struct pci_dev *dev)
asd_disable_ints(asd_ha);
+ /* Ensure all scheduled tasklets complete before freeing resources */
+ tasklet_kill(&asd_ha->seq.dl_tasklet);
+
asd_remove_dev_attrs(asd_ha);
/* XXX more here as needed */
--
2.51.1.dirty
From: Junrui Luo <moonafterrain(a)outlook.com>
The uninorth_insert_memory and uninorth_remove_memory functions lack
proper validation of the pg_start parameter before using it as an array
index into the GATT (Graphics Address Translation Table).
The current bounds check fails to reject negative
pg_start values and potentially causes out-of-bounds writes.
Fix by explicitly checking that pg_start is non-negative before
performing bounds checking. This makes the security requirement clear
and does not rely on implicit type conversion behavior.
The uninorth_remove_memory function has no bounds checking at all, so
add the same validation there.
Reported-by: Yuhao Jiang <danisjiang(a)gmail.com>
Reported-by: Junrui Luo <moonafterrain(a)outlook.com>
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Cc: stable(a)vger.kernel.org
Signed-off-by: Junrui Luo <moonafterrain(a)outlook.com>
---
drivers/char/agp/uninorth-agp.c | 15 ++++++++++++++-
1 file changed, 14 insertions(+), 1 deletion(-)
diff --git a/drivers/char/agp/uninorth-agp.c b/drivers/char/agp/uninorth-agp.c
index b8d7115b8c9e..4e0b949016f7 100644
--- a/drivers/char/agp/uninorth-agp.c
+++ b/drivers/char/agp/uninorth-agp.c
@@ -169,7 +169,9 @@ static int uninorth_insert_memory(struct agp_memory *mem, off_t pg_start, int ty
temp = agp_bridge->current_size;
num_entries = A_SIZE_32(temp)->num_entries;
- if ((pg_start + mem->page_count) > num_entries)
+ if (pg_start < 0 ||
+ (pg_start + mem->page_count) > num_entries ||
+ (pg_start + mem->page_count) < pg_start)
return -EINVAL;
gp = (u32 *) &agp_bridge->gatt_table[pg_start];
@@ -200,6 +202,9 @@ static int uninorth_insert_memory(struct agp_memory *mem, off_t pg_start, int ty
static int uninorth_remove_memory(struct agp_memory *mem, off_t pg_start, int type)
{
size_t i;
+ int num_entries;
+ void *temp;
+
u32 *gp;
int mask_type;
@@ -215,6 +220,14 @@ static int uninorth_remove_memory(struct agp_memory *mem, off_t pg_start, int ty
if (mem->page_count == 0)
return 0;
+ temp = agp_bridge->current_size;
+ num_entries = A_SIZE_32(temp)->num_entries;
+
+ if (pg_start < 0 ||
+ (pg_start + mem->page_count) > num_entries ||
+ (pg_start + mem->page_count) < pg_start)
+ return -EINVAL;
+
gp = (u32 *) &agp_bridge->gatt_table[pg_start];
for (i = 0; i < mem->page_count; ++i) {
gp[i] = scratch_value;
--
2.51.1.dirty
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 15292f1b4c55a3a7c940dbcb6cb8793871ed3d92
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025102054-slacks-ambush-f774@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 15292f1b4c55a3a7c940dbcb6cb8793871ed3d92 Mon Sep 17 00:00:00 2001
From: Babu Moger <babu.moger(a)amd.com>
Date: Fri, 10 Oct 2025 12:08:35 -0500
Subject: [PATCH] x86/resctrl: Fix miscount of bandwidth event when
reactivating previously unavailable RMID
Users can create as many monitoring groups as the number of RMIDs supported
by the hardware. However, on AMD systems, only a limited number of RMIDs
are guaranteed to be actively tracked by the hardware. RMIDs that exceed
this limit are placed in an "Unavailable" state.
When a bandwidth counter is read for such an RMID, the hardware sets
MSR_IA32_QM_CTR.Unavailable (bit 62). When such an RMID starts being tracked
again the hardware counter is reset to zero. MSR_IA32_QM_CTR.Unavailable
remains set on first read after tracking re-starts and is clear on all
subsequent reads as long as the RMID is tracked.
resctrl miscounts the bandwidth events after an RMID transitions from the
"Unavailable" state back to being tracked. This happens because when the
hardware starts counting again after resetting the counter to zero, resctrl
in turn compares the new count against the counter value stored from the
previous time the RMID was tracked.
This results in resctrl computing an event value that is either undercounting
(when new counter is more than stored counter) or a mistaken overflow (when
new counter is less than stored counter).
Reset the stored value (arch_mbm_state::prev_msr) of MSR_IA32_QM_CTR to
zero whenever the RMID is in the "Unavailable" state to ensure accurate
counting after the RMID resets to zero when it starts to be tracked again.
Example scenario that results in mistaken overflow
==================================================
1. The resctrl filesystem is mounted, and a task is assigned to a
monitoring group.
$mount -t resctrl resctrl /sys/fs/resctrl
$mkdir /sys/fs/resctrl/mon_groups/test1/
$echo 1234 > /sys/fs/resctrl/mon_groups/test1/tasks
$cat /sys/fs/resctrl/mon_groups/test1/mon_data/mon_L3_*/mbm_total_bytes
21323 <- Total bytes on domain 0
"Unavailable" <- Total bytes on domain 1
Task is running on domain 0. Counter on domain 1 is "Unavailable".
2. The task runs on domain 0 for a while and then moves to domain 1. The
counter starts incrementing on domain 1.
$cat /sys/fs/resctrl/mon_groups/test1/mon_data/mon_L3_*/mbm_total_bytes
7345357 <- Total bytes on domain 0
4545 <- Total bytes on domain 1
3. At some point, the RMID in domain 0 transitions to the "Unavailable"
state because the task is no longer executing in that domain.
$cat /sys/fs/resctrl/mon_groups/test1/mon_data/mon_L3_*/mbm_total_bytes
"Unavailable" <- Total bytes on domain 0
434341 <- Total bytes on domain 1
4. Since the task continues to migrate between domains, it may eventually
return to domain 0.
$cat /sys/fs/resctrl/mon_groups/test1/mon_data/mon_L3_*/mbm_total_bytes
17592178699059 <- Overflow on domain 0
3232332 <- Total bytes on domain 1
In this case, the RMID on domain 0 transitions from "Unavailable" state to
active state. The hardware sets MSR_IA32_QM_CTR.Unavailable (bit 62) when
the counter is read and begins tracking the RMID counting from 0.
Subsequent reads succeed but return a value smaller than the previously
saved MSR value (7345357). Consequently, the resctrl's overflow logic is
triggered, it compares the previous value (7345357) with the new, smaller
value and incorrectly interprets this as a counter overflow, adding a large
delta.
In reality, this is a false positive: the counter did not overflow but was
simply reset when the RMID transitioned from "Unavailable" back to active
state.
Here is the text from APM [1] available from [2].
"In PQOS Version 2.0 or higher, the MBM hardware will set the U bit on the
first QM_CTR read when it begins tracking an RMID that it was not
previously tracking. The U bit will be zero for all subsequent reads from
that RMID while it is still tracked by the hardware. Therefore, a QM_CTR
read with the U bit set when that RMID is in use by a processor can be
considered 0 when calculating the difference with a subsequent read."
[1] AMD64 Architecture Programmer's Manual Volume 2: System Programming
Publication # 24593 Revision 3.41 section 19.3.3 Monitoring L3 Memory
Bandwidth (MBM).
[ bp: Split commit message into smaller paragraph chunks for better
consumption. ]
Fixes: 4d05bf71f157d ("x86/resctrl: Introduce AMD QOS feature")
Signed-off-by: Babu Moger <babu.moger(a)amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp(a)alien8.de>
Reviewed-by: Reinette Chatre <reinette.chatre(a)intel.com>
Tested-by: Reinette Chatre <reinette.chatre(a)intel.com>
Cc: stable(a)vger.kernel.org # needs adjustments for <= v6.17
Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537 # [2]
diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
index c8945610d455..2cd25a0d4637 100644
--- a/arch/x86/kernel/cpu/resctrl/monitor.c
+++ b/arch/x86/kernel/cpu/resctrl/monitor.c
@@ -242,7 +242,9 @@ int resctrl_arch_rmid_read(struct rdt_resource *r, struct rdt_mon_domain *d,
u32 unused, u32 rmid, enum resctrl_event_id eventid,
u64 *val, void *ignored)
{
+ struct rdt_hw_mon_domain *hw_dom = resctrl_to_arch_mon_dom(d);
int cpu = cpumask_any(&d->hdr.cpu_mask);
+ struct arch_mbm_state *am;
u64 msr_val;
u32 prmid;
int ret;
@@ -251,12 +253,16 @@ int resctrl_arch_rmid_read(struct rdt_resource *r, struct rdt_mon_domain *d,
prmid = logical_rmid_to_physical_rmid(cpu, rmid);
ret = __rmid_read_phys(prmid, eventid, &msr_val);
- if (ret)
- return ret;
- *val = get_corrected_val(r, d, rmid, eventid, msr_val);
+ if (!ret) {
+ *val = get_corrected_val(r, d, rmid, eventid, msr_val);
+ } else if (ret == -EINVAL) {
+ am = get_arch_mbm_state(hw_dom, rmid, eventid);
+ if (am)
+ am->prev_msr = 0;
+ }
- return 0;
+ return ret;
}
static int __cntr_id_read(u32 cntr_id, u64 *val)
From: Christian Hitz <christian.hitz(a)bbv.ch>
If a GPIO is used to control the chip's enable pin, it needs to be pulled
high before any i2c communication is attempted.
Currently, the enable GPIO handling is not correct.
Assume the enable GPIO is low when the probe function is entered. In this
case the device is in SHUTDOWN mode and does not react to i2c commands.
During probe the following sequence happens:
1. The call to lp50xx_reset() on line 548 has no effect as i2c is not
possible yet.
2. Then - on line 552 - lp50xx_enable_disable() is called. As
"priv->enable_gpio“ has not yet been initialized, setting the GPIO has
no effect. Also the i2c enable command is not executed as the device
is still in SHUTDOWN.
3. On line 556 the call to lp50xx_probe_dt() finally parses the rest of
the DT and the configured priv->enable_gpio is set up.
As a result the device is still in SHUTDOWN mode and not ready for
operation.
Split lp50xx_enable_disable() into distinct enable and disable functions
to enforce correct ordering between enable_gpio manipulations and i2c
commands.
Read enable_gpio configuration from DT before attempting to manipulate
enable_gpio.
Add delays to observe correct wait timing after manipulating enable_gpio
and before any i2c communication.
Fixes: 242b81170fb8 ("leds: lp50xx: Add the LP50XX family of the RGB LED driver")
Cc: stable(a)vger.kernel.org
Signed-off-by: Christian Hitz <christian.hitz(a)bbv.ch>
---
Changes in v2:
- Unconditionally reset in lp50xx_enable
- Define magic numbers
- Improve log message
---
drivers/leds/leds-lp50xx.c | 55 +++++++++++++++++++++++++++-----------
1 file changed, 40 insertions(+), 15 deletions(-)
diff --git a/drivers/leds/leds-lp50xx.c b/drivers/leds/leds-lp50xx.c
index 94f8ef6b482c..d3485d814cf4 100644
--- a/drivers/leds/leds-lp50xx.c
+++ b/drivers/leds/leds-lp50xx.c
@@ -50,6 +50,12 @@
#define LP50XX_SW_RESET 0xff
#define LP50XX_CHIP_EN BIT(6)
+#define LP50XX_CHIP_DISABLE 0x00
+#define LP50XX_START_TIME_US 500
+#define LP50XX_RESET_TIME_US 3
+
+#define LP50XX_EN_GPIO_LOW 0
+#define LP50XX_EN_GPIO_HIGH 1
/* There are 3 LED outputs per bank */
#define LP50XX_LEDS_PER_MODULE 3
@@ -371,19 +377,42 @@ static int lp50xx_reset(struct lp50xx *priv)
return regmap_write(priv->regmap, priv->chip_info->reset_reg, LP50XX_SW_RESET);
}
-static int lp50xx_enable_disable(struct lp50xx *priv, int enable_disable)
+static int lp50xx_enable(struct lp50xx *priv)
{
int ret;
- ret = gpiod_direction_output(priv->enable_gpio, enable_disable);
+ if (priv->enable_gpio) {
+ ret = gpiod_direction_output(priv->enable_gpio, LP50XX_EN_GPIO_HIGH);
+ if (ret)
+ return ret;
+
+ udelay(LP50XX_START_TIME_US);
+ }
+
+ ret = lp50xx_reset(priv);
if (ret)
return ret;
- if (enable_disable)
- return regmap_write(priv->regmap, LP50XX_DEV_CFG0, LP50XX_CHIP_EN);
- else
- return regmap_write(priv->regmap, LP50XX_DEV_CFG0, 0);
+ return regmap_write(priv->regmap, LP50XX_DEV_CFG0, LP50XX_CHIP_EN);
+}
+static int lp50xx_disable(struct lp50xx *priv)
+{
+ int ret;
+
+ ret = regmap_write(priv->regmap, LP50XX_DEV_CFG0, LP50XX_CHIP_DISABLE);
+ if (ret)
+ return ret;
+
+ if (priv->enable_gpio) {
+ ret = gpiod_direction_output(priv->enable_gpio, LP50XX_EN_GPIO_LOW);
+ if (ret)
+ return ret;
+
+ udelay(LP50XX_RESET_TIME_US);
+ }
+
+ return 0;
}
static int lp50xx_probe_leds(struct fwnode_handle *child, struct lp50xx *priv,
@@ -447,6 +476,10 @@ static int lp50xx_probe_dt(struct lp50xx *priv)
return dev_err_probe(priv->dev, PTR_ERR(priv->enable_gpio),
"Failed to get enable GPIO\n");
+ ret = lp50xx_enable(priv);
+ if (ret)
+ return ret;
+
priv->regulator = devm_regulator_get(priv->dev, "vled");
if (IS_ERR(priv->regulator))
priv->regulator = NULL;
@@ -547,14 +580,6 @@ static int lp50xx_probe(struct i2c_client *client)
return ret;
}
- ret = lp50xx_reset(led);
- if (ret)
- return ret;
-
- ret = lp50xx_enable_disable(led, 1);
- if (ret)
- return ret;
-
return lp50xx_probe_dt(led);
}
@@ -563,7 +588,7 @@ static void lp50xx_remove(struct i2c_client *client)
struct lp50xx *led = i2c_get_clientdata(client);
int ret;
- ret = lp50xx_enable_disable(led, 0);
+ ret = lp50xx_disable(led);
if (ret)
dev_err(led->dev, "Failed to disable chip\n");
--
2.51.1
A user reports that on their Lenovo Corsola Magneton with EC firmware
steelix-15194.270.0 the driver probe fails with EINVAL. It turns out
that the power LED does not contain any color components as indicated
by the following "ectool led power query" output:
Brightness range for LED 1:
red : 0x0
green : 0x0
blue : 0x0
yellow : 0x0
white : 0x0
amber : 0x0
The LED also does not react to commands sent manually through ectool and
is generally non-functional.
Instead of failing the probe for all LEDs managed by the EC when one
without color components is encountered, silently skip those.
Fixes: 8d6ce6f3ec9d ("leds: Add ChromeOS EC driver")
Cc: stable(a)vger.kernel.org
Signed-off-by: Thomas Weißschuh <linux(a)weissschuh.net>
---
drivers/leds/leds-cros_ec.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/drivers/leds/leds-cros_ec.c b/drivers/leds/leds-cros_ec.c
index 377cf04e202a..bea3cc3fbfd2 100644
--- a/drivers/leds/leds-cros_ec.c
+++ b/drivers/leds/leds-cros_ec.c
@@ -142,9 +142,6 @@ static int cros_ec_led_count_subleds(struct device *dev,
}
}
- if (!num_subleds)
- return -EINVAL;
-
*max_brightness = common_range;
return num_subleds;
}
@@ -189,6 +186,8 @@ static int cros_ec_led_probe_one(struct device *dev, struct cros_ec_device *cros
&priv->led_mc_cdev.led_cdev.max_brightness);
if (num_subleds < 0)
return num_subleds;
+ if (num_subleds == 0)
+ return 0; /* LED without any colors, skip */
priv->cros_ec = cros_ec;
priv->led_id = id;
---
base-commit: 3a8660878839faadb4f1a6dd72c3179c1df56787
change-id: 20251028-cros_ec-leds-no-colors-18eb8d1efa92
Best regards,
--
Thomas Weißschuh <linux(a)weissschuh.net>
When simple_write_to_buffer() succeeds, it returns the number of bytes
actually copied to the buffer, which may be less than the requested
'count' if the buffer size is insufficient. However, the current code
incorrectly uses 'count' as the index for null termination instead of
the actual bytes copied, leading to out-of-bound write.
Add a check for the count and use the return value as the index.
Found via static analysis. This is similar to the
commit da9374819eb3 ("iio: backend: fix out-of-bound write")
Fixes: b1c5d68ea66e ("iio: dac: ad3552r-hs: add support for internal ramp")
Cc: stable(a)vger.kernel.org
Signed-off-by: Miaoqian Lin <linmq006(a)gmail.com>
---
drivers/iio/dac/ad3552r-hs.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/iio/dac/ad3552r-hs.c b/drivers/iio/dac/ad3552r-hs.c
index 41b96b48ba98..a9578afa7015 100644
--- a/drivers/iio/dac/ad3552r-hs.c
+++ b/drivers/iio/dac/ad3552r-hs.c
@@ -549,12 +549,15 @@ static ssize_t ad3552r_hs_write_data_source(struct file *f,
guard(mutex)(&st->lock);
+ if (count >= sizeof(buf))
+ return -ENOSPC;
+
ret = simple_write_to_buffer(buf, sizeof(buf) - 1, ppos, userbuf,
count);
if (ret < 0)
return ret;
- buf[count] = '\0';
+ buf[ret] = '\0';
ret = match_string(dbgfs_attr_source, ARRAY_SIZE(dbgfs_attr_source),
buf);
--
2.39.5 (Apple Git-154)
A recent change fixed device reference leaks when looking up drm
platform device driver data during bind() but failed to remove a partial
fix which had been added by commit 80805b62ea5b ("drm/mediatek: Fix
kobject put for component sub-drivers").
This results in a reference imbalance on component bind() failures and
on unbind() which could lead to a user-after-free.
Make sure to only drop the references after retrieving the driver data
by effectively reverting the previous partial fix.
Note that holding a reference to a device does not prevent its driver
data from going away so there is no point in keeping the reference.
Fixes: 1f403699c40f ("drm/mediatek: Fix device/node reference count leaks in mtk_drm_get_all_drm_priv")
Reported-by: Sjoerd Simons <sjoerd(a)collabora.com>
Link: https://lore.kernel.org/r/20251003-mtk-drm-refcount-v1-1-3b3f2813b0db@colla…
Cc: stable(a)vger.kernel.org
Cc: Ma Ke <make24(a)iscas.ac.cn>
Cc: AngeloGioacchino Del Regno <angelogioacchino.delregno(a)collabora.com>
Signed-off-by: Johan Hovold <johan(a)kernel.org>
---
drivers/gpu/drm/mediatek/mtk_drm_drv.c | 10 ----------
1 file changed, 10 deletions(-)
diff --git a/drivers/gpu/drm/mediatek/mtk_drm_drv.c b/drivers/gpu/drm/mediatek/mtk_drm_drv.c
index 384b0510272c..a94c51a83261 100644
--- a/drivers/gpu/drm/mediatek/mtk_drm_drv.c
+++ b/drivers/gpu/drm/mediatek/mtk_drm_drv.c
@@ -686,10 +686,6 @@ static int mtk_drm_bind(struct device *dev)
for (i = 0; i < private->data->mmsys_dev_num; i++)
private->all_drm_private[i]->drm = NULL;
err_put_dev:
- for (i = 0; i < private->data->mmsys_dev_num; i++) {
- /* For device_find_child in mtk_drm_get_all_priv() */
- put_device(private->all_drm_private[i]->dev);
- }
put_device(private->mutex_dev);
return ret;
}
@@ -697,18 +693,12 @@ static int mtk_drm_bind(struct device *dev)
static void mtk_drm_unbind(struct device *dev)
{
struct mtk_drm_private *private = dev_get_drvdata(dev);
- int i;
/* for multi mmsys dev, unregister drm dev in mmsys master */
if (private->drm_master) {
drm_dev_unregister(private->drm);
mtk_drm_kms_deinit(private->drm);
drm_dev_put(private->drm);
-
- for (i = 0; i < private->data->mmsys_dev_num; i++) {
- /* For device_find_child in mtk_drm_get_all_priv() */
- put_device(private->all_drm_private[i]->dev);
- }
put_device(private->mutex_dev);
}
private->mtk_drm_bound = false;
--
2.49.1
The code in bmc150-accel-core.c unconditionally calls
bmc150_accel_set_interrupt() in the iio_buffer_setup_ops,
such as on the runtime PM resume path giving a kernel
splat like this if the device has no interrupts:
Unable to handle kernel NULL pointer dereference at virtual
address 00000001 when read
CPU: 0 UID: 0 PID: 393 Comm: iio-sensor-prox Not tainted
6.18.0-rc1-postmarketos-stericsson-00001-g6b43386e3737 #73 PREEMPT
Hardware name: ST-Ericsson Ux5x0 platform (Device Tree Support)
PC is at bmc150_accel_set_interrupt+0x98/0x194
LR is at __pm_runtime_resume+0x5c/0x64
(...)
Call trace:
bmc150_accel_set_interrupt from bmc150_accel_buffer_postenable+0x40/0x108
bmc150_accel_buffer_postenable from __iio_update_buffers+0xbe0/0xcbc
__iio_update_buffers from enable_store+0x84/0xc8
enable_store from kernfs_fop_write_iter+0x154/0x1b4
kernfs_fop_write_iter from do_iter_readv_writev+0x178/0x1e4
do_iter_readv_writev from vfs_writev+0x158/0x3f4
vfs_writev from do_writev+0x74/0xe4
do_writev from __sys_trace_return+0x0/0x10
This bug seems to have been in the driver since the beginning,
but it only manifests recently, I do not know why.
Cc: stable(a)vger.kernel.org
Signed-off-by: Linus Walleij <linus.walleij(a)linaro.org>
---
drivers/iio/accel/bmc150-accel-core.c | 5 +++++
drivers/iio/accel/bmc150-accel.h | 1 +
2 files changed, 6 insertions(+)
diff --git a/drivers/iio/accel/bmc150-accel-core.c b/drivers/iio/accel/bmc150-accel-core.c
index 3c5d1560b163..f4421a3b0ef2 100644
--- a/drivers/iio/accel/bmc150-accel-core.c
+++ b/drivers/iio/accel/bmc150-accel-core.c
@@ -523,6 +523,10 @@ static int bmc150_accel_set_interrupt(struct bmc150_accel_data *data, int i,
const struct bmc150_accel_interrupt_info *info = intr->info;
int ret;
+ /* We do not always have an IRQ */
+ if (!data->has_irq)
+ return 0;
+
if (state) {
if (atomic_inc_return(&intr->users) > 1)
return 0;
@@ -1696,6 +1700,7 @@ int bmc150_accel_core_probe(struct device *dev, struct regmap *regmap, int irq,
}
if (irq > 0) {
+ data->has_irq = true;
ret = devm_request_threaded_irq(dev, irq,
bmc150_accel_irq_handler,
bmc150_accel_irq_thread_handler,
diff --git a/drivers/iio/accel/bmc150-accel.h b/drivers/iio/accel/bmc150-accel.h
index 7a7baf52e595..6e9bb69a1fd4 100644
--- a/drivers/iio/accel/bmc150-accel.h
+++ b/drivers/iio/accel/bmc150-accel.h
@@ -59,6 +59,7 @@ enum bmc150_accel_trigger_id {
struct bmc150_accel_data {
struct regmap *regmap;
struct regulator_bulk_data regulators[2];
+ bool has_irq;
struct bmc150_accel_interrupt interrupts[BMC150_ACCEL_INTERRUPTS];
struct bmc150_accel_trigger triggers[BMC150_ACCEL_TRIGGERS];
struct mutex mutex;
---
base-commit: 3a8660878839faadb4f1a6dd72c3179c1df56787
change-id: 20251027-fix-bmc150-7e568122b265
Best regards,
--
Linus Walleij <linus.walleij(a)linaro.org>