Hi,
On Sat, 2025-12-13 at 04:38 -0500, Sasha Levin wrote:
> This is a note to let you know that I've just added the patch titled
>
> drm/xe: Enforce correct user fence signaling order using
>
> to the 6.18-stable tree which can be found at:
>
> http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
>
> The filename of the patch is:
> drm-xe-enforce-correct-user-fence-signaling-order-us.patch
> and it can be found in the queue-6.18 subdirectory.
>
> If you, or anyone else, feels it should not be added to the stable
> tree,
> please let <stable(a)vger.kernel.org> know about it.
>
>
(Already replied to a similar email from GregKH) Please skip this
patch. The patch looks already applied and appears to be the result of
an incorrect merge resolution.
Thanks,
Thomas
>
> commit e0d6df858765e6228a87c8559ccbe6826a1a6fef
> Author: Matthew Brost <matthew.brost(a)intel.com>
> Date: Fri Oct 31 16:40:45 2025 -0700
>
> drm/xe: Enforce correct user fence signaling order using
>
> [ Upstream commit adda4e855ab6409a3edaa585293f1f2069ab7299 ]
>
> Prevent application hangs caused by out-of-order fence signaling
> when
> user fences are attached. Use drm_syncobj (via dma-fence-chain)
> to
> guarantee that each user fence signals in order, regardless of
> the
> signaling order of the attached fences. Ensure user fence
> writebacks to
> user space occur in the correct sequence.
>
> v7:
> - Skip drm_syncbj create of error (CI)
>
> Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for
> Intel GPUs")
> Signed-off-by: Matthew Brost <matthew.brost(a)intel.com>
> Reviewed-by: Thomas Hellström <thomas.hellstrom(a)linux.intel.com>
> Link:
> https://patch.msgid.link/20251031234050.3043507-2-matthew.brost@intel.com
> Signed-off-by: Sasha Levin <sashal(a)kernel.org>
>
> diff --git a/drivers/gpu/drm/xe/xe_exec_queue.c
> b/drivers/gpu/drm/xe/xe_exec_queue.c
> index cb5f204c08ed6..a6efe4e8ab556 100644
> --- a/drivers/gpu/drm/xe/xe_exec_queue.c
> +++ b/drivers/gpu/drm/xe/xe_exec_queue.c
> @@ -344,6 +344,9 @@ void xe_exec_queue_destroy(struct kref *ref)
> struct xe_exec_queue *q = container_of(ref, struct
> xe_exec_queue, refcount);
> struct xe_exec_queue *eq, *next;
>
> + if (q->ufence_syncobj)
> + drm_syncobj_put(q->ufence_syncobj);
> +
> if (q->ufence_syncobj)
> drm_syncobj_put(q->ufence_syncobj);
>
The current UFS clocks does not align with their respective names,
causing the ref_clk to be set to an incorrect frequency as below,
which results in command timeouts.
ufshcd-qcom 1d84000.ufshc: invalid ref_clk setting = 300000000
This commit fixes the issue by properly reordering the UFS clocks to
match their names.
Fixes: ea172f61f4fd ("arm64: dts: qcom: qcs615: Fix up UFS clocks")
Cc: stable(a)vger.kernel.org
Signed-off-by: Pradeep P V K <pradeep.pragallapati(a)oss.qualcomm.com>
---
arch/arm64/boot/dts/qcom/talos.dtsi | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/arm64/boot/dts/qcom/talos.dtsi b/arch/arm64/boot/dts/qcom/talos.dtsi
index d1dbfa3bd81c..95d26e313622 100644
--- a/arch/arm64/boot/dts/qcom/talos.dtsi
+++ b/arch/arm64/boot/dts/qcom/talos.dtsi
@@ -1399,10 +1399,10 @@
<&gcc GCC_AGGRE_UFS_PHY_AXI_CLK>,
<&gcc GCC_UFS_PHY_AHB_CLK>,
<&gcc GCC_UFS_PHY_UNIPRO_CORE_CLK>,
- <&gcc GCC_UFS_PHY_ICE_CORE_CLK>,
<&rpmhcc RPMH_CXO_CLK>,
<&gcc GCC_UFS_PHY_TX_SYMBOL_0_CLK>,
- <&gcc GCC_UFS_PHY_RX_SYMBOL_0_CLK>;
+ <&gcc GCC_UFS_PHY_RX_SYMBOL_0_CLK>,
+ <&gcc GCC_UFS_PHY_ICE_CORE_CLK>;
clock-names = "core_clk",
"bus_aggr_clk",
"iface_clk",
--
2.17.1
When imported dma-bufs are destroyed, TTM is not fully
individualizing the dma-resv, but it *is* copying the fences that
need to be waited for before declaring idle. So in the case where
the bo->resv != bo->_resv we can still drop the preempt-fences, but
make sure we do that on bo->_resv which contains the fence-pointer
copy.
In the case where the copying fails, bo->_resv will typically not
contain any fences pointers at all, so there will be nothing to
drop. In that case, TTM would have ensured all fences that would
have been copied are signaled, including any remaining preempt
fences.
Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Cc: Matthew Brost <matthew.brost(a)intel.com>
Cc: <stable(a)vger.kernel.org> # v6.8+
Signed-off-by: Thomas Hellström <thomas.hellstrom(a)linux.intel.com>
---
drivers/gpu/drm/xe/xe_bo.c | 15 ++++-----------
1 file changed, 4 insertions(+), 11 deletions(-)
diff --git a/drivers/gpu/drm/xe/xe_bo.c b/drivers/gpu/drm/xe/xe_bo.c
index 6280e6a013ff..8b6474cd3eaf 100644
--- a/drivers/gpu/drm/xe/xe_bo.c
+++ b/drivers/gpu/drm/xe/xe_bo.c
@@ -1526,7 +1526,7 @@ static bool xe_ttm_bo_lock_in_destructor(struct ttm_buffer_object *ttm_bo)
* always succeed here, as long as we hold the lru lock.
*/
spin_lock(&ttm_bo->bdev->lru_lock);
- locked = dma_resv_trylock(ttm_bo->base.resv);
+ locked = dma_resv_trylock(&ttm_bo->base._resv);
spin_unlock(&ttm_bo->bdev->lru_lock);
xe_assert(xe, locked);
@@ -1546,13 +1546,6 @@ static void xe_ttm_bo_release_notify(struct ttm_buffer_object *ttm_bo)
bo = ttm_to_xe_bo(ttm_bo);
xe_assert(xe_bo_device(bo), !(bo->created && kref_read(&ttm_bo->base.refcount)));
- /*
- * Corner case where TTM fails to allocate memory and this BOs resv
- * still points the VMs resv
- */
- if (ttm_bo->base.resv != &ttm_bo->base._resv)
- return;
-
if (!xe_ttm_bo_lock_in_destructor(ttm_bo))
return;
@@ -1562,14 +1555,14 @@ static void xe_ttm_bo_release_notify(struct ttm_buffer_object *ttm_bo)
* TODO: Don't do this for external bos once we scrub them after
* unbind.
*/
- dma_resv_for_each_fence(&cursor, ttm_bo->base.resv,
+ dma_resv_for_each_fence(&cursor, &ttm_bo->base._resv,
DMA_RESV_USAGE_BOOKKEEP, fence) {
if (xe_fence_is_xe_preempt(fence) &&
!dma_fence_is_signaled(fence)) {
if (!replacement)
replacement = dma_fence_get_stub();
- dma_resv_replace_fences(ttm_bo->base.resv,
+ dma_resv_replace_fences(&ttm_bo->base._resv,
fence->context,
replacement,
DMA_RESV_USAGE_BOOKKEEP);
@@ -1577,7 +1570,7 @@ static void xe_ttm_bo_release_notify(struct ttm_buffer_object *ttm_bo)
}
dma_fence_put(replacement);
- dma_resv_unlock(ttm_bo->base.resv);
+ dma_resv_unlock(&ttm_bo->base._resv);
}
static void xe_ttm_bo_delete_mem_notify(struct ttm_buffer_object *ttm_bo)
--
2.51.1
The patch titled
Subject: mm: consider non-anon swap cache folios in folio_expected_ref_count()
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-consider-non-anon-swap-cache-folios-in-folio_expected_ref_count.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via various
branches at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there most days
------------------------------------------------------
From: Bijan Tabatabai <bijan311(a)gmail.com>
Subject: mm: consider non-anon swap cache folios in folio_expected_ref_count()
Date: Tue, 16 Dec 2025 14:07:27 -0600
Currently, folio_expected_ref_count() only adds references for the swap
cache if the folio is anonymous. However, according to the comment above
the definition of PG_swapcache in enum pageflags, shmem folios can also
have PG_swapcache set. This patch makes sure references for the swap
cache are added if folio_test_swapcache(folio) is true.
This issue was found when trying to hot-unplug memory in a QEMU/KVM
virtual machine. When initiating hot-unplug when most of the guest memory
is allocated, hot-unplug hangs partway through removal due to migration
failures. The following message would be printed several times, and would
be printed again about every five seconds:
[ 49.641309] migrating pfn b12f25 failed ret:7
[ 49.641310] page: refcount:2 mapcount:0 mapping:0000000033bd8fe2 index:0x7f404d925 pfn:0xb12f25
[ 49.641311] aops:swap_aops
[ 49.641313] flags: 0x300000000030508(uptodate|active|owner_priv_1|reclaim|swapbacked|node=0|zone=3)
[ 49.641314] raw: 0300000000030508 ffffed312c4bc908 ffffed312c4bc9c8 0000000000000000
[ 49.641315] raw: 00000007f404d925 00000000000c823b 00000002ffffffff 0000000000000000
[ 49.641315] page dumped because: migration failure
When debugging this, I found that these migration failures were due to
__migrate_folio() returning -EAGAIN for a small set of folios because the
expected reference count it calculates via folio_expected_ref_count() is
one less than the actual reference count of the folios. Furthermore, all
of the affected folios were not anonymous, but had the PG_swapcache flag
set, inspiring this patch. After applying this patch, the memory
hot-unplug behaves as expected.
I tested this on a machine running Ubuntu 24.04 with kernel version
6.8.0-90-generic and 64GB of memory. The guest VM is managed by libvirt
and runs Ubuntu 24.04 with kernel version 6.18 (though the head of the
mm-unstable branch as a Dec 16, 2025 was also tested and behaves the same)
and 48GB of memory. The libvirt XML definition for the VM can be found at
[1]. CONFIG_MHP_DEFAULT_ONLINE_TYPE_ONLINE_MOVABLE is set in the guest
kernel so the hot-pluggable memory is automatically onlined.
Below are the steps to reproduce this behavior:
1) Define and start and virtual machine
host$ virsh -c qemu:///system define ./test_vm.xml # test_vm.xml from [1]
host$ virsh -c qemu:///system start test_vm
2) Setup swap in the guest
guest$ sudo fallocate -l 32G /swapfile
guest$ sudo chmod 0600 /swapfile
guest$ sudo mkswap /swapfile
guest$ sudo swapon /swapfile
3) Use alloc_data [2] to allocate most of the remaining guest memory
guest$ ./alloc_data 45
4) In a separate guest terminal, monitor the amount of used memory
guest$ watch -n1 free -h
5) When alloc_data has finished allocating, initiate the memory
hot-unplug using the provided xml file [3]
host$ virsh -c qemu:///system detach-device test_vm ./remove.xml --live
After initiating the memory hot-unplug, you should see the amount of
available memory in the guest decrease, and the amount of used swap data
increase. If everything works as expected, when all of the memory is
unplugged, there should be around 8.5-9GB of data in swap. If the
unplugging is unsuccessful, the amount of used swap data will settle below
that. If that happens, you should be able to see log messages in dmesg
similar to the one posted above.
Link: https://lkml.kernel.org/r/20251216200727.2360228-1-bijan311@gmail.com
Link: https://github.com/BijanT/linux_patch_files/blob/main/test_vm.xml [1]
Link: https://github.com/BijanT/linux_patch_files/blob/main/alloc_data.c [2]
Link: https://github.com/BijanT/linux_patch_files/blob/main/remove.xml [3]
Fixes: 86ebd50224c0 ("mm: add folio_expected_ref_count() for reference count calculation")
Signed-off-by: Bijan Tabatabai <bijan311(a)gmail.com>
Acked-by: David Hildenbrand (Red Hat) <david(a)kernel.org>
Acked-by: Zi Yan <ziy(a)nvidia.com>
Reviewed-by: Baolin Wang <baolin.wang(a)linux.alibaba.com>
Cc: Liam Howlett <liam.howlett(a)oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes(a)oracle.com>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Mike Rapoport <rppt(a)kernel.org>
Cc: Shivank Garg <shivankg(a)amd.com>
Cc: Suren Baghdasaryan <surenb(a)google.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Kairui Song <ryncsn(a)gmail.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
include/linux/mm.h | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
--- a/include/linux/mm.h~mm-consider-non-anon-swap-cache-folios-in-folio_expected_ref_count
+++ a/include/linux/mm.h
@@ -2459,10 +2459,10 @@ static inline int folio_expected_ref_cou
if (WARN_ON_ONCE(page_has_type(&folio->page) && !folio_test_hugetlb(folio)))
return 0;
- if (folio_test_anon(folio)) {
- /* One reference per page from the swapcache. */
- ref_count += folio_test_swapcache(folio) << order;
- } else {
+ /* One reference per page from the swapcache. */
+ ref_count += folio_test_swapcache(folio) << order;
+
+ if (!folio_test_anon(folio)) {
/* One reference per page from the pagecache. */
ref_count += !!folio->mapping << order;
/* One reference from PG_private. */
_
Patches currently in -mm which might be from bijan311(a)gmail.com are
mm-consider-non-anon-swap-cache-folios-in-folio_expected_ref_count.patch
Hi,
This series intends to fix the race between the MHI stack and the MHI client
drivers due to the MHI 'auto_queue' feature. As it turns out often, the best
way to fix an issue in a feature is to drop the feature itself and this series
does exactly that.
There is no real benefit in having the 'auto_queue' feature in the MHI stack,
other than saving a few lines of code in the client drivers. Since the QRTR is
the only client driver which makes use of this feature, this series reworks the
QRTR driver to manage the buffer on its own.
Testing
=======
Tested on Qcom X1E based Lenovo Thinkpad T14s laptop with WLAN device.
Merge Strategy
==============
Since this series modifies many subsystem drivers, I'd like to get acks from
relevant subsystem maintainers and take the series through MHI tree.
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam(a)oss.qualcomm.com>
---
Manivannan Sadhasivam (2):
net: qrtr: Drop the MHI auto_queue feature for IPCR DL channels
bus: mhi: host: Drop the auto_queue support
drivers/accel/qaic/mhi_controller.c | 44 -------------------
drivers/bus/mhi/host/init.c | 10 -----
drivers/bus/mhi/host/internal.h | 3 --
drivers/bus/mhi/host/main.c | 81 +----------------------------------
drivers/bus/mhi/host/pci_generic.c | 20 +--------
drivers/net/wireless/ath/ath11k/mhi.c | 4 --
drivers/net/wireless/ath/ath12k/mhi.c | 4 --
include/linux/mhi.h | 14 ------
net/qrtr/mhi.c | 67 ++++++++++++++++++++++++-----
9 files changed, 60 insertions(+), 187 deletions(-)
---
base-commit: 8f0b4cce4481fb22653697cced8d0d04027cb1e8
change-id: 20251217-qrtr-fix-c058251d8d1a
Best regards,
--
Manivannan Sadhasivam <manivannan.sadhasivam(a)oss.qualcomm.com>
The documentation for `Cursor::peek_next` incorrectly describes it as
"Access the previous node without moving the cursor" when it actually
accesses the next node. Update the description to correctly state
"Access the next node without moving the cursor" to match the function
name and implementation.
Reported-by: Miguel Ojeda <ojeda(a)kernel.org>
Closes: https://github.com/Rust-for-Linux/linux/issues/1205
Fixes: 98c14e40e07a0 ("rust: rbtree: add cursor")
Cc: stable(a)vger.kernel.org
Signed-off-by: WeiKang Guo <guoweikang.kernel(a)outlook.com>
---
rust/kernel/rbtree.rs | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/rust/kernel/rbtree.rs b/rust/kernel/rbtree.rs
index 4729eb56827a..cd187e2ca328 100644
--- a/rust/kernel/rbtree.rs
+++ b/rust/kernel/rbtree.rs
@@ -985,7 +985,7 @@ pub fn peek_prev(&self) -> Option<(&K, &V)> {
self.peek(Direction::Prev)
}
- /// Access the previous node without moving the cursor.
+ /// Access the next node without moving the cursor.
pub fn peek_next(&self) -> Option<(&K, &V)> {
self.peek(Direction::Next)
}
--
2.52.0
When running the Rust maple tree kunit tests with lockdep, you may
trigger a warning that looks like this:
lib/maple_tree.c:780 suspicious rcu_dereference_check() usage!
other info that might help us debug this:
rcu_scheduler_active = 2, debug_locks = 1
no locks held by kunit_try_catch/344.
stack backtrace:
CPU: 3 UID: 0 PID: 344 Comm: kunit_try_catch Tainted: G N 6.19.0-rc1+ #2 NONE
Tainted: [N]=TEST
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.17.0-0-gb52ca86e094d-prebuilt.qemu.org 04/01/2014
Call Trace:
<TASK>
dump_stack_lvl+0x71/0x90
lockdep_rcu_suspicious+0x150/0x190
mas_start+0x104/0x150
mas_find+0x179/0x240
_RINvNtCs5QSdWC790r4_4core3ptr13drop_in_placeINtNtCs1cdwasc6FUb_6kernel10maple_tree9MapleTreeINtNtNtBL_5alloc4kbox3BoxlNtNtB1x_9allocator7KmallocEEECsgxAQYCfdR72_25doctests_kernel_generated+0xaf/0x130
rust_doctest_kernel_maple_tree_rs_0+0x600/0x6b0
? lock_release+0xeb/0x2a0
? kunit_try_catch_run+0x210/0x210
kunit_try_run_case+0x74/0x160
? kunit_try_catch_run+0x210/0x210
kunit_generic_run_threadfn_adapter+0x12/0x30
kthread+0x21c/0x230
? __do_trace_sched_kthread_stop_ret+0x40/0x40
ret_from_fork+0x16c/0x270
? __do_trace_sched_kthread_stop_ret+0x40/0x40
ret_from_fork_asm+0x11/0x20
</TASK>
This is because the destructor of maple tree calls mas_find() without
taking rcu_read_lock() or the spinlock. Doing that is actually ok in
this case since the destructor has exclusive access to the entire maple
tree, but it triggers a lockdep warning. To fix that, take the rcu read
lock.
In the future, it's possible that memory reclaim could gain a feature
where it reallocates entries in maple trees even if no user-code is
touching it. If that feature is added, then this use of rcu read lock
would become load-bearing, so I did not make it conditional on lockdep.
We have to repeatedly take and release rcu because the destructor of T
might perform operations that sleep.
Reported-by: Andreas Hindborg <a.hindborg(a)kernel.org>
Closes: https://rust-for-linux.zulipchat.com/#narrow/channel/x/topic/x/near/5642151…
Fixes: da939ef4c494 ("rust: maple_tree: add MapleTree")
Cc: stable(a)vger.kernel.org
Signed-off-by: Alice Ryhl <aliceryhl(a)google.com>
---
Intended for the same tree as any other maple tree patch. (I believe
that's Andrew Morton's tree.)
---
rust/kernel/maple_tree.rs | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)
diff --git a/rust/kernel/maple_tree.rs b/rust/kernel/maple_tree.rs
index e72eec56bf5772ada09239f47748cd649212d8b0..265d6396a78a17886c8b5a3ebe7ba39ccc354add 100644
--- a/rust/kernel/maple_tree.rs
+++ b/rust/kernel/maple_tree.rs
@@ -265,7 +265,16 @@ unsafe fn free_all_entries(self: Pin<&mut Self>) {
loop {
// This uses the raw accessor because we're destroying pointers without removing them
// from the maple tree, which is only valid because this is the destructor.
- let ptr = ma_state.mas_find_raw(usize::MAX);
+ //
+ // Take the rcu lock because mas_find_raw() requires that you hold either the spinlock
+ // or the rcu read lock. This is only really required if memory reclaim might
+ // reallocate entries in the tree, as we otherwise have exclusive access. That feature
+ // doesn't exist yet, so for now, taking the rcu lock only serves the purpose of
+ // silencing lockdep.
+ let ptr = {
+ let _rcu = kernel::sync::rcu::Guard::new();
+ ma_state.mas_find_raw(usize::MAX)
+ };
if ptr.is_null() {
break;
}
---
base-commit: 8f0b4cce4481fb22653697cced8d0d04027cb1e8
change-id: 20251217-maple-drop-rcu-dfe72fb5f49e
Best regards,
--
Alice Ryhl <aliceryhl(a)google.com>
Hi ,
I'm reaching out to see if my last email made it to your inbox.
I can gladly provide more details if required. Please let me know how I can assist.
I'm awaiting your feedback on this matter.
Regards
Connie Griggs
Please reply with REMOVE if you don't wish to receive further emails
-----Original Message-----
Hi ,
Hope you're having a great day!
Are you looking for leads from NRF 2026 Retail's Big Show?
Attendees count: 30,000 Leads
Data Fields: Company Name, Web URL, Contact Name, Title, Direct Email, Phone Number, Mailing Address, Industry, Employee Size, Annual Sales.
If you're interested in these leads, I'd be glad to share the pricing. Let me know!
Thanks for the quick reply. I'm excited to get your thoughts.
Regards
Connie Griggs
Demand Generation Manager
Leads Focus Hub Inc.,
Please reply with REMOVE if you don't wish to receive further emails
The patch titled
Subject: rapidio: replace rio_free_net() with kfree() in rio_scan_alloc_net()
has been added to the -mm mm-nonmm-unstable branch. Its filename is
rapidio-replace-rio_free_net-with-kfree-in-rio_scan_alloc_net.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-nonmm-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via various
branches at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there most days
------------------------------------------------------
From: Haoxiang Li <lihaoxiang(a)isrc.iscas.ac.cn>
Subject: rapidio: replace rio_free_net() with kfree() in rio_scan_alloc_net()
Date: Wed, 17 Dec 2025 17:15:48 +0800
When idtab allocation fails, net is not registered with rio_add_net() yet,
so kfree(net) is sufficient to release the memory.
Link: https://lkml.kernel.org/r/20251217091548.482358-1-lihaoxiang@isrc.iscas.ac.…
Fixes: e6b585ca6e81 ("rapidio: move net allocation into core code")
Signed-off-by: Haoxiang Li <lihaoxiang(a)isrc.iscas.ac.cn>
Cc: Alexandre Bounine <alex.bou9(a)gmail.com>
Cc: Matt Porter <mporter(a)kernel.crashing.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
drivers/rapidio/rio-scan.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/rapidio/rio-scan.c~rapidio-replace-rio_free_net-with-kfree-in-rio_scan_alloc_net
+++ a/drivers/rapidio/rio-scan.c
@@ -854,7 +854,7 @@ static struct rio_net *rio_scan_alloc_ne
if (idtab == NULL) {
pr_err("RIO: failed to allocate destID table\n");
- rio_free_net(net);
+ kfree(net);
net = NULL;
} else {
net->enum_data = idtab;
_
Patches currently in -mm which might be from lihaoxiang(a)isrc.iscas.ac.cn are
rapidio-replace-rio_free_net-with-kfree-in-rio_scan_alloc_net.patch
The patch titled
Subject: rust: maple_tree: rcu_read_lock() in destructor to silence lockdep
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
rust-maple_tree-rcu_read_lock-in-destructor-to-silence-lockdep.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via various
branches at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there most days
------------------------------------------------------
From: Alice Ryhl <aliceryhl(a)google.com>
Subject: rust: maple_tree: rcu_read_lock() in destructor to silence lockdep
Date: Wed, 17 Dec 2025 13:10:37 +0000
When running the Rust maple tree kunit tests with lockdep, you may trigger
a warning that looks like this:
lib/maple_tree.c:780 suspicious rcu_dereference_check() usage!
other info that might help us debug this:
rcu_scheduler_active = 2, debug_locks = 1
no locks held by kunit_try_catch/344.
stack backtrace:
CPU: 3 UID: 0 PID: 344 Comm: kunit_try_catch Tainted: G N 6.19.0-rc1+ #2 NONE
Tainted: [N]=TEST
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.17.0-0-gb52ca86e094d-prebuilt.qemu.org 04/01/2014
Call Trace:
<TASK>
dump_stack_lvl+0x71/0x90
lockdep_rcu_suspicious+0x150/0x190
mas_start+0x104/0x150
mas_find+0x179/0x240
_RINvNtCs5QSdWC790r4_4core3ptr13drop_in_placeINtNtCs1cdwasc6FUb_6kernel10maple_tree9MapleTreeINtNtNtBL_5alloc4kbox3BoxlNtNtB1x_9allocator7KmallocEEECsgxAQYCfdR72_25doctests_kernel_generated+0xaf/0x130
rust_doctest_kernel_maple_tree_rs_0+0x600/0x6b0
? lock_release+0xeb/0x2a0
? kunit_try_catch_run+0x210/0x210
kunit_try_run_case+0x74/0x160
? kunit_try_catch_run+0x210/0x210
kunit_generic_run_threadfn_adapter+0x12/0x30
kthread+0x21c/0x230
? __do_trace_sched_kthread_stop_ret+0x40/0x40
ret_from_fork+0x16c/0x270
? __do_trace_sched_kthread_stop_ret+0x40/0x40
ret_from_fork_asm+0x11/0x20
</TASK>
This is because the destructor of maple tree calls mas_find() without
taking rcu_read_lock() or the spinlock. Doing that is actually ok in this
case since the destructor has exclusive access to the entire maple tree,
but it triggers a lockdep warning. To fix that, take the rcu read lock.
In the future, it's possible that memory reclaim could gain a feature
where it reallocates entries in maple trees even if no user-code is
touching it. If that feature is added, then this use of rcu read lock
would become load-bearing, so I did not make it conditional on lockdep.
We have to repeatedly take and release rcu because the destructor of T
might perform operations that sleep.
Link: https://lkml.kernel.org/r/20251217-maple-drop-rcu-v1-1-702af063573f@google.…
Fixes: da939ef4c494 ("rust: maple_tree: add MapleTree")
Signed-off-by: Alice Ryhl <aliceryhl(a)google.com>
Reported-by: Andreas Hindborg <a.hindborg(a)kernel.org>
Closes: https://rust-for-linux.zulipchat.com/#narrow/channel/x/topic/x/near/5642151…
Reviewed-by: Gary Guo <gary(a)garyguo.net>
Reviewed-by: Daniel Almeida <daniel.almeida(a)collabora.com>
Cc: Andrew Ballance <andrewjballance(a)gmail.com>
Cc: Bj��rn Roy Baron <bjorn3_gh(a)protonmail.com>
Cc: Boqun Feng <boqun.feng(a)gmail.com>
Cc: Danilo Krummrich <dakr(a)kernel.org>
Cc: Liam Howlett <liam.howlett(a)oracle.com>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: Miguel Ojeda <ojeda(a)kernel.org>
Cc: Trevor Gross <tmgross(a)umich.edu>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
rust/kernel/maple_tree.rs | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)
--- a/rust/kernel/maple_tree.rs~rust-maple_tree-rcu_read_lock-in-destructor-to-silence-lockdep
+++ a/rust/kernel/maple_tree.rs
@@ -265,7 +265,16 @@ impl<T: ForeignOwnable> MapleTree<T> {
loop {
// This uses the raw accessor because we're destroying pointers without removing them
// from the maple tree, which is only valid because this is the destructor.
- let ptr = ma_state.mas_find_raw(usize::MAX);
+ //
+ // Take the rcu lock because mas_find_raw() requires that you hold either the spinlock
+ // or the rcu read lock. This is only really required if memory reclaim might
+ // reallocate entries in the tree, as we otherwise have exclusive access. That feature
+ // doesn't exist yet, so for now, taking the rcu lock only serves the purpose of
+ // silencing lockdep.
+ let ptr = {
+ let _rcu = kernel::sync::rcu::Guard::new();
+ ma_state.mas_find_raw(usize::MAX)
+ };
if ptr.is_null() {
break;
}
_
Patches currently in -mm which might be from aliceryhl(a)google.com are
rust-maple_tree-rcu_read_lock-in-destructor-to-silence-lockdep.patch
From: Thomas Gleixner <tglx(a)linutronix.de>
Luigi reported that retriggering a posted MSI interrupt does not work
correctly.
The reason is that the retrigger happens at the vector domain by sending an
IPI to the actual vector on the target CPU. That works correctly exactly
once because the posted MSI interrupt chip does not issue an EOI as that's
only required for the posted MSI notification vector itself.
As a consequence the vector becomes stale in the ISR, which not only
affects this vector but also any lower priority vector in the affected
APIC because the ISR bit is not cleared.
Luigi proposed to set the vector in the remap PIR bitmap and raise the
posted MSI notification vector. That works, but that still does not cure a
related problem:
If there is ever a stray interrupt on such a vector, then the related
APIC ISR bit becomes stale due to the lack of EOI as described above.
Unlikely to happen, but if it happens it's not debuggable at all.
So instead of playing games with the PIR, this can be actually solved
for both cases by:
1) Keeping track of the posted interrupt vector handler state
2) Implementing a posted MSI specific irq_ack() callback which checks that
state. If the posted vector handler is inactive it issues an EOI,
otherwise it delegates that to the posted handler.
This is correct versus affinity changes and concurrent events on the posted
vector as the actual handler invocation is serialized through the interrupt
descriptor lock.
Fixes: ed1e48ea4370 ("iommu/vt-d: Enable posted mode for device MSIs")
Reported-by: Luigi Rizzo <lrizzo(a)google.com>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Tested-by: Luigi Rizzo <lrizzo(a)google.com>
Cc: stable(a)vger.kernel.org
Closes: https://lore.kernel.org/lkml/20251124104836.3685533-1-lrizzo@google.com
---
V2: Use __this_cpu...() - Luigi
---
arch/x86/include/asm/irq_remapping.h | 7 +++++++
arch/x86/kernel/irq.c | 23 +++++++++++++++++++++++
drivers/iommu/intel/irq_remapping.c | 8 ++++----
3 files changed, 34 insertions(+), 4 deletions(-)
--- a/arch/x86/include/asm/irq_remapping.h
+++ b/arch/x86/include/asm/irq_remapping.h
@@ -87,4 +87,11 @@ static inline void panic_if_irq_remap(co
}
#endif /* CONFIG_IRQ_REMAP */
+
+#ifdef CONFIG_X86_POSTED_MSI
+void intel_ack_posted_msi_irq(struct irq_data *irqd);
+#else
+#define intel_ack_posted_msi_irq NULL
+#endif
+
#endif /* __X86_IRQ_REMAPPING_H */
--- a/arch/x86/kernel/irq.c
+++ b/arch/x86/kernel/irq.c
@@ -396,6 +396,7 @@ DEFINE_IDTENTRY_SYSVEC_SIMPLE(sysvec_kvm
/* Posted Interrupt Descriptors for coalesced MSIs to be posted */
DEFINE_PER_CPU_ALIGNED(struct pi_desc, posted_msi_pi_desc);
+static DEFINE_PER_CPU_CACHE_HOT(bool, posted_msi_handler_active);
void intel_posted_msi_init(void)
{
@@ -413,6 +414,25 @@ void intel_posted_msi_init(void)
this_cpu_write(posted_msi_pi_desc.ndst, destination);
}
+void intel_ack_posted_msi_irq(struct irq_data *irqd)
+{
+ irq_move_irq(irqd);
+
+ /*
+ * Handle the rare case that irq_retrigger() raised the actual
+ * assigned vector on the target CPU, which means that it was not
+ * invoked via the posted MSI handler below. In that case APIC EOI
+ * is required as otherwise the ISR entry becomes stale and lower
+ * priority interrupts are never going to be delivered after that.
+ *
+ * If the posted handler invoked the device interrupt handler then
+ * the EOI would be premature because it would acknowledge the
+ * posted vector.
+ */
+ if (unlikely(!__this_cpu_read(posted_msi_handler_active)))
+ apic_eoi();
+}
+
static __always_inline bool handle_pending_pir(unsigned long *pir, struct pt_regs *regs)
{
unsigned long pir_copy[NR_PIR_WORDS];
@@ -445,6 +465,8 @@ DEFINE_IDTENTRY_SYSVEC(sysvec_posted_msi
pid = this_cpu_ptr(&posted_msi_pi_desc);
+ /* Mark the handler active for intel_ack_posted_msi_irq() */
+ __this_cpu_write(posted_msi_handler_active, true);
inc_irq_stat(posted_msi_notification_count);
irq_enter();
@@ -473,6 +495,7 @@ DEFINE_IDTENTRY_SYSVEC(sysvec_posted_msi
apic_eoi();
irq_exit();
+ __this_cpu_write(posted_msi_handler_active, false);
set_irq_regs(old_regs);
}
#endif /* X86_POSTED_MSI */
--- a/drivers/iommu/intel/irq_remapping.c
+++ b/drivers/iommu/intel/irq_remapping.c
@@ -1303,17 +1303,17 @@ static struct irq_chip intel_ir_chip = {
* irq_enter();
* handle_edge_irq()
* irq_chip_ack_parent()
- * irq_move_irq(); // No EOI
+ * intel_ack_posted_msi_irq(); // No EOI
* handle_irq_event()
* driver_handler()
* handle_edge_irq()
* irq_chip_ack_parent()
- * irq_move_irq(); // No EOI
+ * intel_ack_posted_msi_irq(); // No EOI
* handle_irq_event()
* driver_handler()
* handle_edge_irq()
* irq_chip_ack_parent()
- * irq_move_irq(); // No EOI
+ * intel_ack_posted_msi_irq(); // No EOI
* handle_irq_event()
* driver_handler()
* apic_eoi()
@@ -1322,7 +1322,7 @@ static struct irq_chip intel_ir_chip = {
*/
static struct irq_chip intel_ir_chip_post_msi = {
.name = "INTEL-IR-POST",
- .irq_ack = irq_move_irq,
+ .irq_ack = intel_ack_posted_msi_irq,
.irq_set_affinity = intel_ir_set_affinity,
.irq_compose_msi_msg = intel_ir_compose_msi_msg,
.irq_set_vcpu_affinity = intel_ir_set_vcpu_affinity,
From: Li Nan <linan122(a)huawei.com>
commit 9c47127a807da3e36ce80f7c83a1134a291fc021 upstream.
Raid checks if pad3 is zero when loading superblock from disk. Arrays
created with new features may fail to assemble on old kernels as pad3
is used.
Add module parameter check_new_feature to bypass this check.
Link: https://lore.kernel.org/linux-raid/20251103125757.1405796-5-linan666@huawei…
Signed-off-by: Li Nan <linan122(a)huawei.com>
Reviewed-by: Xiao Ni <xni(a)redhat.com>
Signed-off-by: Yu Kuai <yukuai(a)fnnas.com>
---
drivers/md/md.c | 12 +++++++++---
1 file changed, 9 insertions(+), 3 deletions(-)
diff --git a/drivers/md/md.c b/drivers/md/md.c
index 4e033c26fdd4..9d9cb7e1e6e8 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -340,6 +340,7 @@ static int start_readonly;
*/
static bool create_on_open = true;
static bool legacy_async_del_gendisk = true;
+static bool check_new_feature = true;
/*
* We have a system wide 'event count' that is incremented
@@ -1752,9 +1753,13 @@ static int super_1_load(struct md_rdev *rdev, struct md_rdev *refdev, int minor_
}
if (sb->pad0 ||
sb->pad3[0] ||
- memcmp(sb->pad3, sb->pad3+1, sizeof(sb->pad3) - sizeof(sb->pad3[1])))
- /* Some padding is non-zero, might be a new feature */
- return -EINVAL;
+ memcmp(sb->pad3, sb->pad3+1, sizeof(sb->pad3) - sizeof(sb->pad3[1]))) {
+ pr_warn("Some padding is non-zero on %pg, might be a new feature\n",
+ rdev->bdev);
+ if (check_new_feature)
+ return -EINVAL;
+ pr_warn("check_new_feature is disabled, data corruption possible\n");
+ }
rdev->preferred_minor = 0xffff;
rdev->data_offset = le64_to_cpu(sb->data_offset);
@@ -10459,6 +10464,7 @@ module_param(start_dirty_degraded, int, S_IRUGO|S_IWUSR);
module_param_call(new_array, add_named_array, NULL, NULL, S_IWUSR);
module_param(create_on_open, bool, S_IRUSR|S_IWUSR);
module_param(legacy_async_del_gendisk, bool, 0600);
+module_param(check_new_feature, bool, 0600);
MODULE_LICENSE("GPL");
MODULE_DESCRIPTION("MD RAID framework");
--
2.39.2
(Cc: stable(a)vger.kernel.org)
On Wed Dec 10, 2025 at 12:25 PM CET, Marko Turk wrote:
> MMIO backend of PCI Bar always assumes little-endian devices and
> will convert to CPU endianness automatically. Remove the u32::from_le
> conversion which would cause a bug on big-endian machines.
>
> Signed-off-by: Marko Turk <mt(a)markoturk.info>
> Fixes: 685376d18e9a ("samples: rust: add Rust PCI sample driver")
Applied to driver-core-linus, thanks!
This reverts commit b3b274bc9d3d7307308aeaf75f70731765ac999a.
On the DragonBoard 820c (which uses APQ8096/MSM8996) this change causes
the CPUs to downclock to roughly half speed under sustained load. The
regression is visible both during boot and when running CPU stress
workloads such as stress-ng: the CPUs initially ramp up to the expected
frequency, then drop to a lower OPP even though the system is clearly
CPU-bound.
Bisecting points to this commit and reverting it restores the expected
behaviour on the DragonBoard 820c - the CPUs track the cpufreq policy
and run at full performance under load.
The exact interaction with the ACD is not yet fully understood and we
would like to keep ACD in use to avoid possible SoC reliability issues.
Until we have a better fix that preserves ACD while avoiding this
performance regression, revert the bisected patch to restore the
previous behaviour.
Fixes: b3b274bc9d3d ("clk: qcom: cpu-8996: simplify the cpu_clk_notifier_cb")
Cc: stable(a)vger.kernel.org # v6.3+
Link: https://lore.kernel.org/linux-arm-msm/20230113120544.59320-8-dmitry.baryshk…
Cc: Dmitry Baryshkov <dmitry.baryshkov(a)oss.qualcomm.com>
Signed-off-by: Christopher Obbard <christopher.obbard(a)linaro.org>
---
Hi all,
This series contains a single revert for a regression affecting the
APQ8096/MSM8996 (DragonBoard 820c).
The commit being reverted, b3b274bc9d3d ("clk: qcom: cpu-8996: simplify the cpu_clk_notifier_cb"),
introduces a significant performance issue where the CPUs downclock to
~50% of their expected frequency under sustained load. The problem is
reproducible both at boot and when running CPU-bound workloads such as
stress-ng.
Bisecting the issue pointed directly to this commit and reverting it
restores correct cpufreq behaviour.
The root cause appears to be related to the interaction between the
simplified notifier callback and ACD (Adaptive Clock Distribution).
Since we would prefer to keep ACD enabled for SoC reliability reasons,
a revert is the safest option until a proper fix is identified.
Full details are included in the commit message.
Feedback & suggestions welcome.
Cheers!
Christopher Obbard
---
drivers/clk/qcom/clk-cpu-8996.c | 30 +++++++++++-------------------
1 file changed, 11 insertions(+), 19 deletions(-)
diff --git a/drivers/clk/qcom/clk-cpu-8996.c b/drivers/clk/qcom/clk-cpu-8996.c
index 21d13c0841ed..028476931747 100644
--- a/drivers/clk/qcom/clk-cpu-8996.c
+++ b/drivers/clk/qcom/clk-cpu-8996.c
@@ -547,35 +547,27 @@ static int cpu_clk_notifier_cb(struct notifier_block *nb, unsigned long event,
{
struct clk_cpu_8996_pmux *cpuclk = to_clk_cpu_8996_pmux_nb(nb);
struct clk_notifier_data *cnd = data;
+ int ret;
switch (event) {
case PRE_RATE_CHANGE:
+ ret = clk_cpu_8996_pmux_set_parent(&cpuclk->clkr.hw, ALT_INDEX);
qcom_cpu_clk_msm8996_acd_init(cpuclk->clkr.regmap);
-
- /*
- * Avoid overvolting. clk_core_set_rate_nolock() walks from top
- * to bottom, so it will change the rate of the PLL before
- * chaging the parent of PMUX. This can result in pmux getting
- * clocked twice the expected rate.
- *
- * Manually switch to PLL/2 here.
- */
- if (cnd->new_rate < DIV_2_THRESHOLD &&
- cnd->old_rate > DIV_2_THRESHOLD)
- clk_cpu_8996_pmux_set_parent(&cpuclk->clkr.hw, SMUX_INDEX);
-
break;
- case ABORT_RATE_CHANGE:
- /* Revert manual change */
- if (cnd->new_rate < DIV_2_THRESHOLD &&
- cnd->old_rate > DIV_2_THRESHOLD)
- clk_cpu_8996_pmux_set_parent(&cpuclk->clkr.hw, ACD_INDEX);
+ case POST_RATE_CHANGE:
+ if (cnd->new_rate < DIV_2_THRESHOLD)
+ ret = clk_cpu_8996_pmux_set_parent(&cpuclk->clkr.hw,
+ SMUX_INDEX);
+ else
+ ret = clk_cpu_8996_pmux_set_parent(&cpuclk->clkr.hw,
+ ACD_INDEX);
break;
default:
+ ret = 0;
break;
}
- return NOTIFY_OK;
+ return notifier_from_errno(ret);
};
static int qcom_cpu_clk_msm8996_driver_probe(struct platform_device *pdev)
---
base-commit: c17e270dfb342a782d69c4a7c4c32980455afd9c
change-id: 20251202-wip-obbardc-qcom-msm8096-clk-cpu-fix-downclock-b7561da4cb95
Best regards,
--
Christopher Obbard <christopher.obbard(a)linaro.org>