The vNTB endpoint function (pci-epf-vntb) can be configured and reconfigured
through configfs (link/unlink functions, start/stop the controller, update
parameters). In practice, several pitfalls present: double-unmapping when two
windows share a BAR, wrong parameter order in .drop_link leading to wrong
object lookups, duplicate EPC teardown that leads to oopses, a work item
running after resources were torn down, and inability to re-link/restart
fundamentally because ntb_dev was embedded and the vPCI bus teardown was
incomplete.
This series addresses those issues and hardens resource management across NTB
EPF and PCI EP core:
- Avoid double iounmap when PEER_SPAD and CONFIG share the same BAR.
- Fix configfs .drop_link parameter order so the correct groups are used during
unlink.
- Remove duplicate EPC resource teardown in both pci-epf-vntb and pci-epf-ntb,
avoiding crashes on .allow_link failures and during .drop_link.
- Stop the delayed cmd_handler work before clearing BARs/doorbells.
- Manage ntb_dev as a devm-managed allocation and implement .remove() in the
vNTB PCI driver. Switch to pci_scan_root_bus().
With these changes, the controller can now be stopped, a function unlinked,
configfs settings updated, and the controller re-linked and restarted
without rebooting the endpoint, as long as the underlying pci_epc_ops
.stop() is non-destructive and .start() restores normal operation.
Patches 1-5 carry Fixes tags and are candidates for stable.
Patch 6 is a preparatory one for Patch 7.
Patch 7 is a behavioral improvement that completes lifetime management for
relink/restart scenarios.
Apologies for the delay between v2 and v3, and thank you for the review.
v2->v3 changes:
- Added Reviewed-by tag for [PATCH v2 4/6]
- Split [PATCH v2 6/6] into two, based on the feedback from Frank.
(No code changes overall.)
v1->v2 changes:
- Incorporated feedback from Frank.
- Added Reviewed-by tags (except for patches #4 and #6).
- Fixed a typo in patch #5 title.
(No code changes overall.)
v2: https://lore.kernel.org/all/20251029080321.807943-1-den@valinux.co.jp/
v1: https://lore.kernel.org/all/20251023071757.901181-1-den@valinux.co.jp/
Koichiro Den (7):
NTB: epf: Avoid pci_iounmap() with offset when PEER_SPAD and CONFIG
share BAR
PCI: endpoint: Fix parameter order for .drop_link
PCI: endpoint: pci-epf-vntb: Remove duplicate resource teardown
PCI: endpoint: pci-epf-ntb: Remove duplicate resource teardown
NTB: epf: vntb: Stop cmd_handler work in epf_ntb_epc_cleanup
PCI: endpoint: pci-epf-vntb: Switch vpci_scan_bus() to use
pci_scan_root_bus()
PCI: endpoint: pci-epf-vntb: Manage ntb_dev lifetime and fix vpci bus
teardown
drivers/ntb/hw/epf/ntb_hw_epf.c | 3 +-
drivers/pci/endpoint/functions/pci-epf-ntb.c | 56 +-----------
drivers/pci/endpoint/functions/pci-epf-vntb.c | 86 ++++++++++++-------
drivers/pci/endpoint/pci-ep-cfs.c | 8 +-
4 files changed, 62 insertions(+), 91 deletions(-)
--
2.48.1
syzbot reported a kernel BUG in ocfs2_find_victim_chain() because the
`cl_next_free_rec` field of the allocation chain list is 0, triggring the
BUG_ON(!cl->cl_next_free_rec) condition and panicking the kernel.
To fix this, `cl_next_free_rec` is checked inside the caller of
ocfs2_find_victim_chain() i.e. ocfs2_claim_suballoc_bits() and if it is
equal to 0, ocfs2_error() is called, to log the corruption and force the
filesystem into read-only mode, to prevent further damage.
Reported-by: syzbot+96d38c6e1655c1420a72(a)syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=96d38c6e1655c1420a72
Tested-by: syzbot+96d38c6e1655c1420a72(a)syzkaller.appspotmail.com
Cc: stable(a)vger.kernel.org
Signed-off-by: Prithvi Tambewagh <activprithvi(a)gmail.com>
---
fs/ocfs2/suballoc.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/fs/ocfs2/suballoc.c b/fs/ocfs2/suballoc.c
index 6ac4dcd54588..84bb2d11c2aa 100644
--- a/fs/ocfs2/suballoc.c
+++ b/fs/ocfs2/suballoc.c
@@ -1993,6 +1993,13 @@ static int ocfs2_claim_suballoc_bits(struct ocfs2_alloc_context *ac,
cl = (struct ocfs2_chain_list *) &fe->id2.i_chain;
+ if (le16_to_cpu(cl->cl_next_free_rec) == 0) {
+ status = ocfs2_error(ac->ac_inode->i_sb,
+ "Chain allocator dinode %llu has 0 chains\n",
+ (unsigned long long)le64_to_cpu(fe->i_blkno));
+ goto bail;
+ }
+
victim = ocfs2_find_victim_chain(cl);
ac->ac_chain = victim;
base-commit: 939f15e640f193616691d3bcde0089760e75b0d3
--
2.34.1
syzbot reported a kernel BUG in ocfs2_find_victim_chain() because the
`cl_next_free_rec` field of the allocation chain list is 0, triggring the
BUG_ON(!cl->cl_next_free_rec) condition and panicking the kernel.
To fix this, `cl_next_free_rec` is checked inside the caller of
ocfs2_find_victim_chain() i.e. ocfs2_claim_suballoc_bits() and if it is
equal to 0, ocfs2_error() is called, to log the corruption and force the
filesystem into read-only mode, to prevent further damage.
Reported-by: syzbot+96d38c6e1655c1420a72(a)syzkaller.appspotmail.com
Tested-by: syzbot+96d38c6e1655c1420a72(a)syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=96d38c6e1655c1420a72
Cc: stable(a)vger.kernel.org
Signed-off-by: Prithvi Tambewagh <activprithvi(a)gmail.com>
---
fs/ocfs2/suballoc.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/fs/ocfs2/suballoc.c b/fs/ocfs2/suballoc.c
index 6ac4dcd54588..c7eb6efc00b4 100644
--- a/fs/ocfs2/suballoc.c
+++ b/fs/ocfs2/suballoc.c
@@ -1993,6 +1993,13 @@ static int ocfs2_claim_suballoc_bits(struct ocfs2_alloc_context *ac,
cl = (struct ocfs2_chain_list *) &fe->id2.i_chain;
+ if( le16_to_cpu(cl->cl_next_free_rec) == 0) {
+ status = ocfs2_error(ac->ac_inode->i_sb,
+ "Chain allocator dinode %llu has 0 chains\n",
+ (unsigned long long)le64_to_cpu(fe->i_blkno));
+ goto bail;
+ }
+
victim = ocfs2_find_victim_chain(cl);
ac->ac_chain = victim;
base-commit: 939f15e640f193616691d3bcde0089760e75b0d3
--
2.34.1
Hi Greg, Sasha, Jiayuan,
On 27/11/2025 14:41, gregkh(a)linuxfoundation.org wrote:
>
> This is a note to let you know that I've just added the patch titled
>
> mptcp: Fix proto fallback detection with BPF
>
> to the 6.1-stable tree which can be found at:
> http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
>
> The filename of the patch is:
> mptcp-fix-proto-fallback-detection-with-bpf.patch
> and it can be found in the queue-6.1 subdirectory.
>
> If you, or anyone else, feels it should not be added to the stable tree,
> please let <stable(a)vger.kernel.org> know about it.
@Sasha: thank you for having resolved the conflicts for this patch (and
many others related to MPTCP recently). Sadly, it is causing troubles.
@Greg/Sasha: is it possible to remove it from 6.1, 5.15 and 5.10 queues
please?
(The related patch in 6.6 and above is OK)
@Jiayuan: did you not specify you initially saw this issue on a v6.1
kernel? By chance, do you already have a fix for that version?
Cheers,
Matt
--
Sponsored by the NGI0 Core fund.
If the CM ID is destroyed while the CM event for multicast creating is
still queued the cancel_work_sync() will prevent the work from running
which also prevents destroying the ah_attr. This leaks a refcount and
triggers a WARN:
GID entry ref leak for dev syz1 index 2 ref=573
WARNING: CPU: 1 PID: 655 at drivers/infiniband/core/cache.c:809 release_gid_table drivers/infiniband/core/cache.c:806 [inline]
WARNING: CPU: 1 PID: 655 at drivers/infiniband/core/cache.c:809 gid_table_release_one+0x284/0x3cc drivers/infiniband/core/cache.c:886
Destroy the ah_attr after canceling the work, it is safe to call this
twice.
Cc: stable(a)vger.kernel.org
Fixes: fe454dc31e84 ("RDMA/ucma: Fix use-after-free bug in ucma_create_uevent")
Reported-by: syzbot+b0da83a6c0e2e2bddbd4(a)syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/68232e7b.050a0220.f2294.09f6.GAE@google.com
Signed-off-by: Jason Gunthorpe <jgg(a)nvidia.com>
---
drivers/infiniband/core/cma.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index 95e89f5c147c2c..4f5fd47086ab90 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -2031,6 +2031,8 @@ static void destroy_mc(struct rdma_id_private *id_priv,
dev_put(ndev);
cancel_work_sync(&mc->iboe_join.work);
+ if (event->event == RDMA_CM_EVENT_MULTICAST_JOIN)
+ rdma_destroy_ah_attr(&event->param.ud.ah_attr);
}
kfree(mc);
}
base-commit: 3fbaef0942719187f3396bfd0c780d55d35e0980
--
2.43.0