Linaro-mm-sig October 2025

linaro-mm-sig@lists.linaro.org

27 participants
60 discussions

Re: [RFC v2 2/8] dma-buf: Add a helper to match interconnects between exporter/importer

by Jason Gunthorpe

On Sun, Oct 26, 2025 at 09:44:14PM -0700, Vivek Kasireddy wrote: > +/** > + * dma_buf_match_interconnects - determine if there is a specific interconnect > + * that is supported by both exporter and importer. > + * @attach: [in] attachment to populate ic_match field > + * @exp: [in] array of interconnects supported by exporter > + * @exp_ics: [in] number of interconnects supported by exporter > + * @imp: [in] array of interconnects supported by importer > + * @imp_ics: [in] number of interconnects supported by importer > + * > + * This helper function iterates through the list interconnects supported by > + * both exporter and importer to find a match. A successful match means that > + * a common interconnect type is supported by both parties and the exporter's > + * match_interconnect() callback also confirms that the importer is compatible > + * with the exporter for that interconnect type. Document which of the exporter/importer is supposed to call this > + * > + * If a match is found, the attach->ic_match field is populated with a copy > + * of the exporter's match data. > + * Return: true if a match is found, false otherwise. > + */ > +bool dma_buf_match_interconnects(struct dma_buf_attachment *attach, > + const struct dma_buf_interconnect_match *exp, > + unsigned int exp_ics, > + const struct dma_buf_interconnect_match *imp, > + unsigned int imp_ics) > +{ > + const struct dma_buf_interconnect_ops *ic_ops; > + struct dma_buf_interconnect_match *ic_match; > + struct dma_buf *dmabuf = attach->dmabuf; > + unsigned int i, j; > + > + if (!exp || !imp) > + return false; > + > + if (!attach->allow_ic) > + return false; Seems redundant with this check for ic_ops == NULL: > + ic_ops = dmabuf->ops->interconnect_ops; > + if (!ic_ops || !ic_ops->match_interconnect) > + return false; This seems like too much of a maze to me.. I think you should structure it like this. First declare an interconnect: struct dma_buf_interconnect iov_interconnect { .name = "IOV interconnect", .match =.. } Then the exporters "subclass" struct dma_buf_interconnect_ops vfio_iov_interconnect { .interconnect = &iov_interconnect, .map = vfio_map, } I guess no container_of technique.. Then in VFIO's attach trigger the new code: const struct dma_buf_interconnect_match vfio_exp_ics[] = { {&vfio_iov_interconnect}, }; dma_buf_match_interconnects(attach, &vfio_exp_ics)) Which will callback to the importer: static const struct dma_buf_attach_ops xe_dma_buf_attach_ops = { .get_importer_interconnects } dma_buf_match_interconnects() would call aops->get_importer_interconnects and matchs first on .interconnect, then call the interconnect->match function with exp/inpt match structs if not NULL. > +struct dma_buf_interconnect_match { > + const struct dma_buf_interconnect *type; > + struct device *dev; > + unsigned int bar; > +}; This should be more general, dev and bar are unique to the iov importer. Maybe just simple: struct dma_buf_interconnect_match { struct dma_buf_interconnect *ic; // no need for type const struct dma_buf_interconnct_ops *exporter_ic_ops; u64 match_data[2]; // dev and bar are IOV specific, generalize }; Then some helper const struct dma_buf_interconnect_match supports_ics[] = { IOV_INTERCONNECT(&vfio_iov_interconnect, dev, bar), } And it would be nice if interconnect aware drivers could more easially interwork with non-interconnect importers. So I'd add a exporter type of 'p2p dma mapped scatterlist' that just matches the legacy importer. Jason

3 days, 6 hours

Re: [RFC v2 1/8] dma-buf: Add support for map/unmap APIs for interconnects

by Jason Gunthorpe

On Sun, Oct 26, 2025 at 09:44:13PM -0700, Vivek Kasireddy wrote: > For the map operation, the dma-buf core will create an xarray but > the exporter needs to populate it with the interconnect specific > addresses. And, similarly for unmap, the exporter is expected to > cleanup the individual entries of the xarray. I don't think we should limit this to xarrays, nor do I think it is a great datastructure for what is usually needed here.. I just posted the patches showing what iommufd needs, and it wants something like struct mapping { struct p2p_provider *provider; size_t nelms; struct phys_vec *phys; }; Which is not something that make sense as an xarray. I think the interconnect should have its own functions for map/unmap, ie instead of trying to have them as a commmon dma_buf_interconnect_ops do something like struct dma_buf_interconnect_ops { const char *name; bool (*supports_interconnects)(struct dma_buf_attachment *attach, const struct dma_buf_interconnect_match *, unsigned int num_ics); }; struct dma_buf_iov_interconnect_ops { struct dma_buf_interconnect_ops ic_ops; struct xx *(*map)(struct dma_buf_attachment *attach, unsigned int *bar_number, size_t *nelms); // No unmap for iov }; static inline struct xx *dma_buf_iov_map(struct dma_buf_attachment *attach, unsigned int *bar_number, size_t *nelms) { return container_of(attach->ic_ops, struct dma_buf_iov_interconnect_ops, ic_ops)->map( attach, bar_number, nelms)); } > +/** > + * dma_buf_attachment_is_dynamic - check if the importer can handle move_notify. > + * @attach: the attachment to check > + * > + * Returns true if a DMA-buf importer has indicated that it can handle dmabuf > + * location changes through the move_notify callback. > + */ > +static inline bool > +dma_buf_attachment_is_dynamic(struct dma_buf_attachment *attach) > +{ > + return !!attach->importer_ops; > +} Why is this in this patch? I also think this patch should be second in the series, it makes more sense to figure out how to attach with an interconnect then show how to map/unmap with that interconnect Like I'm not sure why this introduces allow_ic? Jason

3 days, 7 hours

Re: [PATCH v5 9/9] vfio/pci: Add dma-buf export support for MMIO regions

by Jason Gunthorpe

On Sun, Oct 26, 2025 at 03:55:04PM +0800, Shuai Xue wrote: > > > 在 2025/10/22 20:50, Jason Gunthorpe 写道: > > On Mon, Oct 13, 2025 at 06:26:11PM +0300, Leon Romanovsky wrote: > > > From: Leon Romanovsky <leonro(a)nvidia.com> > > > > > > Add support for exporting PCI device MMIO regions through dma-buf, > > > enabling safe sharing of non-struct page memory with controlled > > > lifetime management. This allows RDMA and other subsystems to import > > > dma-buf FDs and build them into memory regions for PCI P2P operations. > > > > > > The implementation provides a revocable attachment mechanism using > > > dma-buf move operations. MMIO regions are normally pinned as BARs > > > don't change physical addresses, but access is revoked when the VFIO > > > device is closed or a PCI reset is issued. This ensures kernel > > > self-defense against potentially hostile userspace. > > > > Let's enhance this: > > > > Currently VFIO can take MMIO regions from the device's BAR and map > > them into a PFNMAP VMA with special PTEs. This mapping type ensures > > the memory cannot be used with things like pin_user_pages(), hmm, and > > so on. In practice only the user process CPU and KVM can safely make > > use of these VMA. When VFIO shuts down these VMAs are cleaned by > > unmap_mapping_range() to prevent any UAF of the MMIO beyond driver > > unbind. > > > > However, VFIO type 1 has an insecure behavior where it uses > > follow_pfnmap_*() to fish a MMIO PFN out of a VMA and program it back > > into the IOMMU. This has a long history of enabling P2P DMA inside > > VMs, but has serious lifetime problems by allowing a UAF of the MMIO > > after the VFIO driver has been unbound. > > Hi, Jason, > > Can you elaborate on this more? > > From my understanding of the VFIO type 1 implementation: > > - When a device is opened through VFIO type 1, it increments the > device->refcount > - During unbind, the driver waits for this refcount to drop to zero via > wait_for_completion(&device->comp) > - This should prevent the unbind() from completing while the device is > still in use > > Given this refcount mechanism, I do not figure out how the UAF can > occur. A second vfio device can be opened and then use follow_pfnmap_*() to read the first vfio device's PTEs. There is no relationship betweent the first and second VFIO devices, so once the first is unbound it sails through the device->comp while the second device retains the PFN in its type1 iommu_domain. Jason

3 days, 12 hours

Re: [PATCH] dma-fence: Fix safe access wrapper to call timeline name method

by Christian König

On 10/21/25 20:36, Tvrtko Ursulin wrote: > > On 21/10/2025 17:09, Akash Goel wrote: >> This commit fixes the wrapper function dma_fence_timeline_name(), that >> was added for safe access, to actually call the timeline name method of >> dma_fence_ops. >> >> Cc: <stable(a)vger.kernel.org> # v6.17+ >> Signed-off-by: Akash Goel <akash.goel(a)arm.com> > > Fixes: 506aa8b02a8d ("dma-fence: Add safe access helpers and document the rules") > > Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin(a)igalia.com> Good catch, Reviewed-by: Christian König <christian.koenig(a)amd.com> as well. Please ping me if you need somebody to push this to drm-misc-fixes. Thanks, Christian. > > Apologies for the copy and paste snafu. > > Regards, > > Tvrtko > >> --- >> drivers/dma-buf/dma-fence.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c >> index 3f78c56b58dc..39e6f93dc310 100644 >> --- a/drivers/dma-buf/dma-fence.c >> +++ b/drivers/dma-buf/dma-fence.c >> @@ -1141,7 +1141,7 @@ const char __rcu *dma_fence_timeline_name(struct dma_fence *fence) >> "RCU protection is required for safe access to returned string"); >> if (!test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags)) >> - return fence->ops->get_driver_name(fence); >> + return fence->ops->get_timeline_name(fence); >> else >> return "signaled-timeline"; >> } >

3 days, 14 hours

Re: [RFC PATCH] dma-fence: Remove 64-bit flag

by Christian König

On 10/20/25 13:18, Matthew Brost wrote: > On Mon, Oct 20, 2025 at 10:16:23AM +0200, Philipp Stanner wrote: >> On Fri, 2025-10-17 at 14:28 -0700, Matthew Brost wrote: >>> On Fri, Oct 17, 2025 at 11:31:47AM +0200, Philipp Stanner wrote: >>>> It seems that DMA_FENCE_FLAG_SEQNO64_BIT has no real effects anymore, >>>> since seqno is a u64 everywhere. >>>> >>>> Remove the unneeded flag. >>>> >>>> Signed-off-by: Philipp Stanner <phasta(a)kernel.org> >>>> --- >>>> Seems to me that this flag doesn't really do anything anymore? >>>> >>>> I *suspect* that it could be that some drivers pass a u32 to >>>> dma_fence_init()? I guess they could be ported, couldn't they. >>>> >>> >>> Xe uses 32-bit hardware fence sequence numbers—see [1] and [2]. We could >>> switch to 64-bit hardware fence sequence numbers, but that would require >>> changes on the driver side. If you sent this to our CI, I’m fairly >>> certain we’d see a bunch of failures. I suspect this would also break >>> several other drivers. >> >> What exactly breaks? Help me out here; if you pass a u32 for a u64, > > Seqno wraps. > >> doesn't the C standard guarantee that the higher, unused 32 bits will >> be 0? > > return (int)(lower_32_bits(f1) - lower_32_bits(f2)) > 0; > > Look at the above logic. > > f1 = 0x0; > f2 = 0xffffffff; /* -1 */ > > The above statement will correctly return true. > > Compared to the below statement which returns false. > > return f1 > f2; > > We test seqno wraps in Xe by setting our initial seqno to -127, again if > you send this patch to our CI any test which sends more than 127 job on > queue will likely fail. Yeah, exactly that's why this flag is needed for quite a lot of things. Question is what is missing in the documentation to make that clear? Regards, Christian. > > Matt > >> >> Because the only thing the flag still does is do this lower_32 check in >> fence_is_later. >> >> P. >> >>> >>> As I mentioned, all Xe-supported platforms could be updated since their >>> rings support 64-bit store instructions. However, I suspect that very >>> old i915 platforms don’t support such instructions in the ring. I agree >>> this is a legacy issue, and we should probably use 64-bit sequence >>> numbers in Xe. But again, platforms and drivers that are decades old >>> might break as a result. >>> >>> Matt >>> >>> [1] https://elixir.bootlin.com/linux/v6.17.1/source/drivers/gpu/drm/xe/xe_hw_fe… >>> [2] https://elixir.bootlin.com/linux/v6.17.1/source/drivers/gpu/drm/xe/xe_hw_fe… >>> >>>> P. >>>> --- >>>> drivers/dma-buf/dma-fence.c | 3 +-- >>>> include/linux/dma-fence.h | 10 +--------- >>>> 2 files changed, 2 insertions(+), 11 deletions(-) >>>> >>>> diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c >>>> index 3f78c56b58dc..24794c027813 100644 >>>> --- a/drivers/dma-buf/dma-fence.c >>>> +++ b/drivers/dma-buf/dma-fence.c >>>> @@ -1078,8 +1078,7 @@ void >>>> dma_fence_init64(struct dma_fence *fence, const struct dma_fence_ops *ops, >>>> spinlock_t *lock, u64 context, u64 seqno) >>>> { >>>> - __dma_fence_init(fence, ops, lock, context, seqno, >>>> - BIT(DMA_FENCE_FLAG_SEQNO64_BIT)); >>>> + __dma_fence_init(fence, ops, lock, context, seqno, 0); >>>> } >>>> EXPORT_SYMBOL(dma_fence_init64); >>>> >>>> diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h >>>> index 64639e104110..4eca2db28625 100644 >>>> --- a/include/linux/dma-fence.h >>>> +++ b/include/linux/dma-fence.h >>>> @@ -98,7 +98,6 @@ struct dma_fence { >>>> }; >>>> >>>> enum dma_fence_flag_bits { >>>> - DMA_FENCE_FLAG_SEQNO64_BIT, >>>> DMA_FENCE_FLAG_SIGNALED_BIT, >>>> DMA_FENCE_FLAG_TIMESTAMP_BIT, >>>> DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT, >>>> @@ -470,14 +469,7 @@ dma_fence_is_signaled(struct dma_fence *fence) >>>> */ >>>> static inline bool __dma_fence_is_later(struct dma_fence *fence, u64 f1, u64 f2) >>>> { >>>> - /* This is for backward compatibility with drivers which can only handle >>>> - * 32bit sequence numbers. Use a 64bit compare when the driver says to >>>> - * do so. >>>> - */ >>>> - if (test_bit(DMA_FENCE_FLAG_SEQNO64_BIT, &fence->flags)) >>>> - return f1 > f2; >>>> - >>>> - return (int)(lower_32_bits(f1) - lower_32_bits(f2)) > 0; >>>> + return f1 > f2; >>>> } >>>> >>>> /** >>>> -- >>>> 2.49.0 >>>> >>

3 days, 17 hours

[PATCH AUTOSEL 6.17] drm/xe: improve dma-resv handling for backup object

by Sasha Levin

From: Matthew Auld <matthew.auld(a)intel.com> [ Upstream commit edb1745fc618ba8ef63a45ce3ae60de1bdf29231 ] Since the dma-resv is shared we don't need to reserve and add a fence slot fence twice, plus no need to loop through the dependencies. Signed-off-by: Matthew Auld <matthew.auld(a)intel.com> Cc: Thomas Hellström <thomas.hellstrom(a)linux.intel.com> Cc: Matthew Brost <matthew.brost(a)intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt(a)intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom(a)linux.intel.com> Link: https://lore.kernel.org/r/20250829164715.720735-2-matthew.auld@intel.com Signed-off-by: Sasha Levin <sashal(a)kernel.org> --- LLM Generated explanations, may be completely bogus: YES Explanation - What it fixes - Removes redundant dma-resv operations when a backup BO shares the same reservation object as the original BO, preventing the same fence from being reserved/added twice to the same `dma_resv`. - Avoids scanning the same dependency set twice when source and destination BOs share the same `dma_resv`. - Why the change is correct - The backup object is created to share the parent’s reservation object, so a single reserve/add is sufficient: - The backup BO is initialized with the parent’s resv: `drivers/gpu/drm/xe/xe_bo.c:1309` (`xe_bo_init_locked(..., bo->ttm.base.resv, ...)`), ensuring `bo->ttm.base.resv == backup->ttm.base.resv`. - The patch adds an explicit invariant check to document and enforce this: `drivers/gpu/drm/xe/xe_bo.c:1225` (`xe_assert(xe, bo->ttm.base.resv == backup->ttm.base.resv)`). - With shared `dma_resv`, adding the same fence twice is at best redundant (wasting fence slots and memory) and at worst error-prone. Reserving fence slots only once and adding the fence once is the correct behavior. - Specific code changes and effects - Evict path (GPU migration copy case): - Before: reserves and adds fence on both `bo->ttm.base.resv` and `backup->ttm.base.resv`. - After: reserves and adds exactly once, guarded by the shared-resv assertion. - See single reserve and add: `drivers/gpu/drm/xe/xe_bo.c:1226` (reserve) and `drivers/gpu/drm/xe/xe_bo.c:1237` (add fence). This is the core fix; the removed second reserve/add on the backup is the redundant part eliminated. - Restore path (migration copy back): - Same simplification: reserve once, add once on the shared `dma_resv`. - See single reserve and add: `drivers/gpu/drm/xe/xe_bo.c:1375` (reserve) and `drivers/gpu/drm/xe/xe_bo.c:1387` (add fence). - Dependency handling in migrate: - Before: added deps for both src and dst based only on `src_bo != dst_bo`. - After: only add dst deps if the resv objects differ, avoiding double-walking the same `dma_resv`. - See updated condition: `drivers/gpu/drm/xe/xe_migrate.c:932` (`src_bo->ttm.base.resv != dst_bo->ttm.base.resv`). - User-visible impact without the patch - Duplicate `dma_resv_add_fence()` calls on the same reservation object can: - Consume extra shared-fence slots and memory. - Inflate dependency lists, causing unnecessary scheduler waits and overhead. - Increase failure likelihood of `dma_resv_reserve_fences()` under memory pressure. - These paths are exercised during suspend/resume flows of pinned VRAM BOs (evict/restore), so reliability and performance in power transitions can be affected. - Scope and risk - Small, focused changes localized to the Intel Xe driver migration/evict/restore paths: - Files: `drivers/gpu/drm/xe/xe_bo.c`, `drivers/gpu/drm/xe/xe_migrate.c`. - No API changes or architectural refactors; logic strictly reduces redundant operations. - The `xe_assert` acts as a safety net to catch unexpected non-shared `resv` usage; normal runtime behavior is unchanged when the invariant holds. - The CPU copy fallback paths are untouched. - Stable backport considerations - This is a clear correctness and robustness fix, not a feature. - Low regression risk if the stable branch also creates the backup BO with the parent’s `dma_resv` (as shown by the use of `xe_bo_init_locked(..., bo->ttm.base.resv, ...)` in `drivers/gpu/drm/xe/xe_bo.c:1309`). - If a stable branch diverges and the backup BO does not share the resv, this patch would need adjustment (i.e., keep dual reserve/add in that case). The added `xe_assert` helps surface such mismatches during testing. Conclusion: This commit fixes a real bug (duplicate fence reserve/add and duplicate dependency scanning on a shared `dma_resv`) with a minimal, well-scoped change. It aligns with stable rules (important bugfix, low risk, contained), so it should be backported. drivers/gpu/drm/xe/xe_bo.c | 13 +------------ drivers/gpu/drm/xe/xe_migrate.c | 2 +- 2 files changed, 2 insertions(+), 13 deletions(-) diff --git a/drivers/gpu/drm/xe/xe_bo.c b/drivers/gpu/drm/xe/xe_bo.c index d07e23eb1a54d..5a61441d68af5 100644 --- a/drivers/gpu/drm/xe/xe_bo.c +++ b/drivers/gpu/drm/xe/xe_bo.c @@ -1242,14 +1242,11 @@ int xe_bo_evict_pinned(struct xe_bo *bo) else migrate = mem_type_to_migrate(xe, bo->ttm.resource->mem_type); + xe_assert(xe, bo->ttm.base.resv == backup->ttm.base.resv); ret = dma_resv_reserve_fences(bo->ttm.base.resv, 1); if (ret) goto out_backup; - ret = dma_resv_reserve_fences(backup->ttm.base.resv, 1); - if (ret) - goto out_backup; - fence = xe_migrate_copy(migrate, bo, backup, bo->ttm.resource, backup->ttm.resource, false); if (IS_ERR(fence)) { @@ -1259,8 +1256,6 @@ int xe_bo_evict_pinned(struct xe_bo *bo) dma_resv_add_fence(bo->ttm.base.resv, fence, DMA_RESV_USAGE_KERNEL); - dma_resv_add_fence(backup->ttm.base.resv, fence, - DMA_RESV_USAGE_KERNEL); dma_fence_put(fence); } else { ret = xe_bo_vmap(backup); @@ -1338,10 +1333,6 @@ int xe_bo_restore_pinned(struct xe_bo *bo) if (ret) goto out_unlock_bo; - ret = dma_resv_reserve_fences(backup->ttm.base.resv, 1); - if (ret) - goto out_unlock_bo; - fence = xe_migrate_copy(migrate, backup, bo, backup->ttm.resource, bo->ttm.resource, false); @@ -1352,8 +1343,6 @@ int xe_bo_restore_pinned(struct xe_bo *bo) dma_resv_add_fence(bo->ttm.base.resv, fence, DMA_RESV_USAGE_KERNEL); - dma_resv_add_fence(backup->ttm.base.resv, fence, - DMA_RESV_USAGE_KERNEL); dma_fence_put(fence); } else { ret = xe_bo_vmap(backup); diff --git a/drivers/gpu/drm/xe/xe_migrate.c b/drivers/gpu/drm/xe/xe_migrate.c index 2a627ed64b8f8..ba9b8590eccb2 100644 --- a/drivers/gpu/drm/xe/xe_migrate.c +++ b/drivers/gpu/drm/xe/xe_migrate.c @@ -901,7 +901,7 @@ struct dma_fence *xe_migrate_copy(struct xe_migrate *m, if (!fence) { err = xe_sched_job_add_deps(job, src_bo->ttm.base.resv, DMA_RESV_USAGE_BOOKKEEP); - if (!err && src_bo != dst_bo) + if (!err && src_bo->ttm.base.resv != dst_bo->ttm.base.resv) err = xe_sched_job_add_deps(job, dst_bo->ttm.base.resv, DMA_RESV_USAGE_BOOKKEEP); if (err) -- 2.51.0

5 days, 9 hours

Re: [PATCH v2] dma-buf: system_heap: use larger contiguous mappings instead of per-page mmap

by Maxime Ripard

On Tue, 21 Oct 2025 17:20:22 +1300, Barry Song wrote: > From: Barry Song <v-songbaohua(a)oppo.com> > > We can allocate high-order pages, but mapping them one by > one is inefficient. This patch changes the code to map > as large a chunk as possible. The code looks somewhat > > [ ... ] Reviewed-by: Maxime Ripard <mripard(a)kernel.org> Thanks! Maxime

6 days, 13 hours

[PATCH v5 3/8] rust: drm: gem: Add raw_dma_resv() function

by Lyude Paul

For retrieving a pointer to the struct dma_resv for a given GEM object. We also introduce it in a new trait, BaseObjectPrivate, which we automatically implement for all gem objects and don't expose to users outside of the crate. Signed-off-by: Lyude Paul <lyude(a)redhat.com> --- rust/kernel/drm/gem/mod.rs | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/rust/kernel/drm/gem/mod.rs b/rust/kernel/drm/gem/mod.rs index 32bff2e8463f4..67813cfb0db42 100644 --- a/rust/kernel/drm/gem/mod.rs +++ b/rust/kernel/drm/gem/mod.rs @@ -200,6 +200,18 @@ fn create_mmap_offset(&self) -> Result<u64> { impl<T: IntoGEMObject> BaseObject for T {} +/// Crate-private base operations shared by all GEM object classes. +#[expect(unused)] +pub(crate) trait BaseObjectPrivate: IntoGEMObject { + /// Return a pointer to this object's dma_resv. + fn raw_dma_resv(&self) -> *mut bindings::dma_resv { + // SAFETY: `as_gem_obj()` always returns a valid pointer to the base DRM gem object + unsafe { (*self.as_raw()).resv } + } +} + +impl<T: IntoGEMObject> BaseObjectPrivate for T {} + /// A base GEM object. /// /// Invariants -- 2.51.0

1 week

[PATCH v5 2/8] rust: helpers: Add bindings/wrappers for dma_resv_lock

by Lyude Paul

From: Asahi Lina <lina(a)asahilina.net> This is just for basic usage in the DRM shmem abstractions for implied locking, not intended as a full DMA Reservation abstraction yet. Signed-off-by: Asahi Lina <lina(a)asahilina.net> Signed-off-by: Daniel Almeida <daniel.almeida(a)collabora.com> Reviewed-by: Alice Ryhl <aliceryhl(a)google.com> Signed-off-by: Lyude Paul <lyude(a)redhat.com> --- rust/bindings/bindings_helper.h | 1 + rust/helpers/dma-resv.c | 13 +++++++++++++ rust/helpers/helpers.c | 1 + 3 files changed, 15 insertions(+) create mode 100644 rust/helpers/dma-resv.c diff --git a/rust/bindings/bindings_helper.h b/rust/bindings/bindings_helper.h index 2e43c66635a2c..07f79e125c329 100644 --- a/rust/bindings/bindings_helper.h +++ b/rust/bindings/bindings_helper.h @@ -48,6 +48,7 @@ #include <linux/cpumask.h> #include <linux/cred.h> #include <linux/debugfs.h> +#include <linux/dma-resv.h> #include <linux/device/faux.h> #include <linux/dma-direction.h> #include <linux/dma-mapping.h> diff --git a/rust/helpers/dma-resv.c b/rust/helpers/dma-resv.c new file mode 100644 index 0000000000000..05501cb814513 --- /dev/null +++ b/rust/helpers/dma-resv.c @@ -0,0 +1,13 @@ +// SPDX-License-Identifier: GPL-2.0 + +#include <linux/dma-resv.h> + +int rust_helper_dma_resv_lock(struct dma_resv *obj, struct ww_acquire_ctx *ctx) +{ + return dma_resv_lock(obj, ctx); +} + +void rust_helper_dma_resv_unlock(struct dma_resv *obj) +{ + dma_resv_unlock(obj); +} diff --git a/rust/helpers/helpers.c b/rust/helpers/helpers.c index 551da6c9b5064..36d40f911345c 100644 --- a/rust/helpers/helpers.c +++ b/rust/helpers/helpers.c @@ -25,6 +25,7 @@ #include "cred.c" #include "device.c" #include "dma.c" +#include "dma-resv.c" #include "drm.c" #include "err.c" #include "irq.c" -- 2.51.0

1 week

Re: [PATCH v5 2/2] accel: Add Arm Ethos-U NPU driver

by Rob Herring

On Tue, Oct 21, 2025 at 4:43 PM Matthew Brost <matthew.brost(a)intel.com> wrote: > > On Sat, Oct 18, 2025 at 12:42:30AM -0700, Matthew Brost wrote: > > On Fri, Oct 17, 2025 at 11:43:51PM -0700, Matthew Brost wrote: > > > On Fri, Oct 17, 2025 at 10:37:46AM -0500, Rob Herring wrote: > > > > On Thu, Oct 16, 2025 at 11:25:34PM -0700, Matthew Brost wrote: > > > > > On Thu, Oct 16, 2025 at 04:06:05PM -0500, Rob Herring (Arm) wrote: > > > > > > Add a driver for Arm Ethos-U65/U85 NPUs. The Ethos-U NPU has a > > > > > > relatively simple interface with single command stream to describe > > > > > > buffers, operation settings, and network operations. It supports up to 8 > > > > > > memory regions (though no h/w bounds on a region). The Ethos NPUs > > > > > > are designed to use an SRAM for scratch memory. Region 2 is reserved > > > > > > for SRAM (like the downstream driver stack and compiler). Userspace > > > > > > doesn't need access to the SRAM. > > > > > > > > Thanks for the review. > > > > > > > > [...] > > > > > > > > > > +static struct dma_fence *ethosu_job_run(struct drm_sched_job *sched_job) > > > > > > +{ > > > > > > + struct ethosu_job *job = to_ethosu_job(sched_job); > > > > > > + struct ethosu_device *dev = job->dev; > > > > > > + struct dma_fence *fence = NULL; > > > > > > + int ret; > > > > > > + > > > > > > + if (unlikely(job->base.s_fence->finished.error)) > > > > > > + return NULL; > > > > > > + > > > > > > + fence = ethosu_fence_create(dev); > > > > > > > > > > Another reclaim issue: ethosu_fence_create allocates memory using > > > > > GFP_KERNEL. Since we're already in the DMA fence signaling path > > > > > (reclaim), this can lead to a deadlock. > > > > > > > > > > Without too much thought, you likely want to move this allocation to > > > > > ethosu_job_do_push, but before taking dev->sched_lock or calling > > > > > drm_sched_job_arm. > > > > > > > > > > We really should fix the DRM scheduler work queue to be tainted with > > > > > reclaim. If I recall correctly, we'd need to update the work queue > > > > > layer. Let me look into that—I've seen this type of bug several times, > > > > > and lockdep should be able to catch it. > > > > > > > > Likely the rocket driver suffers from the same issues... > > > > > > > > > > I am not surprised by this statement. > > > > > > > > > > > > > > + if (IS_ERR(fence)) > > > > > > + return fence; > > > > > > + > > > > > > + if (job->done_fence) > > > > > > + dma_fence_put(job->done_fence); > > > > > > + job->done_fence = dma_fence_get(fence); > > > > > > + > > > > > > + ret = pm_runtime_get_sync(dev->base.dev); > > > > > > > > > > I haven't looked at your PM design, but this generally looks quite > > > > > dangerous with respect to reclaim. For example, if your PM resume paths > > > > > allocate memory or take locks that allocate memory underneath, you're > > > > > likely to run into issues. > > > > > > > > > > A better approach would be to attach a PM reference to your job upon > > > > > creation and release it upon job destruction. That would be safer and > > > > > save you headaches in the long run. > > > > > > > > Our PM is nothing more than clock enable/disable and register init. > > > > > > > > If the runtime PM API doesn't work and needs special driver wrappers, > > > > then I'm inclined to just not use it and manage clocks directly (as > > > > that's all it is doing). > > > > > > > > > > Yes, then you’re probably fine. More complex drivers can do all sorts of > > > things during a PM wake, which is why PM wakes should generally be the > > > outermost layer. I still suggest, to future-proof your code, that you > > > move the PM reference to an outer layer. > > > > > > > Also, taking a PM reference in a function call — as opposed to tying it > > to a object's lifetime — is risky. It can quickly lead to imbalances in > > PM references if things go sideways or function calls become unbalanced. > > Depending on how your driver uses the DRM scheduler, this seems like a > > real possibility. > > > > Matt > > > > > > > > > > > > This is what we do in Xe [1] [2]. > > > > > > > > > > Also, in general, this driver has been reviewed (RB’d), but it's not > > > > > great that I spotted numerous issues within just five minutes. I suggest > > > > > taking a step back and thoroughly evaluating everything this driver is > > > > > doing. > > > > > > > > Well, if it is hard to get simple drivers right, then it's a problem > > > > with the subsystem APIs IMO. > > > > > > > > > > Yes, agreed. We should have assertions and lockdep annotations in place > > > to catch driver-side misuses. This is the second driver I’ve randomly > > > looked at over the past year that has broken DMA fencing and reclaim > > > rules. I’ll take an action item to fix this in the DRM scheduler, but > > > I’m afraid I’ll likely break multiple drivers in the process as misuess > > > / lockdep will complain. > > I've posted a series [1] for the DRM scheduler which will complain about the > things I've pointed out here. Thanks. I ran v6 with them and no lockdep splats. Rob

1 week, 1 day

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

Linaro-mm-sig October 2025