Linaro-mm-sig October 2025

linaro-mm-sig@lists.linaro.org

27 participants
63 discussions

Re: [RFC v2 0/8] dma-buf: Add support for mapping dmabufs via interconnects

by Christian König

On 10/31/25 06:15, Kasireddy, Vivek wrote: > Hi Jason, > >> Subject: Re: [RFC v2 0/8] dma-buf: Add support for mapping dmabufs via >> interconnects >> >> On Thu, Oct 30, 2025 at 06:17:11AM +0000, Kasireddy, Vivek wrote: >>> It mostly looks OK to me but there are a few things that I want to discuss, >>> after briefly looking at the patches in your branch: >>> - I am wondering what is the benefit of the SGT compatibility stuff especially >>> when Christian suggested that he'd like to see SGT usage gone from >>> dma-buf >> >> I think to get rid of SGT we do need to put it in a little well >> defined box and then create alternatives and remove things using >> SGT. This is a long journey, and I think this is the first step. >> >> If SGT is some special case it will be harder to excise. >> >> So the next steps would be to make all the exporters directly declare >> a SGT and then remove the SGT related ops from dma_ops itself and >> remove the compat sgt in the attach logic. This is not hard, it is all >> simple mechanical work. > IMO, this SGT compatibility stuff should ideally be a separate follow-on > effort (and patch series) that would also probably include updates to > various drivers to add the SGT mapping type. Nope, just the other way around. In other words the SGT compatibility is a pre-requisite. We should first demonstrate with existing drivers that the new interface works and does what it promised to do and then extend it with new functionality. Regards, Christian. > >> >> This way the only compat requirement is to automatically give an >> import match list for a SGT only importer which is very little code in >> the core. >> >> The point is we make the SGT stuff nonspecial and fully aligned with >> the mapping type in small steps. This way neither importer nor >> exporter should have any special code to deal with interworking. >> >> To remove SGT we'd want to teach the core code how to create some kind >> of conversion mapping type, eg exporter uses SGT importer uses NEW so >> the magic conversion mapping type does the adapatation. >> >> In this way we can convert importers and exporters to use NEW in any >> order and they still interwork with each other. >> >>> eventually. Also, if matching fails, IMO, indicating that to the >>> importer (allow_ic) and having both exporter/importer fallback to >>> the current legacy mechanism would be simpler than the SGT >>> compatibility stuff. >> >> I don't want to have three paths in importers. >> >> If the importer supports SGT it should declare it in a match and the >> core code should always return a SGT match for the importer to use >> >> The importer should not have to code 'oh it is sgt but it somehow a >> little different' via an allow_ic type idea. >> >>> - Also, I thought PCIe P2P (along with SGT) use-cases are already well >> handled >>> by the existing map_dma_buf() and other interfaces. So, it might be >> confusing >>> if the newer interfaces also provide a mechanism to handle P2P although a >>> bit differently. I might be missing something here but shouldn't the existing >>> allow_peer2peer and other related stuff be left alone? >> >> P2P is part of SGT, it gets pulled into the SGT stuff as steps toward >> isolating SGT properly. Again as we move things to use native SGT >> exporters we would remove the exporter related allow_peer2peer items >> when they become unused. >> >>> - You are also adding custom attach/detach ops for each mapping_type. I >> think >>> it makes sense to reuse existing attach/detach ops if possible and initiate >> the >>> matching process from there, at-least initially. >> >> I started there, but as soon as I went to adding PAL I realized the >> attach/detach logic was completely different for each of the mapping >> types. So this is looking alot simpler. >> >> If the driver wants to share the same attach/detach ops for some of >> its mapping types then it can just set the same function pointer to >> all of them and pick up the mapping type from the attach->map_type. >> >>> - Looks like your design doesn't call for a dma_buf_map_interconnect() or >> other >>> similar helpers provided by dma-buf core that the importers can use. Is that >>> because the return type would not be known to the core? >> >> I don't want to have a single shared 'map' operation, that is the >> whole point of this design. Each mapping type has its own ops, own >> types, own function signatures that the client calls directly. >> >> No more type confusion or trying to abuse phys_addr_t, dma_addr_t, or >> scatterlist for in appropriate things. If your driver wants something >> special, like IOV, then give it proper clear types so it is >> understandable. >> >>> - And, just to confirm, with your design if I want to add a new interconnect/ >>> mapping_type (not just IOV but in general), all that is needed is to provide >> custom >>> attach/detach, match ops and one or more ops to map/unmap the address >> list >>> right? Does this mean that the role of dma-buf core would be limited to just >>> match and the exporters are expected to do most of the heavy lifting and >>> checking for stuff like dynamic importers, resv lock held, etc? >> >> I expect the core code would continue to provide wrappers and helpers >> to call the ops that can do any required common stuff. >> >> However, keep in mind, when the importer moves to use mapping type it >> also must be upgraded to use the dynamic importer flow as this API >> doesn't support non-dynamic importers using mapping type. >> >> I will add some of these remarks to the commit messages.. > Sounds good. I'll start testing/working on IOV interconnect patches based on > your design. > > Thanks, > Vivek >> >> Thanks! >> Jason

4 months

Re: [PATCH v2] drm/sched: Fix deadlock in drm_sched_entity_kill_jobs_cb

by Christian König

On 10/31/25 12:50, Tvrtko Ursulin wrote: > > On 31/10/2025 09:07, Pierre-Eric Pelloux-Prayer wrote: >> The Mesa issue referenced below pointed out a possible deadlock: >> >> [ 1231.611031] Possible interrupt unsafe locking scenario: >> >> [ 1231.611033] CPU0 CPU1 >> [ 1231.611034] ---- ---- >> [ 1231.611035] lock(&xa->xa_lock#17); >> [ 1231.611038] local_irq_disable(); >> [ 1231.611039] lock(&fence->lock); >> [ 1231.611041] lock(&xa->xa_lock#17); >> [ 1231.611044] <Interrupt> >> [ 1231.611045] lock(&fence->lock); >> [ 1231.611047] >> *** DEADLOCK *** >> >> In this example, CPU0 would be any function accessing job->dependencies >> through the xa_* functions that doesn't disable interrupts (eg: >> drm_sched_job_add_dependency, drm_sched_entity_kill_jobs_cb). >> >> CPU1 is executing drm_sched_entity_kill_jobs_cb as a fence signalling >> callback so in an interrupt context. It will deadlock when trying to >> grab the xa_lock which is already held by CPU0. >> >> Replacing all xa_* usage by their xa_*_irq counterparts would fix >> this issue, but Christian pointed out another issue: dma_fence_signal >> takes fence.lock and so does dma_fence_add_callback. >> >> dma_fence_signal() // locks f1.lock >> -> drm_sched_entity_kill_jobs_cb() >> -> foreach dependencies >> -> dma_fence_add_callback() // locks f2.lock >> >> This will deadlock if f1 and f2 share the same spinlock. > > Is it possible to hit this case? > > Same lock means same execution timeline Nope, exactly that is incorrect. It's completely up to the implementation what they use this lock for. >, which should mean dependency should have been squashed in drm_sched_job_add_dependency(), no? This makes it less likely, but not impossible to trigger. Regards, Christian. > > Or would sharing the lock but not sharing the entity->fence_context be considered legal? It would be surprising at least. > > Also, would anyone have time to add a kunit test? ;) > > Regards, > > Tvrtko > >> To fix both issues, the code iterating on dependencies and re-arming them >> is moved out to drm_sched_entity_kill_jobs_work. >> >> Link: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13908 >> Reported-by: Mikhail Gavrilov <mikhail.v.gavrilov(a)gmail.com> >> Suggested-by: Christian König <christian.koenig(a)amd.com> >> Reviewed-by: Christian König <christian.koenig(a)amd.com> >> Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer(a)amd.com> >> --- >> drivers/gpu/drm/scheduler/sched_entity.c | 34 +++++++++++++----------- >> 1 file changed, 19 insertions(+), 15 deletions(-) >> >> diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c >> index c8e949f4a568..fe174a4857be 100644 >> --- a/drivers/gpu/drm/scheduler/sched_entity.c >> +++ b/drivers/gpu/drm/scheduler/sched_entity.c >> @@ -173,26 +173,15 @@ int drm_sched_entity_error(struct drm_sched_entity *entity) >> } >> EXPORT_SYMBOL(drm_sched_entity_error); >> +static void drm_sched_entity_kill_jobs_cb(struct dma_fence *f, >> + struct dma_fence_cb *cb); >> + >> static void drm_sched_entity_kill_jobs_work(struct work_struct *wrk) >> { >> struct drm_sched_job *job = container_of(wrk, typeof(*job), work); >> - >> - drm_sched_fence_scheduled(job->s_fence, NULL); >> - drm_sched_fence_finished(job->s_fence, -ESRCH); >> - WARN_ON(job->s_fence->parent); >> - job->sched->ops->free_job(job); >> -} >> - >> -/* Signal the scheduler finished fence when the entity in question is killed. */ >> -static void drm_sched_entity_kill_jobs_cb(struct dma_fence *f, >> - struct dma_fence_cb *cb) >> -{ >> - struct drm_sched_job *job = container_of(cb, struct drm_sched_job, >> - finish_cb); >> + struct dma_fence *f; >> unsigned long index; >> - dma_fence_put(f); >> - >> /* Wait for all dependencies to avoid data corruptions */ >> xa_for_each(&job->dependencies, index, f) { >> struct drm_sched_fence *s_fence = to_drm_sched_fence(f); >> @@ -220,6 +209,21 @@ static void drm_sched_entity_kill_jobs_cb(struct dma_fence *f, >> dma_fence_put(f); >> } >> + drm_sched_fence_scheduled(job->s_fence, NULL); >> + drm_sched_fence_finished(job->s_fence, -ESRCH); >> + WARN_ON(job->s_fence->parent); >> + job->sched->ops->free_job(job); >> +} >> + >> +/* Signal the scheduler finished fence when the entity in question is killed. */ >> +static void drm_sched_entity_kill_jobs_cb(struct dma_fence *f, >> + struct dma_fence_cb *cb) >> +{ >> + struct drm_sched_job *job = container_of(cb, struct drm_sched_job, >> + finish_cb); >> + >> + dma_fence_put(f); >> + >> INIT_WORK(&job->work, drm_sched_entity_kill_jobs_work); >> schedule_work(&job->work); >> } >

4 months

Re: [PATCH v5 9/9] vfio/pci: Add dma-buf export support for MMIO regions

by Leon Romanovsky

On Thu, Oct 30, 2025 at 02:38:36PM -0600, Alex Williamson wrote: > On Mon, 13 Oct 2025 18:26:11 +0300 > Leon Romanovsky <leon(a)kernel.org> wrote: > > diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c > > index fe247d0e2831..56b1320238a9 100644 > > --- a/drivers/vfio/pci/vfio_pci_core.c > > +++ b/drivers/vfio/pci/vfio_pci_core.c > > @@ -1511,6 +1520,19 @@ int vfio_pci_core_ioctl_feature(struct vfio_device *device, u32 flags, > > return vfio_pci_core_pm_exit(vdev, flags, arg, argsz); > > case VFIO_DEVICE_FEATURE_PCI_VF_TOKEN: > > return vfio_pci_core_feature_token(vdev, flags, arg, argsz); > > + case VFIO_DEVICE_FEATURE_DMA_BUF: > > + if (device->ops->ioctl != vfio_pci_core_ioctl) > > + /* > > + * Devices that overwrite general .ioctl() callback > > + * usually do it to implement their own > > + * VFIO_DEVICE_GET_REGION_INFO handlerm and they present > > Typo, "handlerm" Thanks, this part of code is going to be different in v6. > <...> > > @@ -2482,6 +2506,10 @@ static int vfio_pci_dev_set_hot_reset(struct vfio_device_set *dev_set, > > > > ret = pci_reset_bus(pdev); > > > > + list_for_each_entry(vdev, &dev_set->device_list, vdev.dev_set_list) > > + if (__vfio_pci_memory_enabled(vdev)) > > + vfio_pci_dma_buf_move(vdev, false); > > + > > vdev = list_last_entry(&dev_set->device_list, > > struct vfio_pci_core_device, vdev.dev_set_list); > > > > This needs to be placed in the existing undo loop with the up_write(), > otherwise it can be missed in the error case. I'll move, but it caused me to wonder what did you want to achieve with this "vdev = list_last_entry ..." line? vdev is overwritten immediately after that line. > > > diff --git a/drivers/vfio/pci/vfio_pci_dmabuf.c b/drivers/vfio/pci/vfio_pci_dmabuf.c > > new file mode 100644 > > index 000000000000..eaba010777f3 > > --- /dev/null > > +++ b/drivers/vfio/pci/vfio_pci_dmabuf.c > > +static unsigned int calc_sg_nents(struct vfio_pci_dma_buf *priv, > > + struct dma_iova_state *state) > > +{ > > + struct phys_vec *phys_vec = priv->phys_vec; > > + unsigned int nents = 0; > > + u32 i; > > + > > + if (!state || !dma_use_iova(state)) > > + for (i = 0; i < priv->nr_ranges; i++) > > + nents += DIV_ROUND_UP(phys_vec[i].len, UINT_MAX); > > + else > > + /* > > + * In IOVA case, there is only one SG entry which spans > > + * for whole IOVA address space, but we need to make sure > > + * that it fits sg->length, maybe we need more. > > + */ > > + nents = DIV_ROUND_UP(priv->size, UINT_MAX); > > I think we're arguably running afoul of the coding style standard here > that this is not a single simple statement and should use braces. > <...> > > +err_unmap_dma: > > + if (!i || !state) > > + ; /* Do nothing */ > > + else if (dma_use_iova(state)) > > + dma_iova_destroy(attachment->dev, state, mapped_len, dir, > > + attrs); > > + else > > + for_each_sgtable_dma_sg(sgt, sgl, i) > > + dma_unmap_phys(attachment->dev, sg_dma_address(sgl), > > + sg_dma_len(sgl), dir, attrs); > > Same, here for braces. > <...> > > + if (!state) > > + ; /* Do nothing */ > > + else if (dma_use_iova(state)) > > + dma_iova_destroy(attachment->dev, state, priv->size, dir, > > + attrs); > > + else > > + for_each_sgtable_dma_sg(sgt, sgl, i) > > + dma_unmap_phys(attachment->dev, sg_dma_address(sgl), > > + sg_dma_len(sgl), dir, attrs); > > + > > Here too. I will change it, but it is worth to admit that I'm consistent in my coding style. > > > + sg_free_table(sgt); > > + kfree(sgt); > > +} > ... > > diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h > > index 75100bf009ba..63214467c875 100644 > > --- a/include/uapi/linux/vfio.h > > +++ b/include/uapi/linux/vfio.h > > @@ -1478,6 +1478,31 @@ struct vfio_device_feature_bus_master { > > }; > > #define VFIO_DEVICE_FEATURE_BUS_MASTER 10 > > > > +/** > > + * Upon VFIO_DEVICE_FEATURE_GET create a dma_buf fd for the > > + * regions selected. > > + * > > + * open_flags are the typical flags passed to open(2), eg O_RDWR, O_CLOEXEC, > > + * etc. offset/length specify a slice of the region to create the dmabuf from. > > + * nr_ranges is the total number of (P2P DMA) ranges that comprise the dmabuf. > > + * > > Probably worth noting that .flags should be zero, I see we enforce > that. Thanks, Added, thanks > > Alex >

4 months

Re: [RFC v2 0/8] dma-buf: Add support for mapping dmabufs via interconnects

by Jason Gunthorpe

On Thu, Oct 30, 2025 at 06:17:11AM +0000, Kasireddy, Vivek wrote: > It mostly looks OK to me but there are a few things that I want to discuss, > after briefly looking at the patches in your branch: > - I am wondering what is the benefit of the SGT compatibility stuff especially > when Christian suggested that he'd like to see SGT usage gone from > dma-buf I think to get rid of SGT we do need to put it in a little well defined box and then create alternatives and remove things using SGT. This is a long journey, and I think this is the first step. If SGT is some special case it will be harder to excise. So the next steps would be to make all the exporters directly declare a SGT and then remove the SGT related ops from dma_ops itself and remove the compat sgt in the attach logic. This is not hard, it is all simple mechanical work. This way the only compat requirement is to automatically give an import match list for a SGT only importer which is very little code in the core. The point is we make the SGT stuff nonspecial and fully aligned with the mapping type in small steps. This way neither importer nor exporter should have any special code to deal with interworking. To remove SGT we'd want to teach the core code how to create some kind of conversion mapping type, eg exporter uses SGT importer uses NEW so the magic conversion mapping type does the adapatation. In this way we can convert importers and exporters to use NEW in any order and they still interwork with each other. > eventually. Also, if matching fails, IMO, indicating that to the > importer (allow_ic) and having both exporter/importer fallback to > the current legacy mechanism would be simpler than the SGT > compatibility stuff. I don't want to have three paths in importers. If the importer supports SGT it should declare it in a match and the core code should always return a SGT match for the importer to use The importer should not have to code 'oh it is sgt but it somehow a little different' via an allow_ic type idea. > - Also, I thought PCIe P2P (along with SGT) use-cases are already well handled > by the existing map_dma_buf() and other interfaces. So, it might be confusing > if the newer interfaces also provide a mechanism to handle P2P although a > bit differently. I might be missing something here but shouldn't the existing > allow_peer2peer and other related stuff be left alone? P2P is part of SGT, it gets pulled into the SGT stuff as steps toward isolating SGT properly. Again as we move things to use native SGT exporters we would remove the exporter related allow_peer2peer items when they become unused. > - You are also adding custom attach/detach ops for each mapping_type. I think > it makes sense to reuse existing attach/detach ops if possible and initiate the > matching process from there, at-least initially. I started there, but as soon as I went to adding PAL I realized the attach/detach logic was completely different for each of the mapping types. So this is looking alot simpler. If the driver wants to share the same attach/detach ops for some of its mapping types then it can just set the same function pointer to all of them and pick up the mapping type from the attach->map_type. > - Looks like your design doesn't call for a dma_buf_map_interconnect() or other > similar helpers provided by dma-buf core that the importers can use. Is that > because the return type would not be known to the core? I don't want to have a single shared 'map' operation, that is the whole point of this design. Each mapping type has its own ops, own types, own function signatures that the client calls directly. No more type confusion or trying to abuse phys_addr_t, dma_addr_t, or scatterlist for in appropriate things. If your driver wants something special, like IOV, then give it proper clear types so it is understandable. > - And, just to confirm, with your design if I want to add a new interconnect/ > mapping_type (not just IOV but in general), all that is needed is to provide custom > attach/detach, match ops and one or more ops to map/unmap the address list > right? Does this mean that the role of dma-buf core would be limited to just > match and the exporters are expected to do most of the heavy lifting and > checking for stuff like dynamic importers, resv lock held, etc? I expect the core code would continue to provide wrappers and helpers to call the ops that can do any required common stuff. However, keep in mind, when the importer moves to use mapping type it also must be upgraded to use the dynamic importer flow as this API doesn't support non-dynamic importers using mapping type. I will add some of these remarks to the commit messages.. Thanks! Jason

4 months

Re: [PATCH v5 9/9] vfio/pci: Add dma-buf export support for MMIO regions

by Leon Romanovsky

On Wed, Oct 29, 2025 at 05:25:03PM -0700, Samiullah Khawaja wrote: > On Mon, Oct 13, 2025 at 8:27 AM Leon Romanovsky <leon(a)kernel.org> wrote: > > > > From: Leon Romanovsky <leonro(a)nvidia.com> > > > > Add support for exporting PCI device MMIO regions through dma-buf, > > enabling safe sharing of non-struct page memory with controlled > > lifetime management. This allows RDMA and other subsystems to import > > dma-buf FDs and build them into memory regions for PCI P2P operations. > > > > The implementation provides a revocable attachment mechanism using > > dma-buf move operations. MMIO regions are normally pinned as BARs > > don't change physical addresses, but access is revoked when the VFIO > > device is closed or a PCI reset is issued. This ensures kernel > > self-defense against potentially hostile userspace. > > > > Signed-off-by: Jason Gunthorpe <jgg(a)nvidia.com> > > Signed-off-by: Vivek Kasireddy <vivek.kasireddy(a)intel.com> > > Signed-off-by: Leon Romanovsky <leonro(a)nvidia.com> > > --- > > drivers/vfio/pci/Kconfig | 3 + > > drivers/vfio/pci/Makefile | 2 + > > drivers/vfio/pci/vfio_pci_config.c | 22 +- > > drivers/vfio/pci/vfio_pci_core.c | 28 ++ > > drivers/vfio/pci/vfio_pci_dmabuf.c | 446 +++++++++++++++++++++++++++++ > > drivers/vfio/pci/vfio_pci_priv.h | 23 ++ > > include/linux/vfio_pci_core.h | 1 + > > include/uapi/linux/vfio.h | 25 ++ > > 8 files changed, 546 insertions(+), 4 deletions(-) > > create mode 100644 drivers/vfio/pci/vfio_pci_dmabuf.c <...> > > +void vfio_pci_dma_buf_move(struct vfio_pci_core_device *vdev, bool revoked) > > +{ > > + struct vfio_pci_dma_buf *priv; > > + struct vfio_pci_dma_buf *tmp; > > + > > + lockdep_assert_held_write(&vdev->memory_lock); > > + > > + list_for_each_entry_safe(priv, tmp, &vdev->dmabufs, dmabufs_elm) { > > + if (!get_file_active(&priv->dmabuf->file)) > > + continue; > > + > > + if (priv->revoked != revoked) { > > + dma_resv_lock(priv->dmabuf->resv, NULL); > > + priv->revoked = revoked; > > + dma_buf_move_notify(priv->dmabuf); > > I think this should only be called when revoked is true, otherwise > this will be calling move_notify on the already revoked dmabuf > attachments. This case is protected by "if (priv->revoked)" check both in vfio_pci_dma_buf_map and vfio_pci_dma_buf_attach. They will prevent DMABUF recreation if revoked is false. VTW, please trim your replies, it is time consuming to find your reply among 600 lines of unrelated text. Thanks > > + dma_resv_unlock(priv->dmabuf->resv); > > + } > > + dma_buf_put(priv->dmabuf); > > + } > > +}

4 months

Re: [PATCH v5 9/9] vfio/pci: Add dma-buf export support for MMIO regions

by Leon Romanovsky

On Wed, Oct 29, 2025, at 18:50, Alex Mastro wrote: > On Mon, Oct 13, 2025 at 06:26:11PM +0300, Leon Romanovsky wrote: >> + /* >> + * dma_buf_fd() consumes the reference, when the file closes the dmabuf >> + * will be released. >> + */ >> + return dma_buf_fd(priv->dmabuf, get_dma_buf.open_flags); > > I think this still needs to unwind state on fd allocation error. Reference > ownership is only transferred on success. Yes, you are correct, i need to call to dma_buf_put() in case of error. I will fix. Thanks > >> + >> +err_dev_put: >> + vfio_device_put_registration(&vdev->vdev); >> +err_free_phys: >> + kfree(priv->phys_vec); >> +err_free_priv: >> + kfree(priv); >> +err_free_ranges: >> + kfree(dma_ranges); >> + return ret; >> +}

4 months

Re: [PATCH v1] drm/sched: fix deadlock in drm_sched_entity_kill_jobs_cb

by Christian König

On 10/29/25 10:11, Pierre-Eric Pelloux-Prayer wrote: > https://gitlab.freedesktop.org/mesa/mesa/-/issues/13908 pointed out > a possible deadlock: > > [ 1231.611031] Possible interrupt unsafe locking scenario: > > [ 1231.611033] CPU0 CPU1 > [ 1231.611034] ---- ---- > [ 1231.611035] lock(&xa->xa_lock#17); > [ 1231.611038] local_irq_disable(); > [ 1231.611039] lock(&fence->lock); > [ 1231.611041] lock(&xa->xa_lock#17); > [ 1231.611044] <Interrupt> > [ 1231.611045] lock(&fence->lock); > [ 1231.611047] > *** DEADLOCK *** > > My initial fix was to replace xa_erase by xa_erase_irq, but Christian > pointed out that calling dma_fence_add_callback from a callback can > also deadlock if the signalling fence and the one passed to > dma_fence_add_callback share the same lock. > > To fix both issues, the code iterating on dependencies and re-arming them > is moved out to drm_sched_entity_kill_jobs_work. > > Suggested-by: Christian König <christian.koenig(a)amd.com> > Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer(a)amd.com> Reviewed-by: Christian König <christian.koenig(a)amd.com> > --- > drivers/gpu/drm/scheduler/sched_entity.c | 34 +++++++++++++----------- > 1 file changed, 19 insertions(+), 15 deletions(-) > > diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c > index c8e949f4a568..fe174a4857be 100644 > --- a/drivers/gpu/drm/scheduler/sched_entity.c > +++ b/drivers/gpu/drm/scheduler/sched_entity.c > @@ -173,26 +173,15 @@ int drm_sched_entity_error(struct drm_sched_entity *entity) > } > EXPORT_SYMBOL(drm_sched_entity_error); > > +static void drm_sched_entity_kill_jobs_cb(struct dma_fence *f, > + struct dma_fence_cb *cb); > + > static void drm_sched_entity_kill_jobs_work(struct work_struct *wrk) > { > struct drm_sched_job *job = container_of(wrk, typeof(*job), work); > - > - drm_sched_fence_scheduled(job->s_fence, NULL); > - drm_sched_fence_finished(job->s_fence, -ESRCH); > - WARN_ON(job->s_fence->parent); > - job->sched->ops->free_job(job); > -} > - > -/* Signal the scheduler finished fence when the entity in question is killed. */ > -static void drm_sched_entity_kill_jobs_cb(struct dma_fence *f, > - struct dma_fence_cb *cb) > -{ > - struct drm_sched_job *job = container_of(cb, struct drm_sched_job, > - finish_cb); > + struct dma_fence *f; > unsigned long index; > > - dma_fence_put(f); > - > /* Wait for all dependencies to avoid data corruptions */ > xa_for_each(&job->dependencies, index, f) { > struct drm_sched_fence *s_fence = to_drm_sched_fence(f); > @@ -220,6 +209,21 @@ static void drm_sched_entity_kill_jobs_cb(struct dma_fence *f, > dma_fence_put(f); > } > > + drm_sched_fence_scheduled(job->s_fence, NULL); > + drm_sched_fence_finished(job->s_fence, -ESRCH); > + WARN_ON(job->s_fence->parent); > + job->sched->ops->free_job(job); > +} > + > +/* Signal the scheduler finished fence when the entity in question is killed. */ > +static void drm_sched_entity_kill_jobs_cb(struct dma_fence *f, > + struct dma_fence_cb *cb) > +{ > + struct drm_sched_job *job = container_of(cb, struct drm_sched_job, > + finish_cb); > + > + dma_fence_put(f); > + > INIT_WORK(&job->work, drm_sched_entity_kill_jobs_work); > schedule_work(&job->work); > }

4 months

Re: [RFC v2 0/8] dma-buf: Add support for mapping dmabufs via interconnects

by Jason Gunthorpe

On Wed, Oct 29, 2025 at 11:25:34AM +0200, Leon Romanovsky wrote: > On Tue, Oct 28, 2025 at 09:27:26PM -0300, Jason Gunthorpe wrote: > > On Sun, Oct 26, 2025 at 09:44:12PM -0700, Vivek Kasireddy wrote: > > > In a typical dma-buf use case, a dmabuf exporter makes its buffer > > > buffer available to an importer by mapping it using DMA APIs > > > such as dma_map_sgtable() or dma_map_resource(). However, this > > > is not desirable in some cases where the exporter and importer > > > are directly connected via a physical or virtual link (or > > > interconnect) and the importer can access the buffer without > > > having it DMA mapped. > > > > I think my explanation was not so clear, I spent a few hours and typed > > in what I was thinking about here: > > > > https://github.com/jgunthorpe/linux/commits/dmabuf_map_type > > > > I didn't type in the last patch for iommufd side, hopefully it is > > clear enough. Adding iov should follow the pattern of the "physical > > address list" patch. > > > > I think the use of EXPORT_SYMBOL_FOR_MODULES() to lock down the > > physical addres list mapping type to iommufd is clever and I'm hoping > > addresses Chrsitian's concerns about abuse. > > > > Single GPU drivers can easilly declare their own mapping type for > > their own private interconnect without needing to change the core > > code. > > > > This seems to be fairly straightforward and reasonably type safe.. > > It makes me wonder what am I supposed to do with my series now [1]? > How do you see submission plan now? > > [1] https://lore.kernel.org/all/cover.1760368250.git.leon@kernel.org/ IMHO that series needs the small tweaks and should go this merge window, ideally along with the iommufd half. I think this thread is a topic for the next cycle, I expect it will take some time to converge on the dmabuf core changes, and adapting your series is quite simple. Jason

4 months

Re: [RFC v2 0/8] dma-buf: Add support for mapping dmabufs via interconnects

by Jason Gunthorpe

On Sun, Oct 26, 2025 at 09:44:12PM -0700, Vivek Kasireddy wrote: > In a typical dma-buf use case, a dmabuf exporter makes its buffer > buffer available to an importer by mapping it using DMA APIs > such as dma_map_sgtable() or dma_map_resource(). However, this > is not desirable in some cases where the exporter and importer > are directly connected via a physical or virtual link (or > interconnect) and the importer can access the buffer without > having it DMA mapped. I think my explanation was not so clear, I spent a few hours and typed in what I was thinking about here: https://github.com/jgunthorpe/linux/commits/dmabuf_map_type I didn't type in the last patch for iommufd side, hopefully it is clear enough. Adding iov should follow the pattern of the "physical address list" patch. I think the use of EXPORT_SYMBOL_FOR_MODULES() to lock down the physical addres list mapping type to iommufd is clever and I'm hoping addresses Chrsitian's concerns about abuse. Single GPU drivers can easilly declare their own mapping type for their own private interconnect without needing to change the core code. This seems to be fairly straightforward and reasonably type safe.. What do you think? Jason

4 months

Re: [RFC 1/8] dma-buf: Add support for map/unmap APIs for interconnects

by Christian König

On 10/14/25 09:08, Vivek Kasireddy wrote: > For the map operation, the dma-buf core will create an xarray but > the exporter is expected to populate it with the interconnect > specific addresses. And, similarly for unmap, the exporter is > expected to cleanup the individual entries of the xarray. > > Cc: Jason Gunthorpe <jgg(a)nvidia.com> > Cc: Christian Koenig <christian.koenig(a)amd.com> > Cc: Sumit Semwal <sumit.semwal(a)linaro.org> > Cc: Thomas Hellström <thomas.hellstrom(a)linux.intel.com> > Cc: Simona Vetter <simona.vetter(a)ffwll.ch> > Signed-off-by: Vivek Kasireddy <vivek.kasireddy(a)intel.com> > --- > drivers/dma-buf/dma-buf.c | 68 ++++++++++++++++++++++++++++ > include/linux/dma-buf-interconnect.h | 29 ++++++++++++ > include/linux/dma-buf.h | 11 +++++ > 3 files changed, 108 insertions(+) > create mode 100644 include/linux/dma-buf-interconnect.h > > diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c > index 2bcf9ceca997..162642bd53e8 100644 > --- a/drivers/dma-buf/dma-buf.c > +++ b/drivers/dma-buf/dma-buf.c > @@ -1612,6 +1612,74 @@ void dma_buf_vunmap_unlocked(struct dma_buf *dmabuf, struct iosys_map *map) > } > EXPORT_SYMBOL_NS_GPL(dma_buf_vunmap_unlocked, "DMA_BUF"); > > +struct dma_buf_ranges * > +dma_buf_map_interconnect(struct dma_buf_attachment *attach) > +{ > + const struct dma_buf_interconnect_ops *ic_ops; > + struct dma_buf *dmabuf = attach->dmabuf; > + struct dma_buf_ranges *ranges; > + int ret; > + > + might_sleep(); > + > + if (WARN_ON(!attach || !attach->dmabuf)) > + return ERR_PTR(-EINVAL); > + > + if (!dma_buf_attachment_is_dynamic(attach)) > + return ERR_PTR(-EINVAL); > + > + if (!attach->allow_ic) > + return ERR_PTR(-EOPNOTSUPP); > + > + dma_resv_assert_held(attach->dmabuf->resv); > + > + ic_ops = dmabuf->ops->interconnect_ops; > + if (!ic_ops || !ic_ops->map_interconnect) > + return ERR_PTR(-EINVAL); > + > + ranges = kzalloc(sizeof(*ranges), GFP_KERNEL); > + if (!ranges) > + return ERR_PTR(-ENOMEM); > + > + xa_init(&ranges->ranges); > + ret = ic_ops->map_interconnect(attach, ranges); > + if (ret) > + goto err_free_ranges; > + > + return ranges; > + > +err_free_ranges: > + xa_destroy(&ranges->ranges); > + kfree(ranges); > + return ERR_PTR(ret); > +} > +EXPORT_SYMBOL_NS_GPL(dma_buf_map_interconnect, "DMA_BUF"); > + > +void dma_buf_unmap_interconnect(struct dma_buf_attachment *attach, > + struct dma_buf_ranges *ranges) > +{ > + const struct dma_buf_interconnect_ops *ic_ops; > + struct dma_buf *dmabuf = attach->dmabuf; > + > + if (WARN_ON(!attach || !attach->dmabuf || !ranges)) > + return; > + > + if (!attach->allow_ic) > + return; > + > + ic_ops = dmabuf->ops->interconnect_ops; > + if (!ic_ops || !ic_ops->unmap_interconnect) > + return; > + > + dma_resv_assert_held(attach->dmabuf->resv); > + > + ic_ops->unmap_interconnect(attach, ranges); > + > + xa_destroy(&ranges->ranges); > + kfree(ranges); > +} > +EXPORT_SYMBOL_NS_GPL(dma_buf_unmap_interconnect, "DMA_BUF"); > + > #ifdef CONFIG_DEBUG_FS > static int dma_buf_debug_show(struct seq_file *s, void *unused) > { > diff --git a/include/linux/dma-buf-interconnect.h b/include/linux/dma-buf-interconnect.h > new file mode 100644 > index 000000000000..17504dea9691 > --- /dev/null > +++ b/include/linux/dma-buf-interconnect.h > @@ -0,0 +1,29 @@ > +/* SPDX-License-Identifier: MIT */ > + > +#ifndef __DMA_BUF_INTERCONNECT_H__ > +#define __DMA_BUF_INTERCONNECT_H__ > + > +#include <linux/xarray.h> > + > +struct dma_buf_attachment; > + > +struct dma_buf_ranges { > + struct xarray ranges; > + unsigned int nranges; > +}; Hui? How is that supposed to work? Should the exporter fill in the xarray with values? That clearly needs more description. And IIRC xarray can only contain pointers because the lower bits are used for internal handling. Some kind of iterator like interface would be preferred where you have first and next callbacks. > + > +enum dma_buf_interconnect_type { > + DMA_BUF_INTERCONNECT_NONE = 0, Let's start with a DMA_BUF_DMA_ADDR type. > +}; > + > +struct dma_buf_interconnect { > + enum dma_buf_interconnect_type type; > +}; > + > +struct dma_buf_interconnect_ops { > + int (*map_interconnect)(struct dma_buf_attachment *attach, > + struct dma_buf_ranges *ranges); > + void (*unmap_interconnect)(struct dma_buf_attachment *attach, > + struct dma_buf_ranges *ranges); > +}; Please put those directly into the dma_buf_ops structure, I don't really see a value in separating them. Additional to that I'm not sure if the "interconnect" is a good naming, essentially we want to use the new mapping functions to replace the sg_table as well. Regards, Christian. > +#endif > diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h > index d58e329ac0e7..db91c67c00d6 100644 > --- a/include/linux/dma-buf.h > +++ b/include/linux/dma-buf.h > @@ -23,6 +23,8 @@ > #include <linux/dma-fence.h> > #include <linux/wait.h> > > +#include <linux/dma-buf-interconnect.h> > + > struct device; > struct dma_buf; > struct dma_buf_attachment; > @@ -276,6 +278,8 @@ struct dma_buf_ops { > > int (*vmap)(struct dma_buf *dmabuf, struct iosys_map *map); > void (*vunmap)(struct dma_buf *dmabuf, struct iosys_map *map); > + > + const struct dma_buf_interconnect_ops *interconnect_ops; > }; > > /** > @@ -502,7 +506,9 @@ struct dma_buf_attachment { > struct device *dev; > struct list_head node; > bool peer2peer; > + bool allow_ic; > const struct dma_buf_attach_ops *importer_ops; > + struct dma_buf_interconnect interconnect; > void *importer_priv; > void *priv; > }; > @@ -589,6 +595,11 @@ struct sg_table *dma_buf_map_attachment(struct dma_buf_attachment *, > enum dma_data_direction); > void dma_buf_unmap_attachment(struct dma_buf_attachment *, struct sg_table *, > enum dma_data_direction); > + > +struct dma_buf_ranges *dma_buf_map_interconnect(struct dma_buf_attachment *); > +void dma_buf_unmap_interconnect(struct dma_buf_attachment *, > + struct dma_buf_ranges *); > + > void dma_buf_move_notify(struct dma_buf *dma_buf); > int dma_buf_begin_cpu_access(struct dma_buf *dma_buf, > enum dma_data_direction dir);

4 months

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

Linaro-mm-sig October 2025