Linaro-mm-sig

linaro-mm-sig@lists.linaro.org

10 participants
3222 discussions

Re: [PATCH 13/18] drm/amdgpu: independence for the amdkfd_fence! v2

by Christian König

On 11/28/25 11:10, Philipp Stanner wrote: > On Fri, 2025-11-28 at 11:06 +0100, Christian König wrote: >> On 11/27/25 12:10, Philipp Stanner wrote: >>> On Thu, 2025-11-13 at 15:51 +0100, Christian König wrote: >>>> This should allow amdkfd_fences to outlive the amdgpu module. >>>> >>>> v2: implement Felix suggestion to lock the fence while signaling it. >>>> >>>> Signed-off-by: Christian König <christian.koenig(a)amd.com> >>>> --- >>>> >>>> > > […] > >>>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c b/drivers/gpu/drm/amd/amdkfd/kfd_process.c >>>> index a085faac9fe1..8fac70b839ed 100644 >>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c >>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c >>>> @@ -1173,7 +1173,7 @@ static void kfd_process_wq_release(struct work_struct *work) >>>> synchronize_rcu(); >>>> ef = rcu_access_pointer(p->ef); >>>> if (ef) >>>> - dma_fence_signal(ef); >>>> + amdkfd_fence_signal(ef); >>>> >>>> kfd_process_remove_sysfs(p); >>>> kfd_debugfs_remove_process(p); >>>> @@ -1990,7 +1990,6 @@ kfd_process_gpuid_from_node(struct kfd_process *p, struct kfd_node *node, >>>> static int signal_eviction_fence(struct kfd_process *p) >>>> { >>>> struct dma_fence *ef; >>>> - int ret; >>>> >>>> rcu_read_lock(); >>>> ef = dma_fence_get_rcu_safe(&p->ef); >>>> @@ -1998,10 +1997,10 @@ static int signal_eviction_fence(struct kfd_process *p) >>>> if (!ef) >>>> return -EINVAL; >>>> >>>> - ret = dma_fence_signal(ef); >>>> + amdkfd_fence_signal(ef); >>>> dma_fence_put(ef); >>>> >>>> - return ret; >>>> + return 0; >>> >>> Oh wait, that's the code I'm also touching in my return code series! >>> >>> https://lore.kernel.org/dri-devel/cef83fed-5994-4c77-962c-9c7aac9f7306@amd.… >>> >>> >>> Does this series then solve the problem Felix pointed out in >>> evict_process_worker()? >> >> No it doesn't, I wasn't aware that the higher level code actually needs the status. After all Felix is the maintainer of this part. >> >> This patch here needs to be rebased on top of yours and changed accordingly to still return the fence status correctly. >> >> But thanks for pointing that out. > > > Alright, so my (repaired, v2) status-code-removal series shall enter drm-misc-next first, and then your series here. ACK? Works for me, I just need both to re-base the amdgpu patches on top. Christian. > > > P.

1 week, 4 days

Re: [PATCH 13/18] drm/amdgpu: independence for the amdkfd_fence! v2

by Christian König

On 11/27/25 12:10, Philipp Stanner wrote: > On Thu, 2025-11-13 at 15:51 +0100, Christian König wrote: >> This should allow amdkfd_fences to outlive the amdgpu module. >> >> v2: implement Felix suggestion to lock the fence while signaling it. >> >> Signed-off-by: Christian König <christian.koenig(a)amd.com> >> --- >> drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h | 6 +++ >> .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_fence.c | 39 ++++++++----------- >> drivers/gpu/drm/amd/amdkfd/kfd_process.c | 7 ++-- >> drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 4 +- >> 4 files changed, 27 insertions(+), 29 deletions(-) >> >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h >> index 8bdfcde2029b..6254cef04213 100644 >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h >> @@ -196,6 +196,7 @@ int kfd_debugfs_kfd_mem_limits(struct seq_file *m, void *data); >> #endif >> #if IS_ENABLED(CONFIG_HSA_AMD) >> bool amdkfd_fence_check_mm(struct dma_fence *f, struct mm_struct *mm); >> +void amdkfd_fence_signal(struct dma_fence *f); >> struct amdgpu_amdkfd_fence *to_amdgpu_amdkfd_fence(struct dma_fence *f); >> void amdgpu_amdkfd_remove_all_eviction_fences(struct amdgpu_bo *bo); >> int amdgpu_amdkfd_evict_userptr(struct mmu_interval_notifier *mni, >> @@ -210,6 +211,11 @@ bool amdkfd_fence_check_mm(struct dma_fence *f, struct mm_struct *mm) >> return false; >> } >> >> +static inline >> +void amdkfd_fence_signal(struct dma_fence *f) >> +{ > > I would add a short comment here: "Empty function because …" > >> +} >> + >> static inline >> struct amdgpu_amdkfd_fence *to_amdgpu_amdkfd_fence(struct dma_fence *f) >> { >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_fence.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_fence.c >> index 09c919f72b6c..f76c3c52a2a1 100644 >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_fence.c >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_fence.c >> @@ -127,29 +127,9 @@ static bool amdkfd_fence_enable_signaling(struct dma_fence *f) >> if (!svm_range_schedule_evict_svm_bo(fence)) >> return true; >> } >> - return false; >> -} >> - >> -/** >> - * amdkfd_fence_release - callback that fence can be freed >> - * >> - * @f: dma_fence >> - * >> - * This function is called when the reference count becomes zero. >> - * Drops the mm_struct reference and RCU schedules freeing up the fence. >> - */ >> -static void amdkfd_fence_release(struct dma_fence *f) >> -{ >> - struct amdgpu_amdkfd_fence *fence = to_amdgpu_amdkfd_fence(f); >> - >> - /* Unconditionally signal the fence. The process is getting >> - * terminated. >> - */ >> - if (WARN_ON(!fence)) >> - return; /* Not an amdgpu_amdkfd_fence */ >> - >> mmdrop(fence->mm); >> - kfree_rcu(f, rcu); >> + fence->mm = NULL; > > That the storage actually takes place is guaranteed by the lock taken > when calling the fence ops? > >> + return false; >> } >> >> /** >> @@ -174,9 +154,22 @@ bool amdkfd_fence_check_mm(struct dma_fence *f, struct mm_struct *mm) >> return false; >> } >> >> +void amdkfd_fence_signal(struct dma_fence *f) >> +{ >> + struct amdgpu_amdkfd_fence *fence = to_amdgpu_amdkfd_fence(f); >> + long flags; >> + >> + dma_fence_lock_irqsafe(f, flags) >> + if (fence->mm) { >> + mmdrop(fence->mm); >> + fence->mm = NULL; >> + } >> + dma_fence_signal_locked(f); >> + dma_fence_unlock_irqrestore(f, flags) >> +} >> + >> static const struct dma_fence_ops amdkfd_fence_ops = { >> .get_driver_name = amdkfd_fence_get_driver_name, >> .get_timeline_name = amdkfd_fence_get_timeline_name, >> .enable_signaling = amdkfd_fence_enable_signaling, >> - .release = amdkfd_fence_release, >> }; >> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c b/drivers/gpu/drm/amd/amdkfd/kfd_process.c >> index a085faac9fe1..8fac70b839ed 100644 >> --- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c >> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c >> @@ -1173,7 +1173,7 @@ static void kfd_process_wq_release(struct work_struct *work) >> synchronize_rcu(); >> ef = rcu_access_pointer(p->ef); >> if (ef) >> - dma_fence_signal(ef); >> + amdkfd_fence_signal(ef); >> >> kfd_process_remove_sysfs(p); >> kfd_debugfs_remove_process(p); >> @@ -1990,7 +1990,6 @@ kfd_process_gpuid_from_node(struct kfd_process *p, struct kfd_node *node, >> static int signal_eviction_fence(struct kfd_process *p) >> { >> struct dma_fence *ef; >> - int ret; >> >> rcu_read_lock(); >> ef = dma_fence_get_rcu_safe(&p->ef); >> @@ -1998,10 +1997,10 @@ static int signal_eviction_fence(struct kfd_process *p) >> if (!ef) >> return -EINVAL; >> >> - ret = dma_fence_signal(ef); >> + amdkfd_fence_signal(ef); >> dma_fence_put(ef); >> >> - return ret; >> + return 0; > > Oh wait, that's the code I'm also touching in my return code series! > > https://lore.kernel.org/dri-devel/cef83fed-5994-4c77-962c-9c7aac9f7306@amd.… > > > Does this series then solve the problem Felix pointed out in > evict_process_worker()? No it doesn't, I wasn't aware that the higher level code actually needs the status. After all Felix is the maintainer of this part. This patch here needs to be rebased on top of yours and changed accordingly to still return the fence status correctly. But thanks for pointing that out. Regards, Christian. > > > P. > > >> } >> >> static void evict_process_worker(struct work_struct *work) >> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c >> index c30dfb8ec236..566950702b7d 100644 >> --- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c >> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c >> @@ -428,7 +428,7 @@ static void svm_range_bo_release(struct kref *kref) >> >> if (!dma_fence_is_signaled(&svm_bo->eviction_fence->base)) >> /* We're not in the eviction worker. Signal the fence. */ >> - dma_fence_signal(&svm_bo->eviction_fence->base); >> + amdkfd_fence_signal(&svm_bo->eviction_fence->base); >> dma_fence_put(&svm_bo->eviction_fence->base); >> amdgpu_bo_unref(&svm_bo->bo); >> kfree(svm_bo); >> @@ -3628,7 +3628,7 @@ static void svm_range_evict_svm_bo_worker(struct work_struct *work) >> mmap_read_unlock(mm); >> mmput(mm); >> >> - dma_fence_signal(&svm_bo->eviction_fence->base); >> + amdkfd_fence_signal(&svm_bo->eviction_fence->base); >> >> /* This is the last reference to svm_bo, after svm_range_vram_node_free >> * has been called in svm_migrate_vram_to_ram >

1 week, 4 days

Re: [PATCH 09/18] drm/amdgpu: fix KFD eviction fence enable_signaling path

by Christian König

On 11/27/25 11:57, Philipp Stanner wrote: > On Thu, 2025-11-13 at 15:51 +0100, Christian König wrote: >> Calling dma_fence_is_signaled() here is illegal! > > OK, but why is that patch in this series? Because the next patch depends on it, otherwise the series won't compile. My plan is to push the amdgpu patches through amd-staging-drm-next as soon as Alex rebased that branch on drm-next during the next cycle. Regards, Christian. > > P. > >> >> Signed-off-by: Christian König <christian.koenig(a)amd.com> >> --- >> drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_fence.c | 6 ------ >> 1 file changed, 6 deletions(-) >> >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_fence.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_fence.c >> index 1ef758ac5076..09c919f72b6c 100644 >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_fence.c >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_fence.c >> @@ -120,12 +120,6 @@ static bool amdkfd_fence_enable_signaling(struct dma_fence *f) >> { >> struct amdgpu_amdkfd_fence *fence = to_amdgpu_amdkfd_fence(f); >> >> - if (!fence) >> - return false; >> - >> - if (dma_fence_is_signaled(f)) >> - return true; >> - >> if (!fence->svm_bo) { >> if (!kgd2kfd_schedule_evict_and_restore_process(fence->mm, f)) >> return true; >

1 week, 4 days

Re: [PATCH 2/6] amd/amdkfd: Ignore return code of dma_fence_signal()

by Christian König

On 11/27/25 10:48, Philipp Stanner wrote: > On Wed, 2025-11-26 at 16:24 -0500, Kuehling, Felix wrote: >> >> On 2025-11-26 08:19, Philipp Stanner wrote: >>> The return code of dma_fence_signal() is not really useful as there is >>> nothing reasonable to do if a fence was already signaled. That return >>> code shall be removed from the kernel. >>> >>> Ignore dma_fence_signal()'s return code. >> >> I think this is not correct. Looking at the comment in >> evict_process_worker, we use the return value to decide a race >> conditions where multiple threads are trying to signal the eviction >> fence. Only one of them should schedule the restore work. And the other >> ones need to increment the reference count to keep evictions balanced. > > Thank you for pointing that out. Seems then amdkfd is the only user who > actually relies on the feature. See below > >> >> Regards, >> Felix >> >> >>> >>> Suggested-by: Christian König <christian.koenig(a)amd.com> >>> Signed-off-by: Philipp Stanner <phasta(a)kernel.org> >>> --- >>> drivers/gpu/drm/amd/amdkfd/kfd_process.c | 5 ++--- >>> 1 file changed, 2 insertions(+), 3 deletions(-) >>> >>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c b/drivers/gpu/drm/amd/amdkfd/kfd_process.c >>> index ddfe30c13e9d..950fafa4b3c3 100644 >>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c >>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c >>> @@ -1986,7 +1986,6 @@ kfd_process_gpuid_from_node(struct kfd_process *p, struct kfd_node *node, >>> static int signal_eviction_fence(struct kfd_process *p) >>> { >>> struct dma_fence *ef; >>> - int ret; >>> >>> rcu_read_lock(); >>> ef = dma_fence_get_rcu_safe(&p->ef); >>> @@ -1994,10 +1993,10 @@ static int signal_eviction_fence(struct kfd_process *p) >>> if (!ef) >>> return -EINVAL; >>> >>> - ret = dma_fence_signal(ef); >>> + dma_fence_signal(ef); > > The issue now is that dma_fence_signal()'s return code is actually non- > racy, because check + bit-set are protected by lock. > > Christian's new spinlock series would add a lock function for fences: > https://lore.kernel.org/dri-devel/20251113145332.16805-5-christian.koenig@a… > > > So I suppose this should work: > > dma_fence_lock_irqsave(ef, flags); > if (dma_fence_test_signaled_flag(ef)) { > dma_fence_unlock_irqrestore(ef, flags); > return true; > } > dma_fence_signal_locked(ef); > dma_fence_unlock_irqrestore(ef, flags); > > return false; > > > + some cosmetic adjustments for the boolean of course. > > > Would that fly and be reasonable? @Felix, Christian. I was just about to reply with the same idea when your mail arrived. So yes looks totally reasonable to me. Regards, Christian. > > > P.

1 week, 5 days

Re: [PATCH] dma-buf: add no-op stubs when CONFIG_DMA_SHARED_BUFFER is disabled

by Christian König

On 11/27/25 09:23, Viresh Kumar wrote: > On 27-11-25, 09:07, Christian König wrote: >> On 11/27/25 08:40, Viresh Kumar wrote: >>> Move several dma-buf function declarations under >>> CONFIG_DMA_SHARED_BUFFER and provide static inline no-op implementations >>> for the disabled case to allow the callers to build when the feature is >>> not compiled in. >> >> Good point, but which driver actually needs that? > > This broke some WIP stuff [1] which isn't posted upstream yet. That's why I > didn't mention anything in the commit log, though I could have added a comment > about that in the non-commit-log part. Well then better send that out with the full patch set. >> In other words there should be a concrete example of what breaks in the commit message. > > There is time for those changes to be posted and not sure if they will be > accepted or not. But either way, this change made sense in general and so I > thought there is nothing wrong to get this upstream right away. Yeah when it is unused intermediately then that is usually a no-go even if I agree that it makes sense. >>> +static inline struct dma_buf *dma_buf_get(int fd) >>> +{ >>> + return NULL; >> >> And here ERR_PTR(-EINVAL). > > I am not really sure if this should be EINVAL in this case. EOPNOTSUPP still > makes sense as the API isn't supported. When the API isn't compiled in the fd can't be valid (because you can't create a dma_buf object in the first place). So returning -EINVAL still makes a lot of sense. Regards, Christian. > >>> +static inline struct dma_buf *dma_buf_iter_begin(void) >>> +{ >>> + return NULL; >>> +} >>> + >>> +static inline struct dma_buf *dma_buf_iter_next(struct dma_buf *dmbuf) >>> +{ >>> + return NULL; >>> +} >> >> Those two are only for BPF and not driver use. > > Will drop them. >

1 week, 5 days

Re: [PATCH 1/6] dma-buf/dma-fence: Add dma_fence_test_signaled_flag()

by Christian König

On 11/27/25 10:16, Philipp Stanner wrote: > On Thu, 2025-11-27 at 09:11 +0100, Christian König wrote: >> On 11/26/25 17:55, Matthew Brost wrote: >>> On Wed, Nov 26, 2025 at 08:41:27AM -0800, Matthew Brost wrote: >>>> On Wed, Nov 26, 2025 at 02:19:10PM +0100, Philipp Stanner wrote: >>>>> The dma_fence framework checks at many places whether the signaled flag >>>>> of a fence is already set. The code can be simplified and made more >>>>> readable by providing a helper function for that. >>>>> >>>>> Add dma_fence_test_signaled_flag(), which only checks whether a fence is >>>>> signaled. Use it internally. >>>>> >>>>> Suggested-by: Tvrtko Ursulin <tvrtko.ursulin(a)igalia.com> >>>>> Signed-off-by: Philipp Stanner <phasta(a)kernel.org> >>>> >>>> This is a nice cleanp: >>>> Reviewed-by: Matthew Brost <matthew.brost(a)intel.com> >>>> >>>>> --- >>>>> drivers/dma-buf/dma-fence.c | 19 +++++++++---------- >>>>> include/linux/dma-fence.h | 24 ++++++++++++++++++++++-- >>>>> 2 files changed, 31 insertions(+), 12 deletions(-) >>>>> >>>>> diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c >>>>> index 39e6f93dc310..25117a906846 100644 >>>>> --- a/drivers/dma-buf/dma-fence.c >>>>> +++ b/drivers/dma-buf/dma-fence.c >>>>> @@ -372,8 +372,7 @@ int dma_fence_signal_timestamp_locked(struct dma_fence *fence, >>>>> >>>>> lockdep_assert_held(fence->lock); >>>>> >>>>> - if (unlikely(test_and_set_bit(DMA_FENCE_FLAG_SIGNALED_BIT, >>>>> - &fence->flags))) >>> >>> I need to read a little better, I think this change isn't quite right. >>> The original code is test and set, the updated code is test only (i.e., >>> you are missing the set step). So maybe just leave this line as is. >> >> Oh, good point! I've totally missed that as well. > > Oh dear; I also just saw it when opening the mail client ._. > >> >> But that means that this patch set hasn't even been smoke tested. > > I've built it and did some basic testing with my Nouveau system. Any > suggestions? Do you have a CI that one can trigger? DMA-buf has CONFIG_DMABUF_SELFTESTS which should be able to catch things like that. But even running Nouveau should have found this since basically no fence at would signal any more. Regards, Christian. > > Thx > P.

1 week, 5 days

Re: [PATCH 1/6] dma-buf/dma-fence: Add dma_fence_test_signaled_flag()

by Christian König

On 11/26/25 17:55, Matthew Brost wrote: > On Wed, Nov 26, 2025 at 08:41:27AM -0800, Matthew Brost wrote: >> On Wed, Nov 26, 2025 at 02:19:10PM +0100, Philipp Stanner wrote: >>> The dma_fence framework checks at many places whether the signaled flag >>> of a fence is already set. The code can be simplified and made more >>> readable by providing a helper function for that. >>> >>> Add dma_fence_test_signaled_flag(), which only checks whether a fence is >>> signaled. Use it internally. >>> >>> Suggested-by: Tvrtko Ursulin <tvrtko.ursulin(a)igalia.com> >>> Signed-off-by: Philipp Stanner <phasta(a)kernel.org> >> >> This is a nice cleanp: >> Reviewed-by: Matthew Brost <matthew.brost(a)intel.com> >> >>> --- >>> drivers/dma-buf/dma-fence.c | 19 +++++++++---------- >>> include/linux/dma-fence.h | 24 ++++++++++++++++++++++-- >>> 2 files changed, 31 insertions(+), 12 deletions(-) >>> >>> diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c >>> index 39e6f93dc310..25117a906846 100644 >>> --- a/drivers/dma-buf/dma-fence.c >>> +++ b/drivers/dma-buf/dma-fence.c >>> @@ -372,8 +372,7 @@ int dma_fence_signal_timestamp_locked(struct dma_fence *fence, >>> >>> lockdep_assert_held(fence->lock); >>> >>> - if (unlikely(test_and_set_bit(DMA_FENCE_FLAG_SIGNALED_BIT, >>> - &fence->flags))) > > I need to read a little better, I think this change isn't quite right. > The original code is test and set, the updated code is test only (i.e., > you are missing the set step). So maybe just leave this line as is. Oh, good point! I've totally missed that as well. But that means that this patch set hasn't even been smoke tested. Regards, Christian. > > Matt > >>> + if (unlikely(dma_fence_test_signaled_flag(fence))) >>> return -EINVAL; >>> >>> /* Stash the cb_list before replacing it with the timestamp */ >>> @@ -545,7 +544,7 @@ void dma_fence_release(struct kref *kref) >>> trace_dma_fence_destroy(fence); >>> >>> if (!list_empty(&fence->cb_list) && >>> - !test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags)) { >>> + !dma_fence_test_signaled_flag(fence)) { >>> const char __rcu *timeline; >>> const char __rcu *driver; >>> unsigned long flags; >>> @@ -602,7 +601,7 @@ static bool __dma_fence_enable_signaling(struct dma_fence *fence) >>> was_set = test_and_set_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT, >>> &fence->flags); >>> >>> - if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags)) >>> + if (dma_fence_test_signaled_flag(fence)) >>> return false; >>> >>> if (!was_set && fence->ops->enable_signaling) { >>> @@ -666,7 +665,7 @@ int dma_fence_add_callback(struct dma_fence *fence, struct dma_fence_cb *cb, >>> if (WARN_ON(!fence || !func)) >>> return -EINVAL; >>> >>> - if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags)) { >>> + if (dma_fence_test_signaled_flag(fence)) { >>> INIT_LIST_HEAD(&cb->node); >>> return -ENOENT; >>> } >>> @@ -783,7 +782,7 @@ dma_fence_default_wait(struct dma_fence *fence, bool intr, signed long timeout) >>> >>> spin_lock_irqsave(fence->lock, flags); >>> >>> - if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags)) >>> + if (dma_fence_test_signaled_flag(fence)) >>> goto out; >>> >>> if (intr && signal_pending(current)) { >>> @@ -800,7 +799,7 @@ dma_fence_default_wait(struct dma_fence *fence, bool intr, signed long timeout) >>> cb.task = current; >>> list_add(&cb.base.node, &fence->cb_list); >>> >>> - while (!test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags) && ret > 0) { >>> + while (!dma_fence_test_signaled_flag(fence) && ret > 0) { >>> if (intr) >>> __set_current_state(TASK_INTERRUPTIBLE); >>> else >>> @@ -832,7 +831,7 @@ dma_fence_test_signaled_any(struct dma_fence **fences, uint32_t count, >>> >>> for (i = 0; i < count; ++i) { >>> struct dma_fence *fence = fences[i]; >>> - if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags)) { >>> + if (dma_fence_test_signaled_flag(fence)) { >>> if (idx) >>> *idx = i; >>> return true; >>> @@ -1108,7 +1107,7 @@ const char __rcu *dma_fence_driver_name(struct dma_fence *fence) >>> RCU_LOCKDEP_WARN(!rcu_read_lock_held(), >>> "RCU protection is required for safe access to returned string"); >>> >>> - if (!test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags)) >>> + if (!dma_fence_test_signaled_flag(fence)) >>> return fence->ops->get_driver_name(fence); >>> else >>> return "detached-driver"; >>> @@ -1140,7 +1139,7 @@ const char __rcu *dma_fence_timeline_name(struct dma_fence *fence) >>> RCU_LOCKDEP_WARN(!rcu_read_lock_held(), >>> "RCU protection is required for safe access to returned string"); >>> >>> - if (!test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags)) >>> + if (!dma_fence_test_signaled_flag(fence)) >>> return fence->ops->get_timeline_name(fence); >>> else >>> return "signaled-timeline"; >>> diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h >>> index 64639e104110..19972f5d176f 100644 >>> --- a/include/linux/dma-fence.h >>> +++ b/include/linux/dma-fence.h >>> @@ -401,6 +401,26 @@ void dma_fence_enable_sw_signaling(struct dma_fence *fence); >>> const char __rcu *dma_fence_driver_name(struct dma_fence *fence); >>> const char __rcu *dma_fence_timeline_name(struct dma_fence *fence); >>> >>> +/* >>> + * dma_fence_test_signaled_flag - Only check whether a fence is signaled yet. >>> + * @fence: the fence to check >>> + * >>> + * This function just checks whether @fence is signaled, without interacting >>> + * with the fence in any way. The user must, therefore, ensure through other >>> + * means that fences get signaled eventually. >>> + * >>> + * This function uses test_bit(), which is thread-safe. Naturally, this function >>> + * should be used opportunistically; a fence could get signaled at any moment >>> + * after the check is done. >>> + * >>> + * Return: true if signaled, false otherwise. >>> + */ >>> +static inline bool >>> +dma_fence_test_signaled_flag(struct dma_fence *fence) >>> +{ >>> + return test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags); >>> +} >>> + >>> /** >>> * dma_fence_is_signaled_locked - Return an indication if the fence >>> * is signaled yet. >>> @@ -418,7 +438,7 @@ const char __rcu *dma_fence_timeline_name(struct dma_fence *fence); >>> static inline bool >>> dma_fence_is_signaled_locked(struct dma_fence *fence) >>> { >>> - if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags)) >>> + if (dma_fence_test_signaled_flag(fence)) >>> return true; >>> >>> if (fence->ops->signaled && fence->ops->signaled(fence)) { >>> @@ -448,7 +468,7 @@ dma_fence_is_signaled_locked(struct dma_fence *fence) >>> static inline bool >>> dma_fence_is_signaled(struct dma_fence *fence) >>> { >>> - if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags)) >>> + if (dma_fence_test_signaled_flag(fence)) >>> return true; >>> >>> if (fence->ops->signaled && fence->ops->signaled(fence)) { >>> -- >>> 2.49.0 >>>

1 week, 5 days

Re: [PATCH] dma-buf: add no-op stubs when CONFIG_DMA_SHARED_BUFFER is disabled

by Christian König

On 11/27/25 08:40, Viresh Kumar wrote: > Move several dma-buf function declarations under > CONFIG_DMA_SHARED_BUFFER and provide static inline no-op implementations > for the disabled case to allow the callers to build when the feature is > not compiled in. Good point, but which driver actually needs that? At least the whole DRM subsystem selects CONFIG_DMA_SHARED_BUFFER and others (like V4L) usually don't compile their whole infrastructure when that option isn't selected. In other words there should be a concrete example of what breaks in the commit message. > > Signed-off-by: Viresh Kumar <viresh.kumar(a)linaro.org> > --- > include/linux/dma-buf.h | 116 ++++++++++++++++++++++++++++++++++++++++ > 1 file changed, 116 insertions(+) > > diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h > index d58e329ac0e7..06e494d8f6b0 100644 > --- a/include/linux/dma-buf.h > +++ b/include/linux/dma-buf.h > @@ -568,6 +568,7 @@ static inline bool dma_buf_is_dynamic(struct dma_buf *dmabuf) > return !!dmabuf->ops->pin; > } > > +#ifdef CONFIG_DMA_SHARED_BUFFER > struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf, > struct device *dev); > struct dma_buf_attachment * > @@ -609,4 +610,119 @@ int dma_buf_vmap_unlocked(struct dma_buf *dmabuf, struct iosys_map *map); > void dma_buf_vunmap_unlocked(struct dma_buf *dmabuf, struct iosys_map *map); > struct dma_buf *dma_buf_iter_begin(void); > struct dma_buf *dma_buf_iter_next(struct dma_buf *dmbuf); > + > +#else > +static inline struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf, > + struct device *dev) > +{ > + return NULL; This should probably be an ERR_PTR(-EOPNOTSUPP); > +} > + > +static inline struct dma_buf_attachment * > +dma_buf_dynamic_attach(struct dma_buf *dmabuf, struct device *dev, > + const struct dma_buf_attach_ops *importer_ops, > + void *importer_priv) > +{ > + return NULL; Same here, ERR_PTR(-EOPNOTSUPP). > +} > + > +static inline void dma_buf_detach(struct dma_buf *dmabuf, > + struct dma_buf_attachment *attach) { } > + > +static inline int dma_buf_pin(struct dma_buf_attachment *attach) > +{ > + return -EOPNOTSUPP; > +} > + > +static inline void dma_buf_unpin(struct dma_buf_attachment *attach) { } > + > +static inline struct dma_buf * > +dma_buf_export(const struct dma_buf_export_info *exp_info) > +{ > + return NULL; > +} > + > + > +static inline int dma_buf_fd(struct dma_buf *dmabuf, int flags) > +{ > + return -EOPNOTSUPP; > +} > + > + > +static inline struct dma_buf *dma_buf_get(int fd) > +{ > + return NULL; And here ERR_PTR(-EINVAL). > +} > + > +static inline void dma_buf_put(struct dma_buf *dmabuf) { } > + > +static inline struct sg_table * > +dma_buf_map_attachment(struct dma_buf_attachment *, enum dma_data_direction) > +{ > + return NULL; ERR_PTR(-EINVAL) > +} > + > +static inline void dma_buf_unmap_attachment(struct dma_buf_attachment *, > + struct sg_table *, > + enum dma_data_direction) { } > + > +static inline void dma_buf_move_notify(struct dma_buf *dma_buf) { } > + > +static inline int dma_buf_begin_cpu_access(struct dma_buf *dma_buf, > + enum dma_data_direction dir) > +{ > + return -EOPNOTSUPP; > +} > + > +static inline int dma_buf_end_cpu_access(struct dma_buf *dma_buf, > + enum dma_data_direction dir) > +{ > + return -EOPNOTSUPP; > +} > + > +static inline struct sg_table * > +dma_buf_map_attachment_unlocked(struct dma_buf_attachment *attach, > + enum dma_data_direction direction) > +{ > + return NULL; ERR_PTR(-EINVAL) > +} > + > +static inline void > +dma_buf_unmap_attachment_unlocked(struct dma_buf_attachment *attach, > + struct sg_table *sg_table, > + enum dma_data_direction direction) { } > + > +static inline int dma_buf_mmap(struct dma_buf *, struct vm_area_struct *, > + unsigned long) > +{ > + return -EOPNOTSUPP; > +} > + > +static inline int dma_buf_vmap(struct dma_buf *dmabuf, struct iosys_map *map) > +{ > + return -EOPNOTSUPP; > +} > + > +static inline void dma_buf_vunmap(struct dma_buf *dmabuf, struct iosys_map *map) > +{ } > + > +static inline int dma_buf_vmap_unlocked(struct dma_buf *dmabuf, > + struct iosys_map *map) > +{ > + return -EOPNOTSUPP; > +} > + > +static inline void dma_buf_vunmap_unlocked(struct dma_buf *dmabuf, > + struct iosys_map *map) { } > + > +static inline struct dma_buf *dma_buf_iter_begin(void) > +{ > + return NULL; > +} > + > +static inline struct dma_buf *dma_buf_iter_next(struct dma_buf *dmbuf) > +{ > + return NULL; > +} Those two are only for BPF and not driver use. Regards, Christian. > +#endif /* CONFIG_DMA_SHARED_BUFFER */ > #endif /* __DMA_BUF_H__ */

1 week, 5 days

Re: [PATCH 2/6] amd/amdkfd: Ignore return code of dma_fence_signal()

by Kuehling, Felix

On 2025-11-26 08:19, Philipp Stanner wrote: > The return code of dma_fence_signal() is not really useful as there is > nothing reasonable to do if a fence was already signaled. That return > code shall be removed from the kernel. > > Ignore dma_fence_signal()'s return code. I think this is not correct. Looking at the comment in evict_process_worker, we use the return value to decide a race conditions where multiple threads are trying to signal the eviction fence. Only one of them should schedule the restore work. And the other ones need to increment the reference count to keep evictions balanced. Regards, Felix > > Suggested-by: Christian König <christian.koenig(a)amd.com> > Signed-off-by: Philipp Stanner <phasta(a)kernel.org> > --- > drivers/gpu/drm/amd/amdkfd/kfd_process.c | 5 ++--- > 1 file changed, 2 insertions(+), 3 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c b/drivers/gpu/drm/amd/amdkfd/kfd_process.c > index ddfe30c13e9d..950fafa4b3c3 100644 > --- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c > @@ -1986,7 +1986,6 @@ kfd_process_gpuid_from_node(struct kfd_process *p, struct kfd_node *node, > static int signal_eviction_fence(struct kfd_process *p) > { > struct dma_fence *ef; > - int ret; > > rcu_read_lock(); > ef = dma_fence_get_rcu_safe(&p->ef); > @@ -1994,10 +1993,10 @@ static int signal_eviction_fence(struct kfd_process *p) > if (!ef) > return -EINVAL; > > - ret = dma_fence_signal(ef); > + dma_fence_signal(ef); > dma_fence_put(ef); > > - return ret; > + return 0; > } > > static void evict_process_worker(struct work_struct *work)

1 week, 6 days

[PATCH v2 0/9] Initial DMABUF support for iommufd

by Jason Gunthorpe

This series is the start of adding full DMABUF support to iommufd. Currently it is limited to only work with VFIO's DMABUF exporter. It sits on top of Leon's series to add a DMABUF exporter to VFIO: https://lore.kernel.org/all/20251120-dmabuf-vfio-v9-0-d7f71607f371@nvidia.c… The existing IOMMU_IOAS_MAP_FILE is enhanced to detect DMABUF fd's, but otherwise works the same as it does today for a memfd. The user can select a slice of the FD to map into the ioas and if the underliyng alignment requirements are met it will be placed in the iommu_domain. Though limited, it is enough to allow a VMM like QEMU to connect MMIO BAR memory from VFIO to an iommu_domain controlled by iommufd. This is used for PCI Peer to Peer support in VMs, and is the last feature that the VFIO type 1 container has that iommufd couldn't do. The VFIO type1 version extracts raw PFNs from VMAs, which has no lifetime control and is a use-after-free security problem. Instead iommufd relies on revokable DMABUFs. Whenever VFIO thinks there should be no access to the MMIO it can shoot down the mapping in iommufd which will unmap it from the iommu_domain. There is no automatic remap, this is a safety protocol so the kernel doesn't get stuck. Userspace is expected to know it is doing something that will revoke the dmabuf and map/unmap it around the activity. Eg when QEMU goes to issue FLR it should do the map/unmap to iommufd. Since DMABUF is missing some key general features for this use case it relies on a "private interconnect" between VFIO and iommufd via the vfio_pci_dma_buf_iommufd_map() call. The call confirms the DMABUF has revoke semantics and delivers a phys_addr for the memory suitable for use with iommu_map(). Medium term there is a desire to expand the supported DMABUFs to include GPU drivers to support DPDK/SPDK type use cases so future series will work to add a general concept of revoke and a general negotiation of interconnect to remove vfio_pci_dma_buf_iommufd_map(). I also plan another series to modify iommufd's vfio_compat to transparently pull a dmabuf out of a VFIO VMA to emulate more of the uAPI of type1. The latest series for interconnect negotation to exchange a phys_addr is: https://lore.kernel.org/r/20251027044712.1676175-1-vivek.kasireddy@intel.com And the discussion for design of revoke is here: https://lore.kernel.org/dri-devel/20250114173103.GE5556@nvidia.com/ This is on github: https://github.com/jgunthorpe/linux/commits/iommufd_dmabuf v2: - Rebase on Leon's v9 - Fix mislocking in an iopt_fill_domain() error path - Revise the comments around how the sub page offset works - Remove a useless WARN_ON in iopt_pages_rw_access() - Fixed missed memory free in the selftest v1: https://patch.msgid.link/r/0-v1-64bed2430cdb+31b-iommufd_dmabuf_jgg@nvidia.… Jason Gunthorpe (9): vfio/pci: Add vfio_pci_dma_buf_iommufd_map() iommufd: Add DMABUF to iopt_pages iommufd: Do not map/unmap revoked DMABUFs iommufd: Allow a DMABUF to be revoked iommufd: Allow MMIO pages in a batch iommufd: Have pfn_reader process DMABUF iopt_pages iommufd: Have iopt_map_file_pages convert the fd to a file iommufd: Accept a DMABUF through IOMMU_IOAS_MAP_FILE iommufd/selftest: Add some tests for the dmabuf flow drivers/iommu/iommufd/io_pagetable.c | 78 +++- drivers/iommu/iommufd/io_pagetable.h | 54 ++- drivers/iommu/iommufd/ioas.c | 8 +- drivers/iommu/iommufd/iommufd_private.h | 14 +- drivers/iommu/iommufd/iommufd_test.h | 10 + drivers/iommu/iommufd/main.c | 10 + drivers/iommu/iommufd/pages.c | 414 ++++++++++++++++-- drivers/iommu/iommufd/selftest.c | 143 ++++++ drivers/vfio/pci/vfio_pci_dmabuf.c | 34 ++ include/linux/vfio_pci_core.h | 4 + tools/testing/selftests/iommu/iommufd.c | 43 ++ tools/testing/selftests/iommu/iommufd_utils.h | 44 ++ 12 files changed, 786 insertions(+), 70 deletions(-) base-commit: f836737ed56db9e2d5b047c56a31e05af0f3f116 -- 2.43.0

1 week, 6 days

Jump to page:

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

Linaro-mm-sig