In a recent discussion with Philip and Danilo the question came up what was already tried and never finished to cleanup the dma_fence framework.
So here are the different ideas I came with but never fully finished, with the patches itself modernized and rebased on top of drm-misc-next.
The main goal of those changes is to make it easier to implement dma_fence backends and don't enforce unnecessary constrains on implementations.
As first step the locking around the dma_fence_ops.signaled callback is made consistent by removing the dma_fence_is_signaled_locked() function.
This was mostly used by backends itself, but if polling the HW is desired the backends can call their own functions for this directly without going through the dma-fence layer.
XE actually seems to be the only driver which make use of that for a bit more handling. For all other cases testing the signaled flag should be enough.
Then forcefully calling dma_fence_signaled() is removed from the dma-fence layer and moved into the backend implementations.
This allows the backend implementations to cleanup after they have signaled the fence. Such cleanup can include removing now signaled fences from lists, dropping references, starting work etc....
Especially nouveau seems to have some really messy workaround because of that involving the DMA_FENCE_FLAG_USER_BITS and installing callbacks because the reference to the context couldn't be dropped directly after signaling. This can now be cleaned up as far as I can see.
In the long term this should also allow reworking the error handling, e.g. removing dma_fence_set_error() and instead giving the error as mandatory parameter to dma_fence_signal().
Then the last piece is dropping calling enable_signaling callback with the dma_fence lock held. This makes it possible for backends to acquire locks which are semantically ordered outside of the dma_fence lock.
This is necessary to allows using the dma_fence inline lock in more cases, previously backends used some common external lock for their dma_fences to for example make it possible remove fences from linked lists.
Please comment and review, Christian.
Dropping the _sw_ part from the names was proposed multiple times now and IIRC people generally agreed with the idea already.
The function requests a fence to signal and triggers some sort of HW interaction on most backends.
So this is not really software related at all and the callback is already just named enable_signaling as well.
Just streamline that and use a consistent name everywhere.
Assisted-by: Claude Sonet 4 Signed-off-by: Christian König christian.koenig@amd.com --- drivers/dma-buf/dma-fence.c | 8 ++-- drivers/dma-buf/st-dma-fence-chain.c | 4 +- drivers/dma-buf/st-dma-fence-unwrap.c | 42 +++++++++---------- drivers/dma-buf/st-dma-fence.c | 16 +++---- drivers/dma-buf/st-dma-resv.c | 10 ++--- drivers/gpu/drm/i915/i915_active.c | 2 +- .../gpu/drm/ttm/tests/ttm_bo_validate_test.c | 2 +- drivers/gpu/drm/ttm/ttm_bo.c | 2 +- drivers/gpu/drm/xe/xe_bo.c | 2 +- drivers/gpu/drm/xe/xe_sched_job.c | 2 +- drivers/gpu/drm/xe/xe_svm.c | 2 +- drivers/gpu/drm/xe/xe_userptr.c | 2 +- drivers/gpu/drm/xe/xe_vm.c | 4 +- include/linux/dma-fence.h | 4 +- 14 files changed, 51 insertions(+), 51 deletions(-)
diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c index c7ea1e75d38a..0ec81a568bbd 100644 --- a/drivers/dma-buf/dma-fence.c +++ b/drivers/dma-buf/dma-fence.c @@ -534,7 +534,7 @@ dma_fence_wait_timeout(struct dma_fence *fence, bool intr, signed long timeout)
__dma_fence_might_wait();
- dma_fence_enable_sw_signaling(fence); + dma_fence_enable_signaling(fence);
rcu_read_lock(); ops = rcu_dereference(fence->ops); @@ -656,14 +656,14 @@ static bool __dma_fence_enable_signaling(struct dma_fence *fence) }
/** - * dma_fence_enable_sw_signaling - enable signaling on fence + * dma_fence_enable_signaling - enable signaling on fence * @fence: the fence to enable * * This will request for sw signaling to be enabled, to make the fence * complete as soon as possible. This calls &dma_fence_ops.enable_signaling * internally. */ -void dma_fence_enable_sw_signaling(struct dma_fence *fence) +void dma_fence_enable_signaling(struct dma_fence *fence) { unsigned long flags;
@@ -671,7 +671,7 @@ void dma_fence_enable_sw_signaling(struct dma_fence *fence) __dma_fence_enable_signaling(fence); dma_fence_unlock_irqrestore(fence, flags); } -EXPORT_SYMBOL(dma_fence_enable_sw_signaling); +EXPORT_SYMBOL(dma_fence_enable_signaling);
/** * dma_fence_add_callback - add a callback to be called when the fence diff --git a/drivers/dma-buf/st-dma-fence-chain.c b/drivers/dma-buf/st-dma-fence-chain.c index a3023d3fedc9..e0d9b69bfa76 100644 --- a/drivers/dma-buf/st-dma-fence-chain.c +++ b/drivers/dma-buf/st-dma-fence-chain.c @@ -82,7 +82,7 @@ static void test_sanitycheck(struct kunit *test)
chain = mock_chain(NULL, f, 1); if (chain) - dma_fence_enable_sw_signaling(chain); + dma_fence_enable_signaling(chain); else KUNIT_FAIL(test, "Failed to create chain");
@@ -139,7 +139,7 @@ static int fence_chains_init(struct fence_chains *fc, unsigned int count,
fc->tail = fc->chains[i];
- dma_fence_enable_sw_signaling(fc->chains[i]); + dma_fence_enable_signaling(fc->chains[i]); }
fc->chain_length = i; diff --git a/drivers/dma-buf/st-dma-fence-unwrap.c b/drivers/dma-buf/st-dma-fence-unwrap.c index 4e7ee25372ba..4d9d313b460c 100644 --- a/drivers/dma-buf/st-dma-fence-unwrap.c +++ b/drivers/dma-buf/st-dma-fence-unwrap.c @@ -103,7 +103,7 @@ static void test_sanitycheck(struct kunit *test) f = mock_fence(); KUNIT_ASSERT_NOT_NULL(test, f);
- dma_fence_enable_sw_signaling(f); + dma_fence_enable_signaling(f);
array = mock_array(1, f); KUNIT_ASSERT_NOT_NULL(test, array); @@ -122,7 +122,7 @@ static void test_unwrap_array(struct kunit *test) f1 = mock_fence(); KUNIT_ASSERT_NOT_NULL(test, f1);
- dma_fence_enable_sw_signaling(f1); + dma_fence_enable_signaling(f1);
f2 = mock_fence(); if (!f2) { @@ -131,7 +131,7 @@ static void test_unwrap_array(struct kunit *test) return; }
- dma_fence_enable_sw_signaling(f2); + dma_fence_enable_signaling(f2);
array = mock_array(2, f1, f2); KUNIT_ASSERT_NOT_NULL(test, array); @@ -160,7 +160,7 @@ static void test_unwrap_chain(struct kunit *test) f1 = mock_fence(); KUNIT_ASSERT_NOT_NULL(test, f1);
- dma_fence_enable_sw_signaling(f1); + dma_fence_enable_signaling(f1);
f2 = mock_fence(); if (!f2) { @@ -169,7 +169,7 @@ static void test_unwrap_chain(struct kunit *test) return; }
- dma_fence_enable_sw_signaling(f2); + dma_fence_enable_signaling(f2);
chain = mock_chain(f1, f2); KUNIT_ASSERT_NOT_NULL(test, chain); @@ -198,7 +198,7 @@ static void test_unwrap_chain_array(struct kunit *test) f1 = mock_fence(); KUNIT_ASSERT_NOT_NULL(test, f1);
- dma_fence_enable_sw_signaling(f1); + dma_fence_enable_signaling(f1);
f2 = mock_fence(); if (!f2) { @@ -207,7 +207,7 @@ static void test_unwrap_chain_array(struct kunit *test) return; }
- dma_fence_enable_sw_signaling(f2); + dma_fence_enable_signaling(f2);
array = mock_array(2, f1, f2); KUNIT_ASSERT_NOT_NULL(test, array); @@ -239,7 +239,7 @@ static void test_unwrap_merge(struct kunit *test) f1 = mock_fence(); KUNIT_ASSERT_NOT_NULL(test, f1);
- dma_fence_enable_sw_signaling(f1); + dma_fence_enable_signaling(f1);
f2 = mock_fence(); if (!f2) { @@ -247,7 +247,7 @@ static void test_unwrap_merge(struct kunit *test) goto error_put_f1; }
- dma_fence_enable_sw_signaling(f2); + dma_fence_enable_signaling(f2);
f3 = dma_fence_unwrap_merge(f1, f2); if (!f3) { @@ -285,7 +285,7 @@ static void test_unwrap_merge_duplicate(struct kunit *test) f1 = mock_fence(); KUNIT_ASSERT_NOT_NULL(test, f1);
- dma_fence_enable_sw_signaling(f1); + dma_fence_enable_signaling(f1);
f2 = dma_fence_unwrap_merge(f1, f1); if (!f2) { @@ -322,7 +322,7 @@ static void test_unwrap_merge_seqno(struct kunit *test) f1 = __mock_fence(ctx[1], 1); KUNIT_ASSERT_NOT_NULL(test, f1);
- dma_fence_enable_sw_signaling(f1); + dma_fence_enable_signaling(f1);
f2 = __mock_fence(ctx[1], 2); if (!f2) { @@ -330,7 +330,7 @@ static void test_unwrap_merge_seqno(struct kunit *test) goto error_put_f1; }
- dma_fence_enable_sw_signaling(f2); + dma_fence_enable_signaling(f2);
f3 = __mock_fence(ctx[0], 1); if (!f3) { @@ -338,7 +338,7 @@ static void test_unwrap_merge_seqno(struct kunit *test) goto error_put_f2; }
- dma_fence_enable_sw_signaling(f3); + dma_fence_enable_signaling(f3);
f4 = dma_fence_unwrap_merge(f1, f2, f3); if (!f4) { @@ -378,7 +378,7 @@ static void test_unwrap_merge_order(struct kunit *test) f1 = mock_fence(); KUNIT_ASSERT_NOT_NULL(test, f1);
- dma_fence_enable_sw_signaling(f1); + dma_fence_enable_signaling(f1);
f2 = mock_fence(); if (!f2) { @@ -387,7 +387,7 @@ static void test_unwrap_merge_order(struct kunit *test) return; }
- dma_fence_enable_sw_signaling(f2); + dma_fence_enable_signaling(f2);
a1 = mock_array(2, f1, f2); KUNIT_ASSERT_NOT_NULL(test, a1); @@ -442,7 +442,7 @@ static void test_unwrap_merge_complex(struct kunit *test) f1 = mock_fence(); KUNIT_ASSERT_NOT_NULL(test, f1);
- dma_fence_enable_sw_signaling(f1); + dma_fence_enable_signaling(f1);
f2 = mock_fence(); if (!f2) { @@ -450,7 +450,7 @@ static void test_unwrap_merge_complex(struct kunit *test) goto error_put_f1; }
- dma_fence_enable_sw_signaling(f2); + dma_fence_enable_signaling(f2);
f3 = dma_fence_unwrap_merge(f1, f2); if (!f3) { @@ -510,7 +510,7 @@ static void test_unwrap_merge_complex_seqno(struct kunit *test) f1 = __mock_fence(ctx[0], 2); KUNIT_ASSERT_NOT_NULL(test, f1);
- dma_fence_enable_sw_signaling(f1); + dma_fence_enable_signaling(f1);
f2 = __mock_fence(ctx[1], 1); if (!f2) { @@ -518,7 +518,7 @@ static void test_unwrap_merge_complex_seqno(struct kunit *test) goto error_put_f1; }
- dma_fence_enable_sw_signaling(f2); + dma_fence_enable_signaling(f2);
f3 = __mock_fence(ctx[0], 1); if (!f3) { @@ -526,7 +526,7 @@ static void test_unwrap_merge_complex_seqno(struct kunit *test) goto error_put_f2; }
- dma_fence_enable_sw_signaling(f3); + dma_fence_enable_signaling(f3);
f4 = __mock_fence(ctx[1], 2); if (!f4) { @@ -534,7 +534,7 @@ static void test_unwrap_merge_complex_seqno(struct kunit *test) goto error_put_f3; }
- dma_fence_enable_sw_signaling(f4); + dma_fence_enable_signaling(f4);
f5 = mock_array(2, dma_fence_get(f1), dma_fence_get(f2)); if (!f5) { diff --git a/drivers/dma-buf/st-dma-fence.c b/drivers/dma-buf/st-dma-fence.c index 499272229696..856d0d302a5d 100644 --- a/drivers/dma-buf/st-dma-fence.c +++ b/drivers/dma-buf/st-dma-fence.c @@ -42,7 +42,7 @@ static void test_sanitycheck(struct kunit *test) f = mock_fence(); KUNIT_ASSERT_NOT_NULL(test, f);
- dma_fence_enable_sw_signaling(f); + dma_fence_enable_signaling(f);
dma_fence_signal(f); dma_fence_put(f); @@ -55,7 +55,7 @@ static void test_signaling(struct kunit *test) f = mock_fence(); KUNIT_ASSERT_NOT_NULL(test, f);
- dma_fence_enable_sw_signaling(f); + dma_fence_enable_signaling(f);
if (dma_fence_is_signaled(f)) { KUNIT_FAIL(test, "Fence unexpectedly signaled on creation"); @@ -127,7 +127,7 @@ static void test_late_add_callback(struct kunit *test) f = mock_fence(); KUNIT_ASSERT_NOT_NULL(test, f);
- dma_fence_enable_sw_signaling(f); + dma_fence_enable_signaling(f);
dma_fence_signal(f);
@@ -209,7 +209,7 @@ static void test_status(struct kunit *test) f = mock_fence(); KUNIT_ASSERT_NOT_NULL(test, f);
- dma_fence_enable_sw_signaling(f); + dma_fence_enable_signaling(f);
if (dma_fence_get_status(f)) { KUNIT_FAIL(test, "Fence unexpectedly has signaled status on creation"); @@ -233,7 +233,7 @@ static void test_error(struct kunit *test) f = mock_fence(); KUNIT_ASSERT_NOT_NULL(test, f);
- dma_fence_enable_sw_signaling(f); + dma_fence_enable_signaling(f);
dma_fence_set_error(f, -EIO);
@@ -260,7 +260,7 @@ static void test_wait(struct kunit *test) f = mock_fence(); KUNIT_ASSERT_NOT_NULL(test, f);
- dma_fence_enable_sw_signaling(f); + dma_fence_enable_signaling(f);
if (dma_fence_wait_timeout(f, false, 0) != 0) { KUNIT_FAIL(test, "Wait reported complete before being signaled"); @@ -300,7 +300,7 @@ static void test_wait_timeout(struct kunit *test) wt.f = mock_fence(); KUNIT_ASSERT_NOT_NULL(test, wt.f);
- dma_fence_enable_sw_signaling(wt.f); + dma_fence_enable_signaling(wt.f);
if (dma_fence_wait_timeout(wt.f, false, 1) != 0) { KUNIT_FAIL(test, "Wait reported complete before being signaled"); @@ -379,7 +379,7 @@ static int thread_signal_callback(void *arg) break; }
- dma_fence_enable_sw_signaling(f1); + dma_fence_enable_signaling(f1);
rcu_assign_pointer(t->fences[t->id], f1); smp_wmb(); diff --git a/drivers/dma-buf/st-dma-resv.c b/drivers/dma-buf/st-dma-resv.c index 95a4becdb892..0b96136bbd54 100644 --- a/drivers/dma-buf/st-dma-resv.c +++ b/drivers/dma-buf/st-dma-resv.c @@ -48,7 +48,7 @@ static void test_sanitycheck(struct kunit *test) f = alloc_fence(); KUNIT_ASSERT_NOT_NULL(test, f);
- dma_fence_enable_sw_signaling(f); + dma_fence_enable_signaling(f);
dma_fence_signal(f); dma_fence_put(f); @@ -73,7 +73,7 @@ static void test_signaling(struct kunit *test) f = alloc_fence(); KUNIT_ASSERT_NOT_NULL(test, f);
- dma_fence_enable_sw_signaling(f); + dma_fence_enable_signaling(f);
dma_resv_init(&resv); r = dma_resv_lock(&resv, NULL); @@ -117,7 +117,7 @@ static void test_for_each(struct kunit *test) f = alloc_fence(); KUNIT_ASSERT_NOT_NULL(test, f);
- dma_fence_enable_sw_signaling(f); + dma_fence_enable_signaling(f);
dma_resv_init(&resv); r = dma_resv_lock(&resv, NULL); @@ -176,7 +176,7 @@ static void test_for_each_unlocked(struct kunit *test) f = alloc_fence(); KUNIT_ASSERT_NOT_NULL(test, f);
- dma_fence_enable_sw_signaling(f); + dma_fence_enable_signaling(f);
dma_resv_init(&resv); r = dma_resv_lock(&resv, NULL); @@ -246,7 +246,7 @@ static void test_get_fences(struct kunit *test) f = alloc_fence(); KUNIT_ASSERT_NOT_NULL(test, f);
- dma_fence_enable_sw_signaling(f); + dma_fence_enable_signaling(f);
dma_resv_init(&resv); r = dma_resv_lock(&resv, NULL); diff --git a/drivers/gpu/drm/i915/i915_active.c b/drivers/gpu/drm/i915/i915_active.c index 5cb7a72774a0..e7632c1ff4be 100644 --- a/drivers/gpu/drm/i915/i915_active.c +++ b/drivers/gpu/drm/i915/i915_active.c @@ -543,7 +543,7 @@ static void enable_signaling(struct i915_active_fence *active) if (!fence) return;
- dma_fence_enable_sw_signaling(fence); + dma_fence_enable_signaling(fence); dma_fence_put(fence); }
diff --git a/drivers/gpu/drm/ttm/tests/ttm_bo_validate_test.c b/drivers/gpu/drm/ttm/tests/ttm_bo_validate_test.c index 2db221f6fc3a..56ad8ef32584 100644 --- a/drivers/gpu/drm/ttm/tests/ttm_bo_validate_test.c +++ b/drivers/gpu/drm/ttm/tests/ttm_bo_validate_test.c @@ -69,7 +69,7 @@ static void dma_resv_kunit_active_fence_init(struct kunit *test, struct dma_fence *fence;
fence = alloc_mock_fence(test); - dma_fence_enable_sw_signaling(fence); + dma_fence_enable_signaling(fence);
dma_resv_lock(resv, NULL); dma_resv_reserve_fences(resv, 1); diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c index bcd76f6bb7f0..3980f376e3ba 100644 --- a/drivers/gpu/drm/ttm/ttm_bo.c +++ b/drivers/gpu/drm/ttm/ttm_bo.c @@ -224,7 +224,7 @@ static void ttm_bo_flush_all_fences(struct ttm_buffer_object *bo)
dma_resv_iter_begin(&cursor, resv, DMA_RESV_USAGE_BOOKKEEP); dma_resv_for_each_fence_unlocked(&cursor, fence) - dma_fence_enable_sw_signaling(fence); + dma_fence_enable_signaling(fence); dma_resv_iter_end(&cursor); }
diff --git a/drivers/gpu/drm/xe/xe_bo.c b/drivers/gpu/drm/xe/xe_bo.c index 4c80bac67622..85e6d9a0f575 100644 --- a/drivers/gpu/drm/xe/xe_bo.c +++ b/drivers/gpu/drm/xe/xe_bo.c @@ -670,7 +670,7 @@ static int xe_bo_trigger_rebind(struct xe_device *xe, struct xe_bo *bo, dma_resv_iter_begin(&cursor, bo->ttm.base.resv, DMA_RESV_USAGE_BOOKKEEP); dma_resv_for_each_fence_unlocked(&cursor, fence) - dma_fence_enable_sw_signaling(fence); + dma_fence_enable_signaling(fence); dma_resv_iter_end(&cursor); }
diff --git a/drivers/gpu/drm/xe/xe_sched_job.c b/drivers/gpu/drm/xe/xe_sched_job.c index ae5b38b2a884..a4fa00632a30 100644 --- a/drivers/gpu/drm/xe/xe_sched_job.c +++ b/drivers/gpu/drm/xe/xe_sched_job.c @@ -214,7 +214,7 @@ void xe_sched_job_set_error(struct xe_sched_job *job, int error)
trace_xe_sched_job_set_error(job);
- dma_fence_enable_sw_signaling(job->fence); + dma_fence_enable_signaling(job->fence); xe_hw_fence_irq_run(job->q->fence_irq); }
diff --git a/drivers/gpu/drm/xe/xe_svm.c b/drivers/gpu/drm/xe/xe_svm.c index e1651e70c8f0..dba73786d82a 100644 --- a/drivers/gpu/drm/xe/xe_svm.c +++ b/drivers/gpu/drm/xe/xe_svm.c @@ -1090,7 +1090,7 @@ static int xe_drm_pagemap_populate_mm(struct drm_pagemap *dpagemap, dma_resv_wait_timeout(bo->ttm.base.resv, DMA_RESV_USAGE_KERNEL, false, MAX_SCHEDULE_TIMEOUT); else if (pre_migrate_fence) - dma_fence_enable_sw_signaling(pre_migrate_fence); + dma_fence_enable_signaling(pre_migrate_fence); }
drm_pagemap_devmem_init(&bo->devmem_allocation, dev, mm, diff --git a/drivers/gpu/drm/xe/xe_userptr.c b/drivers/gpu/drm/xe/xe_userptr.c index 6761005c0b90..2e45e42c648f 100644 --- a/drivers/gpu/drm/xe/xe_userptr.c +++ b/drivers/gpu/drm/xe/xe_userptr.c @@ -180,7 +180,7 @@ xe_vma_userptr_invalidate_pass1(struct xe_vm *vm, struct xe_userptr_vma *uvma) dma_resv_iter_begin(&cursor, xe_vm_resv(vm), DMA_RESV_USAGE_BOOKKEEP); dma_resv_for_each_fence_unlocked(&cursor, fence) { - dma_fence_enable_sw_signaling(fence); + dma_fence_enable_signaling(fence); if (signaled && !dma_fence_is_signaled(fence)) signaled = false; } diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c index 080c2fff0e95..73ac031ffb04 100644 --- a/drivers/gpu/drm/xe/xe_vm.c +++ b/drivers/gpu/drm/xe/xe_vm.c @@ -256,7 +256,7 @@ int xe_vm_add_compute_exec_queue(struct xe_vm *vm, struct xe_exec_queue *q) */ wait = __xe_vm_userptr_needs_repin(vm) || preempt_fences_waiting(vm); if (wait) - dma_fence_enable_sw_signaling(pfence); + dma_fence_enable_signaling(pfence);
xe_svm_notifier_unlock(vm);
@@ -287,7 +287,7 @@ void xe_vm_remove_compute_exec_queue(struct xe_vm *vm, struct xe_exec_queue *q) --vm->preempt.num_exec_queues; } if (q->lr.pfence) { - dma_fence_enable_sw_signaling(q->lr.pfence); + dma_fence_enable_signaling(q->lr.pfence); dma_fence_put(q->lr.pfence); q->lr.pfence = NULL; } diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h index b52ab692b22e..158cd609f103 100644 --- a/include/linux/dma-fence.h +++ b/include/linux/dma-fence.h @@ -448,7 +448,7 @@ int dma_fence_add_callback(struct dma_fence *fence, dma_fence_func_t func); bool dma_fence_remove_callback(struct dma_fence *fence, struct dma_fence_cb *cb); -void dma_fence_enable_sw_signaling(struct dma_fence *fence); +void dma_fence_enable_signaling(struct dma_fence *fence);
/** * DOC: Safe external access to driver provided object members @@ -534,7 +534,7 @@ dma_fence_is_signaled_locked(struct dma_fence *fence) * Returns true if the fence was already signaled, false if not. Since this * function doesn't enable signaling, it is not guaranteed to ever return * true if dma_fence_add_callback(), dma_fence_wait() or - * dma_fence_enable_sw_signaling() haven't been called before. + * dma_fence_enable_signaling() haven't been called before. * * It's recommended for seqno fences to call dma_fence_signal when the * operation is complete, it makes it possible to prevent issues from
Instead of dma_fence_is_signaled_locked() use dma_fence_test_signaled_flag().
The extra polling check seems unecessary for those use cases.
Signed-off-by: Christian König christian.koenig@amd.com --- drivers/dma-buf/sw_sync.c | 2 +- include/linux/dma-fence.h | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/dma-buf/sw_sync.c b/drivers/dma-buf/sw_sync.c index 8df20b0218a9..243991bc1506 100644 --- a/drivers/dma-buf/sw_sync.c +++ b/drivers/dma-buf/sw_sync.c @@ -262,7 +262,7 @@ static struct sync_pt *sync_pt_create(struct sync_timeline *obj, INIT_LIST_HEAD(&pt->link);
spin_lock_irq(&obj->lock); - if (!dma_fence_is_signaled_locked(&pt->base)) { + if (!dma_fence_test_signaled_flag(&pt->base)) { struct rb_node **p = &obj->pt_tree.rb_node; struct rb_node *parent = NULL;
diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h index 158cd609f103..803e10ca76e3 100644 --- a/include/linux/dma-fence.h +++ b/include/linux/dma-fence.h @@ -658,7 +658,7 @@ static inline struct dma_fence *dma_fence_later(struct dma_fence *f1, */ static inline int dma_fence_get_status_locked(struct dma_fence *fence) { - if (dma_fence_is_signaled_locked(fence)) + if (dma_fence_test_signaled_flag(fence)) return fence->error ?: 1; else return 0;
Instead of dma_fence_is_signaled_locked() use dma_fence_test_signaled_flag().
The extra polling check seems unecessary for those use cases.
Signed-off-by: Christian König christian.koenig@amd.com --- drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 8 ++++---- drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c | 2 +- drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c | 2 +- 3 files changed, 6 insertions(+), 6 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c index ea69b1bac7c6..1192b9800ff2 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c @@ -652,7 +652,7 @@ void amdgpu_fence_driver_set_error(struct amdgpu_ring *ring, int error)
fence = rcu_dereference_protected(drv->fences[i], lockdep_is_held(&drv->lock)); - if (fence && !dma_fence_is_signaled_locked(fence)) + if (fence && !dma_fence_test_signaled_flag(fence)) dma_fence_set_error(fence, error); } spin_unlock_irqrestore(&drv->lock, flags); @@ -677,7 +677,7 @@ void amdgpu_fence_driver_force_completion(struct amdgpu_ring *ring,
fence = rcu_dereference_protected(drv->fences[i], lockdep_is_held(&drv->lock)); - if (fence && !dma_fence_is_signaled_locked(fence)) { + if (fence && !dma_fence_test_signaled_flag(fence)) { if (fence == timedout_fence) dma_fence_set_error(fence, -ETIME); else @@ -738,7 +738,7 @@ void amdgpu_ring_set_fence_errors_and_reemit(struct amdgpu_ring *ring, rcu_read_lock(); unprocessed = rcu_dereference(*ptr);
- if (unprocessed && !dma_fence_is_signaled_locked(unprocessed)) { + if (unprocessed && !dma_fence_test_signaled_flag(unprocessed)) { fence = container_of(unprocessed, struct amdgpu_fence, base); is_guilty_fence = fence == guilty_fence; is_guilty_context = fence->context == guilty_fence->context; @@ -802,7 +802,7 @@ void amdgpu_ring_backup_unprocessed_commands(struct amdgpu_ring *ring, rcu_read_lock(); unprocessed = rcu_dereference(*ptr);
- if (unprocessed && !dma_fence_is_signaled(unprocessed)) { + if (unprocessed && !dma_fence_test_signaled_flag(unprocessed)) { fence = container_of(unprocessed, struct amdgpu_fence, base);
amdgpu_ring_backup_unprocessed_command(ring, fence); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c index d6bee5c30073..ae9d6a2eefab 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c @@ -460,7 +460,7 @@ bool amdgpu_ring_soft_recovery(struct amdgpu_ring *ring, unsigned int vmid, return false;
dma_fence_lock_irqsave(fence, flags); - if (!dma_fence_is_signaled_locked(fence)) + if (!dma_fence_test_signaled_flag(fence)) dma_fence_set_error(fence, -ENODATA); dma_fence_unlock_irqrestore(fence, flags);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c index a41fb72dba94..2cc6552a6399 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c @@ -426,7 +426,7 @@ amdgpu_userq_fence_driver_set_error(struct amdgpu_userq_fence *fence,
f = rcu_dereference_protected(&fence->base, lockdep_is_held(&fence_drv->fence_list_lock)); - if (f && !dma_fence_is_signaled_locked(f)) + if (f && !dma_fence_test_signaled_flag(f)) dma_fence_set_error(f, error); spin_unlock_irqrestore(&fence_drv->fence_list_lock, flags); }
Instead of dma_fence_is_signaled_locked() use dma_fence_test_signaled_flag().
The extra polling check seems unecessary for those use cases.
Signed-off-by: Christian König christian.koenig@amd.com --- drivers/gpu/drm/nouveau/nouveau_fence.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/nouveau/nouveau_fence.c b/drivers/gpu/drm/nouveau/nouveau_fence.c index edbe9e08ba0f..6601ef52e301 100644 --- a/drivers/gpu/drm/nouveau/nouveau_fence.c +++ b/drivers/gpu/drm/nouveau/nouveau_fence.c @@ -83,7 +83,7 @@ nouveau_fence_context_kill(struct nouveau_fence_chan *fctx, int error)
spin_lock_irqsave(&fctx->lock, flags); list_for_each_entry_safe(fence, tmp, &fctx->pending, head) { - if (error && !dma_fence_is_signaled_locked(&fence->base)) + if (error && !dma_fence_test_signaled_flag(&fence->base)) dma_fence_set_error(&fence->base, error);
if (nouveau_fence_signal(fence))
Instead of dma_fence_is_signaled_locked() use dma_fence_test_signaled_flag().
No functional difference the mock HW fence has no signaled callback anyway.
Signed-off-by: Christian König christian.koenig@amd.com --- drivers/gpu/drm/scheduler/tests/mock_scheduler.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/scheduler/tests/mock_scheduler.c b/drivers/gpu/drm/scheduler/tests/mock_scheduler.c index 14403a762335..82dce344bfa1 100644 --- a/drivers/gpu/drm/scheduler/tests/mock_scheduler.c +++ b/drivers/gpu/drm/scheduler/tests/mock_scheduler.c @@ -224,7 +224,7 @@ mock_sched_timedout_job(struct drm_sched_job *sched_job) }
spin_lock_irqsave(&sched->lock, flags); - if (!dma_fence_is_signaled_locked(&job->hw_fence)) { + if (!dma_fence_test_signaled_flag(&job->hw_fence)) { list_del(&job->link); job->flags |= DRM_MOCK_SCHED_JOB_TIMEDOUT; dma_fence_set_error(&job->hw_fence, -ETIMEDOUT); @@ -258,7 +258,7 @@ static void mock_sched_cancel_job(struct drm_sched_job *sched_job) hrtimer_cancel(&job->timer);
spin_lock_irqsave(&sched->lock, flags); - if (!dma_fence_is_signaled_locked(&job->hw_fence)) { + if (!dma_fence_test_signaled_flag(&job->hw_fence)) { list_del(&job->link); dma_fence_set_error(&job->hw_fence, -ECANCELED); dma_fence_signal_locked(&job->hw_fence);
This use case is a bit more complicated since the irq worker is actually the one signaling the fence.
The patch should not indroduce any functional change, but the code can probably be cleaned up quite a bit after the full patch set lands.
Signed-off-by: Christian König christian.koenig@amd.com --- drivers/gpu/drm/xe/xe_hw_fence.c | 28 +++++++++++++++------------- 1 file changed, 15 insertions(+), 13 deletions(-)
diff --git a/drivers/gpu/drm/xe/xe_hw_fence.c b/drivers/gpu/drm/xe/xe_hw_fence.c index 14720623ad00..a4e0278559b8 100644 --- a/drivers/gpu/drm/xe/xe_hw_fence.c +++ b/drivers/gpu/drm/xe/xe_hw_fence.c @@ -16,6 +16,8 @@
static struct kmem_cache *xe_hw_fence_slab;
+static struct xe_hw_fence *to_xe_hw_fence(struct dma_fence *fence); + int __init xe_hw_fence_module_init(void) { xe_hw_fence_slab = kmem_cache_create("xe_hw_fence", @@ -47,6 +49,16 @@ static void fence_free(struct rcu_head *rcu) kmem_cache_free(xe_hw_fence_slab, fence); }
+static bool xe_hw_fence_signaled(struct dma_fence *dma_fence) +{ + struct xe_hw_fence *fence = to_xe_hw_fence(dma_fence); + struct xe_device *xe = fence->xe; + u32 seqno = xe_map_rd(xe, &fence->seqno_map, 0, u32); + + return dma_fence->error || + !__dma_fence_is_later(dma_fence, dma_fence->seqno, seqno); +} + static void hw_fence_irq_run_cb(struct irq_work *work) { struct xe_hw_fence_irq *irq = container_of(work, typeof(*irq), work); @@ -60,7 +72,9 @@ static void hw_fence_irq_run_cb(struct irq_work *work) struct dma_fence *dma_fence = &fence->dma;
trace_xe_hw_fence_try_signal(fence); - if (dma_fence_is_signaled_locked(dma_fence)) { + if (dma_fence_test_signaled_flag(dma_fence) || + xe_hw_fence_signaled(dma_fence)) { + dma_fence_signal_locked(dma_fence); trace_xe_hw_fence_signal(fence); list_del_init(&fence->irq_link); dma_fence_put(dma_fence); @@ -120,8 +134,6 @@ void xe_hw_fence_ctx_finish(struct xe_hw_fence_ctx *ctx) { }
-static struct xe_hw_fence *to_xe_hw_fence(struct dma_fence *fence); - static struct xe_hw_fence_irq *xe_hw_fence_irq(struct xe_hw_fence *fence) { return container_of(fence->dma.extern_lock, struct xe_hw_fence_irq, @@ -142,16 +154,6 @@ static const char *xe_hw_fence_get_timeline_name(struct dma_fence *dma_fence) return fence->name; }
-static bool xe_hw_fence_signaled(struct dma_fence *dma_fence) -{ - struct xe_hw_fence *fence = to_xe_hw_fence(dma_fence); - struct xe_device *xe = fence->xe; - u32 seqno = xe_map_rd(xe, &fence->seqno_map, 0, u32); - - return dma_fence->error || - !__dma_fence_is_later(dma_fence, dma_fence->seqno, seqno); -} - static bool xe_hw_fence_enable_signaling(struct dma_fence *dma_fence) { struct xe_hw_fence *fence = to_xe_hw_fence(dma_fence);
Finally remove one of the biggest trouble makers in the dma_fence handling.
The signaled callback is now consistently called without holding the dma_fence lock.
Signed-off-by: Christian König christian.koenig@amd.com --- include/linux/dma-fence.h | 36 ------------------------------------ 1 file changed, 36 deletions(-)
diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h index 803e10ca76e3..ad69acbea218 100644 --- a/include/linux/dma-fence.h +++ b/include/linux/dma-fence.h @@ -493,40 +493,6 @@ dma_fence_test_signaled_flag(struct dma_fence *fence) return test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags); }
-/** - * dma_fence_is_signaled_locked - Return an indication if the fence - * is signaled yet. - * @fence: the fence to check - * - * Returns true if the fence was already signaled, false if not. Since this - * function doesn't enable signaling, it is not guaranteed to ever return - * true if dma_fence_add_callback(), dma_fence_wait() or - * dma_fence_enable_sw_signaling() haven't been called before. - * - * This function requires &dma_fence.lock to be held. - * - * See also dma_fence_is_signaled(). - */ -static inline bool -dma_fence_is_signaled_locked(struct dma_fence *fence) -{ - const struct dma_fence_ops *ops; - - if (dma_fence_test_signaled_flag(fence)) - return true; - - rcu_read_lock(); - ops = rcu_dereference(fence->ops); - if (ops && ops->signaled && ops->signaled(fence)) { - rcu_read_unlock(); - dma_fence_signal_locked(fence); - return true; - } - rcu_read_unlock(); - - return false; -} - /** * dma_fence_is_signaled - Return an indication if the fence is signaled yet. * @fence: the fence to check @@ -540,8 +506,6 @@ dma_fence_is_signaled_locked(struct dma_fence *fence) * operation is complete, it makes it possible to prevent issues from * wraparound between time of issue and time of use by checking the return * value of this function before calling hardware-specific wait instructions. - * - * See also dma_fence_is_signaled_locked(). */ static inline bool dma_fence_is_signaled(struct dma_fence *fence)
Rename the dma_fence_ops.signaled callback into check_signaled and move calling dma_fence_signal() into the actual drivers.
This way backends can do cleanup after calling dma_fence_signal().
For example it might be necessary to remove items from linked lists and/or drop additional references, start work items etc...
It also gives backends a clean point to know when all registered callbacks are finished.
No intended functional change.
Assisted-by: Claude Sonet 4 Signed-off-by: Christian König christian.koenig@amd.com --- drivers/dma-buf/dma-fence-array.c | 10 +++---- drivers/dma-buf/dma-fence-chain.c | 16 +++++++----- drivers/dma-buf/sw_sync.c | 9 ++++--- .../gpu/drm/amd/amdgpu/amdgpu_userq_fence.c | 8 +++--- drivers/gpu/drm/etnaviv/etnaviv_gpu.c | 7 ++--- drivers/gpu/drm/i915/i915_request.c | 7 ++--- drivers/gpu/drm/msm/msm_fence.c | 8 +++--- drivers/gpu/drm/nouveau/nouveau_fence.c | 12 +++++---- drivers/gpu/drm/radeon/radeon_fence.c | 8 +++--- drivers/gpu/drm/vc4/vc4_fence.c | 7 ++--- drivers/gpu/drm/virtio/virtgpu_fence.c | 5 ++-- drivers/gpu/drm/xe/xe_hw_fence.c | 17 ++++++------ include/linux/dma-fence.h | 26 +++++++++---------- 13 files changed, 72 insertions(+), 68 deletions(-)
diff --git a/drivers/dma-buf/dma-fence-array.c b/drivers/dma-buf/dma-fence-array.c index 5e10e8df372f..49ea2ba7c460 100644 --- a/drivers/dma-buf/dma-fence-array.c +++ b/drivers/dma-buf/dma-fence-array.c @@ -100,7 +100,7 @@ static bool dma_fence_array_enable_signaling(struct dma_fence *fence) return true; }
-static bool dma_fence_array_signaled(struct dma_fence *fence) +static void dma_fence_array_signaled(struct dma_fence *fence) { struct dma_fence_array *array = to_dma_fence_array(fence); int num_pending; @@ -123,18 +123,18 @@ static bool dma_fence_array_signaled(struct dma_fence *fence) if (test_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT, &array->base.flags)) { if (num_pending <= 0) goto signal; - return false; + return; }
for (i = 0; i < array->num_fences; ++i) { if (dma_fence_is_signaled(array->fences[i]) && !--num_pending) goto signal; } - return false; + return;
signal: dma_fence_array_clear_pending_error(array); - return true; + dma_fence_signal(fence); }
static void dma_fence_array_release(struct dma_fence *fence) @@ -163,7 +163,7 @@ const struct dma_fence_ops dma_fence_array_ops = { .get_driver_name = dma_fence_array_get_driver_name, .get_timeline_name = dma_fence_array_get_timeline_name, .enable_signaling = dma_fence_array_enable_signaling, - .signaled = dma_fence_array_signaled, + .check_signaled = dma_fence_array_signaled, .release = dma_fence_array_release, .set_deadline = dma_fence_array_set_deadline, }; diff --git a/drivers/dma-buf/dma-fence-chain.c b/drivers/dma-buf/dma-fence-chain.c index a588f55ea4d3..ff4f02900237 100644 --- a/drivers/dma-buf/dma-fence-chain.c +++ b/drivers/dma-buf/dma-fence-chain.c @@ -161,18 +161,20 @@ static bool dma_fence_chain_enable_signaling(struct dma_fence *fence) return false; }
-static bool dma_fence_chain_signaled(struct dma_fence *fence) +static void dma_fence_chain_signaled(struct dma_fence *fence) { - dma_fence_chain_for_each(fence, fence) { - struct dma_fence *f = dma_fence_chain_contained(fence); + struct dma_fence *iter = fence; + + dma_fence_chain_for_each(iter, iter) { + struct dma_fence *f = dma_fence_chain_contained(iter);
if (!dma_fence_is_signaled(f)) { - dma_fence_put(fence); - return false; + dma_fence_put(iter); + return; } }
- return true; + dma_fence_signal(fence); }
static void dma_fence_chain_release(struct dma_fence *fence) @@ -221,7 +223,7 @@ const struct dma_fence_ops dma_fence_chain_ops = { .get_driver_name = dma_fence_chain_get_driver_name, .get_timeline_name = dma_fence_chain_get_timeline_name, .enable_signaling = dma_fence_chain_enable_signaling, - .signaled = dma_fence_chain_signaled, + .check_signaled = dma_fence_chain_signaled, .release = dma_fence_chain_release, .set_deadline = dma_fence_chain_set_deadline, }; diff --git a/drivers/dma-buf/sw_sync.c b/drivers/dma-buf/sw_sync.c index 243991bc1506..c3b2563f2541 100644 --- a/drivers/dma-buf/sw_sync.c +++ b/drivers/dma-buf/sw_sync.c @@ -167,11 +167,12 @@ static void timeline_fence_release(struct dma_fence *fence) dma_fence_free(fence); }
-static bool timeline_fence_signaled(struct dma_fence *fence) +static void timeline_fence_signaled(struct dma_fence *fence) { struct sync_timeline *parent = dma_fence_parent(fence);
- return !__dma_fence_is_later(fence, fence->seqno, parent->value); + if (!__dma_fence_is_later(fence, fence->seqno, parent->value)) + dma_fence_signal(fence); }
static void timeline_fence_set_deadline(struct dma_fence *fence, ktime_t deadline) @@ -193,7 +194,7 @@ static void timeline_fence_set_deadline(struct dma_fence *fence, ktime_t deadlin static const struct dma_fence_ops timeline_fence_ops = { .get_driver_name = timeline_fence_get_driver_name, .get_timeline_name = timeline_fence_get_timeline_name, - .signaled = timeline_fence_signaled, + .check_signaled = timeline_fence_signaled, .release = timeline_fence_release, .set_deadline = timeline_fence_set_deadline, }; @@ -218,7 +219,7 @@ static void sync_timeline_signal(struct sync_timeline *obj, unsigned int inc) obj->value += inc;
list_for_each_entry_safe(pt, next, &obj->pt_list, link) { - if (!timeline_fence_signaled(&pt->base)) + if (__dma_fence_is_later(&pt->base, pt->base.seqno, obj->value)) break;
dma_fence_get(&pt->base); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c index 2cc6552a6399..b0c904a74f7a 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c @@ -311,7 +311,7 @@ static const char *amdgpu_userq_fence_get_timeline_name(struct dma_fence *f) return fence->fence_drv->timeline_name; }
-static bool amdgpu_userq_fence_signaled(struct dma_fence *f) +static void amdgpu_userq_fence_signaled(struct dma_fence *f) { struct amdgpu_userq_fence *fence = to_amdgpu_userq_fence(f); struct amdgpu_userq_fence_driver *fence_drv = fence->fence_drv; @@ -321,9 +321,7 @@ static bool amdgpu_userq_fence_signaled(struct dma_fence *f) wptr = fence->base.seqno;
if (rptr >= wptr) - return true; - - return false; + dma_fence_signal(f); }
static void amdgpu_userq_fence_free(struct rcu_head *rcu) @@ -347,7 +345,7 @@ static void amdgpu_userq_fence_release(struct dma_fence *f) static const struct dma_fence_ops amdgpu_userq_fence_ops = { .get_driver_name = amdgpu_userq_fence_get_driver_name, .get_timeline_name = amdgpu_userq_fence_get_timeline_name, - .signaled = amdgpu_userq_fence_signaled, + .check_signaled = amdgpu_userq_fence_signaled, .release = amdgpu_userq_fence_release, };
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gpu.c b/drivers/gpu/drm/etnaviv/etnaviv_gpu.c index a891d4f1f843..4f19c4c2a232 100644 --- a/drivers/gpu/drm/etnaviv/etnaviv_gpu.c +++ b/drivers/gpu/drm/etnaviv/etnaviv_gpu.c @@ -1153,11 +1153,12 @@ static const char *etnaviv_fence_get_timeline_name(struct dma_fence *fence) return dev_name(f->gpu->dev); }
-static bool etnaviv_fence_signaled(struct dma_fence *fence) +static void etnaviv_fence_signaled(struct dma_fence *fence) { struct etnaviv_fence *f = to_etnaviv_fence(fence);
- return (s32)(f->gpu->completed_fence - f->base.seqno) >= 0; + if ((s32)(f->gpu->completed_fence - f->base.seqno) >= 0) + dma_fence_signal(fence); }
static void etnaviv_fence_release(struct dma_fence *fence) @@ -1170,7 +1171,7 @@ static void etnaviv_fence_release(struct dma_fence *fence) static const struct dma_fence_ops etnaviv_fence_ops = { .get_driver_name = etnaviv_fence_get_driver_name, .get_timeline_name = etnaviv_fence_get_timeline_name, - .signaled = etnaviv_fence_signaled, + .check_signaled = etnaviv_fence_signaled, .release = etnaviv_fence_release, };
diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c index d2c7b1090df0..c39a7f4b6dc7 100644 --- a/drivers/gpu/drm/i915/i915_request.c +++ b/drivers/gpu/drm/i915/i915_request.c @@ -87,9 +87,10 @@ static const char *i915_fence_get_timeline_name(struct dma_fence *fence) return ctx->name; }
-static bool i915_fence_signaled(struct dma_fence *fence) +static void i915_fence_signaled(struct dma_fence *fence) { - return i915_request_completed(to_request(fence)); + if (i915_request_completed(to_request(fence))) + dma_fence_signal(fence); }
static bool i915_fence_enable_signaling(struct dma_fence *fence) @@ -176,7 +177,7 @@ const struct dma_fence_ops i915_fence_ops = { .get_driver_name = i915_fence_get_driver_name, .get_timeline_name = i915_fence_get_timeline_name, .enable_signaling = i915_fence_enable_signaling, - .signaled = i915_fence_signaled, + .check_signaled = i915_fence_signaled, .wait = i915_fence_wait, .release = i915_fence_release, }; diff --git a/drivers/gpu/drm/msm/msm_fence.c b/drivers/gpu/drm/msm/msm_fence.c index 3dca8e09c192..a3297d3194ca 100644 --- a/drivers/gpu/drm/msm/msm_fence.c +++ b/drivers/gpu/drm/msm/msm_fence.c @@ -123,10 +123,12 @@ static const char *msm_fence_get_timeline_name(struct dma_fence *fence) return f->fctx->name; }
-static bool msm_fence_signaled(struct dma_fence *fence) +static void msm_fence_signaled(struct dma_fence *fence) { struct msm_fence *f = to_msm_fence(fence); - return msm_fence_completed(f->fctx, f->base.seqno); + + if (msm_fence_completed(f->fctx, f->base.seqno)) + dma_fence_signal(fence); }
static void msm_fence_set_deadline(struct dma_fence *fence, ktime_t deadline) @@ -167,7 +169,7 @@ static void msm_fence_set_deadline(struct dma_fence *fence, ktime_t deadline) static const struct dma_fence_ops msm_fence_ops = { .get_driver_name = msm_fence_get_driver_name, .get_timeline_name = msm_fence_get_timeline_name, - .signaled = msm_fence_signaled, + .check_signaled = msm_fence_signaled, .set_deadline = msm_fence_set_deadline, };
diff --git a/drivers/gpu/drm/nouveau/nouveau_fence.c b/drivers/gpu/drm/nouveau/nouveau_fence.c index 6601ef52e301..64df1d7de460 100644 --- a/drivers/gpu/drm/nouveau/nouveau_fence.c +++ b/drivers/gpu/drm/nouveau/nouveau_fence.c @@ -455,7 +455,7 @@ static const char *nouveau_fence_get_timeline_name(struct dma_fence *f) * result. The drm node should still be there, so we can derive the index from * the fence context. */ -static bool nouveau_fence_is_signaled(struct dma_fence *f) +static void nouveau_fence_is_signaled(struct dma_fence *f) { struct nouveau_fence *fence = to_nouveau_fence(f); struct nouveau_fence_chan *fctx = nouveau_fctx(fence); @@ -468,7 +468,8 @@ static bool nouveau_fence_is_signaled(struct dma_fence *f) ret = (int)(fctx->read(chan) - fence->base.seqno) >= 0; rcu_read_unlock();
- return ret; + if (ret) + dma_fence_signal(f); }
static bool nouveau_fence_no_signaling(struct dma_fence *f) @@ -486,7 +487,8 @@ static bool nouveau_fence_no_signaling(struct dma_fence *f) * being able to enable signaling. It will still get signaled eventually, * just not right away. */ - if (nouveau_fence_is_signaled(f)) { + nouveau_fence_is_signaled(f); + if (dma_fence_test_signaled_flag(f)) { list_del(&fence->head);
dma_fence_put(&fence->base); @@ -509,7 +511,7 @@ static const struct dma_fence_ops nouveau_fence_ops_legacy = { .get_driver_name = nouveau_fence_get_get_driver_name, .get_timeline_name = nouveau_fence_get_timeline_name, .enable_signaling = nouveau_fence_no_signaling, - .signaled = nouveau_fence_is_signaled, + .check_signaled = nouveau_fence_is_signaled, .wait = nouveau_fence_wait_legacy, .release = nouveau_fence_release }; @@ -536,6 +538,6 @@ static const struct dma_fence_ops nouveau_fence_ops_uevent = { .get_driver_name = nouveau_fence_get_get_driver_name, .get_timeline_name = nouveau_fence_get_timeline_name, .enable_signaling = nouveau_fence_enable_signaling, - .signaled = nouveau_fence_is_signaled, + .check_signaled = nouveau_fence_is_signaled, .release = nouveau_fence_release }; diff --git a/drivers/gpu/drm/radeon/radeon_fence.c b/drivers/gpu/drm/radeon/radeon_fence.c index 02a40e4750c7..45f01ebe5a78 100644 --- a/drivers/gpu/drm/radeon/radeon_fence.c +++ b/drivers/gpu/drm/radeon/radeon_fence.c @@ -350,7 +350,7 @@ static bool radeon_fence_seq_signaled(struct radeon_device *rdev, return false; }
-static bool radeon_fence_is_signaled(struct dma_fence *f) +static void radeon_fence_is_signaled(struct dma_fence *f) { struct radeon_fence *fence = to_radeon_fence(f); struct radeon_device *rdev = fence->rdev; @@ -358,9 +358,7 @@ static bool radeon_fence_is_signaled(struct dma_fence *f) u64 seq = fence->seq;
if (atomic64_read(&rdev->fence_drv[ring].last_seq) >= seq) - return true; - - return false; + dma_fence_signal(f); }
/** @@ -1046,7 +1044,7 @@ const struct dma_fence_ops radeon_fence_ops = { .get_driver_name = radeon_fence_get_driver_name, .get_timeline_name = radeon_fence_get_timeline_name, .enable_signaling = radeon_fence_enable_signaling, - .signaled = radeon_fence_is_signaled, + .check_signaled = radeon_fence_is_signaled, .wait = radeon_fence_default_wait, .release = NULL, }; diff --git a/drivers/gpu/drm/vc4/vc4_fence.c b/drivers/gpu/drm/vc4/vc4_fence.c index 580214e2158c..3db2588906ac 100644 --- a/drivers/gpu/drm/vc4/vc4_fence.c +++ b/drivers/gpu/drm/vc4/vc4_fence.c @@ -33,16 +33,17 @@ static const char *vc4_fence_get_timeline_name(struct dma_fence *fence) return "vc4-v3d"; }
-static bool vc4_fence_signaled(struct dma_fence *fence) +static void vc4_fence_signaled(struct dma_fence *fence) { struct vc4_fence *f = to_vc4_fence(fence); struct vc4_dev *vc4 = to_vc4_dev(f->dev);
- return vc4->finished_seqno >= f->seqno; + if (vc4->finished_seqno >= f->seqno) + dma_fence_signal(fence); }
const struct dma_fence_ops vc4_fence_ops = { .get_driver_name = vc4_fence_get_driver_name, .get_timeline_name = vc4_fence_get_timeline_name, - .signaled = vc4_fence_signaled, + .check_signaled = vc4_fence_signaled, }; diff --git a/drivers/gpu/drm/virtio/virtgpu_fence.c b/drivers/gpu/drm/virtio/virtgpu_fence.c index c3e66ef2133a..2118de27bd14 100644 --- a/drivers/gpu/drm/virtio/virtgpu_fence.c +++ b/drivers/gpu/drm/virtio/virtgpu_fence.c @@ -40,19 +40,18 @@ static const char *virtio_gpu_get_timeline_name(struct dma_fence *f) return "controlq"; }
-static bool virtio_gpu_fence_signaled(struct dma_fence *f) +static void virtio_gpu_fence_signaled(struct dma_fence *f) { /* leaked fence outside driver before completing * initialization with virtio_gpu_fence_emit. */ WARN_ON_ONCE(f->seqno == 0); - return false; }
static const struct dma_fence_ops virtio_gpu_fence_ops = { .get_driver_name = virtio_gpu_get_driver_name, .get_timeline_name = virtio_gpu_get_timeline_name, - .signaled = virtio_gpu_fence_signaled, + .check_signaled = virtio_gpu_fence_signaled, };
struct virtio_gpu_fence *virtio_gpu_fence_alloc(struct virtio_gpu_device *vgdev, diff --git a/drivers/gpu/drm/xe/xe_hw_fence.c b/drivers/gpu/drm/xe/xe_hw_fence.c index a4e0278559b8..bda2fde0b216 100644 --- a/drivers/gpu/drm/xe/xe_hw_fence.c +++ b/drivers/gpu/drm/xe/xe_hw_fence.c @@ -49,14 +49,15 @@ static void fence_free(struct rcu_head *rcu) kmem_cache_free(xe_hw_fence_slab, fence); }
-static bool xe_hw_fence_signaled(struct dma_fence *dma_fence) +static void xe_hw_fence_signaled(struct dma_fence *dma_fence) { struct xe_hw_fence *fence = to_xe_hw_fence(dma_fence); struct xe_device *xe = fence->xe; u32 seqno = xe_map_rd(xe, &fence->seqno_map, 0, u32);
- return dma_fence->error || - !__dma_fence_is_later(dma_fence, dma_fence->seqno, seqno); + if (dma_fence->error || + !__dma_fence_is_later(dma_fence, dma_fence->seqno, seqno)) + dma_fence_signal(dma_fence); }
static void hw_fence_irq_run_cb(struct irq_work *work) @@ -72,9 +73,8 @@ static void hw_fence_irq_run_cb(struct irq_work *work) struct dma_fence *dma_fence = &fence->dma;
trace_xe_hw_fence_try_signal(fence); - if (dma_fence_test_signaled_flag(dma_fence) || - xe_hw_fence_signaled(dma_fence)) { - dma_fence_signal_locked(dma_fence); + xe_hw_fence_signaled(dma_fence); + if (dma_fence_test_signaled_flag(dma_fence)) { trace_xe_hw_fence_signal(fence); list_del_init(&fence->irq_link); dma_fence_put(dma_fence); @@ -163,7 +163,8 @@ static bool xe_hw_fence_enable_signaling(struct dma_fence *dma_fence) list_add_tail(&fence->irq_link, &irq->pending);
/* SW completed (no HW IRQ) so kick handler to signal fence */ - if (xe_hw_fence_signaled(dma_fence)) + xe_hw_fence_signaled(dma_fence); + if (dma_fence_test_signaled_flag(dma_fence)) xe_hw_fence_irq_run(irq);
return true; @@ -181,7 +182,7 @@ static const struct dma_fence_ops xe_hw_fence_ops = { .get_driver_name = xe_hw_fence_get_driver_name, .get_timeline_name = xe_hw_fence_get_timeline_name, .enable_signaling = xe_hw_fence_enable_signaling, - .signaled = xe_hw_fence_signaled, + .check_signaled = xe_hw_fence_signaled, .release = xe_hw_fence_release, };
diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h index ad69acbea218..e93ea4ac0636 100644 --- a/include/linux/dma-fence.h +++ b/include/linux/dma-fence.h @@ -195,21 +195,22 @@ struct dma_fence_ops { bool (*enable_signaling)(struct dma_fence *fence);
/** - * @signaled: + * @check_signaled: * * Peek whether the fence is signaled, as a fastpath optimization for - * e.g. dma_fence_wait() or dma_fence_add_callback(). Note that this + * e.g. dma_fence_wait() or dma_fence_add_callback(). If the fence is + * signaled, the implementation must call dma_fence_signal(). This * callback does not need to make any guarantees beyond that a fence - * once indicates as signalled must always return true from this - * callback. This callback may return false even if the fence has - * completed already, in this case information hasn't propogated throug - * the system yet. See also dma_fence_is_signaled(). + * that is signaled will have dma_fence_signal() called. The callback + * may be called even if the fence has already been signaled, in which + * case the dma_fence_signal() call will be a no-op. See also + * dma_fence_is_signaled(). * - * May set &dma_fence.error if returning true. + * May set &dma_fence.error before calling dma_fence_signal(). * * This callback is optional. */ - bool (*signaled)(struct dma_fence *fence); + void (*check_signaled)(struct dma_fence *fence);
/** * @wait: @@ -517,14 +518,11 @@ dma_fence_is_signaled(struct dma_fence *fence)
rcu_read_lock(); ops = rcu_dereference(fence->ops); - if (ops && ops->signaled && ops->signaled(fence)) { - rcu_read_unlock(); - dma_fence_signal(fence); - return true; - } + if (ops && ops->check_signaled) + ops->check_signaled(fence); rcu_read_unlock();
- return false; + return dma_fence_test_signaled_flag(fence); }
/**
Instead move the call into the backend implementations where necessary.
This way backends can do cleanup after calling dma_fence_signal().
For example it might be necessary to remove items from linked lists and/or drop additional references, start work items etc...
It also gives backends a clean point to know when all registered callbacks are finished.
No intended functional change.
Assisted-by: Claude Sonet 4 Signed-off-by: Christian König christian.koenig@amd.com --- drivers/dma-buf/dma-fence-array.c | 7 +++---- drivers/dma-buf/dma-fence-chain.c | 16 +++++++++------- drivers/dma-buf/dma-fence.c | 16 +++++----------- drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_fence.c | 16 +++++++++------- .../gpu/drm/amd/amdgpu/amdgpu_eviction_fence.c | 3 +-- drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 4 +--- drivers/gpu/drm/i915/i915_request.c | 5 +++-- drivers/gpu/drm/nouveau/nouveau_fence.c | 15 +++++---------- drivers/gpu/drm/radeon/radeon_fence.c | 12 +++++++----- drivers/gpu/drm/vmwgfx/vmwgfx_fence.c | 6 +++--- drivers/gpu/drm/xe/xe_hw_fence.c | 4 +--- drivers/gpu/drm/xe/xe_preempt_fence.c | 3 +-- drivers/gpu/host1x/fence.c | 10 +++++----- include/linux/dma-fence.h | 12 ++++++------ 14 files changed, 59 insertions(+), 70 deletions(-)
diff --git a/drivers/dma-buf/dma-fence-array.c b/drivers/dma-buf/dma-fence-array.c index 49ea2ba7c460..541c9c169624 100644 --- a/drivers/dma-buf/dma-fence-array.c +++ b/drivers/dma-buf/dma-fence-array.c @@ -67,7 +67,7 @@ static void dma_fence_array_cb_func(struct dma_fence *f, dma_fence_put(&array->base); }
-static bool dma_fence_array_enable_signaling(struct dma_fence *fence) +static void dma_fence_array_enable_signaling(struct dma_fence *fence) { struct dma_fence_array *array = to_dma_fence_array(fence); struct dma_fence_array_cb *cb = array->callbacks; @@ -92,12 +92,11 @@ static bool dma_fence_array_enable_signaling(struct dma_fence *fence) dma_fence_put(&array->base); if (atomic_dec_and_test(&array->num_pending)) { dma_fence_array_clear_pending_error(array); - return false; + dma_fence_signal_locked(fence); + return; } } } - - return true; }
static void dma_fence_array_signaled(struct dma_fence *fence) diff --git a/drivers/dma-buf/dma-fence-chain.c b/drivers/dma-buf/dma-fence-chain.c index ff4f02900237..6617f4150c73 100644 --- a/drivers/dma-buf/dma-fence-chain.c +++ b/drivers/dma-buf/dma-fence-chain.c @@ -9,7 +9,7 @@
#include <linux/dma-fence-chain.h>
-static bool dma_fence_chain_enable_signaling(struct dma_fence *fence); +static void dma_fence_chain_enable_signaling(struct dma_fence *fence);
/** * dma_fence_chain_get_prev - use RCU to get a reference to the previous fence @@ -122,13 +122,14 @@ static const char *dma_fence_chain_get_timeline_name(struct dma_fence *fence) static void dma_fence_chain_irq_work(struct irq_work *work) { struct dma_fence_chain *chain; + unsigned long flags;
chain = container_of(work, typeof(*chain), work);
/* Try to rearm the callback */ - if (!dma_fence_chain_enable_signaling(&chain->base)) - /* Ok, we are done. No more unsignaled fences left */ - dma_fence_signal(&chain->base); + dma_fence_lock_irqsave(&chain->base, flags); + dma_fence_chain_enable_signaling(&chain->base); + dma_fence_unlock_irqrestore(&chain->base, flags); dma_fence_put(&chain->base); }
@@ -142,7 +143,7 @@ static void dma_fence_chain_cb(struct dma_fence *f, struct dma_fence_cb *cb) dma_fence_put(f); }
-static bool dma_fence_chain_enable_signaling(struct dma_fence *fence) +static void dma_fence_chain_enable_signaling(struct dma_fence *fence) { struct dma_fence_chain *head = to_dma_fence_chain(fence);
@@ -153,12 +154,13 @@ static bool dma_fence_chain_enable_signaling(struct dma_fence *fence) dma_fence_get(f); if (!dma_fence_add_callback(f, &head->cb, dma_fence_chain_cb)) { dma_fence_put(fence); - return true; + return; } dma_fence_put(f); } dma_fence_put(&head->base); - return false; + /* Ok, we are done. No more unsignaled fences left */ + dma_fence_signal_locked(&head->base); }
static void dma_fence_chain_signaled(struct dma_fence *fence) diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c index 0ec81a568bbd..15b425984c36 100644 --- a/drivers/dma-buf/dma-fence.c +++ b/drivers/dma-buf/dma-fence.c @@ -626,7 +626,7 @@ void dma_fence_free(struct dma_fence *fence) } EXPORT_SYMBOL(dma_fence_free);
-static bool __dma_fence_enable_signaling(struct dma_fence *fence) +static void __dma_fence_enable_signaling(struct dma_fence *fence) { const struct dma_fence_ops *ops; bool was_set; @@ -637,22 +637,15 @@ static bool __dma_fence_enable_signaling(struct dma_fence *fence) &fence->flags);
if (dma_fence_test_signaled_flag(fence)) - return false; + return;
rcu_read_lock(); ops = rcu_dereference(fence->ops); if (!was_set && ops && ops->enable_signaling) { trace_dma_fence_enable_signal(fence); - - if (!ops->enable_signaling(fence)) { - rcu_read_unlock(); - dma_fence_signal_locked(fence); - return false; - } + ops->enable_signaling(fence); } rcu_read_unlock(); - - return true; }
/** @@ -710,7 +703,8 @@ int dma_fence_add_callback(struct dma_fence *fence, struct dma_fence_cb *cb, }
dma_fence_lock_irqsave(fence, flags); - if (__dma_fence_enable_signaling(fence)) { + __dma_fence_enable_signaling(fence); + if (!dma_fence_test_signaled_flag(fence)) { cb->func = func; list_add_tail(&cb->node, &fence->cb_list); } else { diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_fence.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_fence.c index 6a364357522b..15f546c9098e 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_fence.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_fence.c @@ -118,15 +118,17 @@ static const char *amdkfd_fence_get_timeline_name(struct dma_fence *f) * * @f: dma_fence */ -static bool amdkfd_fence_enable_signaling(struct dma_fence *f) +static void amdkfd_fence_enable_signaling(struct dma_fence *f) { struct amdgpu_amdkfd_fence *fence = to_amdgpu_amdkfd_fence(f);
- if (!fence) - return false; + if (!fence) { + dma_fence_signal_locked(f); + return; + }
if (dma_fence_is_signaled(f)) - return true; + return;
/* if fence->svm_bo is NULL, means this fence is created through * init_kfd_vm() or amdgpu_amdkfd_gpuvm_restore_process_bos(). @@ -134,12 +136,12 @@ static bool amdkfd_fence_enable_signaling(struct dma_fence *f) */ if (!fence->svm_bo) { if (!kgd2kfd_schedule_evict_and_restore_process(fence->mm, fence->context_id, f)) - return true; + return; } else { if (!svm_range_schedule_evict_svm_bo(fence)) - return true; + return; } - return false; + dma_fence_signal_locked(f); }
/** diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_eviction_fence.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_eviction_fence.c index f6b7522c3c82..ac2b337e0e8f 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_eviction_fence.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_eviction_fence.c @@ -40,12 +40,11 @@ amdgpu_eviction_fence_get_timeline_name(struct dma_fence *f) return ef->timeline_name; }
-static bool amdgpu_eviction_fence_enable_signaling(struct dma_fence *f) +static void amdgpu_eviction_fence_enable_signaling(struct dma_fence *f) { struct amdgpu_eviction_fence *ev_fence = to_ev_fence(f);
schedule_work(&ev_fence->evf_mgr->suspend_work); - return true; }
static const struct dma_fence_ops amdgpu_eviction_fence_ops = { diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c index 1192b9800ff2..f7ddc3e50d34 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c @@ -833,12 +833,10 @@ static const char *amdgpu_fence_get_timeline_name(struct dma_fence *f) * to fence_queue that checks if this fence is signaled, and if so it * signals the fence and removes itself. */ -static bool amdgpu_fence_enable_signaling(struct dma_fence *f) +static void amdgpu_fence_enable_signaling(struct dma_fence *f) { if (!timer_pending(&to_amdgpu_fence(f)->ring->fence_drv.fallback_timer)) amdgpu_fence_schedule_fallback(to_amdgpu_fence(f)->ring); - - return true; }
/** diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c index c39a7f4b6dc7..d9ffcb0e40e3 100644 --- a/drivers/gpu/drm/i915/i915_request.c +++ b/drivers/gpu/drm/i915/i915_request.c @@ -93,9 +93,10 @@ static void i915_fence_signaled(struct dma_fence *fence) dma_fence_signal(fence); }
-static bool i915_fence_enable_signaling(struct dma_fence *fence) +static void i915_fence_enable_signaling(struct dma_fence *fence) { - return i915_request_enable_breadcrumb(to_request(fence)); + if (!i915_request_enable_breadcrumb(to_request(fence))) + dma_fence_signal_locked(fence); }
static signed long i915_fence_wait(struct dma_fence *fence, diff --git a/drivers/gpu/drm/nouveau/nouveau_fence.c b/drivers/gpu/drm/nouveau/nouveau_fence.c index 64df1d7de460..7250f58ee443 100644 --- a/drivers/gpu/drm/nouveau/nouveau_fence.c +++ b/drivers/gpu/drm/nouveau/nouveau_fence.c @@ -472,7 +472,7 @@ static void nouveau_fence_is_signaled(struct dma_fence *f) dma_fence_signal(f); }
-static bool nouveau_fence_no_signaling(struct dma_fence *f) +static void nouveau_fence_no_signaling(struct dma_fence *f) { struct nouveau_fence *fence = to_nouveau_fence(f);
@@ -492,10 +492,8 @@ static bool nouveau_fence_no_signaling(struct dma_fence *f) list_del(&fence->head);
dma_fence_put(&fence->base); - return false; + dma_fence_signal_locked(f); } - - return true; }
static void nouveau_fence_release(struct dma_fence *f) @@ -516,22 +514,19 @@ static const struct dma_fence_ops nouveau_fence_ops_legacy = { .release = nouveau_fence_release };
-static bool nouveau_fence_enable_signaling(struct dma_fence *f) +static void nouveau_fence_enable_signaling(struct dma_fence *f) { struct nouveau_fence *fence = to_nouveau_fence(f); struct nouveau_fence_chan *fctx = nouveau_fctx(fence); - bool ret;
if (!fctx->notify_ref++) nvif_event_allow(&fctx->event);
- ret = nouveau_fence_no_signaling(f); - if (ret) + nouveau_fence_no_signaling(f); + if (!dma_fence_test_signaled_flag(f)) set_bit(DMA_FENCE_FLAG_USER_BITS, &fence->base.flags); else if (!--fctx->notify_ref) nvif_event_block(&fctx->event); - - return ret; }
static const struct dma_fence_ops nouveau_fence_ops_uevent = { diff --git a/drivers/gpu/drm/radeon/radeon_fence.c b/drivers/gpu/drm/radeon/radeon_fence.c index 45f01ebe5a78..5a543d8ea0d9 100644 --- a/drivers/gpu/drm/radeon/radeon_fence.c +++ b/drivers/gpu/drm/radeon/radeon_fence.c @@ -369,13 +369,15 @@ static void radeon_fence_is_signaled(struct dma_fence *f) * to fence_queue that checks if this fence is signaled, and if so it * signals the fence and removes itself. */ -static bool radeon_fence_enable_signaling(struct dma_fence *f) +static void radeon_fence_enable_signaling(struct dma_fence *f) { struct radeon_fence *fence = to_radeon_fence(f); struct radeon_device *rdev = fence->rdev;
- if (atomic64_read(&rdev->fence_drv[fence->ring].last_seq) >= fence->seq) - return false; + if (atomic64_read(&rdev->fence_drv[fence->ring].last_seq) >= fence->seq) { + dma_fence_signal_locked(f); + return; + }
if (down_read_trylock(&rdev->exclusive_lock)) { radeon_irq_kms_sw_irq_get(rdev, fence->ring); @@ -387,7 +389,8 @@ static bool radeon_fence_enable_signaling(struct dma_fence *f) if (atomic64_read(&rdev->fence_drv[fence->ring].last_seq) >= fence->seq) { radeon_irq_kms_sw_irq_put(rdev, fence->ring); up_read(&rdev->exclusive_lock); - return false; + dma_fence_signal_locked(f); + return; }
up_read(&rdev->exclusive_lock); @@ -403,7 +406,6 @@ static bool radeon_fence_enable_signaling(struct dma_fence *f) fence->fence_wake.func = radeon_fence_check_signaled; __add_wait_queue(&rdev->fence_queue, &fence->fence_wake); dma_fence_get(f); - return true; }
/** diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_fence.c b/drivers/gpu/drm/vmwgfx/vmwgfx_fence.c index 4ef84ff9b638..cb92232ca4ee 100644 --- a/drivers/gpu/drm/vmwgfx/vmwgfx_fence.c +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_fence.c @@ -95,7 +95,7 @@ static const char *vmw_fence_get_timeline_name(struct dma_fence *f) * enabled. If interrupts were already enabled we just increment the number of * seqno waiters. */ -static bool vmw_fence_enable_signaling(struct dma_fence *f) +static void vmw_fence_enable_signaling(struct dma_fence *f) { u32 seqno; struct vmw_fence_obj *fence = @@ -110,13 +110,13 @@ static bool vmw_fence_enable_signaling(struct dma_fence *f) vmw_seqno_waiter_remove(dev_priv); fence->waiter_added = false; } - return false; + dma_fence_signal_locked(f); + return; } else if (!fence->waiter_added) { fence->waiter_added = true; if (vmw_seqno_waiter_add(dev_priv)) goto check_for_race; } - return true; }
static u32 __vmw_fences_update(struct vmw_fence_manager *fman); diff --git a/drivers/gpu/drm/xe/xe_hw_fence.c b/drivers/gpu/drm/xe/xe_hw_fence.c index bda2fde0b216..44563dfd75ab 100644 --- a/drivers/gpu/drm/xe/xe_hw_fence.c +++ b/drivers/gpu/drm/xe/xe_hw_fence.c @@ -154,7 +154,7 @@ static const char *xe_hw_fence_get_timeline_name(struct dma_fence *dma_fence) return fence->name; }
-static bool xe_hw_fence_enable_signaling(struct dma_fence *dma_fence) +static void xe_hw_fence_enable_signaling(struct dma_fence *dma_fence) { struct xe_hw_fence *fence = to_xe_hw_fence(dma_fence); struct xe_hw_fence_irq *irq = xe_hw_fence_irq(fence); @@ -166,8 +166,6 @@ static bool xe_hw_fence_enable_signaling(struct dma_fence *dma_fence) xe_hw_fence_signaled(dma_fence); if (dma_fence_test_signaled_flag(dma_fence)) xe_hw_fence_irq_run(irq); - - return true; }
static void xe_hw_fence_release(struct dma_fence *dma_fence) diff --git a/drivers/gpu/drm/xe/xe_preempt_fence.c b/drivers/gpu/drm/xe/xe_preempt_fence.c index d6427b473ddd..c6e5472ec7ac 100644 --- a/drivers/gpu/drm/xe/xe_preempt_fence.c +++ b/drivers/gpu/drm/xe/xe_preempt_fence.c @@ -67,7 +67,7 @@ preempt_fence_get_timeline_name(struct dma_fence *fence) return "preempt"; }
-static bool preempt_fence_enable_signaling(struct dma_fence *fence) +static void preempt_fence_enable_signaling(struct dma_fence *fence) { struct xe_preempt_fence *pfence = container_of(fence, typeof(*pfence), base); @@ -75,7 +75,6 @@ static bool preempt_fence_enable_signaling(struct dma_fence *fence)
pfence->error = q->ops->suspend(q); queue_work(q->vm->xe->preempt_fence_wq, &pfence->preempt_work); - return true; }
static const struct dma_fence_ops preempt_fence_ops = { diff --git a/drivers/gpu/host1x/fence.c b/drivers/gpu/host1x/fence.c index b9a7d0bf91f8..4a74df718540 100644 --- a/drivers/gpu/host1x/fence.c +++ b/drivers/gpu/host1x/fence.c @@ -30,12 +30,14 @@ static struct host1x_syncpt_fence *to_host1x_fence(struct dma_fence *f) return container_of(f, struct host1x_syncpt_fence, base); }
-static bool host1x_syncpt_fence_enable_signaling(struct dma_fence *f) +static void host1x_syncpt_fence_enable_signaling(struct dma_fence *f) { struct host1x_syncpt_fence *sf = to_host1x_fence(f);
- if (host1x_syncpt_is_expired(sf->sp, sf->threshold)) - return false; + if (host1x_syncpt_is_expired(sf->sp, sf->threshold)) { + dma_fence_signal_locked(f); + return; + }
/* Reference for interrupt path. */ dma_fence_get(f); @@ -62,8 +64,6 @@ static bool host1x_syncpt_fence_enable_signaling(struct dma_fence *f) * so we need to initialize all state used by signalling * before it. */ - - return true; }
static const struct dma_fence_ops host1x_syncpt_fence_ops = { diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h index e93ea4ac0636..c8e4d5a61d72 100644 --- a/include/linux/dma-fence.h +++ b/include/linux/dma-fence.h @@ -174,12 +174,12 @@ struct dma_fence_ops { * This is called with irq's disabled, so only spinlocks which disable * IRQ's can be used in the code outside of this callback. * - * A return value of false indicates the fence already passed, - * or some failure occurred that made it impossible to enable - * signaling. True indicates successful enabling. + * If the fence has already passed or if some failure occurred that + * makes it impossible to enable signaling, the implementation must + * call dma_fence_signal_locked() before returning. * - * &dma_fence.error may be set in enable_signaling, but only when false - * is returned. + * &dma_fence.error may be set in enable_signaling before calling + * dma_fence_signal_locked(). * * Since many implementations can call dma_fence_signal() even when before * @enable_signaling has been called there's a race window, where the @@ -192,7 +192,7 @@ struct dma_fence_ops { * This callback is optional. If this callback is not present, then the * driver must always have signaling enabled. */ - bool (*enable_signaling)(struct dma_fence *fence); + void (*enable_signaling)(struct dma_fence *fence);
/** * @check_signaled:
Make the callback responsible for acquiring the fence lock when needed.
This gives backends control over their locking strategy and allows them to nest locks in their desired order.
This caused quite some trouble in the past and is the reason for multiple workarounds.
As a start for cleanup this patch also removes the lockdep anotation workaround from dma_fence_chain and dma_fence_array since it isn't necessary any more.
Assisted-by: Claude Sonet 4 Signed-off-by: Christian König christian.koenig@amd.com --- drivers/dma-buf/dma-fence-array.c | 16 +--------- drivers/dma-buf/dma-fence-chain.c | 19 ++--------- drivers/dma-buf/dma-fence.c | 32 +++++++------------ .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_fence.c | 18 +++++------ drivers/gpu/drm/i915/i915_request.c | 4 +++ drivers/gpu/drm/nouveau/nouveau_fence.c | 16 ++++++++-- drivers/gpu/drm/radeon/radeon_fence.c | 6 ++++ drivers/gpu/drm/vmwgfx/vmwgfx_fence.c | 6 +++- drivers/gpu/drm/xe/xe_hw_fence.c | 3 ++ drivers/gpu/drm/xe/xe_preempt_fence.c | 3 ++ drivers/gpu/host1x/fence.c | 6 ++++ include/linux/dma-fence.h | 10 ++++-- 12 files changed, 71 insertions(+), 68 deletions(-)
diff --git a/drivers/dma-buf/dma-fence-array.c b/drivers/dma-buf/dma-fence-array.c index 541c9c169624..fcda2dcc6010 100644 --- a/drivers/dma-buf/dma-fence-array.c +++ b/drivers/dma-buf/dma-fence-array.c @@ -92,7 +92,7 @@ static void dma_fence_array_enable_signaling(struct dma_fence *fence) dma_fence_put(&array->base); if (atomic_dec_and_test(&array->num_pending)) { dma_fence_array_clear_pending_error(array); - dma_fence_signal_locked(fence); + dma_fence_signal(fence); return; } } @@ -197,8 +197,6 @@ void dma_fence_array_init(struct dma_fence_array *array, int num_fences, struct dma_fence **fences, u64 context, unsigned seqno) { - static struct lock_class_key dma_fence_array_lock_key; - WARN_ON(!num_fences || !fences);
array->num_fences = num_fences; @@ -207,18 +205,6 @@ void dma_fence_array_init(struct dma_fence_array *array, seqno); init_irq_work(&array->work, irq_dma_fence_array_work);
- /* - * dma_fence_array_enable_signaling() is invoked while holding - * array->base.inline_lock and may call dma_fence_add_callback() - * on the underlying fences, which takes their inline_lock. - * - * Since both locks share the same lockdep class, this legitimate - * nesting confuses lockdep and triggers a recursive locking - * warning. Assign a separate lockdep class to the array lock - * to model this hierarchy correctly. - */ - lockdep_set_class(&array->base.inline_lock, &dma_fence_array_lock_key); - atomic_set(&array->num_pending, num_fences); array->fences = fences;
diff --git a/drivers/dma-buf/dma-fence-chain.c b/drivers/dma-buf/dma-fence-chain.c index 6617f4150c73..943ec919138d 100644 --- a/drivers/dma-buf/dma-fence-chain.c +++ b/drivers/dma-buf/dma-fence-chain.c @@ -122,14 +122,11 @@ static const char *dma_fence_chain_get_timeline_name(struct dma_fence *fence) static void dma_fence_chain_irq_work(struct irq_work *work) { struct dma_fence_chain *chain; - unsigned long flags;
chain = container_of(work, typeof(*chain), work);
/* Try to rearm the callback */ - dma_fence_lock_irqsave(&chain->base, flags); dma_fence_chain_enable_signaling(&chain->base); - dma_fence_unlock_irqrestore(&chain->base, flags); dma_fence_put(&chain->base); }
@@ -159,8 +156,9 @@ static void dma_fence_chain_enable_signaling(struct dma_fence *fence) dma_fence_put(f); } dma_fence_put(&head->base); + /* Ok, we are done. No more unsignaled fences left */ - dma_fence_signal_locked(&head->base); + dma_fence_signal(&head->base); }
static void dma_fence_chain_signaled(struct dma_fence *fence) @@ -246,7 +244,6 @@ void dma_fence_chain_init(struct dma_fence_chain *chain, struct dma_fence *fence, uint64_t seqno) { - static struct lock_class_key dma_fence_chain_lock_key; struct dma_fence_chain *prev_chain = to_dma_fence_chain(prev); uint64_t context;
@@ -268,18 +265,6 @@ void dma_fence_chain_init(struct dma_fence_chain *chain, dma_fence_init64(&chain->base, &dma_fence_chain_ops, NULL, context, seqno);
- /* - * dma_fence_chain_enable_signaling() is invoked while holding - * chain->base.inline_lock and may call dma_fence_add_callback() - * on the underlying fences, which takes their inline_lock. - * - * Since both locks share the same lockdep class, this legitimate - * nesting confuses lockdep and triggers a recursive locking - * warning. Assign a separate lockdep class to the chain lock - * to model this hierarchy correctly. - */ - lockdep_set_class(&chain->base.inline_lock, &dma_fence_chain_lock_key); - /* * Chaining dma_fence_chain container together is only allowed through * the prev fence and not through the contained fence. diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c index 15b425984c36..f201dff75247 100644 --- a/drivers/dma-buf/dma-fence.c +++ b/drivers/dma-buf/dma-fence.c @@ -626,13 +626,19 @@ void dma_fence_free(struct dma_fence *fence) } EXPORT_SYMBOL(dma_fence_free);
-static void __dma_fence_enable_signaling(struct dma_fence *fence) +/** + * dma_fence_enable_signaling - enable signaling on fence + * @fence: the fence to enable + * + * This will request for sw signaling to be enabled, to make the fence + * complete as soon as possible. This calls &dma_fence_ops.enable_signaling + * internally. + */ +void dma_fence_enable_signaling(struct dma_fence *fence) { const struct dma_fence_ops *ops; bool was_set;
- dma_fence_assert_held(fence); - was_set = test_and_set_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT, &fence->flags);
@@ -647,23 +653,6 @@ static void __dma_fence_enable_signaling(struct dma_fence *fence) } rcu_read_unlock(); } - -/** - * dma_fence_enable_signaling - enable signaling on fence - * @fence: the fence to enable - * - * This will request for sw signaling to be enabled, to make the fence - * complete as soon as possible. This calls &dma_fence_ops.enable_signaling - * internally. - */ -void dma_fence_enable_signaling(struct dma_fence *fence) -{ - unsigned long flags; - - dma_fence_lock_irqsave(fence, flags); - __dma_fence_enable_signaling(fence); - dma_fence_unlock_irqrestore(fence, flags); -} EXPORT_SYMBOL(dma_fence_enable_signaling);
/** @@ -702,8 +691,9 @@ int dma_fence_add_callback(struct dma_fence *fence, struct dma_fence_cb *cb, return -ENOENT; }
+ dma_fence_enable_signaling(fence); + dma_fence_lock_irqsave(fence, flags); - __dma_fence_enable_signaling(fence); if (!dma_fence_test_signaled_flag(fence)) { cb->func = func; list_add_tail(&cb->node, &fence->cb_list); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_fence.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_fence.c index 15f546c9098e..368c2083b4bd 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_fence.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_fence.c @@ -121,27 +121,27 @@ static const char *amdkfd_fence_get_timeline_name(struct dma_fence *f) static void amdkfd_fence_enable_signaling(struct dma_fence *f) { struct amdgpu_amdkfd_fence *fence = to_amdgpu_amdkfd_fence(f); + unsigned long flags;
- if (!fence) { - dma_fence_signal_locked(f); - return; - } - - if (dma_fence_is_signaled(f)) - return; + dma_fence_lock_irqsave(f, flags);
/* if fence->svm_bo is NULL, means this fence is created through * init_kfd_vm() or amdgpu_amdkfd_gpuvm_restore_process_bos(). * Therefore, this fence is amdgpu_amdkfd_fence->eviction_fence. */ if (!fence->svm_bo) { - if (!kgd2kfd_schedule_evict_and_restore_process(fence->mm, fence->context_id, f)) + if (!kgd2kfd_schedule_evict_and_restore_process(fence->mm, fence->context_id, f)) { + dma_fence_unlock_irqrestore(f, flags); return; + } } else { - if (!svm_range_schedule_evict_svm_bo(fence)) + if (!svm_range_schedule_evict_svm_bo(fence)) { + dma_fence_unlock_irqrestore(f, flags); return; + } } dma_fence_signal_locked(f); + dma_fence_unlock_irqrestore(f, flags); }
/** diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c index d9ffcb0e40e3..9218a4d6ef11 100644 --- a/drivers/gpu/drm/i915/i915_request.c +++ b/drivers/gpu/drm/i915/i915_request.c @@ -95,8 +95,12 @@ static void i915_fence_signaled(struct dma_fence *fence)
static void i915_fence_enable_signaling(struct dma_fence *fence) { + unsigned long flags; + + dma_fence_lock_irqsave(fence, flags); if (!i915_request_enable_breadcrumb(to_request(fence))) dma_fence_signal_locked(fence); + dma_fence_unlock_irqrestore(fence, flags); }
static signed long i915_fence_wait(struct dma_fence *fence, diff --git a/drivers/gpu/drm/nouveau/nouveau_fence.c b/drivers/gpu/drm/nouveau/nouveau_fence.c index 7250f58ee443..f494281d0ed2 100644 --- a/drivers/gpu/drm/nouveau/nouveau_fence.c +++ b/drivers/gpu/drm/nouveau/nouveau_fence.c @@ -472,7 +472,7 @@ static void nouveau_fence_is_signaled(struct dma_fence *f) dma_fence_signal(f); }
-static void nouveau_fence_no_signaling(struct dma_fence *f) +static void __nouveau_fence_no_signaling(struct dma_fence *f) { struct nouveau_fence *fence = to_nouveau_fence(f);
@@ -496,6 +496,15 @@ static void nouveau_fence_no_signaling(struct dma_fence *f) } }
+static void nouveau_fence_no_signaling(struct dma_fence *f) +{ + unsigned long flags; + + dma_fence_lock_irqsave(f, flags); + __nouveau_fence_no_signaling(f); + dma_fence_unlock_irqrestore(f, flags); +} + static void nouveau_fence_release(struct dma_fence *f) { struct nouveau_fence *fence = to_nouveau_fence(f); @@ -518,15 +527,18 @@ static void nouveau_fence_enable_signaling(struct dma_fence *f) { struct nouveau_fence *fence = to_nouveau_fence(f); struct nouveau_fence_chan *fctx = nouveau_fctx(fence); + unsigned long flags;
if (!fctx->notify_ref++) nvif_event_allow(&fctx->event);
- nouveau_fence_no_signaling(f); + dma_fence_lock_irqsave(f, flags); + __nouveau_fence_no_signaling(f); if (!dma_fence_test_signaled_flag(f)) set_bit(DMA_FENCE_FLAG_USER_BITS, &fence->base.flags); else if (!--fctx->notify_ref) nvif_event_block(&fctx->event); + dma_fence_unlock_irqrestore(f, flags); }
static const struct dma_fence_ops nouveau_fence_ops_uevent = { diff --git a/drivers/gpu/drm/radeon/radeon_fence.c b/drivers/gpu/drm/radeon/radeon_fence.c index 5a543d8ea0d9..bf48bd2556ec 100644 --- a/drivers/gpu/drm/radeon/radeon_fence.c +++ b/drivers/gpu/drm/radeon/radeon_fence.c @@ -373,9 +373,13 @@ static void radeon_fence_enable_signaling(struct dma_fence *f) { struct radeon_fence *fence = to_radeon_fence(f); struct radeon_device *rdev = fence->rdev; + unsigned long flags; + + dma_fence_lock_irqsave(f, flags);
if (atomic64_read(&rdev->fence_drv[fence->ring].last_seq) >= fence->seq) { dma_fence_signal_locked(f); + dma_fence_unlock_irqrestore(f, flags); return; }
@@ -390,6 +394,7 @@ static void radeon_fence_enable_signaling(struct dma_fence *f) radeon_irq_kms_sw_irq_put(rdev, fence->ring); up_read(&rdev->exclusive_lock); dma_fence_signal_locked(f); + dma_fence_unlock_irqrestore(f, flags); return; }
@@ -406,6 +411,7 @@ static void radeon_fence_enable_signaling(struct dma_fence *f) fence->fence_wake.func = radeon_fence_check_signaled; __add_wait_queue(&rdev->fence_queue, &fence->fence_wake); dma_fence_get(f); + dma_fence_unlock_irqrestore(f, flags); }
/** diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_fence.c b/drivers/gpu/drm/vmwgfx/vmwgfx_fence.c index cb92232ca4ee..c88999098bb5 100644 --- a/drivers/gpu/drm/vmwgfx/vmwgfx_fence.c +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_fence.c @@ -100,9 +100,11 @@ static void vmw_fence_enable_signaling(struct dma_fence *f) u32 seqno; struct vmw_fence_obj *fence = container_of(f, struct vmw_fence_obj, base); - struct vmw_fence_manager *fman = fman_from_fence(fence); struct vmw_private *dev_priv = fman->dev_priv; + unsigned long flags; + + dma_fence_lock_irqsave(f, flags); check_for_race: seqno = vmw_fence_read(dev_priv); if (seqno - fence->base.seqno < VMW_FENCE_WRAP) { @@ -111,12 +113,14 @@ static void vmw_fence_enable_signaling(struct dma_fence *f) fence->waiter_added = false; } dma_fence_signal_locked(f); + dma_fence_unlock_irqrestore(f, flags); return; } else if (!fence->waiter_added) { fence->waiter_added = true; if (vmw_seqno_waiter_add(dev_priv)) goto check_for_race; } + dma_fence_unlock_irqrestore(f, flags); }
static u32 __vmw_fences_update(struct vmw_fence_manager *fman); diff --git a/drivers/gpu/drm/xe/xe_hw_fence.c b/drivers/gpu/drm/xe/xe_hw_fence.c index 44563dfd75ab..5356553001cb 100644 --- a/drivers/gpu/drm/xe/xe_hw_fence.c +++ b/drivers/gpu/drm/xe/xe_hw_fence.c @@ -158,7 +158,9 @@ static void xe_hw_fence_enable_signaling(struct dma_fence *dma_fence) { struct xe_hw_fence *fence = to_xe_hw_fence(dma_fence); struct xe_hw_fence_irq *irq = xe_hw_fence_irq(fence); + unsigned long flags;
+ dma_fence_lock_irqsave(dma_fence, flags); dma_fence_get(dma_fence); list_add_tail(&fence->irq_link, &irq->pending);
@@ -166,6 +168,7 @@ static void xe_hw_fence_enable_signaling(struct dma_fence *dma_fence) xe_hw_fence_signaled(dma_fence); if (dma_fence_test_signaled_flag(dma_fence)) xe_hw_fence_irq_run(irq); + dma_fence_unlock_irqrestore(dma_fence, flags); }
static void xe_hw_fence_release(struct dma_fence *dma_fence) diff --git a/drivers/gpu/drm/xe/xe_preempt_fence.c b/drivers/gpu/drm/xe/xe_preempt_fence.c index c6e5472ec7ac..ea0b9ed9d8cd 100644 --- a/drivers/gpu/drm/xe/xe_preempt_fence.c +++ b/drivers/gpu/drm/xe/xe_preempt_fence.c @@ -72,9 +72,12 @@ static void preempt_fence_enable_signaling(struct dma_fence *fence) struct xe_preempt_fence *pfence = container_of(fence, typeof(*pfence), base); struct xe_exec_queue *q = pfence->q; + unsigned long flags;
+ dma_fence_lock_irqsave(fence, flags); pfence->error = q->ops->suspend(q); queue_work(q->vm->xe->preempt_fence_wq, &pfence->preempt_work); + dma_fence_unlock_irqrestore(fence, flags); }
static const struct dma_fence_ops preempt_fence_ops = { diff --git a/drivers/gpu/host1x/fence.c b/drivers/gpu/host1x/fence.c index 4a74df718540..75b101aae756 100644 --- a/drivers/gpu/host1x/fence.c +++ b/drivers/gpu/host1x/fence.c @@ -33,9 +33,13 @@ static struct host1x_syncpt_fence *to_host1x_fence(struct dma_fence *f) static void host1x_syncpt_fence_enable_signaling(struct dma_fence *f) { struct host1x_syncpt_fence *sf = to_host1x_fence(f); + unsigned long flags; + + dma_fence_lock_irqsave(f, flags);
if (host1x_syncpt_is_expired(sf->sp, sf->threshold)) { dma_fence_signal_locked(f); + dma_fence_unlock_irqrestore(f, flags); return; }
@@ -64,6 +68,8 @@ static void host1x_syncpt_fence_enable_signaling(struct dma_fence *f) * so we need to initialize all state used by signalling * before it. */ + + dma_fence_unlock_irqrestore(f, flags); }
static const struct dma_fence_ops host1x_syncpt_fence_ops = { diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h index c8e4d5a61d72..e6b17aa1b769 100644 --- a/include/linux/dma-fence.h +++ b/include/linux/dma-fence.h @@ -171,12 +171,16 @@ struct dma_fence_ops { * implementation know that there is another driver waiting on the * signal (ie. hw->sw case). * - * This is called with irq's disabled, so only spinlocks which disable - * IRQ's can be used in the code outside of this callback. + * The callback is responsible for acquiring the fence lock if needed + * using dma_fence_lock_irqsave(). This gives drivers control over their + * locking strategy and allows them to minimize the critical section if + * they have complex logic. * * If the fence has already passed or if some failure occurred that * makes it impossible to enable signaling, the implementation must - * call dma_fence_signal_locked() before returning. + * call dma_fence_signal_locked() before returning. Note that + * dma_fence_signal_locked() requires the fence lock to be held, so + * implementations calling it MUST acquire the lock first. * * &dma_fence.error may be set in enable_signaling before calling * dma_fence_signal_locked().
linaro-mm-sig@lists.linaro.org