On 12/11/25 15:35, Tvrtko Ursulin wrote:
Hi,
On 11/12/2025 13:16, Christian König wrote:
Implement per-fence spinlocks, allowing implementations to not give an external spinlock to protect the fence internal statei. Instead a spinlock embedded into the fence structure itself is used in this case.
Shared spinlocks have the problem that implementations need to guarantee that the lock live at least as long all fences referencing them.
Using a per-fence spinlock allows completely decoupling spinlock producer and consumer life times, simplifying the handling in most use cases.
v2: improve naming, coverage and function documentation v3: fix one additional locking in the selftests
Signed-off-by: Christian König christian.koenig@amd.com Reviewed-by: Tvrtko Ursulin tvrtko.ursulin@igalia.com
I don't think I gave r-b on this one. Not just yet at least. Maybe you have missed the comments I had in the previous two rounds? I will repeat them below.
I was already wondering why you gave comments and an rb but though that the comments might just be optional.
Going to remove that and see on the comments below.
@@ -365,7 +364,7 @@ void dma_fence_signal_timestamp_locked(struct dma_fence *fence, struct dma_fence_cb *cur, *tmp; struct list_head cb_list; - lockdep_assert_held(fence->lock); + lockdep_assert_held(dma_fence_spinlock(fence)); if (unlikely(test_and_set_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags))) @@ -412,9 +411,9 @@ void dma_fence_signal_timestamp(struct dma_fence *fence, ktime_t timestamp) if (WARN_ON(!fence)) return; - spin_lock_irqsave(fence->lock, flags); + dma_fence_lock_irqsave(fence, flags);
For the locking wrappers I think it would be better to introduce them in a purely mechanical patch preceding this one. That is, just add the wrappers and nothing else.
That doesn't fully work for all cases, but I will separate it out a bit more.
static inline uint64_t amdgpu_vm_tlb_seq(struct amdgpu_vm *vm) { + struct dma_fence *fence; unsigned long flags; - spinlock_t *lock; /* * Workaround to stop racing between the fence signaling and handling - * the cb. The lock is static after initially setting it up, just make - * sure that the dma_fence structure isn't freed up. + * the cb. */ rcu_read_lock(); - lock = vm->last_tlb_flush->lock; + fence = dma_fence_get_rcu(vm->last_tlb_flush);
Why does this belong here? If taking a reference fixes some race it needs to be a separate patch. If it doesn't then this patch shouldn't be adding it.
The code previously assumed that the lock is global and can't go away while the function is called. When we start to use an inline lock that assumption is not true any more.
But you're right that can be a separate patch.
@@ -362,6 +368,38 @@ dma_fence_get_rcu_safe(struct dma_fence __rcu **fencep) } while (1); } +/**
- dma_fence_spinlock - return pointer to the spinlock protecting the fence
- @fence: the fence to get the lock from
- Return either the pointer to the embedded or the external spin lock.
- */
+static inline spinlock_t *dma_fence_spinlock(struct dma_fence *fence) +{ + return test_bit(DMA_FENCE_FLAG_INLINE_LOCK_BIT, &fence->flags) ? + &fence->inline_lock : fence->extern_lock;
Is sprinkling of conditionals better than growing the struct? Probably yes, since branch misses are cheaper than cache misses. Unless the code grows significantly on some hot path and we get instruction cache misses instead. Who knows. But let say in the commit message we considered it and decided on this solution due xyz.
Sure.
On a quick grep there is one arch where this grows the struct past a cache line anyway, but as it is PA-RISC I guess no one cares. Lets mention that in the commit message as well.
Interesting, I was aware of the problems on Sparc regarding spinlocks but that PA-RISC also has something more complicated then an int is news to me.
Anyway I agree it doesn't really matter.
Regards, Christian.
Regards,
Tvrtko
+}
+/**
- dma_fence_lock_irqsave - irqsave lock the fence
- @fence: the fence to lock
- @flags: where to store the CPU flags.
- Lock the fence, preventing it from changing to the signaled state.
- */
+#define dma_fence_lock_irqsave(fence, flags) \ + spin_lock_irqsave(dma_fence_spinlock(fence), flags)
+/**
- dma_fence_unlock_irqrestore - unlock the fence and irqrestore
- @fence: the fence to unlock
- @flags the CPU flags to restore
- Unlock the fence, allowing it to change it's state to signaled again.
- */
+#define dma_fence_unlock_irqrestore(fence, flags) \ + spin_unlock_irqrestore(dma_fence_spinlock(fence), flags)
#ifdef CONFIG_LOCKDEP bool dma_fence_begin_signalling(void); void dma_fence_end_signalling(bool cookie);
linaro-mm-sig@lists.linaro.org