On 11/14/25 11:50, Tvrtko Ursulin wrote:
@@ -569,12 +577,12 @@ void dma_fence_release(struct kref *kref) spin_unlock_irqrestore(fence->lock, flags); } - rcu_read_unlock();
- if (fence->ops->release) - fence->ops->release(fence); + ops = rcu_dereference(fence->ops); + if (ops->release) + ops->release(fence); else dma_fence_free(fence); + rcu_read_unlock();
Risk being a spin lock in the release callback will trigger a warning on PREEMPT_RT. But at least the current code base does not have anything like that AFAICS so I guess it is okay.
I don't think that this is a problem. When PREEMPT_RT is enabled both RCU and spinlocks become preemptible.
So as far as I know it is perfectly valid to grab a spinlock under an rcu read side critical section.
diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h index 64639e104110..77f07735f556 100644 --- a/include/linux/dma-fence.h +++ b/include/linux/dma-fence.h @@ -66,7 +66,7 @@ struct seq_file; */ struct dma_fence { spinlock_t *lock; - const struct dma_fence_ops *ops; + const struct dma_fence_ops __rcu *ops; /* * We clear the callback list on kref_put so that by the time we * release the fence it is unused. No one should be adding to the @@ -218,6 +218,10 @@ struct dma_fence_ops { * timed out. Can also return other error values on custom implementations, * which should be treated as if the fence is signaled. For example a hardware * lockup could be reported like that. + * + * Implementing this callback prevents the BO from detaching after
s/BO/fence/
+ * signaling and so it is mandatory for the module providing the + * dma_fence_ops to stay loaded as long as the dma_fence exists. */ signed long (*wait)(struct dma_fence *fence, bool intr, signed long timeout); @@ -229,6 +233,13 @@ struct dma_fence_ops { * Can be called from irq context. This callback is optional. If it is * NULL, then dma_fence_free() is instead called as the default * implementation. + * + * Implementing this callback prevents the BO from detaching after
Ditto.
Both fixed, thanks.
+ * signaling and so it is mandatory for the module providing the + * dma_fence_ops to stay loaded as long as the dma_fence exists. + * + * If the callback is implemented the memory backing the dma_fence + * object must be freed RCU safe. */ void (*release)(struct dma_fence *fence); @@ -418,13 +429,19 @@ const char __rcu *dma_fence_timeline_name(struct dma_fence *fence); static inline bool dma_fence_is_signaled_locked(struct dma_fence *fence) { + const struct dma_fence_ops *ops;
if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags)) return true; - if (fence->ops->signaled && fence->ops->signaled(fence)) { + rcu_read_lock(); + ops = rcu_dereference(fence->ops); + if (ops->signaled && ops->signaled(fence)) { + rcu_read_unlock(); dma_fence_signal_locked(fence); return true; } + rcu_read_unlock(); return false; } @@ -448,13 +465,19 @@ dma_fence_is_signaled_locked(struct dma_fence *fence) static inline bool dma_fence_is_signaled(struct dma_fence *fence) { + const struct dma_fence_ops *ops;
if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags)) return true; - if (fence->ops->signaled && fence->ops->signaled(fence)) { + rcu_read_lock(); + ops = rcu_dereference(fence->ops); + if (ops->signaled && ops->signaled(fence)) { + rcu_read_unlock();
With the unlocked version two threads could race and one could make the fence->lock go away just around here, before the dma_fence_signal below will take it. It seems it is only safe to rcu_read_unlock before signaling if using the embedded fence (later in the series). Can you think of a downside to holding the rcu read lock to after signaling? that would make it safe I think.
Well it's good to talk about it but I think that it is not necessary to protect the lock in this particular case.
See the RCU protection is only for the fence->ops pointer, but the lock can be taken way after the fence is already signaled.
That's why I came up with the patch to move the lock into the fence in the first place.
Regards, Christian.
Regards,
Tvrtko
dma_fence_signal(fence); return true; } + rcu_read_unlock(); return false; }