The commit mentioned in the fixes tag below introduced a mechanism through which fence producers can fully decouple from fence consumers. This, desirable, mechanism is based on the fence's signaled-bit as the "decoupling point".
A sophisticated interaction between RCU and atomic instructions attempts to ensure that fence consumers can still interact with fence producers through the dma_fence_ops, callback pointers into the producer.
This is the desired behavior: to check for decoupling, the signaled-bit is first checked. If it's not yet signaled, RCU ensures that the ops pointer cannot yet be NULL.
Hereby, dma_fence_signal_timestamp_locked() first sets the signaled-bit, and then sets the ops pointer to NULL. Readers first load the ops pointer, and then check through the signaled-bit whether the pointer can legally be accessed.
These set and load operations could occur out of order on weakly ordered platforms. Hence, we need to enforce strict ordering all the time.
Add the appropriate memory barriers.
Cc: stable@vger.kernel.org Fixes: f4cc3ab824d6 ("dma-buf: protected fence ops by RCU v8") Signed-off-by: Philipp Stanner phasta@kernel.org --- Tested with dmabuf and drm_sched unit tests.
Memory barriers are notoriously difficult, so I would appreciate if some of the more experienced folks can check this. Notably, I am not sure whether the smp_wmb() is necessary.
The documentation for test_and_set_bit() makes the mysterious statement "This is an atomic fully-ordered operation (implied full memory barrier)", but the kcsan_mb() seems to be some sort of debugging barrier, and in any case the docu doesn't make it obvious to me whether that "full barrier" comes before or after the bit setting takes place.
Moreover, in my opinion we should order dma_fence_is_signaled(), too – but if we agree to merge Christian's new series [1] that need should disappear.
[1] https://lore.kernel.org/dri-devel/20260624122917.2483-1-christian.koenig@amd... --- drivers/dma-buf/dma-fence.c | 24 ++++++++++++++++++++++++ 1 file changed, 24 insertions(+)
diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c index c7ea1e75d38a..2e80b01499de 100644 --- a/drivers/dma-buf/dma-fence.c +++ b/drivers/dma-buf/dma-fence.c @@ -363,6 +363,18 @@ void dma_fence_signal_timestamp_locked(struct dma_fence *fence, &fence->flags))) return;
+ /* + * Fully order setting of the bit above with setting of the ops pointer + * to NULL below, so that all parties can use the signaled flag to + * detect that the fence decoupled from its ops in a safe manner. + * + * The counter parts of this barrier are in dma_fence_timeline_name() + * and dma_fence_driver_name(). All other future parties that rely on + * the signaled flag for valid access to the ops pointer will need a + * memory barrier. + */ + smp_wmb(); + trace_dma_fence_signaled(fence);
/* @@ -1170,6 +1182,12 @@ const char __rcu *dma_fence_driver_name(struct dma_fence *fence)
/* RCU protection is required for safe access to returned string */ ops = rcu_dereference(fence->ops); + /* + * Fully order the dereference above with the flag check. Otherwise, + * ops could be dereferenced as a NULL pointer. The barrier's + * counterpart is in dma_fence_signal_timestamp_locked(). + */ + smp_rmb(); if (!dma_fence_test_signaled_flag(fence)) return (const char __rcu *)ops->get_driver_name(fence); else @@ -1203,6 +1221,12 @@ const char __rcu *dma_fence_timeline_name(struct dma_fence *fence)
/* RCU protection is required for safe access to returned string */ ops = rcu_dereference(fence->ops); + /* + * Fully order the dereference above with the flag check. Otherwise, + * ops could be dereferenced as a NULL pointer. The barrier's + * counterpart is in dma_fence_signal_timestamp_locked(). + */ + smp_rmb(); if (!dma_fence_test_signaled_flag(fence)) return (const char __rcu *)ops->get_driver_name(fence); else
base-commit: cdeb2ccd993ed8647adbbda2c3b103aa717fd6f7