Urgent fixes!
dma_resv_test_signaled is completely broken, dma_resv_wait_timeout kind of broken.
Maarten Lankhorst (2): dma-buf: Fix dma_resv_wait_timeout handling of timeout = 0. dma-buf: Fix dma_resv_test_signaled.
drivers/dma-buf/dma-resv.c | 20 +++++++++++--------- 1 file changed, 11 insertions(+), 9 deletions(-)
Commit ada5c48b11a3 ("dma-buf: use new iterator in dma_resv_wait_timeout") accidentally started mishandling timeout = 0, by forcing a blocking wait with timeout = 1 passed to fences. This is not intended, as timeout = 0 may be used for peeking, similar to test_signaled.
Fixes: ada5c48b11a3 ("dma-buf: use new iterator in dma_resv_wait_timeout") Cc: Christian König christian.koenig@amd.com Cc: Daniel Vetter daniel.vetter@ffwll.ch Signed-off-by: Maarten Lankhorst maarten.lankhorst@linux.intel.com --- drivers/dma-buf/dma-resv.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c index 9eb2baa387d4..70a8082660c5 100644 --- a/drivers/dma-buf/dma-resv.c +++ b/drivers/dma-buf/dma-resv.c @@ -617,18 +617,18 @@ EXPORT_SYMBOL_GPL(dma_resv_get_fences); long dma_resv_wait_timeout(struct dma_resv *obj, bool wait_all, bool intr, unsigned long timeout) { - long ret = timeout ? timeout : 1; + long ret = timeout ?: 1; struct dma_resv_iter cursor; struct dma_fence *fence;
dma_resv_iter_begin(&cursor, obj, wait_all); dma_resv_for_each_fence_unlocked(&cursor, fence) { + ret = dma_fence_wait_timeout(fence, intr, timeout); + if (ret <= 0) + break;
- ret = dma_fence_wait_timeout(fence, intr, ret); - if (ret <= 0) { - dma_resv_iter_end(&cursor); - return ret; - } + if (timeout) + timeout = ret; } dma_resv_iter_end(&cursor);
Am 15.10.21 um 13:57 schrieb Maarten Lankhorst:
Commit ada5c48b11a3 ("dma-buf: use new iterator in dma_resv_wait_timeout") accidentally started mishandling timeout = 0, by forcing a blocking wait with timeout = 1 passed to fences. This is not intended, as timeout = 0 may be used for peeking, similar to test_signaled.
Fixes: ada5c48b11a3 ("dma-buf: use new iterator in dma_resv_wait_timeout") Cc: Christian König christian.koenig@amd.com Cc: Daniel Vetter daniel.vetter@ffwll.ch Signed-off-by: Maarten Lankhorst maarten.lankhorst@linux.intel.com
Sorry for the delay, back from sick leave just today.
Good catch, but when I read the old code correctly that was also broken before by passing in 1 to dma_fence_wait_timeout() for a timeout of 0.
drivers/dma-buf/dma-resv.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c index 9eb2baa387d4..70a8082660c5 100644 --- a/drivers/dma-buf/dma-resv.c +++ b/drivers/dma-buf/dma-resv.c @@ -617,18 +617,18 @@ EXPORT_SYMBOL_GPL(dma_resv_get_fences); long dma_resv_wait_timeout(struct dma_resv *obj, bool wait_all, bool intr, unsigned long timeout) {
- long ret = timeout ? timeout : 1;
- long ret = timeout ?: 1;
Please don't change the coding style here.
Apart from that looks good to me.
Christian.
struct dma_resv_iter cursor; struct dma_fence *fence; dma_resv_iter_begin(&cursor, obj, wait_all); dma_resv_for_each_fence_unlocked(&cursor, fence) {
ret = dma_fence_wait_timeout(fence, intr, timeout);
if (ret <= 0)
break;
ret = dma_fence_wait_timeout(fence, intr, ret);
if (ret <= 0) {
dma_resv_iter_end(&cursor);
return ret;
}
if (timeout)
} dma_resv_iter_end(&cursor);timeout = ret;
Commit 7fa828cb9265 ("dma-buf: use new iterator in dma_resv_test_signaled") accidentally forgot to test whether the dma-buf is actually signaled, breaking pretty much everything depending on it.
Fixes: 7fa828cb9265 ("dma-buf: use new iterator in dma_resv_test_signaled") Cc: Christian König christian.koenig@amd.com Cc: Daniel Vetter daniel.vetter@ffwll.ch Signed-off-by: Maarten Lankhorst maarten.lankhorst@linux.intel.com --- drivers/dma-buf/dma-resv.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c index 70a8082660c5..37ab2875e469 100644 --- a/drivers/dma-buf/dma-resv.c +++ b/drivers/dma-buf/dma-resv.c @@ -655,14 +655,16 @@ bool dma_resv_test_signaled(struct dma_resv *obj, bool test_all) { struct dma_resv_iter cursor; struct dma_fence *fence; + bool ret = true;
dma_resv_iter_begin(&cursor, obj, test_all); dma_resv_for_each_fence_unlocked(&cursor, fence) { - dma_resv_iter_end(&cursor); - return false; + ret = dma_fence_is_signaled(fence); + if (!ret) + break; } dma_resv_iter_end(&cursor); - return true; + return ret; } EXPORT_SYMBOL_GPL(dma_resv_test_signaled);
Am 15.10.21 um 13:57 schrieb Maarten Lankhorst:
Commit 7fa828cb9265 ("dma-buf: use new iterator in dma_resv_test_signaled") accidentally forgot to test whether the dma-buf is actually signaled, breaking pretty much everything depending on it.
NAK, the dma_resv_for_each_fence_unlocked() returns only unsignaled fences. So the code is correct as it is.
Regards, Christian.
Fixes: 7fa828cb9265 ("dma-buf: use new iterator in dma_resv_test_signaled") Cc: Christian König christian.koenig@amd.com Cc: Daniel Vetter daniel.vetter@ffwll.ch Signed-off-by: Maarten Lankhorst maarten.lankhorst@linux.intel.com
drivers/dma-buf/dma-resv.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c index 70a8082660c5..37ab2875e469 100644 --- a/drivers/dma-buf/dma-resv.c +++ b/drivers/dma-buf/dma-resv.c @@ -655,14 +655,16 @@ bool dma_resv_test_signaled(struct dma_resv *obj, bool test_all) { struct dma_resv_iter cursor; struct dma_fence *fence;
- bool ret = true;
dma_resv_iter_begin(&cursor, obj, test_all); dma_resv_for_each_fence_unlocked(&cursor, fence) {
dma_resv_iter_end(&cursor);
return false;
ret = dma_fence_is_signaled(fence);
if (!ret)
} dma_resv_iter_end(&cursor);break;
- return true;
- return ret; } EXPORT_SYMBOL_GPL(dma_resv_test_signaled);
Op 15-10-2021 om 14:07 schreef Christian König:
Am 15.10.21 um 13:57 schrieb Maarten Lankhorst:
Commit 7fa828cb9265 ("dma-buf: use new iterator in dma_resv_test_signaled") accidentally forgot to test whether the dma-buf is actually signaled, breaking pretty much everything depending on it.
NAK, the dma_resv_for_each_fence_unlocked() returns only unsignaled fences. So the code is correct as it is.
That seems like it might cause some unexpected behavior when that function is called with one of the fence locks held, if it calls dma_fence_signal().
Could it be changed to only test the signaled bit, in which case this patch would still be useful?
Or at least add some lockdep annotations, that fence->lock might be taken. So any hangs would at least be easy to spot with lockdep.
~Maarten
Am 15.10.21 um 14:52 schrieb Maarten Lankhorst:
Op 15-10-2021 om 14:07 schreef Christian König:
Am 15.10.21 um 13:57 schrieb Maarten Lankhorst:
Commit 7fa828cb9265 ("dma-buf: use new iterator in dma_resv_test_signaled") accidentally forgot to test whether the dma-buf is actually signaled, breaking pretty much everything depending on it.
NAK, the dma_resv_for_each_fence_unlocked() returns only unsignaled fences. So the code is correct as it is.
That seems like it might cause some unexpected behavior when that function is called with one of the fence locks held, if it calls dma_fence_signal().
Could it be changed to only test the signaled bit, in which case this patch would still be useful?
That's exactly what I suggested as well, but Daniel was against that because of concerns around barriers.
Or at least add some lockdep annotations, that fence->lock might be taken. So any hangs would at least be easy to spot with lockdep.
That should be trivial doable.
Christian.
~Maarten
Linaro-mm-sig mailing list Linaro-mm-sig@lists.linaro.org https://lists.linaro.org/mailman/listinfo/linaro-mm-sig
On Fri, Oct 15, 2021 at 02:56:59PM +0200, Christian König wrote:
Am 15.10.21 um 14:52 schrieb Maarten Lankhorst:
Op 15-10-2021 om 14:07 schreef Christian König:
Am 15.10.21 um 13:57 schrieb Maarten Lankhorst:
Commit 7fa828cb9265 ("dma-buf: use new iterator in dma_resv_test_signaled") accidentally forgot to test whether the dma-buf is actually signaled, breaking pretty much everything depending on it.
NAK, the dma_resv_for_each_fence_unlocked() returns only unsignaled fences. So the code is correct as it is.
That seems like it might cause some unexpected behavior when that function is called with one of the fence locks held, if it calls dma_fence_signal().
Could it be changed to only test the signaled bit, in which case this patch would still be useful?
That's exactly what I suggested as well, but Daniel was against that because of concerns around barriers.
I don't want open-coded bitmask tests, because the current code we have in dma-fence.c is missing barriers, and that doesn't get better if we spread that all around. But if you want this then wrap it in some static inline in dma-fence.h or so, that's fine. Just not open-coded outside of these files, like i915-gem code does a lot (which imo is just plain a disaster).
Or at least add some lockdep annotations, that fence->lock might be taken. So any hangs would at least be easy to spot with lockdep.
That should be trivial doable.
might_lock is trivial to add, but it's more complicated. The spinlock is provided by the fence code, which means there's lots of different lockdep classes. A might_lock on fence->lock is better than nothing, but maybe not good enough.
What we might need are a few more pieces: - a fake dma-fence spinlock lockdep key, maybe call it dma_fence_lock_key or so. - in dma_fence_init we lock dma_fence_lock_key, and then might_lock the actual spinlock passed as an argument. This establishes dependencies from that fake lock to all real fence spinlocks - anywhere we need a might lock we take dma_fence_lock_key instead
The potential issue here is that this might result in lockdep splats in cases where fences somehow naturally nest (maybe drm/sched job fence vs hw fence). So perhaps too much. -Daniel
linaro-mm-sig@lists.linaro.org