From: Jérôme Glisse jglisse@redhat.com
The debugfs take reference on fence without dropping them. Also the rcu section are not well balance. Fix all that ...
Signed-off-by: Jérôme Glisse jglisse@redhat.com Cc: Christian König christian.koenig@amd.com Cc: Daniel Vetter daniel.vetter@ffwll.ch Cc: Sumit Semwal sumit.semwal@linaro.org Cc: linux-media@vger.kernel.org Cc: dri-devel@lists.freedesktop.org Cc: linaro-mm-sig@lists.linaro.org Cc: Stéphane Marchesin marcheu@chromium.org Cc: stable@vger.kernel.org --- drivers/dma-buf/dma-buf.c | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-)
diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index 13884474d158..f6f4de42ac49 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -1051,24 +1051,31 @@ static int dma_buf_debug_show(struct seq_file *s, void *unused) fobj = rcu_dereference(robj->fence); shared_count = fobj ? fobj->shared_count : 0; fence = rcu_dereference(robj->fence_excl); + fence = dma_fence_get_rcu(fence); if (!read_seqcount_retry(&robj->seq, seq)) break; rcu_read_unlock(); } - - if (fence) + if (fence) { seq_printf(s, "\tExclusive fence: %s %s %ssignalled\n", fence->ops->get_driver_name(fence), fence->ops->get_timeline_name(fence), dma_fence_is_signaled(fence) ? "" : "un"); + dma_fence_put(fence); + } + + rcu_read_lock(); for (i = 0; i < shared_count; i++) { fence = rcu_dereference(fobj->shared[i]); if (!dma_fence_get_rcu(fence)) continue; + rcu_read_unlock(); seq_printf(s, "\tShared fence: %s %s %ssignalled\n", fence->ops->get_driver_name(fence), fence->ops->get_timeline_name(fence), dma_fence_is_signaled(fence) ? "" : "un"); + dma_fence_put(fence); + rcu_read_lock(); } rcu_read_unlock();
Am 06.12.18 um 02:41 schrieb jglisse@redhat.com:
From: Jérôme Glisse jglisse@redhat.com
The debugfs take reference on fence without dropping them. Also the rcu section are not well balance. Fix all that ...
Signed-off-by: Jérôme Glisse jglisse@redhat.com Cc: Christian König christian.koenig@amd.com Cc: Daniel Vetter daniel.vetter@ffwll.ch Cc: Sumit Semwal sumit.semwal@linaro.org Cc: linux-media@vger.kernel.org Cc: dri-devel@lists.freedesktop.org Cc: linaro-mm-sig@lists.linaro.org Cc: Stéphane Marchesin marcheu@chromium.org Cc: stable@vger.kernel.org
Well NAK, you are now taking the RCU lock twice and dropping the RCU and still accessing fobj has a huge potential for accessing freed up memory.
The only correct thing I can see here is to grab a reference to the fence before printing any info on it, Christian.
drivers/dma-buf/dma-buf.c | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-)
diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index 13884474d158..f6f4de42ac49 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -1051,24 +1051,31 @@ static int dma_buf_debug_show(struct seq_file *s, void *unused) fobj = rcu_dereference(robj->fence); shared_count = fobj ? fobj->shared_count : 0; fence = rcu_dereference(robj->fence_excl);
}fence = dma_fence_get_rcu(fence); if (!read_seqcount_retry(&robj->seq, seq)) break; rcu_read_unlock();
if (fence)
if (fence) { seq_printf(s, "\tExclusive fence: %s %s %ssignalled\n", fence->ops->get_driver_name(fence), fence->ops->get_timeline_name(fence), dma_fence_is_signaled(fence) ? "" : "un");
dma_fence_put(fence);
}
for (i = 0; i < shared_count; i++) { fence = rcu_dereference(fobj->shared[i]); if (!dma_fence_get_rcu(fence)) continue;rcu_read_lock();
rcu_read_unlock(); seq_printf(s, "\tShared fence: %s %s %ssignalled\n", fence->ops->get_driver_name(fence), fence->ops->get_timeline_name(fence), dma_fence_is_signaled(fence) ? "" : "un");
dma_fence_put(fence);
} rcu_read_unlock();rcu_read_lock();
On Thu, Dec 06, 2018 at 08:09:28AM +0000, Koenig, Christian wrote:
Am 06.12.18 um 02:41 schrieb jglisse@redhat.com:
From: Jérôme Glisse jglisse@redhat.com
The debugfs take reference on fence without dropping them. Also the rcu section are not well balance. Fix all that ...
Signed-off-by: Jérôme Glisse jglisse@redhat.com Cc: Christian König christian.koenig@amd.com Cc: Daniel Vetter daniel.vetter@ffwll.ch Cc: Sumit Semwal sumit.semwal@linaro.org Cc: linux-media@vger.kernel.org Cc: dri-devel@lists.freedesktop.org Cc: linaro-mm-sig@lists.linaro.org Cc: Stéphane Marchesin marcheu@chromium.org Cc: stable@vger.kernel.org
Well NAK, you are now taking the RCU lock twice and dropping the RCU and still accessing fobj has a huge potential for accessing freed up memory.
The only correct thing I can see here is to grab a reference to the fence before printing any info on it, Christian.
Hu ? That is exactly what i am doing, take reference under rcu, rcu_unlock print the fence info, drop the fence reference, rcu lock rinse and repeat ...
Note that the fobj in _existing_ code is access outside the rcu end that there is an rcu imbalance in that code ie a lonlely rcu_unlock after the for loop.
So that the existing code is broken.
drivers/dma-buf/dma-buf.c | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-)
diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index 13884474d158..f6f4de42ac49 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -1051,24 +1051,31 @@ static int dma_buf_debug_show(struct seq_file *s, void *unused) fobj = rcu_dereference(robj->fence); shared_count = fobj ? fobj->shared_count : 0; fence = rcu_dereference(robj->fence_excl);
}fence = dma_fence_get_rcu(fence); if (!read_seqcount_retry(&robj->seq, seq)) break; rcu_read_unlock();
if (fence)
if (fence) { seq_printf(s, "\tExclusive fence: %s %s %ssignalled\n", fence->ops->get_driver_name(fence), fence->ops->get_timeline_name(fence), dma_fence_is_signaled(fence) ? "" : "un");
dma_fence_put(fence);
}
for (i = 0; i < shared_count; i++) { fence = rcu_dereference(fobj->shared[i]); if (!dma_fence_get_rcu(fence)) continue;rcu_read_lock();
rcu_read_unlock(); seq_printf(s, "\tShared fence: %s %s %ssignalled\n", fence->ops->get_driver_name(fence), fence->ops->get_timeline_name(fence), dma_fence_is_signaled(fence) ? "" : "un");
dma_fence_put(fence);
} rcu_read_unlock();rcu_read_lock();
Am 06.12.18 um 16:21 schrieb Jerome Glisse:
On Thu, Dec 06, 2018 at 08:09:28AM +0000, Koenig, Christian wrote:
Am 06.12.18 um 02:41 schrieb jglisse@redhat.com:
From: Jérôme Glisse jglisse@redhat.com
The debugfs take reference on fence without dropping them. Also the rcu section are not well balance. Fix all that ...
Signed-off-by: Jérôme Glisse jglisse@redhat.com Cc: Christian König christian.koenig@amd.com Cc: Daniel Vetter daniel.vetter@ffwll.ch Cc: Sumit Semwal sumit.semwal@linaro.org Cc: linux-media@vger.kernel.org Cc: dri-devel@lists.freedesktop.org Cc: linaro-mm-sig@lists.linaro.org Cc: Stéphane Marchesin marcheu@chromium.org Cc: stable@vger.kernel.org
Well NAK, you are now taking the RCU lock twice and dropping the RCU and still accessing fobj has a huge potential for accessing freed up memory.
The only correct thing I can see here is to grab a reference to the fence before printing any info on it, Christian.
Hu ? That is exactly what i am doing, take reference under rcu, rcu_unlock print the fence info, drop the fence reference, rcu lock rinse and repeat ...
Note that the fobj in _existing_ code is access outside the rcu end that there is an rcu imbalance in that code ie a lonlely rcu_unlock after the for loop.
So that the existing code is broken.
No, the existing code is perfectly fine.
Please note the break in the loop before the rcu_unlock();
if (!read_seqcount_retry(&robj->seq, seq)) break; <- HERE! rcu_read_unlock(); }
So your patch breaks that and take the RCU read lock twice.
Regards, Christian.
drivers/dma-buf/dma-buf.c | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-)
diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index 13884474d158..f6f4de42ac49 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -1051,24 +1051,31 @@ static int dma_buf_debug_show(struct seq_file *s, void *unused) fobj = rcu_dereference(robj->fence); shared_count = fobj ? fobj->shared_count : 0; fence = rcu_dereference(robj->fence_excl);
fence = dma_fence_get_rcu(fence); if (!read_seqcount_retry(&robj->seq, seq)) break; rcu_read_unlock(); }
if (fence)
if (fence) { seq_printf(s, "\tExclusive fence: %s %s %ssignalled\n", fence->ops->get_driver_name(fence), fence->ops->get_timeline_name(fence), dma_fence_is_signaled(fence) ? "" : "un");
dma_fence_put(fence);
}
rcu_read_lock(); for (i = 0; i < shared_count; i++) { fence = rcu_dereference(fobj->shared[i]); if (!dma_fence_get_rcu(fence)) continue;
rcu_read_unlock(); seq_printf(s, "\tShared fence: %s %s %ssignalled\n", fence->ops->get_driver_name(fence), fence->ops->get_timeline_name(fence), dma_fence_is_signaled(fence) ? "" : "un");
dma_fence_put(fence);
rcu_read_lock(); } rcu_read_unlock();
On Thu, Dec 06, 2018 at 04:08:12PM +0000, Koenig, Christian wrote:
Am 06.12.18 um 16:21 schrieb Jerome Glisse:
On Thu, Dec 06, 2018 at 08:09:28AM +0000, Koenig, Christian wrote:
Am 06.12.18 um 02:41 schrieb jglisse@redhat.com:
From: Jérôme Glisse jglisse@redhat.com
The debugfs take reference on fence without dropping them. Also the rcu section are not well balance. Fix all that ...
Signed-off-by: Jérôme Glisse jglisse@redhat.com Cc: Christian König christian.koenig@amd.com Cc: Daniel Vetter daniel.vetter@ffwll.ch Cc: Sumit Semwal sumit.semwal@linaro.org Cc: linux-media@vger.kernel.org Cc: dri-devel@lists.freedesktop.org Cc: linaro-mm-sig@lists.linaro.org Cc: Stéphane Marchesin marcheu@chromium.org Cc: stable@vger.kernel.org
Well NAK, you are now taking the RCU lock twice and dropping the RCU and still accessing fobj has a huge potential for accessing freed up memory.
The only correct thing I can see here is to grab a reference to the fence before printing any info on it, Christian.
Hu ? That is exactly what i am doing, take reference under rcu, rcu_unlock print the fence info, drop the fence reference, rcu lock rinse and repeat ...
Note that the fobj in _existing_ code is access outside the rcu end that there is an rcu imbalance in that code ie a lonlely rcu_unlock after the for loop.
So that the existing code is broken.
No, the existing code is perfectly fine.
Please note the break in the loop before the rcu_unlock();
if (!read_seqcount_retry(&robj->seq, seq)) break; <- HERE! rcu_read_unlock(); }
So your patch breaks that and take the RCU read lock twice.
Ok missed that, i wonder if the refcount in balance explains the crash that was reported to me ... i sent a patch just for that.
Thank you for reviewing and pointing out the code i was oblivious too :)
Cheers, Jérôme
linaro-mm-sig@lists.linaro.org