On 2017年07月22日 00:20, Christian König wrote:
> From: Christian König <christian.koenig(a)amd.com>
>
> With hardware resets in mind it is possible that all shared fences are
> signaled, but the exlusive isn't. Fix waiting for everything in this situation.
>
> Signed-off-by: Christian König <christian.koenig(a)amd.com>
> ---
> drivers/dma-buf/reservation.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/dma-buf/reservation.c b/drivers/dma-buf/reservation.c
> index e2eff86..ce3f9c1 100644
> --- a/drivers/dma-buf/reservation.c
> +++ b/drivers/dma-buf/reservation.c
> @@ -461,7 +461,7 @@ long reservation_object_wait_timeout_rcu(struct reservation_object *obj,
> }
> }
>
> - if (!shared_count) {
> + if (!fence) {
previous code seems be a bug, the exclusive fence isn't be waited at all
if shared_count != 0.
With your fix, there still is a case the exclusive fence could be
skipped, that when fobj->shared[shared_count-1] isn't signalled.
Regards,
David Zhou
> struct dma_fence *fence_excl = rcu_dereference(obj->fence_excl);
>
> if (fence_excl &&
On Fri, Jul 21, 2017 at 06:20:01PM +0200, Christian König wrote:
> From: Christian König <christian.koenig(a)amd.com>
>
> With hardware resets in mind it is possible that all shared fences are
> signaled, but the exlusive isn't. Fix waiting for everything in this situation.
How did you end up with both shared and exclusive fences on the same
reservation object? At least I thought the point of exclusive was that
it's exclusive (and has an implicit barrier on all previous shared
fences). Same for shared fences, they need to wait for the exclusive one
(and replace it).
Is this fallout from the amdgpu trickery where by default you do all
shared fences? I thought we've aligned semantics a while back ...
-Daniel
>
> Signed-off-by: Christian König <christian.koenig(a)amd.com>
> ---
> drivers/dma-buf/reservation.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/dma-buf/reservation.c b/drivers/dma-buf/reservation.c
> index e2eff86..ce3f9c1 100644
> --- a/drivers/dma-buf/reservation.c
> +++ b/drivers/dma-buf/reservation.c
> @@ -461,7 +461,7 @@ long reservation_object_wait_timeout_rcu(struct reservation_object *obj,
> }
> }
>
> - if (!shared_count) {
> + if (!fence) {
> struct dma_fence *fence_excl = rcu_dereference(obj->fence_excl);
>
> if (fence_excl &&
> --
> 2.7.4
>
> _______________________________________________
> dri-devel mailing list
> dri-devel(a)lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
2017-04-26 Christian König <deathsimple(a)vodafone.de>:
> Am 26.04.2017 um 16:46 schrieb Andres Rodriguez:
> > When a timeout of zero is specified, the caller is only interested in
> > the fence status.
> >
> > In the current implementation, dma_fence_default_wait will always call
> > schedule_timeout() at least once for an unsignaled fence. This adds a
> > significant overhead to a fence status query.
> >
> > Avoid this overhead by returning early if a zero timeout is specified.
> >
> > v2: move early return after enable_signaling
> >
> > Signed-off-by: Andres Rodriguez <andresx7(a)gmail.com>
>
> Reviewed-by: Christian König <christian.koenig(a)amd.com>
pushed to drm-misc-next. Thanks all.
Gustavo
From: Markus Elfring <elfring(a)users.sourceforge.net>
Date: Mon, 8 May 2017 11:05:05 +0200
A few update suggestions were taken into account
from static source code analysis.
Markus Elfring (4):
Combine two function calls into one in dma_buf_debug_show()
Improve a size determination in dma_buf_attach()
Adjust a null pointer check in dma_buf_attach()
Use seq_putc() in two functions
drivers/dma-buf/dma-buf.c | 8 +++-----
drivers/dma-buf/sync_debug.c | 6 +++---
2 files changed, 6 insertions(+), 8 deletions(-)
--
2.12.2
When a timeout of zero is specified, the caller is only interested in
the fence status.
In the current implementation, dma_fence_default_wait will always call
schedule_timeout() at least once for an unsignaled fence. This adds a
significant overhead to a fence status query.
Avoid this overhead by returning early if a zero timeout is specified.
v2: move early return after enable_signaling
Signed-off-by: Andres Rodriguez <andresx7(a)gmail.com>
---
If I'm understanding correctly, I don't think we need to register the
default wait callback. But if that isn't the case please let me know.
This patch has the same perf improvements as v1.
drivers/dma-buf/dma-fence.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c
index 0918d3f..57da14c 100644
--- a/drivers/dma-buf/dma-fence.c
+++ b/drivers/dma-buf/dma-fence.c
@@ -402,6 +402,11 @@ dma_fence_default_wait(struct dma_fence *fence, bool intr, signed long timeout)
}
}
+ if (!timeout) {
+ ret = 0;
+ goto out;
+ }
+
cb.base.func = dma_fence_default_wait_cb;
cb.task = current;
list_add(&cb.base.node, &fence->cb_list);
--
2.9.3
On 2017-04-26 06:13 AM, Christian König wrote:
> Am 26.04.2017 um 11:59 schrieb Dave Airlie:
>> On 26 April 2017 at 17:20, Christian König <deathsimple(a)vodafone.de>
>> wrote:
>>> NAK, I'm wondering how often I have to reject that change. We should
>>> probably add a comment here.
>>>
>>> Even with a zero timeout we still need to enable signaling, otherwise
>>> some
>>> fence will never signal if userspace just polls on them.
>>>
>>> If a caller is only interested in the fence status without enabling the
>>> signaling it should call dma_fence_is_signaled() instead.
>> Can we not move the return 0 (with spin unlock) down after we enabling
>> signalling, but before
>> we enter the schedule_timeout(1)?
>
> Yes, that would be an option.
>
I was actually arguing with Dave about this on IRC yesterday. Seems like
I owe him a beer now.
-Andres
> Christian.
>
>>
>> Dave.
>
>
On 26 April 2017 at 17:20, Christian König <deathsimple(a)vodafone.de> wrote:
> NAK, I'm wondering how often I have to reject that change. We should
> probably add a comment here.
>
> Even with a zero timeout we still need to enable signaling, otherwise some
> fence will never signal if userspace just polls on them.
>
> If a caller is only interested in the fence status without enabling the
> signaling it should call dma_fence_is_signaled() instead.
Can we not move the return 0 (with spin unlock) down after we enabling
signalling, but before
we enter the schedule_timeout(1)?
Dave.
CC a few extra lists I missed.
Regards,
Andres
On 2017-04-25 09:36 PM, Andres Rodriguez wrote:
> When a timeout of zero is specified, the caller is only interested in
> the fence status.
>
> In the current implementation, dma_fence_default_wait will always call
> schedule_timeout() at least once for an unsignaled fence. This adds a
> significant overhead to a fence status query.
>
> Avoid this overhead by returning early if a zero timeout is specified.
>
> Signed-off-by: Andres Rodriguez <andresx7(a)gmail.com>
> ---
>
> This heavily affects the performance of the Source2 engine running on
> radv.
>
> This patch improves dota2(radv) perf on a i7-6700k+RX480 system from
> 72fps->81fps.
>
> drivers/dma-buf/dma-fence.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c
> index 0918d3f..348e9e2 100644
> --- a/drivers/dma-buf/dma-fence.c
> +++ b/drivers/dma-buf/dma-fence.c
> @@ -380,6 +380,9 @@ dma_fence_default_wait(struct dma_fence *fence, bool intr, signed long timeout)
> if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags))
> return ret;
>
> + if (!timeout)
> + return 0;
> +
> spin_lock_irqsave(fence->lock, flags);
>
> if (intr && signal_pending(current)) {
>
Hi,
This is v4 of the series to cleanup to Ion. Greg took some of the patches
that weren't CMA related already. There was a minor bisectability problem
with the CMA APIs so this is a new version to address that. I also
addressed some minor comments on the patch to collapse header files.
Thanks,
Laura
Laura Abbott (12):
cma: Store a name in the cma structure
cma: Introduce cma_for_each_area
staging: android: ion: Use CMA APIs directly
staging: android: ion: Stop butchering the DMA address
staging: android: ion: Break the ABI in the name of forward progress
staging: android: ion: Get rid of ion_phys_addr_t
staging: android: ion: Collapse internal header files
staging: android: ion: Rework heap registration/enumeration
staging: android: ion: Drop ion_map_kernel interface
staging: android: ion: Remove ion_handle and ion_client
staging: android: ion: Set query return value
staging/android: Update Ion TODO list
arch/powerpc/kvm/book3s_hv_builtin.c | 3 +-
drivers/base/dma-contiguous.c | 5 +-
drivers/staging/android/TODO | 21 +-
drivers/staging/android/ion/Kconfig | 32 +
drivers/staging/android/ion/Makefile | 11 +-
drivers/staging/android/ion/compat_ion.c | 152 -----
drivers/staging/android/ion/compat_ion.h | 29 -
drivers/staging/android/ion/ion-ioctl.c | 55 +-
drivers/staging/android/ion/ion.c | 812 ++----------------------
drivers/staging/android/ion/ion.h | 386 ++++++++---
drivers/staging/android/ion/ion_carveout_heap.c | 21 +-
drivers/staging/android/ion/ion_chunk_heap.c | 16 +-
drivers/staging/android/ion/ion_cma_heap.c | 120 ++--
drivers/staging/android/ion/ion_heap.c | 68 --
drivers/staging/android/ion/ion_page_pool.c | 3 +-
drivers/staging/android/ion/ion_priv.h | 453 -------------
drivers/staging/android/ion/ion_system_heap.c | 39 +-
drivers/staging/android/uapi/ion.h | 36 +-
include/linux/cma.h | 6 +-
mm/cma.c | 31 +-
mm/cma.h | 1 +
mm/cma_debug.c | 2 +-
22 files changed, 524 insertions(+), 1778 deletions(-)
delete mode 100644 drivers/staging/android/ion/compat_ion.c
delete mode 100644 drivers/staging/android/ion/compat_ion.h
delete mode 100644 drivers/staging/android/ion/ion_priv.h
--
2.7.4