On 22/09/2021 13:15, Christian König wrote:
Am 22.09.21 um 13:46 schrieb Tvrtko Ursulin:
On 22/09/2021 11:21, Tvrtko Ursulin wrote:
On 22/09/2021 10:10, Christian König wrote:
This makes the function much simpler since the complex retry logic is now handled else where.
Signed-off-by: Christian König christian.koenig@amd.com
drivers/gpu/drm/i915/gem/i915_gem_busy.c | 35 ++++++++++-------------- 1 file changed, 14 insertions(+), 21 deletions(-)
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_busy.c b/drivers/gpu/drm/i915/gem/i915_gem_busy.c index 6234e17259c1..313afb4a11c7 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_busy.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_busy.c @@ -82,8 +82,8 @@ i915_gem_busy_ioctl(struct drm_device *dev, void *data, { struct drm_i915_gem_busy *args = data; struct drm_i915_gem_object *obj; - struct dma_resv_list *list; - unsigned int seq; + struct dma_resv_iter cursor; + struct dma_fence *fence; int err; err = -ENOENT; @@ -109,27 +109,20 @@ i915_gem_busy_ioctl(struct drm_device *dev, void *data, * to report the overall busyness. This is what the wait-ioctl does. * */ -retry: - seq = raw_read_seqcount(&obj->base.resv->seq);
- /* Translate the exclusive fence to the READ *and* WRITE engine */ - args->busy = busy_check_writer(dma_resv_excl_fence(obj->base.resv));
- /* Translate shared fences to READ set of engines */ - list = dma_resv_shared_list(obj->base.resv); - if (list) { - unsigned int shared_count = list->shared_count, i;
- for (i = 0; i < shared_count; ++i) { - struct dma_fence *fence = - rcu_dereference(list->shared[i]);
+ args->busy = false;
You can drop this line, especially since it is not a boolean. With that:
Reviewed-by: Tvrtko Ursulin tvrtko.ursulin@intel.com
Having said this, one thing to add in the commit message is some commentary that although simpler in code, the new implementation has a lot more atomic instructions due all the extra fence get/put.
Saying this because I remembered busy ioctl is quite an over-popular one. Thinking about traces from some real userspaces I looked at in the past.
So I think ack from maintainers will be required here. Because I just don't know if any performance impact will be visible or not. So view my r-b as "code looks fine" but I am on the fence if it should actually be merged. Probably leaning towards no actually - given how the code is localised here and I dislike burdening old platforms with more CPU time it could be cheaply left as is.
Well previously we would have allocated memory, which as far as I know has more overhead than a few extra atomic operations.
It doesn't, it only uses dma_resv_excl_fence and dma_resv_shared_list.
Regards,
Tvrtko
On the other hand I could as well stick with dma_resv_get_fences() here.
Regards, Christian.
Regards,
Tvrtko
Regards,
Tvrtko
+ dma_resv_iter_begin(&cursor, obj->base.resv, true); + dma_resv_for_each_fence_unlocked(&cursor, fence) { + if (dma_resv_iter_is_restarted(&cursor)) + args->busy = 0;
+ if (dma_resv_iter_is_exclusive(&cursor)) + /* Translate the exclusive fence to the READ *and* WRITE engine */ + args->busy |= busy_check_writer(fence); + else + /* Translate shared fences to READ set of engines */ args->busy |= busy_check_reader(fence); - } }
- if (args->busy && read_seqcount_retry(&obj->base.resv->seq, seq)) - goto retry; + dma_resv_iter_end(&cursor); err = 0; out: