Am 20.09.2017 um 20:20 schrieb Daniel Vetter:
On Mon, Sep 11, 2017 at 01:06:32PM +0200, Christian König wrote:
Am 11.09.2017 um 12:01 schrieb Chris Wilson:
[SNIP]
Yeah, but that is illegal with a fence objects.
When anybody allocates fences this way it breaks at least reservation_object_get_fences_rcu(), reservation_object_wait_timeout_rcu() and reservation_object_test_signaled_single().
Many, many months ago I sent patches to fix them all.
Found those after a bit a searching. Yeah, those patches where proposed more than a year ago, but never pushed upstream.
Not sure if we really should go this way. dma_fence objects are shared between drivers and since we can't judge if it's the correct fence based on a criteria in the object (only the read counter which is outside) all drivers need to be correct for this.
I would rather go the way and change dma_fence_release() to wrap fence->ops->release into call_rcu() to keep the whole RCU handling outside of the individual drivers.
Hm, I entirely dropped the ball on this, I kinda assumed that we managed to get some agreement on this between i915 and dma_fence. Adding a pile more people.
For the meantime I've send a v2 of this patch to fix at least the buggy return of NULL when we fail to grab the RCU reference but keeping the extra checking for now.
Can I get an rb on this please so that we fix at least the bug at hand?
Thanks, Christian.
Joonas, Tvrtko, I guess we need to fix this one way or the other. -Daniel