Hi guys,
we are currently working an Freesync and direct scan out from system
memory on AMD APUs in A+A laptops.
On problem we stumbled over is that our display hardware needs to scan
out from uncached system memory and we currently don't have a way to
communicate that through DMA-buf.
For our specific use case at hand we are going to implement something
driver specific, but the question is should we have something more
generic for this?
After all the system memory access pattern is a PCIe extension and as
such something generic.
Regards,
Christian.
Hammer it a bit more in that iterators can be restarted and when that
matters, plus suggest to prefer the locked version whenver.
Also delete the two leftover kerneldoc for static functions plus
sprinkle some more links while at it.
Signed-off-by: Daniel Vetter <daniel.vetter(a)intel.com>
Cc: Sumit Semwal <sumit.semwal(a)linaro.org>
Cc: "Christian König" <christian.koenig(a)amd.com>
Cc: linux-media(a)vger.kernel.org
Cc: linaro-mm-sig(a)lists.linaro.org
---
drivers/dma-buf/dma-resv.c | 26 ++++++++++++--------------
include/linux/dma-resv.h | 13 ++++++++++++-
2 files changed, 24 insertions(+), 15 deletions(-)
diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c
index 9eb2baa387d4..1453b664c405 100644
--- a/drivers/dma-buf/dma-resv.c
+++ b/drivers/dma-buf/dma-resv.c
@@ -323,12 +323,6 @@ void dma_resv_add_excl_fence(struct dma_resv *obj, struct dma_fence *fence)
}
EXPORT_SYMBOL(dma_resv_add_excl_fence);
-/**
- * dma_resv_iter_restart_unlocked - restart the unlocked iterator
- * @cursor: The dma_resv_iter object to restart
- *
- * Restart the unlocked iteration by initializing the cursor object.
- */
static void dma_resv_iter_restart_unlocked(struct dma_resv_iter *cursor)
{
cursor->seq = read_seqcount_begin(&cursor->obj->seq);
@@ -344,14 +338,6 @@ static void dma_resv_iter_restart_unlocked(struct dma_resv_iter *cursor)
cursor->is_restarted = true;
}
-/**
- * dma_resv_iter_walk_unlocked - walk over fences in a dma_resv obj
- * @cursor: cursor to record the current position
- *
- * Return all the fences in the dma_resv object which are not yet signaled.
- * The returned fence has an extra local reference so will stay alive.
- * If a concurrent modify is detected the whole iteration is started over again.
- */
static void dma_resv_iter_walk_unlocked(struct dma_resv_iter *cursor)
{
struct dma_resv *obj = cursor->obj;
@@ -387,6 +373,12 @@ static void dma_resv_iter_walk_unlocked(struct dma_resv_iter *cursor)
* dma_resv_iter_first_unlocked - first fence in an unlocked dma_resv obj.
* @cursor: the cursor with the current position
*
+ * Subsequent fences are iterated with dma_resv_iter_next_unlocked().
+ *
+ * Beware that the iterator can be restarted. Code which accumulates statistics
+ * or similar needs to check for this with dma_resv_iter_is_restarted(). For
+ * this reason prefer the locked dma_resv_iter_first() whenver possible.
+ *
* Returns the first fence from an unlocked dma_resv obj.
*/
struct dma_fence *dma_resv_iter_first_unlocked(struct dma_resv_iter *cursor)
@@ -406,6 +398,10 @@ EXPORT_SYMBOL(dma_resv_iter_first_unlocked);
* dma_resv_iter_next_unlocked - next fence in an unlocked dma_resv obj.
* @cursor: the cursor with the current position
*
+ * Beware that the iterator can be restarted. Code which accumulates statistics
+ * or similar needs to check for this with dma_resv_iter_is_restarted(). For
+ * this reason prefer the locked dma_resv_iter_next() whenver possible.
+ *
* Returns the next fence from an unlocked dma_resv obj.
*/
struct dma_fence *dma_resv_iter_next_unlocked(struct dma_resv_iter *cursor)
@@ -431,6 +427,8 @@ EXPORT_SYMBOL(dma_resv_iter_next_unlocked);
* dma_resv_iter_first - first fence from a locked dma_resv object
* @cursor: cursor to record the current position
*
+ * Subsequent fences are iterated with dma_resv_iter_next_unlocked().
+ *
* Return the first fence in the dma_resv object while holding the
* &dma_resv.lock.
*/
diff --git a/include/linux/dma-resv.h b/include/linux/dma-resv.h
index dbd235ab447f..ebe908592ac3 100644
--- a/include/linux/dma-resv.h
+++ b/include/linux/dma-resv.h
@@ -153,6 +153,13 @@ struct dma_resv {
* struct dma_resv_iter - current position into the dma_resv fences
*
* Don't touch this directly in the driver, use the accessor function instead.
+ *
+ * IMPORTANT
+ *
+ * When using the lockless iterators like dma_resv_iter_next_unlocked() or
+ * dma_resv_for_each_fence_unlocked() beware that the iterator can be restarted.
+ * Code which accumulates statistics or similar needs to check for this with
+ * dma_resv_iter_is_restarted().
*/
struct dma_resv_iter {
/** @obj: The dma_resv object we iterate over */
@@ -243,7 +250,11 @@ static inline bool dma_resv_iter_is_restarted(struct dma_resv_iter *cursor)
* &dma_resv.lock and using RCU instead. The cursor needs to be initialized
* with dma_resv_iter_begin() and cleaned up with dma_resv_iter_end(). Inside
* the iterator a reference to the dma_fence is held and the RCU lock dropped.
- * When the dma_resv is modified the iteration starts over again.
+ *
+ * Beware that the iterator can be restarted when the struct dma_resv for
+ * @cursor is modified. Code which accumulates statistics or similar needs to
+ * check for this with dma_resv_iter_is_restarted(). For this reason prefer the
+ * lock iterator dma_resv_for_each_fence() whenever possible.
*/
#define dma_resv_for_each_fence_unlocked(cursor, fence) \
for (fence = dma_resv_iter_first_unlocked(cursor); \
--
2.33.0
On Thu, Nov 25, 2021 at 11:48 PM <guangming.cao(a)mediatek.com> wrote:
>
> From: Guangming <Guangming.Cao(a)mediatek.com>
>
> For previous version, it uses 'sg_table.nent's to traverse sg_table in pages
> free flow.
> However, 'sg_table.nents' is reassigned in 'dma_map_sg', it means the number of
> created entries in the DMA adderess space.
> So, use 'sg_table.nents' in pages free flow will case some pages can't be freed.
>
> Here we should use sg_table.orig_nents to free pages memory, but use the
> sgtable helper 'for each_sgtable_sg'(, instead of the previous rather common
> helper 'for_each_sg' which maybe cause memory leak) is much better.
>
> Fixes: d963ab0f15fb0 ("dma-buf: system_heap: Allocate higher order pages if available")
> Signed-off-by: Guangming <Guangming.Cao(a)mediatek.com>
> Reviewed-by: Robin Murphy <robin.murphy(a)arm.com>
> Cc: <stable(a)vger.kernel.org> # 5.11.*
Thanks so much for catching this and sending in all the revisions!
Reviewed-by: John Stultz <john.stultz(a)linaro.org>
Am 30.11.21 um 15:35 schrieb Thomas Hellström:
> On Tue, 2021-11-30 at 14:26 +0100, Christian König wrote:
>> Am 30.11.21 um 13:56 schrieb Thomas Hellström:
>>> On 11/30/21 13:42, Christian König wrote:
>>>> Am 30.11.21 um 13:31 schrieb Thomas Hellström:
>>>>> [SNIP]
>>>>>> Other than that, I didn't investigate the nesting fails
>>>>>> enough to
>>>>>> say I can accurately review this. :)
>>>>> Basically the problem is that within enable_signaling() which
>>>>> is
>>>>> called with the dma_fence lock held, we take the dma_fence lock
>>>>> of
>>>>> another fence. If that other fence is a dma_fence_array, or a
>>>>> dma_fence_chain which in turn tries to lock a dma_fence_array
>>>>> we hit
>>>>> a splat.
>>>> Yeah, I already thought that you constructed something like that.
>>>>
>>>> You get the splat because what you do here is illegal, you can't
>>>> mix
>>>> dma_fence_array and dma_fence_chain like this or you can end up
>>>> in a
>>>> stack corruption.
>>> Hmm. Ok, so what is the stack corruption, is it that the
>>> enable_signaling() will end up with endless recursion? If so,
>>> wouldn't
>>> it be more usable we break that recursion chain and allow a more
>>> general use?
>> The problem is that this is not easily possible for dma_fence_array
>> containers. Just imagine that you drop the last reference to the
>> containing fences during dma_fence_array destruction if any of the
>> contained fences is another container you can easily run into
>> recursion
>> and with that stack corruption.
> Indeed, that would require some deeper surgery.
>
>> That's one of the major reasons I came up with the dma_fence_chain
>> container. This one you can chain any number of elements together
>> without running into any recursion.
>>
>>> Also what are the mixing rules between these? Never use a
>>> dma-fence-chain as one of the array fences and never use a
>>> dma-fence-array as a dma-fence-chain fence?
>> You can't add any other container to a dma_fence_array, neither other
>> dma_fence_array instances nor dma_fence_chain instances.
>>
>> IIRC at least technically a dma_fence_chain can contain a
>> dma_fence_array if you absolutely need that, but Daniel, Jason and I
>> already had the same discussion a while back and came to the
>> conclusion
>> to avoid that as well if possible.
> Yes, this is actually the use-case. But what I can't easily guarantee
> is that that dma_fence_chain isn't fed into a dma_fence_array somewhere
> else. How do you typically avoid that?
>
> Meanwhile I guess I need to take a different approach in the driver to
> avoid this altogether.
Jason and I came up with a deep dive iterator for his use case, but I
think we don't want to use that any more after my dma_resv rework.
In other words when you need to create a new dma_fence_array you flatten
out the existing construct which is at worst case
dma_fence_chain->dma_fence_array->dma_fence.
Regards,
Christian.
>
> /Thomas
>
>
>> Regards,
>> Christian.
>>
>>> /Thomas
>>>
>>>
>>>
>>>
>>>> Regards,
>>>> Christian.
>>>>
>>>>> But I'll update the commit message with a typical splat.
>>>>>
>>>>> /Thomas
>
Am 30.11.21 um 13:56 schrieb Thomas Hellström:
>
> On 11/30/21 13:42, Christian König wrote:
>> Am 30.11.21 um 13:31 schrieb Thomas Hellström:
>>> [SNIP]
>>>> Other than that, I didn't investigate the nesting fails enough to
>>>> say I can accurately review this. :)
>>>
>>> Basically the problem is that within enable_signaling() which is
>>> called with the dma_fence lock held, we take the dma_fence lock of
>>> another fence. If that other fence is a dma_fence_array, or a
>>> dma_fence_chain which in turn tries to lock a dma_fence_array we hit
>>> a splat.
>>
>> Yeah, I already thought that you constructed something like that.
>>
>> You get the splat because what you do here is illegal, you can't mix
>> dma_fence_array and dma_fence_chain like this or you can end up in a
>> stack corruption.
>
> Hmm. Ok, so what is the stack corruption, is it that the
> enable_signaling() will end up with endless recursion? If so, wouldn't
> it be more usable we break that recursion chain and allow a more
> general use?
The problem is that this is not easily possible for dma_fence_array
containers. Just imagine that you drop the last reference to the
containing fences during dma_fence_array destruction if any of the
contained fences is another container you can easily run into recursion
and with that stack corruption.
That's one of the major reasons I came up with the dma_fence_chain
container. This one you can chain any number of elements together
without running into any recursion.
> Also what are the mixing rules between these? Never use a
> dma-fence-chain as one of the array fences and never use a
> dma-fence-array as a dma-fence-chain fence?
You can't add any other container to a dma_fence_array, neither other
dma_fence_array instances nor dma_fence_chain instances.
IIRC at least technically a dma_fence_chain can contain a
dma_fence_array if you absolutely need that, but Daniel, Jason and I
already had the same discussion a while back and came to the conclusion
to avoid that as well if possible.
Regards,
Christian.
>
> /Thomas
>
>
>
>
>>
>> Regards,
>> Christian.
>>
>>>
>>> But I'll update the commit message with a typical splat.
>>>
>>> /Thomas
>>
Am 30.11.21 um 13:31 schrieb Thomas Hellström:
> [SNIP]
>> Other than that, I didn't investigate the nesting fails enough to say
>> I can accurately review this. :)
>
> Basically the problem is that within enable_signaling() which is
> called with the dma_fence lock held, we take the dma_fence lock of
> another fence. If that other fence is a dma_fence_array, or a
> dma_fence_chain which in turn tries to lock a dma_fence_array we hit a
> splat.
Yeah, I already thought that you constructed something like that.
You get the splat because what you do here is illegal, you can't mix
dma_fence_array and dma_fence_chain like this or you can end up in a
stack corruption.
Regards,
Christian.
>
> But I'll update the commit message with a typical splat.
>
> /Thomas
Am 30.11.21 um 13:19 schrieb Thomas Hellström:
> Introducing more usage of dma-fence-chain and dma-fence-array in the
> i915 driver we start to hit lockdep splats due to the recursive fence
> locking in the dma-fence-chain and dma-fence-array containers.
> This is a humble suggestion to try to establish a dma-fence locking order
> (patch 1) and to avoid excessive recursive locking in these containers
> (patch 2)
Well completely NAK to this.
This splats are intentional notes that something in the driver code is
wrong (or we messed up the chain and array containers somehow).
Those two containers are intentionally crafted in a way which allows to
avoid any dependency between their spinlocks. So as long as you
correctly use them you should never see a splat.
Please provide the lockdep splat so that we can analyze the problem.
Thanks,
Christian.
>
> Thomas Hellström (2):
> dma-fence: Avoid establishing a locking order between fence classes
> dma-fence: Avoid excessive recursive fence locking from
> enable_signaling() callbacks
>
> drivers/dma-buf/dma-fence-array.c | 23 +++++++--
> drivers/dma-buf/dma-fence-chain.c | 12 ++++-
> drivers/dma-buf/dma-fence.c | 79 +++++++++++++++++++++----------
> include/linux/dma-fence.h | 4 ++
> 4 files changed, 89 insertions(+), 29 deletions(-)
>
> Cc: linaro-mm-sig(a)lists.linaro.org
> Cc: dri-devel(a)lists.freedesktop.org
> Cc: Christian König <christian.koenig(a)amd.com>
>
Hi everyone,
compared to the last version I've dropped the pruning as suggested by
Maarten, split the new DMA_RESV_USAGE_* patches from the general
introduction as suggeted by Daniel and renamed OTEHRS to BOOKKEEP as
suggested by Pekka.
Please take a look and review,
Christian.
Am 29.11.21 um 13:46 schrieb Thomas Hellström:
> On Mon, 2021-11-29 at 13:33 +0100, Christian König wrote:
>> Am 29.11.21 um 13:23 schrieb Thomas Hellström:
>>> Hi, Christian,
>>>
>>> On Mon, 2021-11-29 at 09:21 +0100, Christian König wrote:
>>>> Am 29.11.21 um 08:35 schrieb Thomas Hellström:
>>>>> If a dma_fence_array is reported signaled by a call to
>>>>> dma_fence_is_signaled(), it may leak the PENDING_ERROR status.
>>>>>
>>>>> Fix this by clearing the PENDING_ERROR status if we return true
>>>>> in
>>>>> dma_fence_array_signaled().
>>>>>
>>>>> Fixes: 1f70b8b812f3 ("dma-fence: Propagate errors to dma-fence-
>>>>> array container")
>>>>> Cc: linaro-mm-sig(a)lists.linaro.org
>>>>> Cc: Christian König <ckoenig.leichtzumerken(a)gmail.com>
>>>>> Cc: Chris Wilson <chris(a)chris-wilson.co.uk>
>>>>> Signed-off-by: Thomas Hellström
>>>>> <thomas.hellstrom(a)linux.intel.com>
>>>> Reviewed-by: Christian König <christian.koenig(a)amd.com>
>>> How are the dma-buf / dma-fence patches typically merged? If i915
>>> is
>>> the only fence->error user, could we take this through drm-intel to
>>> avoid a backmerge for upcoming i915 work?
>> Well that one here looks like a bugfix to me, so either through
>> drm-misc-fixes ore some i915 -fixes branch sounds fine to me.
>>
>> If you have any new development based on that a backmerge of the -
>> fixes
>> into your -next branch is unavoidable anyway.
> Ok, I'll check with Joonas if I can take it through
> drm-intel-gt-next, since fixes are cherry-picked from that one. Patch
> will then appear in both the -fixes and the -next branch.
Well exactly that's the stuff Daniel told me to avoid :)
But maybe your i915 workflow is somehow better handling that than the
AMD workflow.
Christian.
>
> Thanks,
> /Thomas
>
>
>> Regards,
>> Christian.
>>
>>> /Thomas
>>>
>>>
>>>>> ---
>>>>> drivers/dma-buf/dma-fence-array.c | 6 +++++-
>>>>> 1 file changed, 5 insertions(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/drivers/dma-buf/dma-fence-array.c b/drivers/dma-
>>>>> buf/dma-fence-array.c
>>>>> index d3fbd950be94..3e07f961e2f3 100644
>>>>> --- a/drivers/dma-buf/dma-fence-array.c
>>>>> +++ b/drivers/dma-buf/dma-fence-array.c
>>>>> @@ -104,7 +104,11 @@ static bool
>>>>> dma_fence_array_signaled(struct
>>>>> dma_fence *fence)
>>>>> {
>>>>> struct dma_fence_array *array =
>>>>> to_dma_fence_array(fence);
>>>>>
>>>>> - return atomic_read(&array->num_pending) <= 0;
>>>>> + if (atomic_read(&array->num_pending) > 0)
>>>>> + return false;
>>>>> +
>>>>> + dma_fence_array_clear_pending_error(array);
>>>>> + return true;
>>>>> }
>>>>>
>>>>> static void dma_fence_array_release(struct dma_fence *fence)
>