On Tue, Nov 23, 2021 at 03:21:07PM +0100, Christian König wrote:
> This change adds the dma_resv_usage enum and allows us to specify why a
> dma_resv object is queried for its containing fences.
>
> Additional to that a dma_resv_usage_rw() helper function is added to aid
> retrieving the fences for a read or write userspace submission.
>
> This is then deployed to the different query functions of the dma_resv
> object and all of their users.
>
> Signed-off-by: Christian König <christian.koenig(a)amd.com>
Just a few comments on the kenreldoc while I scroll through.
> EXPORT_SYMBOL(ib_umem_dmabuf_map_pages);
> diff --git a/include/linux/dma-resv.h b/include/linux/dma-resv.h
> index 062571c04bca..37552935bca6 100644
> --- a/include/linux/dma-resv.h
> +++ b/include/linux/dma-resv.h
> @@ -49,6 +49,86 @@ extern struct ww_class reservation_ww_class;
>
> struct dma_resv_list;
>
> +/**
> + * enum dma_resv_usage - how the fences from a dma_resv obj are used
> + *
> + * This enum describes the different use cases for a dma_resv object and
> + * controls which fences are returned when queried.
> + *
> + * An important fact is that there is the order KERNEL<WRITE<READ<OTHER and
> + * when the dma_resv object is asked for fences for one use case the fences
> + * for the lower use case are returned as well.
Might be good to replicate this to all functions that take a
dma_resv_usage flag, and then also add a "See enum dma_resv_usage for more
information." so we get a clickable hyperlink too.
> + *
> + * For example when asking for WRITE fences then the KERNEL fences are returned
> + * as well. Similar when asked for READ fences then both WRITE and KERNEL
> + * fences are returned as well.
> + */
> +enum dma_resv_usage {
> + /**
> + * @DMA_RESV_USAGE_KERNEL: For in kernel memory management only.
> + *
> + * This should only be used for things like copying or clearing memory
> + * with a DMA hardware engine for the purpose of kernel memory
> + * management.
> + *
> + * Drivers *always* need to wait for those fences before accessing the
> + * resource protected by the dma_resv object. The only exception for
> + * that is when the resource is known to be locked down in place by
> + * pinning it previously.
Should dma_buf_pin also do the wait for kernel fences? I think that would
also ba e bit clearer semantics in the dma-buf patch which does these
waits for us.
Or should dma_buf_pin be pipelined and it's up to callers to wait? For kms
that's definitely the semantics we want, but it's a bit playing with fire
situation, so maybe dma-buf should get the more idiot proof semantics?
> + */
> + DMA_RESV_USAGE_KERNEL,
> +
> + /**
> + * @DMA_RESV_USAGE_WRITE: Implicit write synchronization.
> + *
> + * This should only be used for userspace command submissions which add
> + * an implicit write dependency.
> + */
> + DMA_RESV_USAGE_WRITE,
> +
> + /**
> + * @DMA_RESV_USAGE_READ: Implicit read synchronization.
> + *
> + * This should only be used for userspace command submissions which add
> + * an implicit read dependency.
> + */
> + DMA_RESV_USAGE_READ,
> +
> + /**
> + * @DMA_RESV_USAGE_OTHER: No implicit sync.
> + *
> + * This should be used for operations which don't want to add an
> + * implicit dependency at all, but still have a dependency on memory
> + * management.
> + *
> + * This might include things like preemption fences as well as device
> + * page table updates or even userspace command submissions.
I think we should highlight a bit more that for explicitly synchronized
userspace like vk OTHER is the normal case. So really not an exception.
Ofc aside from amdkgf there's currently no driver doing this, but really
we should have lots of them ...
> + *
> + * The kernel memory management *always* need to wait for those fences
> + * before moving or freeing the resource protected by the dma_resv
> + * object.
> + */
> + DMA_RESV_USAGE_OTHER
> +};
> +
> +/**
> + * dma_resv_usage_rw - helper for implicit sync
> + * @write: true if we create a new implicit sync write
> + *
> + * This returns the implicit synchronization usage for write or read accesses.
Pls add "See enum dma_resv_usage for more details." or so. Never hurts to
be plentiful with links :-)
> + */
> +static inline enum dma_resv_usage dma_resv_usage_rw(bool write)
> +{
> + /* This looks confusing at first sight, but is indeed correct.
> + *
> + * The rational is that new write operations needs to wait for the
> + * existing read and write operations to finish.
> + * But a new read operation only needs to wait for the existing write
> + * operations to finish.
> + */
> + return write ? DMA_RESV_USAGE_READ : DMA_RESV_USAGE_WRITE;
> +}
> +
> /**
> * struct dma_resv - a reservation object manages fences for a buffer
> *
> @@ -147,8 +227,8 @@ struct dma_resv_iter {
> /** @obj: The dma_resv object we iterate over */
> struct dma_resv *obj;
>
> - /** @all_fences: If all fences should be returned */
> - bool all_fences;
> + /** @usage: Controls which fences are returned */
> + enum dma_resv_usage usage;
>
> /** @fence: the currently handled fence */
> struct dma_fence *fence;
> @@ -178,14 +258,14 @@ struct dma_fence *dma_resv_iter_next(struct dma_resv_iter *cursor);
> * dma_resv_iter_begin - initialize a dma_resv_iter object
> * @cursor: The dma_resv_iter object to initialize
> * @obj: The dma_resv object which we want to iterate over
> - * @all_fences: If all fences should be returned or just the exclusive one
> + * @usage: controls which fences to return
Please add the blurb here I mentioned above. Maybe adjust the text to use
the neatly highlighted @usage.
> */
> static inline void dma_resv_iter_begin(struct dma_resv_iter *cursor,
> struct dma_resv *obj,
> - bool all_fences)
> + enum dma_resv_usage usage)
> {
> cursor->obj = obj;
> - cursor->all_fences = all_fences;
> + cursor->usage = usage;
> cursor->fence = NULL;
> }
>
> @@ -242,7 +322,7 @@ static inline bool dma_resv_iter_is_restarted(struct dma_resv_iter *cursor)
> * dma_resv_for_each_fence - fence iterator
> * @cursor: a struct dma_resv_iter pointer
> * @obj: a dma_resv object pointer
> - * @all_fences: true if all fences should be returned
> + * @usage: controls which fences to return
> * @fence: the current fence
> *
Same, another place that needs the @usage clarification.
> * Iterate over the fences in a struct dma_resv object while holding the
> @@ -251,8 +331,8 @@ static inline bool dma_resv_iter_is_restarted(struct dma_resv_iter *cursor)
> * valid as long as the lock is held and so no extra reference to the fence is
> * taken.
> */
> -#define dma_resv_for_each_fence(cursor, obj, all_fences, fence) \
> - for (dma_resv_iter_begin(cursor, obj, all_fences), \
> +#define dma_resv_for_each_fence(cursor, obj, usage, fence) \
> + for (dma_resv_iter_begin(cursor, obj, usage), \
> fence = dma_resv_iter_first(cursor); fence; \
> fence = dma_resv_iter_next(cursor))
>
> @@ -421,14 +501,14 @@ void dma_resv_replace_fences(struct dma_resv *obj, uint64_t context,
> void dma_resv_add_excl_fence(struct dma_resv *obj, struct dma_fence *fence);
> void dma_resv_prune(struct dma_resv *obj);
> void dma_resv_prune_unlocked(struct dma_resv *obj);
> -int dma_resv_get_fences(struct dma_resv *obj, bool write,
> +int dma_resv_get_fences(struct dma_resv *obj, enum dma_resv_usage usage,
> unsigned int *num_fences, struct dma_fence ***fences);
> -int dma_resv_get_singleton(struct dma_resv *obj, bool write,
> +int dma_resv_get_singleton(struct dma_resv *obj, enum dma_resv_usage usage,
> struct dma_fence **fence);
> int dma_resv_copy_fences(struct dma_resv *dst, struct dma_resv *src);
> -long dma_resv_wait_timeout(struct dma_resv *obj, bool wait_all, bool intr,
> - unsigned long timeout);
> -bool dma_resv_test_signaled(struct dma_resv *obj, bool test_all);
> +long dma_resv_wait_timeout(struct dma_resv *obj, enum dma_resv_usage usage,
> + bool intr, unsigned long timeout);
> +bool dma_resv_test_signaled(struct dma_resv *obj, enum dma_resv_usage usage);
> void dma_resv_describe(struct dma_resv *obj, struct seq_file *seq);
I took endless amounts of discussions, but I think we're arriving at
something really neat and tiny here now finally. Both semantics, and how
drivers use them.
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
Am 29.11.21 um 13:23 schrieb Thomas Hellström:
> Hi, Christian,
>
> On Mon, 2021-11-29 at 09:21 +0100, Christian König wrote:
>> Am 29.11.21 um 08:35 schrieb Thomas Hellström:
>>> If a dma_fence_array is reported signaled by a call to
>>> dma_fence_is_signaled(), it may leak the PENDING_ERROR status.
>>>
>>> Fix this by clearing the PENDING_ERROR status if we return true in
>>> dma_fence_array_signaled().
>>>
>>> Fixes: 1f70b8b812f3 ("dma-fence: Propagate errors to dma-fence-
>>> array container")
>>> Cc: linaro-mm-sig(a)lists.linaro.org
>>> Cc: Christian König <ckoenig.leichtzumerken(a)gmail.com>
>>> Cc: Chris Wilson <chris(a)chris-wilson.co.uk>
>>> Signed-off-by: Thomas Hellström <thomas.hellstrom(a)linux.intel.com>
>> Reviewed-by: Christian König <christian.koenig(a)amd.com>
> How are the dma-buf / dma-fence patches typically merged? If i915 is
> the only fence->error user, could we take this through drm-intel to
> avoid a backmerge for upcoming i915 work?
Well that one here looks like a bugfix to me, so either through
drm-misc-fixes ore some i915 -fixes branch sounds fine to me.
If you have any new development based on that a backmerge of the -fixes
into your -next branch is unavoidable anyway.
Regards,
Christian.
>
> /Thomas
>
>
>>> ---
>>> drivers/dma-buf/dma-fence-array.c | 6 +++++-
>>> 1 file changed, 5 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/dma-buf/dma-fence-array.c b/drivers/dma-
>>> buf/dma-fence-array.c
>>> index d3fbd950be94..3e07f961e2f3 100644
>>> --- a/drivers/dma-buf/dma-fence-array.c
>>> +++ b/drivers/dma-buf/dma-fence-array.c
>>> @@ -104,7 +104,11 @@ static bool dma_fence_array_signaled(struct
>>> dma_fence *fence)
>>> {
>>> struct dma_fence_array *array = to_dma_fence_array(fence);
>>>
>>> - return atomic_read(&array->num_pending) <= 0;
>>> + if (atomic_read(&array->num_pending) > 0)
>>> + return false;
>>> +
>>> + dma_fence_array_clear_pending_error(array);
>>> + return true;
>>> }
>>>
>>> static void dma_fence_array_release(struct dma_fence *fence)
>
On 11/27/21 5:05 PM, Jonathan Cameron wrote:
>> Non-coherent mapping with no cache sync:
>> - fileio:
>> read: 156 MiB/s
>> write: 123 MiB/s
>> - dmabuf:
>> read: 234 MiB/s (capped by sample rate)
>> write: 182 MiB/s
>>
>> Non-coherent reads with no cache sync + write-combine writes:
>> - fileio:
>> read: 156 MiB/s
>> write: 140 MiB/s
>> - dmabuf:
>> read: 234 MiB/s (capped by sample rate)
>> write: 210 MiB/s
>>
>>
>> A few things we can deduce from this:
>>
>> * Write-combine is not available on Zynq/ARM? If it was working, it
>> should give a better performance than the coherent mapping, but it
>> doesn't seem to do anything at all. At least it doesn't harm
>> performance.
> I'm not sure it's very relevant to this sort of streaming write.
> If you write a sequence of addresses then nothing stops them getting combined
> into a single write whether or not it is write-combining.
There is a difference at which point they can get combined. With
write-combine they can be coalesced into a single transaction anywhere
in the interconnect, as early as the CPU itself. Without write-cobmine
the DDR controller might decide to combine them, but not earlier. This
can make a difference especially if the write is a narrow write, i.e.
the access size is smaller than the buswidth.
Lets say you do 32-bit writes, but your bus is 64 bits wide. With WC two
32-bits can be combined into a 64-bit write. Without WC that is not
possible and you are potentially not using the bus to its fullest
capacity. This is especially true if the memory bus is wider than the
widest access size of the CPU.
Hi guys,
as discussed before this set of patches completely rework the dma_resv semantic
and spreads the new handling over all the existing drivers and users.
First of all this drops the DAG approach because it requires that every single
driver implements those relatively complicated rules correctly and any
violation of that immediately leads to either corruption of freed memory or
even more severe security problems.
Instead we just keep all fences around all the time until they are signaled.
Only fences with the same context are assumed to be signaled in the correct
order since this is exercised elsewhere as well. Replacing fences is now only
supported for hardware mechanism like VM page table updates where the hardware
can guarantee that the resource can't be accessed any more.
Then the concept of a single exclusive fence and multiple shared fences is
dropped as well.
Instead the dma_resv object is now just a container for dma_fence objects where
each fence has associated usage flags. Those use flags describe how the
operation represented by the dma_fence object is using the resource protected
by the dma_resv object. This allows us to add multiple fences for each usage
type.
Additionally to the existing WRITE/READ usages this patch set also adds the new
KERNEL and OTHER usages. The KERNEL usages is used in cases where the kernel
needs to do some operation with the resource protected by the dma_resv object,
like copies or clears. Those are mandatory to wait for when dynamic memory
management is used.
The OTHER usage is for cases where we don't want that the operation represented
by the dma_fence object participate in any implicit sync but needs to be
respected by the kernel memory management. Examples for those are VM page table
updates and preemption fences.
While doing this the new implementation cleans up existing workarounds all over
the place, but especially amdgpu and TTM. Surprisingly I also found two use
cases for the KERNEL/OTHER usage in i915 and Nouveau, those might need more
thoughts.
In general the existing functionality should been preserved, the only downside
is that we now always need to reserve a slot before adding a fence. The newly
added call to the reservation function can probably use some more cleanup.
TODOs: Testing, testing, testing, doublechecking the newly added
kerneldoc for any typos.
Please review and/or comment,
Christian.
Am 26.11.21 um 08:49 schrieb guangming.cao(a)mediatek.com:
> From: Guangming <Guangming.Cao(a)mediatek.com>
>
> For previous version, it uses 'sg_table.nent's to traverse sg_table in pages
> free flow.
> However, 'sg_table.nents' is reassigned in 'dma_map_sg', it means the number of
> created entries in the DMA adderess space.
> So, use 'sg_table.nents' in pages free flow will case some pages can't be freed.
>
> Here we should use sg_table.orig_nents to free pages memory, but use the
> sgtable helper 'for each_sgtable_sg'(, instead of the previous rather common
> helper 'for_each_sg' which maybe cause memory leak) is much better.
>
> Fixes: d963ab0f15fb0 ("dma-buf: system_heap: Allocate higher order pages if available")
> Signed-off-by: Guangming <Guangming.Cao(a)mediatek.com>
> Reviewed-by: Robin Murphy <robin.murphy(a)arm.com>
Reviewed-by: Christian König <christian.koenig(a)amd.com>
> Cc: <stable(a)vger.kernel.org> # 5.11.*
> ---
> v4: Correct commit message
> 1. Cc stable(a)vger.kernel.org in commit message and add required kernel version.
> 2. Add reviewed-by since patch V2 and V4 are same and V2 is reviewed by Robin.
> 3. There is no new code change in V4.
> V3: Cc stable(a)vger.kernel.org
> 1. This patch needs to be merged stable branch, add stable(a)vger.kernel.org
> in mail list.
> 2. Correct some spelling mistake.
> 3. There is No new code change in V3.
> V2: use 'for_each_sgtable_sg' to 'replece for_each_sg' as suggested by Robin.
>
> ---
> drivers/dma-buf/heaps/system_heap.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/dma-buf/heaps/system_heap.c b/drivers/dma-buf/heaps/system_heap.c
> index 23a7e74ef966..8660508f3684 100644
> --- a/drivers/dma-buf/heaps/system_heap.c
> +++ b/drivers/dma-buf/heaps/system_heap.c
> @@ -289,7 +289,7 @@ static void system_heap_dma_buf_release(struct dma_buf *dmabuf)
> int i;
>
> table = &buffer->sg_table;
> - for_each_sg(table->sgl, sg, table->nents, i) {
> + for_each_sgtable_sg(table, sg, i) {
> struct page *page = sg_page(sg);
>
> __free_pages(page, compound_order(page));
On Fri, Nov 26, 2021 at 11:16:05AM +0800, guangming.cao(a)mediatek.com wrote:
> From: Guangming <Guangming.Cao(a)mediatek.com>
>
> For previous version, it uses 'sg_table.nent's to traverse sg_table in pages
> free flow.
> However, 'sg_table.nents' is reassigned in 'dma_map_sg', it means the number of
> created entries in the DMA adderess space.
> So, use 'sg_table.nents' in pages free flow will case some pages can't be freed.
>
> Here we should use sg_table.orig_nents to free pages memory, but use the
> sgtable helper 'for each_sgtable_sg'(, instead of the previous rather common
> helper 'for_each_sg' which maybe cause memory leak) is much better.
>
> Fixes: d963ab0f15fb0 ("dma-buf: system_heap: Allocate higher order pages if available")
>
> Signed-off-by: Guangming <Guangming.Cao(a)mediatek.com>
> ---
> drivers/dma-buf/heaps/system_heap.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/dma-buf/heaps/system_heap.c b/drivers/dma-buf/heaps/system_heap.c
> index 23a7e74ef966..8660508f3684 100644
> --- a/drivers/dma-buf/heaps/system_heap.c
> +++ b/drivers/dma-buf/heaps/system_heap.c
> @@ -289,7 +289,7 @@ static void system_heap_dma_buf_release(struct dma_buf *dmabuf)
> int i;
>
> table = &buffer->sg_table;
> - for_each_sg(table->sgl, sg, table->nents, i) {
> + for_each_sgtable_sg(table, sg, i) {
> struct page *page = sg_page(sg);
>
> __free_pages(page, compound_order(page));
> --
> 2.17.1
>
<formletter>
This is not the correct way to submit patches for inclusion in the
stable kernel tree. Please read:
https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html
for how to do this properly.
</formletter>
25.11.2021 19:53, Akhil R пишет:
> Add support for the ACPI based device registration so that the driver
> can be also enabled through ACPI table.
>
> This does not include the ACPI support for Tegra VI and DVC I2C.
>
> Signed-off-by: Akhil R <akhilrajeev(a)nvidia.com>
> ---
> drivers/i2c/busses/i2c-tegra.c | 52 ++++++++++++++++++++++++++++++++----------
> 1 file changed, 40 insertions(+), 12 deletions(-)
>
> v4 changes:
> * changed has_acpi_companion to ACPI_HANDLE for consistency across
> all functions
> v3 - https://lkml.org/lkml/2021/11/25/173
> v2 - https://lkml.org/lkml/2021/11/23/82
> v1 - https://lkml.org/lkml/2021/11/19/393
Andy suggested to make v5 with extra patch that will make use of the
generic i2c_timings, but it should be a separate change.
This patch is good to me. Please provide the full changelog for each
version next time, instead of the links. Thanks!
If you'll make v5, then feel free to add my r-b:
Reviewed-by: Dmitry Osipenko <digetx(a)gmail.com>
25.11.2021 22:28, Andy Shevchenko пишет:
>> - err = reset_control_reset(i2c_dev->rst);
>> + if (handle)
>> + err = acpi_evaluate_object(handle, "_RST", NULL, NULL);
> Does it compile for CONFIG_ACPI=n case?
>
It compiles and works fine with CONFIG_ACPI=n.
On 2021-11-25 13:49, guangming.cao(a)mediatek.com wrote:
> From: Guangming <Guangming.Cao(a)mediatek.com>
>
> Use (for_each_sgtable_sg) rather than (for_each_sg) to traverse
> sg_table to free sg_table.
> Use (for_each_sg) maybe will casuse some pages can't be freed
> when send wrong nents number.
It's still worth spelling out that this is fixing a bug where the
current code should have been using table->orig_nents - it's just that
switching to the sgtable helper is the best way to make the fix, since
it almost entirely removes the possibility of making that (previously
rather common) mistake.
If it helps, for the change itself:
Reviewed-by: Robin Murphy <robin.murphy(a)arm.com>
Thanks,
Robin.
> Signed-off-by: Guangming <Guangming.Cao(a)mediatek.com>
> ---
> drivers/dma-buf/heaps/system_heap.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/dma-buf/heaps/system_heap.c b/drivers/dma-buf/heaps/system_heap.c
> index 23a7e74ef966..8660508f3684 100644
> --- a/drivers/dma-buf/heaps/system_heap.c
> +++ b/drivers/dma-buf/heaps/system_heap.c
> @@ -289,7 +289,7 @@ static void system_heap_dma_buf_release(struct dma_buf *dmabuf)
> int i;
>
> table = &buffer->sg_table;
> - for_each_sg(table->sgl, sg, table->nents, i) {
> + for_each_sgtable_sg(table, sg, i) {
> struct page *page = sg_page(sg);
>
> __free_pages(page, compound_order(page));
>