On Wed, Sep 30, 2020 at 11:39:06AM +0200, Michel Dänzer wrote:
> On 2020-03-17 10:21 p.m., Jason Ekstrand wrote:
> > Explicit synchronization is the future. At least, that seems to be what
> > most userspace APIs are agreeing on at this point. However, most of our
> > Linux APIs (both userspace and kernel UAPI) are currently built around
> > implicit synchronization with dma-buf. While work is ongoing to change
> > many of the userspace APIs and protocols to an explicit synchronization
> > model, switching over piecemeal is difficult due to the number of
> > potential components involved. On the kernel side, many drivers use
> > dma-buf including GPU (3D/compute), display, v4l, and others. In
> > userspace, we have X11, several Wayland compositors, 3D drivers, compute
> > drivers (OpenCL etc.), media encode/decode, and the list goes on.
> >
> > This patch provides a path forward by allowing userspace to manually
> > manage the fences attached to a dma-buf. Alternatively, one can think
> > of this as making dma-buf's implicit synchronization simply a carrier
> > for an explicit fence. This is accomplished by adding two IOCTLs to
> > dma-buf for importing and exporting a sync file to/from the dma-buf.
> > This way a userspace component which is uses explicit synchronization,
> > such as a Vulkan driver, can manually set the write fence on a buffer
> > before handing it off to an implicitly synchronized component such as a
> > Wayland compositor or video encoder. In this way, each of the different
> > components can be upgraded to an explicit synchronization model one at a
> > time as long as the userspace pieces connecting them are aware of it and
> > import/export fences at the right times.
> >
> > There is a potential race condition with this API if userspace is not
> > careful. A typical use case for implicit synchronization is to wait for
> > the dma-buf to be ready, use it, and then signal it for some other
> > component. Because a sync_file cannot be created until it is guaranteed
> > to complete in finite time, userspace can only signal the dma-buf after
> > it has already submitted the work which uses it to the kernel and has
> > received a sync_file back. There is no way to atomically submit a
> > wait-use-signal operation. This is not, however, really a problem with
> > this API so much as it is a problem with explicit synchronization
> > itself. The way this is typically handled is to have very explicit
> > ownership transfer points in the API or protocol which ensure that only
> > one component is using it at any given time. Both X11 (via the PRESENT
> > extension) and Wayland provide such ownership transfer points via
> > explicit present and idle messages.
> >
> > The decision was intentionally made in this patch to make the import and
> > export operations IOCTLs on the dma-buf itself rather than as a DRM
> > IOCTL. This makes it the import/export operation universal across all
> > components which use dma-buf including GPU, display, v4l, and others.
> > It also means that a userspace component can do the import/export
> > without access to the DRM fd which may be tricky to get in cases where
> > the client communicates with DRM via a userspace API such as OpenGL or
> > Vulkan. At a future date we may choose to add direct import/export APIs
> > to components such as drm_syncobj to avoid allocating a file descriptor
> > and going through two ioctls. However, that seems to be something of a
> > micro-optimization as import/export operations are likely to happen at a
> > rate of a few per frame of rendered or decoded video.
> >
> > v2 (Jason Ekstrand):
> > - Use a wrapper dma_fence_array of all fences including the new one
> > when importing an exclusive fence.
> >
> > v3 (Jason Ekstrand):
> > - Lock around setting shared fences as well as exclusive
> > - Mark SIGNAL_SYNC_FILE as a read-write ioctl.
> > - Initialize ret to 0 in dma_buf_wait_sync_file
> >
> > v4 (Jason Ekstrand):
> > - Use the new dma_resv_get_singleton helper
> >
> > v5 (Jason Ekstrand):
> > - Rename the IOCTLs to import/export rather than wait/signal
> > - Drop the WRITE flag and always get/set the exclusive fence
> >
> > Signed-off-by: Jason Ekstrand <jason(a)jlekstrand.net>
>
> What's the status of this? DMA_BUF_IOCTL_EXPORT_SYNC_FILE would be useful
> for Wayland compositors to wait for client buffers to become ready without
> being prone to getting delayed by later HW access to them, so it would be
> nice to merge that at least (if DMA_BUF_IOCTL_IMPORT_SYNC_FILE is still
> controversial).
I think the missing bits are just the usual stuff
- igt testcases
- userspace using the new ioctls
- review of the entire pile
I don't think there's any fundamental objections aside from "no one ever
pushed this over the finish line".
Cheers, Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
Hi Alex,
On 22.09.2020 01:15, Alex Goins wrote:
> Tested-by: Alex Goins <agoins(a)nvidia.com>
>
> This change fixes a regression with drm_prime_sg_to_page_addr_arrays() and
> AMDGPU in v5.9.
Thanks for testing!
> Commit 39913934 similarly revamped AMDGPU to use sgtable helper functions. When
> it changed from dma_map_sg_attrs() to dma_map_sgtable(), as a side effect it
> started correctly updating sgt->nents to the return value of dma_map_sg_attrs().
> However, drm_prime_sg_to_page_addr_arrays() incorrectly uses sgt->nents to
> iterate over pages, rather than sgt->orig_nents, resulting in it now returning
> the incorrect number of pages on AMDGPU.
>
> I had written a patch that changes drm_prime_sg_to_page_addr_arrays() to use
> for_each_sgtable_sg() instead of for_each_sg(), iterating using sgt->orig_nents:
>
> - for_each_sg(sgt->sgl, sg, sgt->nents, count) {
> + for_each_sgtable_sg(sgt, sg, count) {
>
> This patch takes it further, but still has the effect of fixing the number of
> pages that drm_prime_sg_to_page_addr_arrays() returns. Something like this
> should be included in v5.9 to prevent a regression with AMDGPU.
Probably the easiest way to handle a fix for v5.9 would be to simply
merge the latest version of this patch also to v5.9-rcX:
https://lore.kernel.org/dri-devel/20200904131711.12950-3-m.szyprowski@samsu…
This way we would get it fixed and avoid possible conflict in the -next.
Do you have any AMDGPU fixes for v5.9 in the queue? Maybe you can add
that patch to the queue? Dave: would it be okay that way?
Best regards
--
Marek Szyprowski, PhD
Samsung R&D Institute Poland
Am 28.09.20 um 09:37 schrieb Thomas Zimmermann:
> Hi
>
> Am 28.09.20 um 08:50 schrieb Christian König:
>> Am 27.09.20 um 21:16 schrieb Sam Ravnborg:
>>> Hi Thomas.
>>>
>>>>> struct simap {
>>>>> union {
>>>>> void __iomem *vaddr_iomem;
>>>>> void *vaddr;
>>>>> };
>>>>> bool is_iomem;
>>>>> };
>>>>>
>>>>> Where simap is a shorthand for system_iomem_map
>>>>> And it could al be stuffed into a include/linux/simap.h file.
>>>>>
>>>>> Not totally sold on the simap name - but wanted to come up with
>>>>> something.
>>>> Yes. Others, myself included, have suggested to use a name that does not
>>>> imply a connection to the dma-buf framework, but no one has come up with
>>>> a good name.
>>>>
>>>> I strongly dislike simap, as it's entirely non-obvious what it does.
>>>> dma-buf-map is not actually wrong. The structures represents the mapping
>>>> of a dma-able buffer in most cases.
>>>>
>>>>> With this approach users do not have to pull in dma-buf to use it and
>>>>> users will not confuse that this is only for dma-buf usage.
>>>> There's no need to enable dma-buf. It's all in the header file without
>>>> dependencies on dma-buf. It's really just the name.
>>>>
>>>> But there's something else to take into account. The whole issue here is
>>>> that the buffer is disconnected from its originating driver, so we don't
>>>> know which kind of memory ops we have to use. Thinking about it, I
>>>> realized that no one else seemed to have this problem until now.
>>>> Otherwise there would be a solution already. So maybe the dma-buf
>>>> framework *is* the native use case for this data structure.
>>> We have at least:
>>> linux/fb.h:
>>> union {
>>> char __iomem *screen_base; /* Virtual address */
>>> char *screen_buffer;
>>> };
>>>
>>> Which solve more or less the same problem.
> I thought this was for convenience. The important is_iomem bit is missing.
>
>> I also already noted that in TTM we have exactly the same problem and a
>> whole bunch of helpers to allow operations on those pointers.
> How do you call this within TTM?
ttm_bus_placement, but I really don't like that name.
>
> The data structure represents a pointer to either system or I/O memory,
> but not necessatrily device memory. It contains raw data. That would
> give something like
>
> struct databuf_map
> struct databuf_ptr
> struct dbuf_map
> struct dbuf_ptr
>
> My favorite would be dbuf_ptr. It's short and the API names would make
> sense: dbuf_ptr_clear() for clearing, dbuf_ptr_set_vaddr() to set an
> address, dbuf_ptr_incr() to increment, etc. Also, the _ptr indicates
> that it's a single address; not an offset with length.
Puh, no idea. All of that doesn't sound like it 100% hits the underlying
meaning of the structure.
Christian.
>
> Best regards
> Thomas
>
>> Christian.
>>
>>>
>>>> Anyway, if a better name than dma-buf-map comes in, I'm willing to
>>>> rename the thing. Otherwise I intend to merge the patchset by the end of
>>>> the week.
>>> Well, the main thing is that I think this shoud be moved away from
>>> dma-buf. But if indeed dma-buf is the only relevant user in drm then
>>> I am totally fine with the current naming.
>>>
>>> One alternative named that popped up in my head: struct sys_io_map {}
>>> But again, if this is kept in dma-buf then I am fine with the current
>>> naming.
>>>
>>> In other words, if you continue to think this is mostly a dma-buf
>>> thing all three patches are:
>>> Acked-by: Sam Ravnborg <sam(a)ravnborg.org>
>>>
>>> Sam
>> _______________________________________________
>> dri-devel mailing list
>> dri-devel(a)lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/dri-devel
Hi Thomas.
> >
> > struct simap {
> > union {
> > void __iomem *vaddr_iomem;
> > void *vaddr;
> > };
> > bool is_iomem;
> > };
> >
> > Where simap is a shorthand for system_iomem_map
> > And it could al be stuffed into a include/linux/simap.h file.
> >
> > Not totally sold on the simap name - but wanted to come up with
> > something.
>
> Yes. Others, myself included, have suggested to use a name that does not
> imply a connection to the dma-buf framework, but no one has come up with
> a good name.
>
> I strongly dislike simap, as it's entirely non-obvious what it does.
> dma-buf-map is not actually wrong. The structures represents the mapping
> of a dma-able buffer in most cases.
>
> >
> > With this approach users do not have to pull in dma-buf to use it and
> > users will not confuse that this is only for dma-buf usage.
>
> There's no need to enable dma-buf. It's all in the header file without
> dependencies on dma-buf. It's really just the name.
>
> But there's something else to take into account. The whole issue here is
> that the buffer is disconnected from its originating driver, so we don't
> know which kind of memory ops we have to use. Thinking about it, I
> realized that no one else seemed to have this problem until now.
> Otherwise there would be a solution already. So maybe the dma-buf
> framework *is* the native use case for this data structure.
We have at least:
linux/fb.h:
union {
char __iomem *screen_base; /* Virtual address */
char *screen_buffer;
};
Which solve more or less the same problem.
> Anyway, if a better name than dma-buf-map comes in, I'm willing to
> rename the thing. Otherwise I intend to merge the patchset by the end of
> the week.
Well, the main thing is that I think this shoud be moved away from
dma-buf. But if indeed dma-buf is the only relevant user in drm then
I am totally fine with the current naming.
One alternative named that popped up in my head: struct sys_io_map {}
But again, if this is kept in dma-buf then I am fine with the current
naming.
In other words, if you continue to think this is mostly a dma-buf
thing all three patches are:
Acked-by: Sam Ravnborg <sam(a)ravnborg.org>
Sam
On Fri, Sep 25, 2020 at 04:51:38PM +0800, Tian Tao wrote:
> drm_modeset_lock.h already declares struct drm_device, so there's no
> need to declare it in vc4_drv.h
>
> Signed-off-by: Tian Tao <tiantao6(a)hisilicon.com>
Just an aside, when submitting patches please use
scripts/get_maintainers.pl to generate the recipient list. Looking through
past few patches from you it seems fairly arbitrary and often misses the
actual maintainers for a given piece of code, which increases the odds the
patch will get lost a lot.
E.g. for this one I'm only like the 5th or so fallback person, and the
main maintainer isn't on the recipient list.
Cheeers, Daniel
> ---
> drivers/gpu/drm/vc4/vc4_drv.h | 1 -
> 1 file changed, 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/vc4/vc4_drv.h b/drivers/gpu/drm/vc4/vc4_drv.h
> index 8c8d96b..8717a1c 100644
> --- a/drivers/gpu/drm/vc4/vc4_drv.h
> +++ b/drivers/gpu/drm/vc4/vc4_drv.h
> @@ -19,7 +19,6 @@
>
> #include "uapi/drm/vc4_drm.h"
>
> -struct drm_device;
> struct drm_gem_object;
>
> /* Don't forget to update vc4_bo.c: bo_type_names[] when adding to
> --
> 2.7.4
>
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
NULL pointer dereference is observed while exporting the dmabuf but
failed to allocate the 'struct file' which results into the dropping of
the allocated dentry corresponding to this file in the dmabuf fs, which
is ending up in dma_buf_release() and accessing the uninitialzed
dentry->d_fsdata.
Call stack on 5.4 is below:
dma_buf_release+0x2c/0x254 drivers/dma-buf/dma-buf.c:88
__dentry_kill+0x294/0x31c fs/dcache.c:584
dentry_kill fs/dcache.c:673 [inline]
dput+0x250/0x380 fs/dcache.c:859
path_put+0x24/0x40 fs/namei.c:485
alloc_file_pseudo+0x1a4/0x200 fs/file_table.c:235
dma_buf_getfile drivers/dma-buf/dma-buf.c:473 [inline]
dma_buf_export+0x25c/0x3ec drivers/dma-buf/dma-buf.c:585
Fix this by checking for the valid pointer in the dentry->d_fsdata.
Fixes: 4ab59c3c638c ("dma-buf: Move dma_buf_release() from fops to dentry_ops")
Cc: <stable(a)vger.kernel.org> [5.7+]
Signed-off-by: Charan Teja Reddy <charante(a)codeaurora.org>
---
drivers/dma-buf/dma-buf.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
index 58564d82..844967f 100644
--- a/drivers/dma-buf/dma-buf.c
+++ b/drivers/dma-buf/dma-buf.c
@@ -59,6 +59,8 @@ static void dma_buf_release(struct dentry *dentry)
struct dma_buf *dmabuf;
dmabuf = dentry->d_fsdata;
+ if (unlikely(!dmabuf))
+ return;
BUG_ON(dmabuf->vmapping_counter);
--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a
member of the Code Aurora Forum, hosted by The Linux Foundation
Hi Andrew,
I'm the new DMA-buf maintainer and Daniel and others came up with patches extending the use of the dma_buf_mmap() function.
Now this function is doing something a bit odd by changing the vma->vm_file while installing a VMA in the mmap() system call
The background here is that DMA-buf allows device drivers to export buffer which are then imported into another device driver. The mmap() handler of the importing device driver then find that the pgoff belongs to the exporting device and so redirects the mmap() call there.
In other words user space calls mmap() on one file descriptor, but get a different one mapped into your virtual address space.
My question is now: Is that legal or can you think of something which breaks here?
If it's not legal we should probably block any new users of the dma_buf_mmap() function and consider what should happen with the two existing ones.
If that is legal I would like to document this by adding a new vma_set_file() function which does the necessary reference count dance.
Thanks in advance,
Christian.