On Fri, May 15, 2020 at 02:07:06PM +0900, David Stevens wrote:
> On Thu, May 14, 2020 at 9:30 PM Daniel Vetter <daniel(a)ffwll.ch> wrote:
> > On Thu, May 14, 2020 at 05:19:40PM +0900, David Stevens wrote:
> > > Sorry for the duplicate reply, didn't notice this until now.
> > >
> > > > Just storing
> > > > the uuid should be doable (assuming this doesn't change during the
> > > > lifetime of the buffer), so no need for a callback.
> > >
> > > Directly storing the uuid doesn't work that well because of
> > > synchronization issues. The uuid needs to be shared between multiple
> > > virtio devices with independent command streams, so to prevent races
> > > between importing and exporting, the exporting driver can't share the
> > > uuid with other drivers until it knows that the device has finished
> > > registering the uuid. That requires a round trip to and then back from
> > > the device. Using a callback allows the latency from that round trip
> > > registration to be hidden.
> >
> > Uh, that means you actually do something and there's locking involved.
> > Makes stuff more complicated, invariant attributes are a lot easier
> > generally. Registering that uuid just always doesn't work, and blocking
> > when you're exporting?
>
> Registering the id at creation and blocking in gem export is feasible,
> but it doesn't work well for systems with a centralized buffer
> allocator that doesn't support batch allocations (e.g. gralloc). In
> such a system, the round trip latency would almost certainly be
> included in the buffer allocation time. At least on the system I'm
> working on, I suspect that would add 10s of milliseconds of startup
> latency to video pipelines (although I haven't benchmarked the
> difference). Doing the blocking as late as possible means most or all
> of the latency can be hidden behind other pipeline setup work.
>
> In terms of complexity, I think the synchronization would be basically
> the same in either approach, just in different locations. All it would
> do is alleviate the need for a callback to fetch the UUID.
Hm ok. I guess if we go with the older patch, where this all is a lot more
just code in virtio, doing an extra function to allocate the uuid sounds
fine. Then synchronization is entirely up to the virtio subsystem and not
a dma-buf problem (and hence not mine). You can use dma_resv_lock or so,
but no need to. But with callbacks potentially going both ways things
always get a bit interesting wrt locking - this is what makes peer2peer
dma-buf so painful right now. Hence I'd like to avoid that if needed, at
least at the dma-buf level. virtio code I don't mind what you do there :-)
Cheers, Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
On Wed, Mar 11, 2020 at 12:20 PM David Stevens <stevensd(a)chromium.org> wrote:
>
> This change adds a new dma-buf operation that allows dma-bufs to be used
> by virtio drivers to share exported objects. The new operation allows
> the importing driver to query the exporting driver for the UUID which
> identifies the underlying exported object.
>
> Signed-off-by: David Stevens <stevensd(a)chromium.org>
Adding Tomasz Figa, I've discussed this with him at elce last year I
think. Just to make sure.
Bunch of things:
- obviously we need the users of this in a few drivers, can't really
review anything stand-alone
- adding very specific ops to the generic interface is rather awkward,
eventually everyone wants that and we end up in a mess. I think the
best solution here would be if we create a struct virtio_dma_buf which
subclasses dma-buf, add a (hopefully safe) runtime upcasting
functions, and then a virtio_dma_buf_get_uuid() function. Just storing
the uuid should be doable (assuming this doesn't change during the
lifetime of the buffer), so no need for a callback.
- for the runtime upcasting the usual approach is to check the ->ops
pointer. Which means that would need to be the same for all virtio
dma_bufs, which might get a bit awkward. But I'd really prefer we not
add allocator specific stuff like this to dma-buf.
-Daniel
> ---
> drivers/dma-buf/dma-buf.c | 12 ++++++++++++
> include/linux/dma-buf.h | 18 ++++++++++++++++++
> 2 files changed, 30 insertions(+)
>
> diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
> index d4097856c86b..fa5210ba6aaa 100644
> --- a/drivers/dma-buf/dma-buf.c
> +++ b/drivers/dma-buf/dma-buf.c
> @@ -1158,6 +1158,18 @@ void dma_buf_vunmap(struct dma_buf *dmabuf, void *vaddr)
> }
> EXPORT_SYMBOL_GPL(dma_buf_vunmap);
>
> +int dma_buf_get_uuid(struct dma_buf *dmabuf, uuid_t *uuid)
> +{
> + if (WARN_ON(!dmabuf) || !uuid)
> + return -EINVAL;
> +
> + if (!dmabuf->ops->get_uuid)
> + return -ENODEV;
> +
> + return dmabuf->ops->get_uuid(dmabuf, uuid);
> +}
> +EXPORT_SYMBOL_GPL(dma_buf_get_uuid);
> +
> #ifdef CONFIG_DEBUG_FS
> static int dma_buf_debug_show(struct seq_file *s, void *unused)
> {
> diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h
> index abf5459a5b9d..00758523597d 100644
> --- a/include/linux/dma-buf.h
> +++ b/include/linux/dma-buf.h
> @@ -251,6 +251,21 @@ struct dma_buf_ops {
>
> void *(*vmap)(struct dma_buf *);
> void (*vunmap)(struct dma_buf *, void *vaddr);
> +
> + /**
> + * @get_uuid
> + *
> + * This is called by dma_buf_get_uuid to get the UUID which identifies
> + * the buffer to virtio devices.
> + *
> + * This callback is optional.
> + *
> + * Returns:
> + *
> + * 0 on success or a negative error code on failure. On success uuid
> + * will be populated with the buffer's UUID.
> + */
> + int (*get_uuid)(struct dma_buf *dmabuf, uuid_t *uuid);
> };
>
> /**
> @@ -444,4 +459,7 @@ int dma_buf_mmap(struct dma_buf *, struct vm_area_struct *,
> unsigned long);
> void *dma_buf_vmap(struct dma_buf *);
> void dma_buf_vunmap(struct dma_buf *, void *vaddr);
> +
> +int dma_buf_get_uuid(struct dma_buf *dmabuf, uuid_t *uuid);
> +
> #endif /* __DMA_BUF_H__ */
> --
> 2.25.1.481.gfbce0eb801-goog
>
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
On Thu, May 14, 2020 at 05:19:40PM +0900, David Stevens wrote:
> Sorry for the duplicate reply, didn't notice this until now.
>
> > Just storing
> > the uuid should be doable (assuming this doesn't change during the
> > lifetime of the buffer), so no need for a callback.
>
> Directly storing the uuid doesn't work that well because of
> synchronization issues. The uuid needs to be shared between multiple
> virtio devices with independent command streams, so to prevent races
> between importing and exporting, the exporting driver can't share the
> uuid with other drivers until it knows that the device has finished
> registering the uuid. That requires a round trip to and then back from
> the device. Using a callback allows the latency from that round trip
> registration to be hidden.
Uh, that means you actually do something and there's locking involved.
Makes stuff more complicated, invariant attributes are a lot easier
generally. Registering that uuid just always doesn't work, and blocking
when you're exporting?
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
On Thu, May 14, 2020 at 11:08:52AM +0900, David Stevens wrote:
> On Thu, May 14, 2020 at 12:45 AM Daniel Vetter <daniel(a)ffwll.ch> wrote:
> > On Wed, Mar 11, 2020 at 12:20 PM David Stevens <stevensd(a)chromium.org> wrote:
> > >
> > > This change adds a new dma-buf operation that allows dma-bufs to be used
> > > by virtio drivers to share exported objects. The new operation allows
> > > the importing driver to query the exporting driver for the UUID which
> > > identifies the underlying exported object.
> > >
> > > Signed-off-by: David Stevens <stevensd(a)chromium.org>
> >
> > Adding Tomasz Figa, I've discussed this with him at elce last year I
> > think. Just to make sure.
> >
> > Bunch of things:
> > - obviously we need the users of this in a few drivers, can't really
> > review anything stand-alone
>
> Here is a link to the usage of this feature by the currently under
> development virtio-video driver:
> https://markmail.org/thread/j4xlqaaim266qpks
>
> > - adding very specific ops to the generic interface is rather awkward,
> > eventually everyone wants that and we end up in a mess. I think the
> > best solution here would be if we create a struct virtio_dma_buf which
> > subclasses dma-buf, add a (hopefully safe) runtime upcasting
> > functions, and then a virtio_dma_buf_get_uuid() function. Just storing
> > the uuid should be doable (assuming this doesn't change during the
> > lifetime of the buffer), so no need for a callback.
>
> So you would prefer a solution similar to the original version of this
> patchset? https://markmail.org/message/z7if4u56q5fmaok4
yup.
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
On Wed, May 13, 2020 at 05:40:26PM +0530, Charan Teja Kalla wrote:
>
> Thank you Greg for the comments.
> On 5/12/2020 2:22 PM, Greg KH wrote:
> > On Fri, May 08, 2020 at 12:11:03PM +0530, Charan Teja Reddy wrote:
> >> The following race occurs while accessing the dmabuf object exported as
> >> file:
> >> P1 P2
> >> dma_buf_release() dmabuffs_dname()
> >> [say lsof reading /proc/<P1 pid>/fd/<num>]
> >>
> >> read dmabuf stored in dentry->d_fsdata
> >> Free the dmabuf object
> >> Start accessing the dmabuf structure
> >>
> >> In the above description, the dmabuf object freed in P1 is being
> >> accessed from P2 which is resulting into the use-after-free. Below is
> >> the dump stack reported.
> >>
> >> We are reading the dmabuf object stored in the dentry->d_fsdata but
> >> there is no binding between the dentry and the dmabuf which means that
> >> the dmabuf can be freed while it is being read from ->d_fsdata and
> >> inuse. Reviews on the patch V1 says that protecting the dmabuf inuse
> >> with an extra refcount is not a viable solution as the exported dmabuf
> >> is already under file's refcount and keeping the multiple refcounts on
> >> the same object coordinated is not possible.
> >>
> >> As we are reading the dmabuf in ->d_fsdata just to get the user passed
> >> name, we can directly store the name in d_fsdata thus can avoid the
> >> reading of dmabuf altogether.
> >>
> >> Call Trace:
> >> kasan_report+0x12/0x20
> >> __asan_report_load8_noabort+0x14/0x20
> >> dmabuffs_dname+0x4f4/0x560
> >> tomoyo_realpath_from_path+0x165/0x660
> >> tomoyo_get_realpath
> >> tomoyo_check_open_permission+0x2a3/0x3e0
> >> tomoyo_file_open
> >> tomoyo_file_open+0xa9/0xd0
> >> security_file_open+0x71/0x300
> >> do_dentry_open+0x37a/0x1380
> >> vfs_open+0xa0/0xd0
> >> path_openat+0x12ee/0x3490
> >> do_filp_open+0x192/0x260
> >> do_sys_openat2+0x5eb/0x7e0
> >> do_sys_open+0xf2/0x180
> >>
> >> Fixes: bb2bb9030425 ("dma-buf: add DMA_BUF_SET_NAME ioctls")
> >> Reported-by: syzbot+3643a18836bce555bff6(a)syzkaller.appspotmail.com
> >> Cc: <stable(a)vger.kernel.org> [5.3+]
> >> Signed-off-by: Charan Teja Reddy <charante(a)codeaurora.org>
> >> ---
> >>
> >> Changes in v2:
> >>
> >> - Pass the user passed name in ->d_fsdata instead of dmabuf
> >> - Improve the commit message
> >>
> >> Changes in v1: (https://patchwork.kernel.org/patch/11514063/)
> >>
> >> drivers/dma-buf/dma-buf.c | 17 ++++++++++-------
> >> 1 file changed, 10 insertions(+), 7 deletions(-)
> >>
> >> diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
> >> index 01ce125..0071f7d 100644
> >> --- a/drivers/dma-buf/dma-buf.c
> >> +++ b/drivers/dma-buf/dma-buf.c
> >> @@ -25,6 +25,7 @@
> >> #include <linux/mm.h>
> >> #include <linux/mount.h>
> >> #include <linux/pseudo_fs.h>
> >> +#include <linux/dcache.h>
> >>
> >> #include <uapi/linux/dma-buf.h>
> >> #include <uapi/linux/magic.h>
> >> @@ -40,15 +41,13 @@ struct dma_buf_list {
> >>
> >> static char *dmabuffs_dname(struct dentry *dentry, char *buffer, int buflen)
> >> {
> >> - struct dma_buf *dmabuf;
> >> char name[DMA_BUF_NAME_LEN];
> >> size_t ret = 0;
> >>
> >> - dmabuf = dentry->d_fsdata;
> >> - dma_resv_lock(dmabuf->resv, NULL);
> >> - if (dmabuf->name)
> >> - ret = strlcpy(name, dmabuf->name, DMA_BUF_NAME_LEN);
> >> - dma_resv_unlock(dmabuf->resv);
> >> + spin_lock(&dentry->d_lock);
> >
> > Are you sure this lock always protects d_fsdata?
>
> I think yes. In the dma-buf.c, I have to make sure that d_fsdata should
> always be under d_lock thus it will be protected. (In this posted patch
> there is one place(in dma_buf_set_name) that is missed, will update this
> in V3).
>
> >
> >> + if (dentry->d_fsdata)
> >> + ret = strlcpy(name, dentry->d_fsdata, DMA_BUF_NAME_LEN);
> >> + spin_unlock(&dentry->d_lock);
> >>
> >> return dynamic_dname(dentry, buffer, buflen, "/%s:%s",
> >> dentry->d_name.name, ret > 0 ? name : "");
> >
> > If the above check fails the name will be what? How could d_name.name
> > be valid but d_fsdata not be valid?
>
> In case of check fails, empty string "" is appended to the name by the
> code, ret > 0 ? name : "", ret is initialized to zero. Thus the name
> string will be like "/dmabuf:".
So multiple objects can have the same "name" if this happens to multiple
ones at once?
> Regarding the validity of d_fsdata, we are setting the dmabuf's
> dentry->d_fsdata to NULL in the dma_buf_release() thus can go invalid if
> that dmabuf is in the free path.
Why are we allowing the name to be set if the dmabuf is on the free path
at all? Shouldn't that be the real fix here?
> >> @@ -80,12 +79,16 @@ static int dma_buf_fs_init_context(struct fs_context *fc)
> >> static int dma_buf_release(struct inode *inode, struct file *file)
> >> {
> >> struct dma_buf *dmabuf;
> >> + struct dentry *dentry = file->f_path.dentry;
> >>
> >> if (!is_dma_buf_file(file))
> >> return -EINVAL;
> >>
> >> dmabuf = file->private_data;
> >>
> >> + spin_lock(&dentry->d_lock);
> >> + dentry->d_fsdata = NULL;
> >> + spin_unlock(&dentry->d_lock);
> >> BUG_ON(dmabuf->vmapping_counter);
> >>
> >> /*
> >> @@ -343,6 +346,7 @@ static long dma_buf_set_name(struct dma_buf *dmabuf, const char __user *buf)
> >> }
> >> kfree(dmabuf->name);
> >> dmabuf->name = name;
> >> + dmabuf->file->f_path.dentry->d_fsdata = name;
> >
> > You are just changing the use of d_fsdata from being a pointer to the
> > dmabuf to being a pointer to the name string? What's to keep that name
> > string around and not have the same reference counting issues that the
> > dmabuf structure itself has? Who frees that string memory?
> >
>
> Yes, I am just storing the name string in the d_fsdata in place of
> dmabuf and this helps to get rid of any extra refcount requirement.
> Because the user passed name carried in the d_fsdata is copied to the
> local buffer in dmabuffs_dname under spin_lock(d_lock) and the same
> d_fsdata is set to NULL(under the d_lock only) when that dmabuf is in
> the release path. So, when d_fsdata is NULL, name string is not accessed
> from the dmabuffs_dname thus extra count is not required.
>
> String memory, stored in the dmabuf->name, is released from the
> dma_buf_release(). Flow will be like, It fist sets d_fsdata=NULL and
> then free the dmabuf->name.
>
> However from your comments I have realized that there is a race in this
> patch when using the name string between dma_buf_set_name() and
> dmabuffs_dname(). But, If the idea of passing the name string inplace of
> dmabuf in d_fsdata looks fine, I can update this next patch.
I'll leave that to the dmabuf authors/maintainers, but it feels odd to
me...
thanks,
greg k-h
Dear All,
During the Exynos DRM GEM rework and fixing the issues in the.
drm_prime_sg_to_page_addr_arrays() function [1] I've noticed that most
drivers in DRM framework incorrectly use nents and orig_nents entries of
the struct sg_table.
In case of the most DMA-mapping implementations exchanging those two
entries or using nents for all loops on the scatterlist is harmless,
because they both have the same value. There exists however a DMA-mapping
implementations, for which such incorrect usage breaks things. The nents
returned by dma_map_sg() might be lower than the nents passed as its
parameter and this is perfectly fine. DMA framework or IOMMU is allowed
to join consecutive chunks while mapping if such operation is supported
by the underlying HW (bus, bridge, IOMMU, etc). Example of the case
where dma_map_sg() might return 1 'DMA' chunk for the 4 'physical' pages
is described here [2]
The DMA-mapping framework documentation [3] states that dma_map_sg()
returns the numer of the created entries in the DMA address space.
However the subsequent calls to dma_sync_sg_for_{device,cpu} and
dma_unmap_sg must be called with the original number of entries passed to
dma_map_sg. The common pattern in DRM drivers were to assign the
dma_map_sg() return value to sg_table->nents and use that value for
the subsequent calls to dma_sync_sg_* or dma_unmap_sg functions. Also
the code iterated over nents times to access the pages stored in the
processed scatterlist, while it should use orig_nents as the numer of
the page entries.
I've tried to identify all such incorrect usage of sg_table->nents and
this is a result of my research. It looks that the incorrect pattern has
been copied over the many drivers mainly in the DRM subsystem. Too bad in
most cases it even worked correctly if the system used a simple, linear
DMA-mapping implementation, for which swapping nents and orig_nents
doesn't make any difference. To avoid similar issues in the future, I've
introduced a common wrappers for DMA-mapping calls, which operate directly
on the sg_table objects. I've also added wrappers for iterating over the
scatterlists stored in the sg_table objects and applied them where
possible. This, together with some common DRM prime helpers, allowed me
to almost get rid of all nents/orig_nents usage in the drivers. I hope
that such change makes the code robust, easier to follow and copy/paste
safe.
The biggest TODO is DRM/i915 driver and I don't feel brave enough to fix
it fully. The driver creatively uses sg_table->orig_nents to store the
size of the allocate scatterlist and ignores the number of the entries
returned by dma_map_sg function. In this patchset I only fixed the
sg_table objects exported by dmabuf related functions. I hope that I
didn't break anything there.
Patches are based on top of Linux next-20200511.
Best regards,
Marek Szyprowski
References:
[1] https://lkml.org/lkml/2020/3/27/555.
[2] https://lkml.org/lkml/2020/3/29/65
[3] Documentation/DMA-API-HOWTO.txt
Changelog:
v4:
- added for_each_sgtable_* wrappers and applied where possible
- added drm_prime_get_contiguous_size() and applied where possible
- applied drm_prime_sg_to_page_addr_arrays() where possible to remove page
extraction from sg_table objects
- added documentation for the introduced wrappers
- improved patches description a bit
v3: https://lore.kernel.org/dri-devel/20200505083926.28503-1-m.szyprowski@samsu…
- introduce dma_*_sgtable_* wrappers and use them in all patches
v2: https://lore.kernel.org/linux-iommu/c01c9766-9778-fd1f-f36e-2dc7bd376ba4@ar…
- dropped most of the changes to drm/i915
- added fixes for rcar-du, xen, media and ion
- fixed a few issues pointed by kbuild test robot
- added wide cc: list for each patch
v1: https://lore.kernel.org/linux-iommu/c01c9766-9778-fd1f-f36e-2dc7bd376ba4@ar…
- initial version
Patch summary:
Marek Szyprowski (38):
dma-mapping: add generic helpers for mapping sgtable objects
scatterlist: add generic wrappers for iterating over sgtable objects
iommu: add generic helper for mapping sgtable objects
drm: prime: add common helper to check scatterlist contiguity
drm: prime: use sgtable iterators in
drm_prime_sg_to_page_addr_arrays()
drm: core: fix common struct sg_table related issues
drm: amdgpu: fix common struct sg_table related issues
drm: armada: fix common struct sg_table related issues
drm: etnaviv: fix common struct sg_table related issues
drm: exynos: use common helper for a scatterlist contiguity check
drm: exynos: fix common struct sg_table related issues
drm: i915: fix common struct sg_table related issues
drm: lima: fix common struct sg_table related issues
drm: mediatek: use common helper for a scatterlist contiguity check
drm: mediatek: use common helper for extracting pages array
drm: msm: fix common struct sg_table related issues
drm: omapdrm: use common helper for extracting pages array
drm: omapdrm: fix common struct sg_table related issues
drm: panfrost: fix common struct sg_table related issues
drm: radeon: fix common struct sg_table related issues
drm: rockchip: use common helper for a scatterlist contiguity check
drm: rockchip: fix common struct sg_table related issues
drm: tegra: fix common struct sg_table related issues
drm: v3d: fix common struct sg_table related issues
drm: virtio: fix common struct sg_table related issues
drm: vmwgfx: fix common struct sg_table related issues
xen: gntdev: fix common struct sg_table related issues
drm: host1x: fix common struct sg_table related issues
drm: rcar-du: fix common struct sg_table related issues
dmabuf: fix common struct sg_table related issues
staging: ion: remove dead code
staging: ion: fix common struct sg_table related issues
staging: tegra-vde: fix common struct sg_table related issues
misc: fastrpc: fix common struct sg_table related issues
rapidio: fix common struct sg_table related issues
samples: vfio-mdev/mbochs: fix common struct sg_table related issues
media: pci: fix common ALSA DMA-mapping related codes
videobuf2: use sgtable-based scatterlist wrappers
drivers/dma-buf/heaps/heap-helpers.c | 13 ++--
drivers/dma-buf/udmabuf.c | 7 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c | 6 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 9 +--
drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 8 +-
drivers/gpu/drm/armada/armada_gem.c | 12 +--
drivers/gpu/drm/drm_cache.c | 2 +-
drivers/gpu/drm/drm_gem_cma_helper.c | 23 +-----
drivers/gpu/drm/drm_gem_shmem_helper.c | 14 ++--
drivers/gpu/drm/drm_prime.c | 86 ++++++++++++----------
drivers/gpu/drm/etnaviv/etnaviv_gem.c | 12 ++-
drivers/gpu/drm/etnaviv/etnaviv_mmu.c | 13 +---
drivers/gpu/drm/exynos/exynos_drm_g2d.c | 10 +--
drivers/gpu/drm/exynos/exynos_drm_gem.c | 23 +-----
drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c | 11 +--
drivers/gpu/drm/i915/gem/selftests/mock_dmabuf.c | 7 +-
drivers/gpu/drm/lima/lima_gem.c | 11 ++-
drivers/gpu/drm/lima/lima_vm.c | 5 +-
drivers/gpu/drm/mediatek/mtk_drm_gem.c | 34 ++-------
drivers/gpu/drm/msm/msm_gem.c | 13 ++--
drivers/gpu/drm/msm/msm_gpummu.c | 14 ++--
drivers/gpu/drm/msm/msm_iommu.c | 2 +-
drivers/gpu/drm/omapdrm/omap_gem.c | 20 ++---
drivers/gpu/drm/panfrost/panfrost_gem.c | 4 +-
drivers/gpu/drm/panfrost/panfrost_mmu.c | 7 +-
drivers/gpu/drm/radeon/radeon_ttm.c | 11 ++-
drivers/gpu/drm/rcar-du/rcar_du_vsp.c | 3 +-
drivers/gpu/drm/rockchip/rockchip_drm_gem.c | 42 +++--------
drivers/gpu/drm/tegra/gem.c | 27 +++----
drivers/gpu/drm/tegra/plane.c | 15 ++--
drivers/gpu/drm/v3d/v3d_mmu.c | 17 ++---
drivers/gpu/drm/virtio/virtgpu_object.c | 36 +++++----
drivers/gpu/drm/virtio/virtgpu_vq.c | 12 ++-
drivers/gpu/drm/vmwgfx/vmwgfx_ttm_buffer.c | 17 +----
drivers/gpu/host1x/job.c | 22 ++----
.../media/common/videobuf2/videobuf2-dma-contig.c | 41 +++++------
drivers/media/common/videobuf2/videobuf2-dma-sg.c | 32 +++-----
drivers/media/common/videobuf2/videobuf2-vmalloc.c | 12 +--
drivers/media/pci/cx23885/cx23885-alsa.c | 2 +-
drivers/media/pci/cx25821/cx25821-alsa.c | 2 +-
drivers/media/pci/cx88/cx88-alsa.c | 2 +-
drivers/media/pci/saa7134/saa7134-alsa.c | 2 +-
drivers/media/platform/vsp1/vsp1_drm.c | 8 +-
drivers/misc/fastrpc.c | 4 +-
drivers/rapidio/devices/rio_mport_cdev.c | 8 +-
drivers/staging/android/ion/ion.c | 25 +++----
drivers/staging/android/ion/ion.h | 1 -
drivers/staging/android/ion/ion_heap.c | 53 ++++---------
drivers/staging/android/ion/ion_system_heap.c | 2 +-
drivers/staging/media/tegra-vde/iommu.c | 4 +-
drivers/xen/gntdev-dmabuf.c | 13 ++--
include/drm/drm_prime.h | 2 +
include/linux/dma-mapping.h | 79 ++++++++++++++++++++
include/linux/iommu.h | 16 ++++
include/linux/scatterlist.h | 50 ++++++++++++-
samples/vfio-mdev/mbochs.c | 3 +-
56 files changed, 452 insertions(+), 477 deletions(-)
--
1.9.1
On Tue, May 12, 2020 at 10:10 PM Kazlauskas, Nicholas
<nicholas.kazlauskas(a)amd.com> wrote:
>
> On 2020-05-12 12:12 p.m., Daniel Vetter wrote:
> > On Tue, May 12, 2020 at 4:24 PM Alex Deucher <alexdeucher(a)gmail.com> wrote:
> >>
> >> On Tue, May 12, 2020 at 9:45 AM Daniel Vetter <daniel.vetter(a)ffwll.ch> wrote:
> >>>
> >>> On Tue, May 12, 2020 at 3:29 PM Alex Deucher <alexdeucher(a)gmail.com> wrote:
> >>>>
> >>>> On Tue, May 12, 2020 at 9:17 AM Daniel Vetter <daniel.vetter(a)ffwll.ch> wrote:
> >>>>>
> >>>>> On Tue, May 12, 2020 at 3:12 PM Alex Deucher <alexdeucher(a)gmail.com> wrote:
> >>>>>>
> >>>>>> On Tue, May 12, 2020 at 8:58 AM Daniel Vetter <daniel(a)ffwll.ch> wrote:
> >>>>>>>
> >>>>>>> On Tue, May 12, 2020 at 08:54:45AM -0400, Alex Deucher wrote:
> >>>>>>>> On Tue, May 12, 2020 at 5:00 AM Daniel Vetter <daniel.vetter(a)ffwll.ch> wrote:
> >>>>>>>>>
> >>>>>>>>> ...
> >>>>>>>>>
> >>>>>>>>> I think it's time to stop this little exercise.
> >>>>>>>>>
> >>>>>>>>> The lockdep splat, for the record:
> >>>>>>>>>
> >>>>>>>>> [ 132.583381] ======================================================
> >>>>>>>>> [ 132.584091] WARNING: possible circular locking dependency detected
> >>>>>>>>> [ 132.584775] 5.7.0-rc3+ #346 Tainted: G W
> >>>>>>>>> [ 132.585461] ------------------------------------------------------
> >>>>>>>>> [ 132.586184] kworker/2:3/865 is trying to acquire lock:
> >>>>>>>>> [ 132.586857] ffffc90000677c70 (crtc_ww_class_acquire){+.+.}-{0:0}, at: drm_atomic_helper_suspend+0x38/0x120 [drm_kms_helper]
> >>>>>>>>> [ 132.587569]
> >>>>>>>>> but task is already holding lock:
> >>>>>>>>> [ 132.589044] ffffffff82318c80 (dma_fence_map){++++}-{0:0}, at: drm_sched_job_timedout+0x25/0xf0 [gpu_sched]
> >>>>>>>>> [ 132.589803]
> >>>>>>>>> which lock already depends on the new lock.
> >>>>>>>>>
> >>>>>>>>> [ 132.592009]
> >>>>>>>>> the existing dependency chain (in reverse order) is:
> >>>>>>>>> [ 132.593507]
> >>>>>>>>> -> #2 (dma_fence_map){++++}-{0:0}:
> >>>>>>>>> [ 132.595019] dma_fence_begin_signalling+0x50/0x60
> >>>>>>>>> [ 132.595767] drm_atomic_helper_commit+0xa1/0x180 [drm_kms_helper]
> >>>>>>>>> [ 132.596567] drm_client_modeset_commit_atomic+0x1ea/0x250 [drm]
> >>>>>>>>> [ 132.597420] drm_client_modeset_commit_locked+0x55/0x190 [drm]
> >>>>>>>>> [ 132.598178] drm_client_modeset_commit+0x24/0x40 [drm]
> >>>>>>>>> [ 132.598948] drm_fb_helper_restore_fbdev_mode_unlocked+0x4b/0xa0 [drm_kms_helper]
> >>>>>>>>> [ 132.599738] drm_fb_helper_set_par+0x30/0x40 [drm_kms_helper]
> >>>>>>>>> [ 132.600539] fbcon_init+0x2e8/0x660
> >>>>>>>>> [ 132.601344] visual_init+0xce/0x130
> >>>>>>>>> [ 132.602156] do_bind_con_driver+0x1bc/0x2b0
> >>>>>>>>> [ 132.602970] do_take_over_console+0x115/0x180
> >>>>>>>>> [ 132.603763] do_fbcon_takeover+0x58/0xb0
> >>>>>>>>> [ 132.604564] register_framebuffer+0x1ee/0x300
> >>>>>>>>> [ 132.605369] __drm_fb_helper_initial_config_and_unlock+0x36e/0x520 [drm_kms_helper]
> >>>>>>>>> [ 132.606187] amdgpu_fbdev_init+0xb3/0xf0 [amdgpu]
> >>>>>>>>> [ 132.607032] amdgpu_device_init.cold+0xe90/0x1677 [amdgpu]
> >>>>>>>>> [ 132.607862] amdgpu_driver_load_kms+0x5a/0x200 [amdgpu]
> >>>>>>>>> [ 132.608697] amdgpu_pci_probe+0xf7/0x180 [amdgpu]
> >>>>>>>>> [ 132.609511] local_pci_probe+0x42/0x80
> >>>>>>>>> [ 132.610324] pci_device_probe+0x104/0x1a0
> >>>>>>>>> [ 132.611130] really_probe+0x147/0x3c0
> >>>>>>>>> [ 132.611939] driver_probe_device+0xb6/0x100
> >>>>>>>>> [ 132.612766] device_driver_attach+0x53/0x60
> >>>>>>>>> [ 132.613593] __driver_attach+0x8c/0x150
> >>>>>>>>> [ 132.614419] bus_for_each_dev+0x7b/0xc0
> >>>>>>>>> [ 132.615249] bus_add_driver+0x14c/0x1f0
> >>>>>>>>> [ 132.616071] driver_register+0x6c/0xc0
> >>>>>>>>> [ 132.616902] do_one_initcall+0x5d/0x2f0
> >>>>>>>>> [ 132.617731] do_init_module+0x5c/0x230
> >>>>>>>>> [ 132.618560] load_module+0x2981/0x2bc0
> >>>>>>>>> [ 132.619391] __do_sys_finit_module+0xaa/0x110
> >>>>>>>>> [ 132.620228] do_syscall_64+0x5a/0x250
> >>>>>>>>> [ 132.621064] entry_SYSCALL_64_after_hwframe+0x49/0xb3
> >>>>>>>>> [ 132.621903]
> >>>>>>>>> -> #1 (crtc_ww_class_mutex){+.+.}-{3:3}:
> >>>>>>>>> [ 132.623587] __ww_mutex_lock.constprop.0+0xcc/0x10c0
> >>>>>>>>> [ 132.624448] ww_mutex_lock+0x43/0xb0
> >>>>>>>>> [ 132.625315] drm_modeset_lock+0x44/0x120 [drm]
> >>>>>>>>> [ 132.626184] drmm_mode_config_init+0x2db/0x8b0 [drm]
> >>>>>>>>> [ 132.627098] amdgpu_device_init.cold+0xbd1/0x1677 [amdgpu]
> >>>>>>>>> [ 132.628007] amdgpu_driver_load_kms+0x5a/0x200 [amdgpu]
> >>>>>>>>> [ 132.628920] amdgpu_pci_probe+0xf7/0x180 [amdgpu]
> >>>>>>>>> [ 132.629804] local_pci_probe+0x42/0x80
> >>>>>>>>> [ 132.630690] pci_device_probe+0x104/0x1a0
> >>>>>>>>> [ 132.631583] really_probe+0x147/0x3c0
> >>>>>>>>> [ 132.632479] driver_probe_device+0xb6/0x100
> >>>>>>>>> [ 132.633379] device_driver_attach+0x53/0x60
> >>>>>>>>> [ 132.634275] __driver_attach+0x8c/0x150
> >>>>>>>>> [ 132.635170] bus_for_each_dev+0x7b/0xc0
> >>>>>>>>> [ 132.636069] bus_add_driver+0x14c/0x1f0
> >>>>>>>>> [ 132.636974] driver_register+0x6c/0xc0
> >>>>>>>>> [ 132.637870] do_one_initcall+0x5d/0x2f0
> >>>>>>>>> [ 132.638765] do_init_module+0x5c/0x230
> >>>>>>>>> [ 132.639654] load_module+0x2981/0x2bc0
> >>>>>>>>> [ 132.640522] __do_sys_finit_module+0xaa/0x110
> >>>>>>>>> [ 132.641372] do_syscall_64+0x5a/0x250
> >>>>>>>>> [ 132.642203] entry_SYSCALL_64_after_hwframe+0x49/0xb3
> >>>>>>>>> [ 132.643022]
> >>>>>>>>> -> #0 (crtc_ww_class_acquire){+.+.}-{0:0}:
> >>>>>>>>> [ 132.644643] __lock_acquire+0x1241/0x23f0
> >>>>>>>>> [ 132.645469] lock_acquire+0xad/0x370
> >>>>>>>>> [ 132.646274] drm_modeset_acquire_init+0xd2/0x100 [drm]
> >>>>>>>>> [ 132.647071] drm_atomic_helper_suspend+0x38/0x120 [drm_kms_helper]
> >>>>>>>>> [ 132.647902] dm_suspend+0x1c/0x60 [amdgpu]
> >>>>>>>>> [ 132.648698] amdgpu_device_ip_suspend_phase1+0x83/0xe0 [amdgpu]
> >>>>>>>>> [ 132.649498] amdgpu_device_ip_suspend+0x1c/0x60 [amdgpu]
> >>>>>>>>> [ 132.650300] amdgpu_device_gpu_recover.cold+0x4e6/0xe64 [amdgpu]
> >>>>>>>>> [ 132.651084] amdgpu_job_timedout+0xfb/0x150 [amdgpu]
> >>>>>>>>> [ 132.651825] drm_sched_job_timedout+0x8a/0xf0 [gpu_sched]
> >>>>>>>>> [ 132.652594] process_one_work+0x23c/0x580
> >>>>>>>>> [ 132.653402] worker_thread+0x50/0x3b0
> >>>>>>>>> [ 132.654139] kthread+0x12e/0x150
> >>>>>>>>> [ 132.654868] ret_from_fork+0x27/0x50
> >>>>>>>>> [ 132.655598]
> >>>>>>>>> other info that might help us debug this:
> >>>>>>>>>
> >>>>>>>>> [ 132.657739] Chain exists of:
> >>>>>>>>> crtc_ww_class_acquire --> crtc_ww_class_mutex --> dma_fence_map
> >>>>>>>>>
> >>>>>>>>> [ 132.659877] Possible unsafe locking scenario:
> >>>>>>>>>
> >>>>>>>>> [ 132.661416] CPU0 CPU1
> >>>>>>>>> [ 132.662126] ---- ----
> >>>>>>>>> [ 132.662847] lock(dma_fence_map);
> >>>>>>>>> [ 132.663574] lock(crtc_ww_class_mutex);
> >>>>>>>>> [ 132.664319] lock(dma_fence_map);
> >>>>>>>>> [ 132.665063] lock(crtc_ww_class_acquire);
> >>>>>>>>> [ 132.665799]
> >>>>>>>>> *** DEADLOCK ***
> >>>>>>>>>
> >>>>>>>>> [ 132.667965] 4 locks held by kworker/2:3/865:
> >>>>>>>>> [ 132.668701] #0: ffff8887fb81c938 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x1bc/0x580
> >>>>>>>>> [ 132.669462] #1: ffffc90000677e58 ((work_completion)(&(&sched->work_tdr)->work)){+.+.}-{0:0}, at: process_one_work+0x1bc/0x580
> >>>>>>>>> [ 132.670242] #2: ffffffff82318c80 (dma_fence_map){++++}-{0:0}, at: drm_sched_job_timedout+0x25/0xf0 [gpu_sched]
> >>>>>>>>> [ 132.671039] #3: ffff8887b84a1748 (&adev->lock_reset){+.+.}-{3:3}, at: amdgpu_device_gpu_recover.cold+0x59e/0xe64 [amdgpu]
> >>>>>>>>> [ 132.671902]
> >>>>>>>>> stack backtrace:
> >>>>>>>>> [ 132.673515] CPU: 2 PID: 865 Comm: kworker/2:3 Tainted: G W 5.7.0-rc3+ #346
> >>>>>>>>> [ 132.674347] Hardware name: System manufacturer System Product Name/PRIME X370-PRO, BIOS 4011 04/19/2018
> >>>>>>>>> [ 132.675194] Workqueue: events drm_sched_job_timedout [gpu_sched]
> >>>>>>>>> [ 132.676046] Call Trace:
> >>>>>>>>> [ 132.676897] dump_stack+0x8f/0xd0
> >>>>>>>>> [ 132.677748] check_noncircular+0x162/0x180
> >>>>>>>>> [ 132.678604] ? stack_trace_save+0x4b/0x70
> >>>>>>>>> [ 132.679459] __lock_acquire+0x1241/0x23f0
> >>>>>>>>> [ 132.680311] lock_acquire+0xad/0x370
> >>>>>>>>> [ 132.681163] ? drm_atomic_helper_suspend+0x38/0x120 [drm_kms_helper]
> >>>>>>>>> [ 132.682021] ? cpumask_next+0x16/0x20
> >>>>>>>>> [ 132.682880] ? module_assert_mutex_or_preempt+0x14/0x40
> >>>>>>>>> [ 132.683737] ? __module_address+0x28/0xf0
> >>>>>>>>> [ 132.684601] drm_modeset_acquire_init+0xd2/0x100 [drm]
> >>>>>>>>> [ 132.685466] ? drm_atomic_helper_suspend+0x38/0x120 [drm_kms_helper]
> >>>>>>>>> [ 132.686335] drm_atomic_helper_suspend+0x38/0x120 [drm_kms_helper]
> >>>>>>>>> [ 132.687255] dm_suspend+0x1c/0x60 [amdgpu]
> >>>>>>>>> [ 132.688152] amdgpu_device_ip_suspend_phase1+0x83/0xe0 [amdgpu]
> >>>>>>>>> [ 132.689057] ? amdgpu_fence_process+0x4c/0x150 [amdgpu]
> >>>>>>>>> [ 132.689963] amdgpu_device_ip_suspend+0x1c/0x60 [amdgpu]
> >>>>>>>>> [ 132.690893] amdgpu_device_gpu_recover.cold+0x4e6/0xe64 [amdgpu]
> >>>>>>>>> [ 132.691818] amdgpu_job_timedout+0xfb/0x150 [amdgpu]
> >>>>>>>>> [ 132.692707] drm_sched_job_timedout+0x8a/0xf0 [gpu_sched]
> >>>>>>>>> [ 132.693597] process_one_work+0x23c/0x580
> >>>>>>>>> [ 132.694487] worker_thread+0x50/0x3b0
> >>>>>>>>> [ 132.695373] ? process_one_work+0x580/0x580
> >>>>>>>>> [ 132.696264] kthread+0x12e/0x150
> >>>>>>>>> [ 132.697154] ? kthread_create_worker_on_cpu+0x70/0x70
> >>>>>>>>> [ 132.698057] ret_from_fork+0x27/0x50
> >>>>>>>>>
> >>>>>>>>> Cc: linux-media(a)vger.kernel.org
> >>>>>>>>> Cc: linaro-mm-sig(a)lists.linaro.org
> >>>>>>>>> Cc: linux-rdma(a)vger.kernel.org
> >>>>>>>>> Cc: amd-gfx(a)lists.freedesktop.org
> >>>>>>>>> Cc: intel-gfx(a)lists.freedesktop.org
> >>>>>>>>> Cc: Chris Wilson <chris(a)chris-wilson.co.uk>
> >>>>>>>>> Cc: Maarten Lankhorst <maarten.lankhorst(a)linux.intel.com>
> >>>>>>>>> Cc: Christian König <christian.koenig(a)amd.com>
> >>>>>>>>> Signed-off-by: Daniel Vetter <daniel.vetter(a)intel.com>
> >>>>>>>>> ---
> >>>>>>>>> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 8 ++++++++
> >>>>>>>>> 1 file changed, 8 insertions(+)
> >>>>>>>>>
> >>>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> >>>>>>>>> index 3584e29323c0..b3b84a0d3baf 100644
> >>>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> >>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> >>>>>>>>> @@ -2415,6 +2415,14 @@ static int amdgpu_device_ip_suspend_phase1(struct amdgpu_device *adev)
> >>>>>>>>> /* displays are handled separately */
> >>>>>>>>> if (adev->ip_blocks[i].version->type == AMD_IP_BLOCK_TYPE_DCE) {
> >>>>>>>>> /* XXX handle errors */
> >>>>>>>>> +
> >>>>>>>>> + /*
> >>>>>>>>> + * This is dm_suspend, which calls modeset locks, and
> >>>>>>>>> + * that a pretty good inversion against dma_fence_signal
> >>>>>>>>> + * which gpu recovery is supposed to guarantee.
> >>>>>>>>> + *
> >>>>>>>>> + * Dont ask me how to fix this.
> >>>>>>>>> + */
> >>>>>>>>
> >>>>>>>> We actually have a fix for this. Will be out shortly.
> >>>>>>>
> >>>>>>> Spoilers? Solid way is to sidesteck the entire thing by avoiding to reset
> >>>>>>> the display block entirely. Fixing the locking while still resetting the
> >>>>>>> display is going to be really hard otoh ...
> >>>>>>
> >>>>>> There's no way to avoid that. On dGPUs at least a full asic reset is
> >>>>>> a full asic reset. Mostly just skips the modeset and does the minimum
> >>>>>> amount necessary to get the display block into a good state for reset.
> >>>>>
> >>>>> But how do you restore the display afterwards? "[RFC 13/17]
> >>>>> drm/scheduler: use dma-fence annotations in tdr work" earlier in the
> >>>>> series has some ideas from me for at least
> >>>>> some of the problems for tdr when the display gets reset along.
> >>>>> Whacking the display while a modeset/flip/whatever is ongoing
> >>>>> concurrently doesn't sound like a good idea, so not sure how you can
> >>>>> do that without taking the drm_modeset_locks. And once you do that,
> >>>>> it's deadlock time.
> >>>>
> >>>> We cache the current display hw state and restore it after the reset
> >>>> without going through the atomic interfaces so everything is back the
> >>>> way it was before the reset.
> >>>
> >>> Hm this sounds interesting ... how do you make sure a concurrent
> >>> atomic update doesn't trample over the same mmio registers while you
> >>> do that dance?
> >>
> >> We take the dm->dc_lock across the reset.
> >
> > Ok if that's an innermost lock and you don't do any dma_fence_wait()
> > while holding that (or anything that somehow depends upon that again)
> > I think that should work. From a quick look at current code in
> > drm-next that seems to be the case. But would be good to check with my
> > annotations whether everything is placed correctly (or maybe there's a
> > bug in my annotations).
> >
> > I still think something like I described in the drm/scheduler patch,
> > which would allow us to take drm_modeset_locks in tdr path, would be a
> > cleaner and more robust solution longer term. Forcing drivers to do
> > their own modeset state looking just doesn't feel that awesome ... I
> > guess that also depends upon how many other drivers have this problem.
> >
>
> I worked a bit on analyzing the original problem faced here with using
> our regular dm_suspend/dm_resume here for GPU reset and I'm not sure how
> taking the drm_modeset_locks really buys us anything.
>
> We ran into a deadlock caused by an async pageflip commit coming in
> along with the TDR suspend commit and the async commit wanted the
> dma_fence associated with the plane. The tdr commit suspend commit then
> started waiting for the pageflip commit to finish - which never happens
> since the fence hasn't been signaled yet and DRM spins forever.
>
> If the dma_fence is "force" signaled in suspend then the async commit
> continues and the suspend commit follows after as normal. The async
> commit no longer holds any of the DRM locks because atomic_commit has
> already finished - we had no trouble grabbing the locks for the tdr commit.
>
> Force signaling the fence wasn't really an option for solving this issue
> though, and DRM doesn't support aborting atomic commits nor would most
> drivers be well equipped to handle this I think - it's no longer really
> atomic at that point.
>
> Ideally we don't actually have that pageflip commit programmed during
> the TDR since we'll have to reapply that same state again after, so what
> we're stuck with is the solution that we're planning on putting into
> driver - hold the internal driver locks across the suspend/reset so
> there's no way for the pageflip commit to start processing after having
> its fence signaled.
>
> We're essentially just faking that the state is the exact same as it was
> before the GPU reset process started.
>
> I'm open to suggestions or improvements for handling this but the
> easiest solution seemed to just bypass the entire DRM stack under the hood.
Yeah gets the job done, plus aside from the locks should also avoid
the memory allocations (which I also think are no-go, but much harder
to hit). The downside is that it's all custom code, plus you need that
internal lock (not really needed in atomic drivers, not sure why you
had that to begin with), and if we have a pile of other drivers which
have the same problem, might be worth fixing this in helpers itself.
One idea I sketched is here:
https://patchwork.freedesktop.org/patch/365335/?series=77179&rev=1
But it's incomplete and probably not the best one.
Cheers, Daniel
>
> Regards,
> Nicholas Kazlauskas
>
> >>>> IIRC, when we reset the reset of the
> >>>> GPU, we disconnect the fences, and then re-attach them after a reset.
> >>>
> >>> Where is that code? Since I'm not sure how you can make that work
> >>> without getting stuck in another kind of deadlock in tdr. But maybe
> >>> the code has some clever trick to pull that off somehow.
> >>
> >> The GPU scheduler. drm_sched_stop, drm_sched_resubmit_jobs, and
> >> drm_sched_start.
> >
> > That seems to just be about the scheduler-internal fences. That
> > tracking you kinda have to throw away and restart with a reset. But I
> > don't think these are the fences visible to other places, like in
> > dma_resv or drm_syncobj or wherever.
> > -Daniel
> >
> >> Alex
> >>
> >>
> >>> -Daniel
> >>>
> >>>>
> >>>> Alex
> >>>>
> >>>>> -Daniel
> >>>>> --
> >>>>> Daniel Vetter
> >>>>> Software Engineer, Intel Corporation
> >>>>> +41 (0) 79 365 57 48 - http://blog.ffwll.ch/
> >>>
> >>>
> >>>
> >>> --
> >>> Daniel Vetter
> >>> Software Engineer, Intel Corporation
> >>> +41 (0) 79 365 57 48 - http://blog.ffwll.ch/
> >
> >
> >
>
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
Hi Michael,
On 12.05.2020 19:52, Ruhl, Michael J wrote:
>> -----Original Message-----
>> From: dri-devel <dri-devel-bounces(a)lists.freedesktop.org> On Behalf Of
>> Marek Szyprowski
>> Sent: Tuesday, May 12, 2020 5:01 AM
>> To: dri-devel(a)lists.freedesktop.org; iommu(a)lists.linux-foundation.org;
>> linaro-mm-sig(a)lists.linaro.org; linux-kernel(a)vger.kernel.org
>> Cc: Pawel Osciak <pawel(a)osciak.com>; Bartlomiej Zolnierkiewicz
>> <b.zolnierkie(a)samsung.com>; David Airlie <airlied(a)linux.ie>; linux-
>> media(a)vger.kernel.org; Hans Verkuil <hverkuil-cisco(a)xs4all.nl>; Mauro
>> Carvalho Chehab <mchehab(a)kernel.org>; Robin Murphy
>> <robin.murphy(a)arm.com>; Christoph Hellwig <hch(a)lst.de>; linux-arm-
>> kernel(a)lists.infradead.org; Marek Szyprowski
>> <m.szyprowski(a)samsung.com>
>> Subject: [PATCH v4 38/38] videobuf2: use sgtable-based scatterlist wrappers
>>
>> Use recently introduced common wrappers operating directly on the struct
>> sg_table objects and scatterlist page iterators to make the code a bit
>> more compact, robust, easier to follow and copy/paste safe.
>>
>> No functional change, because the code already properly did all the
>> scaterlist related calls.
>>
>> Signed-off-by: Marek Szyprowski <m.szyprowski(a)samsung.com>
>> ---
>> For more information, see '[PATCH v4 00/38] DRM: fix struct sg_table nents
>> vs. orig_nents misuse' thread:
>> https://lore.kernel.org/dri-devel/20200512085710.14688-1-
>> m.szyprowski(a)samsung.com/T/
>> ---
>> .../media/common/videobuf2/videobuf2-dma-contig.c | 41 ++++++++++----
>> --------
>> drivers/media/common/videobuf2/videobuf2-dma-sg.c | 32 +++++++--------
>> --
>> drivers/media/common/videobuf2/videobuf2-vmalloc.c | 12 +++----
>> 3 files changed, 34 insertions(+), 51 deletions(-)
>>
>> diff --git a/drivers/media/common/videobuf2/videobuf2-dma-contig.c
>> b/drivers/media/common/videobuf2/videobuf2-dma-contig.c
>> index d3a3ee5..bf31a9d 100644
>> --- a/drivers/media/common/videobuf2/videobuf2-dma-contig.c
>> +++ b/drivers/media/common/videobuf2/videobuf2-dma-contig.c
>> @@ -48,16 +48,15 @@ struct vb2_dc_buf {
>>
>> static unsigned long vb2_dc_get_contiguous_size(struct sg_table *sgt)
>> {
>> - struct scatterlist *s;
>> dma_addr_t expected = sg_dma_address(sgt->sgl);
>> - unsigned int i;
>> + struct sg_dma_page_iter dma_iter;
>> unsigned long size = 0;
>>
>> - for_each_sg(sgt->sgl, s, sgt->nents, i) {
>> - if (sg_dma_address(s) != expected)
>> + for_each_sgtable_dma_page(sgt, &dma_iter, 0) {
>> + if (sg_page_iter_dma_address(&dma_iter) != expected)
>> break;
>> - expected = sg_dma_address(s) + sg_dma_len(s);
>> - size += sg_dma_len(s);
>> + expected += PAGE_SIZE;
>> + size += PAGE_SIZE;
> This code in drm_prime_t_contiguous_size and here. I seem to remember seeing
> the same pattern in other drivers.
>
> Would it worthwhile to make this a helper as well?
I think I've identified such patterns in all DRM drivers and replaced
with a common helper. So far I have no idea where to put such helper to
make it available for media/videobuf2, so those a few lines are indeed
duplicated here.
> Also, isn't the sg_dma_len() the actual length of the chunk we are looking at?
>
> If its I not PAGE_SIZE (ie. dma chunk is 4 * PAGE_SIZE?), does your loop/calculation still work?
scaterlist page iterators (for_each_sg_page/for_each_sg_dma_page and
their sgtable variants) always operates on PAGE_SIZE units. They
correctly handle larger sg_dma_len().
Best regards
--
Marek Szyprowski, PhD
Samsung R&D Institute Poland
On Tue, May 12, 2020 at 09:09:52AM -0300, Jason Gunthorpe wrote:
> On Tue, May 12, 2020 at 10:59:29AM +0200, Daniel Vetter wrote:
> > diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c
> > index 6802125349fb..d5c0fd2efc70 100644
> > +++ b/drivers/dma-buf/dma-fence.c
> > @@ -110,6 +110,52 @@ u64 dma_fence_context_alloc(unsigned num)
> > }
> > EXPORT_SYMBOL(dma_fence_context_alloc);
> >
> > +#ifdef CONFIG_LOCKDEP
> > +struct lockdep_map dma_fence_lockdep_map = {
> > + .name = "dma_fence_map"
> > +};
> > +
> > +bool dma_fence_begin_signalling(void)
> > +{
>
> Why is this global? I would have expected it to be connected to a
> single fence?
It's the same rules for all fences, since they can be shared across
drivers in various ways. Lockdep usually achieves that with a static
variable hidden in the macro, but that doesn't work if you have lots of
different ways from different drivers to create a dma_fence. Hence the
unique global one that we explicitly allocate.
We have similar stuff for the global dma_resv_lock ww_mutex class, just
there it's a bit more standardized and hidden behind a neat macro. But
really lockdep needs global lockdep_maps or it doesn't work.
> It would also be alot nicer if this was some general lockdep feature,
> not tied to dmabuf. This exact problem also strikes anyone using
> completions, for instance, and the same solution should be
> applicable??
There was:
https://lwn.net/Articles/709849/
It even got merged, and seems to have worked. Unfortunately (and I'm not
entirely clear on the reasons) it was thrown out again, so we can't use
it. That means wait_event/wake_up dependencies need to be manually
annotated, like e.g. flush_work() already is. flush_work is more or less
where I've stolen this idea from, with some adjustements and tricks on top
to make it work for dma_fence users.
Cheers, Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
Dear All,
During the Exynos DRM GEM rework and fixing the issues in the
drm_prime_sg_to_page_addr_arrays() function [1] I've noticed that most
drivers in DRM framework incorrectly use nents and orig_nents entries of
the struct sg_table.
In case of the most DMA-mapping implementations exchanging those two
entries or using nents for all loops on the scatterlist is harmless,
because they both have the same value. There exists however a DMA-mapping
implementations, for which such incorrect usage breaks things. The nents
returned by dma_map_sg() might be lower than the nents passed as its
parameter and this is perfectly fine. DMA framework or IOMMU is allowed
to join consecutive chunks while mapping if such operation is supported
by the underlying HW (bus, bridge, IOMMU, etc). Example of the case
where dma_map_sg() might return 1 'DMA' chunk for the 4 'physical' pages
is described here [2]
The DMA-mapping framework documentation [3] states that dma_map_sg()
returns the numer of the created entries in the DMA address space.
However the subsequent calls to dma_sync_sg_for_{device,cpu} and
dma_unmap_sg must be called with the original number of entries passed to
dma_map_sg. The common pattern in DRM drivers were to assign the
dma_map_sg() return value to sg_table->nents and use that value for
the subsequent calls to dma_sync_sg_* or dma_unmap_sg functions. Also
the code iterated over nents times to access the pages stored in the
processed scatterlist, while it should use orig_nents as the numer of
the page entries.
I've tried to identify all such incorrect usage of sg_table->nents and
this is a result of my research. It looks that the incorrect pattern has
been copied over the many drivers mainly in the DRM subsystem. Too bad in
most cases it even worked correctly if the system used simple, linear
DMA-mapping implementation, for which swapping nents and orig_nents
doesn't make any difference. To avoid similar issues in the future, I've
introduced a common dma-mapping wrappers, which operate directly on the
sg_table objects.
The biggest TODO is DRM/i915 driver and I don't feel brave enough to fix
it fully. The driver creatively uses sg_table->orig_nents to store the
size of the allocate scatterlist and ignores the number of the entries
returned by dma_map_sg function. In this patchset I only fixed the
sg_table objects exported by dmabuf related functions. I hope that I
didn't break anything there.
Patches are based on top of Linux next-20200504.
Best regards,
Marek Szyprowski
References:
[1] https://lkml.org/lkml/2020/3/27/555
[2] https://lkml.org/lkml/2020/3/29/65
[3] Documentation/DMA-API-HOWTO.txt
Changelog:
v3:
- introduce dma_*_sgtable_* wrappers and use them in all patches
v2: https://lore.kernel.org/linux-iommu/c01c9766-9778-fd1f-f36e-2dc7bd376ba4@ar…
- dropped most of the changes to drm/i915
- added fixes for rcar-du, xen, media and ion
- fixed a few issues pointed by kbuild test robot
- added wide cc: list for each patch
v1: https://lore.kernel.org/linux-iommu/c01c9766-9778-fd1f-f36e-2dc7bd376ba4@ar…
- initial version
Patch summary:
Marek Szyprowski (25):
dma-mapping: add generic helpers for mapping sgtable objects
drm: core: fix common struct sg_table related issues
drm: amdgpu: fix common struct sg_table related issues
drm: armada: fix common struct sg_table related issues
drm: etnaviv: fix common struct sg_table related issues
drm: exynos: fix common struct sg_table related issues
drm: i915: fix common struct sg_table related issues
drm: lima: fix common struct sg_table related issues
drm: msm: fix common struct sg_table related issues
drm: panfrost: fix common struct sg_table related issues
drm: radeon: fix common struct sg_table related issues
drm: rockchip: fix common struct sg_table related issues
drm: tegra: fix common struct sg_table related issues
drm: virtio: fix common struct sg_table related issues
drm: vmwgfx: fix common struct sg_table related issues
xen: gntdev: fix common struct sg_table related issues
drm: host1x: fix common struct sg_table related issues
drm: rcar-du: fix common struct sg_table related issues
dmabuf: fix common struct sg_table related issues
staging: ion: fix common struct sg_table related issues
staging: tegra-vde: fix common struct sg_table related issues
misc: fastrpc: fix common struct sg_table related issues
rapidio: fix common struct sg_table related issues
samples: vfio-mdev/mbochs: fix common struct sg_table related issues
media: pci: fix common ALSA DMA-mapping related codes
drivers/dma-buf/heaps/heap-helpers.c | 13 +++++-----
drivers/dma-buf/udmabuf.c | 7 +++---
drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c | 6 ++---
drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 9 +++----
drivers/gpu/drm/armada/armada_gem.c | 10 ++++----
drivers/gpu/drm/drm_cache.c | 2 +-
drivers/gpu/drm/drm_gem_shmem_helper.c | 14 +++++++----
drivers/gpu/drm/drm_prime.c | 13 +++++-----
drivers/gpu/drm/etnaviv/etnaviv_gem.c | 12 ++++-----
drivers/gpu/drm/exynos/exynos_drm_g2d.c | 9 +++----
drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c | 13 ++++------
drivers/gpu/drm/i915/gem/selftests/mock_dmabuf.c | 7 +++---
drivers/gpu/drm/lima/lima_gem.c | 11 +++++---
drivers/gpu/drm/msm/msm_gem.c | 13 ++++------
drivers/gpu/drm/msm/msm_iommu.c | 2 +-
drivers/gpu/drm/panfrost/panfrost_gem.c | 4 +--
drivers/gpu/drm/panfrost/panfrost_mmu.c | 5 ++--
drivers/gpu/drm/radeon/radeon_ttm.c | 11 ++++----
drivers/gpu/drm/rockchip/rockchip_drm_gem.c | 26 +++++++++----------
drivers/gpu/drm/tegra/gem.c | 27 ++++++++------------
drivers/gpu/drm/tegra/plane.c | 15 ++++-------
drivers/gpu/drm/virtio/virtgpu_object.c | 17 ++++++-------
drivers/gpu/drm/virtio/virtgpu_vq.c | 10 +++-----
drivers/gpu/drm/vmwgfx/vmwgfx_ttm_buffer.c | 17 +++----------
drivers/gpu/host1x/job.c | 22 ++++++----------
drivers/media/pci/cx23885/cx23885-alsa.c | 2 +-
drivers/media/pci/cx25821/cx25821-alsa.c | 2 +-
drivers/media/pci/cx88/cx88-alsa.c | 2 +-
drivers/media/pci/saa7134/saa7134-alsa.c | 2 +-
drivers/media/platform/vsp1/vsp1_drm.c | 9 ++++---
drivers/misc/fastrpc.c | 4 +--
drivers/rapidio/devices/rio_mport_cdev.c | 8 +++---
drivers/staging/android/ion/ion.c | 25 ++++++++----------
drivers/staging/android/ion/ion_heap.c | 6 ++---
drivers/staging/android/ion/ion_system_heap.c | 2 +-
drivers/staging/media/tegra-vde/iommu.c | 4 +--
drivers/xen/gntdev-dmabuf.c | 7 +++---
include/linux/dma-mapping.h | 32 ++++++++++++++++++++++++
include/linux/iommu.h | 6 +++++
samples/vfio-mdev/mbochs.c | 3 ++-
40 files changed, 202 insertions(+), 207 deletions(-)
--
1.9.1