Hi Michael,
On 12.05.2020 19:52, Ruhl, Michael J wrote:
>> -----Original Message-----
>> From: dri-devel <dri-devel-bounces(a)lists.freedesktop.org> On Behalf Of
>> Marek Szyprowski
>> Sent: Tuesday, May 12, 2020 5:01 AM
>> To: dri-devel(a)lists.freedesktop.org; iommu(a)lists.linux-foundation.org;
>> linaro-mm-sig(a)lists.linaro.org; linux-kernel(a)vger.kernel.org
>> Cc: Pawel Osciak <pawel(a)osciak.com>; Bartlomiej Zolnierkiewicz
>> <b.zolnierkie(a)samsung.com>; David Airlie <airlied(a)linux.ie>; linux-
>> media(a)vger.kernel.org; Hans Verkuil <hverkuil-cisco(a)xs4all.nl>; Mauro
>> Carvalho Chehab <mchehab(a)kernel.org>; Robin Murphy
>> <robin.murphy(a)arm.com>; Christoph Hellwig <hch(a)lst.de>; linux-arm-
>> kernel(a)lists.infradead.org; Marek Szyprowski
>> <m.szyprowski(a)samsung.com>
>> Subject: [PATCH v4 38/38] videobuf2: use sgtable-based scatterlist wrappers
>>
>> Use recently introduced common wrappers operating directly on the struct
>> sg_table objects and scatterlist page iterators to make the code a bit
>> more compact, robust, easier to follow and copy/paste safe.
>>
>> No functional change, because the code already properly did all the
>> scaterlist related calls.
>>
>> Signed-off-by: Marek Szyprowski <m.szyprowski(a)samsung.com>
>> ---
>> For more information, see '[PATCH v4 00/38] DRM: fix struct sg_table nents
>> vs. orig_nents misuse' thread:
>> https://lore.kernel.org/dri-devel/20200512085710.14688-1-
>> m.szyprowski(a)samsung.com/T/
>> ---
>> .../media/common/videobuf2/videobuf2-dma-contig.c | 41 ++++++++++----
>> --------
>> drivers/media/common/videobuf2/videobuf2-dma-sg.c | 32 +++++++--------
>> --
>> drivers/media/common/videobuf2/videobuf2-vmalloc.c | 12 +++----
>> 3 files changed, 34 insertions(+), 51 deletions(-)
>>
>> diff --git a/drivers/media/common/videobuf2/videobuf2-dma-contig.c
>> b/drivers/media/common/videobuf2/videobuf2-dma-contig.c
>> index d3a3ee5..bf31a9d 100644
>> --- a/drivers/media/common/videobuf2/videobuf2-dma-contig.c
>> +++ b/drivers/media/common/videobuf2/videobuf2-dma-contig.c
>> @@ -48,16 +48,15 @@ struct vb2_dc_buf {
>>
>> static unsigned long vb2_dc_get_contiguous_size(struct sg_table *sgt)
>> {
>> - struct scatterlist *s;
>> dma_addr_t expected = sg_dma_address(sgt->sgl);
>> - unsigned int i;
>> + struct sg_dma_page_iter dma_iter;
>> unsigned long size = 0;
>>
>> - for_each_sg(sgt->sgl, s, sgt->nents, i) {
>> - if (sg_dma_address(s) != expected)
>> + for_each_sgtable_dma_page(sgt, &dma_iter, 0) {
>> + if (sg_page_iter_dma_address(&dma_iter) != expected)
>> break;
>> - expected = sg_dma_address(s) + sg_dma_len(s);
>> - size += sg_dma_len(s);
>> + expected += PAGE_SIZE;
>> + size += PAGE_SIZE;
> This code in drm_prime_t_contiguous_size and here. I seem to remember seeing
> the same pattern in other drivers.
>
> Would it worthwhile to make this a helper as well?
I think I've identified such patterns in all DRM drivers and replaced
with a common helper. So far I have no idea where to put such helper to
make it available for media/videobuf2, so those a few lines are indeed
duplicated here.
> Also, isn't the sg_dma_len() the actual length of the chunk we are looking at?
>
> If its I not PAGE_SIZE (ie. dma chunk is 4 * PAGE_SIZE?), does your loop/calculation still work?
scaterlist page iterators (for_each_sg_page/for_each_sg_dma_page and
their sgtable variants) always operates on PAGE_SIZE units. They
correctly handle larger sg_dma_len().
Best regards
--
Marek Szyprowski, PhD
Samsung R&D Institute Poland
On Tue, May 12, 2020 at 09:09:52AM -0300, Jason Gunthorpe wrote:
> On Tue, May 12, 2020 at 10:59:29AM +0200, Daniel Vetter wrote:
> > diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c
> > index 6802125349fb..d5c0fd2efc70 100644
> > +++ b/drivers/dma-buf/dma-fence.c
> > @@ -110,6 +110,52 @@ u64 dma_fence_context_alloc(unsigned num)
> > }
> > EXPORT_SYMBOL(dma_fence_context_alloc);
> >
> > +#ifdef CONFIG_LOCKDEP
> > +struct lockdep_map dma_fence_lockdep_map = {
> > + .name = "dma_fence_map"
> > +};
> > +
> > +bool dma_fence_begin_signalling(void)
> > +{
>
> Why is this global? I would have expected it to be connected to a
> single fence?
It's the same rules for all fences, since they can be shared across
drivers in various ways. Lockdep usually achieves that with a static
variable hidden in the macro, but that doesn't work if you have lots of
different ways from different drivers to create a dma_fence. Hence the
unique global one that we explicitly allocate.
We have similar stuff for the global dma_resv_lock ww_mutex class, just
there it's a bit more standardized and hidden behind a neat macro. But
really lockdep needs global lockdep_maps or it doesn't work.
> It would also be alot nicer if this was some general lockdep feature,
> not tied to dmabuf. This exact problem also strikes anyone using
> completions, for instance, and the same solution should be
> applicable??
There was:
https://lwn.net/Articles/709849/
It even got merged, and seems to have worked. Unfortunately (and I'm not
entirely clear on the reasons) it was thrown out again, so we can't use
it. That means wait_event/wake_up dependencies need to be manually
annotated, like e.g. flush_work() already is. flush_work is more or less
where I've stolen this idea from, with some adjustements and tricks on top
to make it work for dma_fence users.
Cheers, Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
Dear All,
During the Exynos DRM GEM rework and fixing the issues in the
drm_prime_sg_to_page_addr_arrays() function [1] I've noticed that most
drivers in DRM framework incorrectly use nents and orig_nents entries of
the struct sg_table.
In case of the most DMA-mapping implementations exchanging those two
entries or using nents for all loops on the scatterlist is harmless,
because they both have the same value. There exists however a DMA-mapping
implementations, for which such incorrect usage breaks things. The nents
returned by dma_map_sg() might be lower than the nents passed as its
parameter and this is perfectly fine. DMA framework or IOMMU is allowed
to join consecutive chunks while mapping if such operation is supported
by the underlying HW (bus, bridge, IOMMU, etc). Example of the case
where dma_map_sg() might return 1 'DMA' chunk for the 4 'physical' pages
is described here [2]
The DMA-mapping framework documentation [3] states that dma_map_sg()
returns the numer of the created entries in the DMA address space.
However the subsequent calls to dma_sync_sg_for_{device,cpu} and
dma_unmap_sg must be called with the original number of entries passed to
dma_map_sg. The common pattern in DRM drivers were to assign the
dma_map_sg() return value to sg_table->nents and use that value for
the subsequent calls to dma_sync_sg_* or dma_unmap_sg functions. Also
the code iterated over nents times to access the pages stored in the
processed scatterlist, while it should use orig_nents as the numer of
the page entries.
I've tried to identify all such incorrect usage of sg_table->nents and
this is a result of my research. It looks that the incorrect pattern has
been copied over the many drivers mainly in the DRM subsystem. Too bad in
most cases it even worked correctly if the system used simple, linear
DMA-mapping implementation, for which swapping nents and orig_nents
doesn't make any difference. To avoid similar issues in the future, I've
introduced a common dma-mapping wrappers, which operate directly on the
sg_table objects.
The biggest TODO is DRM/i915 driver and I don't feel brave enough to fix
it fully. The driver creatively uses sg_table->orig_nents to store the
size of the allocate scatterlist and ignores the number of the entries
returned by dma_map_sg function. In this patchset I only fixed the
sg_table objects exported by dmabuf related functions. I hope that I
didn't break anything there.
Patches are based on top of Linux next-20200504.
Best regards,
Marek Szyprowski
References:
[1] https://lkml.org/lkml/2020/3/27/555
[2] https://lkml.org/lkml/2020/3/29/65
[3] Documentation/DMA-API-HOWTO.txt
Changelog:
v3:
- introduce dma_*_sgtable_* wrappers and use them in all patches
v2: https://lore.kernel.org/linux-iommu/c01c9766-9778-fd1f-f36e-2dc7bd376ba4@ar…
- dropped most of the changes to drm/i915
- added fixes for rcar-du, xen, media and ion
- fixed a few issues pointed by kbuild test robot
- added wide cc: list for each patch
v1: https://lore.kernel.org/linux-iommu/c01c9766-9778-fd1f-f36e-2dc7bd376ba4@ar…
- initial version
Patch summary:
Marek Szyprowski (25):
dma-mapping: add generic helpers for mapping sgtable objects
drm: core: fix common struct sg_table related issues
drm: amdgpu: fix common struct sg_table related issues
drm: armada: fix common struct sg_table related issues
drm: etnaviv: fix common struct sg_table related issues
drm: exynos: fix common struct sg_table related issues
drm: i915: fix common struct sg_table related issues
drm: lima: fix common struct sg_table related issues
drm: msm: fix common struct sg_table related issues
drm: panfrost: fix common struct sg_table related issues
drm: radeon: fix common struct sg_table related issues
drm: rockchip: fix common struct sg_table related issues
drm: tegra: fix common struct sg_table related issues
drm: virtio: fix common struct sg_table related issues
drm: vmwgfx: fix common struct sg_table related issues
xen: gntdev: fix common struct sg_table related issues
drm: host1x: fix common struct sg_table related issues
drm: rcar-du: fix common struct sg_table related issues
dmabuf: fix common struct sg_table related issues
staging: ion: fix common struct sg_table related issues
staging: tegra-vde: fix common struct sg_table related issues
misc: fastrpc: fix common struct sg_table related issues
rapidio: fix common struct sg_table related issues
samples: vfio-mdev/mbochs: fix common struct sg_table related issues
media: pci: fix common ALSA DMA-mapping related codes
drivers/dma-buf/heaps/heap-helpers.c | 13 +++++-----
drivers/dma-buf/udmabuf.c | 7 +++---
drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c | 6 ++---
drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 9 +++----
drivers/gpu/drm/armada/armada_gem.c | 10 ++++----
drivers/gpu/drm/drm_cache.c | 2 +-
drivers/gpu/drm/drm_gem_shmem_helper.c | 14 +++++++----
drivers/gpu/drm/drm_prime.c | 13 +++++-----
drivers/gpu/drm/etnaviv/etnaviv_gem.c | 12 ++++-----
drivers/gpu/drm/exynos/exynos_drm_g2d.c | 9 +++----
drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c | 13 ++++------
drivers/gpu/drm/i915/gem/selftests/mock_dmabuf.c | 7 +++---
drivers/gpu/drm/lima/lima_gem.c | 11 +++++---
drivers/gpu/drm/msm/msm_gem.c | 13 ++++------
drivers/gpu/drm/msm/msm_iommu.c | 2 +-
drivers/gpu/drm/panfrost/panfrost_gem.c | 4 +--
drivers/gpu/drm/panfrost/panfrost_mmu.c | 5 ++--
drivers/gpu/drm/radeon/radeon_ttm.c | 11 ++++----
drivers/gpu/drm/rockchip/rockchip_drm_gem.c | 26 +++++++++----------
drivers/gpu/drm/tegra/gem.c | 27 ++++++++------------
drivers/gpu/drm/tegra/plane.c | 15 ++++-------
drivers/gpu/drm/virtio/virtgpu_object.c | 17 ++++++-------
drivers/gpu/drm/virtio/virtgpu_vq.c | 10 +++-----
drivers/gpu/drm/vmwgfx/vmwgfx_ttm_buffer.c | 17 +++----------
drivers/gpu/host1x/job.c | 22 ++++++----------
drivers/media/pci/cx23885/cx23885-alsa.c | 2 +-
drivers/media/pci/cx25821/cx25821-alsa.c | 2 +-
drivers/media/pci/cx88/cx88-alsa.c | 2 +-
drivers/media/pci/saa7134/saa7134-alsa.c | 2 +-
drivers/media/platform/vsp1/vsp1_drm.c | 9 ++++---
drivers/misc/fastrpc.c | 4 +--
drivers/rapidio/devices/rio_mport_cdev.c | 8 +++---
drivers/staging/android/ion/ion.c | 25 ++++++++----------
drivers/staging/android/ion/ion_heap.c | 6 ++---
drivers/staging/android/ion/ion_system_heap.c | 2 +-
drivers/staging/media/tegra-vde/iommu.c | 4 +--
drivers/xen/gntdev-dmabuf.c | 7 +++---
include/linux/dma-mapping.h | 32 ++++++++++++++++++++++++
include/linux/iommu.h | 6 +++++
samples/vfio-mdev/mbochs.c | 3 ++-
40 files changed, 202 insertions(+), 207 deletions(-)
--
1.9.1
On Fri, May 08, 2020 at 12:11:03PM +0530, Charan Teja Reddy wrote:
> The following race occurs while accessing the dmabuf object exported as
> file:
> P1 P2
> dma_buf_release() dmabuffs_dname()
> [say lsof reading /proc/<P1 pid>/fd/<num>]
>
> read dmabuf stored in dentry->d_fsdata
> Free the dmabuf object
> Start accessing the dmabuf structure
>
> In the above description, the dmabuf object freed in P1 is being
> accessed from P2 which is resulting into the use-after-free. Below is
> the dump stack reported.
>
> We are reading the dmabuf object stored in the dentry->d_fsdata but
> there is no binding between the dentry and the dmabuf which means that
> the dmabuf can be freed while it is being read from ->d_fsdata and
> inuse. Reviews on the patch V1 says that protecting the dmabuf inuse
> with an extra refcount is not a viable solution as the exported dmabuf
> is already under file's refcount and keeping the multiple refcounts on
> the same object coordinated is not possible.
>
> As we are reading the dmabuf in ->d_fsdata just to get the user passed
> name, we can directly store the name in d_fsdata thus can avoid the
> reading of dmabuf altogether.
>
> Call Trace:
> kasan_report+0x12/0x20
> __asan_report_load8_noabort+0x14/0x20
> dmabuffs_dname+0x4f4/0x560
> tomoyo_realpath_from_path+0x165/0x660
> tomoyo_get_realpath
> tomoyo_check_open_permission+0x2a3/0x3e0
> tomoyo_file_open
> tomoyo_file_open+0xa9/0xd0
> security_file_open+0x71/0x300
> do_dentry_open+0x37a/0x1380
> vfs_open+0xa0/0xd0
> path_openat+0x12ee/0x3490
> do_filp_open+0x192/0x260
> do_sys_openat2+0x5eb/0x7e0
> do_sys_open+0xf2/0x180
>
> Fixes: bb2bb9030425 ("dma-buf: add DMA_BUF_SET_NAME ioctls")
> Reported-by: syzbot+3643a18836bce555bff6(a)syzkaller.appspotmail.com
> Cc: <stable(a)vger.kernel.org> [5.3+]
> Signed-off-by: Charan Teja Reddy <charante(a)codeaurora.org>
> ---
>
> Changes in v2:
>
> - Pass the user passed name in ->d_fsdata instead of dmabuf
> - Improve the commit message
>
> Changes in v1: (https://patchwork.kernel.org/patch/11514063/)
>
> drivers/dma-buf/dma-buf.c | 17 ++++++++++-------
> 1 file changed, 10 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
> index 01ce125..0071f7d 100644
> --- a/drivers/dma-buf/dma-buf.c
> +++ b/drivers/dma-buf/dma-buf.c
> @@ -25,6 +25,7 @@
> #include <linux/mm.h>
> #include <linux/mount.h>
> #include <linux/pseudo_fs.h>
> +#include <linux/dcache.h>
>
> #include <uapi/linux/dma-buf.h>
> #include <uapi/linux/magic.h>
> @@ -40,15 +41,13 @@ struct dma_buf_list {
>
> static char *dmabuffs_dname(struct dentry *dentry, char *buffer, int buflen)
> {
> - struct dma_buf *dmabuf;
> char name[DMA_BUF_NAME_LEN];
> size_t ret = 0;
>
> - dmabuf = dentry->d_fsdata;
> - dma_resv_lock(dmabuf->resv, NULL);
> - if (dmabuf->name)
> - ret = strlcpy(name, dmabuf->name, DMA_BUF_NAME_LEN);
> - dma_resv_unlock(dmabuf->resv);
> + spin_lock(&dentry->d_lock);
Are you sure this lock always protects d_fsdata?
> + if (dentry->d_fsdata)
> + ret = strlcpy(name, dentry->d_fsdata, DMA_BUF_NAME_LEN);
> + spin_unlock(&dentry->d_lock);
>
> return dynamic_dname(dentry, buffer, buflen, "/%s:%s",
> dentry->d_name.name, ret > 0 ? name : "");
If the above check fails the name will be what? How could d_name.name
be valid but d_fsdata not be valid?
> @@ -80,12 +79,16 @@ static int dma_buf_fs_init_context(struct fs_context *fc)
> static int dma_buf_release(struct inode *inode, struct file *file)
> {
> struct dma_buf *dmabuf;
> + struct dentry *dentry = file->f_path.dentry;
>
> if (!is_dma_buf_file(file))
> return -EINVAL;
>
> dmabuf = file->private_data;
>
> + spin_lock(&dentry->d_lock);
> + dentry->d_fsdata = NULL;
> + spin_unlock(&dentry->d_lock);
> BUG_ON(dmabuf->vmapping_counter);
>
> /*
> @@ -343,6 +346,7 @@ static long dma_buf_set_name(struct dma_buf *dmabuf, const char __user *buf)
> }
> kfree(dmabuf->name);
> dmabuf->name = name;
> + dmabuf->file->f_path.dentry->d_fsdata = name;
You are just changing the use of d_fsdata from being a pointer to the
dmabuf to being a pointer to the name string? What's to keep that name
string around and not have the same reference counting issues that the
dmabuf structure itself has? Who frees that string memory?
thanks,
greg k-h
On Tue, May 12, 2020 at 10:43:18AM +0530, Charan Teja Kalla wrote:
> > Ok, but watch out, now you have 2 different reference counts for the
> > same structure. Keeping them coordinated is almost always an impossible
> > task so you need to only rely on one. If you can't use the file api,
> > just drop all of the reference counting logic in there and only use the
> > kref one.
>
> I feel that changing the refcount logic now to dma-buf objects involve
> changes in
>
> the core dma-buf framework. NO? Instead, how about passing the user passed
> name directly
>
> in the ->d_fsdata inplace of dmabuf object? Because we just need user passed
> name in the
>
> dmabuffs_dname(). With this we can avoid the need for extra refcount on
> dmabuf.
Odd formatting :(
> Posted patch-V2: https://lkml.org/lkml/2020/5/8/158
Please just post links to lore.kernel.org, we have no control over
lkml.org at all.
I'll go review that patch now...
greg k-h
On Mon, 11 May 2020 at 19:37, Oded Gabbay <oded.gabbay(a)gmail.com> wrote:
>
> On Mon, May 11, 2020 at 12:11 PM Daniel Vetter <daniel.vetter(a)ffwll.ch> wrote:
> >
> > It's the default.
> Thanks for catching that.
>
> >
> > Also so much for "we're not going to tell the graphics people how to
> > review their code", dma_fence is a pretty core piece of gpu driver
> > infrastructure. And it's very much uapi relevant, including piles of
> > corresponding userspace protocols and libraries for how to pass these
> > around.
> >
> > Would be great if habanalabs would not use this (from a quick look
> > it's not needed at all), since open source the userspace and playing
> > by the usual rules isn't on the table. If that's not possible (because
> > it's actually using the uapi part of dma_fence to interact with gpu
> > drivers) then we have exactly what everyone promised we'd want to
> > avoid.
>
> We don't use the uapi parts, we currently only using the fencing and
> signaling ability of this module inside our kernel code. But maybe I
> didn't understand what you request. You want us *not* to use this
> well-written piece of kernel code because it is only used by graphics
> drivers ?
> I'm sorry but I don't get this argument, if this is indeed what you meant.
We would rather drivers using a feature that has requirements on
correct userspace implementations of the feature have a userspace that
is open source and auditable.
Fencing is tricky, cross-device fencing is really tricky, and having
the ability for a closed userspace component to mess up other people's
drivers, think i915 shared with closed habana userspace and shared
fences, decreases ability to debug things.
Ideally we wouldn't offer users known untested/broken scenarios, so
yes we'd prefer that drivers that intend to expose a userspace fencing
api around dma-fence would adhere to the rules of the gpu drivers.
I'm not say you have to drop using dma-fence, but if you move towards
cross-device stuff I believe other drivers would be correct in
refusing to interact with fences from here.
Dave.
It's the default.
Also so much for "we're not going to tell the graphics people how to
review their code", dma_fence is a pretty core piece of gpu driver
infrastructure. And it's very much uapi relevant, including piles of
corresponding userspace protocols and libraries for how to pass these
around.
Would be great if habanalabs would not use this (from a quick look
it's not needed at all), since open source the userspace and playing
by the usual rules isn't on the table. If that's not possible (because
it's actually using the uapi part of dma_fence to interact with gpu
drivers) then we have exactly what everyone promised we'd want to
avoid.
Signed-off-by: Daniel Vetter <daniel.vetter(a)intel.com>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: Olof Johansson <olof(a)lixom.net>
Cc: Oded Gabbay <oded.gabbay(a)gmail.com>
Cc: Sumit Semwal <sumit.semwal(a)linaro.org>
Cc: linux-media(a)vger.kernel.org
Cc: linaro-mm-sig(a)lists.linaro.org
---
drivers/misc/habanalabs/command_submission.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/drivers/misc/habanalabs/command_submission.c b/drivers/misc/habanalabs/command_submission.c
index 409276b6374d..cc3ce759b6c3 100644
--- a/drivers/misc/habanalabs/command_submission.c
+++ b/drivers/misc/habanalabs/command_submission.c
@@ -46,7 +46,6 @@ static const struct dma_fence_ops hl_fence_ops = {
.get_driver_name = hl_fence_get_driver_name,
.get_timeline_name = hl_fence_get_timeline_name,
.enable_signaling = hl_fence_enable_signaling,
- .wait = dma_fence_default_wait,
.release = hl_fence_release
};
--
2.26.2