Changing vma->vm_file in dma_buf_mmap()

List overview All Threads
Download

newer

older

[PATCH AUTOSEL 5.4 009/330]...

[PATCH] dma-resv:...

Christian König

14 Sep 2020 14 Sep '20

1:29 p.m.

Hi Andrew,

I'm the new DMA-buf maintainer and Daniel and others came up with patches extending the use of the dma_buf_mmap() function.

Now this function is doing something a bit odd by changing the vma->vm_file while installing a VMA in the mmap() system call

The background here is that DMA-buf allows device drivers to export buffer which are then imported into another device driver. The mmap() handler of the importing device driver then find that the pgoff belongs to the exporting device and so redirects the mmap() call there.

In other words user space calls mmap() on one file descriptor, but get a different one mapped into your virtual address space.

My question is now: Is that legal or can you think of something which breaks here?

If it's not legal we should probably block any new users of the dma_buf_mmap() function and consider what should happen with the two existing ones.

If that is legal I would like to document this by adding a new vma_set_file() function which does the necessary reference count dance.

Thanks in advance, Christian.

Show replies by date

Christian König

14 Sep 14 Sep

1:29 p.m.

New subject: [PATCH 1/2] drm/shmem-helpers: revert "Redirect mmap for imported dma-buf"

This reverts commit 26d3ac3cb04d171a861952e89324e347598a347f.

We need to figure out if dma_buf_mmap() is valid or not first.

Signed-off-by: Christian König christian.koenig@amd.com --- drivers/gpu/drm/drm_gem_shmem_helper.c | 3 --- 1 file changed, 3 deletions(-)

diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c b/drivers/gpu/drm/drm_gem_shmem_helper.c index 0a952f27c184..cd727343f72b 100644 --- a/drivers/gpu/drm/drm_gem_shmem_helper.c +++ b/drivers/gpu/drm/drm_gem_shmem_helper.c @@ -594,9 +594,6 @@ int drm_gem_shmem_mmap(struct drm_gem_object *obj, struct vm_area_struct *vma) /* Remove the fake offset */ vma->vm_pgoff -= drm_vma_node_start(&obj->vma_node);

- if (obj->import_attach) - return dma_buf_mmap(obj->dma_buf, vma, 0); - shmem = to_drm_gem_shmem_obj(obj);

ret = drm_gem_shmem_get_pages(shmem);

-- 2.17.1

Daniel Vetter

15 Sep 15 Sep

10:39 a.m.

New subject: [PATCH 1/2] drm/shmem-helpers: revert "Redirect mmap for imported dma-buf"

On Mon, Sep 14, 2020 at 3:29 PM Christian König ckoenig.leichtzumerken@gmail.com wrote:

...

This reverts commit 26d3ac3cb04d171a861952e89324e347598a347f.

We need to figure out if dma_buf_mmap() is valid or not first.

Signed-off-by: Christian König christian.koenig@amd.com

The trouble is that doing dma-buf mmap by looking at the struct pages behind the sg list and just inserting those into userspace doesn't really work any better. You still won't get the unmap_mapping_range and hence pte shoot-down. So maybe dma_buf_mmap forwarding doesn't work, but this doesn't make it any better.

Also commit message should probably explain a bit the context here, not a lot of people have been in our private discussion on this. -Daniel

...

drivers/gpu/drm/drm_gem_shmem_helper.c | 3 --- 1 file changed, 3 deletions(-)

diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c b/drivers/gpu/drm/drm_gem_shmem_helper.c index 0a952f27c184..cd727343f72b 100644 --- a/drivers/gpu/drm/drm_gem_shmem_helper.c +++ b/drivers/gpu/drm/drm_gem_shmem_helper.c @@ -594,9 +594,6 @@ int drm_gem_shmem_mmap(struct drm_gem_object *obj, struct vm_area_struct *vma) /* Remove the fake offset */ vma->vm_pgoff -= drm_vma_node_start(&obj->vma_node);
  if (obj->import_attach)
          return dma_buf_mmap(obj->dma_buf, vma, 0);
  shmem = to_drm_gem_shmem_obj(obj);

  ret = drm_gem_shmem_get_pages(shmem);
-- 2.17.1

dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel

-- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch

Christian König

11:03 a.m.

New subject: [PATCH 1/2] drm/shmem-helpers: revert "Redirect mmap for imported dma-buf"

Am 15.09.20 um 12:39 schrieb Daniel Vetter:

...

On Mon, Sep 14, 2020 at 3:29 PM Christian König ckoenig.leichtzumerken@gmail.com wrote:

...
This reverts commit 26d3ac3cb04d171a861952e89324e347598a347f.

We need to figure out if dma_buf_mmap() is valid or not first.

Signed-off-by: Christian König christian.koenig@amd.com

The trouble is that doing dma-buf mmap by looking at the struct pages behind the sg list and just inserting those into userspace doesn't really work any better. You still won't get the unmap_mapping_range and hence pte shoot-down. So maybe dma_buf_mmap forwarding doesn't work, but this doesn't make it any better.

Good point. Question is what should we do? Return -EPERM?

...

Also commit message should probably explain a bit the context here, not a lot of people have been in our private discussion on this.

Well, that's certain true.

Christian.

...

-Daniel

...
drivers/gpu/drm/drm_gem_shmem_helper.c | 3 --- 1 file changed, 3 deletions(-)

diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c b/drivers/gpu/drm/drm_gem_shmem_helper.c index 0a952f27c184..cd727343f72b 100644 --- a/drivers/gpu/drm/drm_gem_shmem_helper.c +++ b/drivers/gpu/drm/drm_gem_shmem_helper.c @@ -594,9 +594,6 @@ int drm_gem_shmem_mmap(struct drm_gem_object *obj, struct vm_area_struct *vma) /* Remove the fake offset */ vma->vm_pgoff -= drm_vma_node_start(&obj->vma_node);
  if (obj->import_attach)
          return dma_buf_mmap(obj->dma_buf, vma, 0);
   shmem = to_drm_gem_shmem_obj(obj);

   ret = drm_gem_shmem_get_pages(shmem);
-- 2.17.1

dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel

Daniel Vetter

11:07 a.m.

New subject: [PATCH 1/2] drm/shmem-helpers: revert "Redirect mmap for imported dma-buf"

On Tue, Sep 15, 2020 at 1:03 PM Christian König ckoenig.leichtzumerken@gmail.com wrote:

...

Am 15.09.20 um 12:39 schrieb Daniel Vetter:

...
On Mon, Sep 14, 2020 at 3:29 PM Christian König ckoenig.leichtzumerken@gmail.com wrote:

...
This reverts commit 26d3ac3cb04d171a861952e89324e347598a347f.

We need to figure out if dma_buf_mmap() is valid or not first.

Signed-off-by: Christian König christian.koenig@amd.com

The trouble is that doing dma-buf mmap by looking at the struct pages behind the sg list and just inserting those into userspace doesn't really work any better. You still won't get the unmap_mapping_range and hence pte shoot-down. So maybe dma_buf_mmap forwarding doesn't work, but this doesn't make it any better.

Good point. Question is what should we do? Return -EPERM?

IIrc there's userspace which expects this to work, but I think it's also only trying to do this with simpler drivers that don't need unmap_mapping_range to work correctly. Or it's simply that no one ever reported the bugs. It's a bit a mess :-/ -Daniel

...

...
Also commit message should probably explain a bit the context here, not a lot of people have been in our private discussion on this.

Well, that's certain true.

Christian.

...
-Daniel

...
drivers/gpu/drm/drm_gem_shmem_helper.c | 3 --- 1 file changed, 3 deletions(-)

diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c b/drivers/gpu/drm/drm_gem_shmem_helper.c index 0a952f27c184..cd727343f72b 100644 --- a/drivers/gpu/drm/drm_gem_shmem_helper.c +++ b/drivers/gpu/drm/drm_gem_shmem_helper.c @@ -594,9 +594,6 @@ int drm_gem_shmem_mmap(struct drm_gem_object *obj, struct vm_area_struct *vma) /* Remove the fake offset */ vma->vm_pgoff -= drm_vma_node_start(&obj->vma_node);
  if (obj->import_attach)
          return dma_buf_mmap(obj->dma_buf, vma, 0);
   shmem = to_drm_gem_shmem_obj(obj);

   ret = drm_gem_shmem_get_pages(shmem);
-- 2.17.1

dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel

-- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch

Christian König

14 Sep 14 Sep

1:29 p.m.

New subject: [PATCH 2/2] mm: introduce vma_set_file function

Add the new vma_set_file() function to allow changing vma->vm_file with the necessary refcount dance.

Signed-off-by: Christian König christian.koenig@amd.com --- drivers/dma-buf/dma-buf.c | 16 +++++----------- include/linux/mm.h | 2 ++ mm/mmap.c | 16 ++++++++++++++++ 3 files changed, 23 insertions(+), 11 deletions(-)

diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index 1699a8e309ef..672f3525ba74 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -1163,20 +1163,14 @@ int dma_buf_mmap(struct dma_buf *dmabuf, struct vm_area_struct *vma, return -EINVAL;

/* readjust the vma */ - get_file(dmabuf->file); - oldfile = vma->vm_file; - vma->vm_file = dmabuf->file; + oldfile = vma_set_file(vma, dmabuf->file); vma->vm_pgoff = pgoff;

ret = dmabuf->ops->mmap(dmabuf, vma); - if (ret) { - /* restore old parameters on failure */ - vma->vm_file = oldfile; - fput(dmabuf->file); - } else { - if (oldfile) - fput(oldfile); - } + /* restore old parameters on failure */ + if (ret) + vma_set_file(vma, oldfile); + return ret;

} diff --git a/include/linux/mm.h b/include/linux/mm.h index 1983e08f5906..398a6fdaad1e 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -2688,6 +2688,8 @@ static inline void vma_set_page_prot(struct vm_area_struct *vma) } #endif

+struct file *vma_set_file(struct vm_area_struct *vma, struct file *file); + #ifdef CONFIG_NUMA_BALANCING unsigned long change_prot_numa(struct vm_area_struct *vma, unsigned long start, unsigned long end); diff --git a/mm/mmap.c b/mm/mmap.c index 40248d84ad5f..d3c3c510f643 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -136,6 +136,22 @@ void vma_set_page_prot(struct vm_area_struct *vma) WRITE_ONCE(vma->vm_page_prot, vm_page_prot); }

+/* + * Change backing file, only valid to use during initial VMA setup. + */ +struct file *vma_set_file(struct vm_area_struct *vma, struct file *file) +{ + if (file) + get_file(file); + + swap(vma->vm_file, file); + + if (file) + fput(file); + + return file; +} + /* * Requires inode->i_mapping->i_mmap_rwsem */

-- 2.17.1

kernel test robot

15 Sep 15 Sep

9:19 a.m.

New subject: [PATCH 2/2] mm: introduce vma_set_file function

Hi "Christian,

I love your patch! Yet something to improve:

[auto build test ERROR on hnaz-linux-mm/master] [also build test ERROR on drm-intel/for-linux-next drm-tip/drm-tip linus/master v5.9-rc5 next-20200914] [cannot apply to tegra-drm/drm/tegra/for-next drm-exynos/exynos-drm-next drm/drm-next] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch]

url: https://github.com/0day-ci/linux/commits/Christian-K-nig/drm-shmem-helpers-r... base: https://github.com/hnaz/linux-mm master config: h8300-randconfig-r023-20200914 (attached as .config) compiler: h8300-linux-gcc (GCC) 9.3.0 reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # save the attached .config to linux build tree COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=h8300

If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot lkp@intel.com

All errors (new ones prefixed by >>):

h8300-linux-ld: drivers/dma-buf/dma-buf.o: in function `dma_buf_mmap':

...

...
drivers/dma-buf/dma-buf.c:1166: undefined reference to `vma_set_file' h8300-linux-ld: drivers/dma-buf/dma-buf.c:1172: undefined reference to `vma_set_file'

h8300-linux-ld: drivers/leds/leds-lp55xx-common.o: in function `devm_led_classdev_register': include/linux/leds.h:200: undefined reference to `devm_led_classdev_register_ext'

# https://github.com/0day-ci/linux/commit/c558278651bbea7cb67487890a983608764c... git remote add linux-review https://github.com/0day-ci/linux git fetch --no-tags linux-review Christian-K-nig/drm-shmem-helpers-revert-Redirect-mmap-for-imported-dma-buf/20200914-222921 git checkout c558278651bbea7cb67487890a983608764cc7f4 vim +1166 drivers/dma-buf/dma-buf.c

1127 1128 1129 /** 1130 * dma_buf_mmap - Setup up a userspace mmap with the given vma 1131 * @dmabuf: [in] buffer that should back the vma 1132 * @vma: [in] vma for the mmap 1133 * @pgoff: [in] offset in pages where this mmap should start within the 1134 * dma-buf buffer. 1135 * 1136 * This function adjusts the passed in vma so that it points at the file of the 1137 * dma_buf operation. It also adjusts the starting pgoff and does bounds 1138 * checking on the size of the vma. Then it calls the exporters mmap function to 1139 * set up the mapping. 1140 * 1141 * Can return negative error values, returns 0 on success. 1142 */ 1143 int dma_buf_mmap(struct dma_buf *dmabuf, struct vm_area_struct *vma, 1144 unsigned long pgoff) 1145 { 1146 struct file *oldfile; 1147 int ret; 1148 1149 if (WARN_ON(!dmabuf || !vma)) 1150 return -EINVAL; 1151 1152 /* check if buffer supports mmap */ 1153 if (!dmabuf->ops->mmap) 1154 return -EINVAL; 1155 1156 /* check for offset overflow */ 1157 if (pgoff + vma_pages(vma) < pgoff) 1158 return -EOVERFLOW; 1159 1160 /* check for overflowing the buffer's size */ 1161 if (pgoff + vma_pages(vma) > 1162 dmabuf->size >> PAGE_SHIFT) 1163 return -EINVAL; 1164 1165 /* readjust the vma */

...

1166 oldfile = vma_set_file(vma, dmabuf->file);

1167 vma->vm_pgoff = pgoff; 1168 1169 ret = dmabuf->ops->mmap(dmabuf, vma); 1170 /* restore old parameters on failure */ 1171 if (ret)

...

1172 vma_set_file(vma, oldfile);

1173 1174 return ret; 1175

--- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

Christian König

14 Sep 14 Sep

1:30 p.m.

Am 14.09.20 um 15:29 schrieb Christian König:

...

Hi Andrew,

Sorry forgot to add Daniel as well.

...

I'm the new DMA-buf maintainer and Daniel and others came up with patches extending the use of the dma_buf_mmap() function.

Now this function is doing something a bit odd by changing the vma->vm_file while installing a VMA in the mmap() system call

The background here is that DMA-buf allows device drivers to export buffer which are then imported into another device driver. The mmap() handler of the importing device driver then find that the pgoff belongs to the exporting device and so redirects the mmap() call there.

In other words user space calls mmap() on one file descriptor, but get a different one mapped into your virtual address space.

My question is now: Is that legal or can you think of something which breaks here?

If it's not legal we should probably block any new users of the dma_buf_mmap() function and consider what should happen with the two existing ones.

If that is legal I would like to document this by adding a new vma_set_file() function which does the necessary reference count dance.

Thanks in advance, Christian.

Jason Gunthorpe

2:06 p.m.

On Mon, Sep 14, 2020 at 03:30:47PM +0200, Christian König wrote:

...

Am 14.09.20 um 15:29 schrieb Christian König:

...
Hi Andrew,

I'm the new DMA-buf maintainer and Daniel and others came up with patches extending the use of the dma_buf_mmap() function.

Now this function is doing something a bit odd by changing the vma->vm_file while installing a VMA in the mmap() system call

It doesn't look obviously safe as mmap_region() has an interesting mix of file and vma->file

Eg it calls mapping_unmap_writable() using both routes

What about security? Is it OK that some other random file, maybe in another process, is being linked to this mmap?

...

...
The background here is that DMA-buf allows device drivers to export buffer which are then imported into another device driver. The mmap() handler of the importing device driver then find that the pgoff belongs to the exporting device and so redirects the mmap() call there.

So the pgoff is some virtualized thing?

Jason

Christian König

6:26 p.m.

Am 14.09.20 um 16:06 schrieb Jason Gunthorpe:

...

On Mon, Sep 14, 2020 at 03:30:47PM +0200, Christian König wrote:

...
Am 14.09.20 um 15:29 schrieb Christian König:

...
Hi Andrew,

I'm the new DMA-buf maintainer and Daniel and others came up with patches extending the use of the dma_buf_mmap() function.

Now this function is doing something a bit odd by changing the vma->vm_file while installing a VMA in the mmap() system call

It doesn't look obviously safe as mmap_region() has an interesting mix of file and vma->file

Eg it calls mapping_unmap_writable() using both routes

Thanks for the hint, going to take a look at that code tomorrow.

...

What about security? Is it OK that some other random file, maybe in another process, is being linked to this mmap?

Good question, I have no idea. That's why I send out this mail.

...

...
...
The background here is that DMA-buf allows device drivers to export buffer which are then imported into another device driver. The mmap() handler of the importing device driver then find that the pgoff belongs to the exporting device and so redirects the mmap() call there.

So the pgoff is some virtualized thing?

Yes, absolutely.

Christian.

...

Jason

Daniel Vetter

16 Sep 16 Sep

9:53 a.m.

On Mon, Sep 14, 2020 at 08:26:47PM +0200, Christian König wrote:

...

Am 14.09.20 um 16:06 schrieb Jason Gunthorpe:

...
On Mon, Sep 14, 2020 at 03:30:47PM +0200, Christian König wrote:

...
Am 14.09.20 um 15:29 schrieb Christian König:

...
Hi Andrew,

I'm the new DMA-buf maintainer and Daniel and others came up with patches extending the use of the dma_buf_mmap() function.

Now this function is doing something a bit odd by changing the vma->vm_file while installing a VMA in the mmap() system call

It doesn't look obviously safe as mmap_region() has an interesting mix of file and vma->file

Eg it calls mapping_unmap_writable() using both routes

Thanks for the hint, going to take a look at that code tomorrow.

...
What about security? Is it OK that some other random file, maybe in another process, is being linked to this mmap?

Good question, I have no idea. That's why I send out this mail.

...
...
...
The background here is that DMA-buf allows device drivers to export buffer which are then imported into another device driver. The mmap() handler of the importing device driver then find that the pgoff belongs to the exporting device and so redirects the mmap() call there.

So the pgoff is some virtualized thing?

Yes, absolutely.

Maybe notch more context. Conceptually the buffer objects we use to manage gpu memory are all stand-alone objects, internally refcounted and everything. And if you export them as a dma-buf, then they are indeed stand-alone file descriptors like any other.

But within the driver, we generally need thousands of these, and that tends to bring fd exhaustion problems with it. That's why all the private buffer objects which aren't shared with other process or other drivers are handles only valid for a specific fd instance of the drm chardev (each open gets their own namespace), and only for ioctls done on that chardev. And for mmap we assign fake (but unique across all open fd on it) offsets within the overall chardev. Hence all the pgoff mangling and re-mangling.

Now for unmap_mapping_range we'd like it to find all such fake offset aliases pointing at the one underlying buffer object: - mmap on the dma-buf fd, at offset 0 - mmap on the drm chardev where the buffer was originally allocated, at some unique offset - mmap on the drm chardev where the buffer was imported, again at some (likely) different unique (for that chardev) offset.

So to make unmap_mapping_range work across the entire delegation change we'd actually need to change the vma->vma_file and pgoff twice: - once when forwarding from the importing drm chardev to the dma-buf - once when forwarding from the dma-buf to the exported drm chardev fake offset, which (mostly for historical reasons) is considered the canonical fake offset

We can't really do the delegation in userspace because: - the importer might not have access to the exporters drm chardev, it only gets the dma-buf. If we'd give it the underlying drm chardev it could do stuff like issue rendering commands, breaking the access model. - the dma-buf fd is only used to establish the sharing, once it's imported everywhere it generally gets closed. Userspace could re-export it and then call mmap on that, but feels a bit contrived. - especially on SoC platforms this has already become uapi. It's not a big problem because the drivers that really need unmap_mapping_range to work are the big gpu drivers with discrete vram, where mappings need to be invalidate when moving buffer objects in/out of vram.

Hence why we'd like to be able to forward aliasing mappings and adjust the file and pgoff, while hopefully everything keeps working. I thought this would work, but Christian noticed it doesn't really.

Cheers, Daniel

...

Christian.

...
Jason

-- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch

Christian König

10:14 a.m.

Am 16.09.20 um 11:53 schrieb Daniel Vetter:

...

On Mon, Sep 14, 2020 at 08:26:47PM +0200, Christian König wrote:

...
Am 14.09.20 um 16:06 schrieb Jason Gunthorpe:

...
On Mon, Sep 14, 2020 at 03:30:47PM +0200, Christian König wrote:

...
Am 14.09.20 um 15:29 schrieb Christian König:

...
Hi Andrew,

I'm the new DMA-buf maintainer and Daniel and others came up with patches extending the use of the dma_buf_mmap() function.

Now this function is doing something a bit odd by changing the vma->vm_file while installing a VMA in the mmap() system call

It doesn't look obviously safe as mmap_region() has an interesting mix of file and vma->file

Eg it calls mapping_unmap_writable() using both routes

Thanks for the hint, going to take a look at that code tomorrow.

...
What about security? Is it OK that some other random file, maybe in another process, is being linked to this mmap?

Good question, I have no idea. That's why I send out this mail.

...
...
...
The background here is that DMA-buf allows device drivers to export buffer which are then imported into another device driver. The mmap() handler of the importing device driver then find that the pgoff belongs to the exporting device and so redirects the mmap() call there.

So the pgoff is some virtualized thing?

Yes, absolutely.

Maybe notch more context. Conceptually the buffer objects we use to manage gpu memory are all stand-alone objects, internally refcounted and everything. And if you export them as a dma-buf, then they are indeed stand-alone file descriptors like any other.

But within the driver, we generally need thousands of these, and that tends to bring fd exhaustion problems with it. That's why all the private buffer objects which aren't shared with other process or other drivers are handles only valid for a specific fd instance of the drm chardev (each open gets their own namespace), and only for ioctls done on that chardev. And for mmap we assign fake (but unique across all open fd on it) offsets within the overall chardev. Hence all the pgoff mangling and re-mangling.

Now for unmap_mapping_range we'd like it to find all such fake offset aliases pointing at the one underlying buffer object:

mmap on the dma-buf fd, at offset 0

mmap on the drm chardev where the buffer was originally allocated, at some unique offset

mmap on the drm chardev where the buffer was imported, again at some (likely) different unique (for that chardev) offset.

So to make unmap_mapping_range work across the entire delegation change we'd actually need to change the vma->vma_file and pgoff twice:

once when forwarding from the importing drm chardev to the dma-buf

once when forwarding from the dma-buf to the exported drm chardev fake offset, which (mostly for historical reasons) is considered the canonical fake offset

We can't really do the delegation in userspace because:

the importer might not have access to the exporters drm chardev, it only gets the dma-buf. If we'd give it the underlying drm chardev it could do stuff like issue rendering commands, breaking the access model.

the dma-buf fd is only used to establish the sharing, once it's imported everywhere it generally gets closed. Userspace could re-export it and then call mmap on that, but feels a bit contrived.

especially on SoC platforms this has already become uapi. It's not a big problem because the drivers that really need unmap_mapping_range to work are the big gpu drivers with discrete vram, where mappings need to be invalidate when moving buffer objects in/out of vram.

Hence why we'd like to be able to forward aliasing mappings and adjust the file and pgoff, while hopefully everything keeps working. I thought this would work, but Christian noticed it doesn't really.

Well to be clear I'm still not sure if that works or not :)

But Jason pointed me to the right piece of code. See this comment in in mmap_region():

...

/* ->mmap() can change vma->vm_file, but must guarantee that
vma_link() below can deny write-access if VM_DENYWRITE is set

and map writably if VM_SHARED is set. This usually means the

new file must not have been exposed to user-space, yet.

*/ vma https://elixir.bootlin.com/linux/v5.9-rc5/C/ident/vma->vm_file https://elixir.bootlin.com/linux/v5.9-rc5/C/ident/vm_file = get_file https://elixir.bootlin.com/linux/v5.9-rc5/C/ident/get_file(file https://elixir.bootlin.com/linux/v5.9-rc5/C/ident/file); error = call_mmap https://elixir.bootlin.com/linux/v5.9-rc5/C/ident/call_mmap(file https://elixir.bootlin.com/linux/v5.9-rc5/C/ident/file, vma https://elixir.bootlin.com/linux/v5.9-rc5/C/ident/vma);

So changing vma->vm_file is allowed at least under certain circumstances.

Only the "file must not have been exposed to user-space, yet" part still needs double checking. Currently working on that.

Regards, Christian.

...

Cheers, Daniel

...
Christian.

...
Jason

Christian König

11:45 a.m.

[SNIP]

...

But Jason pointed me to the right piece of code. See this comment in in mmap_region():

...
/* ->mmap() can change vma->vm_file, but must guarantee that
vma_link() below can deny write-access if VM_DENYWRITE is set

and map writably if VM_SHARED is set. This usually means the

new file must not have been exposed to user-space, yet.

*/ vma https://elixir.bootlin.com/linux/v5.9-rc5/C/ident/vma->vm_file https://elixir.bootlin.com/linux/v5.9-rc5/C/ident/vm_file = get_file https://elixir.bootlin.com/linux/v5.9-rc5/C/ident/get_file(file https://elixir.bootlin.com/linux/v5.9-rc5/C/ident/file); error = call_mmap https://elixir.bootlin.com/linux/v5.9-rc5/C/ident/call_mmap(file https://elixir.bootlin.com/linux/v5.9-rc5/C/ident/file, vma https://elixir.bootlin.com/linux/v5.9-rc5/C/ident/vma);
So changing vma->vm_file is allowed at least under certain circumstances.

Only the "file must not have been exposed to user-space, yet" part still needs double checking. Currently working on that.

Ok, I think we can guarantee for all DMA-bufs what is required here.

While searching the code I've found that at least vgem_prime_mmap() and i915_gem_dmabuf_mmap() are doing the same thing of modifying vma->vm_file.

So I'm leaning towards that this works as expected and we should just document this properly.

Daniel and Jason what do you think?

Christian.

Daniel Vetter

12:41 p.m.

On Wed, Sep 16, 2020 at 1:45 PM Christian König christian.koenig@amd.com wrote:

...

[SNIP]

But Jason pointed me to the right piece of code. See this comment in in mmap_region():

/* ->mmap() can change vma->vm_file, but must guarantee that

vma_link() below can deny write-access if VM_DENYWRITE is set

and map writably if VM_SHARED is set. This usually means the

new file must not have been exposed to user-space, yet.

*/ vma->vm_file = get_file(file); error = call_mmap(file, vma);

So changing vma->vm_file is allowed at least under certain circumstances.

Only the "file must not have been exposed to user-space, yet" part still needs double checking. Currently working on that.

Ok, I think we can guarantee for all DMA-bufs what is required here.

While searching the code I've found that at least vgem_prime_mmap() and i915_gem_dmabuf_mmap() are doing the same thing of modifying vma->vm_file.

So I'm leaning towards that this works as expected and we should just document this properly.

Daniel and Jason what do you think?

Well I can claim I started this, so I started out with naively assuming that it Just Works :-)

I think we already have vgem/i915 prime testcases in igt which try to excercise this mmap forwarding, including provoking pte shoot-downs. So I think best would be if you could also add a variant for amdgpu, to make sure this really works and keeps working.

Plus ofc the documentation patch so that core mm folks can officially ack this as legit. -Daniel

-- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch

Jason Gunthorpe

2:07 p.m.

On Wed, Sep 16, 2020 at 11:53:59AM +0200, Daniel Vetter wrote:

...

But within the driver, we generally need thousands of these, and that tends to bring fd exhaustion problems with it. That's why all the private buffer objects which aren't shared with other process or other drivers are handles only valid for a specific fd instance of the drm chardev (each open gets their own namespace), and only for ioctls done on that chardev. And for mmap we assign fake (but unique across all open fd on it) offsets within the overall chardev. Hence all the pgoff mangling and re-mangling.

Are they still unique struct files? Just without a fdno?

...

Hence why we'd like to be able to forward aliasing mappings and adjust the file and pgoff, while hopefully everything keeps working. I thought this would work, but Christian noticed it doesn't really.

It seems reasonable to me that the dma buf should be the owner of the VMA, otherwise like you say, there is a big mess attaching the custom vma ops and what not to the proper dma buf.

I don't see anything obviously against this in mmap_region() - why did Chritian notice it doesn't really work?

Jason

Christian König

2:14 p.m.

Am 16.09.20 um 16:07 schrieb Jason Gunthorpe:

...

On Wed, Sep 16, 2020 at 11:53:59AM +0200, Daniel Vetter wrote:

...
But within the driver, we generally need thousands of these, and that tends to bring fd exhaustion problems with it. That's why all the private buffer objects which aren't shared with other process or other drivers are handles only valid for a specific fd instance of the drm chardev (each open gets their own namespace), and only for ioctls done on that chardev. And for mmap we assign fake (but unique across all open fd on it) offsets within the overall chardev. Hence all the pgoff mangling and re-mangling.

Are they still unique struct files? Just without a fdno?

Yes, exactly.

...

...
Hence why we'd like to be able to forward aliasing mappings and adjust the file and pgoff, while hopefully everything keeps working. I thought this would work, but Christian noticed it doesn't really.

It seems reasonable to me that the dma buf should be the owner of the VMA, otherwise like you say, there is a big mess attaching the custom vma ops and what not to the proper dma buf.

I don't see anything obviously against this in mmap_region() - why did Chritian notice it doesn't really work?

To clarify I think this might work.

I just had the same "Is that legal?", "What about security?", etc.. questions you raised as well.

It seems like a source of trouble so I thought better ask somebody more familiar with that.

Christian.

...

Jason

Daniel Vetter

3:24 p.m.

On Wed, Sep 16, 2020 at 4:14 PM Christian König christian.koenig@amd.com wrote:

...

Am 16.09.20 um 16:07 schrieb Jason Gunthorpe:

...
On Wed, Sep 16, 2020 at 11:53:59AM +0200, Daniel Vetter wrote:

...
But within the driver, we generally need thousands of these, and that tends to bring fd exhaustion problems with it. That's why all the private buffer objects which aren't shared with other process or other drivers are handles only valid for a specific fd instance of the drm chardev (each open gets their own namespace), and only for ioctls done on that chardev. And for mmap we assign fake (but unique across all open fd on it) offsets within the overall chardev. Hence all the pgoff mangling and re-mangling.

Are they still unique struct files? Just without a fdno?

Yes, exactly.

Not entirely, since dma-buf happened after drm chardev, so for that historical reason the underlying struct file is shared, since it's the drm chardev. But since that's per-device we don't have a problem in practice with different vm_ops, since those are also per-device. But yeah we could fish out some entirely hidden per-object struct file if that's required for some mm internal reasons. -Daniel

...

...
...
Hence why we'd like to be able to forward aliasing mappings and adjust the file and pgoff, while hopefully everything keeps working. I thought this would work, but Christian noticed it doesn't really.

It seems reasonable to me that the dma buf should be the owner of the VMA, otherwise like you say, there is a big mess attaching the custom vma ops and what not to the proper dma buf.

I don't see anything obviously against this in mmap_region() - why did Chritian notice it doesn't really work?

To clarify I think this might work.

I just had the same "Is that legal?", "What about security?", etc.. questions you raised as well.

It seems like a source of trouble so I thought better ask somebody more familiar with that.

Christian.

...
Jason

dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel

-- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch

Christian König

3:31 p.m.

Am 16.09.20 um 17:24 schrieb Daniel Vetter:

...

On Wed, Sep 16, 2020 at 4:14 PM Christian König christian.koenig@amd.com wrote:

...
Am 16.09.20 um 16:07 schrieb Jason Gunthorpe:

...
On Wed, Sep 16, 2020 at 11:53:59AM +0200, Daniel Vetter wrote:

...
But within the driver, we generally need thousands of these, and that tends to bring fd exhaustion problems with it. That's why all the private buffer objects which aren't shared with other process or other drivers are handles only valid for a specific fd instance of the drm chardev (each open gets their own namespace), and only for ioctls done on that chardev. And for mmap we assign fake (but unique across all open fd on it) offsets within the overall chardev. Hence all the pgoff mangling and re-mangling.

Are they still unique struct files? Just without a fdno?

Yes, exactly.

Not entirely, since dma-buf happened after drm chardev, so for that historical reason the underlying struct file is shared, since it's the drm chardev. But since that's per-device we don't have a problem in practice with different vm_ops, since those are also per-device. But yeah we could fish out some entirely hidden per-object struct file if that's required for some mm internal reasons.

Hui? Ok that is just the handling in i915, isn't it?

As far as I know we create an unique struct file for each DMA-buf.

Regards, Christian.

...

-Daniel

...
...
...
Hence why we'd like to be able to forward aliasing mappings and adjust the file and pgoff, while hopefully everything keeps working. I thought this would work, but Christian noticed it doesn't really.

It seems reasonable to me that the dma buf should be the owner of the VMA, otherwise like you say, there is a big mess attaching the custom vma ops and what not to the proper dma buf.

I don't see anything obviously against this in mmap_region() - why did Chritian notice it doesn't really work?

To clarify I think this might work.

I just had the same "Is that legal?", "What about security?", etc.. questions you raised as well.

It seems like a source of trouble so I thought better ask somebody more familiar with that.

Christian.

...
Jason

dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel

Daniel Vetter

17 Sep 17 Sep

6:23 a.m.

On Wed, Sep 16, 2020 at 5:31 PM Christian König ckoenig.leichtzumerken@gmail.com wrote:

...

Am 16.09.20 um 17:24 schrieb Daniel Vetter:

...
On Wed, Sep 16, 2020 at 4:14 PM Christian König christian.koenig@amd.com wrote:

...
Am 16.09.20 um 16:07 schrieb Jason Gunthorpe:

...
On Wed, Sep 16, 2020 at 11:53:59AM +0200, Daniel Vetter wrote:

...
But within the driver, we generally need thousands of these, and that tends to bring fd exhaustion problems with it. That's why all the private buffer objects which aren't shared with other process or other drivers are handles only valid for a specific fd instance of the drm chardev (each open gets their own namespace), and only for ioctls done on that chardev. And for mmap we assign fake (but unique across all open fd on it) offsets within the overall chardev. Hence all the pgoff mangling and re-mangling.

Are they still unique struct files? Just without a fdno?

Yes, exactly.

Not entirely, since dma-buf happened after drm chardev, so for that historical reason the underlying struct file is shared, since it's the drm chardev. But since that's per-device we don't have a problem in practice with different vm_ops, since those are also per-device. But yeah we could fish out some entirely hidden per-object struct file if that's required for some mm internal reasons.

Hui? Ok that is just the handling in i915, isn't it?

As far as I know we create an unique struct file for each DMA-buf.

Yes dma-buf, but that gets forwarded to the original drm chardev which originally exported the buffer. It's only there where the forwarding chain stops. The other thing is that iirc we have a singleton anon_inode behind all the dma-buf, so they'd share all the same address_space and so would all alias for unmap_mapping_range (I think at least). -Daniel

...

Regards, Christian.

...
-Daniel

...
...
...
Hence why we'd like to be able to forward aliasing mappings and adjust the file and pgoff, while hopefully everything keeps working. I thought this would work, but Christian noticed it doesn't really.

It seems reasonable to me that the dma buf should be the owner of the VMA, otherwise like you say, there is a big mess attaching the custom vma ops and what not to the proper dma buf.

I don't see anything obviously against this in mmap_region() - why did Chritian notice it doesn't really work?

To clarify I think this might work.

I just had the same "Is that legal?", "What about security?", etc.. questions you raised as well.

It seems like a source of trouble so I thought better ask somebody more familiar with that.

Christian.

...
Jason

dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel

-- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch

Christian König

7:11 a.m.

Am 17.09.20 um 08:23 schrieb Daniel Vetter:

...

On Wed, Sep 16, 2020 at 5:31 PM Christian König ckoenig.leichtzumerken@gmail.com wrote:

...
Am 16.09.20 um 17:24 schrieb Daniel Vetter:

...
On Wed, Sep 16, 2020 at 4:14 PM Christian König christian.koenig@amd.com wrote:

...
Am 16.09.20 um 16:07 schrieb Jason Gunthorpe:

...
On Wed, Sep 16, 2020 at 11:53:59AM +0200, Daniel Vetter wrote:

...
But within the driver, we generally need thousands of these, and that tends to bring fd exhaustion problems with it. That's why all the private buffer objects which aren't shared with other process or other drivers are handles only valid for a specific fd instance of the drm chardev (each open gets their own namespace), and only for ioctls done on that chardev. And for mmap we assign fake (but unique across all open fd on it) offsets within the overall chardev. Hence all the pgoff mangling and re-mangling.

Are they still unique struct files? Just without a fdno?

Yes, exactly.

Not entirely, since dma-buf happened after drm chardev, so for that historical reason the underlying struct file is shared, since it's the drm chardev. But since that's per-device we don't have a problem in practice with different vm_ops, since those are also per-device. But yeah we could fish out some entirely hidden per-object struct file if that's required for some mm internal reasons.

Hui? Ok that is just the handling in i915, isn't it?

As far as I know we create an unique struct file for each DMA-buf.

Yes dma-buf, but that gets forwarded to the original drm chardev which originally exported the buffer. It's only there where the forwarding chain stops. The other thing is that iirc we have a singleton anon_inode behind all the dma-buf, so they'd share all the same address_space and so would all alias for unmap_mapping_range (I think at least).

Amdgpu works by using the address_space of the drm chardev into the struct file of DMA-buf instead.

I think that this is cleaner, but only by a little bit :)

Anyway I'm a bit concerned that we have so many different approaches for the same problem.

Christian.

...

-Daniel

...
Regards, Christian.

...
-Daniel

...
...
...
Hence why we'd like to be able to forward aliasing mappings and adjust the file and pgoff, while hopefully everything keeps working. I thought this would work, but Christian noticed it doesn't really.

It seems reasonable to me that the dma buf should be the owner of the VMA, otherwise like you say, there is a big mess attaching the custom vma ops and what not to the proper dma buf.

I don't see anything obviously against this in mmap_region() - why did Chritian notice it doesn't really work?

To clarify I think this might work.

I just had the same "Is that legal?", "What about security?", etc.. questions you raised as well.

It seems like a source of trouble so I thought better ask somebody more familiar with that.

Christian.

...
Jason

dri-devel mailing list dri-devel@lists.freedesktop.org https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.free...

Daniel Vetter

8:09 a.m.

On Thu, Sep 17, 2020 at 9:11 AM Christian König christian.koenig@amd.com wrote:

...

Am 17.09.20 um 08:23 schrieb Daniel Vetter:

...
On Wed, Sep 16, 2020 at 5:31 PM Christian König ckoenig.leichtzumerken@gmail.com wrote:

...
Am 16.09.20 um 17:24 schrieb Daniel Vetter:

...
On Wed, Sep 16, 2020 at 4:14 PM Christian König christian.koenig@amd.com wrote:

...
Am 16.09.20 um 16:07 schrieb Jason Gunthorpe:

...
On Wed, Sep 16, 2020 at 11:53:59AM +0200, Daniel Vetter wrote:

> But within the driver, we generally need thousands of these, and that > tends to bring fd exhaustion problems with it. That's why all the private > buffer objects which aren't shared with other process or other drivers are > handles only valid for a specific fd instance of the drm chardev (each > open gets their own namespace), and only for ioctls done on that chardev. > And for mmap we assign fake (but unique across all open fd on it) offsets > within the overall chardev. Hence all the pgoff mangling and re-mangling. Are they still unique struct files? Just without a fdno?

Yes, exactly.

Not entirely, since dma-buf happened after drm chardev, so for that historical reason the underlying struct file is shared, since it's the drm chardev. But since that's per-device we don't have a problem in practice with different vm_ops, since those are also per-device. But yeah we could fish out some entirely hidden per-object struct file if that's required for some mm internal reasons.

Hui? Ok that is just the handling in i915, isn't it?

As far as I know we create an unique struct file for each DMA-buf.

Yes dma-buf, but that gets forwarded to the original drm chardev which originally exported the buffer. It's only there where the forwarding chain stops. The other thing is that iirc we have a singleton anon_inode behind all the dma-buf, so they'd share all the same address_space and so would all alias for unmap_mapping_range (I think at least).

Amdgpu works by using the address_space of the drm chardev into the struct file of DMA-buf instead.

I think that this is cleaner, but only by a little bit :)

Yeah, but it doesn't work when forwarding from the drm chardev to the dma-buf on the importer side, since you'd need a ton of different address spaces. And you still rely on the core code picking up your pgoff mangling, which feels about as risky to me as the vma file pointer wrangling - if it's not consistently applied the reverse map is toast and unmap_mapping_range doesn't work correctly for our needs.

...

Anyway I'm a bit concerned that we have so many different approaches for the same problem.

Yeah, I think if we can standardize this then that would be really good. -Daniel

...

Christian.

...
-Daniel

...
Regards, Christian.

...
-Daniel

...
...
> Hence why we'd like to be able to forward aliasing mappings and adjust the > file and pgoff, while hopefully everything keeps working. I thought this > would work, but Christian noticed it doesn't really. It seems reasonable to me that the dma buf should be the owner of the VMA, otherwise like you say, there is a big mess attaching the custom vma ops and what not to the proper dma buf.

I don't see anything obviously against this in mmap_region() - why did Chritian notice it doesn't really work?

To clarify I think this might work.

I just had the same "Is that legal?", "What about security?", etc.. questions you raised as well.

It seems like a source of trouble so I thought better ask somebody more familiar with that.

Christian.

...
Jason

dri-devel mailing list dri-devel@lists.freedesktop.org https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.free...

-- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch

Jason Gunthorpe

11:31 a.m.

On Thu, Sep 17, 2020 at 10:09:12AM +0200, Daniel Vetter wrote:

...

Yeah, but it doesn't work when forwarding from the drm chardev to the dma-buf on the importer side, since you'd need a ton of different address spaces. And you still rely on the core code picking up your pgoff mangling, which feels about as risky to me as the vma file pointer wrangling - if it's not consistently applied the reverse map is toast and unmap_mapping_range doesn't work correctly for our needs.

I would think the pgoff has to be translated at the same time the vm->vm_file is changed?

The owner of the dma_buf should have one virtual address space and FD, all its dma bufs should be linked to it, and all pgoffs translated to that space.

Jason

Christian König

12:03 p.m.

Am 17.09.20 um 13:31 schrieb Jason Gunthorpe:

...

On Thu, Sep 17, 2020 at 10:09:12AM +0200, Daniel Vetter wrote:

...
Yeah, but it doesn't work when forwarding from the drm chardev to the dma-buf on the importer side, since you'd need a ton of different address spaces. And you still rely on the core code picking up your pgoff mangling, which feels about as risky to me as the vma file pointer wrangling - if it's not consistently applied the reverse map is toast and unmap_mapping_range doesn't work correctly for our needs.

I would think the pgoff has to be translated at the same time the vm->vm_file is changed?

The owner of the dma_buf should have one virtual address space and FD, all its dma bufs should be linked to it, and all pgoffs translated to that space.

Yeah, that is exactly like amdgpu is doing it.

Going to document that somehow when I'm done with TTM cleanups.

Christian.

...

Jason _______________________________________________ Linaro-mm-sig mailing list Linaro-mm-sig@lists.linaro.org https://lists.linaro.org/mailman/listinfo/linaro-mm-sig

Jason Gunthorpe

12:18 p.m.

On Thu, Sep 17, 2020 at 02:03:48PM +0200, Christian König wrote:

...

Am 17.09.20 um 13:31 schrieb Jason Gunthorpe:

...
On Thu, Sep 17, 2020 at 10:09:12AM +0200, Daniel Vetter wrote:

...
Yeah, but it doesn't work when forwarding from the drm chardev to the dma-buf on the importer side, since you'd need a ton of different address spaces. And you still rely on the core code picking up your pgoff mangling, which feels about as risky to me as the vma file pointer wrangling - if it's not consistently applied the reverse map is toast and unmap_mapping_range doesn't work correctly for our needs.

I would think the pgoff has to be translated at the same time the vm->vm_file is changed?

The owner of the dma_buf should have one virtual address space and FD, all its dma bufs should be linked to it, and all pgoffs translated to that space.

Yeah, that is exactly like amdgpu is doing it.

Going to document that somehow when I'm done with TTM cleanups.

BTW, while people are looking at this, is there a way to go from a VMA to a dma_buf that owns it?

Jason

Christian König

12:24 p.m.

Am 17.09.20 um 14:18 schrieb Jason Gunthorpe:

...

On Thu, Sep 17, 2020 at 02:03:48PM +0200, Christian König wrote:

...
Am 17.09.20 um 13:31 schrieb Jason Gunthorpe:

...
On Thu, Sep 17, 2020 at 10:09:12AM +0200, Daniel Vetter wrote:

...
Yeah, but it doesn't work when forwarding from the drm chardev to the dma-buf on the importer side, since you'd need a ton of different address spaces. And you still rely on the core code picking up your pgoff mangling, which feels about as risky to me as the vma file pointer wrangling - if it's not consistently applied the reverse map is toast and unmap_mapping_range doesn't work correctly for our needs.

I would think the pgoff has to be translated at the same time the vm->vm_file is changed?

The owner of the dma_buf should have one virtual address space and FD, all its dma bufs should be linked to it, and all pgoffs translated to that space.

Yeah, that is exactly like amdgpu is doing it.

Going to document that somehow when I'm done with TTM cleanups.

BTW, while people are looking at this, is there a way to go from a VMA to a dma_buf that owns it?

Only a driver specific one.

For TTM drivers vma->vm_private_data points to the buffer object. Not sure about the drivers using GEM only.

Why are you asking?

Regards, Christian.

...

Jason _______________________________________________ Linaro-mm-sig mailing list Linaro-mm-sig@lists.linaro.org https://lists.linaro.org/mailman/listinfo/linaro-mm-sig

Daniel Vetter

12:26 p.m.

On Thu, Sep 17, 2020 at 02:24:29PM +0200, Christian König wrote:

...

Am 17.09.20 um 14:18 schrieb Jason Gunthorpe:

...
On Thu, Sep 17, 2020 at 02:03:48PM +0200, Christian König wrote:

...
Am 17.09.20 um 13:31 schrieb Jason Gunthorpe:

...
On Thu, Sep 17, 2020 at 10:09:12AM +0200, Daniel Vetter wrote:

...
Yeah, but it doesn't work when forwarding from the drm chardev to the dma-buf on the importer side, since you'd need a ton of different address spaces. And you still rely on the core code picking up your pgoff mangling, which feels about as risky to me as the vma file pointer wrangling - if it's not consistently applied the reverse map is toast and unmap_mapping_range doesn't work correctly for our needs.

I would think the pgoff has to be translated at the same time the vm->vm_file is changed?

The owner of the dma_buf should have one virtual address space and FD, all its dma bufs should be linked to it, and all pgoffs translated to that space.

Yeah, that is exactly like amdgpu is doing it.

Going to document that somehow when I'm done with TTM cleanups.

BTW, while people are looking at this, is there a way to go from a VMA to a dma_buf that owns it?

Only a driver specific one.

For TTM drivers vma->vm_private_data points to the buffer object. Not sure about the drivers using GEM only.

For gem drivers (which includes the ones using vram helpers, which uses ttm internally) that points at the drm_gem_object pointer. I guess that might be something we can unify a bit on the ttm mmap paths of the remaining driver, now that there's a drm_gem_object embedded anyway. -Daniel

...

Why are you asking?

Regards, Christian.

...
Jason _______________________________________________ Linaro-mm-sig mailing list Linaro-mm-sig@lists.linaro.org https://lists.linaro.org/mailman/listinfo/linaro-mm-sig

-- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch

Jason Gunthorpe

2:35 p.m.

On Thu, Sep 17, 2020 at 02:24:29PM +0200, Christian König wrote:

...

Am 17.09.20 um 14:18 schrieb Jason Gunthorpe:

...
On Thu, Sep 17, 2020 at 02:03:48PM +0200, Christian König wrote:

...
Am 17.09.20 um 13:31 schrieb Jason Gunthorpe:

...
On Thu, Sep 17, 2020 at 10:09:12AM +0200, Daniel Vetter wrote:

...
Yeah, but it doesn't work when forwarding from the drm chardev to the dma-buf on the importer side, since you'd need a ton of different address spaces. And you still rely on the core code picking up your pgoff mangling, which feels about as risky to me as the vma file pointer wrangling - if it's not consistently applied the reverse map is toast and unmap_mapping_range doesn't work correctly for our needs.

I would think the pgoff has to be translated at the same time the vm->vm_file is changed?

The owner of the dma_buf should have one virtual address space and FD, all its dma bufs should be linked to it, and all pgoffs translated to that space.

Yeah, that is exactly like amdgpu is doing it.

Going to document that somehow when I'm done with TTM cleanups.

BTW, while people are looking at this, is there a way to go from a VMA to a dma_buf that owns it?

Only a driver specific one.

Sounds OK

...

For TTM drivers vma->vm_private_data points to the buffer object. Not sure about the drivers using GEM only.

Why are drivers in control of the vma? I would think dma_buf should be the vma owner. IIRC module lifetime correctness essentially hings on the module owner of the struct file

...

Why are you asking?

I'm thinking about using find_vma on something that is not get_user_pages()'able to go to the underlying object, in this case dma buf.

So, user VA -> find_vma -> dma_buf object -> dma_buf operations on the memory it represents

Jason

Christian König

2:54 p.m.

Am 17.09.20 um 16:35 schrieb Jason Gunthorpe:

...

On Thu, Sep 17, 2020 at 02:24:29PM +0200, Christian König wrote:

...
Am 17.09.20 um 14:18 schrieb Jason Gunthorpe:

...
On Thu, Sep 17, 2020 at 02:03:48PM +0200, Christian König wrote:

...
Am 17.09.20 um 13:31 schrieb Jason Gunthorpe:

...
On Thu, Sep 17, 2020 at 10:09:12AM +0200, Daniel Vetter wrote:

...
Yeah, but it doesn't work when forwarding from the drm chardev to the dma-buf on the importer side, since you'd need a ton of different address spaces. And you still rely on the core code picking up your pgoff mangling, which feels about as risky to me as the vma file pointer wrangling - if it's not consistently applied the reverse map is toast and unmap_mapping_range doesn't work correctly for our needs.

I would think the pgoff has to be translated at the same time the vm->vm_file is changed?

The owner of the dma_buf should have one virtual address space and FD, all its dma bufs should be linked to it, and all pgoffs translated to that space.

Yeah, that is exactly like amdgpu is doing it.

Going to document that somehow when I'm done with TTM cleanups.

BTW, while people are looking at this, is there a way to go from a VMA to a dma_buf that owns it?

Only a driver specific one.

Sounds OK

...
For TTM drivers vma->vm_private_data points to the buffer object. Not sure about the drivers using GEM only.

Why are drivers in control of the vma? I would think dma_buf should be the vma owner. IIRC module lifetime correctness essentially hings on the module owner of the struct file

Because the page fault handling is completely driver specific.

We could install some DMA-buf vmops, but that would just be another layer of redirection.

...

...
Why are you asking?

I'm thinking about using find_vma on something that is not get_user_pages()'able to go to the underlying object, in this case dma buf.

So, user VA -> find_vma -> dma_buf object -> dma_buf operations on the memory it represents

Ah, yes we are already doing this in amdgpu as well. But only for DMA-bufs or more generally buffers which are mmaped by this driver instance.

Some applications are braindead enough to mmap() a buffer and then give us back the CPU pointer and request to make it a handle (userptr) again.

That is clearly forbidden by OpenGL, OpenCL and Vulkan, but they use it anyway.

Christian.

...

Jason _______________________________________________ Linaro-mm-sig mailing list Linaro-mm-sig@lists.linaro.org https://lists.linaro.org/mailman/listinfo/linaro-mm-sig

Jason Gunthorpe

3:24 p.m.

On Thu, Sep 17, 2020 at 04:54:44PM +0200, Christian König wrote:

...

Am 17.09.20 um 16:35 schrieb Jason Gunthorpe:

...
On Thu, Sep 17, 2020 at 02:24:29PM +0200, Christian König wrote:

...
Am 17.09.20 um 14:18 schrieb Jason Gunthorpe:

...
On Thu, Sep 17, 2020 at 02:03:48PM +0200, Christian König wrote:

...
Am 17.09.20 um 13:31 schrieb Jason Gunthorpe:

...
On Thu, Sep 17, 2020 at 10:09:12AM +0200, Daniel Vetter wrote:

> Yeah, but it doesn't work when forwarding from the drm chardev to the > dma-buf on the importer side, since you'd need a ton of different > address spaces. And you still rely on the core code picking up your > pgoff mangling, which feels about as risky to me as the vma file > pointer wrangling - if it's not consistently applied the reverse map > is toast and unmap_mapping_range doesn't work correctly for our needs. I would think the pgoff has to be translated at the same time the vm->vm_file is changed?

The owner of the dma_buf should have one virtual address space and FD, all its dma bufs should be linked to it, and all pgoffs translated to that space.

Yeah, that is exactly like amdgpu is doing it.

Going to document that somehow when I'm done with TTM cleanups.

BTW, while people are looking at this, is there a way to go from a VMA to a dma_buf that owns it?

Only a driver specific one.

Sounds OK

...
For TTM drivers vma->vm_private_data points to the buffer object. Not sure about the drivers using GEM only.

Why are drivers in control of the vma? I would think dma_buf should be the vma owner. IIRC module lifetime correctness essentially hings on the module owner of the struct file

Because the page fault handling is completely driver specific.

We could install some DMA-buf vmops, but that would just be another layer of redirection.

If it is already taking a page fault I'm not sure the extra function call indirection is going to be a big deal. Having a uniform VMA sounds saner than every driver custom rolling something.

When I unwound a similar mess in RDMA all the custom VMA stuff in the drivers turned out to be generally buggy, at least.

Is vma->vm_file->private_data universally a dma_buf pointer at least?

...

...
So, user VA -> find_vma -> dma_buf object -> dma_buf operations on the memory it represents

Ah, yes we are already doing this in amdgpu as well. But only for DMA-bufs or more generally buffers which are mmaped by this driver instance.

So there is no general dma_buf service? That is a real bummer

Jason

Daniel Vetter

3:37 p.m.

On Thu, Sep 17, 2020 at 5:24 PM Jason Gunthorpe jgg@ziepe.ca wrote:

...

On Thu, Sep 17, 2020 at 04:54:44PM +0200, Christian König wrote:

...
Am 17.09.20 um 16:35 schrieb Jason Gunthorpe:

...
On Thu, Sep 17, 2020 at 02:24:29PM +0200, Christian König wrote:

...
Am 17.09.20 um 14:18 schrieb Jason Gunthorpe:

...
On Thu, Sep 17, 2020 at 02:03:48PM +0200, Christian König wrote:

...
Am 17.09.20 um 13:31 schrieb Jason Gunthorpe: > On Thu, Sep 17, 2020 at 10:09:12AM +0200, Daniel Vetter wrote: > > > Yeah, but it doesn't work when forwarding from the drm chardev to the > > dma-buf on the importer side, since you'd need a ton of different > > address spaces. And you still rely on the core code picking up your > > pgoff mangling, which feels about as risky to me as the vma file > > pointer wrangling - if it's not consistently applied the reverse map > > is toast and unmap_mapping_range doesn't work correctly for our needs. > I would think the pgoff has to be translated at the same time the > vm->vm_file is changed? > > The owner of the dma_buf should have one virtual address space and FD, > all its dma bufs should be linked to it, and all pgoffs translated to > that space. Yeah, that is exactly like amdgpu is doing it.

Going to document that somehow when I'm done with TTM cleanups.

BTW, while people are looking at this, is there a way to go from a VMA to a dma_buf that owns it?

Only a driver specific one.

Sounds OK

...
For TTM drivers vma->vm_private_data points to the buffer object. Not sure about the drivers using GEM only.

Why are drivers in control of the vma? I would think dma_buf should be the vma owner. IIRC module lifetime correctness essentially hings on the module owner of the struct file

Because the page fault handling is completely driver specific.

We could install some DMA-buf vmops, but that would just be another layer of redirection.

Uh geez I didn't know amdgpu was doing that :-/

Since this is on, I guess the inverse of trying to convert a userptr into a dma-buf is properly rejected?

...

If it is already taking a page fault I'm not sure the extra function call indirection is going to be a big deal. Having a uniform VMA sounds saner than every driver custom rolling something.

When I unwound a similar mess in RDMA all the custom VMA stuff in the drivers turned out to be generally buggy, at least.

Is vma->vm_file->private_data universally a dma_buf pointer at least?

Nope. I think if you want this without some large scale rewrite of a lot of code we'd need a vmops->get_dmabuf or similar. Not pretty, but would get the job done.

...

...
...
So, user VA -> find_vma -> dma_buf object -> dma_buf operations on the memory it represents

Ah, yes we are already doing this in amdgpu as well. But only for DMA-bufs or more generally buffers which are mmaped by this driver instance.

So there is no general dma_buf service? That is a real bummer

Mostly historical reasons and "it's complicated". One problem is that dma-buf isn't a powerful enough interface that drivers could use it for all their native objects, e.g. userptr doesn't pass through it, and clever cache flushing tricks aren't allowed and a bunch of other things. So there's some serious roadblocks before we could have a common allocator (or set of allocators) behind dma-buf. -Daniel

-- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch

Christian König

4:06 p.m.

Am 17.09.20 um 17:37 schrieb Daniel Vetter:

...

On Thu, Sep 17, 2020 at 5:24 PM Jason Gunthorpe jgg@ziepe.ca wrote:

...
On Thu, Sep 17, 2020 at 04:54:44PM +0200, Christian König wrote:

...
Am 17.09.20 um 16:35 schrieb Jason Gunthorpe:

...
On Thu, Sep 17, 2020 at 02:24:29PM +0200, Christian König wrote:

...
Am 17.09.20 um 14:18 schrieb Jason Gunthorpe:

...
On Thu, Sep 17, 2020 at 02:03:48PM +0200, Christian König wrote: > Am 17.09.20 um 13:31 schrieb Jason Gunthorpe: >> On Thu, Sep 17, 2020 at 10:09:12AM +0200, Daniel Vetter wrote: >> >>> Yeah, but it doesn't work when forwarding from the drm chardev to the >>> dma-buf on the importer side, since you'd need a ton of different >>> address spaces. And you still rely on the core code picking up your >>> pgoff mangling, which feels about as risky to me as the vma file >>> pointer wrangling - if it's not consistently applied the reverse map >>> is toast and unmap_mapping_range doesn't work correctly for our needs. >> I would think the pgoff has to be translated at the same time the >> vm->vm_file is changed? >> >> The owner of the dma_buf should have one virtual address space and FD, >> all its dma bufs should be linked to it, and all pgoffs translated to >> that space. > Yeah, that is exactly like amdgpu is doing it. > > Going to document that somehow when I'm done with TTM cleanups. BTW, while people are looking at this, is there a way to go from a VMA to a dma_buf that owns it?

Only a driver specific one.

Sounds OK

...
For TTM drivers vma->vm_private_data points to the buffer object. Not sure about the drivers using GEM only.

Why are drivers in control of the vma? I would think dma_buf should be the vma owner. IIRC module lifetime correctness essentially hings on the module owner of the struct file

Because the page fault handling is completely driver specific.

We could install some DMA-buf vmops, but that would just be another layer of redirection.

Uh geez I didn't know amdgpu was doing that :-/

Since this is on, I guess the inverse of trying to convert a userptr into a dma-buf is properly rejected?

My fault, I wasn't specific enough in my description :)

Amdgpu is NOT doing this with mmaped DMA-bufs, but rather with it's own mmaped BOs.

In other words when userspace call the userptr IOCTL and we get an error because we can't make an userptr from some random device memory we instead check all CPU mappings if the application was brain dead enough to provide us our own pointer back.

IIRC this is even done in userspace and not the kernel. But we talked about doing it in the kernel with the private_data as well.

...

...
If it is already taking a page fault I'm not sure the extra function call indirection is going to be a big deal. Having a uniform VMA sounds saner than every driver custom rolling something.

When I unwound a similar mess in RDMA all the custom VMA stuff in the drivers turned out to be generally buggy, at least.

Is vma->vm_file->private_data universally a dma_buf pointer at least?

Nope. I think if you want this without some large scale rewrite of a lot of code we'd need a vmops->get_dmabuf or similar. Not pretty, but would get the job done.

Yeah, agree that sounds like the simplest approach.

Regards, Christian.

...

...
...
...
So, user VA -> find_vma -> dma_buf object -> dma_buf operations on the memory it represents

Ah, yes we are already doing this in amdgpu as well. But only for DMA-bufs or more generally buffers which are mmaped by this driver instance.

So there is no general dma_buf service? That is a real bummer

Mostly historical reasons and "it's complicated". One problem is that dma-buf isn't a powerful enough interface that drivers could use it for all their native objects, e.g. userptr doesn't pass through it, and clever cache flushing tricks aren't allowed and a bunch of other things. So there's some serious roadblocks before we could have a common allocator (or set of allocators) behind dma-buf. -Daniel

Jason Gunthorpe

4:39 p.m.

On Thu, Sep 17, 2020 at 06:06:14PM +0200, Christian König wrote:

...

...
...
If it is already taking a page fault I'm not sure the extra function call indirection is going to be a big deal. Having a uniform VMA sounds saner than every driver custom rolling something.

When I unwound a similar mess in RDMA all the custom VMA stuff in the drivers turned out to be generally buggy, at least.

Is vma->vm_file->private_data universally a dma_buf pointer at least?

Nope. I think if you want this without some large scale rewrite of a lot of code we'd need a vmops->get_dmabuf or similar. Not pretty, but would get the job done.

Yeah, agree that sounds like the simplest approach.

I don't think that will fly, it is clearly only papering over a mess inside DRM/dma buf :\

Jason

Daniel Vetter

5:23 p.m.

On Thu, Sep 17, 2020 at 6:39 PM Jason Gunthorpe jgg@ziepe.ca wrote:

...

On Thu, Sep 17, 2020 at 06:06:14PM +0200, Christian König wrote:

...
...
...
If it is already taking a page fault I'm not sure the extra function call indirection is going to be a big deal. Having a uniform VMA sounds saner than every driver custom rolling something.

When I unwound a similar mess in RDMA all the custom VMA stuff in the drivers turned out to be generally buggy, at least.

Is vma->vm_file->private_data universally a dma_buf pointer at least?

Nope. I think if you want this without some large scale rewrite of a lot of code we'd need a vmops->get_dmabuf or similar. Not pretty, but would get the job done.

Yeah, agree that sounds like the simplest approach.

I don't think that will fly, it is clearly only papering over a mess inside DRM/dma buf :\

dma-buf started out as something to paper over the disjoint mess of allocators, since it was pretty clear to anyone involved we're not going to unify them anytime soon, if ever. So the mess pretty much is bound to stay.

I think most reasonable thing would be to throw a common vmops in there somehow, but even that means some large scale surgery across every driver/subsystem involved in dma-buf. It wouldn't unify anything, all it would give you is a constant vma->vm_ops to do some kind of upcasting. And that would still only give you a slightly less opaque pointer with a callback to upcast to a dma-buf in some driver/subsystem specific way. -Daniel

-- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch

1860

days inactive

1863

days old

linaro-mm-sig@lists.linaro.org

32 comments

participants

tags (0)

participants (5)

Christian König
Christian König
Daniel Vetter
Jason Gunthorpe
kernel test robot