Hello,
Let me introduce the new ioctls, which are intended to allow gntdev to map scatter-gather table on top of the existing dmabuf, referenced by file descriptor.
When using dma-buf exporter to create dma-buf with backing storage and map it to the grant refs, provided from the domain, we've met a problem, that several HW (i.MX8 gpu in our case) do not support external buffer and requires backing storage to be created using it's native tools. That's why new ioctls were added to be able to pass existing dma-buffer fd as input parameter and use it as backing storage to export to refs.
Following calls were added: IOCTL_GNTDEV_DMABUF_MAP_REFS_TO_BUF - map existing buffer as the backing storage and export it to the provided grant refs; IOCTL_GNTDEV_DMABUF_MAP_RELEASE - detach buffer from the grant table and set notification to unmap grant refs before releasing the external buffer. After this call the external buffer should be detroyed. IOCTL_GNTDEV_DMABUF_MAP_WAIT_RELEASED - wait for timeout until buffer is completely destroyed and gnt refs unmapped so domain could free grant pages. Should be called after buffer was destoyed.
Our setup is based on IMX8QM board. We're trying to implement zero-copy support for DomU graphics using Wayland zwp_linux_dmabuf_v1_interface implementation.
For dma-buf exporter we used i.MX8 gpu native tools to create backing storage grant-refs, received from DomU. Buffer for the backing storage was allocated using gbm_bo_create call because gpu do not support external buffer and requires backing storage to be created using it's native tools (eglCreateImageKHR returns EGL_NO_IMAGE_KHR for buffers, which were not created using gbm_bo_create).
This behaviour was also tested on Qemu setup using DRM_IOCTL_MODE_CREATE_DUMB call to create backing storage buffer.
--- Oleksii Moisieiev (3): xen/grant-table: save page_count on map and use if during async unmapping dma-buf: add dma buffer release notifier callback xen/grant-table: add new ioctls to map dmabuf to existing fd
drivers/dma-buf/dma-buf.c | 44 ++++ drivers/xen/gntdev-common.h | 8 +- drivers/xen/gntdev-dmabuf.c | 416 +++++++++++++++++++++++++++++++++++- drivers/xen/gntdev-dmabuf.h | 7 + drivers/xen/gntdev.c | 101 ++++++++- drivers/xen/grant-table.c | 73 +++++-- include/linux/dma-buf.h | 15 ++ include/uapi/xen/gntdev.h | 62 ++++++ include/xen/grant_table.h | 8 + 9 files changed, 703 insertions(+), 31 deletions(-)
Save the reference count of the page before mapping and use this value in gntdev_unmap_refs_async() call. This is the enhancement of the commit 3f9f1c67572f5e5e6dc84216d48d1480f3c4fcf6 ("xen/grant-table: add a mechanism to safely unmap pages that are in use"). Safe unmapping mechanism defers page that may being use (ref count > 1).
This is needed to allow to map/unmap pages, which have more than 1 reference. For example, DRM_IOCTL_MODE_CREATE_DUMB creates dma buffer with page_count = 2, which unmap call would be deferred while buffer exists because ref count will never equals 1. This means the buffer remains mapped during DRM_IOCTL_MODE_DESTROY_DUMB call which causes an error:
Unable to handle kernel paging request at virtual address <addr> .... Call trace: check_move_unevictable_folios+0xb8/0x4d0 check_move_unevictable_pages+0x8c/0x110 drm_gem_put_pages+0x118/0x198 drm_gem_shmem_put_pages_locked+0x4c/0x70 drm_gem_shmem_unpin+0x30/0x50 virtio_gpu_cleanup_object+0x84/0x130 virtio_gpu_cmd_unref_cb+0x18/0x2c virtio_gpu_dequeue_ctrl_func+0x124/0x290 process_one_work+0x1d0/0x320 worker_thread+0x14c/0x444 kthread+0x10c/0x110
This enhancement allows to provide the expected page_count during map call so refs could be unmapped properly without unneeded defers.
Signed-off-by: Oleksii Moisieiev oleksii_moisieiev@epam.com --- drivers/xen/grant-table.c | 16 +++++++++++++++- include/xen/grant_table.h | 3 +++ 2 files changed, 18 insertions(+), 1 deletion(-)
diff --git a/drivers/xen/grant-table.c b/drivers/xen/grant-table.c index e1ec725c2819..d6576c8b6f0f 100644 --- a/drivers/xen/grant-table.c +++ b/drivers/xen/grant-table.c @@ -1241,11 +1241,23 @@ int gnttab_map_refs(struct gnttab_map_grant_ref *map_ops, case GNTST_okay: { struct xen_page_foreign *foreign; + int page_cnt;
SetPageForeign(pages[i]); foreign = xen_page_foreign(pages[i]); foreign->domid = map_ops[i].dom; foreign->gref = map_ops[i].ref; + page_cnt = page_count(pages[i]); + if (page_cnt > FOREIGN_MAX_PAGE_COUNT) { + /* foreign structure can't hold more than FOREIGN_MAX_PAGE_COUNT. + * That's why we save page_count = 1 so safe unmap mechanism will + * defer unmapping until all users stops using this page and let + * caller handle page users. + */ + pr_warn_ratelimited("page have too many users. Will wait for 0 on umap\n"); + foreign->private = 1; + } else + foreign->private = page_cnt; break; }
@@ -1308,9 +1320,11 @@ static void __gnttab_unmap_refs_async(struct gntab_unmap_queue_data* item) { int ret; int pc; + struct xen_page_foreign *foreign;
for (pc = 0; pc < item->count; pc++) { - if (page_count(item->pages[pc]) > 1) { + foreign = xen_page_foreign(item->pages[pc]); + if (page_count(item->pages[pc]) > foreign->private) { unsigned long delay = GNTTAB_UNMAP_REFS_DELAY * (item->age + 1); schedule_delayed_work(&item->gnttab_work, msecs_to_jiffies(delay)); diff --git a/include/xen/grant_table.h b/include/xen/grant_table.h index e279be353e3f..8e220edf44ab 100644 --- a/include/xen/grant_table.h +++ b/include/xen/grant_table.h @@ -49,6 +49,7 @@ #include <linux/mm_types.h> #include <linux/page-flags.h> #include <linux/kernel.h> +#include <linux/limits.h>
/* * Technically there's no reliably invalid grant reference or grant handle, @@ -274,9 +275,11 @@ int gnttab_unmap_refs_sync(struct gntab_unmap_queue_data *item); void gnttab_batch_map(struct gnttab_map_grant_ref *batch, unsigned count); void gnttab_batch_copy(struct gnttab_copy *batch, unsigned count);
+#define FOREIGN_MAX_PAGE_COUNT U16_MAX
struct xen_page_foreign { domid_t domid; + uint16_t private; grant_ref_t gref; };
Add posibility to register callback on dma-buffer which is called before dma_buf->ops->release call. This helps when external user of the dma buffer should be notified before buffer releases without changing dma-buf ops. This is needed when external dma buffer is used as backing storage for gntdev refs export and grant refs should be unmapped before dma buffer release.
Signed-off-by: Oleksii Moisieiev oleksii_moisieiev@epam.com --- drivers/dma-buf/dma-buf.c | 44 +++++++++++++++++++++++++++++++++++++++ include/linux/dma-buf.h | 15 +++++++++++++ 2 files changed, 59 insertions(+)
diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index efb4990b29e1..3e663ef92e1f 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -25,6 +25,7 @@ #include <linux/dma-resv.h> #include <linux/mm.h> #include <linux/mount.h> +#include <linux/notifier.h> #include <linux/pseudo_fs.h>
#include <uapi/linux/dma-buf.h> @@ -57,6 +58,46 @@ static char *dmabuffs_dname(struct dentry *dentry, char *buffer, int buflen) dentry->d_name.name, ret > 0 ? name : ""); }
+int dma_buf_register_release_notifier(struct dma_buf *dmabuf, + ext_release_notifier_cb ext_release_cb, void *priv) +{ + int ret = 0; + + spin_lock(&dmabuf->ext_release_lock); + + if (dmabuf->ext_release_cb) { + ret = -EEXIST; + goto unlock; + } + + dmabuf->ext_release_cb = ext_release_cb; + dmabuf->ext_release_priv = priv; + unlock: + spin_unlock(&dmabuf->ext_release_lock); + return ret; +} +EXPORT_SYMBOL_GPL(dma_buf_register_release_notifier); + +void dma_buf_unregister_release_notifier(struct dma_buf *dmabuf) +{ + spin_lock(&dmabuf->ext_release_lock); + dmabuf->ext_release_cb = NULL; + spin_unlock(&dmabuf->ext_release_lock); +} +EXPORT_SYMBOL_GPL(dma_buf_unregister_release_notifier); + +static void dma_buf_call_release_notifier(struct dma_buf *dmabuf) +{ + if (!dmabuf->ext_release_cb) + return; + + spin_lock(&dmabuf->ext_release_lock); + dmabuf->ext_release_cb(dmabuf, dmabuf->ext_release_priv); + spin_unlock(&dmabuf->ext_release_lock); + + dma_buf_unregister_release_notifier(dmabuf); +} + static void dma_buf_release(struct dentry *dentry) { struct dma_buf *dmabuf; @@ -75,6 +116,8 @@ static void dma_buf_release(struct dentry *dentry) BUG_ON(dmabuf->cb_in.active || dmabuf->cb_out.active);
dma_buf_stats_teardown(dmabuf); + dma_buf_call_release_notifier(dmabuf); + dmabuf->ops->release(dmabuf);
if (dmabuf->resv == (struct dma_resv *)&dmabuf[1]) @@ -642,6 +685,7 @@ struct dma_buf *dma_buf_export(const struct dma_buf_export_info *exp_info) init_waitqueue_head(&dmabuf->poll); dmabuf->cb_in.poll = dmabuf->cb_out.poll = &dmabuf->poll; dmabuf->cb_in.active = dmabuf->cb_out.active = 0; + spin_lock_init(&dmabuf->ext_release_lock);
if (!resv) { resv = (struct dma_resv *)&dmabuf[1]; diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h index 71731796c8c3..6282d56ac040 100644 --- a/include/linux/dma-buf.h +++ b/include/linux/dma-buf.h @@ -287,6 +287,8 @@ struct dma_buf_ops { void (*vunmap)(struct dma_buf *dmabuf, struct iosys_map *map); };
+typedef void (*ext_release_notifier_cb)(struct dma_buf *dmabuf, void *priv); + /** * struct dma_buf - shared buffer object * @@ -432,6 +434,15 @@ struct dma_buf { */ struct dma_resv *resv;
+ /** @ext_release_cb notififier callback to call on release */ + ext_release_notifier_cb ext_release_cb; + + /** @ext_release_priv private data for callback */ + void *ext_release_priv; + + /** @ext_release_lock spinlock for ext_notifier helper */ + spinlock_t ext_release_lock; + /** @poll: for userspace poll support */ wait_queue_head_t poll;
@@ -632,4 +643,8 @@ int dma_buf_mmap(struct dma_buf *, struct vm_area_struct *, unsigned long); int dma_buf_vmap(struct dma_buf *dmabuf, struct iosys_map *map); void dma_buf_vunmap(struct dma_buf *dmabuf, struct iosys_map *map); + +int dma_buf_register_release_notifier(struct dma_buf *dmabuf, + ext_release_notifier_cb ext_release_cb, void *priv); +void dma_buf_unregister_release_notifier(struct dma_buf *dmabuf); #endif /* __DMA_BUF_H__ */
Add new ioctls to allow gntdev to map scatter-gather table on top of the existing dmabuf, referenced by file descriptor.
When using dma-buf exporter to create dma-buf with backing storage and map it to the grant refs, provided from the domain, we've met a problem, that several HW (i.MX8 gpu in our case) do not support external buffer and requires backing storage to be created using it's native tools. That's why new ioctls were added to be able to pass existing dma-buffer fd as input parameter and use it as backing storage to export to refs.
Following calls were added: IOCTL_GNTDEV_DMABUF_MAP_REFS_TO_BUF - map existing buffer as the backing storage and export it to the provided grant refs; IOCTL_GNTDEV_DMABUF_MAP_RELEASE - detach buffer from the grant table and set notification to unmap grant refs before releasing the external buffer. After this call the external buffer should be detroyed. IOCTL_GNTDEV_DMABUF_MAP_WAIT_RELEASED - wait for timeout until buffer is completely destroyed and gnt refs unmapped so domain could free grant pages.
Signed-off-by: Oleksii Moisieiev oleksii_moisieiev@epam.com --- drivers/xen/gntdev-common.h | 8 +- drivers/xen/gntdev-dmabuf.c | 416 +++++++++++++++++++++++++++++++++++- drivers/xen/gntdev-dmabuf.h | 7 + drivers/xen/gntdev.c | 101 ++++++++- drivers/xen/grant-table.c | 57 +++-- include/uapi/xen/gntdev.h | 62 ++++++ include/xen/grant_table.h | 5 + 7 files changed, 626 insertions(+), 30 deletions(-)
diff --git a/drivers/xen/gntdev-common.h b/drivers/xen/gntdev-common.h index 9c286b2a1900..3b6980df3f9d 100644 --- a/drivers/xen/gntdev-common.h +++ b/drivers/xen/gntdev-common.h @@ -61,6 +61,10 @@ struct gntdev_grant_map { bool *being_removed; struct page **pages; unsigned long pages_vm_start; + unsigned int preserve_pages; + + /* Needed to avoid allocation in gnttab_dma_free_pages(). */ + xen_pfn_t *frames;
#ifdef CONFIG_XEN_GRANT_DMA_ALLOC /* @@ -73,8 +77,6 @@ struct gntdev_grant_map { int dma_flags; void *dma_vaddr; dma_addr_t dma_bus_addr; - /* Needed to avoid allocation in gnttab_dma_free_pages(). */ - xen_pfn_t *frames; #endif
/* Number of live grants */ @@ -85,6 +87,8 @@ struct gntdev_grant_map {
struct gntdev_grant_map *gntdev_alloc_map(struct gntdev_priv *priv, int count, int dma_flags); +struct gntdev_grant_map *gntdev_get_alloc_from_fd(struct gntdev_priv *priv, + struct sg_table *sgt, int count, int dma_flags);
void gntdev_add_map(struct gntdev_priv *priv, struct gntdev_grant_map *add);
diff --git a/drivers/xen/gntdev-dmabuf.c b/drivers/xen/gntdev-dmabuf.c index 940e5e9e8a54..71d3bfee72aa 100644 --- a/drivers/xen/gntdev-dmabuf.c +++ b/drivers/xen/gntdev-dmabuf.c @@ -10,14 +10,18 @@
#include <linux/kernel.h> #include <linux/errno.h> +#include <linux/delay.h> #include <linux/dma-buf.h> +#include <linux/dma-resv.h> #include <linux/slab.h> #include <linux/types.h> #include <linux/uaccess.h> #include <linux/module.h> +#include <linux/notifier.h>
#include <xen/xen.h> #include <xen/grant_table.h> +#include <xen/mem-reservation.h>
#include "gntdev-common.h" #include "gntdev-dmabuf.h" @@ -46,6 +50,18 @@ struct gntdev_dmabuf { /* dma-buf attachment of the imported buffer. */ struct dma_buf_attachment *attach; } imp; + struct { + /* Scatter-gather table of the mapped buffer. */ + struct sg_table *sgt; + /* dma-buf attachment of the mapped buffer. */ + struct dma_buf_attachment *attach; + /* map table */ + struct gntdev_grant_map *map; + /* frames table for memory reservation */ + xen_pfn_t *frames; + + struct gntdev_priv *priv; + } map; } u;
/* Number of pages this buffer has. */ @@ -57,6 +73,7 @@ struct gntdev_dmabuf { struct gntdev_dmabuf_wait_obj { struct list_head next; struct gntdev_dmabuf *gntdev_dmabuf; + int fd; struct completion completion; };
@@ -72,6 +89,10 @@ struct gntdev_dmabuf_priv { struct list_head exp_wait_list; /* List of imported DMA buffers. */ struct list_head imp_list; + /* List of mapped DMA buffers. */ + struct list_head map_list; + /* List of wait objects. */ + struct list_head map_wait_list; /* This is the lock which protects dma_buf_xxx lists. */ struct mutex lock; /* @@ -88,6 +109,64 @@ struct gntdev_dmabuf_priv {
static void dmabuf_exp_release(struct kref *kref);
+static struct gntdev_dmabuf_wait_obj * +dmabuf_map_wait_obj_find(struct gntdev_dmabuf_priv *priv, int fd) +{ + struct gntdev_dmabuf_wait_obj *obj, *ret = ERR_PTR(-ENOENT); + + mutex_lock(&priv->lock); + list_for_each_entry(obj, &priv->map_wait_list, next) + if (obj->fd == fd) { + pr_debug("Found gntdev_dmabuf in the wait list\n"); + ret = obj; + break; + } + mutex_unlock(&priv->lock); + + return ret; +} + +static void dmabuf_exp_wait_obj_free(struct gntdev_dmabuf_priv *priv, + struct gntdev_dmabuf_wait_obj *obj); + +static int +dmabuf_map_wait_obj_set(struct gntdev_dmabuf_priv *priv, + struct gntdev_dmabuf *gntdev_dmabuf, int fd) +{ + struct gntdev_dmabuf_wait_obj *obj; + + obj = dmabuf_map_wait_obj_find(gntdev_dmabuf->priv, fd); + if ((!obj) || (PTR_ERR(obj) == -ENOENT)) { + obj = kzalloc(sizeof(*obj), GFP_KERNEL); + if (!obj) + return -ENOMEM; + } + + init_completion(&obj->completion); + obj->gntdev_dmabuf = gntdev_dmabuf; + obj->fd = fd; + mutex_lock(&priv->lock); + list_add(&obj->next, &priv->map_wait_list); + mutex_unlock(&priv->lock); + return 0; +} + +static void dmabuf_map_wait_obj_signal(struct gntdev_dmabuf_priv *priv, + struct gntdev_dmabuf *gntdev_dmabuf) +{ + struct gntdev_dmabuf_wait_obj *obj; + + mutex_lock(&priv->lock); + list_for_each_entry(obj, &priv->map_wait_list, next) + if (obj->gntdev_dmabuf == gntdev_dmabuf) { + pr_debug("Found gntdev_dmabuf in the wait list, wake\n"); + complete_all(&obj->completion); + break; + } + + mutex_unlock(&priv->lock); +} + static struct gntdev_dmabuf_wait_obj * dmabuf_exp_wait_obj_new(struct gntdev_dmabuf_priv *priv, struct gntdev_dmabuf *gntdev_dmabuf) @@ -410,6 +489,18 @@ static int dmabuf_exp_from_pages(struct gntdev_dmabuf_export_args *args) return ret; }
+static void dmabuf_map_free_gntdev_dmabuf(struct gntdev_dmabuf *gntdev_dmabuf) +{ + if (!gntdev_dmabuf) + return; + + kfree(gntdev_dmabuf->pages); + + kvfree(gntdev_dmabuf->u.map.frames); + kfree(gntdev_dmabuf); + gntdev_dmabuf = NULL; +} + static struct gntdev_grant_map * dmabuf_exp_alloc_backing_storage(struct gntdev_priv *priv, int dmabuf_flags, int count) @@ -432,6 +523,113 @@ dmabuf_exp_alloc_backing_storage(struct gntdev_priv *priv, int dmabuf_flags, return map; }
+static void dmabuf_map_remove(struct gntdev_priv *priv, + struct gntdev_dmabuf *gntdev_dmabuf) +{ + dmabuf_exp_remove_map(priv, gntdev_dmabuf->u.map.map); + dmabuf_map_free_gntdev_dmabuf(gntdev_dmabuf); +} + +static struct gntdev_dmabuf * +dmabuf_alloc_gntdev_from_buf(struct gntdev_priv *priv, int fd, int dmabuf_flags, + int count, unsigned int data_ofs) +{ + struct gntdev_dmabuf *gntdev_dmabuf; + struct dma_buf_attachment *attach; + struct dma_buf *dma_buf; + struct sg_table *sgt; + int ret = 0; + + gntdev_dmabuf = kzalloc(sizeof(*gntdev_dmabuf), GFP_KERNEL); + if (!gntdev_dmabuf) + return ERR_PTR(-ENOMEM); + + gntdev_dmabuf->pages = kcalloc(count, + sizeof(gntdev_dmabuf->pages[0]), GFP_KERNEL); + + if (!gntdev_dmabuf->pages) { + ret = -ENOMEM; + goto free; + } + + gntdev_dmabuf->u.map.frames = kvcalloc(count, + sizeof(gntdev_dmabuf->u.map.frames[0]), GFP_KERNEL); + if (!gntdev_dmabuf->u.map.frames) { + ret = -ENOMEM; + goto free; + } + + if (gntdev_test_page_count(count)) { + ret = -EINVAL; + goto free; + } + + dma_buf = dma_buf_get(fd); + if (IS_ERR_OR_NULL(dma_buf)) { + pr_debug("Unable to get dmabuf from fd\n"); + ret = PTR_ERR(dma_buf); + goto free; + } + + attach = dma_buf_attach(dma_buf, priv->dma_dev); + if (IS_ERR_OR_NULL(attach)) { + ret = PTR_ERR(attach); + goto fail_put; + } + + sgt = dma_buf_map_attachment(attach, DMA_BIDIRECTIONAL); + if (IS_ERR_OR_NULL(sgt)) { + ret = PTR_ERR(sgt); + goto fail_detach; + } + + if (sgt->sgl->offset != data_ofs) { + pr_debug("DMA buffer offset %d, user-space expects %d\n", + sgt->sgl->offset, data_ofs); + ret = -EINVAL; + goto fail_unmap; + } + + /* Check number of pages that imported buffer has. */ + if (attach->dmabuf->size < count << PAGE_SHIFT) { + pr_debug("DMA buffer has %zu pages, user-space expects %d\n", + attach->dmabuf->size, count << PAGE_SHIFT); + ret = -EINVAL; + goto fail_unmap; + } + + gntdev_dmabuf->u.map.map = gntdev_get_alloc_from_fd(priv, sgt, count, + dmabuf_flags); + if (IS_ERR_OR_NULL(gntdev_dmabuf->u.map.map)) { + ret = -ENOMEM; + goto fail_unmap; + }; + + gntdev_dmabuf->priv = priv->dmabuf_priv; + gntdev_dmabuf->fd = fd; + gntdev_dmabuf->u.map.attach = attach; + gntdev_dmabuf->u.map.sgt = sgt; + gntdev_dmabuf->dmabuf = dma_buf; + gntdev_dmabuf->nr_pages = count; + gntdev_dmabuf->u.map.priv = priv; + + memcpy(gntdev_dmabuf->pages, gntdev_dmabuf->u.map.map->pages, count * + sizeof(gntdev_dmabuf->u.map.map->pages[0])); + memcpy(gntdev_dmabuf->u.map.frames, gntdev_dmabuf->u.map.map->frames, count * + sizeof(gntdev_dmabuf->u.map.map->frames[0])); + + return gntdev_dmabuf; +fail_unmap: + dma_buf_unmap_attachment(attach, sgt, DMA_BIDIRECTIONAL); +fail_detach: + dma_buf_detach(dma_buf, attach); +fail_put: + dma_buf_put(dma_buf); +free: + dmabuf_map_free_gntdev_dmabuf(gntdev_dmabuf); + return ERR_PTR(ret); +} + static int dmabuf_exp_from_refs(struct gntdev_priv *priv, int flags, int count, u32 domid, u32 *refs, u32 *fd) { @@ -481,6 +679,117 @@ static int dmabuf_exp_from_refs(struct gntdev_priv *priv, int flags, return ret; }
+static void dmabuf_release_notifier_cb(struct dma_buf *dmabuf, void *priv) +{ + struct gntdev_dmabuf *gntdev_dmabuf = priv; + + if (!gntdev_dmabuf) + return; + + dmabuf_map_remove(gntdev_dmabuf->u.map.priv, gntdev_dmabuf); + dmabuf_map_wait_obj_signal(gntdev_dmabuf->priv, gntdev_dmabuf); +} + +static int dmabuf_detach_map(struct gntdev_dmabuf *gntdev_dmabuf) +{ + struct dma_buf *dma_buf = gntdev_dmabuf->dmabuf; + long lret; + + /* Wait on any implicit fences */ + lret = dma_resv_wait_timeout(dma_buf->resv, + dma_resv_usage_rw(true), true, + MAX_SCHEDULE_TIMEOUT); + if (lret == 0) + return -ETIME; + else if (lret < 0) + return lret; + + if (gntdev_dmabuf->u.map.sgt) { + dma_buf_unmap_attachment(gntdev_dmabuf->u.map.attach, + gntdev_dmabuf->u.map.sgt, DMA_BIDIRECTIONAL); + } + + dma_buf_detach(dma_buf, gntdev_dmabuf->u.map.attach); + dma_buf_put(dma_buf); + + return 0; +} + +static int dmabuf_map_release(struct gntdev_dmabuf *gntdev_dmabuf, bool sync) +{ + int ret; + + if (!sync) { + ret = dmabuf_map_wait_obj_set(gntdev_dmabuf->priv, gntdev_dmabuf, + gntdev_dmabuf->fd); + if (ret) + return ret; + } + + ret = dmabuf_detach_map(gntdev_dmabuf); + if (ret) + return ret; + + if (!sync) { + ret = dma_buf_register_release_notifier(gntdev_dmabuf->dmabuf, + &dmabuf_release_notifier_cb, gntdev_dmabuf); + if (ret) + return ret; + } else { + dmabuf_map_remove(gntdev_dmabuf->u.map.priv, gntdev_dmabuf); + } + + return 0; +} + +static int dmabuf_map_refs_to_fd(struct gntdev_priv *priv, int flags, + int count, u32 domid, u32 *refs, u32 fd, + unsigned int data_ofs) +{ + struct gntdev_dmabuf *gntdev_dmabuf; + int i, ret; + + gntdev_dmabuf = dmabuf_alloc_gntdev_from_buf(priv, fd, flags, count, + data_ofs); + + if (IS_ERR_OR_NULL(gntdev_dmabuf)) { + ret = PTR_ERR(gntdev_dmabuf); + goto fail_gntdev; + } + + for (i = 0; i < count; i++) { + gntdev_dmabuf->u.map.map->grants[i].domid = domid; + gntdev_dmabuf->u.map.map->grants[i].ref = refs[i]; + } + + mutex_lock(&priv->lock); + gntdev_add_map(priv, gntdev_dmabuf->u.map.map); + mutex_unlock(&priv->lock); + + gntdev_dmabuf->u.map.map->flags |= GNTMAP_host_map; +#if defined(CONFIG_X86) + gntdev_dmabuf->u.map.map->flags |= GNTMAP_device_map; +#endif + + ret = gntdev_map_grant_pages(gntdev_dmabuf->u.map.map); + if (ret < 0) + goto fail; + + mutex_lock(&priv->lock); + list_add(&gntdev_dmabuf->next, &priv->dmabuf_priv->map_list); + mutex_unlock(&priv->lock); + + return 0; +fail: + mutex_lock(&priv->lock); + list_del(&gntdev_dmabuf->u.map.map->next); + mutex_unlock(&priv->lock); + dmabuf_detach_map(gntdev_dmabuf); + dmabuf_map_free_gntdev_dmabuf(gntdev_dmabuf); +fail_gntdev: + return ret; +} + /* DMA buffer import support. */
static int @@ -673,14 +982,15 @@ dmabuf_imp_to_refs(struct gntdev_dmabuf_priv *priv, struct device *dev, * it from the buffer's list. */ static struct gntdev_dmabuf * -dmabuf_imp_find_unlink(struct gntdev_dmabuf_priv *priv, int fd) +dmabuf_list_find_unlink(struct gntdev_dmabuf_priv *priv, struct list_head *list, + int fd) { struct gntdev_dmabuf *q, *gntdev_dmabuf, *ret = ERR_PTR(-ENOENT);
mutex_lock(&priv->lock); - list_for_each_entry_safe(gntdev_dmabuf, q, &priv->imp_list, next) { + list_for_each_entry_safe(gntdev_dmabuf, q, list, next) { if (gntdev_dmabuf->fd == fd) { - pr_debug("Found gntdev_dmabuf in the import list\n"); + pr_debug("Found gntdev_dmabuf in the list\n"); ret = gntdev_dmabuf; list_del(&gntdev_dmabuf->next); break; @@ -696,7 +1006,7 @@ static int dmabuf_imp_release(struct gntdev_dmabuf_priv *priv, u32 fd) struct dma_buf_attachment *attach; struct dma_buf *dma_buf;
- gntdev_dmabuf = dmabuf_imp_find_unlink(priv, fd); + gntdev_dmabuf = dmabuf_list_find_unlink(priv, &priv->imp_list, fd); if (IS_ERR(gntdev_dmabuf)) return PTR_ERR(gntdev_dmabuf);
@@ -726,6 +1036,21 @@ static void dmabuf_imp_release_all(struct gntdev_dmabuf_priv *priv) dmabuf_imp_release(priv, gntdev_dmabuf->fd); }
+static void dmabuf_map_release_all(struct gntdev_dmabuf_priv *priv) +{ + struct gntdev_dmabuf *q, *gntdev_dmabuf; + struct gntdev_dmabuf_wait_obj *o, *obj; + + list_for_each_entry_safe(obj, o, &priv->map_wait_list, next) { + dmabuf_exp_wait_obj_free(priv, obj); + } + + list_for_each_entry_safe(gntdev_dmabuf, q, &priv->map_list, next) { + dmabuf_map_release(gntdev_dmabuf, true); + } + +} + /* DMA buffer IOCTL support. */
long gntdev_ioctl_dmabuf_exp_from_refs(struct gntdev_priv *priv, int use_ptemod, @@ -769,6 +1094,47 @@ long gntdev_ioctl_dmabuf_exp_from_refs(struct gntdev_priv *priv, int use_ptemod, return ret; }
+long gntdev_ioctl_dmabuf_map_refs_to_buf(struct gntdev_priv *priv, int use_ptemod, + struct ioctl_gntdev_dmabuf_map_refs_to_buf __user *u) +{ + struct ioctl_gntdev_dmabuf_map_refs_to_buf op; + u32 *refs; + long ret; + + if (use_ptemod) { + pr_debug("Cannot provide dma-buf: use_ptemode %d\n", + use_ptemod); + return -EINVAL; + } + + if (copy_from_user(&op, u, sizeof(op)) != 0) + return -EFAULT; + + if (op.count <= 0) + return -EINVAL; + + refs = kcalloc(op.count, sizeof(*refs), GFP_KERNEL); + if (!refs) + return -ENOMEM; + + if (copy_from_user(refs, u->refs, sizeof(*refs) * op.count) != 0) { + ret = -EFAULT; + goto out; + } + + ret = dmabuf_map_refs_to_fd(priv, op.flags, op.count, + op.domid, refs, op.fd, op.data_ofs); + if (ret) + goto out; + + if (copy_to_user(u, &op, sizeof(op)) != 0) + ret = -EFAULT; + +out: + kfree(refs); + return ret; +} + long gntdev_ioctl_dmabuf_exp_wait_released(struct gntdev_priv *priv, struct ioctl_gntdev_dmabuf_exp_wait_released __user *u) { @@ -823,6 +1189,45 @@ long gntdev_ioctl_dmabuf_imp_release(struct gntdev_priv *priv, return dmabuf_imp_release(priv->dmabuf_priv, op.fd); }
+long gntdev_ioctl_dmabuf_map_release(struct gntdev_priv *priv, + struct ioctl_gntdev_dmabuf_map_release __user *u) +{ + struct ioctl_gntdev_dmabuf_map_release op; + struct gntdev_dmabuf *gntdev_dmabuf; + + if (copy_from_user(&op, u, sizeof(op)) != 0) + return -EFAULT; + + gntdev_dmabuf = dmabuf_list_find_unlink(priv->dmabuf_priv, + &priv->dmabuf_priv->map_list, op.fd); + if (IS_ERR(gntdev_dmabuf)) + return PTR_ERR(gntdev_dmabuf); + + return dmabuf_map_release(gntdev_dmabuf, false); +} + +long gntdev_ioctl_dmabuf_map_wait_released(struct gntdev_priv *priv, + struct ioctl_gntdev_dmabuf_map_wait_released __user *u) +{ + struct ioctl_gntdev_dmabuf_map_wait_released op; + struct gntdev_dmabuf_wait_obj *obj; + int ret = 0; + + if (copy_from_user(&op, u, sizeof(op)) != 0) + return -EFAULT; + + obj = dmabuf_map_wait_obj_find(priv->dmabuf_priv, op.fd); + if (IS_ERR_OR_NULL(obj)) + return (PTR_ERR(obj) == -ENOENT) ? 0 : PTR_ERR(obj); + + if (!completion_done(&obj->completion)) + ret = dmabuf_exp_wait_obj_wait(obj, op.wait_to_ms); + + if (!ret && ret != -ETIMEDOUT) + dmabuf_exp_wait_obj_free(priv->dmabuf_priv, obj); + return ret; +} + struct gntdev_dmabuf_priv *gntdev_dmabuf_init(struct file *filp) { struct gntdev_dmabuf_priv *priv; @@ -835,6 +1240,8 @@ struct gntdev_dmabuf_priv *gntdev_dmabuf_init(struct file *filp) INIT_LIST_HEAD(&priv->exp_list); INIT_LIST_HEAD(&priv->exp_wait_list); INIT_LIST_HEAD(&priv->imp_list); + INIT_LIST_HEAD(&priv->map_list); + INIT_LIST_HEAD(&priv->map_wait_list);
priv->filp = filp;
@@ -844,5 +1251,6 @@ struct gntdev_dmabuf_priv *gntdev_dmabuf_init(struct file *filp) void gntdev_dmabuf_fini(struct gntdev_dmabuf_priv *priv) { dmabuf_imp_release_all(priv); + dmabuf_map_release_all(priv); kfree(priv); } diff --git a/drivers/xen/gntdev-dmabuf.h b/drivers/xen/gntdev-dmabuf.h index 3d9b9cf9d5a1..07301f12ac52 100644 --- a/drivers/xen/gntdev-dmabuf.h +++ b/drivers/xen/gntdev-dmabuf.h @@ -21,6 +21,9 @@ void gntdev_dmabuf_fini(struct gntdev_dmabuf_priv *priv); long gntdev_ioctl_dmabuf_exp_from_refs(struct gntdev_priv *priv, int use_ptemod, struct ioctl_gntdev_dmabuf_exp_from_refs __user *u);
+long gntdev_ioctl_dmabuf_map_refs_to_buf(struct gntdev_priv *priv, int use_ptemod, + struct ioctl_gntdev_dmabuf_map_refs_to_buf __user *u); + long gntdev_ioctl_dmabuf_exp_wait_released(struct gntdev_priv *priv, struct ioctl_gntdev_dmabuf_exp_wait_released __user *u);
@@ -30,4 +33,8 @@ long gntdev_ioctl_dmabuf_imp_to_refs(struct gntdev_priv *priv, long gntdev_ioctl_dmabuf_imp_release(struct gntdev_priv *priv, struct ioctl_gntdev_dmabuf_imp_release __user *u);
+long gntdev_ioctl_dmabuf_map_release(struct gntdev_priv *priv, + struct ioctl_gntdev_dmabuf_map_release __user *u); +long gntdev_ioctl_dmabuf_map_wait_released(struct gntdev_priv *priv, + struct ioctl_gntdev_dmabuf_map_wait_released __user *u); #endif diff --git a/drivers/xen/gntdev.c b/drivers/xen/gntdev.c index 4d9a3050de6a..677a51244bb2 100644 --- a/drivers/xen/gntdev.c +++ b/drivers/xen/gntdev.c @@ -22,6 +22,7 @@
#define pr_fmt(fmt) "xen:" KBUILD_MODNAME ": " fmt
+#include <linux/dma-buf.h> #include <linux/dma-mapping.h> #include <linux/module.h> #include <linux/kernel.h> @@ -43,6 +44,7 @@ #include <xen/gntdev.h> #include <xen/events.h> #include <xen/page.h> +#include <xen/mem-reservation.h> #include <asm/xen/hypervisor.h> #include <asm/xen/hypercall.h>
@@ -96,7 +98,11 @@ static void gntdev_free_map(struct gntdev_grant_map *map) return;
#ifdef CONFIG_XEN_GRANT_DMA_ALLOC - if (map->dma_vaddr) { + if (map->pages && map->preserve_pages) { + gnttab_dma_clean_page_reservation(map->count, map->pages, + map->frames); + + } else if (map->dma_vaddr) { struct gnttab_dma_alloc_args args;
args.dev = map->dma_dev; @@ -216,6 +222,82 @@ struct gntdev_grant_map *gntdev_alloc_map(struct gntdev_priv *priv, int count, return NULL; }
+struct gntdev_grant_map *gntdev_get_alloc_from_fd(struct gntdev_priv *priv, + struct sg_table *sgt, int count, int dma_flags) +{ + struct gntdev_grant_map *add; + int i = 0; + struct sg_page_iter sg_iter; + + add = kzalloc(sizeof(*add), GFP_KERNEL); + if (!add) + return NULL; + + add->grants = kvcalloc(count, sizeof(add->grants[0]), GFP_KERNEL); + add->map_ops = kvcalloc(count, sizeof(add->map_ops[0]), GFP_KERNEL); + add->unmap_ops = kvcalloc(count, sizeof(add->unmap_ops[0]), GFP_KERNEL); + add->pages = kvcalloc(count, sizeof(add->pages[0]), GFP_KERNEL); + add->frames = kvcalloc(count, sizeof(add->frames[0]), + GFP_KERNEL); + add->being_removed = + kvcalloc(count, sizeof(add->being_removed[0]), GFP_KERNEL); + add->preserve_pages = 1; + + if (add->grants == NULL || + add->map_ops == NULL || + add->unmap_ops == NULL || + add->pages == NULL || + add->frames == NULL || + add->being_removed == NULL) + goto err; + + if (use_ptemod) { + add->kmap_ops = kvmalloc_array(count, sizeof(add->kmap_ops[0]), + GFP_KERNEL); + add->kunmap_ops = kvmalloc_array(count, sizeof(add->kunmap_ops[0]), + GFP_KERNEL); + if (NULL == add->kmap_ops || NULL == add->kunmap_ops) + goto err; + } + + for_each_sgtable_page(sgt, &sg_iter, 0) { + struct page *page = sg_page_iter_page(&sg_iter); + + add->pages[i] = page; + add->frames[i] = xen_page_to_gfn(page); + i++; + if (i >= count) + break; + } + + if (i < count) { + pr_debug("Provided buffer is too small"); + goto err; + } + + if (gnttab_dma_reserve_pages(count, add->pages, add->frames)) + goto err; + + for (i = 0; i < count; i++) { + add->map_ops[i].handle = -1; + add->unmap_ops[i].handle = -1; + if (use_ptemod) { + add->kmap_ops[i].handle = -1; + add->kunmap_ops[i].handle = -1; + } + } + + add->index = 0; + add->count = count; + refcount_set(&add->users, 1); + + return add; + +err: + gntdev_free_map(add); + return NULL; +} + void gntdev_add_map(struct gntdev_priv *priv, struct gntdev_grant_map *add) { struct gntdev_grant_map *map; @@ -610,6 +692,9 @@ static int gntdev_release(struct inode *inode, struct file *flip) struct gntdev_grant_map *map;
pr_debug("priv %p\n", priv); +#ifdef CONFIG_XEN_GNTDEV_DMABUF + gntdev_dmabuf_fini(priv->dmabuf_priv); +#endif
mutex_lock(&priv->lock); while (!list_empty(&priv->maps)) { @@ -620,10 +705,6 @@ static int gntdev_release(struct inode *inode, struct file *flip) } mutex_unlock(&priv->lock);
-#ifdef CONFIG_XEN_GNTDEV_DMABUF - gntdev_dmabuf_fini(priv->dmabuf_priv); -#endif - kfree(priv); return 0; } @@ -1020,6 +1101,16 @@ static long gntdev_ioctl(struct file *flip,
case IOCTL_GNTDEV_DMABUF_IMP_RELEASE: return gntdev_ioctl_dmabuf_imp_release(priv, ptr); + + case IOCTL_GNTDEV_DMABUF_MAP_REFS_TO_BUF: + return gntdev_ioctl_dmabuf_map_refs_to_buf(priv, use_ptemod, ptr); + + case IOCTL_GNTDEV_DMABUF_MAP_RELEASE: + return gntdev_ioctl_dmabuf_map_release(priv, ptr); + + case IOCTL_GNTDEV_DMABUF_MAP_WAIT_RELEASED: + return gntdev_ioctl_dmabuf_map_wait_released(priv, ptr); + #endif
default: diff --git a/drivers/xen/grant-table.c b/drivers/xen/grant-table.c index d6576c8b6f0f..257e335bc65b 100644 --- a/drivers/xen/grant-table.c +++ b/drivers/xen/grant-table.c @@ -1036,6 +1036,40 @@ void gnttab_free_pages(int nr_pages, struct page **pages) } EXPORT_SYMBOL_GPL(gnttab_free_pages);
+int gnttab_dma_reserve_pages(int nr_pages, struct page **pages, + xen_pfn_t *frames) +{ + int ret, i; + + for (i = 0; i < nr_pages; i++) + xenmem_reservation_scrub_page(pages[i]); + + xenmem_reservation_va_mapping_reset(nr_pages, pages); + + ret = xenmem_reservation_decrease(nr_pages, frames); + if (ret != nr_pages) + return ret; + + return 0; +} +EXPORT_SYMBOL_GPL(gnttab_dma_reserve_pages); + +int gnttab_dma_clean_page_reservation(int nr_pages, struct page **pages, + xen_pfn_t *frames) +{ + int ret; + + ret = xenmem_reservation_increase(nr_pages, frames); + if (ret != nr_pages) { + pr_debug("Failed to increase reservation for DMA buffer\n"); + return -EFAULT; + } + + xenmem_reservation_va_mapping_update(nr_pages, pages, frames); + return 0; +} +EXPORT_SYMBOL_GPL(gnttab_dma_clean_page_reservation); + #ifdef CONFIG_XEN_GRANT_DMA_ALLOC /** * gnttab_dma_alloc_pages - alloc DMAable pages suitable for grant mapping into @@ -1071,17 +1105,11 @@ int gnttab_dma_alloc_pages(struct gnttab_dma_alloc_args *args)
args->pages[i] = page; args->frames[i] = xen_page_to_gfn(page); - xenmem_reservation_scrub_page(page); }
- xenmem_reservation_va_mapping_reset(args->nr_pages, args->pages); - - ret = xenmem_reservation_decrease(args->nr_pages, args->frames); - if (ret != args->nr_pages) { - pr_debug("Failed to decrease reservation for DMA buffer\n"); - ret = -EFAULT; + ret = gnttab_dma_reserve_pages(args->nr_pages, args->pages, args->frames); + if (ret) goto fail; - }
ret = gnttab_pages_set_private(args->nr_pages, args->pages); if (ret < 0) @@ -1109,17 +1137,8 @@ int gnttab_dma_free_pages(struct gnttab_dma_alloc_args *args) for (i = 0; i < args->nr_pages; i++) args->frames[i] = page_to_xen_pfn(args->pages[i]);
- ret = xenmem_reservation_increase(args->nr_pages, args->frames); - if (ret != args->nr_pages) { - pr_debug("Failed to increase reservation for DMA buffer\n"); - ret = -EFAULT; - } else { - ret = 0; - } - - xenmem_reservation_va_mapping_update(args->nr_pages, args->pages, - args->frames); - + ret = gnttab_dma_clean_page_reservation(args->nr_pages, args->pages, + args->frames); size = args->nr_pages << PAGE_SHIFT; if (args->coherent) dma_free_coherent(args->dev, size, diff --git a/include/uapi/xen/gntdev.h b/include/uapi/xen/gntdev.h index 7a7145395c09..cadc7fd9bc9c 100644 --- a/include/uapi/xen/gntdev.h +++ b/include/uapi/xen/gntdev.h @@ -312,4 +312,66 @@ struct ioctl_gntdev_dmabuf_imp_release { __u32 reserved; };
+/* + * Fd mapping ioctls allows to map @fd to @refs. + * + * Allows gntdev to map scatter-gather table to the existing dma-buf + * file destriptor. It provides the same functionality as + * DMABUF_EXP_FROM_REFS_V2 ioctls, + * but maps sc table on top of the existing buffer memory, instead of + * allocting memory. This is useful when exporter should work with external + * buffer. + */ + +#define IOCTL_GNTDEV_DMABUF_MAP_REFS_TO_BUF \ + _IOC(_IOC_NONE, 'G', 15, \ + sizeof(struct ioctl_gntdev_dmabuf_map_refs_to_buf)) +struct ioctl_gntdev_dmabuf_map_refs_to_buf { + /* IN parameters. */ + /* Specific options for this dma-buf: see GNTDEV_DMA_FLAG_XXX. */ + __u32 flags; + /* Number of grant references in @refs array. */ + __u32 count; + /* Offset of the data in the dma-buf. */ + __u32 data_ofs; + /* File descriptor of the dma-buf. */ + __u32 fd; + /* The domain ID of the grant references to be mapped. */ + __u32 domid; + /* Variable IN parameter. */ + /* Array of grant references of size @count. */ + __u32 refs[1]; +}; + +/* + * This will release gntdev attachment to the provided buffer with file + * descriptor @fd, so it can be released by the owner. This is only valid + * for buffers created with IOCTL_GNTDEV_DMABUF_EXP_REFS_TO_BUF. + * Returns 0 on success, -ETIME when waiting dma_buffer fences to clean + * reached timeout. In this case release call should be repeated after + * releasing dma_buffer fences. + */ +#define IOCTL_GNTDEV_DMABUF_MAP_RELEASE \ + _IOC(_IOC_NONE, 'G', 16, \ + sizeof(struct ioctl_gntdev_dmabuf_map_release)) +struct ioctl_gntdev_dmabuf_map_release { + /* IN parameters */ + __u32 fd; + __u32 reserved; +}; + +/* + * This will wait until gntdev release procedure is finished and buffer was + * released completely. This is only valid for buffers created with + * IOCTL_GNTDEV_DMABUF_EXP_REFS_TO_BUF. + */ +#define IOCTL_GNTDEV_DMABUF_MAP_WAIT_RELEASED \ + _IOC(_IOC_NONE, 'G', 17, \ + sizeof(struct ioctl_gntdev_dmabuf_map_wait_released)) +struct ioctl_gntdev_dmabuf_map_wait_released { + /* IN parameters */ + __u32 fd; + __u32 wait_to_ms; +}; + #endif /* __LINUX_PUBLIC_GNTDEV_H__ */ diff --git a/include/xen/grant_table.h b/include/xen/grant_table.h index 8e220edf44ab..73b473474ac9 100644 --- a/include/xen/grant_table.h +++ b/include/xen/grant_table.h @@ -250,6 +250,11 @@ int gnttab_dma_alloc_pages(struct gnttab_dma_alloc_args *args); int gnttab_dma_free_pages(struct gnttab_dma_alloc_args *args); #endif
+int gnttab_dma_reserve_pages(int nr_pages, struct page **pages, + xen_pfn_t *frames); +int gnttab_dma_clean_page_reservation(int nr_pages, struct page **pages, + xen_pfn_t *frames); + int gnttab_pages_set_private(int nr_pages, struct page **pages); void gnttab_pages_clear_private(int nr_pages, struct page **pages);
[AMD Official Use Only - General]
Sorry for the messed up mail. We currently have mail problems here at AMD.
________________________________ Von: Oleksii Moisieiev Oleksii_Moisieiev@epam.com Gesendet: Montag, 2. Januar 2023 14:41 An: jgross@suse.com jgross@suse.com Cc: Oleksii Moisieiev Oleksii_Moisieiev@epam.com; Stefano Stabellini sstabellini@kernel.org; Oleksandr Tyshchenko Oleksandr_Tyshchenko@epam.com; xen-devel@lists.xenproject.org xen-devel@lists.xenproject.org; linux-kernel@vger.kernel.org linux-kernel@vger.kernel.org; Sumit Semwal sumit.semwal@linaro.org; Koenig, Christian Christian.Koenig@amd.com; linux-media@vger.kernel.org linux-media@vger.kernel.org; dri-devel@lists.freedesktop.org dri-devel@lists.freedesktop.org; linaro-mm-sig@lists.linaro.org linaro-mm-sig@lists.linaro.org Betreff: [PATCH v1 0/3] Add ioctls to map grant refs on the external backing storage
Hello,
Let me introduce the new ioctls, which are intended to allow gntdev to map scatter-gather table on top of the existing dmabuf, referenced by file descriptor.
When using dma-buf exporter to create dma-buf with backing storage and map it to the grant refs, provided from the domain, we've met a problem, that several HW (i.MX8 gpu in our case) do not support external buffer and requires backing storage to be created using it's native tools. That's why new ioctls were added to be able to pass existing dma-buffer fd as input parameter and use it as backing storage to export to refs.
This is a pretty big NAK from my side to this approach.
If you need to replace a file descriptor number local to your process then you can simply use dup2() from userspace.
If your intention here is to replace the backing store of the fd on all processes which currently have it open then please just completely forget that. This will *NEVER* ever work correctly.
Regards, Christian.
Following calls were added: IOCTL_GNTDEV_DMABUF_MAP_REFS_TO_BUF - map existing buffer as the backing storage and export it to the provided grant refs; IOCTL_GNTDEV_DMABUF_MAP_RELEASE - detach buffer from the grant table and set notification to unmap grant refs before releasing the external buffer. After this call the external buffer should be detroyed. IOCTL_GNTDEV_DMABUF_MAP_WAIT_RELEASED - wait for timeout until buffer is completely destroyed and gnt refs unmapped so domain could free grant pages. Should be called after buffer was destoyed.
Our setup is based on IMX8QM board. We're trying to implement zero-copy support for DomU graphics using Wayland zwp_linux_dmabuf_v1_interface implementation.
For dma-buf exporter we used i.MX8 gpu native tools to create backing storage grant-refs, received from DomU. Buffer for the backing storage was allocated using gbm_bo_create call because gpu do not support external buffer and requires backing storage to be created using it's native tools (eglCreateImageKHR returns EGL_NO_IMAGE_KHR for buffers, which were not created using gbm_bo_create).
This behaviour was also tested on Qemu setup using DRM_IOCTL_MODE_CREATE_DUMB call to create backing storage buffer.
--- Oleksii Moisieiev (3): xen/grant-table: save page_count on map and use if during async unmapping dma-buf: add dma buffer release notifier callback xen/grant-table: add new ioctls to map dmabuf to existing fd
drivers/dma-buf/dma-buf.c | 44 ++++ drivers/xen/gntdev-common.h | 8 +- drivers/xen/gntdev-dmabuf.c | 416 +++++++++++++++++++++++++++++++++++- drivers/xen/gntdev-dmabuf.h | 7 + drivers/xen/gntdev.c | 101 ++++++++- drivers/xen/grant-table.c | 73 +++++-- include/linux/dma-buf.h | 15 ++ include/uapi/xen/gntdev.h | 62 ++++++ include/xen/grant_table.h | 8 + 9 files changed, 703 insertions(+), 31 deletions(-)
-- 2.25.1
On 02.01.23 18:36, Koenig, Christian wrote:
[AMD Official Use Only - General]
Sorry for the messed up mail. We currently have mail problems here at AMD.
________________________________ Von: Oleksii Moisieiev Oleksii_Moisieiev@epam.commailto:Oleksii_Moisieiev@epam.com Gesendet: Montag, 2. Januar 2023 14:41 An: jgross@suse.commailto:jgross@suse.com jgross@suse.commailto:jgross@suse.com Cc: Oleksii Moisieiev Oleksii_Moisieiev@epam.commailto:Oleksii_Moisieiev@epam.com; Stefano Stabellini sstabellini@kernel.orgmailto:sstabellini@kernel.org; Oleksandr Tyshchenko Oleksandr_Tyshchenko@epam.commailto:Oleksandr_Tyshchenko@epam.com; xen-devel@lists.xenproject.orgmailto:xen-devel@lists.xenproject.org xen-devel@lists.xenproject.orgmailto:xen-devel@lists.xenproject.org; linux-kernel@vger.kernel.orgmailto:linux-kernel@vger.kernel.org linux-kernel@vger.kernel.orgmailto:linux-kernel@vger.kernel.org; Sumit Semwal sumit.semwal@linaro.orgmailto:sumit.semwal@linaro.org; Koenig, Christian Christian.Koenig@amd.commailto:Christian.Koenig@amd.com; linux-media@vger.kernel.orgmailto:linux-media@vger.kernel.org linux-media@vger.kernel.orgmailto:linux-media@vger.kernel.org; dri-devel@lists.freedesktop.orgmailto:dri-devel@lists.freedesktop.org dri-devel@lists.freedesktop.orgmailto:dri-devel@lists.freedesktop.org; linaro-mm-sig@lists.linaro.orgmailto:linaro-mm-sig@lists.linaro.org linaro-mm-sig@lists.linaro.orgmailto:linaro-mm-sig@lists.linaro.org Betreff: [PATCH v1 0/3] Add ioctls to map grant refs on the external backing storage
Hello,
Let me introduce the new ioctls, which are intended to allow gntdev to map scatter-gather table on top of the existing dmabuf, referenced by file descriptor.
When using dma-buf exporter to create dma-buf with backing storage and map it to the grant refs, provided from the domain, we've met a problem, that several HW (i.MX8 gpu in our case) do not support external buffer and requires backing storage to be created using it's native tools. That's why new ioctls were added to be able to pass existing dma-buffer fd as input parameter and use it as backing storage to export to refs.
This is a pretty big NAK from my side to this approach.
If you need to replace a file descriptor number local to your process then you can simply use dup2() from userspace.
If your intention here is to replace the backing store of the fd on all processes which currently have it open then please just completely forget that. This will *NEVER* ever work correctly.
Regards, Christian.
Hello Cristian,
Thank you for the quick response.
My goal is to provide correct buffer for the interfaces, such as zwp_linux_dmabuf_v1_interface, so zero-copy
feature will work. My suggestion is to give a possibility to use some specific buffer as backing storage where caller
takes responsibility for the provided buffer to have the correct format.
In our case we are using these calls the following way:
1) Get grefs from another Domain (We're working on the virtualized system with different Domains
working as standalone VMs that are sharing resources);
2) Create buffer using gbm_bo_create and receive fd (i.MX8 requires egl api to be called for the buffer
allocation, or eglCreateImageKHR will return EGL_NO_IMAGE_KHR during param setting for zwp_linux_dmabuf);
3) Call for IOCTL_GNTDEV_DMABUF_MAP_REFS_TO_BUF to map grefs using fd as the backing storage;
4) Call for zwp_linux_dmabuf_v1_interface and use zero-copy feature with Wayland;
5) After work finished - call for IOCTL_GNTDEV_DMABUF_MAP_RELEASE to unmap grant refs;
6) Call gbm_bo_destroy to release allocated BO object;
7) Call for IOCTL_GNTDEV_DMABUF_MAP_WAIT_RELEASED to wait until map released completely
(This includes releasing grant refs on the Domain side).
I've tested my changes on IMX8QM board using gbm_bo_create to allocate buffer and
on QEMU setup using DRM_IOCTL_MODE_CREATE_DUMB for buffer allocation.
I can provide test applications I've used for testing purposes.
Best regards,
Oleksii.
Following calls were added: IOCTL_GNTDEV_DMABUF_MAP_REFS_TO_BUF - map existing buffer as the backing storage and export it to the provided grant refs; IOCTL_GNTDEV_DMABUF_MAP_RELEASE - detach buffer from the grant table and set notification to unmap grant refs before releasing the external buffer. After this call the external buffer should be detroyed. IOCTL_GNTDEV_DMABUF_MAP_WAIT_RELEASED - wait for timeout until buffer is completely destroyed and gnt refs unmapped so domain could free grant pages. Should be called after buffer was destoyed.
Our setup is based on IMX8QM board. We're trying to implement zero-copy support for DomU graphics using Wayland zwp_linux_dmabuf_v1_interface implementation.
For dma-buf exporter we used i.MX8 gpu native tools to create backing storage grant-refs, received from DomU. Buffer for the backing storage was allocated using gbm_bo_create call because gpu do not support external buffer and requires backing storage to be created using it's native tools (eglCreateImageKHR returns EGL_NO_IMAGE_KHR for buffers, which were not created using gbm_bo_create).
This behaviour was also tested on Qemu setup using DRM_IOCTL_MODE_CREATE_DUMB call to create backing storage buffer.
--- Oleksii Moisieiev (3): xen/grant-table: save page_count on map and use if during async unmapping dma-buf: add dma buffer release notifier callback xen/grant-table: add new ioctls to map dmabuf to existing fd
drivers/dma-buf/dma-buf.c | 44 ++++ drivers/xen/gntdev-common.h | 8 +- drivers/xen/gntdev-dmabuf.c | 416 +++++++++++++++++++++++++++++++++++- drivers/xen/gntdev-dmabuf.h | 7 + drivers/xen/gntdev.c | 101 ++++++++- drivers/xen/grant-table.c | 73 +++++-- include/linux/dma-buf.h | 15 ++ include/uapi/xen/gntdev.h | 62 ++++++ include/xen/grant_table.h | 8 + 9 files changed, 703 insertions(+), 31 deletions(-)
-- 2.25.1
Hello,
Just wanted to add some additional information hoping it will help to start the discussion and find a correct approach:
My goal is to provide correct buffer for the interfaces, such as zwp_linux_dmabuf_v1_interface, so zero-copy feature will work. My suggestion is to give a possibility to use some specific buffer as backing storage where caller takes responsibility for the provided buffer to have the correct format.
In our case we are using these calls the following way: 1) Get grefs from another Domain (We're working on the virtualized system with different Domains working as standalone VMs that are sharing resources);
2) Create buffer using gbm_bo_create and receive fd (i.MX8 requires egl api to be called for the buffer allocation, or eglCreateImageKHR will return EGL_NO_IMAGE_KHR during param setting for zwp_linux_dmabuf);
3) Call for IOCTL_GNTDEV_DMABUF_MAP_REFS_TO_BUF to map grefs using fd as the backing storage;
4) Call for zwp_linux_dmabuf_v1_interface and use zero-copy feature with Wayland;
5) After work finished - call for IOCTL_GNTDEV_DMABUF_MAP_RELEASE to unmap grant refs;
6) Call gbm_bo_destroy to release allocated BO object;
7) Call for IOCTL_GNTDEV_DMABUF_MAP_WAIT_RELEASED to wait until map released completely
(This includes releasing grant refs on the Domain side).
I've tested my changes on IMX8QM board using gbm_bo_create to allocate buffer and
on QEMU setup using DRM_IOCTL_MODE_CREATE_DUMB for buffer allocation.
There was a suggestion to reverse roles between exporter and imported and allocate buffer on the other side, but this approach doesn't fit to our setup.
I have a board on which starts XEN hypervisor with several virtual domains:
One domain (Dom0) has access to the hardware and is a backend side for the Graphic sharing. Another domain (DomU) is the frontend
side and runs it's Weston instance with zero-copy on the buffer, provided by the backend.
The configuration when Dom0 as Driver domain allocates dma-buf and the exports it to the DomU is not meant to be supported, see link: https://nvd.nist.gov/vuln/detail/CVE-2021-26934
The vulnerability is that there is no way to free buffer which was exported to DomU in case if DomU crashes.
That's why I'm trying to make a solution which should work without need to switch importer and exporter.
I'm aware that my solution may be (and is) not correct, but my intend is to start the discussion in order to produce the appropriate solution.
Best regards,
Oleksii.
On 02.01.23 15:41, Oleksii Moisieiev wrote:
Hello,
Let me introduce the new ioctls, which are intended to allow gntdev to map scatter-gather table on top of the existing dmabuf, referenced by file descriptor.
When using dma-buf exporter to create dma-buf with backing storage and map it to the grant refs, provided from the domain, we've met a problem, that several HW (i.MX8 gpu in our case) do not support external buffer and requires backing storage to be created using it's native tools. That's why new ioctls were added to be able to pass existing dma-buffer fd as input parameter and use it as backing storage to export to refs.
Following calls were added: IOCTL_GNTDEV_DMABUF_MAP_REFS_TO_BUF - map existing buffer as the backing storage and export it to the provided grant refs; IOCTL_GNTDEV_DMABUF_MAP_RELEASE - detach buffer from the grant table and set notification to unmap grant refs before releasing the external buffer. After this call the external buffer should be detroyed. IOCTL_GNTDEV_DMABUF_MAP_WAIT_RELEASED - wait for timeout until buffer is completely destroyed and gnt refs unmapped so domain could free grant pages. Should be called after buffer was destoyed.
Our setup is based on IMX8QM board. We're trying to implement zero-copy support for DomU graphics using Wayland zwp_linux_dmabuf_v1_interface implementation.
For dma-buf exporter we used i.MX8 gpu native tools to create backing storage grant-refs, received from DomU. Buffer for the backing storage was allocated using gbm_bo_create call because gpu do not support external buffer and requires backing storage to be created using it's native tools (eglCreateImageKHR returns EGL_NO_IMAGE_KHR for buffers, which were not created using gbm_bo_create).
This behaviour was also tested on Qemu setup using DRM_IOCTL_MODE_CREATE_DUMB call to create backing storage buffer.
--- Oleksii Moisieiev (3): xen/grant-table: save page_count on map and use if during async unmapping dma-buf: add dma buffer release notifier callback xen/grant-table: add new ioctls to map dmabuf to existing fd
drivers/dma-buf/dma-buf.c | 44 ++++ drivers/xen/gntdev-common.h | 8 +- drivers/xen/gntdev-dmabuf.c | 416 +++++++++++++++++++++++++++++++++++- drivers/xen/gntdev-dmabuf.h | 7 + drivers/xen/gntdev.c | 101 ++++++++- drivers/xen/grant-table.c | 73 +++++-- include/linux/dma-buf.h | 15 ++ include/uapi/xen/gntdev.h | 62 ++++++ include/xen/grant_table.h | 8 + 9 files changed, 703 insertions(+), 31 deletions(-)
linaro-mm-sig@lists.linaro.org