Hi everybody,
core idea in this patch set is that DMA-buf importers can now provide an optional invalidate callback. Using this callback and the reservation object exporters can now avoid pinning DMA-buf memory for a long time while sharing it between devices.
I've already send out an older version roughly a year ago, but didn't had time to further look into cleaning this up.
The last time a major problem was that we would had to fix up all drivers implementing DMA-buf at once.
Now I avoid this by allowing mappings to be cached in the DMA-buf attachment and so driver can optionally move over to the new interface one by one.
This is also a prerequisite to my patchset enabling sharing of device memory with DMA-buf.
Please review and/or comment, Christian.
To allow a smooth transition from pinning buffer objects to dynamic invalidation we first start to cache the sg_table for an attachment unless the driver explicitly says to not do so.
Signed-off-by: Christian König christian.koenig@amd.com --- drivers/dma-buf/dma-buf.c | 24 ++++++++++++++++++++++++ include/linux/dma-buf.h | 11 +++++++++++ 2 files changed, 35 insertions(+)
diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index 7c858020d14b..65161a82d4d5 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -573,6 +573,20 @@ struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf, list_add(&attach->node, &dmabuf->attachments);
mutex_unlock(&dmabuf->lock); + + if (!dmabuf->ops->dynamic_sgt_mapping) { + struct sg_table *sgt; + + sgt = dmabuf->ops->map_dma_buf(attach, DMA_BIDIRECTIONAL); + if (!sgt) + sgt = ERR_PTR(-ENOMEM); + if (IS_ERR(sgt)) { + dma_buf_detach(dmabuf, attach); + return ERR_CAST(sgt); + } + attach->sgt = sgt; + } + return attach;
err_attach: @@ -595,6 +609,10 @@ void dma_buf_detach(struct dma_buf *dmabuf, struct dma_buf_attachment *attach) if (WARN_ON(!dmabuf || !attach)) return;
+ if (attach->sgt) + dmabuf->ops->unmap_dma_buf(attach, attach->sgt, + DMA_BIDIRECTIONAL); + mutex_lock(&dmabuf->lock); list_del(&attach->node); if (dmabuf->ops->detach) @@ -630,6 +648,9 @@ struct sg_table *dma_buf_map_attachment(struct dma_buf_attachment *attach, if (WARN_ON(!attach || !attach->dmabuf)) return ERR_PTR(-EINVAL);
+ if (attach->sgt) + return attach->sgt; + sg_table = attach->dmabuf->ops->map_dma_buf(attach, direction); if (!sg_table) sg_table = ERR_PTR(-ENOMEM); @@ -657,6 +678,9 @@ void dma_buf_unmap_attachment(struct dma_buf_attachment *attach, if (WARN_ON(!attach || !attach->dmabuf || !sg_table)) return;
+ if (attach->sgt == sg_table) + return; + attach->dmabuf->ops->unmap_dma_buf(attach, sg_table, direction); } diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h index 58725f890b5b..0d9c3c13c9fb 100644 --- a/include/linux/dma-buf.h +++ b/include/linux/dma-buf.h @@ -51,6 +51,16 @@ struct dma_buf_attachment; * @vunmap: [optional] unmaps a vmap from the buffer */ struct dma_buf_ops { + /** + * @dynamic_sgt_mapping: + * + * Flag controlling the caching of the sg_table in the DMA-buf helpers. + * If not set the sg_table is created during device attaching, if set + * the sg_table is created dynamically when dma_buf_map_attachment() is + * called. + */ + bool dynamic_sgt_mapping; + /** * @attach: * @@ -323,6 +333,7 @@ struct dma_buf_attachment { struct device *dev; struct list_head node; void *priv; + struct sg_table *sgt; };
/**
Add function variants which can be called with the reservation lock already held.
v2: reordered, add lockdep asserts, fix kerneldoc v3: rebased on sgt caching
Signed-off-by: Christian König christian.koenig@amd.com --- drivers/dma-buf/dma-buf.c | 63 +++++++++++++++++++++++++++++++++++++++ include/linux/dma-buf.h | 5 ++++ 2 files changed, 68 insertions(+)
diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index 65161a82d4d5..ef480e5fb239 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -623,6 +623,43 @@ void dma_buf_detach(struct dma_buf *dmabuf, struct dma_buf_attachment *attach) } EXPORT_SYMBOL_GPL(dma_buf_detach);
+/** + * dma_buf_map_attachment_locked - Maps the buffer into _device_ address space + * with the reservation lock held. Is a wrapper for map_dma_buf() of the + * + * Returns the scatterlist table of the attachment; + * dma_buf_ops. + * @attach: [in] attachment whose scatterlist is to be returned + * @direction: [in] direction of DMA transfer + * + * Returns sg_table containing the scatterlist to be returned; returns ERR_PTR + * on error. May return -EINTR if it is interrupted by a signal. + * + * A mapping must be unmapped by using dma_buf_unmap_attachment_locked(). Note + * that the underlying backing storage is pinned for as long as a mapping + * exists, therefore users/importers should not hold onto a mapping for undue + * amounts of time. + */ +struct sg_table * +dma_buf_map_attachment_locked(struct dma_buf_attachment *attach, + enum dma_data_direction direction) +{ + struct sg_table *sg_table; + + might_sleep(); + reservation_object_assert_held(attach->dmabuf->resv); + + if (attach->sgt) + return attach->sgt; + + sg_table = attach->dmabuf->ops->map_dma_buf(attach, direction); + if (!sg_table) + sg_table = ERR_PTR(-ENOMEM); + + return sg_table; +} +EXPORT_SYMBOL_GPL(dma_buf_map_attachment_locked); + /** * dma_buf_map_attachment - Returns the scatterlist table of the attachment; * mapped into _device_ address space. Is a wrapper for map_dma_buf() of the @@ -659,6 +696,32 @@ struct sg_table *dma_buf_map_attachment(struct dma_buf_attachment *attach, } EXPORT_SYMBOL_GPL(dma_buf_map_attachment);
+/** + * dma_buf_unmap_attachment_locked - unmaps the buffer with reservation lock + * held, should deallocate the associated scatterlist. Is a wrapper for + * unmap_dma_buf() of dma_buf_ops. + * @attach: [in] attachment to unmap buffer from + * @sg_table: [in] scatterlist info of the buffer to unmap + * @direction: [in] direction of DMA transfer + * + * This unmaps a DMA mapping for @attached obtained by + * dma_buf_map_attachment_locked(). + */ +void dma_buf_unmap_attachment_locked(struct dma_buf_attachment *attach, + struct sg_table *sg_table, + enum dma_data_direction direction) +{ + might_sleep(); + reservation_object_assert_held(attach->dmabuf->resv); + + if (attach->sgt == sg_table) + return; + + attach->dmabuf->ops->unmap_dma_buf(attach, sg_table, + direction); +} +EXPORT_SYMBOL_GPL(dma_buf_unmap_attachment_locked); + /** * dma_buf_unmap_attachment - unmaps and decreases usecount of the buffer;might * deallocate the scatterlist associated. Is a wrapper for unmap_dma_buf() of diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h index 0d9c3c13c9fb..18a78be53541 100644 --- a/include/linux/dma-buf.h +++ b/include/linux/dma-buf.h @@ -395,8 +395,13 @@ int dma_buf_fd(struct dma_buf *dmabuf, int flags); struct dma_buf *dma_buf_get(int fd); void dma_buf_put(struct dma_buf *dmabuf);
+struct sg_table *dma_buf_map_attachment_locked(struct dma_buf_attachment *, + enum dma_data_direction); struct sg_table *dma_buf_map_attachment(struct dma_buf_attachment *, enum dma_data_direction); +void dma_buf_unmap_attachment_locked(struct dma_buf_attachment *, + struct sg_table *, + enum dma_data_direction); void dma_buf_unmap_attachment(struct dma_buf_attachment *, struct sg_table *, enum dma_data_direction); int dma_buf_begin_cpu_access(struct dma_buf *dma_buf,
Thanks for the comments, but you are looking at a completely outdated patchset.
If you are interested in the newest one please ping me and I'm going to CC you when I send out the next version.
Christian.
Am 25.05.19 um 03:04 schrieb Hillf Danton:
On Tue, 16 Apr 2019 20:38:31 +0200 Christian König wrote:
Add function variants which can be called with the reservation lock already held.
v2: reordered, add lockdep asserts, fix kerneldoc v3: rebased on sgt caching
Signed-off-by: Christian König christian.koenig@amd.com
drivers/dma-buf/dma-buf.c | 63 +++++++++++++++++++++++++++++++++++++++ include/linux/dma-buf.h | 5 ++++ 2 files changed, 68 insertions(+)
diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index 65161a82d4d5..ef480e5fb239 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -623,6 +623,43 @@ void dma_buf_detach(struct dma_buf *dmabuf, struct dma_buf_attachment *attach) } EXPORT_SYMBOL_GPL(dma_buf_detach); +/**
- dma_buf_map_attachment_locked - Maps the buffer into _device_ address space
- with the reservation lock held. Is a wrapper for map_dma_buf() of the
Something is missing; seems to be s/of the/of the dma_buf_ops./
- Returns the scatterlist table of the attachment;
- dma_buf_ops.
Oh it is sitting here!
- @attach: [in] attachment whose scatterlist is to be returned
- @direction: [in] direction of DMA transfer
- Returns sg_table containing the scatterlist to be returned; returns ERR_PTR
- on error. May return -EINTR if it is interrupted by a signal.
EINTR looks impossible in the code.
- A mapping must be unmapped by using dma_buf_unmap_attachment_locked(). Note
- that the underlying backing storage is pinned for as long as a mapping
- exists, therefore users/importers should not hold onto a mapping for undue
- amounts of time.
- */
+struct sg_table * +dma_buf_map_attachment_locked(struct dma_buf_attachment *attach,
enum dma_data_direction direction)
+{
- struct sg_table *sg_table;
- might_sleep();
- reservation_object_assert_held(attach->dmabuf->resv);
- if (attach->sgt)
return attach->sgt;
- sg_table = attach->dmabuf->ops->map_dma_buf(attach, direction);
- if (!sg_table)
sg_table = ERR_PTR(-ENOMEM);
- return sg_table;
+} +EXPORT_SYMBOL_GPL(dma_buf_map_attachment_locked);
Best Regards Hillf
amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Make it mandatory for dynamic dma-buf callbacks to be called with the reservation lock held.
For static dma-buf exporters we still have the fallback of using cached sgt.
v2: reordered v3: rebased on sgt caching v4: use the cached sgt when possible
Signed-off-by: Christian König christian.koenig@amd.com --- drivers/dma-buf/dma-buf.c | 24 ++++++++++--------- drivers/gpu/drm/armada/armada_gem.c | 6 ++++- drivers/gpu/drm/drm_prime.c | 6 ++++- drivers/gpu/drm/i915/i915_gem_dmabuf.c | 6 ++++- drivers/gpu/drm/tegra/gem.c | 6 ++++- drivers/gpu/drm/udl/udl_dmabuf.c | 6 ++++- .../common/videobuf2/videobuf2-dma-contig.c | 6 ++++- .../media/common/videobuf2/videobuf2-dma-sg.c | 6 ++++- drivers/staging/media/tegra-vde/tegra-vde.c | 6 ++++- include/linux/dma-buf.h | 23 ++++++++++++++++-- 10 files changed, 74 insertions(+), 21 deletions(-)
diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index ef480e5fb239..83c92bfd964c 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -532,8 +532,9 @@ EXPORT_SYMBOL_GPL(dma_buf_put); /** * dma_buf_attach - Add the device to dma_buf's attachments list; optionally, * calls attach() of dma_buf_ops to allow device-specific attach functionality - * @dmabuf: [in] buffer to attach device to. - * @dev: [in] device to be attached. + * @info: [in] holds all the attach related information provided + * by the importer. see &struct dma_buf_attach_info + * for further details. * * Returns struct dma_buf_attachment pointer for this attachment. Attachments * must be cleaned up by calling dma_buf_detach(). @@ -547,20 +548,20 @@ EXPORT_SYMBOL_GPL(dma_buf_put); * accessible to @dev, and cannot be moved to a more suitable place. This is * indicated with the error code -EBUSY. */ -struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf, - struct device *dev) +struct dma_buf_attachment *dma_buf_attach(const struct dma_buf_attach_info *info) { + struct dma_buf *dmabuf = info->dmabuf; struct dma_buf_attachment *attach; int ret;
- if (WARN_ON(!dmabuf || !dev)) + if (WARN_ON(!dmabuf || !info->dev)) return ERR_PTR(-EINVAL);
attach = kzalloc(sizeof(*attach), GFP_KERNEL); if (!attach) return ERR_PTR(-ENOMEM);
- attach->dev = dev; + attach->dev = info->dev; attach->dmabuf = dmabuf;
mutex_lock(&dmabuf->lock); @@ -688,9 +689,9 @@ struct sg_table *dma_buf_map_attachment(struct dma_buf_attachment *attach, if (attach->sgt) return attach->sgt;
- sg_table = attach->dmabuf->ops->map_dma_buf(attach, direction); - if (!sg_table) - sg_table = ERR_PTR(-ENOMEM); + reservation_object_lock(attach->dmabuf->resv, NULL); + sg_table = dma_buf_map_attachment_locked(attach, direction); + reservation_object_unlock(attach->dmabuf->resv);
return sg_table; } @@ -744,8 +745,9 @@ void dma_buf_unmap_attachment(struct dma_buf_attachment *attach, if (attach->sgt == sg_table) return;
- attach->dmabuf->ops->unmap_dma_buf(attach, sg_table, - direction); + reservation_object_lock(attach->dmabuf->resv, NULL); + dma_buf_unmap_attachment_locked(attach, sg_table, direction); + reservation_object_unlock(attach->dmabuf->resv); } EXPORT_SYMBOL_GPL(dma_buf_unmap_attachment);
diff --git a/drivers/gpu/drm/armada/armada_gem.c b/drivers/gpu/drm/armada/armada_gem.c index 642d0e70d0f8..19c47821032f 100644 --- a/drivers/gpu/drm/armada/armada_gem.c +++ b/drivers/gpu/drm/armada/armada_gem.c @@ -501,6 +501,10 @@ armada_gem_prime_export(struct drm_device *dev, struct drm_gem_object *obj, struct drm_gem_object * armada_gem_prime_import(struct drm_device *dev, struct dma_buf *buf) { + struct dma_buf_attach_info attach_info = { + .dev = dev->dev, + .dmabuf = buf + }; struct dma_buf_attachment *attach; struct armada_gem_object *dobj;
@@ -516,7 +520,7 @@ armada_gem_prime_import(struct drm_device *dev, struct dma_buf *buf) } }
- attach = dma_buf_attach(buf, dev->dev); + attach = dma_buf_attach(&attach_info); if (IS_ERR(attach)) return ERR_CAST(attach);
diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c index 231e3f6d5f41..1fadf5d5ed33 100644 --- a/drivers/gpu/drm/drm_prime.c +++ b/drivers/gpu/drm/drm_prime.c @@ -709,6 +709,10 @@ struct drm_gem_object *drm_gem_prime_import_dev(struct drm_device *dev, struct dma_buf *dma_buf, struct device *attach_dev) { + struct dma_buf_attach_info attach_info = { + .dev = attach_dev, + .dmabuf = dma_buf + }; struct dma_buf_attachment *attach; struct sg_table *sgt; struct drm_gem_object *obj; @@ -729,7 +733,7 @@ struct drm_gem_object *drm_gem_prime_import_dev(struct drm_device *dev, if (!dev->driver->gem_prime_import_sg_table) return ERR_PTR(-EINVAL);
- attach = dma_buf_attach(dma_buf, attach_dev); + attach = dma_buf_attach(&attach_info); if (IS_ERR(attach)) return ERR_CAST(attach);
diff --git a/drivers/gpu/drm/i915/i915_gem_dmabuf.c b/drivers/gpu/drm/i915/i915_gem_dmabuf.c index 82e2ca17a441..aa7f685bd6ca 100644 --- a/drivers/gpu/drm/i915/i915_gem_dmabuf.c +++ b/drivers/gpu/drm/i915/i915_gem_dmabuf.c @@ -277,6 +277,10 @@ static const struct drm_i915_gem_object_ops i915_gem_object_dmabuf_ops = { struct drm_gem_object *i915_gem_prime_import(struct drm_device *dev, struct dma_buf *dma_buf) { + struct dma_buf_attach_info attach_info = { + .dev = dev->dev, + .dmabuf = dma_buf + }; struct dma_buf_attachment *attach; struct drm_i915_gem_object *obj; int ret; @@ -295,7 +299,7 @@ struct drm_gem_object *i915_gem_prime_import(struct drm_device *dev, }
/* need to attach */ - attach = dma_buf_attach(dma_buf, dev->dev); + attach = dma_buf_attach(&attach_info); if (IS_ERR(attach)) return ERR_CAST(attach);
diff --git a/drivers/gpu/drm/tegra/gem.c b/drivers/gpu/drm/tegra/gem.c index 4f80100ff5f3..8e6b6c879add 100644 --- a/drivers/gpu/drm/tegra/gem.c +++ b/drivers/gpu/drm/tegra/gem.c @@ -332,6 +332,10 @@ struct tegra_bo *tegra_bo_create_with_handle(struct drm_file *file, static struct tegra_bo *tegra_bo_import(struct drm_device *drm, struct dma_buf *buf) { + struct dma_buf_attach_info attach_info = { + .dev = drm->dev, + .dmabuf = buf + }; struct tegra_drm *tegra = drm->dev_private; struct dma_buf_attachment *attach; struct tegra_bo *bo; @@ -341,7 +345,7 @@ static struct tegra_bo *tegra_bo_import(struct drm_device *drm, if (IS_ERR(bo)) return bo;
- attach = dma_buf_attach(buf, drm->dev); + attach = dma_buf_attach(&attach_info); if (IS_ERR(attach)) { err = PTR_ERR(attach); goto free; diff --git a/drivers/gpu/drm/udl/udl_dmabuf.c b/drivers/gpu/drm/udl/udl_dmabuf.c index 556f62662aa9..86b928f9742f 100644 --- a/drivers/gpu/drm/udl/udl_dmabuf.c +++ b/drivers/gpu/drm/udl/udl_dmabuf.c @@ -226,6 +226,10 @@ static int udl_prime_create(struct drm_device *dev, struct drm_gem_object *udl_gem_prime_import(struct drm_device *dev, struct dma_buf *dma_buf) { + struct dma_buf_attach_info attach_info = { + .dev = dev->dev, + .dmabuf = dma_buf + }; struct dma_buf_attachment *attach; struct sg_table *sg; struct udl_gem_object *uobj; @@ -233,7 +237,7 @@ struct drm_gem_object *udl_gem_prime_import(struct drm_device *dev,
/* need to attach */ get_device(dev->dev); - attach = dma_buf_attach(dma_buf, dev->dev); + attach = dma_buf_attach(&attach_info); if (IS_ERR(attach)) { put_device(dev->dev); return ERR_CAST(attach); diff --git a/drivers/media/common/videobuf2/videobuf2-dma-contig.c b/drivers/media/common/videobuf2/videobuf2-dma-contig.c index aff0ab7bf83d..1f2687b5eb0e 100644 --- a/drivers/media/common/videobuf2/videobuf2-dma-contig.c +++ b/drivers/media/common/videobuf2/videobuf2-dma-contig.c @@ -676,6 +676,10 @@ static void vb2_dc_detach_dmabuf(void *mem_priv) static void *vb2_dc_attach_dmabuf(struct device *dev, struct dma_buf *dbuf, unsigned long size, enum dma_data_direction dma_dir) { + struct dma_buf_attach_info attach_info = { + .dev = dev, + .dmabuf = dbuf + }; struct vb2_dc_buf *buf; struct dma_buf_attachment *dba;
@@ -691,7 +695,7 @@ static void *vb2_dc_attach_dmabuf(struct device *dev, struct dma_buf *dbuf,
buf->dev = dev; /* create attachment for the dmabuf with the user device */ - dba = dma_buf_attach(dbuf, buf->dev); + dba = dma_buf_attach(&attach_info); if (IS_ERR(dba)) { pr_err("failed to attach dmabuf\n"); kfree(buf); diff --git a/drivers/media/common/videobuf2/videobuf2-dma-sg.c b/drivers/media/common/videobuf2/videobuf2-dma-sg.c index 015e737095cd..cbd626d2393a 100644 --- a/drivers/media/common/videobuf2/videobuf2-dma-sg.c +++ b/drivers/media/common/videobuf2/videobuf2-dma-sg.c @@ -608,6 +608,10 @@ static void vb2_dma_sg_detach_dmabuf(void *mem_priv) static void *vb2_dma_sg_attach_dmabuf(struct device *dev, struct dma_buf *dbuf, unsigned long size, enum dma_data_direction dma_dir) { + struct dma_buf_attach_info attach_info = { + .dev = dev, + .dmabuf = dbuf + }; struct vb2_dma_sg_buf *buf; struct dma_buf_attachment *dba;
@@ -623,7 +627,7 @@ static void *vb2_dma_sg_attach_dmabuf(struct device *dev, struct dma_buf *dbuf,
buf->dev = dev; /* create attachment for the dmabuf with the user device */ - dba = dma_buf_attach(dbuf, buf->dev); + dba = dma_buf_attach(&attach_info); if (IS_ERR(dba)) { pr_err("failed to attach dmabuf\n"); kfree(buf); diff --git a/drivers/staging/media/tegra-vde/tegra-vde.c b/drivers/staging/media/tegra-vde/tegra-vde.c index aa6c6bba961e..5a10c1facc27 100644 --- a/drivers/staging/media/tegra-vde/tegra-vde.c +++ b/drivers/staging/media/tegra-vde/tegra-vde.c @@ -568,6 +568,10 @@ static int tegra_vde_attach_dmabuf(struct device *dev, size_t *size, enum dma_data_direction dma_dir) { + struct dma_buf_attach_info attach_info = { + .dev = dev, + .dmabuf = dmabuf + }; struct dma_buf_attachment *attachment; struct dma_buf *dmabuf; struct sg_table *sgt; @@ -591,7 +595,7 @@ static int tegra_vde_attach_dmabuf(struct device *dev, return -EINVAL; }
- attachment = dma_buf_attach(dmabuf, dev); + attachment = dma_buf_attach(&attach_info); if (IS_ERR(attachment)) { dev_err(dev, "Failed to attach dmabuf\n"); err = PTR_ERR(attachment); diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h index 18a78be53541..7e23758db3a4 100644 --- a/include/linux/dma-buf.h +++ b/include/linux/dma-buf.h @@ -128,6 +128,9 @@ struct dma_buf_ops { * any other kind of sharing that the exporter might wish to make * available to buffer-users. * + * This is always called with the dmabuf->resv object locked when + * no_sgt_cache is true. + * * Returns: * * A &sg_table scatter list of or the backing storage of the DMA buffer, @@ -148,6 +151,9 @@ struct dma_buf_ops { * It should also unpin the backing storage if this is the last mapping * of the DMA buffer, it the exporter supports backing storage * migration. + * + * This is always called with the dmabuf->resv object locked when + * no_sgt_cache is true. */ void (*unmap_dma_buf)(struct dma_buf_attachment *, struct sg_table *, @@ -370,6 +376,19 @@ struct dma_buf_export_info { struct dma_buf_export_info name = { .exp_name = KBUILD_MODNAME, \ .owner = THIS_MODULE }
+/** + * struct dma_buf_attach_info - holds information needed to attach to a dma_buf + * @dmabuf: the exported dma_buf + * @dev: the device which wants to import the attachment + * + * This structure holds the information required to attach to a buffer. Used + * with dma_buf_attach() only. + */ +struct dma_buf_attach_info { + struct dma_buf *dmabuf; + struct device *dev; +}; + /** * get_dma_buf - convenience wrapper for get_file. * @dmabuf: [in] pointer to dma_buf @@ -384,8 +403,8 @@ static inline void get_dma_buf(struct dma_buf *dmabuf) get_file(dmabuf->file); }
-struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf, - struct device *dev); +struct dma_buf_attachment * +dma_buf_attach(const struct dma_buf_attach_info *info); void dma_buf_detach(struct dma_buf *dmabuf, struct dma_buf_attachment *dmabuf_attach);
On Tue, Apr 16, 2019 at 08:38:32PM +0200, Christian König wrote:
Make it mandatory for dynamic dma-buf callbacks to be called with the reservation lock held.
For static dma-buf exporters we still have the fallback of using cached sgt.
v2: reordered v3: rebased on sgt caching v4: use the cached sgt when possible
Signed-off-by: Christian König christian.koenig@amd.com
I think there's a bit a rebase chaos going on: - some comments left behind with no_sgt_cache, which I can't find anymore - the function signature rework of dma_buf_attach should imo be split out
Next issue is that the reservation object locking is still in the path of dma_buf_map, so probably still going to result in tons of lockdep splats. Except the i915+amdgpu path should now work due to the fastpath.
Not sure that's a solution that really works, just hides that fundamentally we still have that issue of incompatible locking chains between different drivers. -Daniel
drivers/dma-buf/dma-buf.c | 24 ++++++++++--------- drivers/gpu/drm/armada/armada_gem.c | 6 ++++- drivers/gpu/drm/drm_prime.c | 6 ++++- drivers/gpu/drm/i915/i915_gem_dmabuf.c | 6 ++++- drivers/gpu/drm/tegra/gem.c | 6 ++++- drivers/gpu/drm/udl/udl_dmabuf.c | 6 ++++- .../common/videobuf2/videobuf2-dma-contig.c | 6 ++++- .../media/common/videobuf2/videobuf2-dma-sg.c | 6 ++++- drivers/staging/media/tegra-vde/tegra-vde.c | 6 ++++- include/linux/dma-buf.h | 23 ++++++++++++++++-- 10 files changed, 74 insertions(+), 21 deletions(-)
diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index ef480e5fb239..83c92bfd964c 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -532,8 +532,9 @@ EXPORT_SYMBOL_GPL(dma_buf_put); /**
- dma_buf_attach - Add the device to dma_buf's attachments list; optionally,
- calls attach() of dma_buf_ops to allow device-specific attach functionality
- @dmabuf: [in] buffer to attach device to.
- @dev: [in] device to be attached.
- @info: [in] holds all the attach related information provided
by the importer. see &struct dma_buf_attach_info
for further details.
- Returns struct dma_buf_attachment pointer for this attachment. Attachments
- must be cleaned up by calling dma_buf_detach().
@@ -547,20 +548,20 @@ EXPORT_SYMBOL_GPL(dma_buf_put);
- accessible to @dev, and cannot be moved to a more suitable place. This is
- indicated with the error code -EBUSY.
*/ -struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf,
struct device *dev)
+struct dma_buf_attachment *dma_buf_attach(const struct dma_buf_attach_info *info) {
- struct dma_buf *dmabuf = info->dmabuf; struct dma_buf_attachment *attach; int ret;
- if (WARN_ON(!dmabuf || !dev))
- if (WARN_ON(!dmabuf || !info->dev)) return ERR_PTR(-EINVAL);
attach = kzalloc(sizeof(*attach), GFP_KERNEL); if (!attach) return ERR_PTR(-ENOMEM);
- attach->dev = dev;
- attach->dev = info->dev; attach->dmabuf = dmabuf;
mutex_lock(&dmabuf->lock); @@ -688,9 +689,9 @@ struct sg_table *dma_buf_map_attachment(struct dma_buf_attachment *attach, if (attach->sgt) return attach->sgt;
- sg_table = attach->dmabuf->ops->map_dma_buf(attach, direction);
- if (!sg_table)
sg_table = ERR_PTR(-ENOMEM);
- reservation_object_lock(attach->dmabuf->resv, NULL);
- sg_table = dma_buf_map_attachment_locked(attach, direction);
- reservation_object_unlock(attach->dmabuf->resv);
return sg_table; } @@ -744,8 +745,9 @@ void dma_buf_unmap_attachment(struct dma_buf_attachment *attach, if (attach->sgt == sg_table) return;
- attach->dmabuf->ops->unmap_dma_buf(attach, sg_table,
direction);
- reservation_object_lock(attach->dmabuf->resv, NULL);
- dma_buf_unmap_attachment_locked(attach, sg_table, direction);
- reservation_object_unlock(attach->dmabuf->resv);
} EXPORT_SYMBOL_GPL(dma_buf_unmap_attachment); diff --git a/drivers/gpu/drm/armada/armada_gem.c b/drivers/gpu/drm/armada/armada_gem.c index 642d0e70d0f8..19c47821032f 100644 --- a/drivers/gpu/drm/armada/armada_gem.c +++ b/drivers/gpu/drm/armada/armada_gem.c @@ -501,6 +501,10 @@ armada_gem_prime_export(struct drm_device *dev, struct drm_gem_object *obj, struct drm_gem_object * armada_gem_prime_import(struct drm_device *dev, struct dma_buf *buf) {
- struct dma_buf_attach_info attach_info = {
.dev = dev->dev,
.dmabuf = buf
- }; struct dma_buf_attachment *attach; struct armada_gem_object *dobj;
@@ -516,7 +520,7 @@ armada_gem_prime_import(struct drm_device *dev, struct dma_buf *buf) } }
- attach = dma_buf_attach(buf, dev->dev);
- attach = dma_buf_attach(&attach_info); if (IS_ERR(attach)) return ERR_CAST(attach);
diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c index 231e3f6d5f41..1fadf5d5ed33 100644 --- a/drivers/gpu/drm/drm_prime.c +++ b/drivers/gpu/drm/drm_prime.c @@ -709,6 +709,10 @@ struct drm_gem_object *drm_gem_prime_import_dev(struct drm_device *dev, struct dma_buf *dma_buf, struct device *attach_dev) {
- struct dma_buf_attach_info attach_info = {
.dev = attach_dev,
.dmabuf = dma_buf
- }; struct dma_buf_attachment *attach; struct sg_table *sgt; struct drm_gem_object *obj;
@@ -729,7 +733,7 @@ struct drm_gem_object *drm_gem_prime_import_dev(struct drm_device *dev, if (!dev->driver->gem_prime_import_sg_table) return ERR_PTR(-EINVAL);
- attach = dma_buf_attach(dma_buf, attach_dev);
- attach = dma_buf_attach(&attach_info); if (IS_ERR(attach)) return ERR_CAST(attach);
diff --git a/drivers/gpu/drm/i915/i915_gem_dmabuf.c b/drivers/gpu/drm/i915/i915_gem_dmabuf.c index 82e2ca17a441..aa7f685bd6ca 100644 --- a/drivers/gpu/drm/i915/i915_gem_dmabuf.c +++ b/drivers/gpu/drm/i915/i915_gem_dmabuf.c @@ -277,6 +277,10 @@ static const struct drm_i915_gem_object_ops i915_gem_object_dmabuf_ops = { struct drm_gem_object *i915_gem_prime_import(struct drm_device *dev, struct dma_buf *dma_buf) {
- struct dma_buf_attach_info attach_info = {
.dev = dev->dev,
.dmabuf = dma_buf
- }; struct dma_buf_attachment *attach; struct drm_i915_gem_object *obj; int ret;
@@ -295,7 +299,7 @@ struct drm_gem_object *i915_gem_prime_import(struct drm_device *dev, } /* need to attach */
- attach = dma_buf_attach(dma_buf, dev->dev);
- attach = dma_buf_attach(&attach_info); if (IS_ERR(attach)) return ERR_CAST(attach);
diff --git a/drivers/gpu/drm/tegra/gem.c b/drivers/gpu/drm/tegra/gem.c index 4f80100ff5f3..8e6b6c879add 100644 --- a/drivers/gpu/drm/tegra/gem.c +++ b/drivers/gpu/drm/tegra/gem.c @@ -332,6 +332,10 @@ struct tegra_bo *tegra_bo_create_with_handle(struct drm_file *file, static struct tegra_bo *tegra_bo_import(struct drm_device *drm, struct dma_buf *buf) {
- struct dma_buf_attach_info attach_info = {
.dev = drm->dev,
.dmabuf = buf
- }; struct tegra_drm *tegra = drm->dev_private; struct dma_buf_attachment *attach; struct tegra_bo *bo;
@@ -341,7 +345,7 @@ static struct tegra_bo *tegra_bo_import(struct drm_device *drm, if (IS_ERR(bo)) return bo;
- attach = dma_buf_attach(buf, drm->dev);
- attach = dma_buf_attach(&attach_info); if (IS_ERR(attach)) { err = PTR_ERR(attach); goto free;
diff --git a/drivers/gpu/drm/udl/udl_dmabuf.c b/drivers/gpu/drm/udl/udl_dmabuf.c index 556f62662aa9..86b928f9742f 100644 --- a/drivers/gpu/drm/udl/udl_dmabuf.c +++ b/drivers/gpu/drm/udl/udl_dmabuf.c @@ -226,6 +226,10 @@ static int udl_prime_create(struct drm_device *dev, struct drm_gem_object *udl_gem_prime_import(struct drm_device *dev, struct dma_buf *dma_buf) {
- struct dma_buf_attach_info attach_info = {
.dev = dev->dev,
.dmabuf = dma_buf
- }; struct dma_buf_attachment *attach; struct sg_table *sg; struct udl_gem_object *uobj;
@@ -233,7 +237,7 @@ struct drm_gem_object *udl_gem_prime_import(struct drm_device *dev, /* need to attach */ get_device(dev->dev);
- attach = dma_buf_attach(dma_buf, dev->dev);
- attach = dma_buf_attach(&attach_info); if (IS_ERR(attach)) { put_device(dev->dev); return ERR_CAST(attach);
diff --git a/drivers/media/common/videobuf2/videobuf2-dma-contig.c b/drivers/media/common/videobuf2/videobuf2-dma-contig.c index aff0ab7bf83d..1f2687b5eb0e 100644 --- a/drivers/media/common/videobuf2/videobuf2-dma-contig.c +++ b/drivers/media/common/videobuf2/videobuf2-dma-contig.c @@ -676,6 +676,10 @@ static void vb2_dc_detach_dmabuf(void *mem_priv) static void *vb2_dc_attach_dmabuf(struct device *dev, struct dma_buf *dbuf, unsigned long size, enum dma_data_direction dma_dir) {
- struct dma_buf_attach_info attach_info = {
.dev = dev,
.dmabuf = dbuf
- }; struct vb2_dc_buf *buf; struct dma_buf_attachment *dba;
@@ -691,7 +695,7 @@ static void *vb2_dc_attach_dmabuf(struct device *dev, struct dma_buf *dbuf, buf->dev = dev; /* create attachment for the dmabuf with the user device */
- dba = dma_buf_attach(dbuf, buf->dev);
- dba = dma_buf_attach(&attach_info); if (IS_ERR(dba)) { pr_err("failed to attach dmabuf\n"); kfree(buf);
diff --git a/drivers/media/common/videobuf2/videobuf2-dma-sg.c b/drivers/media/common/videobuf2/videobuf2-dma-sg.c index 015e737095cd..cbd626d2393a 100644 --- a/drivers/media/common/videobuf2/videobuf2-dma-sg.c +++ b/drivers/media/common/videobuf2/videobuf2-dma-sg.c @@ -608,6 +608,10 @@ static void vb2_dma_sg_detach_dmabuf(void *mem_priv) static void *vb2_dma_sg_attach_dmabuf(struct device *dev, struct dma_buf *dbuf, unsigned long size, enum dma_data_direction dma_dir) {
- struct dma_buf_attach_info attach_info = {
.dev = dev,
.dmabuf = dbuf
- }; struct vb2_dma_sg_buf *buf; struct dma_buf_attachment *dba;
@@ -623,7 +627,7 @@ static void *vb2_dma_sg_attach_dmabuf(struct device *dev, struct dma_buf *dbuf, buf->dev = dev; /* create attachment for the dmabuf with the user device */
- dba = dma_buf_attach(dbuf, buf->dev);
- dba = dma_buf_attach(&attach_info); if (IS_ERR(dba)) { pr_err("failed to attach dmabuf\n"); kfree(buf);
diff --git a/drivers/staging/media/tegra-vde/tegra-vde.c b/drivers/staging/media/tegra-vde/tegra-vde.c index aa6c6bba961e..5a10c1facc27 100644 --- a/drivers/staging/media/tegra-vde/tegra-vde.c +++ b/drivers/staging/media/tegra-vde/tegra-vde.c @@ -568,6 +568,10 @@ static int tegra_vde_attach_dmabuf(struct device *dev, size_t *size, enum dma_data_direction dma_dir) {
- struct dma_buf_attach_info attach_info = {
.dev = dev,
.dmabuf = dmabuf
- }; struct dma_buf_attachment *attachment; struct dma_buf *dmabuf; struct sg_table *sgt;
@@ -591,7 +595,7 @@ static int tegra_vde_attach_dmabuf(struct device *dev, return -EINVAL; }
- attachment = dma_buf_attach(dmabuf, dev);
- attachment = dma_buf_attach(&attach_info); if (IS_ERR(attachment)) { dev_err(dev, "Failed to attach dmabuf\n"); err = PTR_ERR(attachment);
diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h index 18a78be53541..7e23758db3a4 100644 --- a/include/linux/dma-buf.h +++ b/include/linux/dma-buf.h @@ -128,6 +128,9 @@ struct dma_buf_ops { * any other kind of sharing that the exporter might wish to make * available to buffer-users. *
* This is always called with the dmabuf->resv object locked when
* no_sgt_cache is true.
*
- Returns:
- A &sg_table scatter list of or the backing storage of the DMA buffer,
@@ -148,6 +151,9 @@ struct dma_buf_ops { * It should also unpin the backing storage if this is the last mapping * of the DMA buffer, it the exporter supports backing storage * migration.
*
* This is always called with the dmabuf->resv object locked when
*/ void (*unmap_dma_buf)(struct dma_buf_attachment *, struct sg_table *,* no_sgt_cache is true.
@@ -370,6 +376,19 @@ struct dma_buf_export_info { struct dma_buf_export_info name = { .exp_name = KBUILD_MODNAME, \ .owner = THIS_MODULE } +/**
- struct dma_buf_attach_info - holds information needed to attach to a dma_buf
- @dmabuf: the exported dma_buf
- @dev: the device which wants to import the attachment
- This structure holds the information required to attach to a buffer. Used
- with dma_buf_attach() only.
- */
+struct dma_buf_attach_info {
- struct dma_buf *dmabuf;
- struct device *dev;
+};
/**
- get_dma_buf - convenience wrapper for get_file.
- @dmabuf: [in] pointer to dma_buf
@@ -384,8 +403,8 @@ static inline void get_dma_buf(struct dma_buf *dmabuf) get_file(dmabuf->file); } -struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf,
struct device *dev);
+struct dma_buf_attachment * +dma_buf_attach(const struct dma_buf_attach_info *info); void dma_buf_detach(struct dma_buf *dmabuf, struct dma_buf_attachment *dmabuf_attach); -- 2.17.1
dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Am 17.04.19 um 16:08 schrieb Daniel Vetter:
On Tue, Apr 16, 2019 at 08:38:32PM +0200, Christian König wrote:
Make it mandatory for dynamic dma-buf callbacks to be called with the reservation lock held.
For static dma-buf exporters we still have the fallback of using cached sgt.
v2: reordered v3: rebased on sgt caching v4: use the cached sgt when possible
Signed-off-by: Christian König christian.koenig@amd.com
I think there's a bit a rebase chaos going on:
- some comments left behind with no_sgt_cache, which I can't find anymore
- the function signature rework of dma_buf_attach should imo be split out
Ah, crap thought I've got all of those. Going to fix that.
Next issue is that the reservation object locking is still in the path of dma_buf_map, so probably still going to result in tons of lockdep splats. Except the i915+amdgpu path should now work due to the fastpath.
I actually found a solution for that :)
The idea is now that we always cache the sgt in the attachment unless the dynamic flag (previously no_sgt_cache flag) is set. And this cached sgt is created without holding the lock.
We either need to document that really well or maybe split the mapping callbacks into map/unmap and map_lock/unmap_locked. Opinions?
Not sure that's a solution that really works, just hides that fundamentally we still have that issue of incompatible locking chains between different drivers.
We can now make a slow transition between static and dynamic DMA-buf handling, so only driver who can do the locking will be affected.
Christian.
-Daniel
drivers/dma-buf/dma-buf.c | 24 ++++++++++--------- drivers/gpu/drm/armada/armada_gem.c | 6 ++++- drivers/gpu/drm/drm_prime.c | 6 ++++- drivers/gpu/drm/i915/i915_gem_dmabuf.c | 6 ++++- drivers/gpu/drm/tegra/gem.c | 6 ++++- drivers/gpu/drm/udl/udl_dmabuf.c | 6 ++++- .../common/videobuf2/videobuf2-dma-contig.c | 6 ++++- .../media/common/videobuf2/videobuf2-dma-sg.c | 6 ++++- drivers/staging/media/tegra-vde/tegra-vde.c | 6 ++++- include/linux/dma-buf.h | 23 ++++++++++++++++-- 10 files changed, 74 insertions(+), 21 deletions(-)
diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index ef480e5fb239..83c92bfd964c 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -532,8 +532,9 @@ EXPORT_SYMBOL_GPL(dma_buf_put); /**
- dma_buf_attach - Add the device to dma_buf's attachments list; optionally,
- calls attach() of dma_buf_ops to allow device-specific attach functionality
- @dmabuf: [in] buffer to attach device to.
- @dev: [in] device to be attached.
- @info: [in] holds all the attach related information provided
by the importer. see &struct dma_buf_attach_info
for further details.
- Returns struct dma_buf_attachment pointer for this attachment. Attachments
- must be cleaned up by calling dma_buf_detach().
@@ -547,20 +548,20 @@ EXPORT_SYMBOL_GPL(dma_buf_put);
- accessible to @dev, and cannot be moved to a more suitable place. This is
- indicated with the error code -EBUSY.
*/ -struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf,
struct device *dev)
+struct dma_buf_attachment *dma_buf_attach(const struct dma_buf_attach_info *info) {
- struct dma_buf *dmabuf = info->dmabuf; struct dma_buf_attachment *attach; int ret;
- if (WARN_ON(!dmabuf || !dev))
- if (WARN_ON(!dmabuf || !info->dev)) return ERR_PTR(-EINVAL);
attach = kzalloc(sizeof(*attach), GFP_KERNEL); if (!attach) return ERR_PTR(-ENOMEM);
- attach->dev = dev;
- attach->dev = info->dev; attach->dmabuf = dmabuf;
mutex_lock(&dmabuf->lock); @@ -688,9 +689,9 @@ struct sg_table *dma_buf_map_attachment(struct dma_buf_attachment *attach, if (attach->sgt) return attach->sgt;
- sg_table = attach->dmabuf->ops->map_dma_buf(attach, direction);
- if (!sg_table)
sg_table = ERR_PTR(-ENOMEM);
- reservation_object_lock(attach->dmabuf->resv, NULL);
- sg_table = dma_buf_map_attachment_locked(attach, direction);
- reservation_object_unlock(attach->dmabuf->resv);
return sg_table; } @@ -744,8 +745,9 @@ void dma_buf_unmap_attachment(struct dma_buf_attachment *attach, if (attach->sgt == sg_table) return;
- attach->dmabuf->ops->unmap_dma_buf(attach, sg_table,
direction);
- reservation_object_lock(attach->dmabuf->resv, NULL);
- dma_buf_unmap_attachment_locked(attach, sg_table, direction);
- reservation_object_unlock(attach->dmabuf->resv); } EXPORT_SYMBOL_GPL(dma_buf_unmap_attachment);
diff --git a/drivers/gpu/drm/armada/armada_gem.c b/drivers/gpu/drm/armada/armada_gem.c index 642d0e70d0f8..19c47821032f 100644 --- a/drivers/gpu/drm/armada/armada_gem.c +++ b/drivers/gpu/drm/armada/armada_gem.c @@ -501,6 +501,10 @@ armada_gem_prime_export(struct drm_device *dev, struct drm_gem_object *obj, struct drm_gem_object * armada_gem_prime_import(struct drm_device *dev, struct dma_buf *buf) {
- struct dma_buf_attach_info attach_info = {
.dev = dev->dev,
.dmabuf = buf
- }; struct dma_buf_attachment *attach; struct armada_gem_object *dobj;
@@ -516,7 +520,7 @@ armada_gem_prime_import(struct drm_device *dev, struct dma_buf *buf) } }
- attach = dma_buf_attach(buf, dev->dev);
- attach = dma_buf_attach(&attach_info); if (IS_ERR(attach)) return ERR_CAST(attach);
diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c index 231e3f6d5f41..1fadf5d5ed33 100644 --- a/drivers/gpu/drm/drm_prime.c +++ b/drivers/gpu/drm/drm_prime.c @@ -709,6 +709,10 @@ struct drm_gem_object *drm_gem_prime_import_dev(struct drm_device *dev, struct dma_buf *dma_buf, struct device *attach_dev) {
- struct dma_buf_attach_info attach_info = {
.dev = attach_dev,
.dmabuf = dma_buf
- }; struct dma_buf_attachment *attach; struct sg_table *sgt; struct drm_gem_object *obj;
@@ -729,7 +733,7 @@ struct drm_gem_object *drm_gem_prime_import_dev(struct drm_device *dev, if (!dev->driver->gem_prime_import_sg_table) return ERR_PTR(-EINVAL);
- attach = dma_buf_attach(dma_buf, attach_dev);
- attach = dma_buf_attach(&attach_info); if (IS_ERR(attach)) return ERR_CAST(attach);
diff --git a/drivers/gpu/drm/i915/i915_gem_dmabuf.c b/drivers/gpu/drm/i915/i915_gem_dmabuf.c index 82e2ca17a441..aa7f685bd6ca 100644 --- a/drivers/gpu/drm/i915/i915_gem_dmabuf.c +++ b/drivers/gpu/drm/i915/i915_gem_dmabuf.c @@ -277,6 +277,10 @@ static const struct drm_i915_gem_object_ops i915_gem_object_dmabuf_ops = { struct drm_gem_object *i915_gem_prime_import(struct drm_device *dev, struct dma_buf *dma_buf) {
- struct dma_buf_attach_info attach_info = {
.dev = dev->dev,
.dmabuf = dma_buf
- }; struct dma_buf_attachment *attach; struct drm_i915_gem_object *obj; int ret;
@@ -295,7 +299,7 @@ struct drm_gem_object *i915_gem_prime_import(struct drm_device *dev, } /* need to attach */
- attach = dma_buf_attach(dma_buf, dev->dev);
- attach = dma_buf_attach(&attach_info); if (IS_ERR(attach)) return ERR_CAST(attach);
diff --git a/drivers/gpu/drm/tegra/gem.c b/drivers/gpu/drm/tegra/gem.c index 4f80100ff5f3..8e6b6c879add 100644 --- a/drivers/gpu/drm/tegra/gem.c +++ b/drivers/gpu/drm/tegra/gem.c @@ -332,6 +332,10 @@ struct tegra_bo *tegra_bo_create_with_handle(struct drm_file *file, static struct tegra_bo *tegra_bo_import(struct drm_device *drm, struct dma_buf *buf) {
- struct dma_buf_attach_info attach_info = {
.dev = drm->dev,
.dmabuf = buf
- }; struct tegra_drm *tegra = drm->dev_private; struct dma_buf_attachment *attach; struct tegra_bo *bo;
@@ -341,7 +345,7 @@ static struct tegra_bo *tegra_bo_import(struct drm_device *drm, if (IS_ERR(bo)) return bo;
- attach = dma_buf_attach(buf, drm->dev);
- attach = dma_buf_attach(&attach_info); if (IS_ERR(attach)) { err = PTR_ERR(attach); goto free;
diff --git a/drivers/gpu/drm/udl/udl_dmabuf.c b/drivers/gpu/drm/udl/udl_dmabuf.c index 556f62662aa9..86b928f9742f 100644 --- a/drivers/gpu/drm/udl/udl_dmabuf.c +++ b/drivers/gpu/drm/udl/udl_dmabuf.c @@ -226,6 +226,10 @@ static int udl_prime_create(struct drm_device *dev, struct drm_gem_object *udl_gem_prime_import(struct drm_device *dev, struct dma_buf *dma_buf) {
- struct dma_buf_attach_info attach_info = {
.dev = dev->dev,
.dmabuf = dma_buf
- }; struct dma_buf_attachment *attach; struct sg_table *sg; struct udl_gem_object *uobj;
@@ -233,7 +237,7 @@ struct drm_gem_object *udl_gem_prime_import(struct drm_device *dev, /* need to attach */ get_device(dev->dev);
- attach = dma_buf_attach(dma_buf, dev->dev);
- attach = dma_buf_attach(&attach_info); if (IS_ERR(attach)) { put_device(dev->dev); return ERR_CAST(attach);
diff --git a/drivers/media/common/videobuf2/videobuf2-dma-contig.c b/drivers/media/common/videobuf2/videobuf2-dma-contig.c index aff0ab7bf83d..1f2687b5eb0e 100644 --- a/drivers/media/common/videobuf2/videobuf2-dma-contig.c +++ b/drivers/media/common/videobuf2/videobuf2-dma-contig.c @@ -676,6 +676,10 @@ static void vb2_dc_detach_dmabuf(void *mem_priv) static void *vb2_dc_attach_dmabuf(struct device *dev, struct dma_buf *dbuf, unsigned long size, enum dma_data_direction dma_dir) {
- struct dma_buf_attach_info attach_info = {
.dev = dev,
.dmabuf = dbuf
- }; struct vb2_dc_buf *buf; struct dma_buf_attachment *dba;
@@ -691,7 +695,7 @@ static void *vb2_dc_attach_dmabuf(struct device *dev, struct dma_buf *dbuf, buf->dev = dev; /* create attachment for the dmabuf with the user device */
- dba = dma_buf_attach(dbuf, buf->dev);
- dba = dma_buf_attach(&attach_info); if (IS_ERR(dba)) { pr_err("failed to attach dmabuf\n"); kfree(buf);
diff --git a/drivers/media/common/videobuf2/videobuf2-dma-sg.c b/drivers/media/common/videobuf2/videobuf2-dma-sg.c index 015e737095cd..cbd626d2393a 100644 --- a/drivers/media/common/videobuf2/videobuf2-dma-sg.c +++ b/drivers/media/common/videobuf2/videobuf2-dma-sg.c @@ -608,6 +608,10 @@ static void vb2_dma_sg_detach_dmabuf(void *mem_priv) static void *vb2_dma_sg_attach_dmabuf(struct device *dev, struct dma_buf *dbuf, unsigned long size, enum dma_data_direction dma_dir) {
- struct dma_buf_attach_info attach_info = {
.dev = dev,
.dmabuf = dbuf
- }; struct vb2_dma_sg_buf *buf; struct dma_buf_attachment *dba;
@@ -623,7 +627,7 @@ static void *vb2_dma_sg_attach_dmabuf(struct device *dev, struct dma_buf *dbuf, buf->dev = dev; /* create attachment for the dmabuf with the user device */
- dba = dma_buf_attach(dbuf, buf->dev);
- dba = dma_buf_attach(&attach_info); if (IS_ERR(dba)) { pr_err("failed to attach dmabuf\n"); kfree(buf);
diff --git a/drivers/staging/media/tegra-vde/tegra-vde.c b/drivers/staging/media/tegra-vde/tegra-vde.c index aa6c6bba961e..5a10c1facc27 100644 --- a/drivers/staging/media/tegra-vde/tegra-vde.c +++ b/drivers/staging/media/tegra-vde/tegra-vde.c @@ -568,6 +568,10 @@ static int tegra_vde_attach_dmabuf(struct device *dev, size_t *size, enum dma_data_direction dma_dir) {
- struct dma_buf_attach_info attach_info = {
.dev = dev,
.dmabuf = dmabuf
- }; struct dma_buf_attachment *attachment; struct dma_buf *dmabuf; struct sg_table *sgt;
@@ -591,7 +595,7 @@ static int tegra_vde_attach_dmabuf(struct device *dev, return -EINVAL; }
- attachment = dma_buf_attach(dmabuf, dev);
- attachment = dma_buf_attach(&attach_info); if (IS_ERR(attachment)) { dev_err(dev, "Failed to attach dmabuf\n"); err = PTR_ERR(attachment);
diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h index 18a78be53541..7e23758db3a4 100644 --- a/include/linux/dma-buf.h +++ b/include/linux/dma-buf.h @@ -128,6 +128,9 @@ struct dma_buf_ops { * any other kind of sharing that the exporter might wish to make * available to buffer-users. *
* This is always called with the dmabuf->resv object locked when
* no_sgt_cache is true.
*
- Returns:
- A &sg_table scatter list of or the backing storage of the DMA buffer,
@@ -148,6 +151,9 @@ struct dma_buf_ops { * It should also unpin the backing storage if this is the last mapping * of the DMA buffer, it the exporter supports backing storage * migration.
*
* This is always called with the dmabuf->resv object locked when
*/ void (*unmap_dma_buf)(struct dma_buf_attachment *, struct sg_table *,* no_sgt_cache is true.
@@ -370,6 +376,19 @@ struct dma_buf_export_info { struct dma_buf_export_info name = { .exp_name = KBUILD_MODNAME, \ .owner = THIS_MODULE } +/**
- struct dma_buf_attach_info - holds information needed to attach to a dma_buf
- @dmabuf: the exported dma_buf
- @dev: the device which wants to import the attachment
- This structure holds the information required to attach to a buffer. Used
- with dma_buf_attach() only.
- */
+struct dma_buf_attach_info {
- struct dma_buf *dmabuf;
- struct device *dev;
+};
- /**
- get_dma_buf - convenience wrapper for get_file.
- @dmabuf: [in] pointer to dma_buf
@@ -384,8 +403,8 @@ static inline void get_dma_buf(struct dma_buf *dmabuf) get_file(dmabuf->file); } -struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf,
struct device *dev);
+struct dma_buf_attachment * +dma_buf_attach(const struct dma_buf_attach_info *info); void dma_buf_detach(struct dma_buf *dmabuf, struct dma_buf_attachment *dmabuf_attach); -- 2.17.1
dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
On Wed, Apr 17, 2019 at 04:14:32PM +0200, Christian König wrote:
Am 17.04.19 um 16:08 schrieb Daniel Vetter:
On Tue, Apr 16, 2019 at 08:38:32PM +0200, Christian König wrote:
Make it mandatory for dynamic dma-buf callbacks to be called with the reservation lock held.
For static dma-buf exporters we still have the fallback of using cached sgt.
v2: reordered v3: rebased on sgt caching v4: use the cached sgt when possible
Signed-off-by: Christian König christian.koenig@amd.com
I think there's a bit a rebase chaos going on:
- some comments left behind with no_sgt_cache, which I can't find anymore
- the function signature rework of dma_buf_attach should imo be split out
Ah, crap thought I've got all of those. Going to fix that.
Next issue is that the reservation object locking is still in the path of dma_buf_map, so probably still going to result in tons of lockdep splats. Except the i915+amdgpu path should now work due to the fastpath.
I actually found a solution for that :)
The idea is now that we always cache the sgt in the attachment unless the dynamic flag (previously no_sgt_cache flag) is set. And this cached sgt is created without holding the lock.
Yeah I think that idea works.
We either need to document that really well or maybe split the mapping callbacks into map/unmap and map_lock/unmap_locked. Opinions?
I think the implementation doesn't. Exporter can't decide on its own whether dynamic/static is what's needed, we need to decide that for each attachment, taking both exporter and importer capabilities into account.
I think if we do that, then it should work out. I replied on the pin/unpin interface patch with some more concrete thoughts. Maybe best to continue that discussion there, with more context. -Daniel
Not sure that's a solution that really works, just hides that fundamentally we still have that issue of incompatible locking chains between different drivers.
We can now make a slow transition between static and dynamic DMA-buf handling, so only driver who can do the locking will be affected.
Christian.
-Daniel
drivers/dma-buf/dma-buf.c | 24 ++++++++++--------- drivers/gpu/drm/armada/armada_gem.c | 6 ++++- drivers/gpu/drm/drm_prime.c | 6 ++++- drivers/gpu/drm/i915/i915_gem_dmabuf.c | 6 ++++- drivers/gpu/drm/tegra/gem.c | 6 ++++- drivers/gpu/drm/udl/udl_dmabuf.c | 6 ++++- .../common/videobuf2/videobuf2-dma-contig.c | 6 ++++- .../media/common/videobuf2/videobuf2-dma-sg.c | 6 ++++- drivers/staging/media/tegra-vde/tegra-vde.c | 6 ++++- include/linux/dma-buf.h | 23 ++++++++++++++++-- 10 files changed, 74 insertions(+), 21 deletions(-)
diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index ef480e5fb239..83c92bfd964c 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -532,8 +532,9 @@ EXPORT_SYMBOL_GPL(dma_buf_put); /**
- dma_buf_attach - Add the device to dma_buf's attachments list; optionally,
- calls attach() of dma_buf_ops to allow device-specific attach functionality
- @dmabuf: [in] buffer to attach device to.
- @dev: [in] device to be attached.
- @info: [in] holds all the attach related information provided
by the importer. see &struct dma_buf_attach_info
for further details.
- Returns struct dma_buf_attachment pointer for this attachment. Attachments
- must be cleaned up by calling dma_buf_detach().
@@ -547,20 +548,20 @@ EXPORT_SYMBOL_GPL(dma_buf_put);
- accessible to @dev, and cannot be moved to a more suitable place. This is
- indicated with the error code -EBUSY.
*/ -struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf,
struct device *dev)
+struct dma_buf_attachment *dma_buf_attach(const struct dma_buf_attach_info *info) {
- struct dma_buf *dmabuf = info->dmabuf; struct dma_buf_attachment *attach; int ret;
- if (WARN_ON(!dmabuf || !dev))
- if (WARN_ON(!dmabuf || !info->dev)) return ERR_PTR(-EINVAL); attach = kzalloc(sizeof(*attach), GFP_KERNEL); if (!attach) return ERR_PTR(-ENOMEM);
- attach->dev = dev;
- attach->dev = info->dev; attach->dmabuf = dmabuf; mutex_lock(&dmabuf->lock);
@@ -688,9 +689,9 @@ struct sg_table *dma_buf_map_attachment(struct dma_buf_attachment *attach, if (attach->sgt) return attach->sgt;
- sg_table = attach->dmabuf->ops->map_dma_buf(attach, direction);
- if (!sg_table)
sg_table = ERR_PTR(-ENOMEM);
- reservation_object_lock(attach->dmabuf->resv, NULL);
- sg_table = dma_buf_map_attachment_locked(attach, direction);
- reservation_object_unlock(attach->dmabuf->resv); return sg_table; }
@@ -744,8 +745,9 @@ void dma_buf_unmap_attachment(struct dma_buf_attachment *attach, if (attach->sgt == sg_table) return;
- attach->dmabuf->ops->unmap_dma_buf(attach, sg_table,
direction);
- reservation_object_lock(attach->dmabuf->resv, NULL);
- dma_buf_unmap_attachment_locked(attach, sg_table, direction);
- reservation_object_unlock(attach->dmabuf->resv); } EXPORT_SYMBOL_GPL(dma_buf_unmap_attachment);
diff --git a/drivers/gpu/drm/armada/armada_gem.c b/drivers/gpu/drm/armada/armada_gem.c index 642d0e70d0f8..19c47821032f 100644 --- a/drivers/gpu/drm/armada/armada_gem.c +++ b/drivers/gpu/drm/armada/armada_gem.c @@ -501,6 +501,10 @@ armada_gem_prime_export(struct drm_device *dev, struct drm_gem_object *obj, struct drm_gem_object * armada_gem_prime_import(struct drm_device *dev, struct dma_buf *buf) {
- struct dma_buf_attach_info attach_info = {
.dev = dev->dev,
.dmabuf = buf
- }; struct dma_buf_attachment *attach; struct armada_gem_object *dobj;
@@ -516,7 +520,7 @@ armada_gem_prime_import(struct drm_device *dev, struct dma_buf *buf) } }
- attach = dma_buf_attach(buf, dev->dev);
- attach = dma_buf_attach(&attach_info); if (IS_ERR(attach)) return ERR_CAST(attach);
diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c index 231e3f6d5f41..1fadf5d5ed33 100644 --- a/drivers/gpu/drm/drm_prime.c +++ b/drivers/gpu/drm/drm_prime.c @@ -709,6 +709,10 @@ struct drm_gem_object *drm_gem_prime_import_dev(struct drm_device *dev, struct dma_buf *dma_buf, struct device *attach_dev) {
- struct dma_buf_attach_info attach_info = {
.dev = attach_dev,
.dmabuf = dma_buf
- }; struct dma_buf_attachment *attach; struct sg_table *sgt; struct drm_gem_object *obj;
@@ -729,7 +733,7 @@ struct drm_gem_object *drm_gem_prime_import_dev(struct drm_device *dev, if (!dev->driver->gem_prime_import_sg_table) return ERR_PTR(-EINVAL);
- attach = dma_buf_attach(dma_buf, attach_dev);
- attach = dma_buf_attach(&attach_info); if (IS_ERR(attach)) return ERR_CAST(attach);
diff --git a/drivers/gpu/drm/i915/i915_gem_dmabuf.c b/drivers/gpu/drm/i915/i915_gem_dmabuf.c index 82e2ca17a441..aa7f685bd6ca 100644 --- a/drivers/gpu/drm/i915/i915_gem_dmabuf.c +++ b/drivers/gpu/drm/i915/i915_gem_dmabuf.c @@ -277,6 +277,10 @@ static const struct drm_i915_gem_object_ops i915_gem_object_dmabuf_ops = { struct drm_gem_object *i915_gem_prime_import(struct drm_device *dev, struct dma_buf *dma_buf) {
- struct dma_buf_attach_info attach_info = {
.dev = dev->dev,
.dmabuf = dma_buf
- }; struct dma_buf_attachment *attach; struct drm_i915_gem_object *obj; int ret;
@@ -295,7 +299,7 @@ struct drm_gem_object *i915_gem_prime_import(struct drm_device *dev, } /* need to attach */
- attach = dma_buf_attach(dma_buf, dev->dev);
- attach = dma_buf_attach(&attach_info); if (IS_ERR(attach)) return ERR_CAST(attach);
diff --git a/drivers/gpu/drm/tegra/gem.c b/drivers/gpu/drm/tegra/gem.c index 4f80100ff5f3..8e6b6c879add 100644 --- a/drivers/gpu/drm/tegra/gem.c +++ b/drivers/gpu/drm/tegra/gem.c @@ -332,6 +332,10 @@ struct tegra_bo *tegra_bo_create_with_handle(struct drm_file *file, static struct tegra_bo *tegra_bo_import(struct drm_device *drm, struct dma_buf *buf) {
- struct dma_buf_attach_info attach_info = {
.dev = drm->dev,
.dmabuf = buf
- }; struct tegra_drm *tegra = drm->dev_private; struct dma_buf_attachment *attach; struct tegra_bo *bo;
@@ -341,7 +345,7 @@ static struct tegra_bo *tegra_bo_import(struct drm_device *drm, if (IS_ERR(bo)) return bo;
- attach = dma_buf_attach(buf, drm->dev);
- attach = dma_buf_attach(&attach_info); if (IS_ERR(attach)) { err = PTR_ERR(attach); goto free;
diff --git a/drivers/gpu/drm/udl/udl_dmabuf.c b/drivers/gpu/drm/udl/udl_dmabuf.c index 556f62662aa9..86b928f9742f 100644 --- a/drivers/gpu/drm/udl/udl_dmabuf.c +++ b/drivers/gpu/drm/udl/udl_dmabuf.c @@ -226,6 +226,10 @@ static int udl_prime_create(struct drm_device *dev, struct drm_gem_object *udl_gem_prime_import(struct drm_device *dev, struct dma_buf *dma_buf) {
- struct dma_buf_attach_info attach_info = {
.dev = dev->dev,
.dmabuf = dma_buf
- }; struct dma_buf_attachment *attach; struct sg_table *sg; struct udl_gem_object *uobj;
@@ -233,7 +237,7 @@ struct drm_gem_object *udl_gem_prime_import(struct drm_device *dev, /* need to attach */ get_device(dev->dev);
- attach = dma_buf_attach(dma_buf, dev->dev);
- attach = dma_buf_attach(&attach_info); if (IS_ERR(attach)) { put_device(dev->dev); return ERR_CAST(attach);
diff --git a/drivers/media/common/videobuf2/videobuf2-dma-contig.c b/drivers/media/common/videobuf2/videobuf2-dma-contig.c index aff0ab7bf83d..1f2687b5eb0e 100644 --- a/drivers/media/common/videobuf2/videobuf2-dma-contig.c +++ b/drivers/media/common/videobuf2/videobuf2-dma-contig.c @@ -676,6 +676,10 @@ static void vb2_dc_detach_dmabuf(void *mem_priv) static void *vb2_dc_attach_dmabuf(struct device *dev, struct dma_buf *dbuf, unsigned long size, enum dma_data_direction dma_dir) {
- struct dma_buf_attach_info attach_info = {
.dev = dev,
.dmabuf = dbuf
- }; struct vb2_dc_buf *buf; struct dma_buf_attachment *dba;
@@ -691,7 +695,7 @@ static void *vb2_dc_attach_dmabuf(struct device *dev, struct dma_buf *dbuf, buf->dev = dev; /* create attachment for the dmabuf with the user device */
- dba = dma_buf_attach(dbuf, buf->dev);
- dba = dma_buf_attach(&attach_info); if (IS_ERR(dba)) { pr_err("failed to attach dmabuf\n"); kfree(buf);
diff --git a/drivers/media/common/videobuf2/videobuf2-dma-sg.c b/drivers/media/common/videobuf2/videobuf2-dma-sg.c index 015e737095cd..cbd626d2393a 100644 --- a/drivers/media/common/videobuf2/videobuf2-dma-sg.c +++ b/drivers/media/common/videobuf2/videobuf2-dma-sg.c @@ -608,6 +608,10 @@ static void vb2_dma_sg_detach_dmabuf(void *mem_priv) static void *vb2_dma_sg_attach_dmabuf(struct device *dev, struct dma_buf *dbuf, unsigned long size, enum dma_data_direction dma_dir) {
- struct dma_buf_attach_info attach_info = {
.dev = dev,
.dmabuf = dbuf
- }; struct vb2_dma_sg_buf *buf; struct dma_buf_attachment *dba;
@@ -623,7 +627,7 @@ static void *vb2_dma_sg_attach_dmabuf(struct device *dev, struct dma_buf *dbuf, buf->dev = dev; /* create attachment for the dmabuf with the user device */
- dba = dma_buf_attach(dbuf, buf->dev);
- dba = dma_buf_attach(&attach_info); if (IS_ERR(dba)) { pr_err("failed to attach dmabuf\n"); kfree(buf);
diff --git a/drivers/staging/media/tegra-vde/tegra-vde.c b/drivers/staging/media/tegra-vde/tegra-vde.c index aa6c6bba961e..5a10c1facc27 100644 --- a/drivers/staging/media/tegra-vde/tegra-vde.c +++ b/drivers/staging/media/tegra-vde/tegra-vde.c @@ -568,6 +568,10 @@ static int tegra_vde_attach_dmabuf(struct device *dev, size_t *size, enum dma_data_direction dma_dir) {
- struct dma_buf_attach_info attach_info = {
.dev = dev,
.dmabuf = dmabuf
- }; struct dma_buf_attachment *attachment; struct dma_buf *dmabuf; struct sg_table *sgt;
@@ -591,7 +595,7 @@ static int tegra_vde_attach_dmabuf(struct device *dev, return -EINVAL; }
- attachment = dma_buf_attach(dmabuf, dev);
- attachment = dma_buf_attach(&attach_info); if (IS_ERR(attachment)) { dev_err(dev, "Failed to attach dmabuf\n"); err = PTR_ERR(attachment);
diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h index 18a78be53541..7e23758db3a4 100644 --- a/include/linux/dma-buf.h +++ b/include/linux/dma-buf.h @@ -128,6 +128,9 @@ struct dma_buf_ops { * any other kind of sharing that the exporter might wish to make * available to buffer-users. *
* This is always called with the dmabuf->resv object locked when
* no_sgt_cache is true.
*
- Returns:
- A &sg_table scatter list of or the backing storage of the DMA buffer,
@@ -148,6 +151,9 @@ struct dma_buf_ops { * It should also unpin the backing storage if this is the last mapping * of the DMA buffer, it the exporter supports backing storage * migration.
*
* This is always called with the dmabuf->resv object locked when
*/ void (*unmap_dma_buf)(struct dma_buf_attachment *, struct sg_table *,* no_sgt_cache is true.
@@ -370,6 +376,19 @@ struct dma_buf_export_info { struct dma_buf_export_info name = { .exp_name = KBUILD_MODNAME, \ .owner = THIS_MODULE } +/**
- struct dma_buf_attach_info - holds information needed to attach to a dma_buf
- @dmabuf: the exported dma_buf
- @dev: the device which wants to import the attachment
- This structure holds the information required to attach to a buffer. Used
- with dma_buf_attach() only.
- */
+struct dma_buf_attach_info {
- struct dma_buf *dmabuf;
- struct device *dev;
+};
- /**
- get_dma_buf - convenience wrapper for get_file.
- @dmabuf: [in] pointer to dma_buf
@@ -384,8 +403,8 @@ static inline void get_dma_buf(struct dma_buf *dmabuf) get_file(dmabuf->file); } -struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf,
struct device *dev);
+struct dma_buf_attachment * +dma_buf_attach(const struct dma_buf_attach_info *info); void dma_buf_detach(struct dma_buf *dmabuf, struct dma_buf_attachment *dmabuf_attach); -- 2.17.1
dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Am 17.04.19 um 16:26 schrieb Daniel Vetter:
On Wed, Apr 17, 2019 at 04:14:32PM +0200, Christian König wrote:
Am 17.04.19 um 16:08 schrieb Daniel Vetter:
On Tue, Apr 16, 2019 at 08:38:32PM +0200, Christian König wrote:
Make it mandatory for dynamic dma-buf callbacks to be called with the reservation lock held.
For static dma-buf exporters we still have the fallback of using cached sgt.
v2: reordered v3: rebased on sgt caching v4: use the cached sgt when possible
Signed-off-by: Christian König christian.koenig@amd.com
I think there's a bit a rebase chaos going on:
- some comments left behind with no_sgt_cache, which I can't find anymore
- the function signature rework of dma_buf_attach should imo be split out
Ah, crap thought I've got all of those. Going to fix that.
Next issue is that the reservation object locking is still in the path of dma_buf_map, so probably still going to result in tons of lockdep splats. Except the i915+amdgpu path should now work due to the fastpath.
I actually found a solution for that :)
The idea is now that we always cache the sgt in the attachment unless the dynamic flag (previously no_sgt_cache flag) is set. And this cached sgt is created without holding the lock.
Yeah I think that idea works.
We either need to document that really well or maybe split the mapping callbacks into map/unmap and map_lock/unmap_locked. Opinions?
I think the implementation doesn't. Exporter can't decide on its own whether dynamic/static is what's needed, we need to decide that for each attachment, taking both exporter and importer capabilities into account.
Well that's what the pin/unpin callbacks are good for :)
Essentially we have to handle the following cases: a) dynamic exporter and dynamic importer Nothing special here and we don't need the sgt caching nor pinning.
b) dynamic exporter and static importer The pin/unpin callbacks are used to inform the exporter that it needs to keep the buffer in the current place.
c) static exporter and dynamic importer We use the sgt caching to avoid calling the exporter with the common lock held.
d) static exporter and static importer We use the sgt caching, but that is actually only optional.
I think if we do that, then it should work out. I replied on the pin/unpin interface patch with some more concrete thoughts. Maybe best to continue that discussion there, with more context.
Yeah, that is probably a good idea.
Regards, Christian.
-Daniel
Not sure that's a solution that really works, just hides that fundamentally we still have that issue of incompatible locking chains between different drivers.
We can now make a slow transition between static and dynamic DMA-buf handling, so only driver who can do the locking will be affected.
Christian.
-Daniel
drivers/dma-buf/dma-buf.c | 24 ++++++++++--------- drivers/gpu/drm/armada/armada_gem.c | 6 ++++- drivers/gpu/drm/drm_prime.c | 6 ++++- drivers/gpu/drm/i915/i915_gem_dmabuf.c | 6 ++++- drivers/gpu/drm/tegra/gem.c | 6 ++++- drivers/gpu/drm/udl/udl_dmabuf.c | 6 ++++- .../common/videobuf2/videobuf2-dma-contig.c | 6 ++++- .../media/common/videobuf2/videobuf2-dma-sg.c | 6 ++++- drivers/staging/media/tegra-vde/tegra-vde.c | 6 ++++- include/linux/dma-buf.h | 23 ++++++++++++++++-- 10 files changed, 74 insertions(+), 21 deletions(-)
diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index ef480e5fb239..83c92bfd964c 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -532,8 +532,9 @@ EXPORT_SYMBOL_GPL(dma_buf_put); /** * dma_buf_attach - Add the device to dma_buf's attachments list; optionally, * calls attach() of dma_buf_ops to allow device-specific attach functionality
- @dmabuf: [in] buffer to attach device to.
- @dev: [in] device to be attached.
- @info: [in] holds all the attach related information provided
by the importer. see &struct dma_buf_attach_info
for further details.
- Returns struct dma_buf_attachment pointer for this attachment. Attachments
- must be cleaned up by calling dma_buf_detach().
@@ -547,20 +548,20 @@ EXPORT_SYMBOL_GPL(dma_buf_put); * accessible to @dev, and cannot be moved to a more suitable place. This is * indicated with the error code -EBUSY. */ -struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf,
struct device *dev)
+struct dma_buf_attachment *dma_buf_attach(const struct dma_buf_attach_info *info) {
- struct dma_buf *dmabuf = info->dmabuf; struct dma_buf_attachment *attach; int ret;
- if (WARN_ON(!dmabuf || !dev))
- if (WARN_ON(!dmabuf || !info->dev)) return ERR_PTR(-EINVAL); attach = kzalloc(sizeof(*attach), GFP_KERNEL); if (!attach) return ERR_PTR(-ENOMEM);
- attach->dev = dev;
- attach->dev = info->dev; attach->dmabuf = dmabuf; mutex_lock(&dmabuf->lock);
@@ -688,9 +689,9 @@ struct sg_table *dma_buf_map_attachment(struct dma_buf_attachment *attach, if (attach->sgt) return attach->sgt;
- sg_table = attach->dmabuf->ops->map_dma_buf(attach, direction);
- if (!sg_table)
sg_table = ERR_PTR(-ENOMEM);
- reservation_object_lock(attach->dmabuf->resv, NULL);
- sg_table = dma_buf_map_attachment_locked(attach, direction);
- reservation_object_unlock(attach->dmabuf->resv); return sg_table; }
@@ -744,8 +745,9 @@ void dma_buf_unmap_attachment(struct dma_buf_attachment *attach, if (attach->sgt == sg_table) return;
- attach->dmabuf->ops->unmap_dma_buf(attach, sg_table,
direction);
- reservation_object_lock(attach->dmabuf->resv, NULL);
- dma_buf_unmap_attachment_locked(attach, sg_table, direction);
- reservation_object_unlock(attach->dmabuf->resv); } EXPORT_SYMBOL_GPL(dma_buf_unmap_attachment);
diff --git a/drivers/gpu/drm/armada/armada_gem.c b/drivers/gpu/drm/armada/armada_gem.c index 642d0e70d0f8..19c47821032f 100644 --- a/drivers/gpu/drm/armada/armada_gem.c +++ b/drivers/gpu/drm/armada/armada_gem.c @@ -501,6 +501,10 @@ armada_gem_prime_export(struct drm_device *dev, struct drm_gem_object *obj, struct drm_gem_object * armada_gem_prime_import(struct drm_device *dev, struct dma_buf *buf) {
- struct dma_buf_attach_info attach_info = {
.dev = dev->dev,
.dmabuf = buf
- }; struct dma_buf_attachment *attach; struct armada_gem_object *dobj;
@@ -516,7 +520,7 @@ armada_gem_prime_import(struct drm_device *dev, struct dma_buf *buf) } }
- attach = dma_buf_attach(buf, dev->dev);
- attach = dma_buf_attach(&attach_info); if (IS_ERR(attach)) return ERR_CAST(attach);
diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c index 231e3f6d5f41..1fadf5d5ed33 100644 --- a/drivers/gpu/drm/drm_prime.c +++ b/drivers/gpu/drm/drm_prime.c @@ -709,6 +709,10 @@ struct drm_gem_object *drm_gem_prime_import_dev(struct drm_device *dev, struct dma_buf *dma_buf, struct device *attach_dev) {
- struct dma_buf_attach_info attach_info = {
.dev = attach_dev,
.dmabuf = dma_buf
- }; struct dma_buf_attachment *attach; struct sg_table *sgt; struct drm_gem_object *obj;
@@ -729,7 +733,7 @@ struct drm_gem_object *drm_gem_prime_import_dev(struct drm_device *dev, if (!dev->driver->gem_prime_import_sg_table) return ERR_PTR(-EINVAL);
- attach = dma_buf_attach(dma_buf, attach_dev);
- attach = dma_buf_attach(&attach_info); if (IS_ERR(attach)) return ERR_CAST(attach);
diff --git a/drivers/gpu/drm/i915/i915_gem_dmabuf.c b/drivers/gpu/drm/i915/i915_gem_dmabuf.c index 82e2ca17a441..aa7f685bd6ca 100644 --- a/drivers/gpu/drm/i915/i915_gem_dmabuf.c +++ b/drivers/gpu/drm/i915/i915_gem_dmabuf.c @@ -277,6 +277,10 @@ static const struct drm_i915_gem_object_ops i915_gem_object_dmabuf_ops = { struct drm_gem_object *i915_gem_prime_import(struct drm_device *dev, struct dma_buf *dma_buf) {
- struct dma_buf_attach_info attach_info = {
.dev = dev->dev,
.dmabuf = dma_buf
- }; struct dma_buf_attachment *attach; struct drm_i915_gem_object *obj; int ret;
@@ -295,7 +299,7 @@ struct drm_gem_object *i915_gem_prime_import(struct drm_device *dev, } /* need to attach */
- attach = dma_buf_attach(dma_buf, dev->dev);
- attach = dma_buf_attach(&attach_info); if (IS_ERR(attach)) return ERR_CAST(attach);
diff --git a/drivers/gpu/drm/tegra/gem.c b/drivers/gpu/drm/tegra/gem.c index 4f80100ff5f3..8e6b6c879add 100644 --- a/drivers/gpu/drm/tegra/gem.c +++ b/drivers/gpu/drm/tegra/gem.c @@ -332,6 +332,10 @@ struct tegra_bo *tegra_bo_create_with_handle(struct drm_file *file, static struct tegra_bo *tegra_bo_import(struct drm_device *drm, struct dma_buf *buf) {
- struct dma_buf_attach_info attach_info = {
.dev = drm->dev,
.dmabuf = buf
- }; struct tegra_drm *tegra = drm->dev_private; struct dma_buf_attachment *attach; struct tegra_bo *bo;
@@ -341,7 +345,7 @@ static struct tegra_bo *tegra_bo_import(struct drm_device *drm, if (IS_ERR(bo)) return bo;
- attach = dma_buf_attach(buf, drm->dev);
- attach = dma_buf_attach(&attach_info); if (IS_ERR(attach)) { err = PTR_ERR(attach); goto free;
diff --git a/drivers/gpu/drm/udl/udl_dmabuf.c b/drivers/gpu/drm/udl/udl_dmabuf.c index 556f62662aa9..86b928f9742f 100644 --- a/drivers/gpu/drm/udl/udl_dmabuf.c +++ b/drivers/gpu/drm/udl/udl_dmabuf.c @@ -226,6 +226,10 @@ static int udl_prime_create(struct drm_device *dev, struct drm_gem_object *udl_gem_prime_import(struct drm_device *dev, struct dma_buf *dma_buf) {
- struct dma_buf_attach_info attach_info = {
.dev = dev->dev,
.dmabuf = dma_buf
- }; struct dma_buf_attachment *attach; struct sg_table *sg; struct udl_gem_object *uobj;
@@ -233,7 +237,7 @@ struct drm_gem_object *udl_gem_prime_import(struct drm_device *dev, /* need to attach */ get_device(dev->dev);
- attach = dma_buf_attach(dma_buf, dev->dev);
- attach = dma_buf_attach(&attach_info); if (IS_ERR(attach)) { put_device(dev->dev); return ERR_CAST(attach);
diff --git a/drivers/media/common/videobuf2/videobuf2-dma-contig.c b/drivers/media/common/videobuf2/videobuf2-dma-contig.c index aff0ab7bf83d..1f2687b5eb0e 100644 --- a/drivers/media/common/videobuf2/videobuf2-dma-contig.c +++ b/drivers/media/common/videobuf2/videobuf2-dma-contig.c @@ -676,6 +676,10 @@ static void vb2_dc_detach_dmabuf(void *mem_priv) static void *vb2_dc_attach_dmabuf(struct device *dev, struct dma_buf *dbuf, unsigned long size, enum dma_data_direction dma_dir) {
- struct dma_buf_attach_info attach_info = {
.dev = dev,
.dmabuf = dbuf
- }; struct vb2_dc_buf *buf; struct dma_buf_attachment *dba;
@@ -691,7 +695,7 @@ static void *vb2_dc_attach_dmabuf(struct device *dev, struct dma_buf *dbuf, buf->dev = dev; /* create attachment for the dmabuf with the user device */
- dba = dma_buf_attach(dbuf, buf->dev);
- dba = dma_buf_attach(&attach_info); if (IS_ERR(dba)) { pr_err("failed to attach dmabuf\n"); kfree(buf);
diff --git a/drivers/media/common/videobuf2/videobuf2-dma-sg.c b/drivers/media/common/videobuf2/videobuf2-dma-sg.c index 015e737095cd..cbd626d2393a 100644 --- a/drivers/media/common/videobuf2/videobuf2-dma-sg.c +++ b/drivers/media/common/videobuf2/videobuf2-dma-sg.c @@ -608,6 +608,10 @@ static void vb2_dma_sg_detach_dmabuf(void *mem_priv) static void *vb2_dma_sg_attach_dmabuf(struct device *dev, struct dma_buf *dbuf, unsigned long size, enum dma_data_direction dma_dir) {
- struct dma_buf_attach_info attach_info = {
.dev = dev,
.dmabuf = dbuf
- }; struct vb2_dma_sg_buf *buf; struct dma_buf_attachment *dba;
@@ -623,7 +627,7 @@ static void *vb2_dma_sg_attach_dmabuf(struct device *dev, struct dma_buf *dbuf, buf->dev = dev; /* create attachment for the dmabuf with the user device */
- dba = dma_buf_attach(dbuf, buf->dev);
- dba = dma_buf_attach(&attach_info); if (IS_ERR(dba)) { pr_err("failed to attach dmabuf\n"); kfree(buf);
diff --git a/drivers/staging/media/tegra-vde/tegra-vde.c b/drivers/staging/media/tegra-vde/tegra-vde.c index aa6c6bba961e..5a10c1facc27 100644 --- a/drivers/staging/media/tegra-vde/tegra-vde.c +++ b/drivers/staging/media/tegra-vde/tegra-vde.c @@ -568,6 +568,10 @@ static int tegra_vde_attach_dmabuf(struct device *dev, size_t *size, enum dma_data_direction dma_dir) {
- struct dma_buf_attach_info attach_info = {
.dev = dev,
.dmabuf = dmabuf
- }; struct dma_buf_attachment *attachment; struct dma_buf *dmabuf; struct sg_table *sgt;
@@ -591,7 +595,7 @@ static int tegra_vde_attach_dmabuf(struct device *dev, return -EINVAL; }
- attachment = dma_buf_attach(dmabuf, dev);
- attachment = dma_buf_attach(&attach_info); if (IS_ERR(attachment)) { dev_err(dev, "Failed to attach dmabuf\n"); err = PTR_ERR(attachment);
diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h index 18a78be53541..7e23758db3a4 100644 --- a/include/linux/dma-buf.h +++ b/include/linux/dma-buf.h @@ -128,6 +128,9 @@ struct dma_buf_ops { * any other kind of sharing that the exporter might wish to make * available to buffer-users. *
* This is always called with the dmabuf->resv object locked when
* no_sgt_cache is true.
*
- Returns:
- A &sg_table scatter list of or the backing storage of the DMA buffer,
@@ -148,6 +151,9 @@ struct dma_buf_ops { * It should also unpin the backing storage if this is the last mapping * of the DMA buffer, it the exporter supports backing storage * migration.
*
* This is always called with the dmabuf->resv object locked when
*/ void (*unmap_dma_buf)(struct dma_buf_attachment *, struct sg_table *,* no_sgt_cache is true.
@@ -370,6 +376,19 @@ struct dma_buf_export_info { struct dma_buf_export_info name = { .exp_name = KBUILD_MODNAME, \ .owner = THIS_MODULE } +/**
- struct dma_buf_attach_info - holds information needed to attach to a dma_buf
- @dmabuf: the exported dma_buf
- @dev: the device which wants to import the attachment
- This structure holds the information required to attach to a buffer. Used
- with dma_buf_attach() only.
- */
+struct dma_buf_attach_info {
- struct dma_buf *dmabuf;
- struct device *dev;
+};
- /**
- get_dma_buf - convenience wrapper for get_file.
- @dmabuf: [in] pointer to dma_buf
@@ -384,8 +403,8 @@ static inline void get_dma_buf(struct dma_buf *dmabuf) get_file(dmabuf->file); } -struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf,
struct device *dev);
+struct dma_buf_attachment * +dma_buf_attach(const struct dma_buf_attach_info *info); void dma_buf_detach(struct dma_buf *dmabuf, struct dma_buf_attachment *dmabuf_attach); -- 2.17.1
dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Each importer can now provide an invalidate_mappings callback.
This allows the exporter to provide the mappings without the need to pin the backing store.
v2: don't try to invalidate mappings when the callback is NULL, lock the reservation obj while using the attachments, add helper to set the callback v3: move flag for invalidation support into the DMA-buf, use new attach_info structure to set the callback v4: use importer_priv field instead of mangling exporter priv. v5: drop invalidation_supported flag
Signed-off-by: Christian König christian.koenig@amd.com --- drivers/dma-buf/dma-buf.c | 37 +++++++++++++++++++++++++++++++++++++ include/linux/dma-buf.h | 33 +++++++++++++++++++++++++++++++-- 2 files changed, 68 insertions(+), 2 deletions(-)
diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index 83c92bfd964c..a3738fab3927 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -563,6 +563,8 @@ struct dma_buf_attachment *dma_buf_attach(const struct dma_buf_attach_info *info
attach->dev = info->dev; attach->dmabuf = dmabuf; + attach->importer_priv = info->importer_priv; + attach->invalidate = info->invalidate;
mutex_lock(&dmabuf->lock);
@@ -571,7 +573,9 @@ struct dma_buf_attachment *dma_buf_attach(const struct dma_buf_attach_info *info if (ret) goto err_attach; } + reservation_object_lock(dmabuf->resv, NULL); list_add(&attach->node, &dmabuf->attachments); + reservation_object_unlock(dmabuf->resv);
mutex_unlock(&dmabuf->lock);
@@ -615,7 +619,9 @@ void dma_buf_detach(struct dma_buf *dmabuf, struct dma_buf_attachment *attach) DMA_BIDIRECTIONAL);
mutex_lock(&dmabuf->lock); + reservation_object_lock(dmabuf->resv, NULL); list_del(&attach->node); + reservation_object_unlock(dmabuf->resv); if (dmabuf->ops->detach) dmabuf->ops->detach(dmabuf, attach);
@@ -653,7 +659,16 @@ dma_buf_map_attachment_locked(struct dma_buf_attachment *attach, if (attach->sgt) return attach->sgt;
+ /* + * Mapping a DMA-buf can trigger its invalidation, prevent sending this + * event to the caller by temporary removing this attachment from the + * list. + */ + if (attach->invalidate) + list_del(&attach->node); sg_table = attach->dmabuf->ops->map_dma_buf(attach, direction); + if (attach->invalidate) + list_add(&attach->node, &attach->dmabuf->attachments); if (!sg_table) sg_table = ERR_PTR(-ENOMEM);
@@ -751,6 +766,26 @@ void dma_buf_unmap_attachment(struct dma_buf_attachment *attach, } EXPORT_SYMBOL_GPL(dma_buf_unmap_attachment);
+/** + * dma_buf_invalidate_mappings - invalidate all mappings of this dma_buf + * + * @dmabuf: [in] buffer which mappings should be invalidated + * + * Informs all attachmenst that they need to destroy and recreated all their + * mappings. + */ +void dma_buf_invalidate_mappings(struct dma_buf *dmabuf) +{ + struct dma_buf_attachment *attach; + + reservation_object_assert_held(dmabuf->resv); + + list_for_each_entry(attach, &dmabuf->attachments, node) + if (attach->invalidate) + attach->invalidate(attach); +} +EXPORT_SYMBOL_GPL(dma_buf_invalidate_mappings); + /** * DOC: cpu access * @@ -1163,10 +1198,12 @@ static int dma_buf_debug_show(struct seq_file *s, void *unused) seq_puts(s, "\tAttached Devices:\n"); attach_count = 0;
+ reservation_object_lock(buf_obj->resv, NULL); list_for_each_entry(attach_obj, &buf_obj->attachments, node) { seq_printf(s, "\t%s\n", dev_name(attach_obj->dev)); attach_count++; } + reservation_object_unlock(buf_obj->resv);
seq_printf(s, "Total %d devices attached\n\n", attach_count); diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h index 7e23758db3a4..ece4638359a8 100644 --- a/include/linux/dma-buf.h +++ b/include/linux/dma-buf.h @@ -324,6 +324,7 @@ struct dma_buf { * @dev: device attached to the buffer. * @node: list of dma_buf_attachment. * @priv: exporter specific attachment data. + * @importer_priv: importer specific attachment data. * * This structure holds the attachment information between the dma_buf buffer * and its user device(s). The list contains one attachment struct per device @@ -340,6 +341,29 @@ struct dma_buf_attachment { struct list_head node; void *priv; struct sg_table *sgt; + void *importer_priv; + + /** + * @invalidate: + * + * Optional callback provided by the importer of the dma-buf. + * + * If provided the exporter can avoid pinning the backing store while + * mappings exists. + * + * The function is called with the lock of the reservation object + * associated with the dma_buf held and the mapping function must be + * called with this lock held as well. This makes sure that no mapping + * is created concurrently with an ongoing invalidation. + * + * After the callback all existing mappings are still valid until all + * fences in the dma_bufs reservation object are signaled, but should be + * destroyed by the importer as soon as possible. + * + * New mappings can be created immediately, but can't be used before the + * exclusive fence in the dma_bufs reservation object is signaled. + */ + void (*invalidate)(struct dma_buf_attachment *attach); };
/** @@ -378,8 +402,10 @@ struct dma_buf_export_info {
/** * struct dma_buf_attach_info - holds information needed to attach to a dma_buf - * @dmabuf: the exported dma_buf - * @dev: the device which wants to import the attachment + * @dmabuf: the exported dma_buf + * @dev: the device which wants to import the attachment + * @importer_priv: private data of importer to this attachment + * @invalidate: callback to use for invalidating mappings * * This structure holds the information required to attach to a buffer. Used * with dma_buf_attach() only. @@ -387,6 +413,8 @@ struct dma_buf_export_info { struct dma_buf_attach_info { struct dma_buf *dmabuf; struct device *dev; + void *importer_priv; + void (*invalidate)(struct dma_buf_attachment *attach); };
/** @@ -423,6 +451,7 @@ void dma_buf_unmap_attachment_locked(struct dma_buf_attachment *, enum dma_data_direction); void dma_buf_unmap_attachment(struct dma_buf_attachment *, struct sg_table *, enum dma_data_direction); +void dma_buf_invalidate_mappings(struct dma_buf *dma_buf); int dma_buf_begin_cpu_access(struct dma_buf *dma_buf, enum dma_data_direction dir); int dma_buf_end_cpu_access(struct dma_buf *dma_buf,
On Tue, Apr 16, 2019 at 08:38:33PM +0200, Christian König wrote:
Each importer can now provide an invalidate_mappings callback.
This allows the exporter to provide the mappings without the need to pin the backing store.
v2: don't try to invalidate mappings when the callback is NULL, lock the reservation obj while using the attachments, add helper to set the callback v3: move flag for invalidation support into the DMA-buf, use new attach_info structure to set the callback v4: use importer_priv field instead of mangling exporter priv. v5: drop invalidation_supported flag
Signed-off-by: Christian König christian.koenig@amd.com
drivers/dma-buf/dma-buf.c | 37 +++++++++++++++++++++++++++++++++++++ include/linux/dma-buf.h | 33 +++++++++++++++++++++++++++++++-- 2 files changed, 68 insertions(+), 2 deletions(-)
diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index 83c92bfd964c..a3738fab3927 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -563,6 +563,8 @@ struct dma_buf_attachment *dma_buf_attach(const struct dma_buf_attach_info *info attach->dev = info->dev; attach->dmabuf = dmabuf;
- attach->importer_priv = info->importer_priv;
- attach->invalidate = info->invalidate;
mutex_lock(&dmabuf->lock); @@ -571,7 +573,9 @@ struct dma_buf_attachment *dma_buf_attach(const struct dma_buf_attach_info *info if (ret) goto err_attach; }
- reservation_object_lock(dmabuf->resv, NULL); list_add(&attach->node, &dmabuf->attachments);
- reservation_object_unlock(dmabuf->resv);
mutex_unlock(&dmabuf->lock); @@ -615,7 +619,9 @@ void dma_buf_detach(struct dma_buf *dmabuf, struct dma_buf_attachment *attach) DMA_BIDIRECTIONAL); mutex_lock(&dmabuf->lock);
- reservation_object_lock(dmabuf->resv, NULL); list_del(&attach->node);
- reservation_object_unlock(dmabuf->resv); if (dmabuf->ops->detach) dmabuf->ops->detach(dmabuf, attach);
@@ -653,7 +659,16 @@ dma_buf_map_attachment_locked(struct dma_buf_attachment *attach, if (attach->sgt) return attach->sgt;
- /*
* Mapping a DMA-buf can trigger its invalidation, prevent sending this
* event to the caller by temporary removing this attachment from the
* list.
*/
- if (attach->invalidate)
sg_table = attach->dmabuf->ops->map_dma_buf(attach, direction);list_del(&attach->node);
- if (attach->invalidate)
if (!sg_table) sg_table = ERR_PTR(-ENOMEM);list_add(&attach->node, &attach->dmabuf->attachments);
@@ -751,6 +766,26 @@ void dma_buf_unmap_attachment(struct dma_buf_attachment *attach, } EXPORT_SYMBOL_GPL(dma_buf_unmap_attachment); +/**
- dma_buf_invalidate_mappings - invalidate all mappings of this dma_buf
- @dmabuf: [in] buffer which mappings should be invalidated
- Informs all attachmenst that they need to destroy and recreated all their
- mappings.
- */
+void dma_buf_invalidate_mappings(struct dma_buf *dmabuf) +{
- struct dma_buf_attachment *attach;
- reservation_object_assert_held(dmabuf->resv);
- list_for_each_entry(attach, &dmabuf->attachments, node)
if (attach->invalidate)
attach->invalidate(attach);
+} +EXPORT_SYMBOL_GPL(dma_buf_invalidate_mappings);
/**
- DOC: cpu access
@@ -1163,10 +1198,12 @@ static int dma_buf_debug_show(struct seq_file *s, void *unused) seq_puts(s, "\tAttached Devices:\n"); attach_count = 0;
list_for_each_entry(attach_obj, &buf_obj->attachments, node) { seq_printf(s, "\t%s\n", dev_name(attach_obj->dev)); attach_count++; }reservation_object_lock(buf_obj->resv, NULL);
reservation_object_unlock(buf_obj->resv);
seq_printf(s, "Total %d devices attached\n\n", attach_count); diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h index 7e23758db3a4..ece4638359a8 100644 --- a/include/linux/dma-buf.h +++ b/include/linux/dma-buf.h @@ -324,6 +324,7 @@ struct dma_buf {
- @dev: device attached to the buffer.
- @node: list of dma_buf_attachment.
- @priv: exporter specific attachment data.
- @importer_priv: importer specific attachment data.
- This structure holds the attachment information between the dma_buf buffer
- and its user device(s). The list contains one attachment struct per device
@@ -340,6 +341,29 @@ struct dma_buf_attachment { struct list_head node; void *priv; struct sg_table *sgt;
- void *importer_priv;
- /**
* @invalidate:
*
* Optional callback provided by the importer of the dma-buf.
*
* If provided the exporter can avoid pinning the backing store while
* mappings exists.
*
* The function is called with the lock of the reservation object
* associated with the dma_buf held and the mapping function must be
* called with this lock held as well. This makes sure that no mapping
* is created concurrently with an ongoing invalidation.
*
* After the callback all existing mappings are still valid until all
* fences in the dma_bufs reservation object are signaled, but should be
* destroyed by the importer as soon as possible.
*
* New mappings can be created immediately, but can't be used before the
* exclusive fence in the dma_bufs reservation object is signaled.
*/
- void (*invalidate)(struct dma_buf_attachment *attach);
I would put the long kerneldoc into dma_buf_attach_info (as an inline comment, you can mix the styles). And reference it from here with something like
"Set from &dma_buf_attach_info.invalidate in dma_buf_attach(), see there for more information."
This here feels a bit too much hidden. -Daniel
}; /** @@ -378,8 +402,10 @@ struct dma_buf_export_info { /**
- struct dma_buf_attach_info - holds information needed to attach to a dma_buf
- @dmabuf: the exported dma_buf
- @dev: the device which wants to import the attachment
- @dmabuf: the exported dma_buf
- @dev: the device which wants to import the attachment
- @importer_priv: private data of importer to this attachment
- @invalidate: callback to use for invalidating mappings
- This structure holds the information required to attach to a buffer. Used
- with dma_buf_attach() only.
@@ -387,6 +413,8 @@ struct dma_buf_export_info { struct dma_buf_attach_info { struct dma_buf *dmabuf; struct device *dev;
- void *importer_priv;
- void (*invalidate)(struct dma_buf_attachment *attach);
}; /** @@ -423,6 +451,7 @@ void dma_buf_unmap_attachment_locked(struct dma_buf_attachment *, enum dma_data_direction); void dma_buf_unmap_attachment(struct dma_buf_attachment *, struct sg_table *, enum dma_data_direction); +void dma_buf_invalidate_mappings(struct dma_buf *dma_buf); int dma_buf_begin_cpu_access(struct dma_buf *dma_buf, enum dma_data_direction dir); int dma_buf_end_cpu_access(struct dma_buf *dma_buf, -- 2.17.1
dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
On Wed, Apr 17, 2019 at 04:01:16PM +0200, Daniel Vetter wrote:
On Tue, Apr 16, 2019 at 08:38:33PM +0200, Christian König wrote:
Each importer can now provide an invalidate_mappings callback.
This allows the exporter to provide the mappings without the need to pin the backing store.
v2: don't try to invalidate mappings when the callback is NULL, lock the reservation obj while using the attachments, add helper to set the callback v3: move flag for invalidation support into the DMA-buf, use new attach_info structure to set the callback v4: use importer_priv field instead of mangling exporter priv. v5: drop invalidation_supported flag
Signed-off-by: Christian König christian.koenig@amd.com
drivers/dma-buf/dma-buf.c | 37 +++++++++++++++++++++++++++++++++++++ include/linux/dma-buf.h | 33 +++++++++++++++++++++++++++++++-- 2 files changed, 68 insertions(+), 2 deletions(-)
diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index 83c92bfd964c..a3738fab3927 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -563,6 +563,8 @@ struct dma_buf_attachment *dma_buf_attach(const struct dma_buf_attach_info *info attach->dev = info->dev; attach->dmabuf = dmabuf;
- attach->importer_priv = info->importer_priv;
- attach->invalidate = info->invalidate;
mutex_lock(&dmabuf->lock); @@ -571,7 +573,9 @@ struct dma_buf_attachment *dma_buf_attach(const struct dma_buf_attach_info *info if (ret) goto err_attach; }
- reservation_object_lock(dmabuf->resv, NULL); list_add(&attach->node, &dmabuf->attachments);
- reservation_object_unlock(dmabuf->resv);
mutex_unlock(&dmabuf->lock); @@ -615,7 +619,9 @@ void dma_buf_detach(struct dma_buf *dmabuf, struct dma_buf_attachment *attach) DMA_BIDIRECTIONAL); mutex_lock(&dmabuf->lock);
- reservation_object_lock(dmabuf->resv, NULL); list_del(&attach->node);
- reservation_object_unlock(dmabuf->resv); if (dmabuf->ops->detach) dmabuf->ops->detach(dmabuf, attach);
@@ -653,7 +659,16 @@ dma_buf_map_attachment_locked(struct dma_buf_attachment *attach, if (attach->sgt) return attach->sgt;
- /*
* Mapping a DMA-buf can trigger its invalidation, prevent sending this
* event to the caller by temporary removing this attachment from the
* list.
*/
- if (attach->invalidate)
sg_table = attach->dmabuf->ops->map_dma_buf(attach, direction);list_del(&attach->node);
- if (attach->invalidate)
if (!sg_table) sg_table = ERR_PTR(-ENOMEM);list_add(&attach->node, &attach->dmabuf->attachments);
@@ -751,6 +766,26 @@ void dma_buf_unmap_attachment(struct dma_buf_attachment *attach, } EXPORT_SYMBOL_GPL(dma_buf_unmap_attachment); +/**
- dma_buf_invalidate_mappings - invalidate all mappings of this dma_buf
- @dmabuf: [in] buffer which mappings should be invalidated
- Informs all attachmenst that they need to destroy and recreated all their
- mappings.
- */
+void dma_buf_invalidate_mappings(struct dma_buf *dmabuf) +{
- struct dma_buf_attachment *attach;
- reservation_object_assert_held(dmabuf->resv);
- list_for_each_entry(attach, &dmabuf->attachments, node)
if (attach->invalidate)
attach->invalidate(attach);
+} +EXPORT_SYMBOL_GPL(dma_buf_invalidate_mappings);
/**
- DOC: cpu access
@@ -1163,10 +1198,12 @@ static int dma_buf_debug_show(struct seq_file *s, void *unused) seq_puts(s, "\tAttached Devices:\n"); attach_count = 0;
list_for_each_entry(attach_obj, &buf_obj->attachments, node) { seq_printf(s, "\t%s\n", dev_name(attach_obj->dev)); attach_count++; }reservation_object_lock(buf_obj->resv, NULL);
reservation_object_unlock(buf_obj->resv);
seq_printf(s, "Total %d devices attached\n\n", attach_count); diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h index 7e23758db3a4..ece4638359a8 100644 --- a/include/linux/dma-buf.h +++ b/include/linux/dma-buf.h @@ -324,6 +324,7 @@ struct dma_buf {
- @dev: device attached to the buffer.
- @node: list of dma_buf_attachment.
- @priv: exporter specific attachment data.
- @importer_priv: importer specific attachment data.
- This structure holds the attachment information between the dma_buf buffer
- and its user device(s). The list contains one attachment struct per device
@@ -340,6 +341,29 @@ struct dma_buf_attachment { struct list_head node; void *priv; struct sg_table *sgt;
- void *importer_priv;
- /**
* @invalidate:
*
* Optional callback provided by the importer of the dma-buf.
*
* If provided the exporter can avoid pinning the backing store while
* mappings exists.
*
* The function is called with the lock of the reservation object
* associated with the dma_buf held and the mapping function must be
* called with this lock held as well. This makes sure that no mapping
* is created concurrently with an ongoing invalidation.
*
* After the callback all existing mappings are still valid until all
* fences in the dma_bufs reservation object are signaled, but should be
* destroyed by the importer as soon as possible.
*
* New mappings can be created immediately, but can't be used before the
* exclusive fence in the dma_bufs reservation object is signaled.
*/
- void (*invalidate)(struct dma_buf_attachment *attach);
I would put the long kerneldoc into dma_buf_attach_info (as an inline comment, you can mix the styles). And reference it from here with something like
"Set from &dma_buf_attach_info.invalidate in dma_buf_attach(), see there for more information."
This here feels a bit too much hidden.
Question on semantics: Is invalidate allowed to add new fences? I think we need that for pipelined buffer moves and stuff perhaps (or pipeline pagetable invalidates or whatever you feel like pipelining). And it should be possible (we already hold the reservation lock), and I think ttm copes (but no idea really).
Either way, docs need to be clear on this. -Daniel
On Tue, Apr 16, 2019 at 08:38:33PM +0200, Christian König wrote:
Each importer can now provide an invalidate_mappings callback.
This allows the exporter to provide the mappings without the need to pin the backing store.
v2: don't try to invalidate mappings when the callback is NULL, lock the reservation obj while using the attachments, add helper to set the callback v3: move flag for invalidation support into the DMA-buf, use new attach_info structure to set the callback v4: use importer_priv field instead of mangling exporter priv. v5: drop invalidation_supported flag
Signed-off-by: Christian König christian.koenig@amd.com
drivers/dma-buf/dma-buf.c | 37 +++++++++++++++++++++++++++++++++++++ include/linux/dma-buf.h | 33 +++++++++++++++++++++++++++++++-- 2 files changed, 68 insertions(+), 2 deletions(-)
diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index 83c92bfd964c..a3738fab3927 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -563,6 +563,8 @@ struct dma_buf_attachment *dma_buf_attach(const struct dma_buf_attach_info *info attach->dev = info->dev; attach->dmabuf = dmabuf;
- attach->importer_priv = info->importer_priv;
- attach->invalidate = info->invalidate;
mutex_lock(&dmabuf->lock); @@ -571,7 +573,9 @@ struct dma_buf_attachment *dma_buf_attach(const struct dma_buf_attach_info *info if (ret) goto err_attach; }
- reservation_object_lock(dmabuf->resv, NULL); list_add(&attach->node, &dmabuf->attachments);
- reservation_object_unlock(dmabuf->resv);
mutex_unlock(&dmabuf->lock); @@ -615,7 +619,9 @@ void dma_buf_detach(struct dma_buf *dmabuf, struct dma_buf_attachment *attach) DMA_BIDIRECTIONAL); mutex_lock(&dmabuf->lock);
- reservation_object_lock(dmabuf->resv, NULL); list_del(&attach->node);
- reservation_object_unlock(dmabuf->resv); if (dmabuf->ops->detach) dmabuf->ops->detach(dmabuf, attach);
@@ -653,7 +659,16 @@ dma_buf_map_attachment_locked(struct dma_buf_attachment *attach, if (attach->sgt) return attach->sgt;
- /*
* Mapping a DMA-buf can trigger its invalidation, prevent sending this
* event to the caller by temporary removing this attachment from the
* list.
*/
- if (attach->invalidate)
list_del(&attach->node);
Just noticed this: Why do we need this? invalidate needs the reservation lock, as does map_attachment. It should be impssoble to have someone else sneak in here. -Daniel
sg_table = attach->dmabuf->ops->map_dma_buf(attach, direction);
- if (attach->invalidate)
if (!sg_table) sg_table = ERR_PTR(-ENOMEM);list_add(&attach->node, &attach->dmabuf->attachments);
@@ -751,6 +766,26 @@ void dma_buf_unmap_attachment(struct dma_buf_attachment *attach, } EXPORT_SYMBOL_GPL(dma_buf_unmap_attachment); +/**
- dma_buf_invalidate_mappings - invalidate all mappings of this dma_buf
- @dmabuf: [in] buffer which mappings should be invalidated
- Informs all attachmenst that they need to destroy and recreated all their
- mappings.
- */
+void dma_buf_invalidate_mappings(struct dma_buf *dmabuf) +{
- struct dma_buf_attachment *attach;
- reservation_object_assert_held(dmabuf->resv);
- list_for_each_entry(attach, &dmabuf->attachments, node)
if (attach->invalidate)
attach->invalidate(attach);
+} +EXPORT_SYMBOL_GPL(dma_buf_invalidate_mappings);
/**
- DOC: cpu access
@@ -1163,10 +1198,12 @@ static int dma_buf_debug_show(struct seq_file *s, void *unused) seq_puts(s, "\tAttached Devices:\n"); attach_count = 0;
list_for_each_entry(attach_obj, &buf_obj->attachments, node) { seq_printf(s, "\t%s\n", dev_name(attach_obj->dev)); attach_count++; }reservation_object_lock(buf_obj->resv, NULL);
reservation_object_unlock(buf_obj->resv);
seq_printf(s, "Total %d devices attached\n\n", attach_count); diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h index 7e23758db3a4..ece4638359a8 100644 --- a/include/linux/dma-buf.h +++ b/include/linux/dma-buf.h @@ -324,6 +324,7 @@ struct dma_buf {
- @dev: device attached to the buffer.
- @node: list of dma_buf_attachment.
- @priv: exporter specific attachment data.
- @importer_priv: importer specific attachment data.
- This structure holds the attachment information between the dma_buf buffer
- and its user device(s). The list contains one attachment struct per device
@@ -340,6 +341,29 @@ struct dma_buf_attachment { struct list_head node; void *priv; struct sg_table *sgt;
- void *importer_priv;
- /**
* @invalidate:
*
* Optional callback provided by the importer of the dma-buf.
*
* If provided the exporter can avoid pinning the backing store while
* mappings exists.
*
* The function is called with the lock of the reservation object
* associated with the dma_buf held and the mapping function must be
* called with this lock held as well. This makes sure that no mapping
* is created concurrently with an ongoing invalidation.
*
* After the callback all existing mappings are still valid until all
* fences in the dma_bufs reservation object are signaled, but should be
* destroyed by the importer as soon as possible.
*
* New mappings can be created immediately, but can't be used before the
* exclusive fence in the dma_bufs reservation object is signaled.
*/
- void (*invalidate)(struct dma_buf_attachment *attach);
}; /** @@ -378,8 +402,10 @@ struct dma_buf_export_info { /**
- struct dma_buf_attach_info - holds information needed to attach to a dma_buf
- @dmabuf: the exported dma_buf
- @dev: the device which wants to import the attachment
- @dmabuf: the exported dma_buf
- @dev: the device which wants to import the attachment
- @importer_priv: private data of importer to this attachment
- @invalidate: callback to use for invalidating mappings
- This structure holds the information required to attach to a buffer. Used
- with dma_buf_attach() only.
@@ -387,6 +413,8 @@ struct dma_buf_export_info { struct dma_buf_attach_info { struct dma_buf *dmabuf; struct device *dev;
- void *importer_priv;
- void (*invalidate)(struct dma_buf_attachment *attach);
}; /** @@ -423,6 +451,7 @@ void dma_buf_unmap_attachment_locked(struct dma_buf_attachment *, enum dma_data_direction); void dma_buf_unmap_attachment(struct dma_buf_attachment *, struct sg_table *, enum dma_data_direction); +void dma_buf_invalidate_mappings(struct dma_buf *dma_buf); int dma_buf_begin_cpu_access(struct dma_buf *dma_buf, enum dma_data_direction dir); int dma_buf_end_cpu_access(struct dma_buf *dma_buf, -- 2.17.1
dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Am 17.04.19 um 21:07 schrieb Daniel Vetter:
On Tue, Apr 16, 2019 at 08:38:33PM +0200, Christian König wrote:
Each importer can now provide an invalidate_mappings callback.
This allows the exporter to provide the mappings without the need to pin the backing store.
v2: don't try to invalidate mappings when the callback is NULL, lock the reservation obj while using the attachments, add helper to set the callback v3: move flag for invalidation support into the DMA-buf, use new attach_info structure to set the callback v4: use importer_priv field instead of mangling exporter priv. v5: drop invalidation_supported flag
Signed-off-by: Christian König christian.koenig@amd.com
drivers/dma-buf/dma-buf.c | 37 +++++++++++++++++++++++++++++++++++++ include/linux/dma-buf.h | 33 +++++++++++++++++++++++++++++++-- 2 files changed, 68 insertions(+), 2 deletions(-)
diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index 83c92bfd964c..a3738fab3927 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -563,6 +563,8 @@ struct dma_buf_attachment *dma_buf_attach(const struct dma_buf_attach_info *info attach->dev = info->dev; attach->dmabuf = dmabuf;
- attach->importer_priv = info->importer_priv;
- attach->invalidate = info->invalidate;
mutex_lock(&dmabuf->lock); @@ -571,7 +573,9 @@ struct dma_buf_attachment *dma_buf_attach(const struct dma_buf_attach_info *info if (ret) goto err_attach; }
- reservation_object_lock(dmabuf->resv, NULL); list_add(&attach->node, &dmabuf->attachments);
- reservation_object_unlock(dmabuf->resv);
mutex_unlock(&dmabuf->lock); @@ -615,7 +619,9 @@ void dma_buf_detach(struct dma_buf *dmabuf, struct dma_buf_attachment *attach) DMA_BIDIRECTIONAL); mutex_lock(&dmabuf->lock);
- reservation_object_lock(dmabuf->resv, NULL); list_del(&attach->node);
- reservation_object_unlock(dmabuf->resv); if (dmabuf->ops->detach) dmabuf->ops->detach(dmabuf, attach);
@@ -653,7 +659,16 @@ dma_buf_map_attachment_locked(struct dma_buf_attachment *attach, if (attach->sgt) return attach->sgt;
- /*
* Mapping a DMA-buf can trigger its invalidation, prevent sending this
* event to the caller by temporary removing this attachment from the
* list.
*/
- if (attach->invalidate)
list_del(&attach->node);
Just noticed this: Why do we need this? invalidate needs the reservation lock, as does map_attachment. It should be impssoble to have someone else sneak in here.
I was having problems with self triggered invalidations.
E.g. client A tries to map an attachment, that in turn causes the buffer to move to a new place and client A is informed about that movement with an invalidation.
Christian.
-Daniel
sg_table = attach->dmabuf->ops->map_dma_buf(attach, direction);
- if (attach->invalidate)
if (!sg_table) sg_table = ERR_PTR(-ENOMEM);list_add(&attach->node, &attach->dmabuf->attachments);
@@ -751,6 +766,26 @@ void dma_buf_unmap_attachment(struct dma_buf_attachment *attach, } EXPORT_SYMBOL_GPL(dma_buf_unmap_attachment); +/**
- dma_buf_invalidate_mappings - invalidate all mappings of this dma_buf
- @dmabuf: [in] buffer which mappings should be invalidated
- Informs all attachmenst that they need to destroy and recreated all their
- mappings.
- */
+void dma_buf_invalidate_mappings(struct dma_buf *dmabuf) +{
- struct dma_buf_attachment *attach;
- reservation_object_assert_held(dmabuf->resv);
- list_for_each_entry(attach, &dmabuf->attachments, node)
if (attach->invalidate)
attach->invalidate(attach);
+} +EXPORT_SYMBOL_GPL(dma_buf_invalidate_mappings);
- /**
- DOC: cpu access
@@ -1163,10 +1198,12 @@ static int dma_buf_debug_show(struct seq_file *s, void *unused) seq_puts(s, "\tAttached Devices:\n"); attach_count = 0;
list_for_each_entry(attach_obj, &buf_obj->attachments, node) { seq_printf(s, "\t%s\n", dev_name(attach_obj->dev)); attach_count++; }reservation_object_lock(buf_obj->resv, NULL);
reservation_object_unlock(buf_obj->resv);
seq_printf(s, "Total %d devices attached\n\n", attach_count); diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h index 7e23758db3a4..ece4638359a8 100644 --- a/include/linux/dma-buf.h +++ b/include/linux/dma-buf.h @@ -324,6 +324,7 @@ struct dma_buf {
- @dev: device attached to the buffer.
- @node: list of dma_buf_attachment.
- @priv: exporter specific attachment data.
- @importer_priv: importer specific attachment data.
- This structure holds the attachment information between the dma_buf buffer
- and its user device(s). The list contains one attachment struct per device
@@ -340,6 +341,29 @@ struct dma_buf_attachment { struct list_head node; void *priv; struct sg_table *sgt;
- void *importer_priv;
- /**
* @invalidate:
*
* Optional callback provided by the importer of the dma-buf.
*
* If provided the exporter can avoid pinning the backing store while
* mappings exists.
*
* The function is called with the lock of the reservation object
* associated with the dma_buf held and the mapping function must be
* called with this lock held as well. This makes sure that no mapping
* is created concurrently with an ongoing invalidation.
*
* After the callback all existing mappings are still valid until all
* fences in the dma_bufs reservation object are signaled, but should be
* destroyed by the importer as soon as possible.
*
* New mappings can be created immediately, but can't be used before the
* exclusive fence in the dma_bufs reservation object is signaled.
*/
- void (*invalidate)(struct dma_buf_attachment *attach); };
/** @@ -378,8 +402,10 @@ struct dma_buf_export_info { /**
- struct dma_buf_attach_info - holds information needed to attach to a dma_buf
- @dmabuf: the exported dma_buf
- @dev: the device which wants to import the attachment
- @dmabuf: the exported dma_buf
- @dev: the device which wants to import the attachment
- @importer_priv: private data of importer to this attachment
- @invalidate: callback to use for invalidating mappings
- This structure holds the information required to attach to a buffer. Used
- with dma_buf_attach() only.
@@ -387,6 +413,8 @@ struct dma_buf_export_info { struct dma_buf_attach_info { struct dma_buf *dmabuf; struct device *dev;
- void *importer_priv;
- void (*invalidate)(struct dma_buf_attachment *attach); };
/** @@ -423,6 +451,7 @@ void dma_buf_unmap_attachment_locked(struct dma_buf_attachment *, enum dma_data_direction); void dma_buf_unmap_attachment(struct dma_buf_attachment *, struct sg_table *, enum dma_data_direction); +void dma_buf_invalidate_mappings(struct dma_buf *dma_buf); int dma_buf_begin_cpu_access(struct dma_buf *dma_buf, enum dma_data_direction dir); int dma_buf_end_cpu_access(struct dma_buf *dma_buf, -- 2.17.1
dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
On Wed, Apr 17, 2019 at 09:13:22PM +0200, Christian König wrote:
Am 17.04.19 um 21:07 schrieb Daniel Vetter:
On Tue, Apr 16, 2019 at 08:38:33PM +0200, Christian König wrote:
Each importer can now provide an invalidate_mappings callback.
This allows the exporter to provide the mappings without the need to pin the backing store.
v2: don't try to invalidate mappings when the callback is NULL, lock the reservation obj while using the attachments, add helper to set the callback v3: move flag for invalidation support into the DMA-buf, use new attach_info structure to set the callback v4: use importer_priv field instead of mangling exporter priv. v5: drop invalidation_supported flag
Signed-off-by: Christian König christian.koenig@amd.com
drivers/dma-buf/dma-buf.c | 37 +++++++++++++++++++++++++++++++++++++ include/linux/dma-buf.h | 33 +++++++++++++++++++++++++++++++-- 2 files changed, 68 insertions(+), 2 deletions(-)
diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index 83c92bfd964c..a3738fab3927 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -563,6 +563,8 @@ struct dma_buf_attachment *dma_buf_attach(const struct dma_buf_attach_info *info attach->dev = info->dev; attach->dmabuf = dmabuf;
- attach->importer_priv = info->importer_priv;
- attach->invalidate = info->invalidate; mutex_lock(&dmabuf->lock);
@@ -571,7 +573,9 @@ struct dma_buf_attachment *dma_buf_attach(const struct dma_buf_attach_info *info if (ret) goto err_attach; }
- reservation_object_lock(dmabuf->resv, NULL); list_add(&attach->node, &dmabuf->attachments);
- reservation_object_unlock(dmabuf->resv); mutex_unlock(&dmabuf->lock);
@@ -615,7 +619,9 @@ void dma_buf_detach(struct dma_buf *dmabuf, struct dma_buf_attachment *attach) DMA_BIDIRECTIONAL); mutex_lock(&dmabuf->lock);
- reservation_object_lock(dmabuf->resv, NULL); list_del(&attach->node);
- reservation_object_unlock(dmabuf->resv); if (dmabuf->ops->detach) dmabuf->ops->detach(dmabuf, attach);
@@ -653,7 +659,16 @@ dma_buf_map_attachment_locked(struct dma_buf_attachment *attach, if (attach->sgt) return attach->sgt;
- /*
* Mapping a DMA-buf can trigger its invalidation, prevent sending this
* event to the caller by temporary removing this attachment from the
* list.
*/
- if (attach->invalidate)
list_del(&attach->node);
Just noticed this: Why do we need this? invalidate needs the reservation lock, as does map_attachment. It should be impssoble to have someone else sneak in here.
I was having problems with self triggered invalidations.
E.g. client A tries to map an attachment, that in turn causes the buffer to move to a new place and client A is informed about that movement with an invalidation.
Uh, that sounds like a bug in ttm or somewhere else in the exporter. If you evict the bo that you're trying to map, that's bad.
Or maybe it's a framework bug, and we need to track whether an attachment has a map or not. That would make more sense ... -Daniel
Am 18.04.19 um 10:08 schrieb Daniel Vetter:
On Wed, Apr 17, 2019 at 09:13:22PM +0200, Christian König wrote:
Am 17.04.19 um 21:07 schrieb Daniel Vetter:
On Tue, Apr 16, 2019 at 08:38:33PM +0200, Christian König wrote:
Each importer can now provide an invalidate_mappings callback.
This allows the exporter to provide the mappings without the need to pin the backing store.
v2: don't try to invalidate mappings when the callback is NULL, lock the reservation obj while using the attachments, add helper to set the callback v3: move flag for invalidation support into the DMA-buf, use new attach_info structure to set the callback v4: use importer_priv field instead of mangling exporter priv. v5: drop invalidation_supported flag
Signed-off-by: Christian König christian.koenig@amd.com
drivers/dma-buf/dma-buf.c | 37 +++++++++++++++++++++++++++++++++++++ include/linux/dma-buf.h | 33 +++++++++++++++++++++++++++++++-- 2 files changed, 68 insertions(+), 2 deletions(-)
diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index 83c92bfd964c..a3738fab3927 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -563,6 +563,8 @@ struct dma_buf_attachment *dma_buf_attach(const struct dma_buf_attach_info *info attach->dev = info->dev; attach->dmabuf = dmabuf;
- attach->importer_priv = info->importer_priv;
- attach->invalidate = info->invalidate; mutex_lock(&dmabuf->lock);
@@ -571,7 +573,9 @@ struct dma_buf_attachment *dma_buf_attach(const struct dma_buf_attach_info *info if (ret) goto err_attach; }
- reservation_object_lock(dmabuf->resv, NULL); list_add(&attach->node, &dmabuf->attachments);
- reservation_object_unlock(dmabuf->resv); mutex_unlock(&dmabuf->lock);
@@ -615,7 +619,9 @@ void dma_buf_detach(struct dma_buf *dmabuf, struct dma_buf_attachment *attach) DMA_BIDIRECTIONAL); mutex_lock(&dmabuf->lock);
- reservation_object_lock(dmabuf->resv, NULL); list_del(&attach->node);
- reservation_object_unlock(dmabuf->resv); if (dmabuf->ops->detach) dmabuf->ops->detach(dmabuf, attach);
@@ -653,7 +659,16 @@ dma_buf_map_attachment_locked(struct dma_buf_attachment *attach, if (attach->sgt) return attach->sgt;
- /*
* Mapping a DMA-buf can trigger its invalidation, prevent sending this
* event to the caller by temporary removing this attachment from the
* list.
*/
- if (attach->invalidate)
list_del(&attach->node);
Just noticed this: Why do we need this? invalidate needs the reservation lock, as does map_attachment. It should be impssoble to have someone else sneak in here.
I was having problems with self triggered invalidations.
E.g. client A tries to map an attachment, that in turn causes the buffer to move to a new place and client A is informed about that movement with an invalidation.
Uh, that sounds like a bug in ttm or somewhere else in the exporter. If you evict the bo that you're trying to map, that's bad.
Or maybe it's a framework bug, and we need to track whether an attachment has a map or not. That would make more sense ...
Well neither, as far as I can see this is perfectly normal behavior.
We just don't want any invalidation send to a driver which is currently making a mapping.
If you want I can do this in the driver as well, but at least of hand it looks like a good idea to have that in common code.
Tracking the mappings could work as well, but the problem here is that I actually want the lifetime of old and new mappings to overlap for pipelining.
Regards, Christian.
-Daniel
On Thu, Apr 18, 2019 at 08:28:51AM +0000, Koenig, Christian wrote:
Am 18.04.19 um 10:08 schrieb Daniel Vetter:
On Wed, Apr 17, 2019 at 09:13:22PM +0200, Christian König wrote:
Am 17.04.19 um 21:07 schrieb Daniel Vetter:
On Tue, Apr 16, 2019 at 08:38:33PM +0200, Christian König wrote:
Each importer can now provide an invalidate_mappings callback.
This allows the exporter to provide the mappings without the need to pin the backing store.
v2: don't try to invalidate mappings when the callback is NULL, lock the reservation obj while using the attachments, add helper to set the callback v3: move flag for invalidation support into the DMA-buf, use new attach_info structure to set the callback v4: use importer_priv field instead of mangling exporter priv. v5: drop invalidation_supported flag
Signed-off-by: Christian König christian.koenig@amd.com
drivers/dma-buf/dma-buf.c | 37 +++++++++++++++++++++++++++++++++++++ include/linux/dma-buf.h | 33 +++++++++++++++++++++++++++++++-- 2 files changed, 68 insertions(+), 2 deletions(-)
diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index 83c92bfd964c..a3738fab3927 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -563,6 +563,8 @@ struct dma_buf_attachment *dma_buf_attach(const struct dma_buf_attach_info *info attach->dev = info->dev; attach->dmabuf = dmabuf;
- attach->importer_priv = info->importer_priv;
- attach->invalidate = info->invalidate; mutex_lock(&dmabuf->lock);
@@ -571,7 +573,9 @@ struct dma_buf_attachment *dma_buf_attach(const struct dma_buf_attach_info *info if (ret) goto err_attach; }
- reservation_object_lock(dmabuf->resv, NULL); list_add(&attach->node, &dmabuf->attachments);
- reservation_object_unlock(dmabuf->resv); mutex_unlock(&dmabuf->lock);
@@ -615,7 +619,9 @@ void dma_buf_detach(struct dma_buf *dmabuf, struct dma_buf_attachment *attach) DMA_BIDIRECTIONAL); mutex_lock(&dmabuf->lock);
- reservation_object_lock(dmabuf->resv, NULL); list_del(&attach->node);
- reservation_object_unlock(dmabuf->resv); if (dmabuf->ops->detach) dmabuf->ops->detach(dmabuf, attach);
@@ -653,7 +659,16 @@ dma_buf_map_attachment_locked(struct dma_buf_attachment *attach, if (attach->sgt) return attach->sgt;
- /*
* Mapping a DMA-buf can trigger its invalidation, prevent sending this
* event to the caller by temporary removing this attachment from the
* list.
*/
- if (attach->invalidate)
list_del(&attach->node);
Just noticed this: Why do we need this? invalidate needs the reservation lock, as does map_attachment. It should be impssoble to have someone else sneak in here.
I was having problems with self triggered invalidations.
E.g. client A tries to map an attachment, that in turn causes the buffer to move to a new place and client A is informed about that movement with an invalidation.
Uh, that sounds like a bug in ttm or somewhere else in the exporter. If you evict the bo that you're trying to map, that's bad.
Or maybe it's a framework bug, and we need to track whether an attachment has a map or not. That would make more sense ...
Well neither, as far as I can see this is perfectly normal behavior.
We just don't want any invalidation send to a driver which is currently making a mapping.
If you want I can do this in the driver as well, but at least of hand it looks like a good idea to have that in common code.
Hm. This sounds like we'd want to invalidate a specific mapping.
Tracking the mappings could work as well, but the problem here is that I actually want the lifetime of old and new mappings to overlap for pipelining.
Aside: Overlapping mappings being explicitly allowed should be in the docs. The current kerneldoc for invalidate leaves that up for interpretation. This answers one of the questions I had overnight, about whether we expect ->invalidate to tear down the mapping or not.
Imo a better semantics would be that ->invalidate must tear down the mapping, but the exporter must delay actual unmap until all fences have cleared. Otherwise you could end up with fun stuff where the exporter releases the memory (it wanted to invalidate after all), while the importer still has a mapping around. That's not going to end well I think.
That would also solve the issue of getting an invalidate while you map, at least if we filter per attachment. -Daniel
Add optional explicit pinning callbacks instead of implicitly assume the exporter pins the buffer when a mapping is created.
Signed-off-by: Christian König christian.koenig@amd.com --- drivers/dma-buf/dma-buf.c | 39 +++++++++++++++++++++++++++++++++++++++ include/linux/dma-buf.h | 37 +++++++++++++++++++++++++++++++------ 2 files changed, 70 insertions(+), 6 deletions(-)
diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index a3738fab3927..f23ff8355505 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -630,6 +630,41 @@ void dma_buf_detach(struct dma_buf *dmabuf, struct dma_buf_attachment *attach) } EXPORT_SYMBOL_GPL(dma_buf_detach);
+/** + * dma_buf_pin - Lock down the DMA-buf + * + * @dmabuf: [in] DMA-buf to lock down. + * + * Returns: + * 0 on success, negative error code on failure. + */ +int dma_buf_pin(struct dma_buf *dmabuf) +{ + int ret = 0; + + reservation_object_assert_held(dmabuf->resv); + + if (dmabuf->ops->pin) + ret = dmabuf->ops->pin(dmabuf); + + return ret; +} +EXPORT_SYMBOL_GPL(dma_buf_pin); + +/** + * dma_buf_unpin - Remove lock from DMA-buf + * + * @dmabuf: [in] DMA-buf to unlock. + */ +void dma_buf_unpin(struct dma_buf *dmabuf) +{ + reservation_object_assert_held(dmabuf->resv); + + if (dmabuf->ops->unpin) + dmabuf->ops->unpin(dmabuf); +} +EXPORT_SYMBOL_GPL(dma_buf_unpin); + /** * dma_buf_map_attachment_locked - Maps the buffer into _device_ address space * with the reservation lock held. Is a wrapper for map_dma_buf() of the @@ -666,6 +701,8 @@ dma_buf_map_attachment_locked(struct dma_buf_attachment *attach, */ if (attach->invalidate) list_del(&attach->node); + else + dma_buf_pin(attach->dmabuf); sg_table = attach->dmabuf->ops->map_dma_buf(attach, direction); if (attach->invalidate) list_add(&attach->node, &attach->dmabuf->attachments); @@ -735,6 +772,8 @@ void dma_buf_unmap_attachment_locked(struct dma_buf_attachment *attach,
attach->dmabuf->ops->unmap_dma_buf(attach, sg_table, direction); + if (!attach->invalidate) + dma_buf_unpin(attach->dmabuf); } EXPORT_SYMBOL_GPL(dma_buf_unmap_attachment_locked);
diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h index ece4638359a8..a615b74e5894 100644 --- a/include/linux/dma-buf.h +++ b/include/linux/dma-buf.h @@ -100,14 +100,40 @@ struct dma_buf_ops { */ void (*detach)(struct dma_buf *, struct dma_buf_attachment *);
+ /** + * @pin_dma_buf: + * + * This is called by dma_buf_pin and lets the exporter know that an + * importer assumes that the DMA-buf can't be invalidated any more. + * + * This is called with the dmabuf->resv object locked. + * + * This callback is optional. + * + * Returns: + * + * 0 on success, negative error code on failure. + */ + int (*pin)(struct dma_buf *); + + /** + * @unpin_dma_buf: + * + * This is called by dma_buf_unpin and lets the exporter know that an + * importer doesn't need to the DMA-buf to stay were it is any more. + * + * This is called with the dmabuf->resv object locked. + * + * This callback is optional. + */ + void (*unpin)(struct dma_buf *); + /** * @map_dma_buf: * * This is called by dma_buf_map_attachment() and is used to map a * shared &dma_buf into device address space, and it is mandatory. It - * can only be called if @attach has been called successfully. This - * essentially pins the DMA buffer into place, and it cannot be moved - * any more + * can only be called if @attach has been called successfully. * * This call may sleep, e.g. when the backing storage first needs to be * allocated, or moved to a location suitable for all currently attached @@ -148,9 +174,6 @@ struct dma_buf_ops { * * This is called by dma_buf_unmap_attachment() and should unmap and * release the &sg_table allocated in @map_dma_buf, and it is mandatory. - * It should also unpin the backing storage if this is the last mapping - * of the DMA buffer, it the exporter supports backing storage - * migration. * * This is always called with the dmabuf->resv object locked when * no_sgt_cache is true. @@ -442,6 +465,8 @@ int dma_buf_fd(struct dma_buf *dmabuf, int flags); struct dma_buf *dma_buf_get(int fd); void dma_buf_put(struct dma_buf *dmabuf);
+int dma_buf_pin(struct dma_buf *dmabuf); +void dma_buf_unpin(struct dma_buf *dmabuf); struct sg_table *dma_buf_map_attachment_locked(struct dma_buf_attachment *, enum dma_data_direction); struct sg_table *dma_buf_map_attachment(struct dma_buf_attachment *,
On Tue, Apr 16, 2019 at 08:38:34PM +0200, Christian König wrote:
Add optional explicit pinning callbacks instead of implicitly assume the exporter pins the buffer when a mapping is created.
Signed-off-by: Christian König christian.koenig@amd.com
Don't we need this together with the invalidate callback and the dynamic stuff? Also I'm assuming that pin/unpin is pretty much required for dynamic bo, so could we look at these callbacks instead of the dynamic flag you add in patch 1.
I'm assuming following rules hold: no pin/upin from exporter:
dma-buf is not dynamic, and pinned for the duration of map/unmap. I'm not 100% sure whether really everyone wants the mapping to be cached for the entire attachment, only drm_prime does that. And that's not the only dma-buf importer.
pin/unpin calls are noops.
pin/unpin exist in the exporter, but importer has not provided an invalidate callback:
We map at attach time, and we also have to pin, since the importer can't handle the buffer disappearing, at attach time. We unmap/unpin at detach.
pin/unpin from exporter, invalidate from importer:
Full dynamic mapping. We assume the importer will do caching, attach fences as needed, and pin the underlying bo when it needs it it permanently, without attaching fences (i.e. the scanout case).
Assuming I'm not terribly off with my understanding, then I think it'd be best to introduce the entire new dma-buf api in the first patch, and flesh it out later. Instead of spread over a few patches. Plus the above (maybe prettier) as a nice kerneldoc overview comment for how dynamic dma-buf is supposed to work really. -Daniel
drivers/dma-buf/dma-buf.c | 39 +++++++++++++++++++++++++++++++++++++++ include/linux/dma-buf.h | 37 +++++++++++++++++++++++++++++++------ 2 files changed, 70 insertions(+), 6 deletions(-)
diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index a3738fab3927..f23ff8355505 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -630,6 +630,41 @@ void dma_buf_detach(struct dma_buf *dmabuf, struct dma_buf_attachment *attach) } EXPORT_SYMBOL_GPL(dma_buf_detach); +/**
- dma_buf_pin - Lock down the DMA-buf
- @dmabuf: [in] DMA-buf to lock down.
- Returns:
- 0 on success, negative error code on failure.
- */
+int dma_buf_pin(struct dma_buf *dmabuf) +{
- int ret = 0;
- reservation_object_assert_held(dmabuf->resv);
- if (dmabuf->ops->pin)
ret = dmabuf->ops->pin(dmabuf);
- return ret;
+} +EXPORT_SYMBOL_GPL(dma_buf_pin);
+/**
- dma_buf_unpin - Remove lock from DMA-buf
- @dmabuf: [in] DMA-buf to unlock.
- */
+void dma_buf_unpin(struct dma_buf *dmabuf) +{
- reservation_object_assert_held(dmabuf->resv);
- if (dmabuf->ops->unpin)
dmabuf->ops->unpin(dmabuf);
+} +EXPORT_SYMBOL_GPL(dma_buf_unpin);
/**
- dma_buf_map_attachment_locked - Maps the buffer into _device_ address space
- with the reservation lock held. Is a wrapper for map_dma_buf() of the
@@ -666,6 +701,8 @@ dma_buf_map_attachment_locked(struct dma_buf_attachment *attach, */ if (attach->invalidate) list_del(&attach->node);
- else
sg_table = attach->dmabuf->ops->map_dma_buf(attach, direction); if (attach->invalidate) list_add(&attach->node, &attach->dmabuf->attachments);dma_buf_pin(attach->dmabuf);
@@ -735,6 +772,8 @@ void dma_buf_unmap_attachment_locked(struct dma_buf_attachment *attach, attach->dmabuf->ops->unmap_dma_buf(attach, sg_table, direction);
- if (!attach->invalidate)
dma_buf_unpin(attach->dmabuf);
} EXPORT_SYMBOL_GPL(dma_buf_unmap_attachment_locked); diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h index ece4638359a8..a615b74e5894 100644 --- a/include/linux/dma-buf.h +++ b/include/linux/dma-buf.h @@ -100,14 +100,40 @@ struct dma_buf_ops { */ void (*detach)(struct dma_buf *, struct dma_buf_attachment *);
- /**
* @pin_dma_buf:
*
* This is called by dma_buf_pin and lets the exporter know that an
* importer assumes that the DMA-buf can't be invalidated any more.
*
* This is called with the dmabuf->resv object locked.
*
* This callback is optional.
*
* Returns:
*
* 0 on success, negative error code on failure.
*/
- int (*pin)(struct dma_buf *);
- /**
* @unpin_dma_buf:
*
* This is called by dma_buf_unpin and lets the exporter know that an
* importer doesn't need to the DMA-buf to stay were it is any more.
*
* This is called with the dmabuf->resv object locked.
*
* This callback is optional.
*/
- void (*unpin)(struct dma_buf *);
- /**
- @map_dma_buf:
- This is called by dma_buf_map_attachment() and is used to map a
- shared &dma_buf into device address space, and it is mandatory. It
* can only be called if @attach has been called successfully. This
* essentially pins the DMA buffer into place, and it cannot be moved
* any more
* can only be called if @attach has been called successfully.
- This call may sleep, e.g. when the backing storage first needs to be
- allocated, or moved to a location suitable for all currently attached
@@ -148,9 +174,6 @@ struct dma_buf_ops { * * This is called by dma_buf_unmap_attachment() and should unmap and * release the &sg_table allocated in @map_dma_buf, and it is mandatory.
* It should also unpin the backing storage if this is the last mapping
* of the DMA buffer, it the exporter supports backing storage
* migration.
- This is always called with the dmabuf->resv object locked when
- no_sgt_cache is true.
@@ -442,6 +465,8 @@ int dma_buf_fd(struct dma_buf *dmabuf, int flags); struct dma_buf *dma_buf_get(int fd); void dma_buf_put(struct dma_buf *dmabuf); +int dma_buf_pin(struct dma_buf *dmabuf); +void dma_buf_unpin(struct dma_buf *dmabuf); struct sg_table *dma_buf_map_attachment_locked(struct dma_buf_attachment *, enum dma_data_direction); struct sg_table *dma_buf_map_attachment(struct dma_buf_attachment *, -- 2.17.1
dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
On Wed, Apr 17, 2019 at 04:20:02PM +0200, Daniel Vetter wrote:
On Tue, Apr 16, 2019 at 08:38:34PM +0200, Christian König wrote:
Add optional explicit pinning callbacks instead of implicitly assume the exporter pins the buffer when a mapping is created.
Signed-off-by: Christian König christian.koenig@amd.com
Don't we need this together with the invalidate callback and the dynamic stuff? Also I'm assuming that pin/unpin is pretty much required for dynamic bo, so could we look at these callbacks instead of the dynamic flag you add in patch 1.
I'm assuming following rules hold: no pin/upin from exporter:
dma-buf is not dynamic, and pinned for the duration of map/unmap. I'm not 100% sure whether really everyone wants the mapping to be cached for the entire attachment, only drm_prime does that. And that's not the only dma-buf importer.
pin/unpin calls are noops.
pin/unpin exist in the exporter, but importer has not provided an invalidate callback:
We map at attach time, and we also have to pin, since the importer can't handle the buffer disappearing, at attach time. We unmap/unpin at detach.
For this case we should have a WARN in pin/unpin, to make sure importers don't do something stupid. One more thought below on pin/unpin.
pin/unpin from exporter, invalidate from importer:
Full dynamic mapping. We assume the importer will do caching, attach fences as needed, and pin the underlying bo when it needs it it permanently, without attaching fences (i.e. the scanout case).
Assuming I'm not terribly off with my understanding, then I think it'd be best to introduce the entire new dma-buf api in the first patch, and flesh it out later. Instead of spread over a few patches. Plus the above (maybe prettier) as a nice kerneldoc overview comment for how dynamic dma-buf is supposed to work really. -Daniel
drivers/dma-buf/dma-buf.c | 39 +++++++++++++++++++++++++++++++++++++++ include/linux/dma-buf.h | 37 +++++++++++++++++++++++++++++++------ 2 files changed, 70 insertions(+), 6 deletions(-)
diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index a3738fab3927..f23ff8355505 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -630,6 +630,41 @@ void dma_buf_detach(struct dma_buf *dmabuf, struct dma_buf_attachment *attach) } EXPORT_SYMBOL_GPL(dma_buf_detach); +/**
- dma_buf_pin - Lock down the DMA-buf
- @dmabuf: [in] DMA-buf to lock down.
- Returns:
- 0 on success, negative error code on failure.
- */
+int dma_buf_pin(struct dma_buf *dmabuf)
Hm, I think it'd be better to pin the attachment, not the underlying buffer. Attachment is the thin the importer will have to pin, and it's at attach/detach time where dma-buf needs to pin for importers who don't understand dynamic buffer sharing.
Plus when we put that onto attachments, we can do a
WARN_ON(!attach->invalidate);
sanity check. I think that would be good to have. -Daniel
+{
- int ret = 0;
- reservation_object_assert_held(dmabuf->resv);
- if (dmabuf->ops->pin)
ret = dmabuf->ops->pin(dmabuf);
- return ret;
+} +EXPORT_SYMBOL_GPL(dma_buf_pin);
+/**
- dma_buf_unpin - Remove lock from DMA-buf
- @dmabuf: [in] DMA-buf to unlock.
- */
+void dma_buf_unpin(struct dma_buf *dmabuf) +{
- reservation_object_assert_held(dmabuf->resv);
- if (dmabuf->ops->unpin)
dmabuf->ops->unpin(dmabuf);
+} +EXPORT_SYMBOL_GPL(dma_buf_unpin);
/**
- dma_buf_map_attachment_locked - Maps the buffer into _device_ address space
- with the reservation lock held. Is a wrapper for map_dma_buf() of the
@@ -666,6 +701,8 @@ dma_buf_map_attachment_locked(struct dma_buf_attachment *attach, */ if (attach->invalidate) list_del(&attach->node);
- else
sg_table = attach->dmabuf->ops->map_dma_buf(attach, direction); if (attach->invalidate) list_add(&attach->node, &attach->dmabuf->attachments);dma_buf_pin(attach->dmabuf);
@@ -735,6 +772,8 @@ void dma_buf_unmap_attachment_locked(struct dma_buf_attachment *attach, attach->dmabuf->ops->unmap_dma_buf(attach, sg_table, direction);
- if (!attach->invalidate)
dma_buf_unpin(attach->dmabuf);
} EXPORT_SYMBOL_GPL(dma_buf_unmap_attachment_locked); diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h index ece4638359a8..a615b74e5894 100644 --- a/include/linux/dma-buf.h +++ b/include/linux/dma-buf.h @@ -100,14 +100,40 @@ struct dma_buf_ops { */ void (*detach)(struct dma_buf *, struct dma_buf_attachment *);
- /**
* @pin_dma_buf:
*
* This is called by dma_buf_pin and lets the exporter know that an
* importer assumes that the DMA-buf can't be invalidated any more.
*
* This is called with the dmabuf->resv object locked.
*
* This callback is optional.
*
* Returns:
*
* 0 on success, negative error code on failure.
*/
- int (*pin)(struct dma_buf *);
- /**
* @unpin_dma_buf:
*
* This is called by dma_buf_unpin and lets the exporter know that an
* importer doesn't need to the DMA-buf to stay were it is any more.
*
* This is called with the dmabuf->resv object locked.
*
* This callback is optional.
*/
- void (*unpin)(struct dma_buf *);
- /**
- @map_dma_buf:
- This is called by dma_buf_map_attachment() and is used to map a
- shared &dma_buf into device address space, and it is mandatory. It
* can only be called if @attach has been called successfully. This
* essentially pins the DMA buffer into place, and it cannot be moved
* any more
* can only be called if @attach has been called successfully.
- This call may sleep, e.g. when the backing storage first needs to be
- allocated, or moved to a location suitable for all currently attached
@@ -148,9 +174,6 @@ struct dma_buf_ops { * * This is called by dma_buf_unmap_attachment() and should unmap and * release the &sg_table allocated in @map_dma_buf, and it is mandatory.
* It should also unpin the backing storage if this is the last mapping
* of the DMA buffer, it the exporter supports backing storage
* migration.
- This is always called with the dmabuf->resv object locked when
- no_sgt_cache is true.
@@ -442,6 +465,8 @@ int dma_buf_fd(struct dma_buf *dmabuf, int flags); struct dma_buf *dma_buf_get(int fd); void dma_buf_put(struct dma_buf *dmabuf); +int dma_buf_pin(struct dma_buf *dmabuf); +void dma_buf_unpin(struct dma_buf *dmabuf); struct sg_table *dma_buf_map_attachment_locked(struct dma_buf_attachment *, enum dma_data_direction); struct sg_table *dma_buf_map_attachment(struct dma_buf_attachment *, -- 2.17.1
dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
-- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
On Wed, Apr 17, 2019 at 04:30:51PM +0200, Daniel Vetter wrote:
On Wed, Apr 17, 2019 at 04:20:02PM +0200, Daniel Vetter wrote:
On Tue, Apr 16, 2019 at 08:38:34PM +0200, Christian König wrote:
Add optional explicit pinning callbacks instead of implicitly assume the exporter pins the buffer when a mapping is created.
Signed-off-by: Christian König christian.koenig@amd.com
Don't we need this together with the invalidate callback and the dynamic stuff? Also I'm assuming that pin/unpin is pretty much required for dynamic bo, so could we look at these callbacks instead of the dynamic flag you add in patch 1.
I'm assuming following rules hold: no pin/upin from exporter:
dma-buf is not dynamic, and pinned for the duration of map/unmap. I'm not 100% sure whether really everyone wants the mapping to be cached for the entire attachment, only drm_prime does that. And that's not the only dma-buf importer.
pin/unpin calls are noops.
pin/unpin exist in the exporter, but importer has not provided an invalidate callback:
We map at attach time, and we also have to pin, since the importer can't handle the buffer disappearing, at attach time. We unmap/unpin at detach.
For this case we should have a WARN in pin/unpin, to make sure importers don't do something stupid. One more thought below on pin/unpin.
pin/unpin from exporter, invalidate from importer:
Full dynamic mapping. We assume the importer will do caching, attach fences as needed, and pin the underlying bo when it needs it it permanently, without attaching fences (i.e. the scanout case).
Assuming I'm not terribly off with my understanding, then I think it'd be best to introduce the entire new dma-buf api in the first patch, and flesh it out later. Instead of spread over a few patches. Plus the above (maybe prettier) as a nice kerneldoc overview comment for how dynamic dma-buf is supposed to work really. -Daniel
drivers/dma-buf/dma-buf.c | 39 +++++++++++++++++++++++++++++++++++++++ include/linux/dma-buf.h | 37 +++++++++++++++++++++++++++++++------ 2 files changed, 70 insertions(+), 6 deletions(-)
diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index a3738fab3927..f23ff8355505 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -630,6 +630,41 @@ void dma_buf_detach(struct dma_buf *dmabuf, struct dma_buf_attachment *attach) } EXPORT_SYMBOL_GPL(dma_buf_detach); +/**
- dma_buf_pin - Lock down the DMA-buf
- @dmabuf: [in] DMA-buf to lock down.
- Returns:
- 0 on success, negative error code on failure.
- */
+int dma_buf_pin(struct dma_buf *dmabuf)
Hm, I think it'd be better to pin the attachment, not the underlying buffer. Attachment is the thin the importer will have to pin, and it's at attach/detach time where dma-buf needs to pin for importers who don't understand dynamic buffer sharing.
Plus when we put that onto attachments, we can do a
WARN_ON(!attach->invalidate);
sanity check. I think that would be good to have.
Another validation idea: dma-buf.c could track the pin_count on the struct dma_buf, and if an exporter tries to invalidate while pinned WARN and bail out. Because that's clearly a driver bug.
All in the interest in making the contract between importers and exporters as clear as possible. -Daniel
On Wed, Apr 17, 2019 at 04:40:11PM +0200, Daniel Vetter wrote:
On Wed, Apr 17, 2019 at 04:30:51PM +0200, Daniel Vetter wrote:
On Wed, Apr 17, 2019 at 04:20:02PM +0200, Daniel Vetter wrote:
On Tue, Apr 16, 2019 at 08:38:34PM +0200, Christian König wrote:
Add optional explicit pinning callbacks instead of implicitly assume the exporter pins the buffer when a mapping is created.
Signed-off-by: Christian König christian.koenig@amd.com
Don't we need this together with the invalidate callback and the dynamic stuff? Also I'm assuming that pin/unpin is pretty much required for dynamic bo, so could we look at these callbacks instead of the dynamic flag you add in patch 1.
I'm assuming following rules hold: no pin/upin from exporter:
dma-buf is not dynamic, and pinned for the duration of map/unmap. I'm not 100% sure whether really everyone wants the mapping to be cached for the entire attachment, only drm_prime does that. And that's not the only dma-buf importer.
pin/unpin calls are noops.
pin/unpin exist in the exporter, but importer has not provided an invalidate callback:
We map at attach time, and we also have to pin, since the importer can't handle the buffer disappearing, at attach time. We unmap/unpin at detach.
For this case we should have a WARN in pin/unpin, to make sure importers don't do something stupid. One more thought below on pin/unpin.
btw just realized that you already have the pin/unpin here. I think even more reasons to squash the invalidate/pin bits together, since they don't really work separately. -Daniel
pin/unpin from exporter, invalidate from importer:
Full dynamic mapping. We assume the importer will do caching, attach fences as needed, and pin the underlying bo when it needs it it permanently, without attaching fences (i.e. the scanout case).
Assuming I'm not terribly off with my understanding, then I think it'd be best to introduce the entire new dma-buf api in the first patch, and flesh it out later. Instead of spread over a few patches. Plus the above (maybe prettier) as a nice kerneldoc overview comment for how dynamic dma-buf is supposed to work really. -Daniel
drivers/dma-buf/dma-buf.c | 39 +++++++++++++++++++++++++++++++++++++++ include/linux/dma-buf.h | 37 +++++++++++++++++++++++++++++++------ 2 files changed, 70 insertions(+), 6 deletions(-)
diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index a3738fab3927..f23ff8355505 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -630,6 +630,41 @@ void dma_buf_detach(struct dma_buf *dmabuf, struct dma_buf_attachment *attach) } EXPORT_SYMBOL_GPL(dma_buf_detach); +/**
- dma_buf_pin - Lock down the DMA-buf
- @dmabuf: [in] DMA-buf to lock down.
- Returns:
- 0 on success, negative error code on failure.
- */
+int dma_buf_pin(struct dma_buf *dmabuf)
Hm, I think it'd be better to pin the attachment, not the underlying buffer. Attachment is the thin the importer will have to pin, and it's at attach/detach time where dma-buf needs to pin for importers who don't understand dynamic buffer sharing.
Plus when we put that onto attachments, we can do a
WARN_ON(!attach->invalidate);
sanity check. I think that would be good to have.
Another validation idea: dma-buf.c could track the pin_count on the struct dma_buf, and if an exporter tries to invalidate while pinned WARN and bail out. Because that's clearly a driver bug.
All in the interest in making the contract between importers and exporters as clear as possible.
-Daniel
Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
On Tue, Apr 16, 2019 at 2:39 PM Christian König ckoenig.leichtzumerken@gmail.com wrote:
Add optional explicit pinning callbacks instead of implicitly assume the exporter pins the buffer when a mapping is created.
Signed-off-by: Christian König christian.koenig@amd.com
drivers/dma-buf/dma-buf.c | 39 +++++++++++++++++++++++++++++++++++++++ include/linux/dma-buf.h | 37 +++++++++++++++++++++++++++++++------ 2 files changed, 70 insertions(+), 6 deletions(-)
diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index a3738fab3927..f23ff8355505 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -630,6 +630,41 @@ void dma_buf_detach(struct dma_buf *dmabuf, struct dma_buf_attachment *attach) } EXPORT_SYMBOL_GPL(dma_buf_detach);
+/**
- dma_buf_pin - Lock down the DMA-buf
- @dmabuf: [in] DMA-buf to lock down.
- Returns:
- 0 on success, negative error code on failure.
- */
+int dma_buf_pin(struct dma_buf *dmabuf) +{
int ret = 0;
reservation_object_assert_held(dmabuf->resv);
if (dmabuf->ops->pin)
ret = dmabuf->ops->pin(dmabuf);
return ret;
+} +EXPORT_SYMBOL_GPL(dma_buf_pin);
+/**
- dma_buf_unpin - Remove lock from DMA-buf
- @dmabuf: [in] DMA-buf to unlock.
- */
+void dma_buf_unpin(struct dma_buf *dmabuf) +{
reservation_object_assert_held(dmabuf->resv);
if (dmabuf->ops->unpin)
dmabuf->ops->unpin(dmabuf);
+} +EXPORT_SYMBOL_GPL(dma_buf_unpin);
/**
- dma_buf_map_attachment_locked - Maps the buffer into _device_ address space
- with the reservation lock held. Is a wrapper for map_dma_buf() of the
@@ -666,6 +701,8 @@ dma_buf_map_attachment_locked(struct dma_buf_attachment *attach, */ if (attach->invalidate) list_del(&attach->node);
else
dma_buf_pin(attach->dmabuf); sg_table = attach->dmabuf->ops->map_dma_buf(attach, direction); if (attach->invalidate) list_add(&attach->node, &attach->dmabuf->attachments);
@@ -735,6 +772,8 @@ void dma_buf_unmap_attachment_locked(struct dma_buf_attachment *attach,
attach->dmabuf->ops->unmap_dma_buf(attach, sg_table, direction);
if (!attach->invalidate)
dma_buf_unpin(attach->dmabuf);
} EXPORT_SYMBOL_GPL(dma_buf_unmap_attachment_locked);
diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h index ece4638359a8..a615b74e5894 100644 --- a/include/linux/dma-buf.h +++ b/include/linux/dma-buf.h @@ -100,14 +100,40 @@ struct dma_buf_ops { */ void (*detach)(struct dma_buf *, struct dma_buf_attachment *);
/**
* @pin_dma_buf:
*
* This is called by dma_buf_pin and lets the exporter know that an
* importer assumes that the DMA-buf can't be invalidated any more.
*
* This is called with the dmabuf->resv object locked.
*
* This callback is optional.
*
* Returns:
*
* 0 on success, negative error code on failure.
*/
int (*pin)(struct dma_buf *);
/**
* @unpin_dma_buf:
*
* This is called by dma_buf_unpin and lets the exporter know that an
* importer doesn't need to the DMA-buf to stay were it is any more.
This should read: * importer doesn't need the DMA-buf to stay were it is anymore.
*
* This is called with the dmabuf->resv object locked.
*
* This callback is optional.
*/
void (*unpin)(struct dma_buf *);
/** * @map_dma_buf: * * This is called by dma_buf_map_attachment() and is used to map a * shared &dma_buf into device address space, and it is mandatory. It
* can only be called if @attach has been called successfully. This
* essentially pins the DMA buffer into place, and it cannot be moved
* any more
* can only be called if @attach has been called successfully. * * This call may sleep, e.g. when the backing storage first needs to be * allocated, or moved to a location suitable for all currently attached
@@ -148,9 +174,6 @@ struct dma_buf_ops { * * This is called by dma_buf_unmap_attachment() and should unmap and * release the &sg_table allocated in @map_dma_buf, and it is mandatory.
* It should also unpin the backing storage if this is the last mapping
* of the DMA buffer, it the exporter supports backing storage
* migration. * * This is always called with the dmabuf->resv object locked when * no_sgt_cache is true.
@@ -442,6 +465,8 @@ int dma_buf_fd(struct dma_buf *dmabuf, int flags); struct dma_buf *dma_buf_get(int fd); void dma_buf_put(struct dma_buf *dmabuf);
+int dma_buf_pin(struct dma_buf *dmabuf); +void dma_buf_unpin(struct dma_buf *dmabuf); struct sg_table *dma_buf_map_attachment_locked(struct dma_buf_attachment *, enum dma_data_direction); struct sg_table *dma_buf_map_attachment(struct dma_buf_attachment *, -- 2.17.1
amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
That is now done by the DMA-buf helpers instead.
Signed-off-by: Christian König christian.koenig@amd.com --- drivers/gpu/drm/drm_prime.c | 76 ++++++++----------------------------- 1 file changed, 16 insertions(+), 60 deletions(-)
diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c index 1fadf5d5ed33..7e439ea3546a 100644 --- a/drivers/gpu/drm/drm_prime.c +++ b/drivers/gpu/drm/drm_prime.c @@ -86,11 +86,6 @@ struct drm_prime_member { struct rb_node handle_rb; };
-struct drm_prime_attachment { - struct sg_table *sgt; - enum dma_data_direction dir; -}; - static int drm_prime_add_buf_handle(struct drm_prime_file_private *prime_fpriv, struct dma_buf *dma_buf, uint32_t handle) { @@ -188,25 +183,16 @@ static int drm_prime_lookup_buf_handle(struct drm_prime_file_private *prime_fpri * @dma_buf: buffer to attach device to * @attach: buffer attachment data * - * Allocates &drm_prime_attachment and calls &drm_driver.gem_prime_pin for - * device specific attachment. This can be used as the &dma_buf_ops.attach - * callback. + * Calls &drm_driver.gem_prime_pin for device specific handling. This can be + * used as the &dma_buf_ops.attach callback. * * Returns 0 on success, negative error code on failure. */ int drm_gem_map_attach(struct dma_buf *dma_buf, struct dma_buf_attachment *attach) { - struct drm_prime_attachment *prime_attach; struct drm_gem_object *obj = dma_buf->priv;
- prime_attach = kzalloc(sizeof(*prime_attach), GFP_KERNEL); - if (!prime_attach) - return -ENOMEM; - - prime_attach->dir = DMA_NONE; - attach->priv = prime_attach; - return drm_gem_pin(obj); } EXPORT_SYMBOL(drm_gem_map_attach); @@ -222,26 +208,8 @@ EXPORT_SYMBOL(drm_gem_map_attach); void drm_gem_map_detach(struct dma_buf *dma_buf, struct dma_buf_attachment *attach) { - struct drm_prime_attachment *prime_attach = attach->priv; struct drm_gem_object *obj = dma_buf->priv;
- if (prime_attach) { - struct sg_table *sgt = prime_attach->sgt; - - if (sgt) { - if (prime_attach->dir != DMA_NONE) - dma_unmap_sg_attrs(attach->dev, sgt->sgl, - sgt->nents, - prime_attach->dir, - DMA_ATTR_SKIP_CPU_SYNC); - sg_free_table(sgt); - } - - kfree(sgt); - kfree(prime_attach); - attach->priv = NULL; - } - drm_gem_unpin(obj); } EXPORT_SYMBOL(drm_gem_map_detach); @@ -286,39 +254,22 @@ void drm_prime_remove_buf_handle_locked(struct drm_prime_file_private *prime_fpr struct sg_table *drm_gem_map_dma_buf(struct dma_buf_attachment *attach, enum dma_data_direction dir) { - struct drm_prime_attachment *prime_attach = attach->priv; struct drm_gem_object *obj = attach->dmabuf->priv; struct sg_table *sgt;
- if (WARN_ON(dir == DMA_NONE || !prime_attach)) + if (WARN_ON(dir == DMA_NONE)) return ERR_PTR(-EINVAL);
- /* return the cached mapping when possible */ - if (prime_attach->dir == dir) - return prime_attach->sgt; - - /* - * two mappings with different directions for the same attachment are - * not allowed - */ - if (WARN_ON(prime_attach->dir != DMA_NONE)) - return ERR_PTR(-EBUSY); - if (obj->funcs) sgt = obj->funcs->get_sg_table(obj); else sgt = obj->dev->driver->gem_prime_get_sg_table(obj);
- if (!IS_ERR(sgt)) { - if (!dma_map_sg_attrs(attach->dev, sgt->sgl, sgt->nents, dir, - DMA_ATTR_SKIP_CPU_SYNC)) { - sg_free_table(sgt); - kfree(sgt); - sgt = ERR_PTR(-ENOMEM); - } else { - prime_attach->sgt = sgt; - prime_attach->dir = dir; - } + if (!dma_map_sg_attrs(attach->dev, sgt->sgl, sgt->nents, dir, + DMA_ATTR_SKIP_CPU_SYNC)) { + sg_free_table(sgt); + kfree(sgt); + sgt = ERR_PTR(-ENOMEM); }
return sgt; @@ -331,14 +282,19 @@ EXPORT_SYMBOL(drm_gem_map_dma_buf); * @sgt: scatterlist info of the buffer to unmap * @dir: direction of DMA transfer * - * Not implemented. The unmap is done at drm_gem_map_detach(). This can be - * used as the &dma_buf_ops.unmap_dma_buf callback. + * This can be used as the &dma_buf_ops.unmap_dma_buf callback. */ void drm_gem_unmap_dma_buf(struct dma_buf_attachment *attach, struct sg_table *sgt, enum dma_data_direction dir) { - /* nothing to be done here */ + if (!sgt) + return; + + dma_unmap_sg_attrs(attach->dev, sgt->sgl, sgt->nents, dir, + DMA_ATTR_SKIP_CPU_SYNC); + sg_free_table(sgt); + kfree(sgt); } EXPORT_SYMBOL(drm_gem_unmap_dma_buf);
Pipeline removal of the BOs backing store when no placement is given during validation.
Signed-off-by: Christian König christian.koenig@amd.com --- drivers/gpu/drm/ttm/ttm_bo.c | 12 ++++++++++++ 1 file changed, 12 insertions(+)
diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c index 41d07faa2eae..8e7e7caee9d5 100644 --- a/drivers/gpu/drm/ttm/ttm_bo.c +++ b/drivers/gpu/drm/ttm/ttm_bo.c @@ -1161,6 +1161,18 @@ int ttm_bo_validate(struct ttm_buffer_object *bo, uint32_t new_flags;
reservation_object_assert_held(bo->resv); + + /* + * Remove the backing store if no placement is given. + */ + if (!placement->num_placement && !placement->num_busy_placement) { + ret = ttm_bo_pipeline_gutting(bo); + if (ret) + return ret; + + return ttm_tt_create(bo, false); + } + /* * Check whether we need to move buffer. */
This way we can even pipeline imported BO evictions.
Signed-off-by: Christian König christian.koenig@amd.com --- drivers/gpu/drm/ttm/ttm_bo_util.c | 18 +----------------- 1 file changed, 1 insertion(+), 17 deletions(-)
diff --git a/drivers/gpu/drm/ttm/ttm_bo_util.c b/drivers/gpu/drm/ttm/ttm_bo_util.c index 895d77d799e4..97f35c4bda35 100644 --- a/drivers/gpu/drm/ttm/ttm_bo_util.c +++ b/drivers/gpu/drm/ttm/ttm_bo_util.c @@ -486,7 +486,6 @@ static int ttm_buffer_object_transfer(struct ttm_buffer_object *bo, struct ttm_buffer_object **new_obj) { struct ttm_transfer_obj *fbo; - int ret;
fbo = kmalloc(sizeof(*fbo), GFP_KERNEL); if (!fbo) @@ -517,10 +516,7 @@ static int ttm_buffer_object_transfer(struct ttm_buffer_object *bo, kref_init(&fbo->base.kref); fbo->base.destroy = &ttm_transfered_destroy; fbo->base.acc_size = 0; - fbo->base.resv = &fbo->base.ttm_resv; - reservation_object_init(fbo->base.resv); - ret = reservation_object_trylock(fbo->base.resv); - WARN_ON(!ret); + reservation_object_init(&fbo->base.ttm_resv);
*new_obj = &fbo->base; return 0; @@ -716,8 +712,6 @@ int ttm_bo_move_accel_cleanup(struct ttm_buffer_object *bo, if (ret) return ret;
- reservation_object_add_excl_fence(ghost_obj->resv, fence); - /** * If we're not moving to fixed memory, the TTM object * needs to stay alive. Otherwhise hang it on the ghost @@ -729,7 +723,6 @@ int ttm_bo_move_accel_cleanup(struct ttm_buffer_object *bo, else bo->ttm = NULL;
- ttm_bo_unreserve(ghost_obj); ttm_bo_put(ghost_obj); }
@@ -772,8 +765,6 @@ int ttm_bo_pipeline_move(struct ttm_buffer_object *bo, if (ret) return ret;
- reservation_object_add_excl_fence(ghost_obj->resv, fence); - /** * If we're not moving to fixed memory, the TTM object * needs to stay alive. Otherwhise hang it on the ghost @@ -785,7 +776,6 @@ int ttm_bo_pipeline_move(struct ttm_buffer_object *bo, else bo->ttm = NULL;
- ttm_bo_unreserve(ghost_obj); ttm_bo_put(ghost_obj);
} else if (from->flags & TTM_MEMTYPE_FLAG_FIXED) { @@ -841,16 +831,10 @@ int ttm_bo_pipeline_gutting(struct ttm_buffer_object *bo) if (ret) return ret;
- ret = reservation_object_copy_fences(ghost->resv, bo->resv); - /* Last resort, wait for the BO to be idle when we are OOM */ - if (ret) - ttm_bo_wait(bo, false, false); - memset(&bo->mem, 0, sizeof(bo->mem)); bo->mem.mem_type = TTM_PL_SYSTEM; bo->ttm = NULL;
- ttm_bo_unreserve(ghost); ttm_bo_put(ghost);
return 0;
The caching of SGT's is actually quite harmful and should probably removed altogether when all drivers are audited.
Start by providing a separate DMA-buf export implementation in amdgpu. This is also a prerequisite of unpinned DMA-buf handling.
v2: fix unintended recursion, remove debugging leftovers v3: split out from unpinned DMA-buf work v4: rebase on top of new no_sgt_cache flag
Signed-off-by: Christian König christian.koenig@amd.com --- drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 1 - drivers/gpu/drm/amd/amdgpu/amdgpu_gem.h | 1 - drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c | 94 ++++++++++------------- 3 files changed, 40 insertions(+), 56 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c index 13a68f62bcc8..f1815223a1a1 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c @@ -1254,7 +1254,6 @@ static struct drm_driver kms_driver = { .gem_prime_export = amdgpu_gem_prime_export, .gem_prime_import = amdgpu_gem_prime_import, .gem_prime_res_obj = amdgpu_gem_prime_res_obj, - .gem_prime_get_sg_table = amdgpu_gem_prime_get_sg_table, .gem_prime_import_sg_table = amdgpu_gem_prime_import_sg_table, .gem_prime_vmap = amdgpu_gem_prime_vmap, .gem_prime_vunmap = amdgpu_gem_prime_vunmap, diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.h index f1ddfc50bcc7..0c50d14a9739 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.h @@ -39,7 +39,6 @@ int amdgpu_gem_object_open(struct drm_gem_object *obj, void amdgpu_gem_object_close(struct drm_gem_object *obj, struct drm_file *file_priv); unsigned long amdgpu_gem_timeout(uint64_t timeout_ns); -struct sg_table *amdgpu_gem_prime_get_sg_table(struct drm_gem_object *obj); struct drm_gem_object * amdgpu_gem_prime_import_sg_table(struct drm_device *dev, struct dma_buf_attachment *attach, diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c index a38e0fb4a6fe..8d748f9d0292 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c @@ -40,22 +40,6 @@ #include <linux/dma-buf.h> #include <linux/dma-fence-array.h>
-/** - * amdgpu_gem_prime_get_sg_table - &drm_driver.gem_prime_get_sg_table - * implementation - * @obj: GEM buffer object (BO) - * - * Returns: - * A scatter/gather table for the pinned pages of the BO's memory. - */ -struct sg_table *amdgpu_gem_prime_get_sg_table(struct drm_gem_object *obj) -{ - struct amdgpu_bo *bo = gem_to_amdgpu_bo(obj); - int npages = bo->tbo.num_pages; - - return drm_prime_pages_to_sg(bo->tbo.ttm->pages, npages); -} - /** * amdgpu_gem_prime_vmap - &dma_buf_ops.vmap implementation * @obj: GEM BO @@ -231,34 +215,29 @@ __reservation_object_make_exclusive(struct reservation_object *obj) }
/** - * amdgpu_gem_map_attach - &dma_buf_ops.attach implementation - * @dma_buf: Shared DMA buffer + * amdgpu_gem_map_dma_buf - &dma_buf_ops.map_dma_buf implementation * @attach: DMA-buf attachment + * @dir: DMA direction * * Makes sure that the shared DMA buffer can be accessed by the target device. * For now, simply pins it to the GTT domain, where it should be accessible by * all DMA devices. * * Returns: - * 0 on success or a negative error code on failure. + * sg_table filled with the DMA addresses to use or ERR_PRT with negative error + * code. */ -static int amdgpu_gem_map_attach(struct dma_buf *dma_buf, - struct dma_buf_attachment *attach) +static struct sg_table * +amdgpu_gem_map_dma_buf(struct dma_buf_attachment *attach, + enum dma_data_direction dir) { + struct dma_buf *dma_buf = attach->dmabuf; struct drm_gem_object *obj = dma_buf->priv; struct amdgpu_bo *bo = gem_to_amdgpu_bo(obj); struct amdgpu_device *adev = amdgpu_ttm_adev(bo->tbo.bdev); + struct sg_table *sgt; long r;
- r = drm_gem_map_attach(dma_buf, attach); - if (r) - return r; - - r = amdgpu_bo_reserve(bo, false); - if (unlikely(r != 0)) - goto error_detach; - - if (attach->dev->driver != adev->dev->driver) { /* * We only create shared fences for internal use, but importers @@ -270,53 +249,61 @@ static int amdgpu_gem_map_attach(struct dma_buf *dma_buf, */ r = __reservation_object_make_exclusive(bo->tbo.resv); if (r) - goto error_unreserve; + return ERR_PTR(r); }
/* pin buffer into GTT */ r = amdgpu_bo_pin(bo, AMDGPU_GEM_DOMAIN_GTT); if (r) - goto error_unreserve; + return ERR_PTR(r); + + sgt = drm_prime_pages_to_sg(bo->tbo.ttm->pages, bo->tbo.num_pages); + if (IS_ERR(sgt)) + return sgt; + + if (!dma_map_sg_attrs(attach->dev, sgt->sgl, sgt->nents, dir, + DMA_ATTR_SKIP_CPU_SYNC)) + goto error_free;
if (attach->dev->driver != adev->dev->driver) bo->prime_shared_count++;
-error_unreserve: - amdgpu_bo_unreserve(bo); + return sgt;
-error_detach: - if (r) - drm_gem_map_detach(dma_buf, attach); - return r; +error_free: + sg_free_table(sgt); + kfree(sgt); + return ERR_PTR(-ENOMEM); }
/** - * amdgpu_gem_map_detach - &dma_buf_ops.detach implementation - * @dma_buf: Shared DMA buffer + * amdgpu_gem_unmap_dma_buf - &dma_buf_ops.unmap_dma_buf implementation * @attach: DMA-buf attachment + * @sgt: sg_table to unmap + * @dir: DMA direction * * This is called when a shared DMA buffer no longer needs to be accessible by * another device. For now, simply unpins the buffer from GTT. */ -static void amdgpu_gem_map_detach(struct dma_buf *dma_buf, - struct dma_buf_attachment *attach) +static void amdgpu_gem_unmap_dma_buf(struct dma_buf_attachment *attach, + struct sg_table *sgt, + enum dma_data_direction dir) { + struct dma_buf *dma_buf = attach->dmabuf; struct drm_gem_object *obj = dma_buf->priv; struct amdgpu_bo *bo = gem_to_amdgpu_bo(obj); struct amdgpu_device *adev = amdgpu_ttm_adev(bo->tbo.bdev); - int ret = 0; - - ret = amdgpu_bo_reserve(bo, true); - if (unlikely(ret != 0)) - goto error;
amdgpu_bo_unpin(bo); + if (attach->dev->driver != adev->dev->driver && bo->prime_shared_count) bo->prime_shared_count--; - amdgpu_bo_unreserve(bo);
-error: - drm_gem_map_detach(dma_buf, attach); + if (sgt) { + dma_unmap_sg(attach->dev, sgt->sgl, sgt->nents, dir); + sg_free_table(sgt); + kfree(sgt); + } }
/** @@ -374,10 +361,9 @@ static int amdgpu_gem_begin_cpu_access(struct dma_buf *dma_buf, }
const struct dma_buf_ops amdgpu_dmabuf_ops = { - .attach = amdgpu_gem_map_attach, - .detach = amdgpu_gem_map_detach, - .map_dma_buf = drm_gem_map_dma_buf, - .unmap_dma_buf = drm_gem_unmap_dma_buf, + .dynamic_sgt_mapping = true, + .map_dma_buf = amdgpu_gem_map_dma_buf, + .unmap_dma_buf = amdgpu_gem_unmap_dma_buf, .release = drm_gem_dmabuf_release, .begin_cpu_access = amdgpu_gem_begin_cpu_access, .mmap = drm_gem_dmabuf_mmap,
Instead of relying on the DRM functions just implement our own import functions. This prepares support for taking care of unpinned DMA-buf.
v2: enable for all exporters, not just amdgpu, fix invalidation handling, lock reservation object while setting callback v3: change to new dma_buf attach interface v4: split out from unpinned DMA-buf work
Signed-off-by: Christian König christian.koenig@amd.com --- drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 1 - drivers/gpu/drm/amd/amdgpu/amdgpu_gem.h | 4 --- drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c | 42 +++++++++++++++-------- drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 34 +++++++++++++++--- 4 files changed, 56 insertions(+), 25 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c index f1815223a1a1..95195c427e85 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c @@ -1254,7 +1254,6 @@ static struct drm_driver kms_driver = { .gem_prime_export = amdgpu_gem_prime_export, .gem_prime_import = amdgpu_gem_prime_import, .gem_prime_res_obj = amdgpu_gem_prime_res_obj, - .gem_prime_import_sg_table = amdgpu_gem_prime_import_sg_table, .gem_prime_vmap = amdgpu_gem_prime_vmap, .gem_prime_vunmap = amdgpu_gem_prime_vunmap, .gem_prime_mmap = amdgpu_gem_prime_mmap, diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.h index 0c50d14a9739..01811d8aa8a9 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.h @@ -39,10 +39,6 @@ int amdgpu_gem_object_open(struct drm_gem_object *obj, void amdgpu_gem_object_close(struct drm_gem_object *obj, struct drm_file *file_priv); unsigned long amdgpu_gem_timeout(uint64_t timeout_ns); -struct drm_gem_object * -amdgpu_gem_prime_import_sg_table(struct drm_device *dev, - struct dma_buf_attachment *attach, - struct sg_table *sg); struct dma_buf *amdgpu_gem_prime_export(struct drm_device *dev, struct drm_gem_object *gobj, int flags); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c index 8d748f9d0292..56e2a606b9a1 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c @@ -122,31 +122,28 @@ int amdgpu_gem_prime_mmap(struct drm_gem_object *obj, struct vm_area_struct *vma }
/** - * amdgpu_gem_prime_import_sg_table - &drm_driver.gem_prime_import_sg_table - * implementation + * amdgpu_gem_prime_create_obj - create BO for DMA-buf import + * * @dev: DRM device - * @attach: DMA-buf attachment - * @sg: Scatter/gather table + * @dma_buf: DMA-buf * - * Imports shared DMA buffer memory exported by another device. + * Creates an empty SG BO for DMA-buf import. * * Returns: * A new GEM BO of the given DRM device, representing the memory * described by the given DMA-buf attachment and scatter/gather table. */ -struct drm_gem_object * -amdgpu_gem_prime_import_sg_table(struct drm_device *dev, - struct dma_buf_attachment *attach, - struct sg_table *sg) +static struct drm_gem_object * +amdgpu_gem_prime_create_obj(struct drm_device *dev, struct dma_buf *dma_buf) { - struct reservation_object *resv = attach->dmabuf->resv; + struct reservation_object *resv = dma_buf->resv; struct amdgpu_device *adev = dev->dev_private; struct amdgpu_bo *bo; struct amdgpu_bo_param bp; int ret;
memset(&bp, 0, sizeof(bp)); - bp.size = attach->dmabuf->size; + bp.size = dma_buf->size; bp.byte_align = PAGE_SIZE; bp.domain = AMDGPU_GEM_DOMAIN_CPU; bp.flags = 0; @@ -157,11 +154,9 @@ amdgpu_gem_prime_import_sg_table(struct drm_device *dev, if (ret) goto error;
- bo->tbo.sg = sg; - bo->tbo.ttm->sg = sg; bo->allowed_domains = AMDGPU_GEM_DOMAIN_GTT; bo->preferred_domains = AMDGPU_GEM_DOMAIN_GTT; - if (attach->dmabuf->ops != &amdgpu_dmabuf_ops) + if (dma_buf->ops != &amdgpu_dmabuf_ops) bo->prime_shared_count = 1;
ww_mutex_unlock(&resv->lock); @@ -417,6 +412,11 @@ struct dma_buf *amdgpu_gem_prime_export(struct drm_device *dev, struct drm_gem_object *amdgpu_gem_prime_import(struct drm_device *dev, struct dma_buf *dma_buf) { + struct dma_buf_attach_info attach_info = { + .dev = dev->dev, + .dmabuf = dma_buf, + }; + struct dma_buf_attachment *attach; struct drm_gem_object *obj;
if (dma_buf->ops == &amdgpu_dmabuf_ops) { @@ -431,5 +431,17 @@ struct drm_gem_object *amdgpu_gem_prime_import(struct drm_device *dev, } }
- return drm_gem_prime_import(dev, dma_buf); + obj = amdgpu_gem_prime_create_obj(dev, dma_buf); + if (IS_ERR(obj)) + return obj; + + attach = dma_buf_attach(&attach_info); + if (IS_ERR(attach)) { + drm_gem_object_put(obj); + return ERR_CAST(attach); + } + + get_dma_buf(dma_buf); + obj->import_attach = attach; + return obj; } diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c index c14198737dcd..afccca5b1f5f 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c @@ -44,6 +44,7 @@ #include <linux/debugfs.h> #include <linux/iommu.h> #include <linux/hmm.h> +#include <linux/dma-buf.h> #include "amdgpu.h" #include "amdgpu_object.h" #include "amdgpu_trace.h" @@ -706,6 +707,7 @@ static unsigned long amdgpu_ttm_io_mem_pfn(struct ttm_buffer_object *bo, */ struct amdgpu_ttm_tt { struct ttm_dma_tt ttm; + struct drm_gem_object *gobj; u64 offset; uint64_t userptr; struct task_struct *usertask; @@ -1179,6 +1181,7 @@ static struct ttm_tt *amdgpu_ttm_tt_create(struct ttm_buffer_object *bo, return NULL; } gtt->ttm.ttm.func = &amdgpu_backend_func; + gtt->gobj = &ttm_to_amdgpu_bo(bo)->gem_base;
/* allocate space for the uninitialized page entries */ if (ttm_sg_tt_init(>t->ttm, bo, page_flags)) { @@ -1199,7 +1202,6 @@ static int amdgpu_ttm_tt_populate(struct ttm_tt *ttm, { struct amdgpu_device *adev = amdgpu_ttm_adev(ttm->bdev); struct amdgpu_ttm_tt *gtt = (void *)ttm; - bool slave = !!(ttm->page_flags & TTM_PAGE_FLAG_SG);
/* user pages are bound by amdgpu_ttm_tt_pin_userptr() */ if (gtt && gtt->userptr) { @@ -1212,7 +1214,20 @@ static int amdgpu_ttm_tt_populate(struct ttm_tt *ttm, return 0; }
- if (slave && ttm->sg) { + if (ttm->page_flags & TTM_PAGE_FLAG_SG) { + if (!ttm->sg) { + struct dma_buf_attachment *attach; + struct sg_table *sgt; + + attach = gtt->gobj->import_attach; + sgt = dma_buf_map_attachment_locked(attach, + DMA_BIDIRECTIONAL); + if (IS_ERR(sgt)) + return PTR_ERR(sgt); + + ttm->sg = sgt; + } + drm_prime_sg_to_page_addr_arrays(ttm->sg, ttm->pages, gtt->ttm.dma_address, ttm->num_pages); @@ -1239,9 +1254,8 @@ static int amdgpu_ttm_tt_populate(struct ttm_tt *ttm, */ static void amdgpu_ttm_tt_unpopulate(struct ttm_tt *ttm) { - struct amdgpu_device *adev; struct amdgpu_ttm_tt *gtt = (void *)ttm; - bool slave = !!(ttm->page_flags & TTM_PAGE_FLAG_SG); + struct amdgpu_device *adev;
if (gtt && gtt->userptr) { amdgpu_ttm_tt_set_user_pages(ttm, NULL); @@ -1250,7 +1264,17 @@ static void amdgpu_ttm_tt_unpopulate(struct ttm_tt *ttm) return; }
- if (slave) + if (ttm->sg && gtt->gobj->import_attach) { + struct dma_buf_attachment *attach; + + attach = gtt->gobj->import_attach; + dma_buf_unmap_attachment_locked(attach, ttm->sg, + DMA_BIDIRECTIONAL); + ttm->sg = NULL; + return; + } + + if (ttm->page_flags & TTM_PAGE_FLAG_SG) return;
adev = amdgpu_ttm_adev(ttm->bdev);
Pin and unpin the DMA-buf exported BOs only on demand and invalidate all DMA-buf mappings when the underlying BO moves.
Signed-off-by: Christian König christian.koenig@amd.com --- drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 5 ++ drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c | 73 ++++++++++++++++++++-- 2 files changed, 72 insertions(+), 6 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c index ec9e45004bff..fdb98eb562db 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c @@ -31,6 +31,7 @@ */ #include <linux/list.h> #include <linux/slab.h> +#include <linux/dma-buf.h> #include <drm/drmP.h> #include <drm/amdgpu_drm.h> #include <drm/drm_cache.h> @@ -1192,6 +1193,10 @@ void amdgpu_bo_move_notify(struct ttm_buffer_object *bo,
amdgpu_bo_kunmap(abo);
+ if (abo->gem_base.dma_buf && !abo->gem_base.import_attach && + bo->mem.mem_type != TTM_PL_SYSTEM) + dma_buf_invalidate_mappings(abo->gem_base.dma_buf); + /* remember the eviction */ if (evict) atomic64_inc(&adev->num_evictions); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c index 56e2a606b9a1..40cd89271b20 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c @@ -209,6 +209,61 @@ __reservation_object_make_exclusive(struct reservation_object *obj) return -ENOMEM; }
+/** + * amdgpu_gem_pin_dma_buf - &dma_buf_ops.pin_dma_buf implementation + * + * @dma_buf: DMA-buf to pin in memory + * + * Pin the BO which is backing the DMA-buf so that it can't move any more. + */ +static int amdgpu_gem_pin_dma_buf(struct dma_buf *dma_buf) +{ + struct drm_gem_object *obj = dma_buf->priv; + struct amdgpu_bo *bo = gem_to_amdgpu_bo(obj); + + /* pin buffer into GTT */ + return amdgpu_bo_pin(bo, AMDGPU_GEM_DOMAIN_GTT); +} + +/** + * amdgpu_gem_unpin_dma_buf - &dma_buf_ops.unpin_dma_buf implementation + * + * @dma_buf: DMA-buf to unpin + * + * Unpin a previously pinned BO to make it movable again. + */ +static void amdgpu_gem_unpin_dma_buf(struct dma_buf *dma_buf) +{ + struct drm_gem_object *obj = dma_buf->priv; + struct amdgpu_bo *bo = gem_to_amdgpu_bo(obj); + + amdgpu_bo_unpin(bo); +} + +/** + * amdgpu_gem_dma_buf_attach - &dma_buf_ops.attach implementation + * + * @dma_buf: DMA-buf we attach to + * @attach: DMA-buf attachment + * + * Returns: + * Always zero for success. + */ +static int amdgpu_gem_dma_buf_attach(struct dma_buf *dma_buf, + struct dma_buf_attachment *attach) +{ + struct drm_gem_object *obj = dma_buf->priv; + struct amdgpu_bo *bo = gem_to_amdgpu_bo(obj); + + /* Make sure the buffer is pinned when userspace didn't set GTT as + * preferred domain. This avoid ping/pong situations with scan out BOs. + */ + if (!(bo->preferred_domains & AMDGPU_GEM_DOMAIN_GTT)) + attach->invalidate = NULL; + + return 0; +} + /** * amdgpu_gem_map_dma_buf - &dma_buf_ops.map_dma_buf implementation * @attach: DMA-buf attachment @@ -247,10 +302,15 @@ amdgpu_gem_map_dma_buf(struct dma_buf_attachment *attach, return ERR_PTR(r); }
- /* pin buffer into GTT */ - r = amdgpu_bo_pin(bo, AMDGPU_GEM_DOMAIN_GTT); - if (r) - return ERR_PTR(r); + if (attach->invalidate) { + /* move buffer into GTT */ + struct ttm_operation_ctx ctx = { false, false }; + + amdgpu_bo_placement_from_domain(bo, AMDGPU_GEM_DOMAIN_GTT); + r = ttm_bo_validate(&bo->tbo, &bo->placement, &ctx); + if (r) + return ERR_PTR(r); + }
sgt = drm_prime_pages_to_sg(bo->tbo.ttm->pages, bo->tbo.num_pages); if (IS_ERR(sgt)) @@ -289,8 +349,6 @@ static void amdgpu_gem_unmap_dma_buf(struct dma_buf_attachment *attach, struct amdgpu_bo *bo = gem_to_amdgpu_bo(obj); struct amdgpu_device *adev = amdgpu_ttm_adev(bo->tbo.bdev);
- amdgpu_bo_unpin(bo); - if (attach->dev->driver != adev->dev->driver && bo->prime_shared_count) bo->prime_shared_count--;
@@ -357,6 +415,9 @@ static int amdgpu_gem_begin_cpu_access(struct dma_buf *dma_buf,
const struct dma_buf_ops amdgpu_dmabuf_ops = { .dynamic_sgt_mapping = true, + .attach = amdgpu_gem_dma_buf_attach, + .pin = amdgpu_gem_pin_dma_buf, + .unpin = amdgpu_gem_unpin_dma_buf, .map_dma_buf = amdgpu_gem_map_dma_buf, .unmap_dma_buf = amdgpu_gem_unmap_dma_buf, .release = drm_gem_dmabuf_release,
Allow for invalidation of imported DMA-bufs.
v2: add dma_buf_pin/dma_buf_unpin support
Signed-off-by: Christian König christian.koenig@amd.com --- drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 6 ++++++ drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c | 24 ++++++++++++++++++++++ 2 files changed, 30 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c index fdb98eb562db..c3a5a115bfc6 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c @@ -848,6 +848,9 @@ int amdgpu_bo_pin_restricted(struct amdgpu_bo *bo, u32 domain, return 0; }
+ if (bo->gem_base.import_attach) + dma_buf_pin(bo->gem_base.import_attach->dmabuf); + bo->flags |= AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS; /* force to pin into visible video ram */ if (!(bo->flags & AMDGPU_GEM_CREATE_NO_CPU_ACCESS)) @@ -931,6 +934,9 @@ int amdgpu_bo_unpin(struct amdgpu_bo *bo)
amdgpu_bo_subtract_pin_size(bo);
+ if (bo->gem_base.import_attach) + dma_buf_unpin(bo->gem_base.import_attach->dmabuf); + for (i = 0; i < bo->placement.num_placement; i++) { bo->placements[i].lpfn = 0; bo->placements[i].flags &= ~TTM_PL_FLAG_NO_EVICT; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c index 40cd89271b20..30634396719b 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c @@ -459,6 +459,28 @@ struct dma_buf *amdgpu_gem_prime_export(struct drm_device *dev, return buf; }
+/** + * amdgpu_gem_prime_invalidate_mappings - &attach.invalidate implementation + * + * @attach: the DMA-buf attachment + * + * Invalidate the DMA-buf attachment, making sure that the we re-create the + * mapping before the next use. + */ +static void +amdgpu_gem_prime_invalidate_mappings(struct dma_buf_attachment *attach) +{ + struct ttm_operation_ctx ctx = { false, false }; + struct drm_gem_object *obj = attach->importer_priv; + struct amdgpu_bo *bo = gem_to_amdgpu_bo(obj); + struct ttm_placement placement = {}; + int r; + + r = ttm_bo_validate(&bo->tbo, &placement, &ctx); + if (r) + DRM_ERROR("Failed to invalidate DMA-buf import (%d))\n", r); +} + /** * amdgpu_gem_prime_import - &drm_driver.gem_prime_import implementation * @dev: DRM device @@ -476,6 +498,7 @@ struct drm_gem_object *amdgpu_gem_prime_import(struct drm_device *dev, struct dma_buf_attach_info attach_info = { .dev = dev->dev, .dmabuf = dma_buf, + .invalidate = amdgpu_gem_prime_invalidate_mappings }; struct dma_buf_attachment *attach; struct drm_gem_object *obj; @@ -496,6 +519,7 @@ struct drm_gem_object *amdgpu_gem_prime_import(struct drm_device *dev, if (IS_ERR(obj)) return obj;
+ attach_info.importer_priv = obj; attach = dma_buf_attach(&attach_info); if (IS_ERR(attach)) { drm_gem_object_put(obj);
On Tue, Apr 16, 2019 at 08:38:29PM +0200, Christian König wrote:
Hi everybody,
core idea in this patch set is that DMA-buf importers can now provide an optional invalidate callback. Using this callback and the reservation object exporters can now avoid pinning DMA-buf memory for a long time while sharing it between devices.
I've already send out an older version roughly a year ago, but didn't had time to further look into cleaning this up.
The last time a major problem was that we would had to fix up all drivers implementing DMA-buf at once.
Now I avoid this by allowing mappings to be cached in the DMA-buf attachment and so driver can optionally move over to the new interface one by one.
This is also a prerequisite to my patchset enabling sharing of device memory with DMA-buf.
Ok, with the discussions and thinking I think this design is solid and should work out. Bunch of api and documentation polishing still to do, to make sure we have really clear semantics and as little room as possible for misunderstanding - refactoring a mess in dma-buf is a lot more tricky than just ttm, there's a lot more users. -Daniel
Am 18.04.19 um 11:13 schrieb Daniel Vetter:
On Tue, Apr 16, 2019 at 08:38:29PM +0200, Christian König wrote:
Hi everybody,
core idea in this patch set is that DMA-buf importers can now provide an optional invalidate callback. Using this callback and the reservation object exporters can now avoid pinning DMA-buf memory for a long time while sharing it between devices.
I've already send out an older version roughly a year ago, but didn't had time to further look into cleaning this up.
The last time a major problem was that we would had to fix up all drivers implementing DMA-buf at once.
Now I avoid this by allowing mappings to be cached in the DMA-buf attachment and so driver can optionally move over to the new interface one by one.
This is also a prerequisite to my patchset enabling sharing of device memory with DMA-buf.
Ok, with the discussions and thinking I think this design is solid and should work out. Bunch of api and documentation polishing still to do, to make sure we have really clear semantics and as little room as possible for misunderstanding - refactoring a mess in dma-buf is a lot more tricky than just ttm, there's a lot more users.
Yeah, I'm pretty aware of that after changing all the users of the map API to use a structure for the parameters.
Well at least it's not UAPI :)
Christian
-Daniel
On Tue, 16 Apr 2019, Christian König wrote:
To allow a smooth transition from pinning buffer objects to dynamic invalidation we first start to cache the sg_table for an attachment unless the driver explicitly says to not do so.
drivers/dma-buf/dma-buf.c | 24 ++++++++++++++++++++++++ include/linux/dma-buf.h | 11 +++++++++++ 2 files changed, 35 insertions(+)
diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index 7c858020d14b..65161a82d4d5 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -573,6 +573,20 @@ struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf, list_add(&attach->node, &dmabuf->attachments); mutex_unlock(&dmabuf->lock);
- if (!dmabuf->ops->dynamic_sgt_mapping) {
struct sg_table *sgt;
sgt = dmabuf->ops->map_dma_buf(attach, DMA_BIDIRECTIONAL);
if (!sgt)
sgt = ERR_PTR(-ENOMEM);
if (IS_ERR(sgt)) {
dma_buf_detach(dmabuf, attach);
return ERR_CAST(sgt);
}
attach->sgt = sgt;
- }
- return attach;
err_attach: @@ -595,6 +609,10 @@ void dma_buf_detach(struct dma_buf *dmabuf, struct dma_buf_attachment *attach) if (WARN_ON(!dmabuf || !attach)) return;
- if (attach->sgt)
dmabuf->ops->unmap_dma_buf(attach, attach->sgt,
DMA_BIDIRECTIONAL);
- mutex_lock(&dmabuf->lock); list_del(&attach->node); if (dmabuf->ops->detach)
@@ -630,6 +648,9 @@ struct sg_table *dma_buf_map_attachment(struct dma_buf_attachment *attach, if (WARN_ON(!attach || !attach->dmabuf)) return ERR_PTR(-EINVAL);
- if (attach->sgt)
return attach->sgt;
I am concerned by this change to make caching the sg_table the default behavior as this will result in the exporter's map_dma_buf/unmap_dma_buf calls are no longer being called in dma_buf_map_attachment/dma_buf_unmap_attachment. This seems concerning to me as it appears to ignore the cache maintenance aspect of the map_dma_buf/unmap_dma_buf calls. For example won't this potentially cause issues for clients of ION.
If we had the following - #1 dma_buf_attach coherent_device - #2 dma_buf attach non_coherent_device - #3 dma_buf_map_attachment non_coherent_device - #4 non_coherent_device writes to buffer - #5 dma_buf_unmap_attachment non_coherent_device - #6 dma_buf_map_attachment coherent_device - #7 coherent_device reads buffer - #8 dma_buf_unmap_attachment coherent_device
There wouldn't be any CMO at step #5 anymore (specifically no invalidate) so now at step #7 the coherent_device could read a stale cache line.
Also, now by default dma_buf_unmap_attachment no longer removes the mappings from the iommu, so now by default dma_buf_unmap_attachment is not doing what I would expect and clients are losing the potential sandboxing benefits of removing the mappings. Shouldn't this caching behavior be something that clients opt into instead of being the default?
Liam
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project
Hi Christian,
On Sat, 27 Apr 2019 at 05:31, Liam Mark lmark@codeaurora.org wrote:
On Tue, 16 Apr 2019, Christian König wrote:
To allow a smooth transition from pinning buffer objects to dynamic invalidation we first start to cache the sg_table for an attachment unless the driver explicitly says to not do so.
drivers/dma-buf/dma-buf.c | 24 ++++++++++++++++++++++++ include/linux/dma-buf.h | 11 +++++++++++ 2 files changed, 35 insertions(+)
diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index 7c858020d14b..65161a82d4d5 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -573,6 +573,20 @@ struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf, list_add(&attach->node, &dmabuf->attachments);
mutex_unlock(&dmabuf->lock);
if (!dmabuf->ops->dynamic_sgt_mapping) {
struct sg_table *sgt;
sgt = dmabuf->ops->map_dma_buf(attach, DMA_BIDIRECTIONAL);
if (!sgt)
sgt = ERR_PTR(-ENOMEM);
if (IS_ERR(sgt)) {
dma_buf_detach(dmabuf, attach);
return ERR_CAST(sgt);
}
attach->sgt = sgt;
}
return attach;
err_attach: @@ -595,6 +609,10 @@ void dma_buf_detach(struct dma_buf *dmabuf, struct dma_buf_attachment *attach) if (WARN_ON(!dmabuf || !attach)) return;
if (attach->sgt)
dmabuf->ops->unmap_dma_buf(attach, attach->sgt,
DMA_BIDIRECTIONAL);
mutex_lock(&dmabuf->lock); list_del(&attach->node); if (dmabuf->ops->detach)
@@ -630,6 +648,9 @@ struct sg_table *dma_buf_map_attachment(struct dma_buf_attachment *attach, if (WARN_ON(!attach || !attach->dmabuf)) return ERR_PTR(-EINVAL);
if (attach->sgt)
return attach->sgt;
I am concerned by this change to make caching the sg_table the default behavior as this will result in the exporter's map_dma_buf/unmap_dma_buf calls are no longer being called in dma_buf_map_attachment/dma_buf_unmap_attachment.
Probably this concern from Liam got lost between versions of your patches; could we please request a reply to these points here?
This seems concerning to me as it appears to ignore the cache maintenance aspect of the map_dma_buf/unmap_dma_buf calls. For example won't this potentially cause issues for clients of ION.
If we had the following
- #1 dma_buf_attach coherent_device
- #2 dma_buf attach non_coherent_device
- #3 dma_buf_map_attachment non_coherent_device
- #4 non_coherent_device writes to buffer
- #5 dma_buf_unmap_attachment non_coherent_device
- #6 dma_buf_map_attachment coherent_device
- #7 coherent_device reads buffer
- #8 dma_buf_unmap_attachment coherent_device
There wouldn't be any CMO at step #5 anymore (specifically no invalidate) so now at step #7 the coherent_device could read a stale cache line.
Also, now by default dma_buf_unmap_attachment no longer removes the mappings from the iommu, so now by default dma_buf_unmap_attachment is not doing what I would expect and clients are losing the potential sandboxing benefits of removing the mappings. Shouldn't this caching behavior be something that clients opt into instead of being the default?
Liam
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project
Best, Sumit.
Am 22.05.19 um 18:17 schrieb Sumit Semwal:
Hi Christian,
On Sat, 27 Apr 2019 at 05:31, Liam Mark lmark@codeaurora.org wrote:
On Tue, 16 Apr 2019, Christian König wrote:
To allow a smooth transition from pinning buffer objects to dynamic invalidation we first start to cache the sg_table for an attachment unless the driver explicitly says to not do so.
drivers/dma-buf/dma-buf.c | 24 ++++++++++++++++++++++++ include/linux/dma-buf.h | 11 +++++++++++ 2 files changed, 35 insertions(+)
diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index 7c858020d14b..65161a82d4d5 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -573,6 +573,20 @@ struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf, list_add(&attach->node, &dmabuf->attachments);
mutex_unlock(&dmabuf->lock);
if (!dmabuf->ops->dynamic_sgt_mapping) {
struct sg_table *sgt;
sgt = dmabuf->ops->map_dma_buf(attach, DMA_BIDIRECTIONAL);
if (!sgt)
sgt = ERR_PTR(-ENOMEM);
if (IS_ERR(sgt)) {
dma_buf_detach(dmabuf, attach);
return ERR_CAST(sgt);
}
attach->sgt = sgt;
}
return attach;
err_attach:
@@ -595,6 +609,10 @@ void dma_buf_detach(struct dma_buf *dmabuf, struct dma_buf_attachment *attach) if (WARN_ON(!dmabuf || !attach)) return;
if (attach->sgt)
dmabuf->ops->unmap_dma_buf(attach, attach->sgt,
DMA_BIDIRECTIONAL);
mutex_lock(&dmabuf->lock); list_del(&attach->node); if (dmabuf->ops->detach)
@@ -630,6 +648,9 @@ struct sg_table *dma_buf_map_attachment(struct dma_buf_attachment *attach, if (WARN_ON(!attach || !attach->dmabuf)) return ERR_PTR(-EINVAL);
if (attach->sgt)
return attach->sgt;
I am concerned by this change to make caching the sg_table the default behavior as this will result in the exporter's map_dma_buf/unmap_dma_buf calls are no longer being called in dma_buf_map_attachment/dma_buf_unmap_attachment.
Probably this concern from Liam got lost between versions of your patches; could we please request a reply to these points here?
Sorry I indeed never got this mail, but this is actually not an issue because Daniel had similar concerns and we didn't made this the default in the final version.
This seems concerning to me as it appears to ignore the cache maintenance aspect of the map_dma_buf/unmap_dma_buf calls. For example won't this potentially cause issues for clients of ION.
If we had the following
- #1 dma_buf_attach coherent_device
- #2 dma_buf attach non_coherent_device
- #3 dma_buf_map_attachment non_coherent_device
- #4 non_coherent_device writes to buffer
- #5 dma_buf_unmap_attachment non_coherent_device
- #6 dma_buf_map_attachment coherent_device
- #7 coherent_device reads buffer
- #8 dma_buf_unmap_attachment coherent_device
There wouldn't be any CMO at step #5 anymore (specifically no invalidate) so now at step #7 the coherent_device could read a stale cache line.
Also, now by default dma_buf_unmap_attachment no longer removes the mappings from the iommu, so now by default dma_buf_unmap_attachment is not doing what I would expect and clients are losing the potential sandboxing benefits of removing the mappings. Shouldn't this caching behavior be something that clients opt into instead of being the default?
Well, it seems you are making incorrect assumptions about the cache maintenance of DMA-buf here.
At least for all DRM devices I'm aware of mapping/unmapping an attachment does *NOT* have any cache maintenance implications.
E.g. the use case you describe above would certainly fail with amdgpu, radeon, nouveau and i915 because mapping a DMA-buf doesn't stop the exporter from reading/writing to that buffer (just the opposite actually).
All of them assume perfectly coherent access to the underlying memory. As far as I know there is no documented cache maintenance requirements for DMA-buf.
The IOMMU concern on the other hand is certainly valid and I perfectly agree that keeping the mapping time as short as possible is desirable.
Regards, Christian.
Liam
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project
Best, Sumit.
On Wed, May 22, 2019 at 7:28 PM Christian König ckoenig.leichtzumerken@gmail.com wrote:
Am 22.05.19 um 18:17 schrieb Sumit Semwal:
Hi Christian,
On Sat, 27 Apr 2019 at 05:31, Liam Mark lmark@codeaurora.org wrote:
On Tue, 16 Apr 2019, Christian König wrote:
To allow a smooth transition from pinning buffer objects to dynamic invalidation we first start to cache the sg_table for an attachment unless the driver explicitly says to not do so.
drivers/dma-buf/dma-buf.c | 24 ++++++++++++++++++++++++ include/linux/dma-buf.h | 11 +++++++++++ 2 files changed, 35 insertions(+)
diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index 7c858020d14b..65161a82d4d5 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -573,6 +573,20 @@ struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf, list_add(&attach->node, &dmabuf->attachments);
mutex_unlock(&dmabuf->lock);
if (!dmabuf->ops->dynamic_sgt_mapping) {
struct sg_table *sgt;
sgt = dmabuf->ops->map_dma_buf(attach, DMA_BIDIRECTIONAL);
if (!sgt)
sgt = ERR_PTR(-ENOMEM);
if (IS_ERR(sgt)) {
dma_buf_detach(dmabuf, attach);
return ERR_CAST(sgt);
}
attach->sgt = sgt;
}
return attach;
err_attach:
@@ -595,6 +609,10 @@ void dma_buf_detach(struct dma_buf *dmabuf, struct dma_buf_attachment *attach) if (WARN_ON(!dmabuf || !attach)) return;
if (attach->sgt)
dmabuf->ops->unmap_dma_buf(attach, attach->sgt,
DMA_BIDIRECTIONAL);
mutex_lock(&dmabuf->lock); list_del(&attach->node); if (dmabuf->ops->detach)
@@ -630,6 +648,9 @@ struct sg_table *dma_buf_map_attachment(struct dma_buf_attachment *attach, if (WARN_ON(!attach || !attach->dmabuf)) return ERR_PTR(-EINVAL);
if (attach->sgt)
return attach->sgt;
I am concerned by this change to make caching the sg_table the default behavior as this will result in the exporter's map_dma_buf/unmap_dma_buf calls are no longer being called in dma_buf_map_attachment/dma_buf_unmap_attachment.
Probably this concern from Liam got lost between versions of your patches; could we please request a reply to these points here?
Sorry I indeed never got this mail, but this is actually not an issue because Daniel had similar concerns and we didn't made this the default in the final version.
This seems concerning to me as it appears to ignore the cache maintenance aspect of the map_dma_buf/unmap_dma_buf calls. For example won't this potentially cause issues for clients of ION.
If we had the following
- #1 dma_buf_attach coherent_device
- #2 dma_buf attach non_coherent_device
- #3 dma_buf_map_attachment non_coherent_device
- #4 non_coherent_device writes to buffer
- #5 dma_buf_unmap_attachment non_coherent_device
- #6 dma_buf_map_attachment coherent_device
- #7 coherent_device reads buffer
- #8 dma_buf_unmap_attachment coherent_device
There wouldn't be any CMO at step #5 anymore (specifically no invalidate) so now at step #7 the coherent_device could read a stale cache line.
Also, now by default dma_buf_unmap_attachment no longer removes the mappings from the iommu, so now by default dma_buf_unmap_attachment is not doing what I would expect and clients are losing the potential sandboxing benefits of removing the mappings. Shouldn't this caching behavior be something that clients opt into instead of being the default?
Well, it seems you are making incorrect assumptions about the cache maintenance of DMA-buf here.
At least for all DRM devices I'm aware of mapping/unmapping an attachment does *NOT* have any cache maintenance implications.
E.g. the use case you describe above would certainly fail with amdgpu, radeon, nouveau and i915 because mapping a DMA-buf doesn't stop the exporter from reading/writing to that buffer (just the opposite actually).
All of them assume perfectly coherent access to the underlying memory. As far as I know there is no documented cache maintenance requirements for DMA-buf.
I think it is documented. It's just that on x86, we ignore that because the dma-api pretends there's never a need for cache flushing on x86, and that everything snoops the cpu caches. Which isn't true since over 20 ago when AGP happened. The actual rules for x86 dma-buf are very much ad-hoc (and we occasionally reapply some duct-tape when cacheline noise shows up somewhere).
I've just filed this away as another instance of the dma-api not fitting gpus, and I think giving recent discussions that won't improve anytime soon. So we're stuck with essentially undefined dma-buf behaviour. -Daniel
The IOMMU concern on the other hand is certainly valid and I perfectly agree that keeping the mapping time as short as possible is desirable.
Regards, Christian.
Liam
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project
Best, Sumit.
dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Am 22.05.19 um 20:30 schrieb Daniel Vetter:
[SNIP]
Well, it seems you are making incorrect assumptions about the cache maintenance of DMA-buf here.
At least for all DRM devices I'm aware of mapping/unmapping an attachment does *NOT* have any cache maintenance implications.
E.g. the use case you describe above would certainly fail with amdgpu, radeon, nouveau and i915 because mapping a DMA-buf doesn't stop the exporter from reading/writing to that buffer (just the opposite actually).
All of them assume perfectly coherent access to the underlying memory. As far as I know there is no documented cache maintenance requirements for DMA-buf.
I think it is documented. It's just that on x86, we ignore that because the dma-api pretends there's never a need for cache flushing on x86, and that everything snoops the cpu caches. Which isn't true since over 20 ago when AGP happened. The actual rules for x86 dma-buf are very much ad-hoc (and we occasionally reapply some duct-tape when cacheline noise shows up somewhere).
Well I strongly disagree on this. Even on x86 at least AMD GPUs are also not fully coherent.
For example you have the texture cache and the HDP read/write cache. So when both amdgpu as well as i915 would write to the same buffer at the same time we would get a corrupted data as well.
The key point is that it is NOT DMA-buf in it's map/unmap call who is defining the coherency, but rather the reservation object and its attached dma_fence instances.
So for example as long as a exclusive reservation object fence is still not signaled I can't assume that all caches are flushed and so can't start with my own operation/access to the data in question.
Regards, Christian.
I've just filed this away as another instance of the dma-api not fitting gpus, and I think giving recent discussions that won't improve anytime soon. So we're stuck with essentially undefined dma-buf behaviour. -Daniel
On Thu, May 23, 2019 at 1:21 PM Koenig, Christian Christian.Koenig@amd.com wrote:
Am 22.05.19 um 20:30 schrieb Daniel Vetter:
[SNIP]
Well, it seems you are making incorrect assumptions about the cache maintenance of DMA-buf here.
At least for all DRM devices I'm aware of mapping/unmapping an attachment does *NOT* have any cache maintenance implications.
E.g. the use case you describe above would certainly fail with amdgpu, radeon, nouveau and i915 because mapping a DMA-buf doesn't stop the exporter from reading/writing to that buffer (just the opposite actually).
All of them assume perfectly coherent access to the underlying memory. As far as I know there is no documented cache maintenance requirements for DMA-buf.
I think it is documented. It's just that on x86, we ignore that because the dma-api pretends there's never a need for cache flushing on x86, and that everything snoops the cpu caches. Which isn't true since over 20 ago when AGP happened. The actual rules for x86 dma-buf are very much ad-hoc (and we occasionally reapply some duct-tape when cacheline noise shows up somewhere).
Well I strongly disagree on this. Even on x86 at least AMD GPUs are also not fully coherent.
For example you have the texture cache and the HDP read/write cache. So when both amdgpu as well as i915 would write to the same buffer at the same time we would get a corrupted data as well.
The key point is that it is NOT DMA-buf in it's map/unmap call who is defining the coherency, but rather the reservation object and its attached dma_fence instances.
So for example as long as a exclusive reservation object fence is still not signaled I can't assume that all caches are flushed and so can't start with my own operation/access to the data in question.
The dma-api doesn't flush device caches, ever. It might flush some iommu caches or some other bus cache somewhere in-between. So it also won't ever make sure that multiple devices don't trample on each another. For that you need something else (like reservation object, but I think that's not really followed outside of drm much).
The other bit is the coherent vs. non-coherent thing, which in the dma-api land just talks about whether cpu/device access need extra flushing or not. Now in practice that extra flushing is always only cpu side, i.e. will cpu writes/reads go through the cpu cache, and will device reads/writes snoop the cpu caches. That's (afaik at least, an in practice, not the abstract spec) the _only_ thing dma-api's cache maintenance does. For 0 copy that's all completely irrelevant, because as soon as you pick a mode where you need to do manual cache management you've screwed up, it's not 0-copy anymore really.
The other hilarious stuff is that on x86 we let userspace (at least with i915) do that cache management, so the kernel doesn't even have a clue. I think what we need in dma-buf (and dma-api people will scream about the "abstraction leak") is some notition about whether an importer should snoop or not (or if that device always uses non-snoop or snooped transactions). But that would shred the illusion the dma-api tries to keep up that all that matters is whether a mapping is coherent from the cpu's pov or not, and you can achieve coherence both with a cache cpu mapping + snooped transactions, or with wc cpu side and non-snooped transactions. Trying to add cache managment (which some dma-buf exporter do indeed attempt to) will be even worse.
Again, none of this is about preventing concurrent writes, or making sure device caches are flushed correctly around batches. -Daniel
On Thu, May 23, 2019 at 1:30 PM Daniel Vetter daniel@ffwll.ch wrote:
On Thu, May 23, 2019 at 1:21 PM Koenig, Christian Christian.Koenig@amd.com wrote:
Am 22.05.19 um 20:30 schrieb Daniel Vetter:
[SNIP]
Well, it seems you are making incorrect assumptions about the cache maintenance of DMA-buf here.
At least for all DRM devices I'm aware of mapping/unmapping an attachment does *NOT* have any cache maintenance implications.
E.g. the use case you describe above would certainly fail with amdgpu, radeon, nouveau and i915 because mapping a DMA-buf doesn't stop the exporter from reading/writing to that buffer (just the opposite actually).
All of them assume perfectly coherent access to the underlying memory. As far as I know there is no documented cache maintenance requirements for DMA-buf.
I think it is documented. It's just that on x86, we ignore that because the dma-api pretends there's never a need for cache flushing on x86, and that everything snoops the cpu caches. Which isn't true since over 20 ago when AGP happened. The actual rules for x86 dma-buf are very much ad-hoc (and we occasionally reapply some duct-tape when cacheline noise shows up somewhere).
Well I strongly disagree on this. Even on x86 at least AMD GPUs are also not fully coherent.
For example you have the texture cache and the HDP read/write cache. So when both amdgpu as well as i915 would write to the same buffer at the same time we would get a corrupted data as well.
The key point is that it is NOT DMA-buf in it's map/unmap call who is defining the coherency, but rather the reservation object and its attached dma_fence instances.
So for example as long as a exclusive reservation object fence is still not signaled I can't assume that all caches are flushed and so can't start with my own operation/access to the data in question.
The dma-api doesn't flush device caches, ever. It might flush some iommu caches or some other bus cache somewhere in-between. So it also won't ever make sure that multiple devices don't trample on each another. For that you need something else (like reservation object, but I think that's not really followed outside of drm much).
The other bit is the coherent vs. non-coherent thing, which in the dma-api land just talks about whether cpu/device access need extra flushing or not. Now in practice that extra flushing is always only cpu side, i.e. will cpu writes/reads go through the cpu cache, and will device reads/writes snoop the cpu caches. That's (afaik at least, an in practice, not the abstract spec) the _only_ thing dma-api's cache maintenance does. For 0 copy that's all completely irrelevant, because as soon as you pick a mode where you need to do manual cache management you've screwed up, it's not 0-copy anymore really.
The other hilarious stuff is that on x86 we let userspace (at least with i915) do that cache management, so the kernel doesn't even have a clue. I think what we need in dma-buf (and dma-api people will scream about the "abstraction leak") is some notition about whether an importer should snoop or not (or if that device always uses non-snoop or snooped transactions). But that would shred the illusion the dma-api tries to keep up that all that matters is whether a mapping is coherent from the cpu's pov or not, and you can achieve coherence both with a cache cpu mapping + snooped transactions, or with wc cpu side and non-snooped transactions. Trying to add cache managment (which some dma-buf exporter do indeed attempt to) will be even worse.
Again, none of this is about preventing concurrent writes, or making sure device caches are flushed correctly around batches.
btw I just grepped for reservation_object, no one outside of drivers/gpu is using that. So for device access synchronization everyone else is relying on userspace ordering requests correctly on its own. Iirc v4l/media is pondering adding dma-fence support, but that's not going anywhere.
Also, for correctness reservations aren't needed, we allow explicit syncing userspace to managed dma-fences/drm_syncobj on their own, and they are allowed to get this wrong. -Daniel
linaro-mm-sig@lists.linaro.org