Linaro-mm-sig March 2018

linaro-mm-sig@lists.linaro.org

13 participants
18 discussions

Re: [Linaro-mm-sig] [PATCH 2/8] PCI: Add pci_find_common_upstream_dev()

by Christian König

Am 28.03.2018 um 20:57 schrieb Logan Gunthorpe: > > On 28/03/18 12:28 PM, Christian König wrote: >> I'm just using amdgpu as blueprint because I'm the co-maintainer of it >> and know it mostly inside out. > Ah, I see. > >> The resource addresses are translated using dma_map_resource(). As far >> as I know that should be sufficient to offload all the architecture >> specific stuff to the DMA subsystem. > It's not. The dma_map infrastructure currently has no concept of > peer-to-peer mappings and is designed for system memory only. No > architecture I'm aware of will translate PCI CPU addresses into PCI Bus > addresses which is necessary for any transfer that doesn't go through > the root complex (though on arches like x86 the CPU and Bus address > happen to be the same). There's a lot of people that would like to see > this change but it's likely going to be a long road before it does. Well, isn't that exactly what dma_map_resource() is good for? As far as I can see it makes sure IOMMU is aware of the access route and translates a CPU address into a PCI Bus address. > Furthermore, one of the reasons our patch-set avoids going through the > root complex at all is that IOMMU drivers will need to be made aware > that it is operating on P2P memory and do arch-specific things > accordingly. There will also need to be flags that indicate whether a > given IOMMU driver supports this. None of this work is done or easy. I'm using that with the AMD IOMMU driver and at least there it works perfectly fine. >> Yeah, but not for ours. See if you want to do real peer 2 peer you need >> to keep both the operation as well as the direction into account. > Not sure what you are saying here... I'm pretty sure we are doing "real" > peer 2 peer... > >> For example when you can do writes between A and B that doesn't mean >> that writes between B and A work. And reads are generally less likely to >> work than writes. etc... > If both devices are behind a switch then the PCI spec guarantees that A > can both read and write B and vice versa. Sorry to say that, but I know a whole bunch of PCI devices which horrible ignores that. For example all AMD APUs fall under that category... > Only once you involve root > complexes do you have this problem. Ie. you have unknown support which > may be no support, or partial support (stores but not loads); or > sometimes bad performance; or a combination of both... and you need some > way to figure out all this mess and that is hard. Whoever tries to > implement a white list will have to sort all this out. Yes, exactly and unfortunately it looks like I'm the poor guy who needs to do this :) Regards, Christian. > > Logan > _______________________________________________ > amd-gfx mailing list > amd-gfx(a)lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/amd-gfx

7 years, 6 months

Re: [Linaro-mm-sig] [PATCH 2/8] PCI: Add pci_find_common_upstream_dev()

by Christian König

Am 28.03.2018 um 18:25 schrieb Logan Gunthorpe: > > On 28/03/18 10:02 AM, Christian König wrote: >> Yeah, that looks very similar to what I picked up from the older >> patches, going to read up on that after my vacation. > Yeah, I was just reading through your patchset and there are a lot of > similarities. Though, I'm not sure what you're trying to accomplish as I > could not find a cover letter and it seems to only enable one driver. Yeah, it was the last day before my easter vacation and I wanted it out of the door. > Is it meant to enable DMA transactions only between two AMD GPUs? Not really, DMA-buf is a general framework for sharing buffers between device drivers. It is widely used in the GFX stack on laptops with both Intel+AMD, Intel+NVIDIA or AMD+AMD graphics devices. Additional to that ARM uses it quite massively for their GFX stacks because they have rendering and displaying device separated. I'm just using amdgpu as blueprint because I'm the co-maintainer of it and know it mostly inside out. > I also don't see where you've taken into account the PCI bus address. On > some architectures this is not the same as the CPU physical address. The resource addresses are translated using dma_map_resource(). As far as I know that should be sufficient to offload all the architecture specific stuff to the DMA subsystem. > >> Just in general why are you interested in the "distance" of the devices? > We've taken a general approach where some drivers may provide p2p memory > (ie. an NVMe card or an RDMA NIC) and other drivers make use of it (ie. > the NVMe-of driver). The orchestrator driver needs to find the most > applicable provider device for a transaction in a situation that may > have multiple providers and multiple clients. So the most applicable > provider is the one that's closest ("distance"-wise) to all the clients > for the P2P transaction. That seems to make sense. > >> And BTW: At least for writes that Peer 2 Peer transactions between >> different root complexes work is actually more common than the other way >> around. > Maybe on x86 with hardware made in the last few years. But on PowerPC, > ARM64, and likely a lot more the chance of support is *much* less. Also, > hardware that only supports P2P stores is hardly full support and is > insufficient for our needs. Yeah, but not for ours. See if you want to do real peer 2 peer you need to keep both the operation as well as the direction into account. For example when you can do writes between A and B that doesn't mean that writes between B and A work. And reads are generally less likely to work than writes. etc... Since the use case I'm targeting for is GFX or GFX+V4L (or GFX+NIC in the future) I really need to handle all such use cases as well. > >> So I'm a bit torn between using a blacklist or a whitelist. A whitelist >> is certainly more conservative approach, but that could get a bit long. > I think a whitelist approach is correct. Given old hardware and other > architectures, a black list is going to be too long and too difficult to > comprehensively populate. Yeah, it would certainly be better if we have something in the root complex capabilities. But you're right that a whitelist sounds the less painful way. Regards, Christian. > > Logan > _______________________________________________ > amd-gfx mailing list > amd-gfx(a)lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/amd-gfx

7 years, 6 months

Re: [Linaro-mm-sig] [PATCH 2/8] PCI: Add pci_find_common_upstream_dev()

by Christian König

Am 28.03.2018 um 17:47 schrieb Logan Gunthorpe: > > On 28/03/18 09:07 AM, Christian König wrote: >> Am 28.03.2018 um 14:38 schrieb Christoph Hellwig: >>> On Sun, Mar 25, 2018 at 12:59:54PM +0200, Christian König wrote: >>>> From: "wdavis(a)nvidia.com" <wdavis(a)nvidia.com> >>>> >>>> Add an interface to find the first device which is upstream of both >>>> devices. >>> Please work with Logan and base this on top of the outstanding peer >>> to peer patchset. >> Can you point me to that? The last code I could find about that was from >> 2015. > The latest posted series is here: > > https://lkml.org/lkml/2018/3/12/830 > > However, we've made some significant changes to the area that's similar > to what you are doing. You can find lasted un-posted here: > > https://github.com/sbates130272/linux-p2pmem/tree/pci-p2p-v4-pre2 > > Specifically this function would be of interest to you: > > https://github.com/sbates130272/linux-p2pmem/blob/0e9468ae2a5a5198513dd1299… > > However, the difference between what we are doing is that we are > interested in the distance through the common upstream device and you > appear to be finding the actual common device. Yeah, that looks very similar to what I picked up from the older patches, going to read up on that after my vacation. Just in general why are you interested in the "distance" of the devices? And BTW: At least for writes that Peer 2 Peer transactions between different root complexes work is actually more common than the other way around. So I'm a bit torn between using a blacklist or a whitelist. A whitelist is certainly more conservative approach, but that could get a bit long. Thanks, Christian. > > Thanks, > > Logan

7 years, 6 months

[PATCH] dma-buf: use parameter structure for dma_buf_attach

by Christian König

Move the parameters into a structure to make it simpler to extend it in follow up patches. This also adds the importer private as parameter so that we can directly work with a completely filled in attachment structure. Signed-off-by: Christian König <christian.koenig(a)amd.com> --- drivers/dma-buf/dma-buf.c | 16 +++++++++------- drivers/gpu/drm/armada/armada_gem.c | 6 +++++- drivers/gpu/drm/drm_prime.c | 6 +++++- drivers/gpu/drm/i915/i915_gem_dmabuf.c | 6 +++++- drivers/gpu/drm/tegra/gem.c | 6 +++++- drivers/gpu/drm/udl/udl_dmabuf.c | 6 +++++- drivers/media/common/videobuf2/videobuf2-dma-contig.c | 6 +++++- drivers/media/common/videobuf2/videobuf2-dma-sg.c | 6 +++++- drivers/staging/media/tegra-vde/tegra-vde.c | 6 +++++- include/linux/dma-buf.h | 19 +++++++++++++++++-- 10 files changed, 66 insertions(+), 17 deletions(-) diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index d78d5fc173dc..d2e8ca0d9427 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -534,8 +534,9 @@ EXPORT_SYMBOL_GPL(dma_buf_put); /** * dma_buf_attach - Add the device to dma_buf's attachments list; optionally, * calls attach() of dma_buf_ops to allow device-specific attach functionality - * @dmabuf: [in] buffer to attach device to. - * @dev: [in] device to be attached. + * @info: [in] holds all the attach related information provided + * by the importer. see &struct dma_buf_attach_info + * for further details. * * Returns struct dma_buf_attachment pointer for this attachment. Attachments * must be cleaned up by calling dma_buf_detach(). @@ -549,26 +550,27 @@ EXPORT_SYMBOL_GPL(dma_buf_put); * accessible to @dev, and cannot be moved to a more suitable place. This is * indicated with the error code -EBUSY. */ -struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf, - struct device *dev) +struct dma_buf_attachment *dma_buf_attach(const struct dma_buf_attach_info *info) { + struct dma_buf *dmabuf = info->dmabuf; struct dma_buf_attachment *attach; int ret; - if (WARN_ON(!dmabuf || !dev)) + if (WARN_ON(!dmabuf || !info->dev)) return ERR_PTR(-EINVAL); attach = kzalloc(sizeof(*attach), GFP_KERNEL); if (!attach) return ERR_PTR(-ENOMEM); - attach->dev = dev; + attach->dev = info->dev; attach->dmabuf = dmabuf; + attach->priv = info->priv; mutex_lock(&dmabuf->lock); if (dmabuf->ops->attach) { - ret = dmabuf->ops->attach(dmabuf, dev, attach); + ret = dmabuf->ops->attach(dmabuf, info->dev, attach); if (ret) goto err_attach; } diff --git a/drivers/gpu/drm/armada/armada_gem.c b/drivers/gpu/drm/armada/armada_gem.c index a97f509743a5..f4d1c11f57ea 100644 --- a/drivers/gpu/drm/armada/armada_gem.c +++ b/drivers/gpu/drm/armada/armada_gem.c @@ -514,6 +514,10 @@ armada_gem_prime_export(struct drm_device *dev, struct drm_gem_object *obj, struct drm_gem_object * armada_gem_prime_import(struct drm_device *dev, struct dma_buf *buf) { + struct dma_buf_attach_info attach_info = { + .dev = dev->dev, + .dmabuf = buf + }; struct dma_buf_attachment *attach; struct armada_gem_object *dobj; @@ -529,7 +533,7 @@ armada_gem_prime_import(struct drm_device *dev, struct dma_buf *buf) } } - attach = dma_buf_attach(buf, dev->dev); + attach = dma_buf_attach(&attach_info); if (IS_ERR(attach)) return ERR_CAST(attach); diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c index 7856a9b3f8a8..4da242de51c2 100644 --- a/drivers/gpu/drm/drm_prime.c +++ b/drivers/gpu/drm/drm_prime.c @@ -707,6 +707,10 @@ struct drm_gem_object *drm_gem_prime_import_dev(struct drm_device *dev, struct dma_buf *dma_buf, struct device *attach_dev) { + struct dma_buf_attach_info attach_info = { + .dev = attach_dev, + .dmabuf = dma_buf + }; struct dma_buf_attachment *attach; struct sg_table *sgt; struct drm_gem_object *obj; @@ -727,7 +731,7 @@ struct drm_gem_object *drm_gem_prime_import_dev(struct drm_device *dev, if (!dev->driver->gem_prime_import_sg_table) return ERR_PTR(-EINVAL); - attach = dma_buf_attach(dma_buf, attach_dev); + attach = dma_buf_attach(&attach_info); if (IS_ERR(attach)) return ERR_CAST(attach); diff --git a/drivers/gpu/drm/i915/i915_gem_dmabuf.c b/drivers/gpu/drm/i915/i915_gem_dmabuf.c index 864439a214c8..94552ef3e5a7 100644 --- a/drivers/gpu/drm/i915/i915_gem_dmabuf.c +++ b/drivers/gpu/drm/i915/i915_gem_dmabuf.c @@ -288,6 +288,10 @@ static const struct drm_i915_gem_object_ops i915_gem_object_dmabuf_ops = { struct drm_gem_object *i915_gem_prime_import(struct drm_device *dev, struct dma_buf *dma_buf) { + struct dma_buf_attach_info attach_info = { + .dev = dev->dev, + .dmabuf = dma_buf + }; struct dma_buf_attachment *attach; struct drm_i915_gem_object *obj; int ret; @@ -306,7 +310,7 @@ struct drm_gem_object *i915_gem_prime_import(struct drm_device *dev, } /* need to attach */ - attach = dma_buf_attach(dma_buf, dev->dev); + attach = dma_buf_attach(&attach_info); if (IS_ERR(attach)) return ERR_CAST(attach); diff --git a/drivers/gpu/drm/tegra/gem.c b/drivers/gpu/drm/tegra/gem.c index 49b9bf28f872..462a4bac3f82 100644 --- a/drivers/gpu/drm/tegra/gem.c +++ b/drivers/gpu/drm/tegra/gem.c @@ -332,6 +332,10 @@ struct tegra_bo *tegra_bo_create_with_handle(struct drm_file *file, static struct tegra_bo *tegra_bo_import(struct drm_device *drm, struct dma_buf *buf) { + struct dma_buf_attach_info attach_info = { + .dev = drm->dev, + .dmabuf = buf + }; struct tegra_drm *tegra = drm->dev_private; struct dma_buf_attachment *attach; struct tegra_bo *bo; @@ -341,7 +345,7 @@ static struct tegra_bo *tegra_bo_import(struct drm_device *drm, if (IS_ERR(bo)) return bo; - attach = dma_buf_attach(buf, drm->dev); + attach = dma_buf_attach(&attach_info); if (IS_ERR(attach)) { err = PTR_ERR(attach); goto free; diff --git a/drivers/gpu/drm/udl/udl_dmabuf.c b/drivers/gpu/drm/udl/udl_dmabuf.c index 2867ed155ff6..c4db84abe231 100644 --- a/drivers/gpu/drm/udl/udl_dmabuf.c +++ b/drivers/gpu/drm/udl/udl_dmabuf.c @@ -243,6 +243,10 @@ static int udl_prime_create(struct drm_device *dev, struct drm_gem_object *udl_gem_prime_import(struct drm_device *dev, struct dma_buf *dma_buf) { + struct dma_buf_attach_info attach_info = { + .dev = dev->dev, + .dmabuf = dma_buf + }; struct dma_buf_attachment *attach; struct sg_table *sg; struct udl_gem_object *uobj; @@ -250,7 +254,7 @@ struct drm_gem_object *udl_gem_prime_import(struct drm_device *dev, /* need to attach */ get_device(dev->dev); - attach = dma_buf_attach(dma_buf, dev->dev); + attach = dma_buf_attach(&attach_info); if (IS_ERR(attach)) { put_device(dev->dev); return ERR_CAST(attach); diff --git a/drivers/media/common/videobuf2/videobuf2-dma-contig.c b/drivers/media/common/videobuf2/videobuf2-dma-contig.c index f1178f6f434d..93bd1f40f756 100644 --- a/drivers/media/common/videobuf2/videobuf2-dma-contig.c +++ b/drivers/media/common/videobuf2/videobuf2-dma-contig.c @@ -677,6 +677,10 @@ static void vb2_dc_detach_dmabuf(void *mem_priv) static void *vb2_dc_attach_dmabuf(struct device *dev, struct dma_buf *dbuf, unsigned long size, enum dma_data_direction dma_dir) { + struct dma_buf_attach_info attach_info = { + .dev = dev, + .dmabuf = dbuf + }; struct vb2_dc_buf *buf; struct dma_buf_attachment *dba; @@ -692,7 +696,7 @@ static void *vb2_dc_attach_dmabuf(struct device *dev, struct dma_buf *dbuf, buf->dev = dev; /* create attachment for the dmabuf with the user device */ - dba = dma_buf_attach(dbuf, buf->dev); + dba = dma_buf_attach(&attach_info); if (IS_ERR(dba)) { pr_err("failed to attach dmabuf\n"); kfree(buf); diff --git a/drivers/media/common/videobuf2/videobuf2-dma-sg.c b/drivers/media/common/videobuf2/videobuf2-dma-sg.c index 753ed3138dcc..4e61050ba87f 100644 --- a/drivers/media/common/videobuf2/videobuf2-dma-sg.c +++ b/drivers/media/common/videobuf2/videobuf2-dma-sg.c @@ -609,6 +609,10 @@ static void vb2_dma_sg_detach_dmabuf(void *mem_priv) static void *vb2_dma_sg_attach_dmabuf(struct device *dev, struct dma_buf *dbuf, unsigned long size, enum dma_data_direction dma_dir) { + struct dma_buf_attach_info attach_info = { + .dev = dev, + .dmabuf = dbuf + }; struct vb2_dma_sg_buf *buf; struct dma_buf_attachment *dba; @@ -624,7 +628,7 @@ static void *vb2_dma_sg_attach_dmabuf(struct device *dev, struct dma_buf *dbuf, buf->dev = dev; /* create attachment for the dmabuf with the user device */ - dba = dma_buf_attach(dbuf, buf->dev); + dba = dma_buf_attach(&attach_info); if (IS_ERR(dba)) { pr_err("failed to attach dmabuf\n"); kfree(buf); diff --git a/drivers/staging/media/tegra-vde/tegra-vde.c b/drivers/staging/media/tegra-vde/tegra-vde.c index c47659e96089..25d112443b0d 100644 --- a/drivers/staging/media/tegra-vde/tegra-vde.c +++ b/drivers/staging/media/tegra-vde/tegra-vde.c @@ -529,6 +529,10 @@ static int tegra_vde_attach_dmabuf(struct device *dev, size_t *size, enum dma_data_direction dma_dir) { + struct dma_buf_attach_info attach_info = { + .dev = dev, + .dmabuf = dmabuf + }; struct dma_buf_attachment *attachment; struct dma_buf *dmabuf; struct sg_table *sgt; @@ -547,7 +551,7 @@ static int tegra_vde_attach_dmabuf(struct device *dev, return -EINVAL; } - attachment = dma_buf_attach(dmabuf, dev); + attachment = dma_buf_attach(&attach_info); if (IS_ERR(attachment)) { dev_err(dev, "Failed to attach dmabuf\n"); err = PTR_ERR(attachment); diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h index 085db2fee2d7..2c27568d44af 100644 --- a/include/linux/dma-buf.h +++ b/include/linux/dma-buf.h @@ -362,6 +362,21 @@ struct dma_buf_export_info { struct dma_buf_export_info name = { .exp_name = KBUILD_MODNAME, \ .owner = THIS_MODULE } +/** + * struct dma_buf_attach_info - holds information needed to attach to a dma_buf + * @dmabuf: the exported dma_buf + * @dev: the device which wants to import the attachment + * @priv: private data of importer to this attachment + * + * This structure holds the information required to attach to a buffer. Used + * with dma_buf_attach() only. + */ +struct dma_buf_attach_info { + struct dma_buf *dmabuf; + struct device *dev; + void *priv; +}; + /** * get_dma_buf - convenience wrapper for get_file. * @dmabuf: [in] pointer to dma_buf @@ -376,8 +391,8 @@ static inline void get_dma_buf(struct dma_buf *dmabuf) get_file(dmabuf->file); } -struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf, - struct device *dev); +struct dma_buf_attachment * +dma_buf_attach(const struct dma_buf_attach_info *info); void dma_buf_detach(struct dma_buf *dmabuf, struct dma_buf_attachment *dmabuf_attach); -- 2.14.1

7 years, 6 months

RFC: unpinned DMA-buf exporting v2

by Christian König

Hi everybody, since I've got positive feedback from Daniel I continued working on this approach. A few issues are still open: 1. Daniel suggested that I make the invalidate_mappings callback a parameter of dma_buf_attach(). This approach unfortunately won't work because when the attachment is created the importer is not necessarily ready to handle invalidation events. E.g. in the amdgpu example we first need to setup the imported GEM/TMM objects and install that in the attachment. My solution is to introduce a separate function to grab the locks and set the callback, this function could then be used to pin the buffer later on if that turns out to be necessary after all. 2. With my example setup this currently results in a ping/pong situation because the exporter prefers a VRAM placement while the importer prefers a GTT placement. This results in quite a performance drop, but can be fixed by a simple mesa patch which allows shred BOs to be placed in both VRAM and GTT. Question is what should we do in the meantime? Accept the performance drop or only allow unpinned sharing with new Mesa? Please review and comment, Christian.

7 years, 6 months

[PATCH 1/5] dma-buf: add optional invalidate_mappings callback v3

by Christian König

Each importer can now provide an invalidate_mappings callback. This allows the exporter to provide the mappings without the need to pin the backing store. v2: don't try to invalidate mappings when the callback is NULL, lock the reservation obj while using the attachments, add helper to set the callback v3: move flag for invalidation support into the DMA-buf, use new attach_info structure to set the callback Signed-off-by: Christian König <christian.koenig(a)amd.com> --- drivers/dma-buf/dma-buf.c | 43 +++++++++++++++++++++++++++++++++++++++++++ include/linux/dma-buf.h | 28 ++++++++++++++++++++++++++++ 2 files changed, 71 insertions(+) diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index d2e8ca0d9427..ffaa2f9a9c2c 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -566,6 +566,7 @@ struct dma_buf_attachment *dma_buf_attach(const struct dma_buf_attach_info *info attach->dev = info->dev; attach->dmabuf = dmabuf; attach->priv = info->priv; + attach->invalidate = info->invalidate; mutex_lock(&dmabuf->lock); @@ -574,7 +575,9 @@ struct dma_buf_attachment *dma_buf_attach(const struct dma_buf_attach_info *info if (ret) goto err_attach; } + reservation_object_lock(dmabuf->resv, NULL); list_add(&attach->node, &dmabuf->attachments); + reservation_object_unlock(dmabuf->resv); mutex_unlock(&dmabuf->lock); return attach; @@ -600,7 +603,9 @@ void dma_buf_detach(struct dma_buf *dmabuf, struct dma_buf_attachment *attach) return; mutex_lock(&dmabuf->lock); + reservation_object_lock(dmabuf->resv, NULL); list_del(&attach->node); + reservation_object_unlock(dmabuf->resv); if (dmabuf->ops->detach) dmabuf->ops->detach(dmabuf, attach); @@ -634,10 +639,23 @@ struct sg_table *dma_buf_map_attachment(struct dma_buf_attachment *attach, if (WARN_ON(!attach || !attach->dmabuf)) return ERR_PTR(-EINVAL); + /* + * Mapping a DMA-buf can trigger its invalidation, prevent sending this + * event to the caller by temporary removing this attachment from the + * list. + */ + if (attach->invalidate) { + reservation_object_assert_held(attach->dmabuf->resv); + list_del(&attach->node); + } + sg_table = attach->dmabuf->ops->map_dma_buf(attach, direction); if (!sg_table) sg_table = ERR_PTR(-ENOMEM); + if (attach->invalidate) + list_add(&attach->node, &attach->dmabuf->attachments); + return sg_table; } EXPORT_SYMBOL_GPL(dma_buf_map_attachment); @@ -658,6 +676,9 @@ void dma_buf_unmap_attachment(struct dma_buf_attachment *attach, { might_sleep(); + if (attach->invalidate) + reservation_object_assert_held(attach->dmabuf->resv); + if (WARN_ON(!attach || !attach->dmabuf || !sg_table)) return; @@ -666,6 +687,26 @@ void dma_buf_unmap_attachment(struct dma_buf_attachment *attach, } EXPORT_SYMBOL_GPL(dma_buf_unmap_attachment); +/** + * dma_buf_invalidate_mappings - invalidate all mappings of this dma_buf + * + * @dmabuf: [in] buffer which mappings should be invalidated + * + * Informs all attachmenst that they need to destroy and recreated all their + * mappings. + */ +void dma_buf_invalidate_mappings(struct dma_buf *dmabuf) +{ + struct dma_buf_attachment *attach; + + reservation_object_assert_held(dmabuf->resv); + + list_for_each_entry(attach, &dmabuf->attachments, node) + if (attach->invalidate) + attach->invalidate(attach); +} +EXPORT_SYMBOL_GPL(dma_buf_invalidate_mappings); + /** * DOC: cpu access * @@ -1123,10 +1164,12 @@ static int dma_buf_debug_show(struct seq_file *s, void *unused) seq_puts(s, "\tAttached Devices:\n"); attach_count = 0; + reservation_object_lock(buf_obj->resv, NULL); list_for_each_entry(attach_obj, &buf_obj->attachments, node) { seq_printf(s, "\t%s\n", dev_name(attach_obj->dev)); attach_count++; } + reservation_object_unlock(buf_obj->resv); seq_printf(s, "Total %d devices attached\n\n", attach_count); diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h index 2c27568d44af..15dd8598bff1 100644 --- a/include/linux/dma-buf.h +++ b/include/linux/dma-buf.h @@ -270,6 +270,8 @@ struct dma_buf_ops { * @poll: for userspace poll support * @cb_excl: for userspace poll support * @cb_shared: for userspace poll support + * @invalidation_supported: True when the exporter supports unpinned operation + * using the reservation lock. * * This represents a shared buffer, created by calling dma_buf_export(). The * userspace representation is a normal file descriptor, which can be created by @@ -293,6 +295,7 @@ struct dma_buf { struct list_head list_node; void *priv; struct reservation_object *resv; + bool invalidation_supported; /* poll support */ wait_queue_head_t poll; @@ -326,6 +329,28 @@ struct dma_buf_attachment { struct device *dev; struct list_head node; void *priv; + + /** + * @invalidate: + * + * Optional callback provided by the importer of the dma-buf. + * + * If provided the exporter can avoid pinning the backing store while + * mappings exists. + * + * The function is called with the lock of the reservation object + * associated with the dma_buf held and the mapping function must be + * called with this lock held as well. This makes sure that no mapping + * is created concurrently with an ongoing invalidation. + * + * After the callback all existing mappings are still valid until all + * fences in the dma_bufs reservation object are signaled, but should be + * destroyed by the importer as soon as possible. + * + * New mappings can be created immediately, but can't be used before the + * exclusive fence in the dma_bufs reservation object is signaled. + */ + void (*invalidate)(struct dma_buf_attachment *attach); }; /** @@ -367,6 +392,7 @@ struct dma_buf_export_info { * @dmabuf: the exported dma_buf * @dev: the device which wants to import the attachment * @priv: private data of importer to this attachment + * @invalidate: callback to use for invalidating mappings * * This structure holds the information required to attach to a buffer. Used * with dma_buf_attach() only. @@ -375,6 +401,7 @@ struct dma_buf_attach_info { struct dma_buf *dmabuf; struct device *dev; void *priv; + void (*invalidate)(struct dma_buf_attachment *attach); }; /** @@ -406,6 +433,7 @@ struct sg_table *dma_buf_map_attachment(struct dma_buf_attachment *, enum dma_data_direction); void dma_buf_unmap_attachment(struct dma_buf_attachment *, struct sg_table *, enum dma_data_direction); +void dma_buf_invalidate_mappings(struct dma_buf *dma_buf); int dma_buf_begin_cpu_access(struct dma_buf *dma_buf, enum dma_data_direction dir); int dma_buf_end_cpu_access(struct dma_buf *dma_buf, -- 2.14.1

7 years, 6 months

RFC: unpinned DMA-buf exporting

by Christian König

This set of patches adds an option invalidate_mappings callback to each DMA-buf attachment which can be filled in by the importer. This callback allows the exporter to provided the DMA-buf content without pinning it. The reservation objects lock acts as synchronization point for buffer moves and creating mappings. This set includes an implementation for amdgpu which should be rather easily portable to other DRM drivers. Please comment, Christian.

7 years, 7 months

[PATCH 1/4] dma-buf: add optional invalidate_mappings callback

by Christian König

Each importer can now provide an invalidate_mappings callback. This allows the exporter to provide the mappings without the need to pin the backing store. Signed-off-by: Christian König <christian.koenig(a)amd.com> --- drivers/dma-buf/dma-buf.c | 25 +++++++++++++++++++++++++ include/linux/dma-buf.h | 36 ++++++++++++++++++++++++++++++++++++ 2 files changed, 61 insertions(+) diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index d78d5fc173dc..ed8d5844ae74 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -629,6 +629,9 @@ struct sg_table *dma_buf_map_attachment(struct dma_buf_attachment *attach, might_sleep(); + if (attach->invalidate_mappings) + reservation_object_assert_held(attach->dmabuf->resv); + if (WARN_ON(!attach || !attach->dmabuf)) return ERR_PTR(-EINVAL); @@ -656,6 +659,9 @@ void dma_buf_unmap_attachment(struct dma_buf_attachment *attach, { might_sleep(); + if (attach->invalidate_mappings) + reservation_object_assert_held(attach->dmabuf->resv); + if (WARN_ON(!attach || !attach->dmabuf || !sg_table)) return; @@ -664,6 +670,25 @@ void dma_buf_unmap_attachment(struct dma_buf_attachment *attach, } EXPORT_SYMBOL_GPL(dma_buf_unmap_attachment); +/** + * dma_buf_invalidate_mappings - invalidate all mappings of this dma_buf + * + * @dmabuf: [in] buffer which mappings should be invalidated + * + * Informs all attachmenst that they need to destroy and recreated all their + * mappings. + */ +void dma_buf_invalidate_mappings(struct dma_buf *dmabuf) +{ + struct dma_buf_attachment *attach; + + reservation_object_assert_held(dmabuf->resv); + + list_for_each_entry(attach, &dmabuf->attachments, node) + attach->invalidate_mappings(attach); +} +EXPORT_SYMBOL_GPL(dma_buf_invalidate_mappings); + /** * DOC: cpu access * diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h index 085db2fee2d7..c1e2f7d93509 100644 --- a/include/linux/dma-buf.h +++ b/include/linux/dma-buf.h @@ -91,6 +91,18 @@ struct dma_buf_ops { */ void (*detach)(struct dma_buf *, struct dma_buf_attachment *); + /** + * @supports_mapping_invalidation: + * + * True for exporters which supports unpinned DMA-buf operation using + * the reservation lock. + * + * When attachment->invalidate_mappings is set the @map_dma_buf and + * @unmap_dma_buf callbacks can be called with the reservation lock + * held. + */ + bool supports_mapping_invalidation; + /** * @map_dma_buf: * @@ -326,6 +338,29 @@ struct dma_buf_attachment { struct device *dev; struct list_head node; void *priv; + + /** + * @invalidate_mappings: + * + * Optional callback provided by the importer of the attachment which + * must be set before mappings are created. + * + * If provided the exporter can avoid pinning the backing store while + * mappings exists. + * + * The function is called with the lock of the reservation object + * associated with the dma_buf held and the mapping function must be + * called with this lock held as well. This makes sure that no mapping + * is created concurrently with an ongoing invalidation. + * + * After the callback all existing mappings are still valid until all + * fences in the dma_bufs reservation object are signaled, but should be + * destroyed by the importer as soon as possible. + * + * New mappings can be created immediately, but can't be used before the + * exclusive fence in the dma_bufs reservation object is signaled. + */ + void (*invalidate_mappings)(struct dma_buf_attachment *attach); }; /** @@ -391,6 +426,7 @@ struct sg_table *dma_buf_map_attachment(struct dma_buf_attachment *, enum dma_data_direction); void dma_buf_unmap_attachment(struct dma_buf_attachment *, struct sg_table *, enum dma_data_direction); +void dma_buf_invalidate_mappings(struct dma_buf *dma_buf); int dma_buf_begin_cpu_access(struct dma_buf *dma_buf, enum dma_data_direction dir); int dma_buf_end_cpu_access(struct dma_buf *dma_buf, -- 2.14.1

7 years, 7 months

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

Linaro-mm-sig March 2018