Changelog v3: * Rebased on top of Jason's iommufd_hwpt branch: https://github.com/jgunthorpe/linux/commits/iommufd_hwpt Particularly the following series: 1) "Revise the hwpt lifetime model" https://lore.kernel.org/linux-iommu/0-v2-406f7ac07936+6a-iommufd_hwpt_jgg@nv... 2) "Add iommufd physical device operations for replace and alloc hwpt" https://lore.kernel.org/linux-iommu/0-v1-7612f88c19f5+2f21-iommufd_alloc_jgg... * Dropped patches from this series accordingly. There were a couple of VFIO patches that will be submitted after the VFIO cdev series. Also, renamed the series to be "emulated". * Moved dma_unmap sanity patch to the first in the series. * Moved dma_unmap sanity to cover both VFIO and IOMMUFD pathways. * Added Kevin's "Reviewed-by" to two of the patches. * Fixed a NULL pointer bug in vfio_iommufd_emulated_bind(). * Moved unmap() call to the common place in iommufd_access_set_ioas(). v2: https://lore.kernel.org/linux-iommu/cover.1675802050.git.nicolinc@nvidia.com... * Rebased on top of vfio_device cdev v2 series. * Update the kdoc and commit message of iommu_group_replace_domain(). * Dropped revert-to-core-domain part in iommu_group_replace_domain(). * Dropped !ops->dma_unmap check in vfio_iommufd_emulated_attach_ioas(). * Added missing rc value in vfio_iommufd_emulated_attach_ioas() from the iommufd_access_set_ioas() call. * Added a new patch in vfio_main to deny vfio_pin/unpin_pages() calls if vdev->ops->dma_unmap is not implemented. * Added a __iommmufd_device_detach helper and let the replace routine do a partial detach(). * Added restriction on auto_domains to use the replace feature. * Added the patch "iommufd/device: Make hwpt_list list_add/del symmetric" from the has_group removal series. v1: https://lore.kernel.org/linux-iommu/cover.1675320212.git.nicolinc@nvidia.com...
Hi all,
The existing IOMMU APIs provide a pair of functions: iommu_attach_group() for callers to attach a device from the default_domain (NULL if not being supported) to a given iommu domain, and iommu_detach_group() for callers to detach a device from a given domain to the default_domain. Internally, the detach_dev op is deprecated for the newer drivers with default_domain. This means that those drivers likely can switch an attaching domain to another one, without stagging the device at a blocking or default domain, for use cases such as: 1) vPASID mode, when a guest wants to replace a single pasid (PASID=0) table with a larger table (PASID=N) 2) Nesting mode, when switching the attaching device from an S2 domain to an S1 domain, or when switching between relevant S1 domains.
This series is rebased on top of Jason Gunthorpe's series that introduces iommu_group_replace_domain API and IOMMUFD infrastructure for the IOMMUFD "physical" devices. The IOMMUFD "emulated" deivces will need some extra steps to replace the access->ioas object and its iopt pointer.
You can also find this series on Github: https://github.com/nicolinc/iommufd/commits/iommu_group_replace_domain-v3
Thank you Nicolin Chen
Nicolin Chen (5): vfio: Do not allow !ops->dma_unmap in vfio_pin/unpin_pages() iommufd: Create access in vfio_iommufd_emulated_bind() iommufd/selftest: Add IOMMU_TEST_OP_ACCESS_SET_IOAS coverage iommufd: Add replace support in iommufd_access_set_ioas() iommufd/selftest: Add coverage for access->ioas replacement
drivers/iommu/iommufd/device.c | 114 ++++++++++++++---- drivers/iommu/iommufd/iommufd_private.h | 2 + drivers/iommu/iommufd/iommufd_test.h | 4 + drivers/iommu/iommufd/selftest.c | 25 +++- drivers/vfio/iommufd.c | 23 ++-- drivers/vfio/vfio_main.c | 4 + include/linux/iommufd.h | 3 +- tools/testing/selftests/iommu/iommufd.c | 29 ++++- tools/testing/selftests/iommu/iommufd_utils.h | 22 +++- 9 files changed, 185 insertions(+), 41 deletions(-)
A driver that doesn't implement ops->dma_unmap shouldn't be allowed to do vfio_pin/unpin_pages(), though it can use vfio_dma_rw() to access an iova range. Deny !ops->dma_unmap cases in vfio_pin/unpin_pages().
Suggested-by: Kevin Tian kevin.tian@intel.com Signed-off-by: Nicolin Chen nicolinc@nvidia.com --- drivers/vfio/vfio_main.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c index 43bd6b76e2b6..0381f6e7f4e6 100644 --- a/drivers/vfio/vfio_main.c +++ b/drivers/vfio/vfio_main.c @@ -1292,6 +1292,8 @@ int vfio_pin_pages(struct vfio_device *device, dma_addr_t iova, /* group->container cannot change while a vfio device is open */ if (!pages || !npage || WARN_ON(!vfio_assert_device_open(device))) return -EINVAL; + if (!device->ops->dma_unmap) + return -EINVAL; if (vfio_device_has_container(device)) return vfio_device_container_pin_pages(device, iova, npage, prot, pages); @@ -1329,6 +1331,8 @@ void vfio_unpin_pages(struct vfio_device *device, dma_addr_t iova, int npage) { if (WARN_ON(!vfio_assert_device_open(device))) return; + if (WARN_ON(!device->ops->dma_unmap)) + return;
if (vfio_device_has_container(device)) { vfio_device_container_unpin_pages(device, iova, npage);
From: Nicolin Chen nicolinc@nvidia.com Sent: Saturday, February 25, 2023 9:52 AM
A driver that doesn't implement ops->dma_unmap shouldn't be allowed to do vfio_pin/unpin_pages(), though it can use vfio_dma_rw() to access an iova range. Deny !ops->dma_unmap cases in vfio_pin/unpin_pages().
Suggested-by: Kevin Tian kevin.tian@intel.com Signed-off-by: Nicolin Chen nicolinc@nvidia.com
Reviewed-by: Kevin Tian kevin.tian@intel.com
From: Nicolin Chen nicolinc@nvidia.com Sent: Saturday, February 25, 2023 9:52 AM
A driver that doesn't implement ops->dma_unmap shouldn't be allowed to do vfio_pin/unpin_pages(), though it can use vfio_dma_rw() to access an iova range. Deny !ops->dma_unmap cases in vfio_pin/unpin_pages().
Suggested-by: Kevin Tian kevin.tian@intel.com Signed-off-by: Nicolin Chen nicolinc@nvidia.com
Reviewed-by: Yi Liu yi.l.liu@intel.com
drivers/vfio/vfio_main.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c index 43bd6b76e2b6..0381f6e7f4e6 100644 --- a/drivers/vfio/vfio_main.c +++ b/drivers/vfio/vfio_main.c @@ -1292,6 +1292,8 @@ int vfio_pin_pages(struct vfio_device *device, dma_addr_t iova, /* group->container cannot change while a vfio device is open */ if (!pages || !npage || WARN_ON(!vfio_assert_device_open(device))) return -EINVAL;
- if (!device->ops->dma_unmap)
if (vfio_device_has_container(device)) return vfio_device_container_pin_pages(device, iova, npage, prot, pages);return -EINVAL;
@@ -1329,6 +1331,8 @@ void vfio_unpin_pages(struct vfio_device *device, dma_addr_t iova, int npage) { if (WARN_ON(!vfio_assert_device_open(device))) return;
if (WARN_ON(!device->ops->dma_unmap))
return;
if (vfio_device_has_container(device)) { vfio_device_container_unpin_pages(device, iova, npage);
-- 2.39.2
To prepare for an access->ioas replacement, move iommufd_access_create() call into vfio_iommufd_emulated_bind(), making it symmetric with the __vfio_iommufd_access_destroy() call in vfio_iommufd_emulated_unbind(). This means an access is created/destroyed by the bind()/unbind(), and the vfio_iommufd_emulated_attach_ioas() only updates the access->ioas pointer.
Since there's no longer an ioas_id input for iommufd_access_create(), add a new helper iommufd_access_set_ioas() to set access->ioas. We can later add a "replace" feature simply to the new iommufd_access_set_ioas() too.
Leaving the access->ioas in vfio_iommufd_emulated_attach_ioas(), however, can introduce some potential of a race condition during pin_/unpin_pages() call where access->ioas->iopt is getting referenced. So, add an ioas_lock to protect it.
Reviewed-by: Kevin Tian kevin.tian@intel.com Signed-off-by: Nicolin Chen nicolinc@nvidia.com --- drivers/iommu/iommufd/device.c | 103 ++++++++++++++++++------ drivers/iommu/iommufd/iommufd_private.h | 1 + drivers/iommu/iommufd/selftest.c | 5 +- drivers/vfio/iommufd.c | 23 +++--- include/linux/iommufd.h | 3 +- 5 files changed, 100 insertions(+), 35 deletions(-)
diff --git a/drivers/iommu/iommufd/device.c b/drivers/iommu/iommufd/device.c index 6ba8633b699d..a955ebd4bd5d 100644 --- a/drivers/iommu/iommufd/device.c +++ b/drivers/iommu/iommufd/device.c @@ -618,9 +618,12 @@ void iommufd_access_destroy_object(struct iommufd_object *obj) struct iommufd_access *access = container_of(obj, struct iommufd_access, obj);
- iopt_remove_access(&access->ioas->iopt, access); + if (access->ioas) { + iopt_remove_access(&access->ioas->iopt, access); + refcount_dec(&access->ioas->obj.users); + } iommufd_ctx_put(access->ictx); - refcount_dec(&access->ioas->obj.users); + mutex_destroy(&access->ioas_lock); }
/** @@ -637,12 +640,10 @@ void iommufd_access_destroy_object(struct iommufd_object *obj) * The provided ops are required to use iommufd_access_pin_pages(). */ struct iommufd_access * -iommufd_access_create(struct iommufd_ctx *ictx, u32 ioas_id, +iommufd_access_create(struct iommufd_ctx *ictx, const struct iommufd_access_ops *ops, void *data) { struct iommufd_access *access; - struct iommufd_object *obj; - int rc;
/* * There is no uAPI for the access object, but to keep things symmetric @@ -655,33 +656,18 @@ iommufd_access_create(struct iommufd_ctx *ictx, u32 ioas_id, access->data = data; access->ops = ops;
- obj = iommufd_get_object(ictx, ioas_id, IOMMUFD_OBJ_IOAS); - if (IS_ERR(obj)) { - rc = PTR_ERR(obj); - goto out_abort; - } - access->ioas = container_of(obj, struct iommufd_ioas, obj); - iommufd_ref_to_users(obj); - if (ops->needs_pin_pages) access->iova_alignment = PAGE_SIZE; else access->iova_alignment = 1; - rc = iopt_add_access(&access->ioas->iopt, access); - if (rc) - goto out_put_ioas;
/* The calling driver is a user until iommufd_access_destroy() */ refcount_inc(&access->obj.users); + mutex_init(&access->ioas_lock); access->ictx = ictx; iommufd_ctx_get(ictx); iommufd_object_finalize(ictx, &access->obj); return access; -out_put_ioas: - refcount_dec(&access->ioas->obj.users); -out_abort: - iommufd_object_abort(ictx, &access->obj); - return ERR_PTR(rc); } EXPORT_SYMBOL_NS_GPL(iommufd_access_create, IOMMUFD);
@@ -700,6 +686,50 @@ void iommufd_access_destroy(struct iommufd_access *access) } EXPORT_SYMBOL_NS_GPL(iommufd_access_destroy, IOMMUFD);
+int iommufd_access_set_ioas(struct iommufd_access *access, u32 ioas_id) +{ + struct iommufd_ioas *new_ioas = NULL, *cur_ioas; + struct iommufd_ctx *ictx = access->ictx; + struct iommufd_object *obj; + int rc = 0; + + if (ioas_id) { + obj = iommufd_get_object(ictx, ioas_id, IOMMUFD_OBJ_IOAS); + if (IS_ERR(obj)) + return PTR_ERR(obj); + new_ioas = container_of(obj, struct iommufd_ioas, obj); + } + + mutex_lock(&access->ioas_lock); + cur_ioas = access->ioas; + if (cur_ioas == new_ioas) + goto out_unlock; + + if (new_ioas) { + rc = iopt_add_access(&new_ioas->iopt, access); + if (rc) + goto out_unlock; + iommufd_ref_to_users(obj); + } + + if (cur_ioas) { + iopt_remove_access(&cur_ioas->iopt, access); + refcount_dec(&cur_ioas->obj.users); + } + + access->ioas = new_ioas; + mutex_unlock(&access->ioas_lock); + + return 0; + +out_unlock: + mutex_unlock(&access->ioas_lock); + if (new_ioas) + iommufd_put_object(obj); + return rc; +} +EXPORT_SYMBOL_NS_GPL(iommufd_access_set_ioas, IOMMUFD); + /** * iommufd_access_notify_unmap - Notify users of an iopt to stop using it * @iopt: iopt to work on @@ -750,8 +780,8 @@ void iommufd_access_notify_unmap(struct io_pagetable *iopt, unsigned long iova, void iommufd_access_unpin_pages(struct iommufd_access *access, unsigned long iova, unsigned long length) { - struct io_pagetable *iopt = &access->ioas->iopt; struct iopt_area_contig_iter iter; + struct io_pagetable *iopt; unsigned long last_iova; struct iopt_area *area;
@@ -759,6 +789,13 @@ void iommufd_access_unpin_pages(struct iommufd_access *access, WARN_ON(check_add_overflow(iova, length - 1, &last_iova))) return;
+ mutex_lock(&access->ioas_lock); + if (!access->ioas) { + mutex_unlock(&access->ioas_lock); + return; + } + iopt = &access->ioas->iopt; + down_read(&iopt->iova_rwsem); iopt_for_each_contig_area(&iter, area, iopt, iova, last_iova) iopt_area_remove_access( @@ -768,6 +805,7 @@ void iommufd_access_unpin_pages(struct iommufd_access *access, min(last_iova, iopt_area_last_iova(area)))); up_read(&iopt->iova_rwsem); WARN_ON(!iopt_area_contig_done(&iter)); + mutex_unlock(&access->ioas_lock); } EXPORT_SYMBOL_NS_GPL(iommufd_access_unpin_pages, IOMMUFD);
@@ -813,8 +851,8 @@ int iommufd_access_pin_pages(struct iommufd_access *access, unsigned long iova, unsigned long length, struct page **out_pages, unsigned int flags) { - struct io_pagetable *iopt = &access->ioas->iopt; struct iopt_area_contig_iter iter; + struct io_pagetable *iopt; unsigned long last_iova; struct iopt_area *area; int rc; @@ -829,6 +867,13 @@ int iommufd_access_pin_pages(struct iommufd_access *access, unsigned long iova, if (check_add_overflow(iova, length - 1, &last_iova)) return -EOVERFLOW;
+ mutex_lock(&access->ioas_lock); + if (!access->ioas) { + mutex_unlock(&access->ioas_lock); + return -ENOENT; + } + iopt = &access->ioas->iopt; + down_read(&iopt->iova_rwsem); iopt_for_each_contig_area(&iter, area, iopt, iova, last_iova) { unsigned long last = min(last_iova, iopt_area_last_iova(area)); @@ -859,6 +904,7 @@ int iommufd_access_pin_pages(struct iommufd_access *access, unsigned long iova, }
up_read(&iopt->iova_rwsem); + mutex_unlock(&access->ioas_lock); return 0;
err_remove: @@ -873,6 +919,7 @@ int iommufd_access_pin_pages(struct iommufd_access *access, unsigned long iova, iopt_area_last_iova(area)))); } up_read(&iopt->iova_rwsem); + mutex_unlock(&access->ioas_lock); return rc; } EXPORT_SYMBOL_NS_GPL(iommufd_access_pin_pages, IOMMUFD); @@ -892,8 +939,8 @@ EXPORT_SYMBOL_NS_GPL(iommufd_access_pin_pages, IOMMUFD); int iommufd_access_rw(struct iommufd_access *access, unsigned long iova, void *data, size_t length, unsigned int flags) { - struct io_pagetable *iopt = &access->ioas->iopt; struct iopt_area_contig_iter iter; + struct io_pagetable *iopt; struct iopt_area *area; unsigned long last_iova; int rc; @@ -903,6 +950,13 @@ int iommufd_access_rw(struct iommufd_access *access, unsigned long iova, if (check_add_overflow(iova, length - 1, &last_iova)) return -EOVERFLOW;
+ mutex_lock(&access->ioas_lock); + if (!access->ioas) { + mutex_unlock(&access->ioas_lock); + return -ENOENT; + } + iopt = &access->ioas->iopt; + down_read(&iopt->iova_rwsem); iopt_for_each_contig_area(&iter, area, iopt, iova, last_iova) { unsigned long last = min(last_iova, iopt_area_last_iova(area)); @@ -929,6 +983,7 @@ int iommufd_access_rw(struct iommufd_access *access, unsigned long iova, rc = -ENOENT; err_out: up_read(&iopt->iova_rwsem); + mutex_unlock(&access->ioas_lock); return rc; } EXPORT_SYMBOL_NS_GPL(iommufd_access_rw, IOMMUFD); diff --git a/drivers/iommu/iommufd/iommufd_private.h b/drivers/iommu/iommufd/iommufd_private.h index c9acc70d84f7..94e88377a7fc 100644 --- a/drivers/iommu/iommufd/iommufd_private.h +++ b/drivers/iommu/iommufd/iommufd_private.h @@ -311,6 +311,7 @@ struct iommufd_access { struct iommufd_object obj; struct iommufd_ctx *ictx; struct iommufd_ioas *ioas; + struct mutex ioas_lock; const struct iommufd_access_ops *ops; void *data; unsigned long iova_alignment; diff --git a/drivers/iommu/iommufd/selftest.c b/drivers/iommu/iommufd/selftest.c index 51ea09af5778..dacd1f67957d 100644 --- a/drivers/iommu/iommufd/selftest.c +++ b/drivers/iommu/iommufd/selftest.c @@ -754,7 +754,7 @@ static int iommufd_test_create_access(struct iommufd_ucmd *ucmd, }
access = iommufd_access_create( - ucmd->ictx, ioas_id, + ucmd->ictx, (flags & MOCK_FLAGS_ACCESS_CREATE_NEEDS_PIN_PAGES) ? &selftest_access_ops_pin : &selftest_access_ops, @@ -763,6 +763,9 @@ static int iommufd_test_create_access(struct iommufd_ucmd *ucmd, rc = PTR_ERR(access); goto out_put_fdno; } + rc = iommufd_access_set_ioas(access, ioas_id); + if (rc) + goto out_destroy; cmd->create_access.out_access_fd = fdno; rc = iommufd_ucmd_respond(ucmd, sizeof(*cmd)); if (rc) diff --git a/drivers/vfio/iommufd.c b/drivers/vfio/iommufd.c index db4efbd56042..1f87294c1931 100644 --- a/drivers/vfio/iommufd.c +++ b/drivers/vfio/iommufd.c @@ -138,10 +138,18 @@ static const struct iommufd_access_ops vfio_user_ops = { int vfio_iommufd_emulated_bind(struct vfio_device *vdev, struct iommufd_ctx *ictx, u32 *out_device_id) { + struct iommufd_access *user; + lockdep_assert_held(&vdev->dev_set->lock);
- vdev->iommufd_ictx = ictx; iommufd_ctx_get(ictx); + user = iommufd_access_create(ictx, &vfio_user_ops, vdev); + if (IS_ERR(user)) { + iommufd_ctx_put(ictx); + return PTR_ERR(user); + } + vdev->iommufd_access = user; + vdev->iommufd_ictx = ictx; return 0; } EXPORT_SYMBOL_GPL(vfio_iommufd_emulated_bind); @@ -161,15 +169,12 @@ EXPORT_SYMBOL_GPL(vfio_iommufd_emulated_unbind);
int vfio_iommufd_emulated_attach_ioas(struct vfio_device *vdev, u32 *pt_id) { - struct iommufd_access *user; - lockdep_assert_held(&vdev->dev_set->lock);
- user = iommufd_access_create(vdev->iommufd_ictx, *pt_id, &vfio_user_ops, - vdev); - if (IS_ERR(user)) - return PTR_ERR(user); - vdev->iommufd_access = user; - return 0; + if (!vdev->iommufd_ictx) + return -EINVAL; + if (!vdev->iommufd_access) + return -ENOENT; + return iommufd_access_set_ioas(vdev->iommufd_access, *pt_id); } EXPORT_SYMBOL_GPL(vfio_iommufd_emulated_attach_ioas); diff --git a/include/linux/iommufd.h b/include/linux/iommufd.h index 3044a432a83e..f20f63e0b066 100644 --- a/include/linux/iommufd.h +++ b/include/linux/iommufd.h @@ -41,9 +41,10 @@ enum { };
struct iommufd_access * -iommufd_access_create(struct iommufd_ctx *ictx, u32 ioas_id, +iommufd_access_create(struct iommufd_ctx *ictx, const struct iommufd_access_ops *ops, void *data); void iommufd_access_destroy(struct iommufd_access *access); +int iommufd_access_set_ioas(struct iommufd_access *access, u32 ioas_id);
void iommufd_ctx_get(struct iommufd_ctx *ictx);
Add a new IOMMU_TEST_OP_ACCESS_SET_IOAS to allow setting access->ioas individually, corresponding to the iommufd_access_set_ioas() helper.
Reviewed-by: Kevin Tian kevin.tian@intel.com Signed-off-by: Nicolin Chen nicolinc@nvidia.com --- drivers/iommu/iommufd/iommufd_test.h | 4 +++ drivers/iommu/iommufd/selftest.c | 26 +++++++++++++++---- tools/testing/selftests/iommu/iommufd_utils.h | 22 ++++++++++++++-- 3 files changed, 45 insertions(+), 7 deletions(-)
diff --git a/drivers/iommu/iommufd/iommufd_test.h b/drivers/iommu/iommufd/iommufd_test.h index d35620158c8b..5ff1fe6524e9 100644 --- a/drivers/iommu/iommufd/iommufd_test.h +++ b/drivers/iommu/iommufd/iommufd_test.h @@ -18,6 +18,7 @@ enum { IOMMU_TEST_OP_ACCESS_RW, IOMMU_TEST_OP_SET_TEMP_MEMORY_LIMIT, IOMMU_TEST_OP_MOCK_DOMAIN_REPLACE, + IOMMU_TEST_OP_ACCESS_SET_IOAS, };
enum { @@ -92,6 +93,9 @@ struct iommu_test_cmd { struct { __u32 limit; } memory_limit; + struct { + __u32 ioas_id; + } access_set_ioas; }; __u32 last; }; diff --git a/drivers/iommu/iommufd/selftest.c b/drivers/iommu/iommufd/selftest.c index dacd1f67957d..80d966b8f689 100644 --- a/drivers/iommu/iommufd/selftest.c +++ b/drivers/iommu/iommufd/selftest.c @@ -732,7 +732,7 @@ static struct selftest_access *iommufd_test_alloc_access(void) }
static int iommufd_test_create_access(struct iommufd_ucmd *ucmd, - unsigned int ioas_id, unsigned int flags) + unsigned int flags) { struct iommu_test_cmd *cmd = ucmd->cmd; struct selftest_access *staccess; @@ -763,9 +763,6 @@ static int iommufd_test_create_access(struct iommufd_ucmd *ucmd, rc = PTR_ERR(access); goto out_put_fdno; } - rc = iommufd_access_set_ioas(access, ioas_id); - if (rc) - goto out_destroy; cmd->create_access.out_access_fd = fdno; rc = iommufd_ucmd_respond(ucmd, sizeof(*cmd)); if (rc) @@ -784,6 +781,22 @@ static int iommufd_test_create_access(struct iommufd_ucmd *ucmd, return rc; }
+static int iommufd_test_access_set_ioas(struct iommufd_ucmd *ucmd, + unsigned int access_id, + unsigned int ioas_id) +{ + struct selftest_access *staccess; + int rc; + + staccess = iommufd_access_get(access_id); + if (IS_ERR(staccess)) + return PTR_ERR(staccess); + + rc = iommufd_access_set_ioas(staccess->access, ioas_id); + fput(staccess->file); + return rc; +} + /* Check that the pages in a page array match the pages in the user VA */ static int iommufd_test_check_pages(void __user *uptr, struct page **pages, size_t npages) @@ -997,8 +1010,11 @@ int iommufd_test(struct iommufd_ucmd *ucmd) ucmd, u64_to_user_ptr(cmd->check_refs.uptr), cmd->check_refs.length, cmd->check_refs.refs); case IOMMU_TEST_OP_CREATE_ACCESS: - return iommufd_test_create_access(ucmd, cmd->id, + return iommufd_test_create_access(ucmd, cmd->create_access.flags); + case IOMMU_TEST_OP_ACCESS_SET_IOAS: + return iommufd_test_access_set_ioas( + ucmd, cmd->id, cmd->access_set_ioas.ioas_id); case IOMMU_TEST_OP_ACCESS_PAGES: return iommufd_test_access_pages( ucmd, cmd->id, cmd->access_pages.iova, diff --git a/tools/testing/selftests/iommu/iommufd_utils.h b/tools/testing/selftests/iommu/iommufd_utils.h index bc0ca8973e79..1333dc85fdab 100644 --- a/tools/testing/selftests/iommu/iommufd_utils.h +++ b/tools/testing/selftests/iommu/iommufd_utils.h @@ -117,13 +117,31 @@ static int _test_cmd_hwpt_alloc(int fd, __u32 device_id, __u32 pt_id, #define test_cmd_hwpt_alloc(device_id, pt_id, hwpt_id) \ ASSERT_EQ(0, _test_cmd_hwpt_alloc(self->fd, device_id, pt_id, hwpt_id))
+static int _test_cmd_access_set_ioas(int fd, __u32 access_id, + unsigned int ioas_id) +{ + struct iommu_test_cmd cmd = { + .size = sizeof(cmd), + .op = IOMMU_TEST_OP_ACCESS_SET_IOAS, + .id = access_id, + .access_set_ioas = { .ioas_id = ioas_id }, + }; + int ret; + + ret = ioctl(fd, IOMMU_TEST_CMD, &cmd); + if (ret) + return ret; + return 0; +} +#define test_cmd_access_set_ioas(access_id, ioas_id) \ + ASSERT_EQ(0, _test_cmd_access_set_ioas(self->fd, access_id, ioas_id)) + static int _test_cmd_create_access(int fd, unsigned int ioas_id, __u32 *access_id, unsigned int flags) { struct iommu_test_cmd cmd = { .size = sizeof(cmd), .op = IOMMU_TEST_OP_CREATE_ACCESS, - .id = ioas_id, .create_access = { .flags = flags }, }; int ret; @@ -132,7 +150,7 @@ static int _test_cmd_create_access(int fd, unsigned int ioas_id, if (ret) return ret; *access_id = cmd.create_access.out_access_fd; - return 0; + return _test_cmd_access_set_ioas(fd, *access_id, ioas_id); } #define test_cmd_create_access(ioas_id, access_id, flags) \ ASSERT_EQ(0, _test_cmd_create_access(self->fd, ioas_id, access_id, \
Support an access->ioas replacement in iommufd_access_set_ioas(), which sets the access->ioas to NULL provisionally so that any further incoming iommufd_access_pin_pages() callback can be blocked.
Then, call access->ops->unmap() to clean up the entire iopt. To allow an iommufd_access_unpin_pages() callback to happen via this unmap() call, add an ioas_unpin pointer so the unpin routine won't be affected by the "access->ioas = NULL" trick above.
Suggested-by: Jason Gunthorpe jgg@nvidia.com Signed-off-by: Nicolin Chen nicolinc@nvidia.com --- drivers/iommu/iommufd/device.c | 15 +++++++++++++-- drivers/iommu/iommufd/iommufd_private.h | 1 + 2 files changed, 14 insertions(+), 2 deletions(-)
diff --git a/drivers/iommu/iommufd/device.c b/drivers/iommu/iommufd/device.c index a955ebd4bd5d..c44d6e7a2ed7 100644 --- a/drivers/iommu/iommufd/device.c +++ b/drivers/iommu/iommufd/device.c @@ -712,11 +712,22 @@ int iommufd_access_set_ioas(struct iommufd_access *access, u32 ioas_id) iommufd_ref_to_users(obj); }
+ /* + * Set ioas to NULL to block any further iommufd_access_pin_pages(). + * iommufd_access_unpin_pages() can continue using access->ioas_unpin. + */ + access->ioas = NULL; + + mutex_unlock(&access->ioas_lock); + access->ops->unmap(access->data, 0, ULONG_MAX); + mutex_lock(&access->ioas_lock); + if (cur_ioas) { iopt_remove_access(&cur_ioas->iopt, access); refcount_dec(&cur_ioas->obj.users); }
+ access->ioas_unpin = new_ioas; access->ioas = new_ioas; mutex_unlock(&access->ioas_lock);
@@ -790,11 +801,11 @@ void iommufd_access_unpin_pages(struct iommufd_access *access, return;
mutex_lock(&access->ioas_lock); - if (!access->ioas) { + if (!access->ioas_unpin) { mutex_unlock(&access->ioas_lock); return; } - iopt = &access->ioas->iopt; + iopt = &access->ioas_unpin->iopt;
down_read(&iopt->iova_rwsem); iopt_for_each_contig_area(&iter, area, iopt, iova, last_iova) diff --git a/drivers/iommu/iommufd/iommufd_private.h b/drivers/iommu/iommufd/iommufd_private.h index 94e88377a7fc..44e77ea9c399 100644 --- a/drivers/iommu/iommufd/iommufd_private.h +++ b/drivers/iommu/iommufd/iommufd_private.h @@ -311,6 +311,7 @@ struct iommufd_access { struct iommufd_object obj; struct iommufd_ctx *ictx; struct iommufd_ioas *ioas; + struct iommufd_ioas *ioas_unpin; struct mutex ioas_lock; const struct iommufd_access_ops *ops; void *data;
From: Nicolin Chen nicolinc@nvidia.com Sent: Saturday, February 25, 2023 9:52 AM
- /*
* Set ioas to NULL to block any further iommufd_access_pin_pages().
* iommufd_access_unpin_pages() can continue using access-
ioas_unpin.
*/
- access->ioas = NULL;
- mutex_unlock(&access->ioas_lock);
- access->ops->unmap(access->data, 0, ULONG_MAX);
- mutex_lock(&access->ioas_lock);
This should check whether @unmap is valid.
On Thu, Mar 02, 2023 at 08:23:54AM +0000, Tian, Kevin wrote:
External email: Use caution opening links or attachments
From: Nicolin Chen nicolinc@nvidia.com Sent: Saturday, February 25, 2023 9:52 AM
/*
* Set ioas to NULL to block any further iommufd_access_pin_pages().
* iommufd_access_unpin_pages() can continue using access-
ioas_unpin.
*/
access->ioas = NULL;
mutex_unlock(&access->ioas_lock);
access->ops->unmap(access->data, 0, ULONG_MAX);
mutex_lock(&access->ioas_lock);
This should check whether @unmap is valid.
Will change to + if (access->ops->unmap) { + mutex_unlock(&access->ioas_lock); + access->ops->unmap(access->data, 0, ULONG_MAX); + mutex_lock(&access->ioas_lock); + }
Thanks! Nic
Add replace coverage as a part of user_copy test case. It basically repeats the copy test after replacing the old ioas with a new one.
Signed-off-by: Nicolin Chen nicolinc@nvidia.com --- tools/testing/selftests/iommu/iommufd.c | 29 +++++++++++++++++++++++-- 1 file changed, 27 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/iommu/iommufd.c b/tools/testing/selftests/iommu/iommufd.c index 65f1847de5a5..76f2ac73bb45 100644 --- a/tools/testing/selftests/iommu/iommufd.c +++ b/tools/testing/selftests/iommu/iommufd.c @@ -1252,7 +1252,13 @@ TEST_F(iommufd_mock_domain, user_copy) .dst_iova = MOCK_APERTURE_START, .length = BUFFER_SIZE, }; - unsigned int ioas_id; + struct iommu_ioas_unmap unmap_cmd = { + .size = sizeof(unmap_cmd), + .ioas_id = self->ioas_id, + .iova = MOCK_APERTURE_START, + .length = BUFFER_SIZE, + }; + unsigned int new_ioas_id, ioas_id;
/* Pin the pages in an IOAS with no domains then copy to an IOAS with domains */ test_ioctl_ioas_alloc(&ioas_id); @@ -1270,11 +1276,30 @@ TEST_F(iommufd_mock_domain, user_copy) ASSERT_EQ(0, ioctl(self->fd, IOMMU_IOAS_COPY, ©_cmd)); check_mock_iova(buffer, MOCK_APERTURE_START, BUFFER_SIZE);
+ /* Now replace the ioas with a new one */ + test_ioctl_ioas_alloc(&new_ioas_id); + test_ioctl_ioas_map_id(new_ioas_id, buffer, BUFFER_SIZE, + ©_cmd.src_iova); + test_cmd_access_set_ioas(access_cmd.id, new_ioas_id); + + /* Destroy the old ioas and cleanup copied mapping */ + ASSERT_EQ(0, ioctl(self->fd, IOMMU_IOAS_UNMAP, &unmap_cmd)); + test_ioctl_destroy(ioas_id); + + /* Then run the same test again with the new ioas */ + access_cmd.access_pages.iova = copy_cmd.src_iova; + ASSERT_EQ(0, + ioctl(self->fd, _IOMMU_TEST_CMD(IOMMU_TEST_OP_ACCESS_PAGES), + &access_cmd)); + copy_cmd.src_ioas_id = new_ioas_id; + ASSERT_EQ(0, ioctl(self->fd, IOMMU_IOAS_COPY, ©_cmd)); + check_mock_iova(buffer, MOCK_APERTURE_START, BUFFER_SIZE); + test_cmd_destroy_access_pages( access_cmd.id, access_cmd.access_pages.out_access_pages_id); test_cmd_destroy_access(access_cmd.id);
- test_ioctl_destroy(ioas_id); + test_ioctl_destroy(new_ioas_id); }
TEST_F(iommufd_mock_domain, replace)
From: Nicolin Chen nicolinc@nvidia.com Sent: Saturday, February 25, 2023 9:52 AM
Add replace coverage as a part of user_copy test case. It basically repeats the copy test after replacing the old ioas with a new one.
Signed-off-by: Nicolin Chen nicolinc@nvidia.com
Reviewed-by: Kevin Tian kevin.tian@intel.com
linux-kselftest-mirror@lists.linaro.org