On Wed, Jul 26, 2023 at 11:30:17AM -0300, Jason Gunthorpe wrote:
On Mon, Jul 24, 2023 at 12:47:05PM -0700, Nicolin Chen wrote:
-int iommufd_access_attach(struct iommufd_access *access, u32 ioas_id) +static int iommufd_access_change_pt(struct iommufd_access *access, u32 ioas_id) {
- struct iommufd_ioas *cur_ioas = access->ioas; struct iommufd_ioas *new_ioas;
- int rc = 0;
- int rc;
- mutex_lock(&access->ioas_lock);
- if (WARN_ON(access->ioas || access->ioas_unpin)) {
mutex_unlock(&access->ioas_lock);
return -EINVAL;
- }
- lockdep_assert_held(&access->ioas_lock);
new_ioas = iommufd_get_ioas(access->ictx, ioas_id);
- if (IS_ERR(new_ioas)) {
mutex_unlock(&access->ioas_lock);
- if (IS_ERR(new_ioas)) return PTR_ERR(new_ioas);
- }
- if (cur_ioas)
__iommufd_access_detach(access);
The drop of the mutex while this function runs is racey with the rest of this, we can mitigate it by blocking concurrent change while detaching which is if access->ioas_unpin is set
Oh. You mean that unmap part dropping the mutex right? I see.
rc = iopt_add_access(&new_ioas->iopt, access); if (rc) {
iommufd_put_object(&new_ioas->obj);mutex_unlock(&access->ioas_lock);
if (cur_ioas)
WARN_ON(iommufd_access_change_pt(access,
cur_ioas->obj.id));
We've already dropped our ref to cur_ioas, so this is also racy with destroy.
Would it be better by calling iommufd_access_detach() that holds the same mutex in the iommufd_access_destroy_object()? We could also unwrap the detach and delay the refcount_dec, as you did in your attaching patch.
This is what I came up with:
diff --git a/drivers/iommu/iommufd/device.c b/drivers/iommu/iommufd/device.c index 57c0e81f5073b2..e55d6e902edb98 100644 --- a/drivers/iommu/iommufd/device.c +++ b/drivers/iommu/iommufd/device.c @@ -758,64 +758,101 @@ void iommufd_access_destroy(struct iommufd_access *access) } EXPORT_SYMBOL_NS_GPL(iommufd_access_destroy, IOMMUFD); -void iommufd_access_detach(struct iommufd_access *access) +static int iommufd_access_change_ioas(struct iommufd_access *access,
struct iommufd_ioas *new_ioas)
{ struct iommufd_ioas *cur_ioas = access->ioas;
- int rc;
- lockdep_assert_held(&access->ioas_lock);
- /* We are racing with a concurrent detach, bail */
- if (access->ioas_unpin)
return -EBUSY;
I think this should check access->ioas too? I mean:
+ /* We are racing with a concurrent detach, bail */ + if (!access->ioas && access->ioas_unpin) + return -EBUSY;
Otherwise, a normal detach() would fail, since an access has both a valid ioas and a valid ioas_unpin.
- if (IS_ERR(new_ioas))
return PTR_ERR(new_ioas);
- if (cur_ioas == new_ioas)
return 0;
- mutex_lock(&access->ioas_lock);
- if (WARN_ON(!access->ioas))
/*goto out;
*/ access->ioas = NULL;
- Set ioas to NULL to block any further iommufd_access_pin_pages().
- iommufd_access_unpin_pages() can continue using access->ioas_unpin.
- if (access->ops->unmap) {
- if (cur_ioas && access->ops->unmap) { mutex_unlock(&access->ioas_lock); access->ops->unmap(access->data, 0, ULONG_MAX); mutex_lock(&access->ioas_lock); }
- if (new_ioas) {
rc = iopt_add_access(&new_ioas->iopt, access);
if (rc) {
iommufd_put_object(&new_ioas->obj);
access->ioas = cur_ioas;
return rc;
}
iommufd_ref_to_users(&new_ioas->obj);
- }
- access->ioas = new_ioas;
- access->ioas_unpin = new_ioas; iopt_remove_access(&cur_ioas->iopt, access);
There was a bug in my earlier version, having the same flow by calling iopt_add_access() prior to iopt_remove_access(). But, doing that would override the access->iopt_access_list_id and it would then get unset by the following iopt_remove_access().
Please refer to : https://lore.kernel.org/linux-iommu/ZJYYWz2wy%2F86FapK@Asurada-Nvidia/
If we want a cleaner detach-then-attach flow, we would need an atomic function in the io_pagetable.c file handling the id, yet I couldn't figure a good naming for the atomic function since it's about acccess shifting between two iopts other than simply "iopt_repalce_access".
So, I came up with this version calling an iopt_remove_access() prior to iopt_add_access(), which requires an add-back the old ioas upon an failure at iopt_add_access(new_ioas).
I will try making some change accordingly on top of this patch.
Thanks Nicolin