On Wed, Jul 26, 2023 at 01:50:28PM -0700, Nicolin Chen wrote:
rc = iopt_add_access(&new_ioas->iopt, access); if (rc) {
iommufd_put_object(&new_ioas->obj);mutex_unlock(&access->ioas_lock);
if (cur_ioas)
WARN_ON(iommufd_access_change_pt(access,
cur_ioas->obj.id));
We've already dropped our ref to cur_ioas, so this is also racy with destroy.
Would it be better by calling iommufd_access_detach() that holds the same mutex in the iommufd_access_destroy_object()? We could also unwrap the detach and delay the refcount_dec, as you did in your attaching patch.
It is better just to integrate it with this algorithm so we don't have the refcounting issues, like I did
This is what I came up with:
diff --git a/drivers/iommu/iommufd/device.c b/drivers/iommu/iommufd/device.c index 57c0e81f5073b2..e55d6e902edb98 100644 --- a/drivers/iommu/iommufd/device.c +++ b/drivers/iommu/iommufd/device.c @@ -758,64 +758,101 @@ void iommufd_access_destroy(struct iommufd_access *access) } EXPORT_SYMBOL_NS_GPL(iommufd_access_destroy, IOMMUFD); -void iommufd_access_detach(struct iommufd_access *access) +static int iommufd_access_change_ioas(struct iommufd_access *access,
struct iommufd_ioas *new_ioas)
{ struct iommufd_ioas *cur_ioas = access->ioas;
- int rc;
- lockdep_assert_held(&access->ioas_lock);
- /* We are racing with a concurrent detach, bail */
- if (access->ioas_unpin)
return -EBUSY;
I think this should check access->ioas too? I mean:
- /* We are racing with a concurrent detach, bail */
- if (!access->ioas && access->ioas_unpin)
return -EBUSY;
Oh, yes, that should basically be 'cur_ioas != access->ioas_unpin' - ie any difference means we are racing with the unmap call.
- if (new_ioas) {
rc = iopt_add_access(&new_ioas->iopt, access);
if (rc) {
iommufd_put_object(&new_ioas->obj);
access->ioas = cur_ioas;
return rc;
}
iommufd_ref_to_users(&new_ioas->obj);
- }
- access->ioas = new_ioas;
- access->ioas_unpin = new_ioas; iopt_remove_access(&cur_ioas->iopt, access);
There was a bug in my earlier version, having the same flow by calling iopt_add_access() prior to iopt_remove_access(). But, doing that would override the access->iopt_access_list_id and it would then get unset by the following iopt_remove_access().
Ah, I was wondering about that order but didn't check it.
Maybe we just need to pass the ID into iopt_remove_access and keep the right version on the stack.
So, I came up with this version calling an iopt_remove_access() prior to iopt_add_access(), which requires an add-back the old ioas upon an failure at iopt_add_access(new_ioas).
That is also sort of reasonable if the refcounting is organized like this does.
Jason