Re: [PATCH v5 01/13] iommufd/viommu: Add IOMMUFD_OBJ_VDEVICE and IOMMU_VDEVICE_ALLOC ioctl

29 Oct 2024

On Tue, Oct 29, 2024 at 03:48:01PM -0300, Jason Gunthorpe wrote:
...
On Tue, Oct 29, 2024 at 10:29:56AM -0700, Nicolin Chen wrote:
...
On Tue, Oct 29, 2024 at 12:58:24PM -0300, Jason Gunthorpe wrote:
...
On Fri, Oct 25, 2024 at 04:50:30PM -0700, Nicolin Chen wrote:
...

diff --git a/drivers/iommu/iommufd/device.c b/drivers/iommu/iommufd/device.c
index 5fd3dd420290..e50113305a9c 100644
--- a/drivers/iommu/iommufd/device.c
+++ b/drivers/iommu/iommufd/device.c
@@ -277,6 +277,17 @@ EXPORT_SYMBOL_NS_GPL(iommufd_ctx_has_group, IOMMUFD);
  */
 void iommufd_device_unbind(struct iommufd_device *idev)
 {

u32 vdev_id = 0;

/* idev->vdev object should be destroyed prior, yet just in case.. */
mutex_lock(&idev->igroup->lock);
if (idev->vdev)

Then should it have a WARN_ON here?
It'd be a user space mistake that forgot to call the destroy ioctl
to the object, in which case I recall kernel shouldn't WARN_ON?
But you can't get here because:
refcount_inc(&idev->obj.users);
And kernel doesn't destroy objects with elevated ref counts?
Hmm, this is not a ->destroy() but iommufd_device_unbind called
by VFIO. And we actually ran into this routine when QEMU didn't
destroy vdev. So, I added this chunk.
The iommufd_object_remove(vdev_id) here would destroy the vdev
where its destroy() does refcount_dec(&idev->obj.users). Then,
the following iommufd_object_destroy_user(.., &idev->obj) will
succeed.
With that said, let's just mandate userspace to destroy vdev.
...
...
...
...

vdev_id = idev->vdev->obj.id;


mutex_unlock(&idev->igroup->lock);
/* Relying on xa_lock against a race with iommufd_destroy() */
if (vdev_id)
iommufd_object_remove(idev->ictx, NULL, vdev_id, 0);



That doesn't seem right, iommufd_object_remove() should never be used
to destroy an object that userspace created with an IOCTL, in fact
that just isn't allowed.
It was for our auto destroy feature.
auto domains are "hidden" hwpts that are kernel managed. They are not
"userspace created".
"Usespace created" objects are ones that userspace is expected to call
destroy on.
OK. I misunderstood that.
...
If you destroy them behind the scenes in the kerenl then the objecd ID
can be reallocated for something else and when userspace does DESTROY
on the ID it thought was still allocated it will malfunction.
So, only userspace can destroy objects that userspace created.
I see. That makes sense.
...
...
If user space forgot to destroy the object while trying to unplug
the device from VM. This saves the day.
No, it should/does fail destroy of the VIOMMU object because the users
refcount is elevated.
The vIOMMU object is refcount_dec also from the unbind() calling
remove(). But anyway, we aligned that userspace should destroy it
explicitly.
...
...
...
Ugh, there is worse here, we can't hold a long term reference on a
kernel owned object:
idev->vdev = vdev;
   refcount_inc(&idev->obj.users);
As it prevents the kernel from disconnecting it.
Hmm, mind elaborating? I think the iommufd_fops_release() would
xa_for_each the object list that destroys the vdev object first
then this idev (and viommu too)?
iommufd_device_unbind() can't fail, and if the object can't be
destroyed because it has an elevated long term refcount it WARN's:
ret = iommufd_object_remove(ictx, obj, obj->id, REMOVE_WAIT_SHORTTERM);
/*
    * If there is a bug and we couldn't destroy the object then we did put
    * back the caller's users refcount and will eventually try to free it
    * again during close.
    */
   WARN_ON(ret);
So you cannot take long term references on kernel owned objects. Only
userspace owned objects.
OK. I think I had got this part. Gao ran into this WARN_ON at v3,
so I added iommufd_object_remove(vdev_id) in unbind() prior to
this iommufd_object_destroy_user(idev->ictx, &idev->obj).
...
...
OK. If user space forgot to destroy its vdev while unplugging the
device, it would not be allowed to hotplug another device (or the
same device) back to the same slot having the same RID, since the
RID on the vIOMMU would be occupied by the undestroyed vdev.
Yes, that seems correct and obvious to me. Until the vdev is
explicitly destroyed the ID is in-use.
Good userspace should destroy the iommufd vDEVICE object before
closing the VFIO file descriptor.
If it doesn't, then the VDEVICE object remains even though the VFIO it
was linked to is gone.
I see.
Thanks
Nicolin

    

2025

2024

2023

2022

2021

2020

2019

2018

2017

Re: [PATCH v5 01/13] iommufd/viommu: Add IOMMUFD_OBJ_VDEVICE and IOMMU_VDEVICE_ALLOC ioctl