On Tue, Oct 29, 2024 at 12:58:24PM -0300, Jason Gunthorpe wrote:
On Fri, Oct 25, 2024 at 04:50:30PM -0700, Nicolin Chen wrote:
diff --git a/drivers/iommu/iommufd/device.c b/drivers/iommu/iommufd/device.c index 5fd3dd420290..e50113305a9c 100644 --- a/drivers/iommu/iommufd/device.c +++ b/drivers/iommu/iommufd/device.c @@ -277,6 +277,17 @@ EXPORT_SYMBOL_NS_GPL(iommufd_ctx_has_group, IOMMUFD); */ void iommufd_device_unbind(struct iommufd_device *idev) {
- u32 vdev_id = 0;
- /* idev->vdev object should be destroyed prior, yet just in case.. */
- mutex_lock(&idev->igroup->lock);
- if (idev->vdev)
Then should it have a WARN_ON here?
It'd be a user space mistake that forgot to call the destroy ioctl to the object, in which case I recall kernel shouldn't WARN_ON?
vdev_id = idev->vdev->obj.id;
- mutex_unlock(&idev->igroup->lock);
- /* Relying on xa_lock against a race with iommufd_destroy() */
- if (vdev_id)
iommufd_object_remove(idev->ictx, NULL, vdev_id, 0);
That doesn't seem right, iommufd_object_remove() should never be used to destroy an object that userspace created with an IOCTL, in fact that just isn't allowed.
It was for our auto destroy feature. If user space forgot to destroy the object while trying to unplug the device from VM. This saves the day.
Ugh, there is worse here, we can't hold a long term reference on a kernel owned object:
idev->vdev = vdev; refcount_inc(&idev->obj.users);
As it prevents the kernel from disconnecting it.
Hmm, mind elaborating? I think the iommufd_fops_release() would xa_for_each the object list that destroys the vdev object first then this idev (and viommu too)?
I came up with this that seems like it will work. Maybe we will need to improve it later. Instead of using the idev, just keep the raw struct device. We can hold a refcount on the struct device without races. There is no need for the idev igroup lock since the xa_lock does everything we need.
OK. If user space forgot to destroy its vdev while unplugging the device, it would not be allowed to hotplug another device (or the same device) back to the same slot having the same RID, since the RID on the vIOMMU would be occupied by the undestroyed vdev.
If we decide to do so, I think we should highlight this somewhere in the doc.
Thanks Nicolin