From: Jason Gunthorpe jgg@nvidia.com Sent: Wednesday, October 26, 2022 2:12 AM
+/**
- iommufd_device_bind - Bind a physical device to an iommu fd
- @ictx: iommufd file descriptor
- @dev: Pointer to a physical PCI device struct
- @id: Output ID number to return to userspace for this device
- A successful bind establishes an ownership over the device and returns
- struct iommufd_device pointer, otherwise returns error pointer.
- A driver using this API must set driver_managed_dma and must not touch
- the device until this routine succeeds and establishes ownership.
- Binding a PCI device places the entire RID under iommufd control.
Then what about non-PCI device? clearer to say that calling this interface just places the entire physical device under iommufd control
* FIXME: This is conceptually broken for iommufd since we want to
allow
* userspace to change the domains, eg switch from an identity IOAS
to a
* DMA IOAs. There is currently no way to create a MSI window that
IOAs -> IOAS
rc = iommu_get_msi_cookie(hwpt->domain, sw_msi_start);
if (rc && rc != -ENODEV)
return rc;
I know this is copied from VFIO but a comment is appreciated why -ENODEV is considered sane to move forward.
- /*
* Otherwise the platform has a MSI window that is not isolated. For
* historical compat with VFIO allow a module parameter to ignore
the
* insecurity.
*/
- if (!(flags &
IOMMUFD_ATTACH_FLAGS_ALLOW_UNSAFE_INTERRUPT))
return -EPERM;
Throw out a warning similar to vfio.
- rc = iopt_table_enforce_group_resv_regions(&hwpt->ioas->iopt,
idev->dev,
idev->group,
&sw_msi_start);
- if (rc)
goto out_iova;
goto out_unlock since the function internally already called __iopt_remove_reserved_iova() upon error.
- /*
* There is no differentiation when domains are allocated, so any
domain
* that is willing to attach to the device is interchangeable with any
* other.
*/
- mutex_lock(&ioas->mutex);
- list_for_each_entry(hwpt, &ioas->hwpt_list, hwpt_item) {
if (!hwpt->auto_domain ||
!refcount_inc_not_zero(&hwpt->obj.users))
users cannot be zero at this point.
a new hwpt is added to the list with users==1 in attach.
detach first removes the hwpt and then decrement users.
Both are conducted by holding ioas->mutex.
continue;
rc = iommufd_device_do_attach(idev, hwpt, flags);
refcount_dec(&hwpt->obj.users);
with above I also wonder whether refcount_inc/dec are just redundant here...
+int iommufd_device_attach(struct iommufd_device *idev, u32 *pt_id,
unsigned int flags)
+{
- struct iommufd_object *pt_obj;
- int rc;
- pt_obj = iommufd_get_object(idev->ictx, *pt_id,
IOMMUFD_OBJ_ANY);
- if (IS_ERR(pt_obj))
return PTR_ERR(pt_obj);
Is there a reason why get_object() is not required for idev?
concurrent attach/unbind could lead to use-after-free here given users for idev is added only in the end of the function.
- switch (pt_obj->type) {
- case IOMMUFD_OBJ_HW_PAGETABLE: {
struct iommufd_hw_pagetable *hwpt =
container_of(pt_obj, struct iommufd_hw_pagetable,
obj);
rc = iommufd_device_do_attach(idev, hwpt, flags);
if (rc)
goto out_put_pt_obj;
mutex_lock(&hwpt->ioas->mutex);
list_add_tail(&hwpt->hwpt_item, &hwpt->ioas->hwpt_list);
mutex_unlock(&hwpt->ioas->mutex);
break;
- }
- case IOMMUFD_OBJ_IOAS: {
struct iommufd_ioas *ioas =
container_of(pt_obj, struct iommufd_ioas, obj);
rc = iommufd_device_auto_get_domain(idev, ioas, flags);
if (rc)
goto out_put_pt_obj;
break;
- }
- default:
rc = -EINVAL;
goto out_put_pt_obj;
- }
- refcount_inc(&idev->obj.users);
- *pt_id = idev->hwpt->obj.id;
What is the value of returning hwpt id of auto domain to user?
IMHO this will add more complexity to the life cycle of auto domain.
w/o allowing it the life cycle is simple i.e. the first attach which doesn't find a matching domain creates a new auto domain then the last detach frees it.
now if allowing user to populate auto domain similar to user-created hwpt then detach also need get_object() to wait for concurrent ioctls similar to iommufd_destroy() does and more trick on destroying the object.
If no big gain probably hiding it from userspace is simpler?
+void iommufd_device_detach(struct iommufd_device *idev) +{
- struct iommufd_hw_pagetable *hwpt = idev->hwpt;
- mutex_lock(&hwpt->ioas->mutex);
- mutex_lock(&hwpt->devices_lock);
- list_del(&idev->devices_item);
- if (!iommufd_hw_pagetable_has_group(hwpt, idev->group)) {
if (list_empty(&hwpt->devices)) {
iopt_table_remove_domain(&hwpt->ioas->iopt,
hwpt->domain);
list_del(&hwpt->hwpt_item);
}
iopt_remove_reserved_iova(&hwpt->ioas->iopt, idev->dev);
this is always required. not just for last device in a group.