From: Jason Gunthorpe jgg@nvidia.com Sent: Wednesday, April 12, 2023 7:18 PM
On Wed, Apr 12, 2023 at 08:27:36AM +0000, Tian, Kevin wrote:
From: Jason Gunthorpe jgg@nvidia.com Sent: Tuesday, April 11, 2023 10:31 PM
On Thu, Mar 23, 2023 at 07:21:42AM +0000, Tian, Kevin wrote:
If no oversight then we can directly put the lock in iommufd_hw_pagetable_attach/detach() which can also simplify a bit
on
its callers in device.c.
So, I did this, and syzkaller explains why this can't be done:
https://lore.kernel.org/r/0000000000006e66d605f83e09bc@google.com
We can't allow the hwpt to be discovered by a parallel iommufd_hw_pagetable_attach() until it is done being setup, otherwise if we fail to set it up we can't destroy the hwpt.
if (immediate_attach) { rc = iommufd_hw_pagetable_attach(hwpt, idev); if (rc) goto out_abort; }
rc = iopt_table_add_domain(&hwpt->ioas->iopt, hwpt->domain); if (rc) goto out_detach; list_add_tail(&hwpt->hwpt_item, &hwpt->ioas->hwpt_list); return hwpt;
out_detach: if (immediate_attach) iommufd_hw_pagetable_detach(idev); out_abort: iommufd_object_abort_and_destroy(ictx, &hwpt->obj);
As some other idev could be pointing at it too now.
How could this happen before this object is finalized? iirc you pointed to me this fact in previous discussion.
It only is unavailable through the xarray, but we've added it to at least one internal list on the group already, it is kind of sketchy to work like this, it should all be atomic..
which internal list? group has a list for attached devices but regarding to hwpt it's stored in a single field igroup->hwpt.
with that being set in iommufd_hw_pagetable_attach() another device cannot race attaching to a different ioas/hwpt (mismatching igroup->hwpt) or the same hwpt being created (not finalized and holding ioas->mutex).
So although it's added to group none will reference its state before it's finalized.
btw removing this lock in this file also makes it easier to support siov device which doesn't have group. We can have internal group attach and pasid attach wrappers within device.c and leave igroup->lock held in the group attach path.
Otherwise we'll have to create a locking wrapper used in this file to touch igroup->lock in particular for iommufd_device which has a igroup object.