On Fri, Mar 24, 2023 at 01:37:51AM +0000, Tian, Kevin wrote:
If vfio races attach/detach then lots of things are messed.
e.g. iommufd_device_detach() directly calls list_del(&idev->group_item) w/o checking whether the device has been attached.
Yeah, you obviously can't race attach/detach or detach/replace
And with that race UAF could occur if we narrow down the lock scope to iommufd_hw_pagetable_attach():
cpu0 cpu1
vfio_iommufd_attach() iommufd_device_attach() iommufd_device_auto_get_domain() mutex_lock(&ioas->mutex); iommufd_hw_pagetable_alloc() hwpt = iommufd_object_alloc() //hwpt.users=1 hwpt->domain = iommu_domain_alloc(idev->dev->bus); iommufd_hw_pagetable_attach() //hwpt.users=2 vfio_iommufd_detach() iommufd_device_detach() mutex_lock(&idev->igroup->lock); hwpt = iommufd_hw_pagetable_detach() mutex_unlock(&idev->igroup->lock); iommufd_hw_pagetable_put(hwpt) iommufd_object_destroy_user(hwpt) //hwpt.users=0 iommufd_hw_pagetable_destroy(hwpt) iommu_domain_free(hwpt->domain); iopt_table_add_domain(&hwpt->ioas->iopt, hwpt->domain); //UAF
You didn't balance the refcounts properly, the cpu1 put will get to hwpt.users=1
There is a refcount_inc in iommufd_hw_pagetable_attach(), the iommufd_hw_pagetable_alloc() retains its reference and so the domain is guarenteed valid
Jason