From: Jason Gunthorpe jgg@nvidia.com Sent: Thursday, January 4, 2024 10:37 PM
On Thu, Dec 14, 2023 at 07:26:39PM +0800, Yi Liu wrote:
Per the prior discussion[1], we agreed to move the error reporting into the driver specific part. On Intel side, we want to report two devTLB invalidation errors: ICE (invalid completion error) and ITE (invalidation timeout error). Such errors have an additional SID information to tell which device failed the devTLB invalidation. I've got the below structure.
IMHO all of this complexity is a consequence of the decision to hide the devtlb invalidation from the VM..
On the other hand I guess you want to do this because of the SIOV troubles where the vPCI function in the VM is entirely virtual and can't be trivially mapped to a real PCI function for ATC invalidation like ARM and AMD can do (but they also can't support SIOV because of this). :(
However it also makes it very confusing about how the VM would perceive an error - eg if it invalidates an SIOV device single PASID and that devtlb fails then the error should be connected back to the vPCI function for the SIOV's specific PASID and not back to the physical PCI function for the SIOV owner.
As the iommu driver itself has no idea about the vPCI functions this seems like it is going to get really confusing. The API I suggested in the other email is not entirely going to work as the vPCI function for SIOV cases will have to be identified by the (struct device, PASID) - while it would be easy enough for the iommu driver to provide the PASID, I'm not sure how the iommufd core will relate the PASID back to the iommu_device to understand SIOV without actually being aware of SIOV to some degree :\
we plan to add such awareness with a new binding helper:
https://lore.kernel.org/lkml/20231009085123.463179-4-yi.l.liu@intel.com/
and with metadata to track association between PASID's and iommufd vdev.
but in reality the relation could be identified in an easy way due to a SIOV restriction which we discussed before - shared PASID space of PF disallows assigning sibling vdev's to a same VM (otherwise no way to identify which sibling vdev triggering an iopf when a pasid is used on both vdev's). That restriction implies that within an iommufd context every iommufd_device object should contain a unique struct device pointer. So PASID can be instead ignored in the lookup then just always do iommufd_get_dev_id() using struct device.