On Fri, Jan 05, 2024 at 02:52:50AM +0000, Tian, Kevin wrote:
but in reality the relation could be identified in an easy way due to a SIOV restriction which we discussed before - shared PASID space of PF disallows assigning sibling vdev's to a same VM (otherwise no way to identify which sibling vdev triggering an iopf when a pasid is used on both vdev's). That restriction implies that within an iommufd context every iommufd_device object should contain a unique struct device pointer. So PASID can be instead ignored in the lookup then just always do iommufd_get_dev_id() using struct device.
A bit more background.
Previously we thought this restriction only applies to SIOV+vSVA, as a guest process may bind to both sibling vdev's, leading to the same pasid situation.
In concept w/o vSVA it's still possible to assign sibling vdev's to a same VM as each vdev is allocated with a unique pasid to mark vRID so can be differentiated from each other in the fault/error path.
I thought the SIOV plan was that each "vdev" ie vpci function would get a slice of the pRID's PASID space statically selected at creation?
So SVA/etc doesn't matter, you reliably get a disjoint set of pRID & pPASID into each VM.
From that view you can't identify the iommufd dev_id without knowing both the pRID and pPASID which will disambiguate the different SIOV iommufd dev_id instances sharing a rid.
But when looking at this err code issue with Yi closely, we found there is another gap in the VT-d spec. Upon devtlb invalidation timeout the hw doesn't report pasid in the error info register. this makes it impossible to identify the source vdev if a hwpt invalidation request involves sibling vdev's from a same PF.
Don't you know which command timed out?
with that I'm inclined to always imposing this restriction for SIOV. One may argue that SIOV w/o vSVA w/o devtlb is conceptually immune but I'm with you that given SIOVr1 is one-off I prefer to limiting its usability other than complexing the kernel.
By this you mean give up on SIOV entirely and always assign the full pRID to an entire VM? I'm confused what restriction you mean if you can't rely on the PASID?
Jason