On 12/11/2023 8:05 PM, Jason Gunthorpe wrote:
On Mon, Dec 11, 2023 at 08:36:46PM +0800, Yi Liu wrote:
On 2023/12/11 10:29, Tian, Kevin wrote:
From: Jason Gunthorpe jgg@nvidia.com Sent: Saturday, December 9, 2023 9:47 AM
What is in a Nested domain: Intel: A single IO page table refereed to by a PASID entry Each vDomain-ID,PASID allocates a unique nesting domain AMD: A GCR3 table pointer Nesting domains are created for every unique GCR3 pointer. vDomain-ID can possibly refer to multiple Nesting domains :( ARM: A CD table pointer Nesting domains are created for every unique CD table top pointer.
this AMD/ARM difference is not very clear to me.
How could a vDomain-ID refer to multiple GCR3 pointers? Wouldn't it lead to cache tag conflict when a same PASID entry in multiple GCR3 tables points to different I/O page tables?
Perhaps due to only one DomainID in the DTE table indexed by BDF? Actually, the vDomainID will not be used to tag cache, the host DomainId would be used instead. @Jason?
The DomainID comes from the DTE table which is indexed by the RID, and the DTE entry points to the GCR3 table. So the VM certainly can setup a DTE table with multiple entires having the same vDomainID but pointing to different GCR3's. So the VMM has to do *something* with this.
Most likely this is not a useful thing to do. However what should the VMM do when it sees this? Block a random DTE or push the duplication down to real HW would be my options. I'd probably try to do the latter just on the basis of better emulation.
Jason
For AMD, the hardware uses host DomainID (hDomainId) and PASID to tag the IOMMU TLB.
The VM can setup vDomainID independently from device (RID) and hDomainID. The vDomainId->hDomainId mapping would be managed by the host IOMMU driver (since this is also needed by the HW when enabling the HW-vIOMMU support a.k.a virtual function).
Currently, the AMD IOMMU driver allocates a DomainId per IOMMU group. One issue with this is when we have nested translation where we could end up with multiple devices (RIDs) sharing same PASID and the same hDomainID.
For example:
- Host view Device1 (RID 1) w/ hDomainId 1 Device2 (RID 2) w/ hDomainId 1 - Guest view Pass-through Device1 (vRID 3) w/ vDomainID A + PASID 0 Pass-through Device2 (vRID 4) w/ vDomainID B + PASID 0
We should be able to workaround this by changing the way we assign hDomainId to be per-device for VFIO pass-through devices although sharing the same v1 (stage-2) page table. This would look like.
- Host view Device1 (RID 1) w/ hDomainId 1 Device2 (RID 2) w/ hDomainId 2 - Guest view Pass-through Device1 (vRID 3) w/ vDomainID A + PASID 0 Pass-through Device2 (vRID 4) w/ vDomainID B + PASID 0
This should avoid the IOMMU TLB conflict. However, the invalidation would need to be done for both DomainId 1 and 2 when updating the v1 (stage-2) page table.
Thanks, Suravee