On 9/27/24 4:03 AM, Nicolin Chen wrote:
On Thu, Sep 26, 2024 at 04:47:02PM +0800, Yi Liu wrote:
On 2024/9/26 02:55, Nicolin Chen wrote:
On Wed, Sep 25, 2024 at 06:30:20PM +0800, Yi Liu wrote:
Hi Nic,
On 2024/8/28 00:59, Nicolin Chen wrote:
This series introduces a new VIOMMU infrastructure and related ioctls.
IOMMUFD has been using the HWPT infrastructure for all cases, including a nested IO page table support. Yet, there're limitations for an HWPT-based structure to support some advanced HW-accelerated features, such as CMDQV on NVIDIA Grace, and HW-accelerated vIOMMU on AMD. Even for a multi-IOMMU environment, it is not straightforward for nested HWPTs to share the same parent HWPT (stage-2 IO pagetable), with the HWPT infrastructure alone.
could you elaborate a bit for the last sentence in the above paragraph?
Stage-2 HWPT/domain on ARM holds a VMID. If we share the parent domain across IOMMU instances, we'd have to make sure that VMID is available on all IOMMU instances. There comes the limitation and potential resource starving, so not ideal.
got it.
Baolu told me that Intel may have the same: different domain IDs on different IOMMUs; multiple IOMMU instances on one chip: https://lore.kernel.org/linux-iommu/cf4fe15c-8bcb-4132-a1fd-b2c8ddf2731b@lin... So, I think we are having the same situation here.
yes, it's called iommu unit or dmar. A typical Intel server can have multiple iommu units. But like Baolu mentioned in that thread, the intel iommu driver maintains separate domain ID spaces for iommu units, which means a given iommu domain has different DIDs when associated with different iommu units. So intel side is not suffering from this so far.
An ARM SMMU has its own VMID pool as well. The suffering comes from associating VMIDs to one shared parent S2 domain.
Does a DID per S1 nested domain or parent S2? If it is per S2, I think the same suffering applies when we share the S2 across IOMMU instances?
It's per S1 nested domain in current VT-d design. It's simple but lacks sharing of DID within a VM. We probably will change this later.
Thanks, baolu