On Thu, Oct 23, 2025 at 2:21 PM Jason Gunthorpe jgg@nvidia.com wrote:
The existing IOMMU page table implementations duplicate all of the working algorithms for each format. By using the generic page table API a single C version of the IOMMU algorithms can be created and re-used for all of the different formats used in the drivers. The implementation will provide a single C version of the iommu domain operations: iova_to_phys, map, unmap, and read_and_clear_dirty.
Further, adding new algorithms and techniques becomes easy to do across the entire fleet of drivers and formats.
It is an enabler for cross-arch page_table_check for IOMMU. There is also a long-standing issue where PT pages are not freed on unmap, leading to substantial overhead on some configurations, especially where IOVA is cycled through for security purposes (as it was done in our environment). Having a single, solid fix for this issue that affects all arches is very much desirable.
The C functions are drop in compatible with the existing iommu_domain_ops using the IOMMU_PT_DOMAIN_OPS() macro. Each per-format implementation compilation unit will produce exported symbols following the pattern pt_iommu_FMT_map_pages() which the macro directly maps to the iommu_domain_ops members. This avoids the additional function pointer indirection like io-pgtable has.
The top level struct used by the drivers is pt_iommu_table_FMT. It contains the other structs to allow container_of() to move between the driver, iommu page table, generic page table, and generic format layers.
struct pt_iommu_table_amdv1 { struct pt_iommu { struct iommu_domain domain; } iommu; struct pt_amdv1 { struct pt_common common; } amdpt; };
The driver is expected to union the pt_iommu_table_FMT with its own existing domain struct:
struct driver_domain { union { struct iommu_domain domain; struct pt_iommu_table_amdv1 amdv1; }; }; PT_IOMMU_CHECK_DOMAIN(struct driver_domain, amdv1, domain);
To create an alias to avoid renaming 'domain' in a lot of driver code.
This allows all the layers to access all the necessary functions to implement their different roles with no change to any of the existing iommu core code.
Implement the basic starting point: pt_iommu_init(), get_info() and deinit().
Tested-by: Alejandro Jimenez alejandro.j.jimenez@oracle.com Reviewed-by: Kevin Tian kevin.tian@intel.com Signed-off-by: Jason Gunthorpe jgg@nvidia.com
Reviewed-by: Pasha Tatashin pasha.tatashin@soleen.com