On 6/10/2025 5:28 AM, Jason Gunthorpe wrote:
The AMD IOMMU documentation seems pretty clear that the V2 table follows the normal CPU expectation of sign extension. This is shown in
Figure 25: AMD64 Long Mode 4-Kbyte Page Address Translation
Where bits Sign-Extend [63:57] == [56]. This is typical for x86 which would have three regions in the page table: lower, non-canonical, upper.
The manual describes that the V1 table does not sign extend in section 2.2.4 Sharing AMD64 Processor and IOMMU Page Tables GPA-to-SPA
Further, Vasant has checked this and indicates the HW has an addtional behavior that the manual does not yet describe. The AMDv2 table does not have the sign extended behavior when attached to PASID 0, which may explain why this has gone unnoticed.
The iommu domain geometry does not directly support sign extended page tables. The driver should report only one of the lower/upper spaces. Solve this by removing the top VA bit from the geometry to use only the lower space.
This will also make the iommu_domain work consistently on all PASID 0 and PASID != 1.
You meant PASID != 0 ?
Adjust dma_max_address() to remove the top VA bit. It now returns:
5 Level: Before 0x1ffffffffffffff After 0x0ffffffffffffff 4 Level: Before 0xffffffffffff After 0x7fffffffffff
Fixes: 11c439a19466 ("iommu/amd/pgtbl_v2: Fix domain max address") Link: https://lore.kernel.org/all/8858d4d6-d360-4ef0-935c-bfd13ea54f42@amd.com/ Signed-off-by: Jason Gunthorpe jgg@nvidia.com
Reviewed-by: Vasant Hegde vasant.hegde@amd.com
-Vasant
drivers/iommu/amd/iommu.c | 17 +++++++++++++++-- 1 file changed, 15 insertions(+), 2 deletions(-)
v2:
- Revise the commit message and comment with the new information from Vasant.
v1: https://patch.msgid.link/r/0-v1-6925ece6b623+296-amdv2_geo_jgg@nvidia.com
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c index 3117d99cf83d0d..1baa9d3583f369 100644 --- a/drivers/iommu/amd/iommu.c +++ b/drivers/iommu/amd/iommu.c @@ -2526,8 +2526,21 @@ static inline u64 dma_max_address(enum protection_domain_mode pgtable) if (pgtable == PD_MODE_V1) return ~0ULL;
- /* V2 with 4/5 level page table */
- return ((1ULL << PM_LEVEL_SHIFT(amd_iommu_gpt_level)) - 1);
- /*
* V2 with 4/5 level page table. Note that "2.2.6.5 AMD64 4-Kbyte Page
* Translation" shows that the V2 table sign extends the top of the
* address space creating a reserved region in the middle of the
* translation, just like the CPU does. Further Vasant says the docs are
* incomplete and this only applies to non-zero PASIDs. If the AMDv2
* page table is assigned to the 0 PASID then there is no sign extension
* check.
*
* Since the IOMMU must have a fixed geometry, and the core code does
* not understand sign extended addressing, we have to chop off the high
* bit to get consistent behavior with attachments of the domain to any
* PASID.
*/
- return ((1ULL << (PM_LEVEL_SHIFT(amd_iommu_gpt_level) - 1)) - 1);
} static bool amd_iommu_hd_support(struct amd_iommu *iommu)
base-commit: eb328711b15b17987021dbb674f446b7b008dca5