On Wed, Jan 29, 2025 at 10:58:00AM -0400, Jason Gunthorpe wrote:
On Wed, Jan 29, 2025 at 02:44:12PM +0100, Eric Auger wrote:
On 1/11/25 4:32 AM, Nicolin Chen wrote:
For systems that require MSI pages to be mapped into the IOMMU translation the IOMMU driver provides an IOMMU_RESV_SW_MSI range, which is the default recommended IOVA window to place these mappings. However, there is nothing special about this address. And to support the RMR trick in VMM for nested
well at least it shall not overlap VMM's RAM. So it was not random either.
translation, the VMM needs to know what sw_msi window the kernel is using. As there is no particular reason to force VMM to adopt the kernel default, provide a simple IOMMU_OPTION_SW_MSI_START/SIZE ioctl that the VMM can use to directly specify the sw_msi window that it wants to use, which replaces and disables the default IOMMU_RESV_SW_MSI from the driver to avoid having to build an API to discover the default IOMMU_RESV_SW_MSI.
IIUC the MSI window will then be different when using legacy VFIO assignment and iommufd backend.
? They use the same, iommufd can have userspace override it. Then it will ignore the reserved region.
MSI reserved regions are exposed in /sys/kernel/iommu_groups/<n>/reserved_regions 0x0000000008000000 0x00000000080fffff msi
Is that configurability reflected accordingly?
?
Nothing using iommufd should parse that sysfs file.
How do you make sure it does not collide with other resv regions? I don't see any check here.
Yes this does need to be checked, it does look missing. It still needs to create a reserved region in the ioas when attaching to keep the areas safe and it has to intersect with the incoming reserved regions from the driver.
Yea, I found iopt_reserve_iova() is actually missed entirely...
While fixing this, I see a way to turn the OPTIONs back to per- idev, if you still prefer them to be per-idev(?). Then, we can check a given input in the set_option() against the device's reserved region list from the driver, prior to device attaching to any HWPT.
Otherwise, we just rely on iopt_enforce_device_reserve_region() during an attach, keeping the option global to simplify VMMs.
Thanks Nicolin