From: Jason Gunthorpe jgg@nvidia.com Sent: Monday, July 31, 2023 9:24 PM
On Mon, Jul 31, 2023 at 06:21:44AM +0000, Tian, Kevin wrote:
As it is, userspace will have to aggregate the list, sort it, merge adjacent overlapping reserved ranges then invert the list to get an allowed list. This is not entirely simple..
Did you already write an algorithm to do this in qemu someplace?
Qemu is optional to aggregate it for S2 given IOMMU_IOAS_IOVA_RANGES is still being used. If the only purpose of using this new cmd is to report per-device reserved ranges to the guest then aggregation is not required.
I don't think it is entirely optional.. If qmeu doesn't track this, then it will have failures when attaching the S2 to the device. It needs to make sure it punches the right holes in the guest memory map to be compatible with the VFIO HW.
I suppose in reality the reserved regions are fairly predictable and probably always match the existing qemu memory map so you can ignore this and still work.
Plus most qemu cases don't deal with hotplug so you can build up the identity ioas with all the devices and then use IOMMU_IOAS_IOVA_RANGES as you say and still work.
Arguably IOMMU_IOAS_IOVA_RANGES becomes redundant with this new cmd. But it's already there and as you said it's actually more convenient to be used if the user doesn't care about per-device reserved ranges...
Yes and yes
So, I guess we should leave it like this?
Yes. Along with this discussion (including what you explained for sw_msi) let's abandon this new cmd and leave it as today.