From: Jason Gunthorpe jgg@nvidia.com Sent: Tuesday, December 12, 2023 11:32 PM
On Tue, Dec 12, 2023 at 02:20:01AM +0000, Tian, Kevin wrote:
From: Liu, Yi L yi.l.liu@intel.com Sent: Monday, December 11, 2023 4:08 PM
On 2023/12/7 16:47, Tian, Kevin wrote:
From: Liu, Yi L yi.l.liu@intel.com Sent: Monday, November 27, 2023 2:39 PM
+static int vfio_pci_core_feature_pasid(struct vfio_device *device, u32
flags,
struct vfio_device_feature_pasid __user
*arg,
size_t argsz)
+{
- struct vfio_pci_core_device *vdev =
container_of(device, struct vfio_pci_core_device, vdev);
- struct vfio_device_feature_pasid pasid = { 0 };
- struct pci_dev *pdev = vdev->pdev;
- u32 capabilities = 0;
- int ret;
- /* We do not support SET of the PASID capability */
this line alone is meaningless. Please explain the reason e.g. due to no PASID capability per VF...
sure. I think the major reason is we don't allow userspace to change the PASID configuration. is it?
if only PF it's still possible to develop a model allowing userspace to change.
More importantly the primary purpose of setting the PASID width is because of the physical properties of the IOMMU HW.
IOMMU HW that supports virtualization should do so in a way that the PASID with can be globally set to some value the hypervisor is aware the HW can decode in all cases.
The VM should have no way to make the HW ignore (vs check for zero) upper bits of the PASID that would require the physical PASID bits to be reduced.
So we should never allow programming of this, VMM just fakes it and ignores sets.
PASID width is read-only so certainly sets should be ignored
Similar argument for enable, IOMMU HW supporting virtualization should always be able to decode PASID and reject PASID TLPs if the VM hasn't configured the vIOMMU to decode them. The purpose of the disable bit is to accommodate IOMMU HW that cannot decode the PASID TLP at all and would become confused.
Yes, this explains why disallowing userspace to change doesn't cause problem in this series. My earlier point was just that allowing userspace to change could be implemented for PF (though unnecessary with your explanation) to mimic the hardware behavior.