On 2023/6/27 2:33, Jason Gunthorpe wrote:
On Sun, Jun 25, 2023 at 02:30:46PM +0800, Baolu Lu wrote:
Agreed. We should avoid workqueue in sva iopf framework. Perhaps we could go ahead with below code? It will be registered to device with iommu_register_device_fault_handler() in IOMMU_DEV_FEAT_IOPF enabling path. Un-registering in the disable path of cause.
This maze needs to be undone as well.
It makes no sense that all the drivers are calling
iommu_register_device_fault_handler(dev, iommu_queue_iopf, dev);
The driver should RX a PRI fault and deliver it to some core code function, this looks like a good start:
static int io_pgfault_handler(struct iommu_fault *fault, void *cookie) { ioasid_t pasid = fault->prm.pasid; struct device *dev = cookie; struct iommu_domain *domain;
if (fault->type != IOMMU_FAULT_PAGE_REQ) return -EOPNOTSUPP; if (fault->prm.flags & IOMMU_FAULT_PAGE_REQUEST_PASID_VALID) domain = iommu_get_domain_for_dev_pasid(dev, pasid, 0); else domain = iommu_get_domain_for_dev(dev); if (!domain || !domain->iopf_handler) return -ENODEV; if (domain->type == IOMMU_DOMAIN_SVA) return iommu_queue_iopf(fault, cookie); return domain->iopf_handler(fault, dev, domain->fault_data);
Then we find the domain that owns the translation and invoke its domain->ops->iopf_handler()
Agreed. The iommu_register_device_fault_handler() could only be called by the device drivers who want to handle the DMA faults and IO page faults by themselves in any special ways.
By default, the faults should be dispatched to domain->iopf_handler in a generic core code.
If the driver created a SVA domain then the op should point to some generic 'handle sva fault' function. There shouldn't be weird SVA stuff in the core code.
The weird SVA stuff is really just a generic per-device workqueue dispatcher, so if we think that is valuable then it should be integrated into the iommu_domain (domain->ops->use_iopf_workqueue = true for instance). Then it could route the fault through the workqueue and still invoke domain->ops->iopf_handler.
The word "SVA" should not appear in any of this.
Yes. We should make it generic. The domain->use_iopf_workqueue flag denotes that the page faults of a fault group should be put together and then be handled and responded in a workqueue. Otherwise, the page fault is dispatched to domain->iopf_handler directly.
Not sure what iommu_register_device_fault_handler() has to do with all of this.. Setting up the dev_iommu stuff to allow for the workqueue should happen dynamically during domain attach, ideally in the core code before calling to the driver.
There are two pointers under struct dev_iommu for fault handling.
/** * struct dev_iommu - Collection of per-device IOMMU data * * @fault_param: IOMMU detected device fault reporting data * @iopf_param: I/O Page Fault queue and data
[...]
struct dev_iommu { struct mutex lock; struct iommu_fault_param *fault_param; struct iopf_device_param *iopf_param;
My understanding is that @fault_param is a place holder for generic things, while @iopf_param is workqueue specific. Perhaps we could make @fault_param static and initialize it during iommu device_probe, as IOMMU fault is generic on every device managed by an IOMMU.
@iopf_param could be allocated on demand. (perhaps renaming it to a more meaningful one?) It happens before a domain with use_iopf_workqueue flag set attaches to a device. iopf_param keeps alive until device_release.
Also, I can understand there is a need to turn on PRI support really early, and it can make sense to have some IOMMU_DEV_FEAT_IOPF/SVA to ask to turn it on.. But that should really only be needed if the HW cannot turn it on dynamically during domain attach of a PRI enabled domain.
It needs cleaning up..
Yes. I can put this and other cleanup things that we've discussed in a preparation series and send it out for review after the next rc1 is released.
Jason
Best regards, baolu