From: Liu, Yi L yi.l.liu@intel.com Sent: Wednesday, December 27, 2023 10:13 PM
On 2023/12/27 17:33, Ethan Zhao wrote:
On 12/27/2023 5:06 PM, Duan, Zhenzhong wrote:
-----Original Message----- From: Liu, Yi L yi.l.liu@intel.com Sent: Tuesday, December 26, 2023 4:44 PM Subject: Re: [PATCH v7 7/9] iommu/vt-d: Allow qi_submit_sync() to
return
the QI faults
On 2023/12/26 14:15, Yi Liu wrote:
On 2023/12/26 12:13, Tian, Kevin wrote:
> From: Liu, Yi L yi.l.liu@intel.com > Sent: Tuesday, December 26, 2023 12:03 PM > > On 2023/12/22 12:23, Tian, Kevin wrote: >>> From: Liu, Yi L yi.l.liu@intel.com >>> Sent: Thursday, December 21, 2023 11:40 PM >>> >>> + fault &= DMA_FSTS_IQE | DMA_FSTS_ITE | DMA_FSTS_ICE; >>> + if (fault) { >>> + if (fsts) >>> + *fsts |= fault; >> do we expect the fault to be accumulated? otherwise it's clearer to >> just do direct assignment instead of asking for the caller to clear >> the variable before invocation. > not quite get. do you mean the fault should not be cleared in the
caller
> side? > I meant:
if (fsts) *fsts = fault;
unless there is a reason to *OR* the original value.
I guess no such a reason. :) let me modify it.
hmmm, replied too soon. The qi_check_fault() would be called multiple times in one invalidation circle as qi_submit_sync() needs to see if any fault happened before the hw writes back QI_DONE to the wait
descriptor.
There can be ICE which may eventually result in ITE. So caller of qi_check_fault() would continue to wait for QI_DONE. So qi_check_fault() returns 0 to let qi_submit_sync() go on though ICE detected. If we use '*fsts = fault;', then ICE would be missed since the input fsts pointer is the same in one qi_submit_sync() call.
Is it necessary to return fault to user if qi_check_fault() return -EAGAIN and a restart run succeeds?
no need if a restart succeeds. I would add a *fault = 0 per the restart.
Issue a device-TLB invalidation to no response device there is possibility
will be trapped there loop for ITE , never get return.
yes. This the implementation today, in future I think we may need a kind of timeout mechanism, so that it can return and report the error to user. In concept, in nested translation, the page table is owned by userspace, so it makes more sense to let userspace know it and take proper action.
it doesn't make sense to retry upon an invalidation request from userspace. if retry is required that is the policy of guest iommu driver. Also it's not good to introduce a uapi flag which won't be set by current driver.
this can be solved by a simple change in qi_check_fault():
if (qi->desc_status[wait_index] == QI_ABORT) - return -EAGAIN; + return fsts ? -ETIMEDOUT : -EAGAIN;
because if the caller wants to know the fault reason the implication is that the caller will decide how to cope with the fault. It is incorrect for qi_check_fault() to decide.