On Wed, 2018-11-07 at 12:57 +0000, Gustavo Pimentel wrote:
On 06/11/2018 16:00, Marc Zyngier wrote:
On 06/11/18 14:53, Lorenzo Pieralisi wrote:
On Sat, Oct 27, 2018 at 12:00:57AM +0000, Trent Piepho wrote:
This gives the following race scenario:
- An MSI is received by, and the status bit for the MSI is set in, the
DWC PCI-e controller. 2. dw_handle_msi_irq() calls a driver's registered interrupt handler for the MSI received. 3. At some point, the interrupt handler must decide, correctly, that there is no more work to do and return. 4. The hardware generates a new MSI. As the MSI's status bit is still set, this new MSI is ignored. 6. dw_handle_msi_irq() unsets the MSI status bit.
The MSI received at point 4 will never be acted upon. It occurred after the driver had finished checking the hardware status for interrupt conditions to act on. Since the MSI status was masked, it does not generated a new IRQ, neither when it was received nor when the MSI is unmasked.
This status register indicates whether exists or not a MSI interrupt on that controller [0..7] to be handle.
While the status for an MSI is set, no new interrupt will be triggered if another identical MSI is received, correct?
In theory, we should clear the interrupt flag only after the interrupt has actually handled (which can take some time to process on the worst case scenario).
But see above, there is a race if a new MSI arrives while still masked. I can see no possible way to solve this in software that does not involve unmasking the MSI before calling the handler. To leave the interrupt masked while calling the handler requires the hardware to queue an interrupt that arrives while masked. We have no docs, but the designware controller doesn't appear to do this in practice.
However, the Trent's patch allows to acknowledge the flag and handle the interrupt later, giving the opportunity to catch a possible new interrupt, which will be handle by a new call of this function.
What I'm interested in is the relationship this has with the mask/unmask callbacks, and whether masking the interrupt before acking it would help.
Although there is the possibility of mask/unmask the interruptions on the controller, from what I've seen typically in other dw drivers this is not done. Probably we don't get much benefit from using it.
Gustavo
Gustavo, can you help here?
In any way, moving the action of acknowledging the interrupt to its right spot in the kernel (dw_pci_bottom_ack) would be a good start.
Thanks,
M.