On Tue, 2018-11-06 at 14:53 +0000, Lorenzo Pieralisi wrote:
Not sure why (5) is not used in your lists, I assume because you want to highlight the race condition with the jump from 4 to 6 (or maybe you do not like number 5 :), just curious).
It's because I removed a step and I'm used to rst docs where it renumbers automatically.
The observant will notice that point 4 present the opportunity for the SoC's interrupt controller to lose the interrupt in the same manner as the bug in this driver. The driver for that interrupt controller will be written to properly deal with this. In some cases the hardware supports an EOI concept, where the 2nd IRQ is masked and internally queued in the hardware, to be re-raised at EOI in step 7. In other cases the IRQ will be unmasked and re-raised at step 4, but the kernel will see the handler is INPROGRESS and not re-invoke it, and instead set a PENDING flag, which causes the handler to re-run at step 7.
[1]
Fixes: 8c934095fa2f ("PCI: dwc: Clear MSI interrupt status after it is handled, not before")
I have two questions:
- Commit 8c934095fa2f has been in the kernel for a year and no regression was reported. It was assumed to fix a problem so before reverting it I want to make sure we are not breaking anything else.
There have been reports. Vignesh reported this some time ago. Also see https://lkml.org/lkml/2018/11/2/55, the author indicated to me that it was the caused by this problem after I pointed him to my patch. And I've reported it now too.
We only updated to a kernel with the regression a few months ago. I think imx6/7 systems don't update their kernels that quickly. Usually not running a mainstream distro with automatic updates. And there are probably a lot of people using the NXP vendor kernel, which has a ton of unmerged patches, and so they don't see this regression yet.
- Your reasoning seems correct but I would pick Marc's brain on this because I want to understand if what this patch does is what IRQ core expects it to do, especially in relation to the IRQ chaining you are mentioning.
I've traced this through the ARM arch code and used our custom PCI device to generate interrupts in all the race windows. It does work as I say, though I can't say this it how it's supposed to work.