On 1/31/2019 4:41 PM, Logan Gunthorpe wrote:
On 2019-01-31 3:46 p.m., Dave Jiang wrote:
I believe irqbalance writes to the file /proc/irq/N/smp_affinity. So maybe take a look at the code that starts from there and see if it would have any impact on your stuff.
Ok, well on my system I can write to the smp_affinity all day and the MSI interrupts still work fine.
Maybe your code is ok then. If the stats show up in /proc/interrupts then you can see it moving to different cores.
The MSI code is a bit difficult to trace and audit with all the different chips and the parent chips which I don't have a good understanding of. But I can definitely see that it could be possible for some chips to change the address as smp_affinitiy will eventually sometimes call msi_domain_set_affinity() which does seem to recompose the message and write it back to the chip.
So, I could relatively easily add a callback to msi_desc to catch this and resend the MSI address/data. However, I'm not sure how this is ever done atomically. It seems like there would be a race while the device updates its address where old interrupts could be triggered. This race would be much longer for us when sending this information over the NTB link. Though, I guess if the only change is that it encodes CPU information in the address then that would not be an issue. However, I'm not sure I can say that for certain without a comprehensive understanding of all the IRQ chips.
Any thoughts on this?
Yeah I'm not sure what to do about it either as I'm not super familiar with that area either. Just making note of what I encountered. And you are right, the updated info has to go over NTB for the other side to write to the updated place. So there's a lot of latency involved.
Logan