On Wed, Jul 31 2024 at 14:24, Bjorn Helgaas wrote:
On Wed, May 29, 2024 at 06:27:27PM -0700, Joseph Jang wrote:
Validate there are no duplicate ITS-MSI hwirqs from the /sys/kernel/irq/*/hwirq.
One example log show 2 duplicated MSI entries in the /proc/interrupts.
150: 0 ... ITS-MSI 3355443200 Edge pciehp 152: 0 ... ITS-MSI 3355443200 Edge pciehp
I don't know how ITS-MSI works, so I don't know whether it's an error that both entries mention 3355443200.
3355443200 == 0xc8000000, which looks like it could be an address or address/data pair or something, and it does make sense to me that if two devices write the same MSI address/data, it should result in the same IRQ.
That was an issue with truncation which got fixed some time ago:
https://lore.kernel.org/all/20240115135649.708536-1-vidyas@nvidia.com/
It seems like maybe this is a generic issue, i.e., if this is a problem, maybe it would affect *other* kinds of MSI too, not just ITS-MSI?
It's the same for ALL interrupts whether MSI or not.
The requirement is that for any interrupt chip all hardware interrupt numbers related to a particular chip must be unique.
Adding a ITS-MSI specific parser is just wrong. It's a generic problem and has absolutely nothing to do with ITS or MSI.
Aside of that the proposed parser does not even work anymore on 6.11 because we switched ARM[64] over to per device domains during the merge window.
So if we want a selftest for the correctness of the hardware interrupt numbers then it should grab the per interrupt sysfs entry 'chip_name' and 'hwirq' pairs and do an analysis per 'chip_name' whether all hardware interrupt numbers for a chip are unique.
Thanks,
tglx