On Sun, 21 Feb 2021 22:18:48 +0100 Andrew Lunn andrew@lunn.ch wrote:
BTW do you have some experience where pca9538 or compatible cause errors when used for interrupts? Because I am thinking about trying to update the pca953x driver to support IRQs via the gpio_chip it registers, instead of a separate irq_chip.
I had a board which just died at boot with an interrupt storm. It was probably a PCA9554, at least, i have that datasheet in my collection.
But why did an interrupt storm kill it? The interrupt handler was called too many times?
First off, the hardware needs to designed correctly. All unused pins need a pull up/down since they default to inputs, and hence will trigger interrupts. Or you need to make unused pins outputs before you enable interrupts. And that probably goes against the design of the GPIO subsystem. I don't think you actually know when a pin is unused.
Omnia has proper pull ups/downs on all 8 pins on this device. 5 of these pins are used from SFP cage, 1 as interrupt from PHY and 2 are unused. Only the interrupt pin was causing problems because marvell PHY driver configured it as blink on activity LED.
I'm not sure i would want to touch this driver. Given how badly this device implements interrupts, any board which does successfully use it for interrupts might regress if you make code changes. And then you are going to have to try to figure out what you actually changed and why it regressed.
The problem in this driver is: - interrupt handler is called every time an input pin changes - not all input pins must be used as interrupt sources - if at least one input pin is used as an interrupt source, the interrupt handler is being called on every change of every input pin - but if the change occurs on a pin that is not used as an interrupt source, the interrupt handler returns IRQ_NONE - a simple scenario to achieve error: 1. use pin P0 as interrupt source and P1 as GPIO input; other pins as outputs 2. connect P1 to something that changes value 3. after 10000 changes of P1 (more if there was a change on P0 at the same time) the interrupt handler will return IRQ_NONE 10000 times and kernel will start ignoring interrupts from this driver because it was returning IRQ_NONE I think this needs to be fixed in this driver. Either this function should return IRQ_HANDLED in this case, or there should be a third option to return, something like IRQ_NONE_BUT_THATS_OK
Marek