Hi Thomas, Xiong
On Thu, Dec 14, 2023 at 09:54:06AM +0800, xiongxin wrote:
在 2023/12/13 22:59, Thomas Gleixner 写道:
On Wed, Dec 13 2023 at 10:29, xiongxin wrote:
在 2023/12/12 23:17, Thomas Gleixner 写道: Sorry, the previous reply may not have clarified the BUG process. I re-debugged and confirmed it yesterday. The current BUG execution sequence is described as follows:
It's the sequence how this works and it works correctly.
Just because it does not work on your machine it does not mean that this is incorrect and a BUG.
You are trying to fix a symptom and thereby violating guarantees of the core code.
That is, there is a time between the 1:handle_level_irq() and 3:irq_thread_fn() calls for the 2:disable_irq() call to acquire the lock and then implement the irq_state_set_disabled() operation. When finally call irq_thread_fn()->irq_finalize_oneshot(), it cannot enter the unmask_thread_irq() process.
Correct, because the interrupt has been DISABLED in the mean time.
In this case, the gpio irq_chip irq_mask()/irq_unmask() callback pairs are not called in pairs, so I think this is a BUG, but not necessarily fixed from the irq core code layer.
No. It is _NOT_ a BUG. unmask() is not allowed to be invoked when the interrupt is DISABLED. That's the last time I'm going to tell you that. Only enable_irq() can undo the effect of disable_irq(), period.
Next, when the gpio controller driver calls the suspend/resume process, it is as follows:
suspend process: dwapb_gpio_suspend() ctx->int_mask = dwapb_read(gpio, GPIO_INTMASK);
resume process: dwapb_gpio_resume() dwapb_write(gpio, GPIO_INTMASK, ctx->int_mask);
Did you actually look at the sequence I gave you?
Suspend: i2c_hid_core_suspend() disable_irq(); <- Marks it disabled and eventually masks it. gpio_irq_suspend() save_registers(); <- Saves masked interrupt Resume: gpio_irq_resume() restore_registers(); <- Restores masked interrupt i2c_hid_core_resume() enable_irq(); <- Unmasks interrupt and removes the disabled markerHave you verified that this order of invocations is what happens on your machine?
Thanks,
tglxAs described earlier, in the current situation, the irq_mask() callback of gpio irq_chip is called in mask_irq(), followed by the disable_irq() in i2c_hid_core_suspend(), unmask_irq() will not be executed.
Then call enable_irq() in i2c_hid_core_resume(). Since gpio irq_chip does not implement the irq_startup() callback, it ends up calling irq_enable().
The irq_enable() function is then implemented as follows:
irq_state_clr_disabled(desc); if (desc->irq_data.chip->irq_enable) { desc->irq_data.chip->irq_enable(&desc->irq_data); irq_state_clr_masked(desc); } else { unmask_irq(desc); }
Because gpio irq_chip implements irq_enable(), unmask_irq() is not executed, and gpio irq_chip's irq_unmask() callback is not called. Instead, irq_state_clr_masked() was called to clear the masked flag.
The irq masked behavior is actually controlled by the irq_mask()/irq_unmask() callback function pairs in gpio irq_chip. When the whole situation occurs, there is one more irq_mask() operation, or one less irq_unmask() operation. This ends the i2c hid resume and the gpio corresponding i2c hid interrupt is also masked.
Please help confirm whether the current situation is a BUG, or suggest other solutions to fix it.
I has just been Cc'ed to this thread from the very start of the thread so not sure whether I fully understand the problem. But a while ago an IRQ disable/undisable-mask/unmask-unpairing problem was reported for DW APB GPIO driver implementation, but in a another context though: https://lore.kernel.org/linux-gpio/1606728979-44259-1-git-send-email-luojiax... We didn't come to a final conclusion back then, so the patch got lost in the emails archive. Xiong, is it related to the problem in your case?
-Serge(y)