Linus,
This patch is being sent directly to you because there has been a regression in 5.18 that I identified and sent a fix up that has been reviewed/tested/acked for nearly a week but the current subsystem maintainer (Bartosz) hasn't picked it up to send to you.
It's a severe problem; anyone who hits it: 1) Power button doesn't work anymore 2) Can't resume their laptop from S3 or s2idle
Because the original patch was cc stable@, it landed in stable releases and has been breaking people left and right as distros track the stable channels. The patch is well tested. Would you please consider to pick this up directly to fix that regression?
Thanks,
Mario Limonciello (1): gpio: Request interrupts after IRQ is initialized
drivers/gpio/gpiolib.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
commit 5467801f1fcb ("gpio: Restrict usage of GPIO chip irq members before initialization") attempted to fix a race condition that lead to a NULL pointer, but in the process caused a regression for _AEI/_EVT declared GPIOs. This manifests in messages showing deferred probing while trying to allocate IRQs like so:
[ 0.688318] amd_gpio AMDI0030:00: Failed to translate GPIO pin 0x0000 to IRQ, err -517 [ 0.688337] amd_gpio AMDI0030:00: Failed to translate GPIO pin 0x002C to IRQ, err -517 [ 0.688348] amd_gpio AMDI0030:00: Failed to translate GPIO pin 0x003D to IRQ, err -517 [ 0.688359] amd_gpio AMDI0030:00: Failed to translate GPIO pin 0x003E to IRQ, err -517 [ 0.688369] amd_gpio AMDI0030:00: Failed to translate GPIO pin 0x003A to IRQ, err -517 [ 0.688379] amd_gpio AMDI0030:00: Failed to translate GPIO pin 0x003B to IRQ, err -517 [ 0.688389] amd_gpio AMDI0030:00: Failed to translate GPIO pin 0x0002 to IRQ, err -517 [ 0.688399] amd_gpio AMDI0030:00: Failed to translate GPIO pin 0x0011 to IRQ, err -517 [ 0.688410] amd_gpio AMDI0030:00: Failed to translate GPIO pin 0x0012 to IRQ, err -517 [ 0.688420] amd_gpio AMDI0030:00: Failed to translate GPIO pin 0x0007 to IRQ, err -517
The code for walking _AEI doesn't handle deferred probing and so this leads to non-functional GPIO interrupts.
Fix this issue by moving the call to `acpi_gpiochip_request_interrupts` to occur after gc->irc.initialized is set.
Cc: Shreeya Patel shreeya.patel@collabora.com Cc: stable@vger.kernel.org Fixes: 5467801f1fcb ("gpio: Restrict usage of GPIO chip irq members before initialization") Reported-by: Mario Limonciello mario.limonciello@amd.com Link: https://lore.kernel.org/linux-gpio/BL1PR12MB51577A77F000A008AA694675E2EF9@BL... Signed-off-by: Mario Limonciello mario.limonciello@amd.com Reviewed-by: Shreeya Patel shreeya.patel@collabora.com Tested-By: Samuel Čavoj samuel@cavoj.net Tested-By: lukeluk498@gmail.com Link: Link: https://gitlab.freedesktop.org/drm/amd/-/issues/1979 Link: https://bugzilla.kernel.org/show_bug.cgi?id=215850 Link: https://gitlab.freedesktop.org/drm/amd/-/issues/1976 Reviewed-by: Andy Shevchenko andy.shevchenko@gmail.com Acked-by: Linus Walleij linus.walleij@linaro.org --- v1->v2: * Pick up acked-by/reviewed-by/link/tested-by
drivers/gpio/gpiolib.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/gpio/gpiolib.c b/drivers/gpio/gpiolib.c index 085348e08986..b7694171655c 100644 --- a/drivers/gpio/gpiolib.c +++ b/drivers/gpio/gpiolib.c @@ -1601,8 +1601,6 @@ static int gpiochip_add_irqchip(struct gpio_chip *gc,
gpiochip_set_irq_hooks(gc);
- acpi_gpiochip_request_interrupts(gc); - /* * Using barrier() here to prevent compiler from reordering * gc->irq.initialized before initialization of above @@ -1612,6 +1610,8 @@ static int gpiochip_add_irqchip(struct gpio_chip *gc,
gc->irq.initialized = true;
+ acpi_gpiochip_request_interrupts(gc); + return 0; }
On 4/22/2022 08:14, Mario Limonciello wrote:
commit 5467801f1fcb ("gpio: Restrict usage of GPIO chip irq members before initialization") attempted to fix a race condition that lead to a NULL pointer, but in the process caused a regression for _AEI/_EVT declared GPIOs. This manifests in messages showing deferred probing while trying to allocate IRQs like so:
[ 0.688318] amd_gpio AMDI0030:00: Failed to translate GPIO pin 0x0000 to IRQ, err -517 [ 0.688337] amd_gpio AMDI0030:00: Failed to translate GPIO pin 0x002C to IRQ, err -517 [ 0.688348] amd_gpio AMDI0030:00: Failed to translate GPIO pin 0x003D to IRQ, err -517 [ 0.688359] amd_gpio AMDI0030:00: Failed to translate GPIO pin 0x003E to IRQ, err -517 [ 0.688369] amd_gpio AMDI0030:00: Failed to translate GPIO pin 0x003A to IRQ, err -517 [ 0.688379] amd_gpio AMDI0030:00: Failed to translate GPIO pin 0x003B to IRQ, err -517 [ 0.688389] amd_gpio AMDI0030:00: Failed to translate GPIO pin 0x0002 to IRQ, err -517 [ 0.688399] amd_gpio AMDI0030:00: Failed to translate GPIO pin 0x0011 to IRQ, err -517 [ 0.688410] amd_gpio AMDI0030:00: Failed to translate GPIO pin 0x0012 to IRQ, err -517 [ 0.688420] amd_gpio AMDI0030:00: Failed to translate GPIO pin 0x0007 to IRQ, err -517
The code for walking _AEI doesn't handle deferred probing and so this leads to non-functional GPIO interrupts.
Fix this issue by moving the call to `acpi_gpiochip_request_interrupts` to occur after gc->irc.initialized is set.
Cc: Shreeya Patel shreeya.patel@collabora.com Cc: stable@vger.kernel.org Fixes: 5467801f1fcb ("gpio: Restrict usage of GPIO chip irq members before initialization") Reported-by: Mario Limonciello mario.limonciello@amd.com Link: https://lore.kernel.org/linux-gpio/BL1PR12MB51577A77F000A008AA694675E2EF9@BL... Signed-off-by: Mario Limonciello mario.limonciello@amd.com Reviewed-by: Shreeya Patel shreeya.patel@collabora.com Tested-By: Samuel Čavoj samuel@cavoj.net Tested-By: lukeluk498@gmail.com Link: Link: https://gitlab.freedesktop.org/drm/amd/-/issues/1979 Link: https://bugzilla.kernel.org/show_bug.cgi?id=215850 Link: https://gitlab.freedesktop.org/drm/amd/-/issues/1976 Reviewed-by: Andy Shevchenko andy.shevchenko@gmail.com Acked-by: Linus Walleij linus.walleij@linaro.org
Takashi hit this as well recently and provided a few more tags.
BugLink: https://bugzilla.suse.com/show_bug.cgi?id=1198697 Reviewed-by: Takashi Iwai tiwai@suse.de Tested-by: Takashi Iwai tiwai@suse.de
v1->v2:
- Pick up acked-by/reviewed-by/link/tested-by
drivers/gpio/gpiolib.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/gpio/gpiolib.c b/drivers/gpio/gpiolib.c index 085348e08986..b7694171655c 100644 --- a/drivers/gpio/gpiolib.c +++ b/drivers/gpio/gpiolib.c @@ -1601,8 +1601,6 @@ static int gpiochip_add_irqchip(struct gpio_chip *gc, gpiochip_set_irq_hooks(gc);
- acpi_gpiochip_request_interrupts(gc);
- /*
- Using barrier() here to prevent compiler from reordering
- gc->irq.initialized before initialization of above
@@ -1612,6 +1610,8 @@ static int gpiochip_add_irqchip(struct gpio_chip *gc, gc->irq.initialized = true;
- acpi_gpiochip_request_interrupts(gc);
- return 0; }
On Fri, Apr 22, 2022 at 6:15 AM Mario Limonciello mario.limonciello@amd.com wrote:
This patch is being sent directly to you because there has been a regression in 5.18 that I identified and sent a fix up that has been reviewed/tested/acked for nearly a week but the current subsystem maintainer (Bartosz) hasn't picked it up to send to you.
Applied.
I'm not sure the "cc: stable" makes much sense since the bug was introduced in this release, but I assume you added it because the problem commit was also marked for stable.
The "Fixes:" tag should take care of it, but I left that cc:stable alone.
Linus
On Fri, Apr 22, 2022 at 3:15 PM Mario Limonciello mario.limonciello@amd.com wrote:
Linus,
This patch is being sent directly to you because there has been a regression in 5.18 that I identified and sent a fix up that has been reviewed/tested/acked for nearly a week but the current subsystem maintainer (Bartosz) hasn't picked it up to send to you.
Hi Mario!
I don't have any previous submission in my inbox. Are you sure to have used my current address (brgl@bgdev.pl)?
Bart
It's a severe problem; anyone who hits it:
- Power button doesn't work anymore
- Can't resume their laptop from S3 or s2idle
Because the original patch was cc stable@, it landed in stable releases and has been breaking people left and right as distros track the stable channels. The patch is well tested. Would you please consider to pick this up directly to fix that regression?
Thanks,
Mario Limonciello (1): gpio: Request interrupts after IRQ is initialized
drivers/gpio/gpiolib.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
-- 2.34.1
On Mon, Apr 25, 2022 at 8:41 PM Bartosz Golaszewski brgl@bgdev.pl wrote:
On Fri, Apr 22, 2022 at 3:15 PM Mario Limonciello mario.limonciello@amd.com wrote:
Linus,
This patch is being sent directly to you because there has been a regression in 5.18 that I identified and sent a fix up that has been reviewed/tested/acked for nearly a week but the current subsystem maintainer (Bartosz) hasn't picked it up to send to you.
Hi Mario!
I don't have any previous submission in my inbox. Are you sure to have used my current address (brgl@bgdev.pl)?
Nevermind, found it in spam. Sorry, this sometimes happens in gmail.
Anyway - it's only been 3 days and I've been travelling. Sometimes reviews take a couple days.
Bart
Bart
It's a severe problem; anyone who hits it:
- Power button doesn't work anymore
- Can't resume their laptop from S3 or s2idle
Because the original patch was cc stable@, it landed in stable releases and has been breaking people left and right as distros track the stable channels. The patch is well tested. Would you please consider to pick this up directly to fix that regression?
Thanks,
Mario Limonciello (1): gpio: Request interrupts after IRQ is initialized
drivers/gpio/gpiolib.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
-- 2.34.1
[Public]
-----Original Message----- From: Bartosz Golaszewski brgl@bgdev.pl Sent: Monday, April 25, 2022 13:43 To: Limonciello, Mario Mario.Limonciello@amd.com Cc: Linus Torvalds torvalds@linux-foundation.org; Natikar, Basavaraj Basavaraj.Natikar@amd.com; Gong, Richard Richard.Gong@amd.com; regressions@lists.linux.dev; Thorsten Leemhuis regressions@leemhuis.info; Greg KH gregkh@linuxfoundation.org; stable stable@vger.kernel.org; Linus Walleij linus.walleij@linaro.org; Shreeya Patel shreeya.patel@collabora.com; Andy Shevchenko andy.shevchenko@gmail.com; open list:GPIO SUBSYSTEM <linux- gpio@vger.kernel.org>; open list linux-kernel@vger.kernel.org Subject: Re: [PATCH v2 0/1] Fix regression in 5.18 for GPIO
On Mon, Apr 25, 2022 at 8:41 PM Bartosz Golaszewski brgl@bgdev.pl wrote:
On Fri, Apr 22, 2022 at 3:15 PM Mario Limonciello mario.limonciello@amd.com wrote:
Linus,
This patch is being sent directly to you because there has been a regression in 5.18 that I identified and sent a fix up that has been reviewed/tested/acked for nearly a week but the current subsystem maintainer (Bartosz) hasn't picked it up to send to you.
Hi Mario!
I don't have any previous submission in my inbox. Are you sure to have used my current address (brgl@bgdev.pl)?
Nevermind, found it in spam. Sorry, this sometimes happens in gmail.
OK glad you found it.
Anyway - it's only been 3 days and I've been travelling. Sometimes reviews take a couple days.
If it was just in the normal development release in an RC I'd agree there wasn't a lot of urgency, but stable picked it up and caused severe regressions. There wasn't an obvious willingness to revert the problematic commit in stable is why Thorsten was making noise about it and suggested me to send it directly to Linus.
Anyway - I'm glad it's sorted now.
Bart
Bart
It's a severe problem; anyone who hits it:
- Power button doesn't work anymore
- Can't resume their laptop from S3 or s2idle
Because the original patch was cc stable@, it landed in stable releases and has been breaking people left and right as distros track the stable channels. The patch is well tested. Would you please consider to pick this up directly to fix that regression?
Thanks,
Mario Limonciello (1): gpio: Request interrupts after IRQ is initialized
drivers/gpio/gpiolib.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
-- 2.34.1
linux-stable-mirror@lists.linaro.org