On 7/31/2019 8:13 AM, Stephen Boyd wrote:
Quoting Linus Walleij (2019-07-31 01:48:38)
On Tue, Jul 30, 2019 at 3:41 PM Mark Brown broonie@kernel.org wrote:
Today's -next fails to boot on QDF2400 systems:
Is this a devicetree or ACPI system? Which devicetree in that case? If it is ACPI I assume one had to look into DSDTs?
But I looked into this!
08:56:36.026587 [ 4.339966] pc : __memset+0x80/0x188 08:56:36.026867 [ 4.343524] lr : msm_gpio_init_valid_mask+0x134/0x1a4
Aha. I think this only worked out of chance before.
What the Qualcomm driver does is exploit that .init_valid_mask() gets called even if .need_valid_mask in gpio_chip is not set to true, and this is why the bug appears in msm_gpio_init_valid_mask(), I'm pretty much sure it is the bitmap_zero(chip->valid_mask, max_gpios); call towards the end of the function that gets turned into: 08:56:36.114798 [ 4.433713] __memset+0x80/0x188
And this causes the crash.
This is then because chip->valid_mask has not been allocated, and that is because .need_valid_mask is not set. This is set like so:
static bool msm_gpio_needs_valid_mask(struct msm_pinctrl *pctrl) { if (pctrl->soc->reserved_gpios) return true;
return device_property_read_u16_array(pctrl->dev, "gpios", NULL, 0) > 0;
}
Some of the code here is new. I guess the arm64 laptop stuff is making changes.
First comes from the driver, second is for ACPI I think. It looks like a bit dangerous way to do it for ACPI, what if an OF pin controller has some "gpios" property? Oh well.
Apparently this only happens on ACPI systems because I tested it myself on a DT system.
Another cause may be this from the call site inside gpiolib:
static int gpiochip_alloc_valid_mask(struct gpio_chip *gc) { if (IS_ENABLED(CONFIG_OF_GPIO)) gc->need_valid_mask = of_gpio_need_valid_mask(gc); if (!gc->need_valid_mask) return 0;
gc->valid_mask = gpiochip_allocate_mask(gc); if (!gc->valid_mask) return -ENOMEM; return 0;
}
So if OF and ACPI is activated at the same time (can that happen?)
Yes, OF and ACPI can be compiled into the same kernel. Also, ACPI does some interesting things when CONFIG_OF is enabled by looking for some ACPI magic that basically says "match the DT compatible string embedded in this ACPI thing". Quite scary!
then the OF code will bail out I suppose and this happens when the ACPI side of things try to use the mask it didn't allocate. The ACPI codepath in msm_gpio_init_valid_mask() seems to try to set the mask even if there is zero GPIOs as well... dereferencing the NULL pointer in chip->valid_mask.
Could it be that this is a ACPI system with zero protected GPIOs?
QDF2400 is an ACPI only system, but it has protected GPIOs
It doesn't seem like the code will survive that. It seems written under the assumption that
It's a bit of a mess.
Can some qcom ACPI people take linux-next for a ride and tell me what I should do?
Lee: do you know if linux-next boots fine on your ACPI machine?
Timur/Stephen: any ideas?
Timur's CAF account is no longer valid, I added his @kernel one.
I think the code changed in commit f626d6dfb709 ("gpio: of: Break out OF-only code"). Now it unconditionally overwrites the chip's need_valid_mask member when CONFIG_OF is enabled. How about only overwriting it to 'true' when it really needs it? That way, the gpio provider can have a say. I originally wrote this by having of_gpio_need_valid_mask() modify the chip directly, but I like this approach instead where we mark it as const in that function and then only set it to true if it's actually needed.
-----8<---- diff --git a/drivers/gpio/gpiolib-of.c b/drivers/gpio/gpiolib-of.c index b10d04dd9296..e39b4290b80c 100644 --- a/drivers/gpio/gpiolib-of.c +++ b/drivers/gpio/gpiolib-of.c @@ -87,7 +87,7 @@ static struct gpio_desc *of_xlate_and_get_gpiod_flags(struct gpio_chip *chip,
- @dev: the device for the GPIO provider
- @return: true if the valid mask needs to be set
*/ -bool of_gpio_need_valid_mask(struct gpio_chip *gc) +bool of_gpio_need_valid_mask(const struct gpio_chip *gc) { int size; struct device_node *np = gc->of_node; diff --git a/drivers/gpio/gpiolib-of.h b/drivers/gpio/gpiolib-of.h index 34954921d96e..454d1658ee2d 100644 --- a/drivers/gpio/gpiolib-of.h +++ b/drivers/gpio/gpiolib-of.h @@ -16,7 +16,7 @@ struct gpio_desc *of_get_named_gpiod_flags(struct device_node *np, int of_gpiochip_add(struct gpio_chip *gc); void of_gpiochip_remove(struct gpio_chip *gc); int of_gpio_get_count(struct device *dev, const char *con_id); -bool of_gpio_need_valid_mask(struct gpio_chip *gc); +bool of_gpio_need_valid_mask(const struct gpio_chip *gc); #else static inline struct gpio_desc *of_find_gpio(struct device *dev, const char *con_id, @@ -36,7 +36,7 @@ static inline int of_gpio_get_count(struct device *dev, const char *con_id) { return 0; } -static inline bool of_gpio_need_valid_mask(struct gpio_chip *gc) +static inline bool of_gpio_need_valid_mask(const struct gpio_chip *gc) { return false; } diff --git a/drivers/gpio/gpiolib.c b/drivers/gpio/gpiolib.c index d45c9a2285f0..e7153c81fdaa 100644 --- a/drivers/gpio/gpiolib.c +++ b/drivers/gpio/gpiolib.c @@ -362,8 +362,8 @@ static unsigned long *gpiochip_allocate_mask(struct gpio_chip *chip) static int gpiochip_alloc_valid_mask(struct gpio_chip *gc) {
- if (IS_ENABLED(CONFIG_OF_GPIO))
gc->need_valid_mask = of_gpio_need_valid_mask(gc);
- if (of_gpio_need_valid_mask(gc))
if (!gc->need_valid_mask) return 0;gc->need_valid_mask = true;
linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel