On Tue, Apr 11, 2023 at 10:57 AM Linus Walleij linus.walleij@linaro.org wrote:
On Mon, Apr 10, 2023 at 11:16 AM Naresh Kamboju naresh.kamboju@linaro.org wrote: (...)
Anders performed bisection on this problem. The bisection have been poing to this commit log, first bad commit: [24c94060fc9b4e0f19e6e018869db46db21d6bc7] gpiolib: ensure that fwnode is properly set
I don't think this is the real issue.
(...)
# 2. Module load error tests # 2.1 gpio overflow
(...)
[ 88.900984] Freed in software_node_release+0xdc/0x108 age=34 cpu=1 pid=683 [ 88.907899] __kmem_cache_free+0x2a4/0x2e0 [ 88.912024] kfree+0xc0/0x1a0 [ 88.915015] software_node_release+0xdc/0x108 [ 88.919402] kobject_put+0xb0/0x220 [ 88.922919] software_node_notify_remove+0x98/0xe8 [ 88.927741] device_del+0x184/0x380 [ 88.931259] platform_device_del.part.0+0x24/0xa8 [ 88.935995] platform_device_unregister+0x30/0x50
I think the refcount is wrong on the fwnode.
The chip is allocated with devm_gpiochip_add_data() which will not call gpiochip_remove() until all references are removed by calling devm_gpio_chip_release().
Add a pr_info() devm_gpio_chip_release() in drivers/gpio/gpiolib-devres.c and see if the callback is even called. I think this could be the problem: if that isn't cleaned up, there will be dangling references.
diff --git a/drivers/gpio/gpiolib-devres.c b/drivers/gpio/gpiolib-devres.c index fe9ce6b19f15..30a0622210d7 100644 --- a/drivers/gpio/gpiolib-devres.c +++ b/drivers/gpio/gpiolib-devres.c @@ -394,6 +394,7 @@ static void devm_gpio_chip_release(void *data) { struct gpio_chip *gc = data;
pr_info("GPIOCHIP %s WAS REMOVED BY DEVRES\n", gc->label); gpiochip_remove(gc);
}
If this isn't working we need to figure out what is holding a reference to the gpiochip.
I don't know how the references to the gpiochip fwnode is supposed to drop to zero though? I didn't work with mockup much ...
What I could think of is that maybe the mockup driver need a .shutdown() callback to forcibly call gpiochip_remove(), and in that case it should be wrapped in a non-existining devm_gpiochip_remove() since devres is used to register it.
Bartosz will know better though! I am pretty sure he has this working flawlessly so the tests must be doing something weird which is leaving references around.
Yours, Linus Walleij
Interestingly I'm not seeing this neither with gpio-sim selftests nor with any of the libgpiod tests which suggests it's the gpio-mockup module that's doing something wrong (or very right in which case it uncovers some otherwise hidden bug). Anyway, I'll try to spend some time on it and figure it out, although I'd like to be done with gpio-mockup altogether already.
Bart