Hi,
On Mon, Sep 12, 2022 at 11:56:43AM +0100, Jason A. Donenfeld wrote:
AFAIK, I'm just using the normal ACPI one. Really nothing fancy. Thinkpad X1 Extreme Gen 4.
All ACPI drivers setup get_property method in their power-supply devices.
Maybe get_property was being set and unset during some kind of initialization/deinitialization that was happening in response to some other event? Not sure, except that I managed to trigger it twice before patching my kernel so my laptop would keep working.
The function is not intended to be changed during the lifetime of the device and AFAIK no mainline drivers does this.
On Mon, Sep 12, 2022 at 11:48 AM Jason A. Donenfeld Jason@zx2c4.com wrote:
On Mon, Sep 12, 2022 at 11:45 AM Jason A. Donenfeld Jason@zx2c4.com wrote:
My machine went through three changes I know about between the threshold of "not crashing" and "crashing":
- Upgraded to 5.19 and then 6.0-rc1.
- I used my laptop on batteries for a prolonged period of time for the first time in a while.
- I updated KDE, whose power management UI elements may or may not make frequent calls to this subsystem to update some visual representation.
- Updated my BIOS.
GASP! The plot thickens.
It appears that the BIOS update I applied has been removed from https://pcsupport.lenovo.com/fr/en/downloads/ds551052-bios-update-utility-bo... and now it only shows the 1.16 version. I updated from 1.16 to 1.18.
The missing release notes are still online if you futz with the URL: https://download.lenovo.com/pccbbs/mobiles/n40ur14w.txt https://download.lenovo.com/pccbbs/mobiles/n40ur15w.txt
One of the items for 1.17 says:
- (Fix) Fixed an issue where it took a long time to update the battery FW.
So maybe something was happening here...
I'm CC'ing Mark from Lenovo to see if he has any insight as to why this BIOS update was pulled.
Maybe the battery was appearing and disappearing rapidly.
If that's correct, then it'd indicate that this bandaid patch is *wrong* and what actually is needed is some kind of reference counting or RCU around that sysfs interface (and maybe others).
Device create/remove is the only time that is supposed to touch the get_property callback. So I suppose a race condition in that path would be a sensible root cause. Considering systems usually registers the device once and keeps it until shutdown would also explain why this has not been noticed earlier.
The function you modified is only called by power_supply_is_system_supplied(), which is an in-kernel function to figure out if the system is running on battery.
Can you trigger this easy enough to figure out a few more details about the state of the problematic device?
-- Sebastian