On Fri, Aug 01, 2025 at 08:53:23PM -0700, Andrew Morton wrote:
On Fri, 1 Aug 2025 23:09:31 -0400 Waiman Long llong@redhat.com wrote:
There have been a few kmemleak locking fixes lately.
I believe this fix is independent from the previous ones:
https://lkml.kernel.org/r/20250731-kmemleak_lock-v1-1-728fd470198f@debian.or...
That's a similar bug in another part of the kmemleak code but fixed differently (which I actually prefer if feasible).
https://lkml.kernel.org/r/20250728190248.605750-1-longman@redhat.com
That's a soft lockup, unrelated to the printk deadlock.
I believe that __printk_safe_enter()/_printk_safe_exit() are for printk internal use only. The proper API to use should be printk_deferred_enter()/printk_deferred_exit() if we want to deferred the printing. Since kmemleak_lock will have been acquired with irq disabled, it meets the condition that printk_deferred_*() APIs can be used.
Gotcha, thanks.
kmemleak;c:__lookup_object() has a lot of callers. I hope you're correct that all have local irqs enabled, but I'll ask Gu to verify that then please send along a new patch which uses printk_deferred_enter()?
__lookup_object() must be called with kmemleak_lock held (unless we have a bug in kmemleak).
Using printk_deferred_enter() is more convenient, though I think some of these places can defer the printing with something similar to the first patch above from Breno.
For __lookup_object(), we could move the warning outside the lock but this function would have to lock the respective object, return it and somehow inform the caller that it was an error and the object needs unlocking. Given that this is a very rare/never event (only happens if someone messes up the kmemleak_alloc/free calls), I'd say the printk_deferred_enter() works best.
We have delete_object_part() which calls kmemleak_warn() with the kmemleak_lock held. The warning can be moved outside similar to Breno's patch.
We have a kmemleak_stop() -> kmemleak_warn() called in __link_object() with the kmemleak_lock held. We could also use the printk deferring here as well. That's another rare corner case.
The patch should add a code comment on why printk deferring is used in case others will wonder in the future.
I'm surprised we haven't seen these until recently. Has printk always allocated memory?