Lukas Bulwahn lukas.bulwahn@gmail.com writes:
On Wed, 22 Jan 2020, Jouni Högander wrote:
Greg Kroah-Hartman gregkh@linuxfoundation.org writes:
Now queued up, I'll push out -rc2 versions with this fix.
greg k-h
We have also been informed about another regression these two commits are causing:
https://lore.kernel.org/lkml/ace19af4-7cae-babd-bac5-cd3505dcd874@I-love.SAK...
I suggest to drop these two patches from this queue, and give us a week to shake out the regressions of the change, and once ready, we can include the complete set of fixes to stable (probably in a week or two).
Ok, thanks for the information, I've now dropped them from all of the queues that had them in them.
greg k-h
I have now run more extensive Syzkaller testing on following patches:
cb626bf566eb net-sysfs: Fix reference count leak ddd9b5e3e765 net-sysfs: Call dev_hold always in rx_queue_add_kobject e0b60903b434 net-sysfs: Call dev_hold always in netdev_queue_add_kobje 48a322b6f996 net-sysfs: fix netdev_queue_add_kobject() breakage b8eb718348b8 net-sysfs: Fix reference count leak in rx|netdev_queue_add_kobject
These patches are fixing couple of memory leaks including this one found by Syzbot: https://syzkaller.appspot.com/bug?extid=ad8ca40ecd77896d51e2
I can reproduce these memory leaks in following stable branches: 4.14, 4.19, and 5.4.
These are all now merged into net/master tree and based on my testing they are ready to be taken into stable branches as well.
- syzkaller list
Jouni et. al, please drop Linus in further responses; Linus, it was wrong to add you to this thread in the first place (reason is explained below)
Jouni, thanks for investigating.
It raises the following questions and comments:
- Does the memory leak NOT appear on 4.9 and earlier LTS branches (or did
you not check that)? If it does not appear, can you bisect it with the reproducer to the commit between 4.14 and 4.9?
I tested and these memory leaks are not reproucible in 4.9 and earlier.
- Do the reproducers you found with your syzkaller testing show the same
behaviour (same bisection) as the reproducers from syzbot?
Yes, they are same.
- I fear syzbot's automatic bisection on is wrong, and Linus' commit
0e034f5c4bc4 ("iwlwifi: fix mis-merge that breaks the driver") is not to blame here; that commit did not cause the memory leak, but fixed some unrelated issue that simply confuses syzbot's automatic bisection.
Just FYI: Dmitry Vyukov's evaluation of the syzbot bisection shows that about 50% are wrong, e.g., due to multiple bugs being triggered with one reproducer and the difficulty of automatically identifying them of being different due to different root causes (despite the smart heuristics of syzkaller & syzbot). So, to identify the actual commit on which the memory leak first appeared, you need to bisect manually with your own judgement if the reported bug stack trace fits to the issue you investigating. Or you use syzbot's automatic bisection but then with a reduced kernel config that cannot be confused by other issues. You might possibly also hit a "beginning of time" in your bisection, where KASAN was simply not supported, then the initially causing commit can simply not determined by bisection with the reproducer and needs some code inspection and archaeology with git. Can you go ahead try to identify the correct commit for this issue?
These two commits (that are not in 4.9 and earlier) are intorducing these leaks:
commit e331c9066901dfe40bea4647521b86e9fb9901bb Author: YueHaibing yuehaibing@huawei.com Date: Tue Mar 19 10:16:53 2019 +0800
net-sysfs: call dev_hold if kobject_init_and_add success
[ Upstream commit a3e23f719f5c4a38ffb3d30c8d7632a4ed8ccd9e ]
In netdev_queue_add_kobject and rx_queue_add_kobject, if sysfs_create_group failed, kobject_put will call netdev_queue_release to decrease dev refcont, however dev_hold has not be called. So we will see this while unregistering dev:
unregister_netdevice: waiting for bcsh0 to become free. Usage count = -1
Reported-by: Hulk Robot hulkci@huawei.com Fixes: d0d668371679 ("net: don't decrement kobj reference count on init fail ure") Signed-off-by: YueHaibing yuehaibing@huawei.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
commit d0d6683716791b2a2761a1bb025c613eb73da6c3 Author: stephen hemminger stephen@networkplumber.org Date: Fri Aug 18 13:46:19 2017 -0700
net: don't decrement kobj reference count on init failure
If kobject_init_and_add failed, then the failure path would decrement the reference count of the queue kobject whose reference count was already zero.
Fixes: 114cf5802165 ("bql: Byte queue limits") Signed-off-by: Stephen Hemminger sthemmin@microsoft.com Signed-off-by: David S. Miller davem@davemloft.net
Lukas
BR,
Jouni Högander