On Mon, Jan 08, 2024 at 04:58:06PM +0000, Lee Jones wrote:
On Mon, 08 Jan 2024, Greg Kroah-Hartman wrote:
On Mon, Jan 08, 2024 at 02:52:24PM +0000, Lee Jones wrote:
On Wed, 09 Aug 2023, Greg Kroah-Hartman wrote:
From: Duoming Zhou duoming@zju.edu.cn
[ Upstream commit 1e7417c188d0a83fb385ba2dbe35fd2563f2b6f3 ]
The timer dev->stat_monitor can schedule the delayed work dev->wq and the delayed work dev->wq can also arm the dev->stat_monitor timer.
When the device is detaching, the net_device will be deallocated. but the net_device private data could still be dereferenced in delayed work or timer handler. As a result, the UAF bugs will happen.
One racy situation is shown below:
(Thread 1) | (Thread 2)
lan78xx_stat_monitor() | ... | lan78xx_disconnect() lan78xx_defer_kevent() | ... ... | cancel_delayed_work_sync(&dev->wq); schedule_delayed_work() | ... (wait some time) | free_netdev(net); //free net_device lan78xx_delayedwork() | //use net_device private data | dev-> //use |
Although we use cancel_delayed_work_sync() to cancel the delayed work in lan78xx_disconnect(), it could still be scheduled in timer handler lan78xx_stat_monitor().
Another racy situation is shown below:
(Thread 1) | (Thread 2)
lan78xx_delayedwork | mod_timer() | lan78xx_disconnect() | cancel_delayed_work_sync() (wait some time) | if (timer_pending(&dev->stat_monitor)) | del_timer_sync(&dev->stat_monitor); lan78xx_stat_monitor() | ... lan78xx_defer_kevent() | free_netdev(net); //free //use net_device private data| dev-> //use |
Although we use del_timer_sync() to delete the timer, the function timer_pending() returns 0 when the timer is activated. As a result, the del_timer_sync() will not be executed and the timer could be re-armed.
In order to mitigate this bug, We use timer_shutdown_sync() to shutdown the timer and then use cancel_delayed_work_sync() to cancel the delayed work. As a result, the net_device could be deallocated safely.
What's more, the dev->flags is set to EVENT_DEV_DISCONNECT in lan78xx_disconnect(). But it could still be set to EVENT_STAT_UPDATE in lan78xx_stat_monitor(). So this patch put the set_bit() behind timer_shutdown_sync().
Fixes: 77dfff5bb7e2 ("lan78xx: Fix race condition in disconnect handling")
Any idea why this stopped at linux-6.4.y? The aforementioned Fixes: commit also exists in linux-6.1.y and linux-5.15.y. I don't see any earlier backport attempts or failure reports that would otherwise explain this.
Did you try to build it:
No, I just noticed that it was missing.
drivers/net/usb/lan78xx.c: In function ‘lan78xx_disconnect’: drivers/net/usb/lan78xx.c:4234:9: error: implicit declaration of function ‘timer_shutdown_sync’ [-Werror=implicit-function-declaration] 4234 | timer_shutdown_sync(&dev->stat_monitor); | ^~~~~~~~~~~~~~~~~~~ cc1: all warnings being treated as errors
That's a good reason to not include it...
It's a perfect reason not to include it.
The issue is not that the patch is not present. It's more the lack of transparency in terms of searchable information on why it was not included.
I was under the impression that a report is usually sent out when a patch failed to apply for any reason?
For patches that are explicitly tagged for stable inclusion, yes, that will happen. That is not the case for this commit.
For patches that only have a "Fixes:" tag on it, those are gotten to on a "best effort" basis when we get a chance, as those were obviously not explicitly asked to be backported. And when they are backported, if they fail, they will fail silently as the author/maintainer was not explicitly asking them to be applied to a stable tree, so it would just be noise to complain about it.
So, it's lucky that this patch was backported at all to any stable tree :)
thanks,
greg k-h