Re: Clock related crashes in v5.4.y-queue

3 Jan 2020


      On 1/2/20 4:40 PM, Sasha Levin wrote:
...
On Thu, Jan 02, 2020 at 01:28:37PM -0800, Guenter Roeck wrote:
...
On Thu, Jan 02, 2020 at 10:01:19PM +0100, Greg Kroah-Hartman wrote:
...
On Wed, Jan 01, 2020 at 06:44:08PM -0800, Guenter Roeck wrote:
...
Hi,
I see a number of crashes in the latest v5.4.y-queue; please see below
for details. The problem bisects to commit 54a311c5d3988d ("clk: Fix memory
leak in clk_unregister()").
The context suggests recovery from a failed driver probe, and it appears
that the memory is released twice. Interestingly, I don't see the problem
in mainline.
I would suggest to drop that patch from the stable queue.
That does not look right, as you point out, so I will go drop it now.
The logic of the clk structure lifetimes seems crazy, messing with krefs
and just "knowing" the lifecycle of the other structures seems like a
problem just waiting to happen...
I agree. While the patch itself seems to be ok per Stephen's feedback,
we have to assume that there will be more secondary failures in addition
to the one I have discovered. Given that clocks are not normally
unregistered, I don't think fixing the memory leak is important enough
to risk the stability of stable releases.
With all that in mind, I'd rather have this in mainline for a prolonged
period of time before considering it for stable release (if at all).
I would very much like to circle back and add both this patch and it's
fix to the stable trees at some point in the future.
If the code is good enough for mainline it should be good enough for
stable as well. If it's broken - let's fix it now instead of deferring
this to when people try to upgrade their major kernel versions.
This is where we differ strongly, and where I think the Linux community will
have to make a decision sometime soon. If "good enough for mainline" is a
relevant criteria for inclusion of a patch into stable releases, we don't
need stable releases anymore (we are backporting all bugs into those anyway).
Just use mainline.
Really, stable releases should be limited to fixing severe bugs. This is not
a fix for a severe bug, and on top of that it has side effects. True, those
side effects are that it uncovers other bugs, but that just makes it worse.
If we assume that my marginal testing covers, optimistically, 1% of the kernel,
and it discovers one bug, we have the potential of many more bugs littered
throughout the kernel which are now exposed. I really don't want to export
that risk into stable releases.
Guenter

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

Re: Clock related crashes in v5.4.y-queue