On Wed, Sep 20, 2023 at 12:13:25PM -0700, Kyle Zeng wrote:
On Wed, Sep 20, 2023 at 10:01:55AM -0700, Florian Fainelli wrote:
On 9/20/23 08:18, Guenter Roeck wrote:
On 9/20/23 01:11, Greg Kroah-Hartman wrote:
On Tue, Sep 19, 2023 at 09:57:25PM -0700, Guenter Roeck wrote:
On 9/17/23 12:07, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.10.195 release. There are 406 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Tue, 19 Sep 2023 19:10:04 +0000. Anything received after that time might be too late.
chromeos-5.10 locks up in configfs_lookup() after the merge of v5.10.195.
I am a bit puzzled because I see
c709c7ca020a configfs: fix a race in configfs_lookup()
in v5.10.195 but not in the list of commits below. I guess I must be missing something.
It was part of the big patchset, it was posted here: https://lore.kernel.org/r/20230917191101.511939651@linuxfoundation.org
Not hidden at all :)
and was submitted here: https://lore.kernel.org/r/ZPOZFHHA0abVmGx+@westworld
Either case, the code now looks as follows.
configfs_lookup() { ... spin_lock(&configfs_dirent_lock); ... err = configfs_attach_attr(sd, dentry); ... spin_unlock(&configfs_dirent_lock); ... }
and
configfs_attach_attr(...) { ... spin_lock(&configfs_dirent_lock); ... }
which unless it is way too late here and I really need to go to sleep just won't work.
Kyle, you did the backport, any comments?
After a good night sleep, the code still looks wrong to me. Reverting the offending patch in chromeos-5.10 solved the problem there. That makes me suspect that no one actually tests configfs.
Humm indeed, looking at our testing we don't have our USB devices being tested which would exercise configfs since we switch the USB device between different configurations (mass storage, serial, networking etc.). Let me see about adding that so we get some coverage. -- Florian
Sorry for the wrong patch. My intention was to backport c42dd069be8dfc9b2239a5c89e73bbd08ab35de0 to v5.10 to avoid a race condition triggered in my test. I tested the patch with my PoC program and made sure it won't trigger the crash. But I didn't notice that it could hang the kernel. I sincerely apologize for the mistake.
My new proposed patch backports both c42dd069be8dfc9b2239a5c89e73bbd08ab35de0 and d07f132a225c013e59aa77f514ad9211ecab82ee. I made sure it does not trigger the race condition anymore. Can anyone having access to more comprehensive tests please check whether it works?
Also, I'm not sure whether it is OK or how to backport two patches in one patch. Please advise on how to do it properly.
Please backport them both individually, do not merge them together.
I'll go revert the current change now and push out a release with it so that it fixes users of this kernel tree.
thanks,
greg k-h