Re: [stable/4.14.y PATCH 0/3] mmc: Fix a potential resource leak when shutting down request queue.

19 Jun 2019


      On Tue, May 14, 2019 at 11:19:34AM +0200, Greg Kroah-Hartman wrote:
...
On Mon, May 13, 2019 at 11:55:18AM -0600, Raul E Rangel wrote:
...
I think we should cherry-pick 41e3efd07d5a02c80f503e29d755aa1bbb4245de
https://lore.kernel.org/patchwork/patch/856512/ into 4.14. It fixes a
potential resource leak when shutting down the request queue.
Potential meaning "it does happen", or "it can happen if we do this", or
just "maybe it might happen, we really do not know?"
It does happen if the AMD SDHCI patches are cherry-picked into 4.14.
https://lkml.org/lkml/2019/5/1/398
It can be mitigated by changing the line in the patch with
`mmc_detect_change(host, 0)` to mmc_detect_change(host, 200)`, but
that's just a workaround to play with the timing so the race condition
doesn't happen.
...
...
Once this patch is applied, there is a potential for a null pointer dereference.
That's what the second patch fixes.
What is the git id of that upstream fix?
So there is no specific upstream fix. There was a large patch set that
migrated mmc to using blk-mq, so the bug just kind of went away.
https://lwn.net/Articles/739774/ or 0fbfd12518303e9b32ac9fd231439459eac848f9
...
...
The third patch is just an optimization to stop processing earlier.
That's not how stable kernels work :(
Oops, I guess we can ignore that patch. It just prevents mmc_init_request
from being called, but it doesn't matter since the 2nd patch actually
checks for NULL now.
...
...
See https://patchwork.kernel.org/patch/10925469/ for the initial motivation.
I don't understand the motivation from that link at all :(
...
This commit applies to v4.14.116. It is already included in 4.19. 4.19 doesn't
suffer from the null pointer dereference because later commits migrate the mmc
stack to blk-mq.
What are those later commits?
Commit 0fbfd12518303e9b32ac9fd231439459eac848f9 specifically deletes the
code for the 2nd patch. As I said above, the NULL pointer dereference
just kind of went away as part of the blk-mq migration, so there is no
upstream fix :(
...
...
I tested this patch set by randomly connecting/disconnecting the SD
card. I got over 189650 itarations without a problem.
And if you do not have these patches, on 4.14.y, how many iterations
cause a problem?  If you just apply the first patch, does that work?
If I apply the AMD SDHCI patches and nothing else, then I can cause a
resource leak within 10 iterations. If I apply just the first patch then
I can cause a NULL pointer error within 10 iterations. If I apply both 1
and 2, then everything works as expected and I can't cause a problem.
...
_EVERY_ time we take a patch that is not upstream, something usually is
broken and needs to be fixed.  We have a long long long history of this,
so if you want to have a patch that is not upstream applied to a stable
kernel release, you need a whole lot of justification and explanation
and begging.  And you need to be around to fix the fallout for when it
breaks :)
It also looks like 2361bfb055f948eac6583fa3c75a014da84fe554 includes a
fix for 41e3efd07d5a02c80f503e29d755aa1bbb4245de, so that would need to
be cherry picked in.
I guess I should have included a fixes: line in my second patch.
So to summarize:
- cherry-pick 41e3efd07d5a02c80f503e29d755aa1bbb4245de
- cherry-pick 2361bfb055f948eac6583fa3c75a014da84fe554
- apply 2nd patch but add to commit message:
  Fixes: 41e3efd07d5a ("mmc: block: Simplify cleaning up the queue")
- Ignore patch 3 since it's an optimization.
...
thanks,
greg k-h
Thanks again for your time, and sorry for the really late response!
Raul

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

Re: [stable/4.14.y PATCH 0/3] mmc: Fix a potential resource leak when shutting down request queue.