On 11/4/24 6:17 AM, Andrew Marshall wrote:
On Sun, Nov 3, 2024, at 23:25, Andrew Marshall wrote:
On Sun, Nov 3, 2024, at 21:38, Jens Axboe wrote:
On 11/3/24 5:06 PM, Jens Axboe wrote:
On 11/3/24 5:01 PM, Keith Busch wrote:
On Sun, Nov 03, 2024 at 04:53:27PM -0700, Jens Axboe wrote:
On 11/3/24 4:47 PM, Andrew Marshall wrote: > I identified f4ce3b5d26ce149e77e6b8e8f2058aa80e5b034e as the likely > problematic commit simply by browsing git log. As indicated above; > reverting that atop 6.6.59 results in success. Since it is passing on > 6.11.6, I suspect there is some missing backport to 6.6.x, or some > other semantic merge conflict. Unfortunately I do not have a compact, > minimal reproducer, but can provide my large one (it is testing a > larger build process in a VM) if needed?there are some additional > details in the above-linked downstream bug report, though. I hope that > having identified the problematic commit is enough for someone with > more context to go off of. Happy to provide more information if > needed.
Don't worry about not having a reproducer, having the backport commit pin pointed will do just fine. I'll take a look at this.
I think stable is missing:
6b231248e97fc3 ("io_uring: consolidate overflow flushing")
I think you need to go back further than that, this one already unconditionally holds ->uring_lock around overflow flushing...
Took a look, it's this one:
commit 8d09a88ef9d3cb7d21d45c39b7b7c31298d23998 Author: Pavel Begunkov asml.silence@gmail.com Date: Wed Apr 10 02:26:54 2024 +0100
io_uring: always lock __io_cqring_overflow_flush
Greg/stable, can you pick this one for 6.6-stable? It picks cleanly.
For 6.1, which is the other stable of that age that has the backport, the attached patch will do the trick.
With that, I believe it should be sorted. Hopefully that can make 6.6.60 and 6.1.116.
-- Jens Axboe Attachments:
- 0001-io_uring-always-lock-__io_cqring_overflow_flush.patch
Cherry-picking 6b231248e97fc3 onto 6.6.59, I can confirm it passes my reproducer (run a few times). Your first quick patch also passed, for what it?s worth. Thanks for the quick responses!
Correction: I cherry-picked and tested 8d09a88ef9d3cb7d21d45c39b7b7c31298d23998 (which was the change you identified), not 6b231248e97fc3. Apologies for any confusion.
Thanks for clarifying, so it's as expected. Hopefully -stable can pick this backport up soonish, so the next stable release will be sorted. Thanks for reporting the issue!