On 2/20/24 16:52, Sven van Ashbrook wrote:
Takaski, Vlastimil: thanks so much for the engagement! See below.
On 2/19/24 12:36, Takashi Iwai wrote:
Karthikeyan, Sven, and co: could you guys show the stack trace at the stall? This may give us more clear light.
Here are our notes of the indefinite stall we saw on v5.10 with iommu SoCs. We did not pursue debugging the stall at the time, in favour of a work-around with the gfp flags. Therefore we only have partial confidence in the notes below. Take them with a block of salt, but they may point in a useful direction.
try to do a "costly" allocation (order > PAGE_ALLOC_COSTLY_ORDER) with __GFP_RETRY_MAYFAIL set.
page alloc's __alloc_pages_slowpath [1] tries to get a page from
the freelist. This fails because there is nothing free of that costly order.
- page alloc tries to reclaim by calling __alloc_pages_direct_reclaim, which bails out [2] because a zone is ready to be compacted; it pretends
to have made a single page of progress.
- page alloc tries to compact, but this always bails out early [3]
because __GFP_IO is not set (it's not passed by the snd allocator, and even if it were, we are suspending so the __GFP_IO flag would be cleared anyway).
- page alloc believes reclaim progress was made (because of the
pretense in item 3) and so it checks whether it should retry compaction. The compaction retry logic [4] thinks it should try again, because: a) reclaim is needed because of the early bail-out in item 4 b) a zonelist is suitable for compaction
- goto 2. indefinite stall.
Thanks a lot, seems this can indeed happen even in 6.8-rc5. We're mishandling the case where compaction is skipped due to lack of __GFP_IO, which is indeed cleared in suspend/resume. I'll create a fix. Please don't hesitate to report such issues the next time, even if not fully debugged :)
Also, Vlastimil suggested that tracepoints would be helpful if that's really in the page allocator, too.
We might be able to generate traces by bailing out of the indefinite stall using a timer, which should hopefully give us a device that's "alive enough" to read the traces.
Can you advise which tracepoints you'd like to see? Is trace-cmd [5] suitable to capture this?
[1] https://source.chromium.org/chromiumos/chromiumos/codesearch/+/main:src/thir... [2] https://source.chromium.org/chromiumos/chromiumos/codesearch/+/main:src/thir... [3] https://source.chromium.org/chromiumos/chromiumos/codesearch/+/main:src/thir... [4] https://source.chromium.org/chromiumos/chromiumos/codesearch/+/main:src/thir... [5] https://chromium.googlesource.com/chromiumos/docs/+/HEAD/kernel_development....