Sorry, I may not have sent the email correctly. I will resend it.
On Thu, 27 Oct 2022 20:26:04 +0000 Andrew Morton akpm@linux-foundation.org wrote:
On Wed, 26 Oct 2022 20:24:38 +0900 NARIBAYASHI Akira a.naribayashi@fujitsu.com wrote:
Depending on the memory configuration, isolate_freepages_block() may scan pages out of the target range and causes panic.
The problem is that pfn as argument of fast_isolate_around() could be out of the target range. Therefore we should consider the case where pfn < start_pfn, and also the case where end_pfn < pfn.
This problem should have been addressd by the commit 6e2b7044c199 ("mm, compaction: make fast_isolate_freepages() stay within zone") but there was an oversight.
Case1: pfn < start_pfn
<at memory compaction for node Y> | node X's zone | node Y's zone +-----------------+------------------------------... pageblock ^ ^ ^ +-----------+-----------+-----------+-----------+... ^ ^ ^ ^ ^ end_pfn ^ start_pfn = cc->zone->zone_start_pfn pfn <---------> scanned range by "Scan After"
Case2: end_pfn < pfn
<at memory compaction for node X> | node X's zone | node Y's zone +-----------------+------------------------------... pageblock ^ ^ ^ +-----------+-----------+-----------+-----------+... ^ ^ ^ ^ ^ pfn ^ end_pfn start_pfn <---------> scanned range by "Scan Before"
It seems that there is no good reason to skip nr_isolated pages just after given pfn. So let perform simple scan from start to end instead of dividing the scan into "Before" and "After".
Under what circumstances will this panic occur? I assume those circumstnces are pretty rare, give that 6e2b7044c1992 was nearly two years ago.
Did you consider the desirability of backporting this fix into earlier kernels?
Panic can occur on systems with multiple zones in a single pageblock.
The reason it is rare is that it only happens in special configurations. Depending on how many similar systems there are, it may be a good idea to fix this problem for older kernels as well.