Re: [PATCH v9 2/8] mm/huge_memory: add two new (not yet used) functions for folio_split()

4 Mar 2025

On 4 Mar 2025, at 6:49, Hugh Dickins wrote:
...
On Wed, 26 Feb 2025, Zi Yan wrote:
...
This is a preparation patch, both added functions are not used yet.
The added __split_unmapped_folio() is able to split a folio with its
mapping removed in two manners: 1) uniform split (the existing way), and
2) buddy allocator like split.
The added __split_folio_to_order() can split a folio into any lower order.
For uniform split, __split_unmapped_folio() calls it once to split the
given folio to the new order.  For buddy allocator split,
__split_unmapped_folio() calls it (folio_order - new_order) times and each
time splits the folio containing the given page to one lower order.
Signed-off-by: Zi Yan ziy@nvidia.com
Sorry, I'm tired and don't really want to be writing this yet, but the
migrate "hotfix" has tipped my hand, and I need to get this out to you
before more days pass.
Thank you for taking the time to test my patches. I really appreciate it.
...
I'd been unable to complete even a single iteration of my "kernel builds
on huge tmpfs while swapping to SSD" testing during this current 6.14-rc
mm.git cycle (6.14-rc itself fine) - until the last week, when some
important fixes have come in, so I'm no longer getting I/O errors from
ext4-on-loop0-on-huge-tmpfs, and "Huh VM_FAULT_OOM leaked" warnings: good.
This error should be related to the other patch I sent out on using
xas_try_split() in shmem_large_entry_split(). Great to have you confirm
it fixed some of the bugs.
...
But I still can't get beyond a few iterations, a few minutes: there's
some corruption of user data, which usually manifests as a kernel build
failing because fixdep couldn't find some truncated-on-the-left pathname.
It is likely that this patch might fix it (partially):
https://lore.kernel.org/linux-mm/56EBE3B6-99EA-470E-B2B3-92C9C13032DF@nvidia....
Andrew has picked it yesterday.
...
While it definitely bisected to your folio_split() series, it's quite
possible that you're merely exposing an existing bug to wider use.
I've spent the last few days trying to track this down, but still not
succeeded: I'm still getting much the same corruption.  But have been
folding in various fixes as I found them, even though they have not
solved the main problem at all.  I'll return to trying to debug the
corruption "tomorrow".
Thank you very much. This patchset has not got much review yet, your
help is really appreciated.
...
I think (might be wrong, I'm in a rush) my mods are all to this
"add two new (not yet used) functions for folio_split()" patch:
please merge them in if you agree.

From source inspection, it looks like a folio_set_order() was missed.

Why is swapcache only checked when folio_test_anon? I can see that
you've just copied that over from the old __split_huge_page(), but
it seems wrong to me here and there - I guess a relic from before
shmem could swap out a huge page.

Doing folio_next() inside the for(;;) is unsafe in those configs
which have to look up zone etc, I got an oops from the "new_folio"
loop; didn't hit an oops from the "release" loop but fixed that too.

While correcting anon versus mapping versus swap_cache, shortened
the lines by avoiding origin_folio->mapping and &release->page.


All these fixes make sense to me. Thanks again for your effort.
Hi Andrew,
Do you mind folding Hugh’s fixes to this patch? Let me know if you prefer
a V10. Thanks.
...
Signed-off-by: Hugh Dickins hughd@google.com
mm/huge_memory.c | 39 ++++++++++++++++++++-------------------
 1 file changed, 20 insertions(+), 19 deletions(-)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 0e45937c0d91..9ce3906672b9 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -3612,7 +3612,9 @@ static void __split_folio_to_order(struct folio *folio, int new_order)
   	folio_xchg_last_cpupid(new_folio, folio_last_cpupid(folio));
   }

if (!new_order)


if (new_order)
folio_set_order(folio, new_order);


else
ClearPageCompound(&folio->page);

}
@@ -3682,7 +3684,9 @@ static int __split_unmapped_folio(struct folio *folio, int new_order,
   int ret = 0;
   bool stop_split = false;

if (folio_test_anon(folio) && folio_test_swapcache(folio)) {


if (folio_test_swapcache(folio)) {
VM_BUG_ON(mapping);


/* a swapcache folio can only be uniformly split to order-0 */
  if (!uniform_split || new_order != 0)
  	return -EINVAL;

@@ -3750,9 +3754,8 @@ static int __split_unmapped_folio(struct folio *folio, int new_order,
   	 * is new_order, since the folio will be worked on in next
   	 * iteration.
   	 */

for (release = folio, next = folio_next(folio);


     release != end_folio;


     release = next, next = folio_next(next)) {




for (release = folio; release != end_folio; release = next) {


	next = folio_next(release);
/*
 * for buddy allocator like split, the folio containing
 * page will be split next and should not be released,



@@ -3784,32 +3787,31 @@ static int __split_unmapped_folio(struct folio *folio, int new_order,
   		lru_add_page_tail(origin_folio, &release->page,
   					lruvec, list);

	/* Some pages can be beyond EOF: drop them from page cache */




	/* Some pages can be beyond EOF: drop them from cache */
if (release->index >= end) {




		if (shmem_mapping(origin_folio->mapping))




		if (shmem_mapping(mapping))
		nr_dropped += folio_nr_pages(release);
	else if (folio_test_clear_dirty(release))
		folio_account_cleaned(release,




				inode_to_wb(origin_folio->mapping->host));




				inode_to_wb(mapping->host));
	__filemap_remove_folio(release, NULL);
	folio_put(release);




	} else if (!folio_test_anon(release)) {


		__xa_store(&origin_folio->mapping->i_pages,


				release->index, &release->page, 0);




	} else if (mapping) {


		__xa_store(&mapping->i_pages,


				release->index, release, 0);
} else if (swap_cache) {
	__xa_store(&swap_cache->i_pages,
			swap_cache_index(release->swap),




				&release->page, 0);




				release, 0);
}

}
}
unlock_page_lruvec(lruvec);



if (folio_test_anon(origin_folio)) {
if (folio_test_swapcache(origin_folio))


	xa_unlock(&swap_cache->i_pages);


} else


if (swap_cache)

xa_unlock(&swap_cache->i_pages);


if (mapping)
xa_unlock(&mapping->i_pages);
/* Caller disabled irqs, so they are still disabled here */


@@ -3828,9 +3830,8 @@ static int __split_unmapped_folio(struct folio *folio, int new_order,
    * For buddy allocator like split, the first after-split folio is left
    * for caller to unlock.
    */

for (new_folio = origin_folio, next = folio_next(origin_folio);
    new_folio != next_folio;


    new_folio = next, next = folio_next(next)) {




for (new_folio = origin_folio; new_folio != next_folio; new_folio = next) {
next = folio_next(new_folio);

if (new_folio == page_folio(lock_at))
	continue;

-- 
2.43.0
Best Regards,
Yan, Zi

    

2025

2024

2023

2022

2021

2020

2019

2018

2017

Re: [PATCH v9 2/8] mm/huge_memory: add two new (not yet used) functions for folio_split()

Signed-off-by: Hugh Dickins hughd@google.com