On 4 Mar 2025, at 6:49, Hugh Dickins wrote:
On Wed, 26 Feb 2025, Zi Yan wrote:
This is a preparation patch, both added functions are not used yet.
The added __split_unmapped_folio() is able to split a folio with its mapping removed in two manners: 1) uniform split (the existing way), and 2) buddy allocator like split.
The added __split_folio_to_order() can split a folio into any lower order. For uniform split, __split_unmapped_folio() calls it once to split the given folio to the new order. For buddy allocator split, __split_unmapped_folio() calls it (folio_order - new_order) times and each time splits the folio containing the given page to one lower order.
Signed-off-by: Zi Yan ziy@nvidia.com
Sorry, I'm tired and don't really want to be writing this yet, but the migrate "hotfix" has tipped my hand, and I need to get this out to you before more days pass.
Thank you for taking the time to test my patches. I really appreciate it.
I'd been unable to complete even a single iteration of my "kernel builds on huge tmpfs while swapping to SSD" testing during this current 6.14-rc mm.git cycle (6.14-rc itself fine) - until the last week, when some important fixes have come in, so I'm no longer getting I/O errors from ext4-on-loop0-on-huge-tmpfs, and "Huh VM_FAULT_OOM leaked" warnings: good.
This error should be related to the other patch I sent out on using xas_try_split() in shmem_large_entry_split(). Great to have you confirm it fixed some of the bugs.
But I still can't get beyond a few iterations, a few minutes: there's some corruption of user data, which usually manifests as a kernel build failing because fixdep couldn't find some truncated-on-the-left pathname.
It is likely that this patch might fix it (partially): https://lore.kernel.org/linux-mm/56EBE3B6-99EA-470E-B2B3-92C9C13032DF@nvidia.... Andrew has picked it yesterday.
While it definitely bisected to your folio_split() series, it's quite possible that you're merely exposing an existing bug to wider use.
I've spent the last few days trying to track this down, but still not succeeded: I'm still getting much the same corruption. But have been folding in various fixes as I found them, even though they have not solved the main problem at all. I'll return to trying to debug the corruption "tomorrow".
Thank you very much. This patchset has not got much review yet, your help is really appreciated.
I think (might be wrong, I'm in a rush) my mods are all to this "add two new (not yet used) functions for folio_split()" patch: please merge them in if you agree.
From source inspection, it looks like a folio_set_order() was missed.
Why is swapcache only checked when folio_test_anon? I can see that you've just copied that over from the old __split_huge_page(), but it seems wrong to me here and there - I guess a relic from before shmem could swap out a huge page.
Doing folio_next() inside the for(;;) is unsafe in those configs which have to look up zone etc, I got an oops from the "new_folio" loop; didn't hit an oops from the "release" loop but fixed that too.
While correcting anon versus mapping versus swap_cache, shortened the lines by avoiding origin_folio->mapping and &release->page.
All these fixes make sense to me. Thanks again for your effort.
Hi Andrew,
Do you mind folding Hugh’s fixes to this patch? Let me know if you prefer a V10. Thanks.
Signed-off-by: Hugh Dickins hughd@google.com
mm/huge_memory.c | 39 ++++++++++++++++++++------------------- 1 file changed, 20 insertions(+), 19 deletions(-)
diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 0e45937c0d91..9ce3906672b9 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -3612,7 +3612,9 @@ static void __split_folio_to_order(struct folio *folio, int new_order) folio_xchg_last_cpupid(new_folio, folio_last_cpupid(folio)); }
- if (!new_order)
- if (new_order)
folio_set_order(folio, new_order);
- else ClearPageCompound(&folio->page);
}
@@ -3682,7 +3684,9 @@ static int __split_unmapped_folio(struct folio *folio, int new_order, int ret = 0; bool stop_split = false;
- if (folio_test_anon(folio) && folio_test_swapcache(folio)) {
- if (folio_test_swapcache(folio)) {
VM_BUG_ON(mapping);
- /* a swapcache folio can only be uniformly split to order-0 */ if (!uniform_split || new_order != 0) return -EINVAL;
@@ -3750,9 +3754,8 @@ static int __split_unmapped_folio(struct folio *folio, int new_order, * is new_order, since the folio will be worked on in next * iteration. */
for (release = folio, next = folio_next(folio);
release != end_folio;
release = next, next = folio_next(next)) {
for (release = folio; release != end_folio; release = next) {
next = folio_next(release); /* * for buddy allocator like split, the folio containing * page will be split next and should not be released,
@@ -3784,32 +3787,31 @@ static int __split_unmapped_folio(struct folio *folio, int new_order, lru_add_page_tail(origin_folio, &release->page, lruvec, list);
/* Some pages can be beyond EOF: drop them from page cache */
/* Some pages can be beyond EOF: drop them from cache */ if (release->index >= end) {
if (shmem_mapping(origin_folio->mapping))
if (shmem_mapping(mapping)) nr_dropped += folio_nr_pages(release); else if (folio_test_clear_dirty(release)) folio_account_cleaned(release,
inode_to_wb(origin_folio->mapping->host));
inode_to_wb(mapping->host)); __filemap_remove_folio(release, NULL); folio_put(release);
} else if (!folio_test_anon(release)) {
__xa_store(&origin_folio->mapping->i_pages,
release->index, &release->page, 0);
} else if (mapping) {
__xa_store(&mapping->i_pages,
release->index, release, 0); } else if (swap_cache) { __xa_store(&swap_cache->i_pages, swap_cache_index(release->swap),
&release->page, 0);
release, 0); }
} }
unlock_page_lruvec(lruvec);
- if (folio_test_anon(origin_folio)) {
if (folio_test_swapcache(origin_folio))
xa_unlock(&swap_cache->i_pages);
- } else
if (swap_cache)
xa_unlock(&swap_cache->i_pages);
if (mapping) xa_unlock(&mapping->i_pages);
/* Caller disabled irqs, so they are still disabled here */
@@ -3828,9 +3830,8 @@ static int __split_unmapped_folio(struct folio *folio, int new_order, * For buddy allocator like split, the first after-split folio is left * for caller to unlock. */
- for (new_folio = origin_folio, next = folio_next(origin_folio);
new_folio != next_folio;
new_folio = next, next = folio_next(next)) {
- for (new_folio = origin_folio; new_folio != next_folio; new_folio = next) {
if (new_folio == page_folio(lock_at)) continue;next = folio_next(new_folio);
-- 2.43.0
Best Regards, Yan, Zi