So I think we should try to keep truncation return -EBUSY. For the shmem case, I think it's ok to return -EINVAL. I guess we can identify such folios by checking for folio_test_swapcache().
Hmm... Don't get how to do this nicely.
Looks we can't do it in folio_split_supported().
Or change folio_split_supported() return error code directly?
On upstream, I would do something like the following (untested):
diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 2f2a521e5d683..33fc3590867e2 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -3524,6 +3524,9 @@ bool non_uniform_split_supported(struct folio *folio, unsigned int new_order, "Cannot split to order-1 folio"); if (new_order == 1) return false; + } else if (folio_test_swapcache(folio)) { + /* TODO: support shmem folios that are in the swapcache. */ + return false; } else if (IS_ENABLED(CONFIG_READ_ONLY_THP_FOR_FS) && !mapping_large_folio_support(folio->mapping)) { /* @@ -3556,6 +3559,9 @@ bool uniform_split_supported(struct folio *folio, unsigned int new_order, "Cannot split to order-1 folio"); if (new_order == 1) return false; + } else if (folio_test_swapcache(folio)) { + /* TODO: support shmem folios that are in the swapcache. */ + return false; } else if (new_order) { if (IS_ENABLED(CONFIG_READ_ONLY_THP_FOR_FS) && !mapping_large_folio_support(folio->mapping)) { @@ -3619,6 +3625,15 @@ static int __folio_split(struct folio *folio, unsigned int new_order, if (folio != page_folio(split_at) || folio != page_folio(lock_at)) return -EINVAL;
+ /* + * Folios that just got truncated cannot get split. Signal to the + * caller that there was a race. + * + * TODO: support shmem folios that are in the swapcache. + */ + if (!is_anon && !folio->mapping && !folio_test_swapcache(folio)) + return -EBUSY; + if (new_order >= folio_order(folio)) return -EINVAL;
@@ -3659,17 +3674,7 @@ static int __folio_split(struct folio *folio, unsigned int new_order, gfp_t gfp;
mapping = folio->mapping; - - /* Truncated ? */ - /* - * TODO: add support for large shmem folio in swap cache. - * When shmem is in swap cache, mapping is NULL and - * folio_test_swapcache() is true. - */ - if (!mapping) { - ret = -EBUSY; - goto out; - } + VM_WARN_ON_ONCE_FOLIO(!mapping, folio);
min_order = mapping_min_folio_order(folio->mapping); if (new_order < min_order) {
So rule out the truncated case earlier, leaving only the swapcache check to be handled later.
Thoughts?
Probably worth mentioning that this was identified by code inspection?
Agree.
Fixes: c010d47f107f ("mm: thp: split huge page to any lower order pages") Signed-off-by: Wei Yang richard.weiyang@gmail.com Cc: Zi Yan ziy@nvidia.com Cc: stable@vger.kernel.org
Hmm, what would this patch look like when based on current upstream? We'd likely want to get that upstream asap.
This depends whether we want it on top of [1].
Current upstream doesn't have it [1] and need to fix it in two places.
Andrew mention prefer a fixup version in [2].
As we will want to backport this patch, likely we want to have it apply on current master.
Bur Andrew can comment what he prefers in this case of a stable fix.