From: Zi Yan ziy@nvidia.com
Hi all,
File folio supports any order and multi-size THP is upstreamed[1], so both file and anonymous folios can be >0 order. Currently, split_huge_page() only splits a huge page to order-0 pages, but splitting to orders higher than 0 might better utilize large folios, if done properly. In addition, Large Block Sizes in XFS support would benefit from it during truncate[2]. This patchset adds support for splitting a large folio to any lower order folios. The patchset is on top of mm-everything-2024-02-24-02-40.
In addition to this implementation of split_huge_page_to_list_to_order(), a possible optimization could be splitting a large folio to arbitrary smaller folios instead of a single order. As both Hugh and Ryan pointed out [3,5] that split to a single order might not be optimal, an order-9 folio might be better split into 1 order-8, 1 order-7, ..., 1 order-1, and 2 order-0 folios, depending on subsequent folio operations. Leave this as future work.
Changelog ===
Since v4[4] 1. Picked up Matthew's order-1 folio support in the page cache patch, so that XFS Large Block Sizes patchset can avoid additional code churn in split_huge_page_to_list_to_order(). 2. Dropped truncate change patch and corresponding testing code. 3. Removed thp_nr_pages() use in __split_huge_page() (per David Hildenbrand). 4. Fixed __split_page_owner() (per David Hildenbrand). 5. Changed unmap_folio() to only add TTU_SPLIT_HUGE_PMD if the folios is pmd mappable (per Ryan Roberts). 6. Moved swapcached folio split warning upfront and return -EINVAL (per Ryan Roberts).
Since v3 --- 1. Excluded shmem folios and pagecache folios without FS support from splitting to any order (per Hugh Dickins). 2. Allowed splitting anonymous large folio to any lower order since multi-size THP is upstreamed. 3. Adapted selftests code to new framework.
Since v2 --- 1. Fixed an issue in __split_page_owner() introduced during my rebase
Since v1 --- 1. Changed split_page_memcg() and split_page_owner() parameter to use order 2. Used folio_test_pmd_mappable() in place of the equivalent code
Details ===
* Patch 1 changes unmap_folio() to only add TTU_SPLIT_HUGE_PMD if the folio is pmd mappable. * Patch 2 adds support for order-1 page cache folio. * Patch 3 changes split_page_memcg() to use order instead of nr_pages. * Patch 4 changes split_page_owner() to use order instead of nr_pages. * Patch 5 and 6 add new_order parameter split_page_memcg() and split_page_owner() and prepare for upcoming changes. * Patch 7 adds split_huge_page_to_list_to_order() to split a huge page to any lower order. The original split_huge_page_to_list() calls split_huge_page_to_list_to_order() with new_order = 0. * Patch 8 adds a test API to debugfs and test cases in split_huge_page_test selftests.
Comments and/or suggestions are welcome.
[1] https://lore.kernel.org/all/20231207161211.2374093-1-ryan.roberts@arm.com/ [2] https://lore.kernel.org/linux-mm/20240226094936.2677493-1-kernel@pankajragha... [3] https://lore.kernel.org/linux-mm/9dd96da-efa2-5123-20d4-4992136ef3ad@google.... [4] https://lore.kernel.org/linux-mm/cbb1d6a0-66dd-47d0-8733-f836fe050374@arm.co... [5] https://lore.kernel.org/linux-mm/20240213215520.1048625-1-zi.yan@sent.com/
Matthew Wilcox (Oracle) (1): mm: Support order-1 folios in the page cache
Zi Yan (7): mm/huge_memory: only split PMD mapping when necessary in unmap_folio() mm/memcg: use order instead of nr in split_page_memcg() mm/page_owner: use order instead of nr in split_page_owner() mm: memcg: make memcg huge page split support any order split. mm: page_owner: add support for splitting to any order in split page_owner. mm: thp: split huge page to any lower order pages mm: huge_memory: enable debugfs to split huge pages to any order.
include/linux/huge_mm.h | 21 ++- include/linux/memcontrol.h | 4 +- include/linux/page_owner.h | 14 +- mm/filemap.c | 2 - mm/huge_memory.c | 173 +++++++++++++----- mm/internal.h | 3 +- mm/memcontrol.c | 10 +- mm/page_alloc.c | 8 +- mm/page_owner.c | 6 +- mm/readahead.c | 3 - .../selftests/mm/split_huge_page_test.c | 115 +++++++++++- 11 files changed, 276 insertions(+), 83 deletions(-)