On Fri, 2 May 2025 23:50:19 +0200 Petr Vaněk arkamar@atlas.cz wrote:
On XEN PV, folio_pte_batch() can incorrectly batch beyond the end of a folio due to a corner case in pte_advance_pfn(). Specifically, when the PFN following the folio maps to an invalidated MFN,
expected_pte = pte_advance_pfn(expected_pte, nr);
produces a pte_none(). If the actual next PTE in memory is also pte_none(), the pte_same() succeeds,
if (!pte_same(pte, expected_pte)) break;
the loop is not broken, and batching continues into unrelated memory.
...
Looks OK for now I guess but it looks like we should pay some attention to what types we're using.
--- a/mm/internal.h +++ b/mm/internal.h @@ -248,11 +248,9 @@ static inline int folio_pte_batch(struct folio *folio, unsigned long addr, pte_t *start_ptep, pte_t pte, int max_nr, fpb_t flags, bool *any_writable, bool *any_young, bool *any_dirty) {
- unsigned long folio_end_pfn = folio_pfn(folio) + folio_nr_pages(folio);
- const pte_t *end_ptep = start_ptep + max_nr; pte_t expected_pte, *ptep; bool writable, young, dirty;
- int nr;
- int nr, cur_nr;
if (any_writable) *any_writable = false; @@ -265,11 +263,15 @@ static inline int folio_pte_batch(struct folio *folio, unsigned long addr, VM_WARN_ON_FOLIO(!folio_test_large(folio) || max_nr < 1, folio); VM_WARN_ON_FOLIO(page_folio(pfn_to_page(pte_pfn(pte))) != folio, folio);
- /* Limit max_nr to the actual remaining PFNs in the folio we could batch. */
- max_nr = min_t(unsigned long, max_nr,
folio_pfn(folio) + folio_nr_pages(folio) - pte_pfn(pte));
Methinks max_nr really wants to be unsigned long. That will permit the cleanup of quite a bit of truncation, extension, signedness conversion and general type chaos in folio_pte_batch()'s various callers.
And...
Why does folio_nr_pages() return a signed quantity? It's a count.
And why the heck is folio_pte_batch() inlined? It's larger then my first hard disk and it has five callsites!