a.k.a "Fix rare user data corruption when using THP" :)
On Mon, 14 Aug 2023, Matthew Wilcox (Oracle) wrote:
The special casing was originally added in pre-git history; reproducing the commit log here:
commit a318a92567d77 Author: Andrew Morton akpm@osdl.org Date: Sun Sep 21 01:42:22 2003 -0700
[PATCH] Speed up direct-io hugetlbpage handling This patch short-circuits all the direct-io page dirtying logic for higher-order pages. Without this, we pointlessly bounce BIOs up to keventd all the time.
In the last twenty years, compound pages have become used for more than just hugetlb. Rewrite these functions to operate on folios instead of pages and remove the special case for hugetlbfs; I don't think it's needed any more (and if it is, we can put it back in as a call to folio_test_hugetlb()).
This was found by inspection; as far as I can tell, this bug can lead to pages used as the destination of a direct I/O read not being marked as dirty. If those pages are then reclaimed by the MM without being dirtied for some other reason, they won't be written out. Then when they're faulted back in, they will not contain the data they should. It'll take a pretty unusual setup to produce this problem with several races all going the wrong way.
This problem predates the folio work; it could for example have been triggered by mmaping a THP in tmpfs and using that as the target of an O_DIRECT read.
Fixes: 800d8c63b2e98 ("shmem: add huge pages support")
No. It's a good catch, but bug looks specific to the folio work to me.
Almost all shmem pages are dirty from birth, even as soon as they are brought back from swap; so it is not necessary to re-mark them dirty.
The exceptions are pages allocated to holes when faulted: so you did get me worried as to whether khugepaged could collapse a pmd-ful of those into a THP without marking the result as dirty.
But no, in v6.5-rc6 the collapse_file() success path has if (is_shmem) folio_mark_dirty(folio); and in v5.10 the same appears as if (is_shmem) set_page_dirty(new_page);
(IIRC, that or marking pmd dirty was missed from early shmem THP support, but fairly soon corrected, and backported to stable then. I have a faint memory of versions which assembled pmd_dirty from collected pte_dirtys.)
And the !is_shmem case is for CONFIG_READ_ONLY_THP_FOR_FS: writing into those pages, by direct IO or whatever, is already prohibited.
It's dem dirty (or not dirty) folios dat's the trouble!
Hugh
Cc: stable@vger.kernel.org Signed-off-by: Matthew Wilcox (Oracle) willy@infradead.org
block/bio.c | 46 ++++++++++++++++++++++++---------------------- 1 file changed, 24 insertions(+), 22 deletions(-)