Re: [PATCH v6 01/11] filemap: Pass address_space mapping to ->free_folio()

17 Sep 2025

On 16.09.25 08:23, Hugh Dickins wrote:
...
On Fri, 12 Sep 2025, Roy, Patrick wrote:
...
From: Elliot Berman quic_eberman@quicinc.com
When guest_memfd removes memory from the host kernel's direct map,
direct map entries must be restored before the memory is freed again. To
do so, ->free_folio() needs to know whether a gmem folio was direct map
removed in the first place though. While possible to keep track of this
information on each individual folio (e.g. via page flags), direct map
removal is an all-or-nothing property of the entire guest_memfd, so it
is less error prone to just check the flag stored in the gmem inode's
private data.  However, by the time ->free_folio() is called,
folio->mapping might be cleared. To still allow access to the address
space from which the folio was just removed, pass it in as an additional
argument to ->free_folio, as the mapping is well-known to all callers.
Link: https://lore.kernel.org/all/15f665b4-2d33-41ca-ac50-fafe24ade32f@redhat.com/
Suggested-by: David Hildenbrand david@redhat.com
Acked-by: David Hildenbrand david@redhat.com
Signed-off-by: Elliot Berman quic_eberman@quicinc.com
[patrick: rewrite shortlog for new usecase]
Signed-off-by: Patrick Roy roypat@amazon.co.uk

Documentation/filesystems/locking.rst |  2 +-
  fs/nfs/dir.c                          | 11 ++++++-----
  fs/orangefs/inode.c                   |  3 ++-
  include/linux/fs.h                    |  2 +-
  mm/filemap.c                          |  9 +++++----
  mm/secretmem.c                        |  3 ++-
  mm/vmscan.c                           |  4 ++--
  virt/kvm/guest_memfd.c                |  3 ++-
  8 files changed, 21 insertions(+), 16 deletions(-)

diff --git a/Documentation/filesystems/locking.rst b/Documentation/filesystems/locking.rst
index aa287ccdac2f..74c97287ec40 100644
--- a/Documentation/filesystems/locking.rst
+++ b/Documentation/filesystems/locking.rst
@@ -262,7 +262,7 @@ prototypes::
   sector_t (*bmap)(struct address_space *, sector_t);
   void (*invalidate_folio) (struct folio *, size_t start, size_t len);
   bool (*release_folio)(struct folio *, gfp_t);

void (*free_folio)(struct folio *);


void (*free_folio)(struct address_space *, struct folio *);
int (*direct_IO)(struct kiocb *, struct iov_iter *iter);
int (*migrate_folio)(struct address_space *, struct folio *dst,
	struct folio *src, enum migrate_mode);

Beware, that is against the intent of free_folio().
Since its 2.6.37 origin in 6072d13c4293 ("Call the filesystem back
whenever a page is removed from the page cache"), freepage() or
free_folio() has intentionally NOT taken a struct address_space *mapping,
because that structure may already be freed by the time free_folio() is
called, if the last folio holding it has now been freed.
Thanks for noticing that Hugh, very good point!
...
Maybe something has changed since then, or maybe it happens to be safe
just in the context in which you want to use it; but it is against the
principle of free_folio().  (Maybe an rcu_read_lock() could be added
in __remove_mapping() to make it safe nowadays? maybe not welcome.)
Let me dig into the callers:
1) filemap_free_folio()
filemap_free_folio() looks up the callback through 
mapping->a_ops->free_folio. Nothing happens in-between that lookup and 
the callback so we should be good.
2) replace_page_cache_folio()
replace_page_cache_folio() similarly looks up the callback through
mapping->a_ops->free_folio. We do some operations afterwards, but 
essentially store the new folio in the page cache and remove the old one.
The only caller is fuse_try_move_folio(), and IIUC both folios are 
locked, preventing concurrent truncation and the mapping going away.
3) __remove_mapping()
__remove_mapping() also looks up the callback through 
mapping->a_ops->free_folio.
Before we call free_folio() we remove the folio from the pagecache 
(__filemap_remove_folio) to then drop locks and call free_folio().
We're only holding the folio lock at that point.
So yes I agree, truncate_inode_pages_final() could be racing with
__remove_mapping().c That's probably exactly what the docs describe 
regarding reclaim.
rcu_read_lock() should indeed work, or some other mechanism that keeps 
truncate_inode_pages_final() from succeeding in this racy situation.
Alternatively I guess we would have to use another callback.
-- 
Cheers

David / dhildenb



    

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

Re: [PATCH v6 01/11] filemap: Pass address_space mapping to ->free_folio()