On Tue, Dec 16, 2025 at 03:53:17AM +0000, Matthew Wilcox wrote:
On Tue, Dec 16, 2025 at 11:12:21AM +0800, Jinchao Wang wrote:
On Tue, Dec 16, 2025 at 02:42:06AM +0000, Matthew Wilcox wrote:
On Tue, Dec 16, 2025 at 09:37:51AM +0800, Jinchao Wang wrote:
On Mon, Dec 15, 2025 at 02:22:23PM +0000, Matthew Wilcox wrote:
On Mon, Dec 15, 2025 at 10:19:00PM +0800, Jinchao Wang wrote:
page_cache_ra_order() and page_cache_ra_unbounded() read mapping minimum folio constraints before taking the invalidate lock, allowing concurrent changes to violate page cache invariants.
Move the lookups under filemap_invalidate_lock_shared() to ensure readahead allocations respect the mapping constraints.
Why are the mapping folio size constraints being changed? They're supposed to be set at inode instantiation and then never changed.
They can change after instantiation for block devices. In the syzbot repro: blkdev_ioctl() -> blkdev_bszset() -> set_blocksize() -> mapping_set_folio_min_order()
Oh, this is just syzbot doing stupid things. We should probably make blkdev_bszset() fail if somebody else has an fd open.
Thanks, that makes sense. Tightening blkdev_bszset() would avoid the race entirely. This change is meant as a defensive fix to prevent BUGs.
Yes, but the point is that there's a lot of code which relies on the AS_FOLIO bits not changing in the middle. Syzbot found one of them, but there are others.
I've been thinking about this more, and I wanted to share another perspective if that's okay.
Rather than tracking down every place that might change AS_FOLIO bits (like blkdev_bszset() and potentially others), what if we make the page cache layer itself robust against such changes?
The invalidate_lock was introduced for exactly this kind of protection (commit 730633f0b7f9: "mm: Protect operations adding pages to page cache with invalidate_lock"). This way, the page cache doesn't need to rely on assumptions about what upper layers might do.
The readahead functions already hold filemap_invalidate_lock_shared(), so moving the constraint reads under the lock adds no overhead. It would protect against AS_FOLIO changes regardless of their source.
I think this separates concerns nicely: upper layers can change constraints through the invalidate_lock protocol, and page cache operations are automatically safe. But I'd really value your thoughts on this approach - you have much more experience with these tradeoffs than I do.
Thanks again for taking the time to discuss this.