On Wed, Feb 26, 2025 at 9:04 AM Matthew Wilcox willy@infradead.org wrote:
On Wed, Feb 26, 2025 at 06:48:15AM -0500, Brian Geffon wrote:
When handling faults for anon shmem finish_fault() will attempt to install ptes for the entire folio. Unfortunately if it encounters a single non-pte_none entry in that range it will bail, even if the pte that triggered the fault is still pte_none. When this situation happens the fault will be retried endlessly never making forward progress.
This patch fixes this behavior and if it detects that a pte in the range is not pte_none it will fall back to setting just the pte for the address that triggered the fault.
Surely there's a similar problem in do_anonymous_page()?
At any rate, what a horrid function finish_fault() has become. Special cases all over the place. What we should be doing is deciding the range of PTEs to insert, bounded by the folio, the VMA and any non-none entries. Maybe I'll get a chance to fix this up.
I agree, I wasn't thrilled that the fix looked like this but I was trying to keep the change minimal to aid in backporting to stable kernels where this behavior is broken. With that being said, do you have a preference on a minimal way we can fix this before finish_fault() gets a proper cleanup?