On 10/21/24 19:14, Lorenzo Stoakes wrote:
On Mon, Oct 21, 2024 at 07:00:53PM +0200, David Hildenbrand wrote:
Also the existing logic is that existing markers (HW poison, uffd-simulated HW poison, uffd wp marker) are retained and no error raised on MADV_DONTNEED, and no error on MADV_FREE either, so it'd be consistent with existing behaviour.
HW poison / uffd-simulated HW poison are expected to be zapped: it's just like a mapped page with HWPOISON. So that is correct.
Well, poison is _not_ zapped on MADV_DONTNEED but _is_ on MADV_FREE :) anyway, I
Huh?
madvise_dontneed_single_vma()->zap_page_range_single(details=NULL)->unmap_single_vma(details=NULL) ... zap_pte_range()
} else if (is_hwpoison_entry(entry) || is_poisoned_swp_entry(entry)) { if (!should_zap_cows(details)) continue; ...
Should just zap them.
What am I missing?
Yeah ok it's me who's missing something here, I hadn't noticed details == NULL so should_zap_cows() is true, my mistake!
Well, good to know it's consistent then. As I've explained I see why zapping actual hwpoison makes sense for MADV_DONTNEED/MADV_FREE. That it's done also for uffd poison is not completely clear, but maybe it was just easier to implement. But it doesn't mean we have to do the same for GUARD PTEs. Either behavior of zap/ignore/error could be valid, we just have to pick one and then live with it as it can't change :) Zapping guards on DONTNEED/FREE seems wrong to me, so it's between error (and potentially catching some misuse) and ignore (potentially more performant in case somebody wants to DOTNEED/FREE an area that contains scattered guards).
And the impossibility to meaningfully unwind on errors in the middle of the operation (unless we pre-scan for guards) isn't exactly nice, so maybe just ignore, i.e. the current approach?
In any case we explicitly add code here to prevent guard pages from going. I will correct everything where I wrongly say otherwise, doh!
mean the MADV flags are a confusing mess generally, as per Vlasta's comments which to begin with I strongly disagreed with then, discussing further, realsed that no this is just a bit insane and had driven _me_ insane.
UFFD-WP behavior is ... weird. Would not expect MADV_DONTNEED to zap uffd-wp entries.
Also semantically you are achieving what the calls expect you are freeing the ranges since the guard page regions are unbacked so are already freed... so yeah I don't think an error really makes sense here.
I you compare it to a VMA hole, it make sense to fail. If we treat it like PROT_NONE, it make sense to skip them.
We might also be limiting use cases by assuming they might _only_ be used for allocators and such.
I don't buy that as an argument, sorry :)
"Let's map the kernel writable into all user space because otherwise we might be limiting use cases"
That's a great idea! Patch series incoming, 1st April 2025... :>)
:) Just flip the bit on x86 and we're done!
;)
:P
-- Cheers,
David / dhildenb
Overall I think just always leaving in place except on remedy err sorry sorry unpoison and munmap and not returning an error if encountered elsewhere (other than, of course, GUP) is the right way forward and most in line with user expectation and practical usage.
Fine with me, make sure to document that is behaves like a PROT_NONE VMA, not like a memory hole, except when something would trigger a fault (GUP etc).
Ack will make sure to document.
-- Cheers,
David / dhildenb