On Fri, Oct 11, 2024 at 08:11:32PM +0200, Jann Horn wrote:
On Fri, Sep 27, 2024 at 2:51 PM Lorenzo Stoakes lorenzo.stoakes@oracle.com wrote:
Add a new PTE marker that results in any access causing the accessing process to segfault.
[...]
static inline int is_poisoned_swp_entry(swp_entry_t entry) +{
/*
* We treat guard pages as poisoned too as these have the same semantics
* as poisoned ranges, only with different fault handling.
*/
return is_pte_marker_entry(entry) &&
(pte_marker_get(entry) &
(PTE_MARKER_POISONED | PTE_MARKER_GUARD));
+}
This means MADV_FREE will also clear guard PTEs, right?
Yes, this is expected, it acts like unmap in effect (with a delayed effect), so we give it the same semantics. The same thing happens with hardware poisoning.
You can see in the tests what expectations we have with different operations, we assert there this specific behaviour:
/* Lazyfree range. */ ASSERT_EQ(madvise(ptr, 10 * page_size, MADV_FREE), 0);
/* This should simply clear the poison markers. */ for (i = 0; i < 10; i++) { ASSERT_TRUE(try_read_write_buf(&ptr[i * page_size])); }
The tests somewhat self-document expected behaviour.
diff --git a/mm/memory.c b/mm/memory.c index 5c6486e33e63..6c413c3d72fd 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -1457,7 +1457,7 @@ static inline bool should_zap_folio(struct zap_details *details, return !folio_test_anon(folio); }
-static inline bool zap_drop_file_uffd_wp(struct zap_details *details) +static inline bool zap_drop_markers(struct zap_details *details) { if (!details) return false; @@ -1478,7 +1478,7 @@ zap_install_uffd_wp_if_needed(struct vm_area_struct *vma, if (vma_is_anonymous(vma)) return;
if (zap_drop_file_uffd_wp(details))
if (zap_drop_markers(details)) return; for (;;) {
@@ -1673,7 +1673,15 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, * drop the marker if explicitly requested. */ if (!vma_is_anonymous(vma) &&
!zap_drop_file_uffd_wp(details))
!zap_drop_markers(details))
continue;
} else if (is_guard_swp_entry(entry)) {
/*
* Ordinary zapping should not remove guard PTE
* markers. Only do so if we should remove PTE markers
* in general.
*/
if (!zap_drop_markers(details)) continue;
Just a comment: It's nice that the feature is restricted to anonymous VMAs, otherwise we'd have to figure out here what to do about unmap_mapping_folio() (which sets ZAP_FLAG_DROP_MARKER together with details.single_folio)...
Yes this is not the only issue with file-backed mappings. Readahead being another, and plenty more.
We will probably look at how we might do this once this patch set lands, and tackle all of these fun things then...
} else if (is_hwpoison_entry(entry) || is_poisoned_swp_entry(entry)) {
@@ -4005,6 +4013,10 @@ static vm_fault_t handle_pte_marker(struct vm_fault *vmf) if (marker & PTE_MARKER_POISONED) return VM_FAULT_HWPOISON;
/* Hitting a guard page is always a fatal condition. */
if (marker & PTE_MARKER_GUARD)
return VM_FAULT_SIGSEGV;
if (pte_marker_entry_uffd_wp(entry)) return pte_marker_handle_uffd_wp(vmf);
-- 2.46.2