Currently there is no means of determining whether a give page in a mapping range is designated a guard region (as installed via madvise() using the MADV_GUARD_INSTALL flag).
This is generally not an issue, but in some instances users may wish to determine whether this is the case.
This series adds this ability via /proc/$pid/pagemap, updates the documentation and adds a self test to assert that this functions correctly.
Lorenzo Stoakes (2): fs/proc/task_mmu: add guard region bit to pagemap tools/selftests: add guard region test for /proc/$pid/pagemap
Documentation/admin-guide/mm/pagemap.rst | 3 +- fs/proc/task_mmu.c | 6 ++- tools/testing/selftests/mm/guard-regions.c | 47 ++++++++++++++++++++++ tools/testing/selftests/mm/vm_util.h | 1 + 4 files changed, 55 insertions(+), 2 deletions(-)
-- 2.48.1
Currently there is no means by which users can determine whether a given page in memory is in fact a guard region, that is having had the MADV_GUARD_INSTALL madvise() flag applied to it.
This is intentional, as to provide this information in VMA metadata would contradict the intent of the feature (providing a means to change fault behaviour at a page table level rather than a VMA level), and would require VMA metadata operations to scan page tables, which is unacceptable.
In many cases, users have no need to reflect and determine what regions have been designated guard regions, as it is the user who has established them in the first place.
But in some instances, such as monitoring software, or software that relies upon being able to ascertain the nature of mappings within a remote process for instance, it becomes useful to be able to determine which pages have the guard region marker applied.
This patch makes use of an unused pagemap bit (58) to provide this information.
This patch updates the documentation at the same time as making the change such that the implementation of the feature and the documentation of it are tied together.
Signed-off-by: Lorenzo Stoakes lorenzo.stoakes@oracle.com --- Documentation/admin-guide/mm/pagemap.rst | 3 ++- fs/proc/task_mmu.c | 6 +++++- 2 files changed, 7 insertions(+), 2 deletions(-)
diff --git a/Documentation/admin-guide/mm/pagemap.rst b/Documentation/admin-guide/mm/pagemap.rst index caba0f52dd36..a297e824f990 100644 --- a/Documentation/admin-guide/mm/pagemap.rst +++ b/Documentation/admin-guide/mm/pagemap.rst @@ -21,7 +21,8 @@ There are four components to pagemap: * Bit 56 page exclusively mapped (since 4.2) * Bit 57 pte is uffd-wp write-protected (since 5.13) (see Documentation/admin-guide/mm/userfaultfd.rst) - * Bits 58-60 zero + * Bit 58 pte is a guard region (since 6.15) (see madvise (2) man page) + * Bits 59-60 zero * Bit 61 page is file-page or shared-anon (since 3.5) * Bit 62 page swapped * Bit 63 page present diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index f02cd362309a..c17615e21a5d 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -1632,6 +1632,7 @@ struct pagemapread { #define PM_SOFT_DIRTY BIT_ULL(55) #define PM_MMAP_EXCLUSIVE BIT_ULL(56) #define PM_UFFD_WP BIT_ULL(57) +#define PM_GUARD_REGION BIT_ULL(58) #define PM_FILE BIT_ULL(61) #define PM_SWAP BIT_ULL(62) #define PM_PRESENT BIT_ULL(63) @@ -1732,6 +1733,8 @@ static pagemap_entry_t pte_to_pagemap_entry(struct pagemapread *pm, page = pfn_swap_entry_to_page(entry); if (pte_marker_entry_uffd_wp(entry)) flags |= PM_UFFD_WP; + if (is_guard_swp_entry(entry)) + flags |= PM_GUARD_REGION; }
if (page) { @@ -1931,7 +1934,8 @@ static const struct mm_walk_ops pagemap_ops = { * Bit 55 pte is soft-dirty (see Documentation/admin-guide/mm/soft-dirty.rst) * Bit 56 page exclusively mapped * Bit 57 pte is uffd-wp write-protected - * Bits 58-60 zero + * Bit 58 pte is a guard region + * Bits 59-60 zero * Bit 61 page is file-page or shared-anon * Bit 62 page swapped * Bit 63 page present
On Fri, Feb 21, 2025 at 4:05 AM Lorenzo Stoakes lorenzo.stoakes@oracle.com wrote:
Currently there is no means by which users can determine whether a given page in memory is in fact a guard region, that is having had the MADV_GUARD_INSTALL madvise() flag applied to it.
This is intentional, as to provide this information in VMA metadata would contradict the intent of the feature (providing a means to change fault behaviour at a page table level rather than a VMA level), and would require VMA metadata operations to scan page tables, which is unacceptable.
In many cases, users have no need to reflect and determine what regions have been designated guard regions, as it is the user who has established them in the first place.
But in some instances, such as monitoring software, or software that relies upon being able to ascertain the nature of mappings within a remote process for instance, it becomes useful to be able to determine which pages have the guard region marker applied.
This patch makes use of an unused pagemap bit (58) to provide this information.
This patch updates the documentation at the same time as making the change such that the implementation of the feature and the documentation of it are tied together.
Signed-off-by: Lorenzo Stoakes lorenzo.stoakes@oracle.com
Documentation/admin-guide/mm/pagemap.rst | 3 ++- fs/proc/task_mmu.c | 6 +++++- 2 files changed, 7 insertions(+), 2 deletions(-)
diff --git a/Documentation/admin-guide/mm/pagemap.rst b/Documentation/admin-guide/mm/pagemap.rst index caba0f52dd36..a297e824f990 100644 --- a/Documentation/admin-guide/mm/pagemap.rst +++ b/Documentation/admin-guide/mm/pagemap.rst @@ -21,7 +21,8 @@ There are four components to pagemap: * Bit 56 page exclusively mapped (since 4.2) * Bit 57 pte is uffd-wp write-protected (since 5.13) (see Documentation/admin-guide/mm/userfaultfd.rst)
- Bits 58-60 zero
- Bit 58 pte is a guard region (since 6.15) (see madvise (2) man page)
Should this be 6.14 ?
Other than that: Reviewed-by: Kalesh Singh kaleshsingh@google.com
Thanks, Kalesh
- Bits 59-60 zero
- Bit 61 page is file-page or shared-anon (since 3.5)
- Bit 62 page swapped
- Bit 63 page present
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index f02cd362309a..c17615e21a5d 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -1632,6 +1632,7 @@ struct pagemapread { #define PM_SOFT_DIRTY BIT_ULL(55) #define PM_MMAP_EXCLUSIVE BIT_ULL(56) #define PM_UFFD_WP BIT_ULL(57) +#define PM_GUARD_REGION BIT_ULL(58) #define PM_FILE BIT_ULL(61) #define PM_SWAP BIT_ULL(62) #define PM_PRESENT BIT_ULL(63) @@ -1732,6 +1733,8 @@ static pagemap_entry_t pte_to_pagemap_entry(struct pagemapread *pm, page = pfn_swap_entry_to_page(entry); if (pte_marker_entry_uffd_wp(entry)) flags |= PM_UFFD_WP;
if (is_guard_swp_entry(entry))
flags |= PM_GUARD_REGION; } if (page) {
@@ -1931,7 +1934,8 @@ static const struct mm_walk_ops pagemap_ops = {
- Bit 55 pte is soft-dirty (see Documentation/admin-guide/mm/soft-dirty.rst)
- Bit 56 page exclusively mapped
- Bit 57 pte is uffd-wp write-protected
- Bits 58-60 zero
- Bit 58 pte is a guard region
- Bits 59-60 zero
- Bit 61 page is file-page or shared-anon
- Bit 62 page swapped
- Bit 63 page present
-- 2.48.1
On Fri, Feb 21, 2025 at 09:10:42AM -0800, Kalesh Singh wrote:
On Fri, Feb 21, 2025 at 4:05 AM Lorenzo Stoakes lorenzo.stoakes@oracle.com wrote:
Currently there is no means by which users can determine whether a given page in memory is in fact a guard region, that is having had the MADV_GUARD_INSTALL madvise() flag applied to it.
This is intentional, as to provide this information in VMA metadata would contradict the intent of the feature (providing a means to change fault behaviour at a page table level rather than a VMA level), and would require VMA metadata operations to scan page tables, which is unacceptable.
In many cases, users have no need to reflect and determine what regions have been designated guard regions, as it is the user who has established them in the first place.
But in some instances, such as monitoring software, or software that relies upon being able to ascertain the nature of mappings within a remote process for instance, it becomes useful to be able to determine which pages have the guard region marker applied.
This patch makes use of an unused pagemap bit (58) to provide this information.
This patch updates the documentation at the same time as making the change such that the implementation of the feature and the documentation of it are tied together.
Signed-off-by: Lorenzo Stoakes lorenzo.stoakes@oracle.com
Documentation/admin-guide/mm/pagemap.rst | 3 ++- fs/proc/task_mmu.c | 6 +++++- 2 files changed, 7 insertions(+), 2 deletions(-)
diff --git a/Documentation/admin-guide/mm/pagemap.rst b/Documentation/admin-guide/mm/pagemap.rst index caba0f52dd36..a297e824f990 100644 --- a/Documentation/admin-guide/mm/pagemap.rst +++ b/Documentation/admin-guide/mm/pagemap.rst @@ -21,7 +21,8 @@ There are four components to pagemap: * Bit 56 page exclusively mapped (since 4.2) * Bit 57 pte is uffd-wp write-protected (since 5.13) (see Documentation/admin-guide/mm/userfaultfd.rst)
- Bits 58-60 zero
- Bit 58 pte is a guard region (since 6.15) (see madvise (2) man page)
Should this be 6.14 ?
We're aiming for the 6.15 merge window so this is correct :>) I don't think this could be considered a hotfix haha!
Other than that: Reviewed-by: Kalesh Singh kaleshsingh@google.com
Thanks! And thanks for review on the other patch also! Appreciated.
Thanks, Kalesh
- Bits 59-60 zero
- Bit 61 page is file-page or shared-anon (since 3.5)
- Bit 62 page swapped
- Bit 63 page present
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index f02cd362309a..c17615e21a5d 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -1632,6 +1632,7 @@ struct pagemapread { #define PM_SOFT_DIRTY BIT_ULL(55) #define PM_MMAP_EXCLUSIVE BIT_ULL(56) #define PM_UFFD_WP BIT_ULL(57) +#define PM_GUARD_REGION BIT_ULL(58) #define PM_FILE BIT_ULL(61) #define PM_SWAP BIT_ULL(62) #define PM_PRESENT BIT_ULL(63) @@ -1732,6 +1733,8 @@ static pagemap_entry_t pte_to_pagemap_entry(struct pagemapread *pm, page = pfn_swap_entry_to_page(entry); if (pte_marker_entry_uffd_wp(entry)) flags |= PM_UFFD_WP;
if (is_guard_swp_entry(entry))
flags |= PM_GUARD_REGION; } if (page) {
@@ -1931,7 +1934,8 @@ static const struct mm_walk_ops pagemap_ops = {
- Bit 55 pte is soft-dirty (see Documentation/admin-guide/mm/soft-dirty.rst)
- Bit 56 page exclusively mapped
- Bit 57 pte is uffd-wp write-protected
- Bits 58-60 zero
- Bit 58 pte is a guard region
- Bits 59-60 zero
- Bit 61 page is file-page or shared-anon
- Bit 62 page swapped
- Bit 63 page present
-- 2.48.1
Add a test to the guard region self tests to assert that the /proc/$pid/pagemap information now made availabile to the user correctly identifies and reports guard regions.
As a part of this change, update vm_util.h to add the new bit (note there is no header file in the kernel where this is exposed, the user is expected to provide their own mask) and utilise the helper functions there for pagemap functionality.
Signed-off-by: Lorenzo Stoakes lorenzo.stoakes@oracle.com --- tools/testing/selftests/mm/guard-regions.c | 47 ++++++++++++++++++++++ tools/testing/selftests/mm/vm_util.h | 1 + 2 files changed, 48 insertions(+)
diff --git a/tools/testing/selftests/mm/guard-regions.c b/tools/testing/selftests/mm/guard-regions.c index ea9b5815e828..0c7183e8b661 100644 --- a/tools/testing/selftests/mm/guard-regions.c +++ b/tools/testing/selftests/mm/guard-regions.c @@ -19,6 +19,7 @@ #include <sys/syscall.h> #include <sys/uio.h> #include <unistd.h> +#include "vm_util.h"
/* * Ignore the checkpatch warning, as per the C99 standard, section 7.14.1.1: @@ -2032,4 +2033,50 @@ TEST_F(guard_regions, anon_zeropage) ASSERT_EQ(munmap(ptr, 10 * page_size), 0); }
+/* + * Assert that /proc/$pid/pagemap correctly identifies guard region ranges. + */ +TEST_F(guard_regions, pagemap) +{ + const unsigned long page_size = self->page_size; + int proc_fd; + char *ptr; + int i; + + proc_fd = open("/proc/self/pagemap", O_RDONLY); + ASSERT_NE(proc_fd, -1); + + ptr = mmap_(self, variant, NULL, 10 * page_size, + PROT_READ | PROT_WRITE, 0, 0); + ASSERT_NE(ptr, MAP_FAILED); + + /* Read from pagemap, and assert no guard regions are detected. */ + for (i = 0; i < 10; i++) { + char *ptr_p = &ptr[i * page_size]; + unsigned long entry = pagemap_get_entry(proc_fd, ptr_p); + unsigned long masked = entry & PM_GUARD_REGION_MASK; + + ASSERT_EQ(masked, 0); + } + + /* Install a guard region in every other page. */ + for (i = 0; i < 10; i += 2) { + char *ptr_p = &ptr[i * page_size]; + + ASSERT_EQ(madvise(ptr_p, page_size, MADV_GUARD_INSTALL), 0); + } + + /* Re-read from pagemap, and assert guard regions are detected. */ + for (i = 0; i < 10; i++) { + char *ptr_p = &ptr[i * page_size]; + unsigned long entry = pagemap_get_entry(proc_fd, ptr_p); + unsigned long masked = entry & PM_GUARD_REGION_MASK; + + ASSERT_EQ(masked, i % 2 == 0 ? PM_GUARD_REGION_MASK : 0); + } + + ASSERT_EQ(close(proc_fd), 0); + ASSERT_EQ(munmap(ptr, 10 * page_size), 0); +} + TEST_HARNESS_MAIN diff --git a/tools/testing/selftests/mm/vm_util.h b/tools/testing/selftests/mm/vm_util.h index b60ac68a9dc8..73a11443b7f6 100644 --- a/tools/testing/selftests/mm/vm_util.h +++ b/tools/testing/selftests/mm/vm_util.h @@ -10,6 +10,7 @@ #define PM_SOFT_DIRTY BIT_ULL(55) #define PM_MMAP_EXCLUSIVE BIT_ULL(56) #define PM_UFFD_WP BIT_ULL(57) +#define PM_GUARD_REGION_MASK BIT_ULL(58) #define PM_FILE BIT_ULL(61) #define PM_SWAP BIT_ULL(62) #define PM_PRESENT BIT_ULL(63)
On Fri, Feb 21, 2025 at 12:05:23PM +0000, Lorenzo Stoakes wrote:
Add a test to the guard region self tests to assert that the /proc/$pid/pagemap information now made availabile to the user correctly identifies and reports guard regions.
As a part of this change, update vm_util.h to add the new bit (note there is no header file in the kernel where this is exposed, the user is expected to provide their own mask) and utilise the helper functions there for pagemap functionality.
Signed-off-by: Lorenzo Stoakes lorenzo.stoakes@oracle.com
Andrew - Apologies,
I managed to not commit a change I quickly made before sending this out (I'm ill, seems it is having an impact...)
If the series is ok would you mind tacking on this fix-patch? It's simply to rename a clumsily named define here.
No functional changes...
Thanks!
----8<---- From 60be19e88b3bfe9a6ec459115f0027721c494b30 Mon Sep 17 00:00:00 2001 From: Lorenzo Stoakes lorenzo.stoakes@oracle.com Date: Fri, 21 Feb 2025 13:45:48 +0000 Subject: [PATCH] fixup define name
Fix badly named define so it's consistent with the others.
Signed-off-by: Lorenzo Stoakes lorenzo.stoakes@oracle.com --- tools/testing/selftests/mm/guard-regions.c | 6 +++--- tools/testing/selftests/mm/vm_util.h | 2 +- 2 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/tools/testing/selftests/mm/guard-regions.c b/tools/testing/selftests/mm/guard-regions.c index 0c7183e8b661..280d1831bf73 100644 --- a/tools/testing/selftests/mm/guard-regions.c +++ b/tools/testing/selftests/mm/guard-regions.c @@ -2054,7 +2054,7 @@ TEST_F(guard_regions, pagemap) for (i = 0; i < 10; i++) { char *ptr_p = &ptr[i * page_size]; unsigned long entry = pagemap_get_entry(proc_fd, ptr_p); - unsigned long masked = entry & PM_GUARD_REGION_MASK; + unsigned long masked = entry & PM_GUARD_REGION;
ASSERT_EQ(masked, 0); } @@ -2070,9 +2070,9 @@ TEST_F(guard_regions, pagemap) for (i = 0; i < 10; i++) { char *ptr_p = &ptr[i * page_size]; unsigned long entry = pagemap_get_entry(proc_fd, ptr_p); - unsigned long masked = entry & PM_GUARD_REGION_MASK; + unsigned long masked = entry & PM_GUARD_REGION;
- ASSERT_EQ(masked, i % 2 == 0 ? PM_GUARD_REGION_MASK : 0); + ASSERT_EQ(masked, i % 2 == 0 ? PM_GUARD_REGION : 0); }
ASSERT_EQ(close(proc_fd), 0); diff --git a/tools/testing/selftests/mm/vm_util.h b/tools/testing/selftests/mm/vm_util.h index 73a11443b7f6..0e629586556b 100644 --- a/tools/testing/selftests/mm/vm_util.h +++ b/tools/testing/selftests/mm/vm_util.h @@ -10,7 +10,7 @@ #define PM_SOFT_DIRTY BIT_ULL(55) #define PM_MMAP_EXCLUSIVE BIT_ULL(56) #define PM_UFFD_WP BIT_ULL(57) -#define PM_GUARD_REGION_MASK BIT_ULL(58) +#define PM_GUARD_REGION BIT_ULL(58) #define PM_FILE BIT_ULL(61) #define PM_SWAP BIT_ULL(62) #define PM_PRESENT BIT_ULL(63) -- 2.48.1
On Fri, Feb 21, 2025 at 5:51 AM Lorenzo Stoakes lorenzo.stoakes@oracle.com wrote:
On Fri, Feb 21, 2025 at 12:05:23PM +0000, Lorenzo Stoakes wrote:
Add a test to the guard region self tests to assert that the /proc/$pid/pagemap information now made availabile to the user correctly identifies and reports guard regions.
As a part of this change, update vm_util.h to add the new bit (note there is no header file in the kernel where this is exposed, the user is expected to provide their own mask) and utilise the helper functions there for pagemap functionality.
Signed-off-by: Lorenzo Stoakes lorenzo.stoakes@oracle.com
Reviewed-by: Kalesh Singh kaleshsingh@google.com
Andrew - Apologies,
I managed to not commit a change I quickly made before sending this out (I'm ill, seems it is having an impact...)
If the series is ok would you mind tacking on this fix-patch? It's simply to rename a clumsily named define here.
No functional changes...
Thanks!
----8<---- From 60be19e88b3bfe9a6ec459115f0027721c494b30 Mon Sep 17 00:00:00 2001 From: Lorenzo Stoakes lorenzo.stoakes@oracle.com Date: Fri, 21 Feb 2025 13:45:48 +0000 Subject: [PATCH] fixup define name
Fix badly named define so it's consistent with the others.
Signed-off-by: Lorenzo Stoakes lorenzo.stoakes@oracle.com
tools/testing/selftests/mm/guard-regions.c | 6 +++--- tools/testing/selftests/mm/vm_util.h | 2 +- 2 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/tools/testing/selftests/mm/guard-regions.c b/tools/testing/selftests/mm/guard-regions.c index 0c7183e8b661..280d1831bf73 100644 --- a/tools/testing/selftests/mm/guard-regions.c +++ b/tools/testing/selftests/mm/guard-regions.c @@ -2054,7 +2054,7 @@ TEST_F(guard_regions, pagemap) for (i = 0; i < 10; i++) { char *ptr_p = &ptr[i * page_size]; unsigned long entry = pagemap_get_entry(proc_fd, ptr_p);
unsigned long masked = entry & PM_GUARD_REGION_MASK;
unsigned long masked = entry & PM_GUARD_REGION; ASSERT_EQ(masked, 0); }
@@ -2070,9 +2070,9 @@ TEST_F(guard_regions, pagemap) for (i = 0; i < 10; i++) { char *ptr_p = &ptr[i * page_size]; unsigned long entry = pagemap_get_entry(proc_fd, ptr_p);
unsigned long masked = entry & PM_GUARD_REGION_MASK;
unsigned long masked = entry & PM_GUARD_REGION;
ASSERT_EQ(masked, i % 2 == 0 ? PM_GUARD_REGION_MASK : 0);
ASSERT_EQ(masked, i % 2 == 0 ? PM_GUARD_REGION : 0); } ASSERT_EQ(close(proc_fd), 0);
diff --git a/tools/testing/selftests/mm/vm_util.h b/tools/testing/selftests/mm/vm_util.h index 73a11443b7f6..0e629586556b 100644 --- a/tools/testing/selftests/mm/vm_util.h +++ b/tools/testing/selftests/mm/vm_util.h @@ -10,7 +10,7 @@ #define PM_SOFT_DIRTY BIT_ULL(55) #define PM_MMAP_EXCLUSIVE BIT_ULL(56) #define PM_UFFD_WP BIT_ULL(57) -#define PM_GUARD_REGION_MASK BIT_ULL(58) +#define PM_GUARD_REGION BIT_ULL(58) #define PM_FILE BIT_ULL(61) #define PM_SWAP BIT_ULL(62)
#define PM_PRESENT BIT_ULL(63)
2.48.1
linux-kselftest-mirror@lists.linaro.org