On Wed, Feb 19, 2025 at 10:15:47AM +0100, David Hildenbrand wrote:
On 19.02.25 10:03, Lorenzo Stoakes wrote:
On Wed, Feb 19, 2025 at 12:25:51AM -0800, Kalesh Singh wrote:
On Thu, Feb 13, 2025 at 10:18 AM Lorenzo Stoakes lorenzo.stoakes@oracle.com wrote:
The guard regions feature was initially implemented to support anonymous mappings only, excluding shmem.
This was done such as to introduce the feature carefully and incrementally and to be conservative when considering the various caveats and corner cases that are applicable to file-backed mappings but not to anonymous ones.
Now this feature has landed in 6.13, it is time to revisit this and to extend this functionality to file-backed and shmem mappings.
In order to make this maximally useful, and since one may map file-backed mappings read-only (for instance ELF images), we also remove the restriction on read-only mappings and permit the establishment of guard regions in any non-hugetlb, non-mlock()'d mapping.
Hi Lorenzo,
Thank you for your work on this.
You're welcome.
Have we thought about how guard regions are represented in /proc/*/[s]maps?
This is off-topic here but... Yes, extensively. No they do not appear there.
I thought you had attended LPC and my talk where I mentioned this purposefully as a drawback?
I went out of my way to advertise this limitation at the LPC talk, in the original series, etc. so it's a little disappointing that this is being brought up so late, but nobody else has raised objections to this issue so I think in general it's not a limitation that matters in practice.
In the field, I've found that many applications read the ranges from /proc/self/[s]maps to determine what they can access (usually related to obfuscation techniques). If they don't know of the guard regions it would cause them to crash; I think that we'll need similar entries to PROT_NONE (---p) for these, and generally to maintain consistency between the behavior and what is being said from /proc/*/[s]maps.
No, we cannot have these, sorry.
Firstly /proc/$pid/[s]maps describes VMAs. The entire purpose of this feature is to avoid having to accumulate VMAs for regions which are not intended to be accessible.
Secondly, there is no practical means for this to be accomplished in /proc/$pid/maps in _any_ way - as no metadata relating to a VMA indicates they have guard regions.
This is intentional, because setting such metadata is simply not practical
- why? Because when you try to split the VMA, how do you know which bit
gets the metadata and which doesn't? You can't without _reading page tables_.
/proc/$pid/smaps _does_ read page tables, but we can't start pretending VMAs exist when they don't, this would be completely inaccurate, would break assumptions for things like mremap (which require a single VMA) and would be unworkable.
The best that _could_ be achieved is to have a marker in /proc/$pid/smaps saying 'hey this region has guard regions somewhere'.
And then simply expose it in /proc/$pid/pagemap, which is a better interface for this pte-level information inside of VMAs. We should still have a spare bit for that purpose in the pagemap entries.
Ah yeah thanks David forgot about that!
This is also a possibility if that'd solve your problems Kalesh?
This bit will be fought over haha
-- Cheers,
David / dhildenb