On Thu, Nov 06, 2025 at 12:31:29PM +0100, Vlastimil Babka wrote:
On 11/6/25 11:46, Lorenzo Stoakes wrote:
This patch adds the ability to atomically set VMA flags with only the mmap read/VMA read lock held.
As this could be hugely problematic for VMA flags in general given that all other accesses are non-atomic and serialised by the mmap/VMA locks, we implement this with a strict allow-list - that is, only designated flags are allowed to do this.
We make VM_MAYBE_GUARD one of these flags, and then set it under the mmap read flag upon guard region installation.
The places where this flag is used currently and matter are:
VMA merge - performed under mmap/VMA write lock, therefore excluding racing writes.
/proc/$pid/smaps - can race the write, however this isn't meaningful as the flag write is performed at the point of the guard region being established, and thus an smaps reader can't reasonably expect to avoid races. Due to atomicity, a reader will observe either the flag being set or not. Therefore consistency will be maintained.
In all other cases the flag being set is irrelevant and atomicity guarantees other flags will be read correctly.
Could we maybe also spell out that we rely on the read mmap/VMA lock to exclude with writers that have write lock and then use non-atomic updates to update completely different flags than VM_MAYBE_GUARD? Those non-atomic updates could cause RMW races when only our side uses an atomic update, but the trick is that the read lock excludes with the write lock.
I thought this was implicit, I guess I can spell that out.
We additionally update madvise_guard_install() to ensure that anon_vma_prepare() is set for anonymous VMAs to maintain consistency with the assumption that any anonymous VMA with page tables will have an anon_vma set, and any with an anon_vma unset will not have page tables established.
Could we more obviously say that we did anon_vma_prepare() unconditionally before this patch to trigger the page table copying in fork, but it's not needed anymore because fork now checks also VM_MAYBE_GUARD that we're setting here. Maybe it would be even more obvious to move that vma_needs_copy() hunk from previous patch to this one, but doesn't matter that much.
I thought that was covered between the comment, the previous patch and this but I can spell it out also.
Also we could mention that this patch alone will prevent merging of VMAs in some situations, but that's addressed next. I don't think it's such a bisect hazard to need reordering or combining changes, just mention perhaps.
A little pedantic but sure :)
Signed-off-by: Lorenzo Stoakes lorenzo.stoakes@oracle.com
Otherwise LGTM.
Reviewed-by: Vlastimil Babka vbabka@suse.cz
Thanks!