On 8/5/23, Suren Baghdasaryan surenb@google.com wrote:
On Fri, Aug 4, 2023 at 6:06 PM Mateusz Guzik mjguzik@gmail.com wrote:
On 8/5/23, Linus Torvalds torvalds@linux-foundation.org wrote:
On Fri, 4 Aug 2023 at 16:25, Mateusz Guzik mjguzik@gmail.com wrote:
I know of these guys, I think they are excluded as is -- they go through access_remote_vm, starting with: if (mmap_read_lock_killable(mm)) return 0;
while dup_mmap already write locks the parent's mm.
Oh, you're only worried about vma_start_write()?
That's a non-issue. It doesn't take the lock normally, since it starts off with
if (__is_vma_write_locked(vma, &mm_lock_seq)) return;
which catches on the lock sequence number already being set.
So no extra locking there.
Well, technically there's extra locking because the code stupidly doesn't initialize new vma allocations to the right sequence number, but that was talked about here:
https://lore.kernel.org/all/CAHk-=wiCrWAoEesBuoGoqqufvesicbGp3cX0LyKgEvsFaZN...
and it's a separate issue.
I'm going to bet one beer this is the issue.
The patch I'm responding to only consists of adding the call to vma_start_write and claims the 5% slowdown from it, while fixing crashes if the forking process is multithreaded.
For the fix to work it has to lock something against the parent.
VMA_ITERATOR(old_vmi, oldmm, 0);
[..] for_each_vma(old_vmi, mpnt) { [..] vma_start_write(mpnt);
the added line locks an obj in the parent's vm space.
The problem you linked looks like pessimization for freshly allocated vmas, but that's what is being operated on here.
Sorry, now I'm having trouble understanding the problem you are describing. We are locking the parent's vma before copying it and the newly created vma is locked before it's added into the vma tree. What is the problem then?
Sorry for the late reply!
Looks there has been a bunch of weird talking past one another in this thread and I don't think trying to straighten it all out is worth any time.
I think at least the two of us agree that if a single-threaded process enters dup_mmap an down_writes the mmap semaphore, then no new thread can pop up in said process, thus no surprise page faults from that angle. 3rd parties are supposed to interfaces like access_remote_vm, which down_read said semaphore and are consequently also not a problem. The only worry here is that someone is messing with another process memory without the semaphore, but is very unlikely and patchable in the worst case -- but someone(tm) has to audit. With all these conditions satisfied one can elide vma_start_write for a perf win.
Finally, I think we agreed you are going to do the audit ;)
Cheers,