On Fri, Aug 4, 2023 at 5:49 PM Mateusz Guzik mjguzik@gmail.com wrote:
On 8/5/23, Suren Baghdasaryan surenb@google.com wrote:
On Fri, Aug 4, 2023 at 5:26 PM Suren Baghdasaryan surenb@google.com wrote:
On Fri, Aug 4, 2023 at 5:15 PM Linus Torvalds torvalds@linux-foundation.org wrote:
On Fri, 4 Aug 2023 at 16:25, Mateusz Guzik mjguzik@gmail.com wrote:
I know of these guys, I think they are excluded as is -- they go through access_remote_vm, starting with: if (mmap_read_lock_killable(mm)) return 0;
while dup_mmap already write locks the parent's mm.
Oh, you're only worried about vma_start_write()?
That's a non-issue. It doesn't take the lock normally, since it starts off with
if (__is_vma_write_locked(vma, &mm_lock_seq)) return;
which catches on the lock sequence number already being set.
That check will prevent re-locking but if vma is not already locked then the call will proceed with obtaining the lock and setting vma->vm_lock_seq to mm->mm_lock_seq.
The optimization Mateusz describes looks valid to me. If there is nobody else to fault a page and mm_users is stable (which I think it is because we are holding mmap_lock for write) then we can skip vma locking, I think.
mm_users is definitely *not* stable -- it can be bumped by get_task_mm, which is only synchronized with task lock.
Ugh, you are of course correct. Poor choice for saying no new users (threads) can appear from under us.
However, the other users (that I know of ) go through the mmap semaphore to mess with anything which means they will wait for dup_mmap to finish (or do their work first). I would be surprised if there were any cases which don't take the semaphore, given that it was a requirement prior to the vma patchset (unless you patched some to no longer need it?). I would guess worst case the semaphore can be added if missing.
No, the only mmap_lock read-lock that is affected is during the page fault, which is expected.
What is guaranteed is that if the forking process is single-threaded, there will be no threads added out of nowhere -- the only thread which could do it is busy creating one in dup_mmap. If multithreaded operation of the forking process was the only problem, that's it.
So no extra locking there.
Well, technically there's extra locking because the code stupidly doesn't initialize new vma allocations to the right sequence number, but that was talked about here:
https://lore.kernel.org/all/CAHk-=wiCrWAoEesBuoGoqqufvesicbGp3cX0LyKgEvsFaZN...
and it's a separate issue.
Linus
-- Mateusz Guzik <mjguzik gmail.com>