On Fri, 3 May 2024 at 16:39, Al Viro viro@zeniv.linux.org.uk wrote:
*IF* those files are on purely internal filesystem, that's probably OK; do that with something on something mountable (char device, sysfs file, etc.) and you have a problem with filesystem staying busy.
Yeah, I agree, it's a bit annoying in general. That said, it's easy to do: stash a file descriptor in a unix domain socket, and that's basically exactly what you have: a random reference to a 'struct file' that will stay around for as long as you just keep that socket around, long after the "real" file descriptor has been closed, and entirely separately from it.
And yes, that's exactly why unix domain socket transfers have caused so many problems over the years, with both refcount overflows and nasty garbage collection issues.
So randomly taking references to file descriptors certainly isn't new.
In fact, it's so common that I find the epoll pattern annoying, in that it does something special and *not* taking a ref - and it does that special thing to *other* ("innocent") file descriptors. Yes, dma-buf is a bit like those unix domain sockets in that it can keep random references alive for random times, but at least it does it just to its own file descriptors, not random other targets.
So the dmabuf thing is very much a "I'm a special file that describes a dma buffer", and shouldn't really affect anything outside of active dmabuf uses (which admittedly is a large portion of the GPU drivers, and has been expanding from there...). I
So the reason I'm annoyed at epoll in this case is that I think epoll triggered the bug in some entirely innocent subsystem. dma-buf is doing something differently odd, yes, but at least it's odd in a "I'm a specialized thing" sense, not in some "I screw over others" sense.
Linus