From: Christian Brauner
Sent: 06 May 2024 09:45
The fact is, it's not dma-buf that is violating any rules. It's epoll.
I agree that epoll() not taking a reference on the file is at least unexpected and contradicts the usual code patterns for the sake of performance and that it very likely is the case that most callers of f_op->poll() don't know this.
Note, I cleary wrote upthread that I'm ok to do it like you suggested but raised two concerns a) there's currently only one instance of prolonged @file lifetime in f_op->poll() afaict and b) that there's possibly going to be some performance impact on epoll().
So it's at least worth discussing what's more important because epoll() is very widely used and it's not that we haven't favored performance before.
But you've already said that you aren't concerned with performance on epoll() upthread. So afaict then there's really not a lot more to discuss other than take the patch and see whether we get any complaints.
Surely there isn't a problem with epoll holding a reference to the file structure - it isn't really any different to a dup().
'All' that needs to happen is that the 'magic' that makes epoll() remove files on the last fput happen when the close is done. I'm sure there are horrid locking issues it that code (separate from it calling ->poll() after ->release()) eg if you call close() concurrently with EPOLL_CTL_ADD.
I'm not at all sure it would have mattered if epoll kept the file open. But it can't do that because it is documented not to. As well as poll/select holding a reference to all their fd for the duration of the system call, a successful mmap() holds a reference until the pages are all unmapped - usually by process exit.
We (dayjob) have code that uses epoll() to monitor large numbers of UDP sockets. I was doing some tests (trying to) receive RTP (audio) data concurrently on 10000 sockets with typically one packet every 20ms. There are 10000 associated RCTP sockets that are usually idle. A more normal limit would be 1000 RTP sockets. All the data needs to go into a single (multithreaded) process. Just getting all the packets queued on the sockets was non-trivial. epoll is about the only way to actually read the data. (That needed multiple epoll fd so each thread could process all the events from one epoll fd then look for another unprocessed fd.)
David
- Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)