On Tue, May 07, 2024 at 12:10:07PM +0200, Christian König wrote:
Am 06.05.24 um 21:04 schrieb T.J. Mercier:
On Mon, May 6, 2024 at 2:30 AM Charan Teja Kalla quic_charante@quicinc.com wrote:
Hi TJ,
Seems I have got answers from [1], where it is agreed upon epoll() is the source of issue.
Thanks a lot for the discussion.
[1] https://lore.kernel.org/lkml/0000000000002d631f0615918f1e@google.com/
Thanks Charan
Oh man, quite a set of threads on this over the weekend. Thanks for the link.
Yeah and it also has some interesting side conclusion: We should probably tell people to stop using DMA-buf with epoll.
The background is that the mutex approach epoll uses to make files disappear from the interest list on close results in the fact that each file can only be part of a single epoll at a time.
Now since DMA-buf is build around the idea that we share the buffer representation as file between processes it means that only one process at a time can use epoll with each DMA-buf.
So for example if a window manager uses epoll everything is fine. If a client is using epoll everything is fine as well. But if *both* use epoll at the same time it won't work.
This can lead to rather funny and hard to debug combinations of failures and I think we need to document this limitation and explicitly point it out.
Ok, I tested this with a small C program, and you're mixing things up. Here's what I got
- You cannot add a file twice to the same epoll file/fd. So that part is correct, and also my understanding from reading the kernel code.
- You can add the same file to two different epoll file instaces. Which means it's totally fine to use epoll on a dma_buf in different processes like both in the compositor and in clients.
- Substantially more entertaining, you can nest epoll instances, and e.g. add a 2nd epoll file as an event to the first one. That way you can add the same file to both epoll fds, and so end up with the same file essentially being added twice to the top-level epoll file. So even within one application there's no real issue when e.g. different userspace drivers all want to use epoll on the same fd, because you can just throw in another level of epoll and it's fine again and you won't get an EEXISTS on EPOLL_CTL_ADD.
But I also don't think we have this issue right now anywhere, since it's kinda a general epoll issue that happens with any duplicated file.
So I don't think there's any reasons to recommend against using epoll on dma-buf fd (or sync_file or drm_syncobj or any of the sharing primitives we have really).
Cheers, Sima