On Sat, 4 May 2024 at 08:32, Linus Torvalds torvalds@linux-foundation.org wrote:
Lookie here, the fundamental issue is that epoll can call '->poll()' on a file descriptor that is being closed concurrently.
Thinking some more about this, and replying to myself...
Actually, I wonder if we could *really* fix this by simply moving the eventpoll_release() to where it really belongs.
If we did it in file_close_fd_locked(), it would actually make a *lot* more sense. Particularly since eventpoll actually uses this:
struct epoll_filefd { struct file *file; int fd; } __packed;
ie it doesn't just use the 'struct file *', it uses the 'fd' itself (for ep_find()).
(Strictly speaking, it should also have a pointer to the 'struct files_struct' to make the 'int fd' be meaningful).
IOW, eventpoll already considers the file _descriptor_ relevant, not just the file pointer, and that's destroyed at *close* time, not at 'fput()' time.
Yeah, yeah, the locking situation in file_close_fd_locked() is a bit inconvenient, but if we can solve that, it would solve the problem in a fundamentally different way: remove the ep iterm before the file->f_count has actually been decremented, so the whole "race with fput()" would just go away entirely.
I dunno. I think that would be the right thing to do, but I wouldn't be surprised if some disgusting eventpoll user then might depend on the current situation where the eventpoll thing stays around even after the close() if you have another copy of the file open.
Linus