From: Linus Torvalds
Sent: 05 May 2024 18:56
epoll can call out to vfs_poll() with a file pointer that may race with the last 'fput()'. That would make f_count go down to zero, and while the ep->mtx locking means that the resulting file pointer tear-down will be blocked until the poll returns, it means that f_count is already dead, and any use of it won't actually get a reference to the file any more: it's dead regardless.
Make sure we have a valid ref on the file pointer before we call down to vfs_poll() from the epoll routines.
How much is the extra pair of atomics going to hurt performance? IIRC a lot of work was done to (try to) make epoll lockless.
Perhaps the 'hook' into epoll (usually) from sys_close should be done before any of the references are removed? (Which is different from Q6/A6 in man epoll - but that seems to be documenting a bug!) Then the ->poll() callback can't happen (assuming it is properly locked) after the ->release() one.
It seems better to add extra atomics to the close/final-fput path rather than ever ->poll() call epoll makes.
I can get extra references to a driver by dup() open("/dev/fd/n") and mmap() - but epoll is definitely fd based. (Which may be why it has the fd number in its data.)
Is there another race between EPOLL_CTL_ADD and close() (from a different thread)?
David
- Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)