On Fri, May 03, 2024 at 12:49:11PM -0600, Jens Axboe wrote:
On 5/3/24 12:26 PM, Kees Cook wrote:
Thanks for doing this analysis! I suspect at least a start of a fix would be this:
diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index 8fe5aa67b167..15e8f74ee0f2 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -267,9 +267,8 @@ static __poll_t dma_buf_poll(struct file *file, poll_table *poll) if (events & EPOLLOUT) { /* Paired with fput in dma_buf_poll_cb */
get_file(dmabuf->file);
if (!dma_buf_poll_add_cb(resv, true, dcb))
if (!atomic_long_inc_not_zero(&dmabuf->file) &&
!dma_buf_poll_add_cb(resv, true, dcb)) /* No callback queued, wake up any other waiters */
Don't think this is sane at all. I'm assuming you meant:
atomic_long_inc_not_zero(&dmabuf->file->f_count);
Oops, yes, sorry. I was typed from memory instead of copy/paste.
but won't fly as you're not under RCU in the first place. And what protects it from being long gone before you attempt this anyway? This is sane way to attempt to fix it, it's completely opposite of what sane ref handling should look like.
Not sure what the best fix is here, seems like dma-buf should hold an actual reference to the file upfront rather than just stash a pointer and then later _hope_ that it can just grab a reference. That seems pretty horrible, and the real source of the issue.
AFAICT, epoll just doesn't hold any references at all. It depends, I think, on eventpoll_release() (really eventpoll_release_file()) synchronizing with epoll_wait() (but I don't see how this happens, and the race seems to be against ep_item_poll() ...?)
I'm really confused about how eventpoll manages the lifetime of polled fds.
Due to this issue I've proposed fixing get_file() to detect pathological states: https://lore.kernel.org/lkml/20240502222252.work.690-kees@kernel.org/
I don't think this would catch this case, as the memory could just be garbage at this point.
It catches it just fine! :) I tested it against the published PoC.
And for cases where further allocations have progressed far enough to corrupt the freed struct file and render the check pointless, nothing different has happened than what happens today. At least now we have a window to catch the situation across the time frame before it is both reallocated _and_ the contents at the f_count offset gets changed to non-zero.
linaro-mm-sig@lists.linaro.org