On Wed 05-06-19 11:25:51, Dave Chinner wrote:
On Mon, Jun 03, 2019 at 03:21:55PM +0200, Jan Kara wrote:
Hole puching currently evicts pages from page cache and then goes on to remove blocks from the inode. This happens under both i_mmap_sem and i_rwsem held exclusively which provides appropriate serialization with racing page faults. However there is currently nothing that prevents ordinary read(2) from racing with the hole punch and instantiating page cache page after hole punching has evicted page cache but before it has removed blocks from the inode. This page cache page will be mapping soon to be freed block and that can lead to returning stale data to userspace or even filesystem corruption.
Fix the problem by protecting reads as well as readahead requests with i_mmap_sem.
CC: stable@vger.kernel.org Reported-by: Amir Goldstein amir73il@gmail.com Signed-off-by: Jan Kara jack@suse.cz
fs/ext4/file.c | 35 +++++++++++++++++++++++++++++++---- 1 file changed, 31 insertions(+), 4 deletions(-)
diff --git a/fs/ext4/file.c b/fs/ext4/file.c index 2c5baa5e8291..a21fa9f8fb5d 100644 --- a/fs/ext4/file.c +++ b/fs/ext4/file.c @@ -34,6 +34,17 @@ #include "xattr.h" #include "acl.h" +static ssize_t ext4_file_buffered_read(struct kiocb *iocb, struct iov_iter *to) +{
- ssize_t ret;
- struct inode *inode = file_inode(iocb->ki_filp);
- down_read(&EXT4_I(inode)->i_mmap_sem);
- ret = generic_file_read_iter(iocb, to);
- up_read(&EXT4_I(inode)->i_mmap_sem);
- return ret;
Isn't i_mmap_sem taken in the page fault path? What makes it safe to take here both outside and inside the mmap_sem at the same time? I mean, the whole reason for i_mmap_sem existing is that the inode i_rwsem can't be taken both outside and inside the i_mmap_sem at the same time, so what makes the i_mmap_sem different?
Drat, you're right that read path may take page fault which will cause lock inversion with mmap_sem. Just my xfstests run apparently didn't trigger this as I didn't get any lockdep splat. Thanks for catching this!
Honza