LTP test case dio04 test failed on 32bit kernel running linux next 20191107 kernel. Linux version 5.4.0-rc6-next-20191107.
diotest4 1 TPASS : Negative Offset diotest4 2 TPASS : removed diotest4 3 TPASS : Odd count of read and write diotest4 4 TPASS : Read beyond the file size diotest4 5 TPASS : Invalid file descriptor diotest4 6 TPASS : Out of range file descriptor diotest4 7 TPASS : Closed file descriptor diotest4 8 TPASS : removed diotest4 9 TCONF : diotest4.c:345: Direct I/O on /dev/null is not supported diotest4 10 TPASS : read, write to a mmaped file diotest4 11 TPASS : read, write to an unmapped file diotest4 12 TPASS : read from file not open for reading diotest4 13 TPASS : write to file not open for writing diotest4 14 TPASS : read, write with non-aligned buffer diotest4 15 TFAIL : diotest4.c:476: read to read-only space. returns 0: Success diotest4 16 TFAIL : diotest4.c:180: read, write buffer in read-only space diotest4 17 TFAIL : diotest4.c:114: read allows nonexistant space. returns 0: Success diotest4 18 TFAIL : diotest4.c:129: write allows nonexistant space.returns -1: Invalid argument diotest4 19 TFAIL : diotest4.c:180: read, write in non-existant space diotest4 20 TPASS : read, write for file with O_SYNC diotest4 0 TINFO : 2/15 test blocks failed
Test results comparison link, https://qa-reports.linaro.org/lkft/linux-next-oe/tests/ltp-dio-tests/dio04
Test case source link, https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/io/di...
Test case description:
* NAME * diotest4.c * * DESCRIPTION * The program generates error conditions and verifies the error * code generated with the expected error value. The program also * tests some of the boundary condtions. The size of test file created * is filesize_in_blocks * 4k. * Test blocks: * [1] Negative Offset * [2] Negative count - removed 08/01/2003 - robbiew * [3] Odd count of read and write * [4] Read beyond the file size * [5] Invalid file descriptor * [6] Out of range file descriptor * [7] Closed file descriptor * [8] Directory read, write - removed 10/7/2002 - plars * [9] Character device (/dev/null) read, write * [10] read, write to a mmaped file * [11] read, write to an unmaped file with munmap * [12] read from file not open for reading * [13] write to file not open for writing * [14] read, write with non-aligned buffer * [15] read, write buffer in read-only space * [16] read, write in non-existant space * [17] read, write for file with O_SYNC
metadata: git branch: master git repo: https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git git commit: c68c5373c504078cc0e0edc7d5c88b47ca308144 git describe: next-20191107 make_kernelversion: 5.4.0-rc6 kernel-config: http://snapshots.linaro.org/openembedded/lkft/lkft/sumo/intel-core2-32/lkft/... build-location: http://snapshots.linaro.org/openembedded/lkft/lkft/sumo/intel-core2-32/lkft/...
Best regards Naresh Kamboju
----- Original Message -----
LTP test case dio04 test failed on 32bit kernel running linux next 20191107 kernel. Linux version 5.4.0-rc6-next-20191107.
diotest4 1 TPASS : Negative Offset diotest4 2 TPASS : removed diotest4 3 TPASS : Odd count of read and write diotest4 4 TPASS : Read beyond the file size diotest4 5 TPASS : Invalid file descriptor diotest4 6 TPASS : Out of range file descriptor diotest4 7 TPASS : Closed file descriptor diotest4 8 TPASS : removed diotest4 9 TCONF : diotest4.c:345: Direct I/O on /dev/null is not supported diotest4 10 TPASS : read, write to a mmaped file diotest4 11 TPASS : read, write to an unmapped file diotest4 12 TPASS : read from file not open for reading diotest4 13 TPASS : write to file not open for writing diotest4 14 TPASS : read, write with non-aligned buffer diotest4 15 TFAIL : diotest4.c:476: read to read-only space. returns 0: Success diotest4 16 TFAIL : diotest4.c:180: read, write buffer in read-only space diotest4 17 TFAIL : diotest4.c:114: read allows nonexistant space. returns 0: Success diotest4 18 TFAIL : diotest4.c:129: write allows nonexistant space.returns -1: Invalid argument diotest4 19 TFAIL : diotest4.c:180: read, write in non-existant space diotest4 20 TPASS : read, write for file with O_SYNC diotest4 0 TINFO : 2/15 test blocks failed
Smaller reproducer for 32-bit system and ext4 is: openat(AT_FDCWD, "testdata-4.5918", O_RDWR|O_DIRECT) = 4 mmap2(NULL, 4096, PROT_READ, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7f7b000 read(4, 0xb7f7b000, 4096) = 0 // expects -EFAULT
Problem appears to be conversion in ternary operator at iomap_dio_bio_actor() return. Test passes for me with following tweak:
diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c index 2f88d64c2a4d..8615b1f78389 100644 --- a/fs/iomap/direct-io.c +++ b/fs/iomap/direct-io.c @@ -318,7 +318,7 @@ iomap_dio_bio_actor(struct inode *inode, loff_t pos, loff_t length, if (pad) iomap_dio_zero(dio, iomap, pos, fs_block_size - pad); } - return copied ? copied : ret; + return copied ? (loff_t) copied : ret; }
static loff_t
[add the other iomap maintainer to cc] [add the ext4 list to cc since they just added iomap directio support] [add the ext4 maintainer for the same reason]
On Thu, Nov 07, 2019 at 07:20:43PM -0500, Jan Stancek wrote:
----- Original Message -----
LTP test case dio04 test failed on 32bit kernel running linux next 20191107 kernel. Linux version 5.4.0-rc6-next-20191107.
diotest4 1 TPASS : Negative Offset diotest4 2 TPASS : removed diotest4 3 TPASS : Odd count of read and write diotest4 4 TPASS : Read beyond the file size diotest4 5 TPASS : Invalid file descriptor diotest4 6 TPASS : Out of range file descriptor diotest4 7 TPASS : Closed file descriptor diotest4 8 TPASS : removed diotest4 9 TCONF : diotest4.c:345: Direct I/O on /dev/null is not supported diotest4 10 TPASS : read, write to a mmaped file diotest4 11 TPASS : read, write to an unmapped file diotest4 12 TPASS : read from file not open for reading diotest4 13 TPASS : write to file not open for writing diotest4 14 TPASS : read, write with non-aligned buffer diotest4 15 TFAIL : diotest4.c:476: read to read-only space. returns 0: Success diotest4 16 TFAIL : diotest4.c:180: read, write buffer in read-only space diotest4 17 TFAIL : diotest4.c:114: read allows nonexistant space. returns 0: Success diotest4 18 TFAIL : diotest4.c:129: write allows nonexistant space.returns -1: Invalid argument diotest4 19 TFAIL : diotest4.c:180: read, write in non-existant space diotest4 20 TPASS : read, write for file with O_SYNC diotest4 0 TINFO : 2/15 test blocks failed
Smaller reproducer for 32-bit system and ext4 is: openat(AT_FDCWD, "testdata-4.5918", O_RDWR|O_DIRECT) = 4 mmap2(NULL, 4096, PROT_READ, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7f7b000 read(4, 0xb7f7b000, 4096) = 0 // expects -EFAULT
Problem appears to be conversion in ternary operator at iomap_dio_bio_actor() return. Test passes for me with following tweak:
I can't do a whole lot with a code snippet that lacks a proper SOB header.
diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c index 2f88d64c2a4d..8615b1f78389 100644 --- a/fs/iomap/direct-io.c +++ b/fs/iomap/direct-io.c @@ -318,7 +318,7 @@ iomap_dio_bio_actor(struct inode *inode, loff_t pos, loff_t length, if (pad) iomap_dio_zero(dio, iomap, pos, fs_block_size - pad); }
return copied ? copied : ret;
return copied ? (loff_t) copied : ret;
I'm a little confused on this proposed fix -- why does casting size_t (aka unsigned long) to loff_t (long long) on a 32-bit system change the test outcome? Does this same diotest4 failure happen with XFS? I ask because XFS has been using iomap for directio for ages.
AFAICT @copied is a simple byte counter, and the other variable @n that gets added to @copied is also a simple byte counter -- nobody should ever be trying to stuff a negative value in there?
(Or is this a bug with the ext4 code that calls iomap_dio_rw?)
--D
}
static loff_t
----- Original Message -----
I can't do a whole lot with a code snippet that lacks a proper SOB header.
I'll resend as a patch, maybe split it to 2 returns instead.
diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c index 2f88d64c2a4d..8615b1f78389 100644 --- a/fs/iomap/direct-io.c +++ b/fs/iomap/direct-io.c @@ -318,7 +318,7 @@ iomap_dio_bio_actor(struct inode *inode, loff_t pos, loff_t length, if (pad) iomap_dio_zero(dio, iomap, pos, fs_block_size - pad); }
return copied ? copied : ret;
return copied ? (loff_t) copied : ret;
I'm a little confused on this proposed fix -- why does casting size_t (aka unsigned long) to loff_t (long long) on a 32-bit system change the test outcome?
Ternary operator has a return type and an attempt is made to convert each of operands to the type of the other. So, in this case "ret" appears to be converted to type of "copied" first. Both have size of 4 bytes on 32-bit x86:
size_t copied = 0; int ret = -14; long long actor_ret = copied ? copied : ret;
On x86_64: actor_ret == -14; On x86 : actor_ret == 4294967282
Does this same diotest4 failure happen with XFS? I ask because XFS has been using iomap for directio for ages.
Yes, it fails on XFS too.
On Mon, Nov 11, 2019 at 03:19:40AM -0500, Jan Stancek wrote:
loff_t length, if (pad) iomap_dio_zero(dio, iomap, pos, fs_block_size - pad); }
return copied ? copied : ret;
return copied ? (loff_t) copied : ret;
I'm a little confused on this proposed fix -- why does casting size_t (aka unsigned long) to loff_t (long long) on a 32-bit system change the test outcome?
Ternary operator has a return type and an attempt is made to convert each of operands to the type of the other. So, in this case "ret" appears to be converted to type of "copied" first. Both have size of 4 bytes on 32-bit x86:
Sounds like we should use a good old if here to avoid that whole problem spacE:
if (copied) return copied; return ret;
size_t copied = 0; int ret = -14; long long actor_ret = copied ? copied : ret;
On x86_64: actor_ret == -14; On x86 : actor_ret == 4294967282
Does this same diotest4 failure happen with XFS? I ask because XFS has been using iomap for directio for ages.
Yes, it fails on XFS too.
Is this a new test? If not why was this never reported? Sounds like we should add this test case to xfstests.
Naresh reported LTP diotest4 failing for 32bit x86 and arm -next kernels on ext4. Same problem exists in 5.4-rc7 on xfs.
The failure comes down to: openat(AT_FDCWD, "testdata-4.5918", O_RDWR|O_DIRECT) = 4 mmap2(NULL, 4096, PROT_READ, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7f7b000 read(4, 0xb7f7b000, 4096) = 0 // expects -EFAULT
Problem is conversion at iomap_dio_bio_actor() return. Ternary operator has a return type and an attempt is made to convert each of operands to the type of the other. In this case "ret" (int) is converted to type of "copied" (unsigned long). Both have size of 4 bytes: size_t copied = 0; int ret = -14; long long actor_ret = copied ? copied : ret;
On x86_64: actor_ret == -14; On x86 : actor_ret == 4294967282
Replace ternary operator with 2 return statements to avoid this unwanted conversion.
Fixes: 4721a6010990 ("iomap: dio data corruption and spurious errors when pipes fill") Reported-by: Naresh Kamboju naresh.kamboju@linaro.org Signed-off-by: Jan Stancek jstancek@redhat.com --- fs/iomap/direct-io.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c index 1fc28c2da279..7c58f51d7da7 100644 --- a/fs/iomap/direct-io.c +++ b/fs/iomap/direct-io.c @@ -318,7 +318,9 @@ static void iomap_dio_bio_end_io(struct bio *bio) if (pad) iomap_dio_zero(dio, iomap, pos, fs_block_size - pad); } - return copied ? copied : ret; + if (copied) + return copied; + return ret; }
static loff_t
Looks good,
Reviewed-by: Christoph Hellwig hch@lst.de
On Mon, Nov 11, 2019 at 11:28:10AM +0100, Jan Stancek wrote:
Naresh reported LTP diotest4 failing for 32bit x86 and arm -next kernels on ext4. Same problem exists in 5.4-rc7 on xfs.
The failure comes down to: openat(AT_FDCWD, "testdata-4.5918", O_RDWR|O_DIRECT) = 4 mmap2(NULL, 4096, PROT_READ, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7f7b000 read(4, 0xb7f7b000, 4096) = 0 // expects -EFAULT
Problem is conversion at iomap_dio_bio_actor() return. Ternary operator has a return type and an attempt is made to convert each of operands to the type of the other. In this case "ret" (int) is converted to type of "copied" (unsigned long). Both have size of 4 bytes: size_t copied = 0; int ret = -14; long long actor_ret = copied ? copied : ret;
On x86_64: actor_ret == -14; On x86 : actor_ret == 4294967282
Replace ternary operator with 2 return statements to avoid this unwanted conversion.
Fixes: 4721a6010990 ("iomap: dio data corruption and spurious errors when pipes fill") Reported-by: Naresh Kamboju naresh.kamboju@linaro.org Signed-off-by: Jan Stancek jstancek@redhat.com
Thansk for the full explanation & patch, will test...
Reviewed-by: Darrick J. Wong darrick.wong@oracle.com
--D
fs/iomap/direct-io.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c index 1fc28c2da279..7c58f51d7da7 100644 --- a/fs/iomap/direct-io.c +++ b/fs/iomap/direct-io.c @@ -318,7 +318,9 @@ static void iomap_dio_bio_end_io(struct bio *bio) if (pad) iomap_dio_zero(dio, iomap, pos, fs_block_size - pad); }
- return copied ? copied : ret;
- if (copied)
return copied;
- return ret;
} static loff_t -- 1.8.3.1
----- Original Message -----
Is this a new test?
No, it's not new.
If not why was this never reported? Sounds like we should add this test case to xfstests.
I'm guessing not that many users still run 32bit kernels. Naresh' setup is using ext4, so I assume he noticed only after recent changes in linux-next wrt. directio and ext4.
Regards, Jan
On Mon, 11 Nov 2019 at 16:09, Jan Stancek jstancek@redhat.com wrote:
----- Original Message -----
Is this a new test?
No, it's not new.
If not why was this never reported? Sounds like we should add this test case to xfstests.
I'm guessing not that many users still run 32bit kernels. Naresh' setup is using ext4, so I assume he noticed only after recent changes in linux-next wrt. directio and ext4.
That's true. Started noticing recently from Linux next-20191107 kernel on i386 and arm32.
Steps to reproduce: ./runltp -f dio -d /mounted-ext4-drive
- Naresh