Hi Minchan,
On Mon, Nov 07, 2022 at 08:11:35PM +0100, Petr Vorel wrote:
Hi all,
following bug is trying to workaround an error on ppc64le, where zram01.sh LTP test (there is also kernel selftest tools/testing/selftests/zram/zram01.sh, but LTP test got further updates) has often mem_used_total 0 although zram is already filled.
Hi, Petr,
Is it happening on only ppc64le?
I haven't seen it on other archs (x86_64, aarch64).
Is it a new regression? What kernel version did you use?
Found on openSUSE kernel, which uses stable kernel releases 6.0.x. It's probably much older, first I've seen it some years ago (I'm not able to find kernel version), but it was random. Now it's much more common.
I tested it on bare metal machine with some older SLES kernel (based on 4.12.14, with thousands of patches) and it fails:
# PATH="/opt/ltp/testcases/bin:$PATH" LTP_SINGLE_FS_TYPE=vfat zram01.sh ... zram01 5 TINFO: make vfat filesystem on /dev/zram0 zram01 5 TPASS: zram_makefs succeeded zram01 6 TINFO: mount /dev/zram0 zram01 6 TPASS: mount of zram device(s) succeeded zram01 7 TINFO: filling zram0 (it can take long time) zram01 7 TPASS: zram0 was filled with '25568' KB /opt/ltp/testcases/bin/zram01.sh: line 137: 100 * 1024 * 25568 / 0: division by 0 (error token is "0") ...
My patch does not help, obviously the value does not change.
zram01 5 TINFO: make vfat filesystem on /dev/zram1 zram01 5 TPASS: zram_makefs succeeded zram01 6 TINFO: mount /dev/zram1 zram01 6 TPASS: mount of zram device(s) succeeded zram01 7 TINFO: filling zram1 (it can take long time) zram01 7 TPASS: zram1 was filled with '25568' KB zram01 7 TINFO: /sys/block/zram1/mm_stat 15859712 0 0 26214400 196608 242 0 zram01 7 TINFO: /sys/block/zram1/mm_stat 15859712 0 0 26214400 196608 242 0 zram01 7 TINFO: /sys/block/zram1/mm_stat 15859712 0 0 26214400 196608 242 0 zram01 7 TINFO: /sys/block/zram1/mm_stat 15859712 0 0 26214400 196608 242 0 zram01 7 TINFO: /sys/block/zram1/mm_stat 15859712 0 0 26214400 196608 242 0 zram01 7 TINFO: /sys/block/zram1/mm_stat 15859712 0 0 26214400 196608 242 0 zram01 7 TINFO: /sys/block/zram1/mm_stat 15859712 0 0 26214400 196608 242 0 zram01 7 TINFO: /sys/block/zram1/mm_stat 15859712 0 0 26214400 196608 242 0 zram01 7 TINFO: /sys/block/zram1/mm_stat 15859712 0 0 26214400 196608 242 0 zram01 7 TINFO: /sys/block/zram1/mm_stat 15859712 0 0 26214400 196608 242 0 zram01 7 TINFO: /sys/block/zram1/mm_stat 15859712 0 0 26214400 196608 242 0 zram01 7 TINFO: /sys/block/zram1/mm_stat 15859712 0 0 26214400 196608 242 0 zram01 7 TINFO: /sys/block/zram1/mm_stat 15859712 0 0 26214400 196608 242 0 zram01 7 TINFO: /sys/block/zram1/mm_stat 15859712 0 0 26214400 196608 242 0 zram01 7 TINFO: /sys/block/zram1/mm_stat 15859712 0 0 26214400 196608 242 0 zram01 7 TINFO: /sys/block/zram1/mm_stat 15859712 0 0 26214400 196608 242 0 zram01 7 TINFO: /sys/block/zram1/mm_stat 15859712 0 0 26214400 196608 242 0 zram01 7 TINFO: /sys/block/zram1/mm_stat 15859712 0 0 26214400 196608 242 0 zram01 7 TINFO: /sys/block/zram1/mm_stat 15859712 0 0 26214400 196608 242 0 zram01 7 TINFO: /sys/block/zram1/mm_stat 15859712 0 0 26214400 196608 242 0 zram01 7 TINFO: /sys/block/zram1/mm_stat 15859712 0 0 26214400 196608 242 0 zram01 7 TBROK: "loop_read_mem_used_total /sys/block/zram1/mm_stat" timed out
This is just to demonstrate, that the problem has been here long time, and it's not QEMU related (although it could be theoretically related to openSUSE/SLES user space, but it's very unlikely).
I'll debug if page_same_filled() is being called on mainline kernel.
Kind regards, Petr
Test runs on VM (I can give qemu command or whatever you need to know about it) I'll try to verify it on some bare metal ppc64le.
Actually, mem_used_total indicates how many *physical memory* were currently used to keep original data size.
However, if the test data is repeated pattern of unsigned long (https://github.com/torvalds/linux/blob/master/drivers/block/zram/zram_drv.c#...) zram doesn't allocate the physical memory but just mark the unsigned long's value in meta area for decompression later.
Not sure you hit the this case.
Thanks for a hint, I'll try to debug it.
Kind regards, Petr
Patch tries to repeatedly read /sys/block/zram*/mm_stat for 1 sec, waiting for mem_used_total > 0. The question if this is expected and should be workarounded or a bug which should be fixed.
REPRODUCE THE ISSUE Quickest way to install only zram tests and their dependencies: make autotools && ./configure && for i in testcases/lib/ testcases/kernel/device-drivers/zram/; do cd $i && make -j$(getconf _NPROCESSORS_ONLN) && make install && cd -; done
Run the test (only on vfat) PATH="/opt/ltp/testcases/bin:$PATH" LTP_SINGLE_FS_TYPE=vfat zram01.sh
Petr Vorel (1): zram01.sh: Workaround division by 0 on vfat on ppc64le
.../kernel/device-drivers/zram/zram01.sh | 27 +++++++++++++++++-- 1 file changed, 25 insertions(+), 2 deletions(-)
-- 2.38.0