FYI,
This issue is specific on 32-bit architectures i386 and arm on linux-next tree. As per the test results history this problem started happening from Bad : next-20200430 Good : next-20200429
steps to reproduce: dd if=/dev/disk/by-id/ata-SanDisk_SSD_PLUS_120GB_190504A00573 of=/dev/null bs=1M count=2048 or mkfs -t ext4 /dev/disk/by-id/ata-SanDisk_SSD_PLUS_120GB_190804A00BE5
Problem: [ 38.802375] dd invoked oom-killer: gfp_mask=0x100cc0(GFP_USER), order=0, oom_score_adj=0
i386 crash log: https://pastebin.com/Hb8U89vU arm crash log: https://pastebin.com/BD9t3JTm
On Tue, 19 May 2020 at 14:15, Michal Hocko mhocko@kernel.org wrote:
On Tue 19-05-20 10:11:25, Arnd Bergmann wrote:
On Tue, May 19, 2020 at 9:52 AM Michal Hocko mhocko@kernel.org wrote:
On Mon 18-05-20 19:40:55, Naresh Kamboju wrote:
Thanks for looking into this problem.
On Sat, 2 May 2020 at 02:28, Andrew Morton akpm@linux-foundation.org wrote:
On Fri, 1 May 2020 18:08:28 +0530 Naresh Kamboju naresh.kamboju@linaro.org wrote:
mkfs -t ext4 invoked oom-killer on i386 kernel running on x86_64 device and started happening on linux -next master branch kernel tag next-20200430 and next-20200501. We did not bisect this problem.
[...]
Creating journal (131072 blocks): [ 31.251333] mkfs.ext4 invoked oom-killer: gfp_mask=0x101cc0(GFP_USER|__GFP_WRITE), order=0, oom_score_adj=0
[...]
[ 31.500943] DMA free:187396kB min:22528kB low:28160kB high:33792kB reserved_highatomic:0KB active_anon:0kB inactive_anon:0kB active_file:4736kB inactive_file:431688kB unevictable:0kB writepending:62020kB present:783360kB managed:668264kB mlocked:0kB kernel_stack:888kB pagetables:0kB bounce:0kB free_pcp:880kB local_pcp:216kB free_cma:163840kB
This is really unexpected. You are saying this is a regular i386 and DMA should be bottom 16MB while yours is 780MB and the rest of the low mem is in the Normal zone which is completely missing here. How have you got to that configuration? I have to say I haven't seen anything like that on i386.
I think that line comes from an ARM32 beaglebone-X15 machine showing the same symptom. The i386 line from the log file that Naresh linked to at https://lkft.validation.linaro.org/scheduler/job/1406110#L1223 is less unusual:
OK, that makes more sense! At least for the memory layout.
[ 34.931663] Node 0 active_anon:21464kB inactive_anon:8688kB active_file:16604kB inactive_file:849976kB unevictable:0kB isolated(anon):0kB isolated(file):0kB mapped:25284kB dirty:58952kB writeback:27772kB shmem:8944kB writeback_tmp:0kB unstable:0kB all_unreclaimable? yes [ 34.955523] DMA free:3356kB min:68kB low:84kB high:100kB reserved_highatomic:0KB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:11964kB unevictable:0kB writepending:11980kB present:15964kB managed:15876kB mlocked:0kB kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB [ 34.983385] lowmem_reserve[]: 0 825 1947 825 [ 34.987678] Normal free:3948kB min:7732kB low:8640kB high:9548kB reserved_highatomic:0KB active_anon:0kB inactive_anon:0kB active_file:1096kB inactive_file:786400kB unevictable:0kB writepending:65432kB present:884728kB managed:845576kB mlocked:0kB kernel_stack:1112kB pagetables:0kB bounce:0kB free_pcp:2908kB local_pcp:500kB free_cma:0kB
The lowmem is really low (way below the min watermark so even memory reserves for high priority and atomic requests are depleted. There is still 786MB of inactive page cache to be reclaimed. It doesn't seem to be dirty or under the writeback but it still might be pinned by the filesystem. I would suggest watching vmscan reclaim tracepoints and check why the reclaim fails to reclaim anything. -- Michal Hocko SUSE Labs