Re: [PATCH V2] mm/gup: Clear the LRU flag of a page before adding to LRU batch

4 Aug 2024

On Sun, Aug 4, 2024 at 4:03 AM Kairui Song ryncsn@gmail.com wrote:
...
On Sun, Aug 4, 2024 at 1:09 AM Yu Zhao yuzhao@google.com wrote:
...
On Sat, Aug 3, 2024 at 2:31 AM Ge Yang yangge1116@126.com wrote:
...
在 2024/8/3 4:18, Chris Li 写道:
...
On Thu, Aug 1, 2024 at 6:56 PM Ge Yang yangge1116@126.com wrote:
...
...
> I can't reproduce this problem, using tmpfs to compile linux.
> Seems you limit the memory size used to compile linux, which leads to
> OOM. May I ask why the memory size is limited to 481280kB? Do I also
> need to limit the memory size to 481280kB to test?
Yes, you need to limit the cgroup memory size to force the swap
action. I am using memory.max = 470M.
I believe other values e.g. 800M can trigger it as well. The reason to
limit the memory to cause the swap action.
The goal is to intentionally overwhelm the memory load and let the
swap system do its job. The 470M is chosen to cause a lot of swap
action but not too high to cause OOM kills in normal kernels.
In another word, high enough swap pressure but not too high to bust
into OOM kill. e.g. I verify that, with your patch reverted, the
mm-stable kernel can sustain this level of swap pressure (470M)
without OOM kill.
I borrowed the 470M magic value from Hugh and verified it works with
my test system. Huge has a similar swab test up which is more
complicated than mine. It is the inspiration of my swap stress test
setup.
FYI, I am using "make -j32" on a machine with 12 cores (24
hyperthreading). My typical swap usage is about 3-5G. I set my
swapfile size to about 20G.
I am using zram or ssd as the swap backend.  Hope that helps you
reproduce the problem.
Hi Chris,
I try to construct the experiment according to your suggestions above.
Hi Ge,
Sorry to hear that you were not able to reproduce it.
...
High swap pressure can be triggered, but OOM can't be reproduced. The
specific steps are as follows:
root@ubuntu-server-2204:/home/yangge# cp workspace/linux/ /dev/shm/ -rf
I use a slightly different way to setup the tmpfs:
Here is section of my script:
     if ! [ -d $tmpdir ]; then
             sudo mkdir -p $tmpdir
             sudo mount -t tmpfs -o size=100% nodev $tmpdir
     fi

     sudo mkdir -p $cgroup
     sudo sh -c "echo $mem > $cgroup/memory.max" || echo setup

memory.max error
         sudo sh -c "echo 1 > $cgroup/memory.oom.group" || echo setup
oom.group error
Per run:
    # $workdir is under $tmpdir
     sudo rm -rf $workdir
     mkdir -p $workdir
     cd $workdir
     echo "Extracting linux tree"
     XZ_OPT='-T0 -9 –memory=75%' tar xJf $linux_src || die "xz

extract failed"
     sudo sh -c "echo $BASHPID > $cgroup/cgroup.procs"
     echo "Cleaning linux tree, setup defconfig"
     cd $workdir/linux
     make -j$NR_TASK clean
     make defconfig > /dev/null
     echo Kernel compile run $i
     /usr/bin/time -a -o $log make --silent -j$NR_TASK  || die "make failed"

...
Thanks.
...
...
root@ubuntu-server-2204:/home/yangge# sync
root@ubuntu-server-2204:/home/yangge# echo 3 > /proc/sys/vm/drop_caches
root@ubuntu-server-2204:/home/yangge# cd /sys/fs/cgroup/
root@ubuntu-server-2204:/sys/fs/cgroup/# mkdir kernel-build
root@ubuntu-server-2204:/sys/fs/cgroup/# cd kernel-build
root@ubuntu-server-2204:/sys/fs/cgroup/kernel-build# echo 470M > memory.max
root@ubuntu-server-2204:/sys/fs/cgroup/kernel-build# echo $$ > cgroup.procs
root@ubuntu-server-2204:/sys/fs/cgroup/kernel-build# cd /dev/shm/linux/
root@ubuntu-server-2204:/dev/shm/linux# make clean && make -j24
I am using make -j 32.
Your step should work.
Did you enable MGLRU in your .config file? Mine did. I attached my
config file here.
The above test didn't enable MGLRU.
When MGLRU is enabled, I can reproduce OOM very soon. The cause of
triggering OOM is being analyzed.
Hi Ge,
Just in case, maybe you can try to revert your patch and run the test
again? I'm also seeing OOM with MGLRU with this test, Active/Inactive
LRU is fine. But after reverting your patch, the OOM issue still
exists.
...
I think this is one of the potential side effects -- Huge mentioned
earlier about isolate_lru_folios():
https://lore.kernel.org/linux-mm/503f0df7-91e8-07c1-c4a6-124cad9e65e7@google...
Try this:

diff --git a/mm/vmscan.c b/mm/vmscan.c
index cfa839284b92..778bf5b7ef97 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -4320,7 +4320,7 @@ static bool sort_folio(struct lruvec *lruvec,
struct folio *folio, struct scan_c
        }
    /* ineligible */


  if (zone > sc->reclaim_idx || skip_cma(folio, sc)) {




  if (!folio_test_lru(folio) || zone > sc->reclaim_idx ||



skip_cma(folio, sc)) {
                gen = folio_inc_gen(lruvec, folio, false);
                list_move_tail(&folio->lru, &lrugen->folios[gen][type][zone]);
                return true;
Hi Yu, I tested your patch, on my system, the OOM still exists (96
core and 256G RAM), test memcg is limited to 512M and 32 thread ().
And I found the OOM seems irrelevant to either your patch or Ge's
patch. (it may changed the OOM chance slight though)
After the very quick OOM (it failed to untar the linux source code),
checking lru_gen_full:
memcg    47 /build-kernel-tmpfs
 node     0
        442       1691      29405           0
                     0          0r          0e          0p         57r
       617e          0p
                     1          0r          0e          0p          0r
         4e          0p
                     2          0r          0e          0p          0r
         0e          0p
                     3          0r          0e          0p          0r
         0e          0p
                                0           0           0           0
         0           0
        443       1683      57748         832
                     0          0           0           0           0
         0           0
                     1          0           0           0           0
         0           0
                     2          0           0           0           0
         0           0
                     3          0           0           0           0
         0           0
                                0           0           0           0
         0           0
        444       1670      30207         133
                     0          0           0           0           0
         0           0
                     1          0           0           0           0
         0           0
                     2          0           0           0           0
         0           0
                     3          0           0           0           0
         0           0
                                0           0           0           0
         0           0
        445       1662          0           0
                     0          0R         34T          0          57R
       238T          0
                     1          0R          0T          0           0R
         0T          0
                     2          0R          0T          0           0R
         0T          0
                     3          0R          0T          0           0R
        81T          0
                            13807L        324O        867Y       2538N
        63F         18A
If I repeat the test many times, it may succeed by chance, but the
untar process is very slow and generates about 7000 generations.
But if I change the untar cmdline to:
python -c "import sys; sys.stdout.buffer.write(open('$linux_src',
mode='rb').read())" | tar zx
Then the problem is gone, it can untar the file successfully and very fast.
This might be a different issue reported by Chris, I'm not sure.
After more testing, I think these are two problems (note I changed the
memcg limit to 600m later so the compile test can run smoothly).
1. OOM during the untar progress (can be workarounded by the untar
cmdline I mentioned above).
2. OOM during the compile progress (this should be the one Chris encountered).
Both 1 and 2 only exist for MGLRU.
1 can be workarounded using the cmdline I mentioned above.
2 is caused by Ge's patch, and 1 is not.
I can confirm Yu's patch fixed 2 on my system, but the 1 seems still a
problem, it's not related to this patch, maybe can be discussed
elsewhere.

    

2025

2024

2023

2022

2021

2020

2019

2018

2017

Re: [PATCH V2] mm/gup: Clear the LRU flag of a page before adding to LRU batch