On Thu, 21 May 2020 at 00:39, Chris Down chris@chrisdown.name wrote:
Hi Naresh,
Naresh Kamboju writes:
As a part of investigation on this issue LKFT teammate Anders Roxell git bisected the problem and found bad commit(s) which caused this problem.
The following two patches have been reverted on next-20200519 and retested the reproducible steps and confirmed the test case mkfs -t ext4 got PASS. ( invoked oom-killer is gone now)
Revert "mm, memcg: avoid stale protection values when cgroup is above protection" This reverts commit 23a53e1c02006120f89383270d46cbd040a70bc6.
Revert "mm, memcg: decouple e{low,min} state mutations from protection checks" This reverts commit 7b88906ab7399b58bb088c28befe50bcce076d82.
Thanks Anders and Naresh for tracking this down and reverting.
I'll take a look tomorrow. I don't see anything immediately obviously wrong in either of those commits from a (very) cursory glance, but they should only be taking effect if protections are set.
Since you have i386 hardware available, and I don't, could you please apply only "avoid stale protection" again and check if it only happens with that commit, or requires both? That would help narrow down the suspects.
Not both. The bad commit is "mm, memcg: decouple e{low,min} state mutations from protection checks"
Do you use any memcg protections in these tests?
I see three MEMCG configs and please find the kernel config link for more details.
CONFIG_MEMCG=y CONFIG_MEMCG_SWAP=y CONFIG_MEMCG_KMEM=y
kernel config link, https://builds.tuxbuild.com/8lg6WQibcwtQRRtIa0bcFA/kernel.config
- Naresh