The patch titled
Subject: mm/compaction: avoid VM_BUG_ON(PageSlab()) in page_mapcount()
has been removed from the -mm tree. Its filename was
mm-compaction-avoid-vm_bug_onpageslab-in-page_mapcount.patch
This patch was dropped because it is obsolete
------------------------------------------------------
From: Konstantin Khlebnikov <khlebnikov(a)yandex-team.ru>
Subject: mm/compaction: avoid VM_BUG_ON(PageSlab()) in page_mapcount()
isolate_migratepages_block() runs some checks out of lru_lock when
choosing pages for migration. After checking PageLRU() it checks extra
page references by comparing page_count() and page_mapcount(). Between
these two checks page could be removed from lru, freed and taken by slab.
As a result this race triggers VM_BUG_ON(PageSlab()) in page_mapcount().
Race window is tiny. For certain workload this happens around once a
year.
page:ffffea0105ca9380 count:1 mapcount:0 mapping:ffff88ff7712c180 index:0x0 compound_mapcount: 0
flags: 0x500000000008100(slab|head)
raw: 0500000000008100 dead000000000100 dead000000000200 ffff88ff7712c180
raw: 0000000000000000 0000000080200020 00000001ffffffff 0000000000000000
page dumped because: VM_BUG_ON_PAGE(PageSlab(page))
------------[ cut here ]------------
kernel BUG at ./include/linux/mm.h:628!
invalid opcode: 0000 [#1] SMP NOPTI
CPU: 77 PID: 504 Comm: kcompactd1 Tainted: G W 4.19.109-27 #1
Hardware name: Yandex T175-N41-Y3N/MY81-EX0-Y3N, BIOS R05 06/20/2019
RIP: 0010:isolate_migratepages_block+0x986/0x9b0
To fix just opencode page_mapcount() in racy check for 0-order case and
recheck carefully under lru_lock when page cannot escape from lru.
Also add checking extra references for file pages and swap cache.
Link: http://lkml.kernel.org/r/158937872515.474360.5066096871639561424.stgit@buzz
Fixes: 119d6d59dcc0 ("mm, compaction: avoid isolating pinned pages")
Fixes: 1d148e218a0d ("mm: add VM_BUG_ON_PAGE() to page_mapcount()")
Signed-off-by: Konstantin Khlebnikov <khlebnikov(a)yandex-team.ru>
Cc: Minchan Kim <minchan(a)kernel.org>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: David Rientjes <rientjes(a)google.com>
Cc: Mel Gorman <mgorman(a)techsingularity.net>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/compaction.c | 17 +++++++++++++----
1 file changed, 13 insertions(+), 4 deletions(-)
--- a/mm/compaction.c~mm-compaction-avoid-vm_bug_onpageslab-in-page_mapcount
+++ a/mm/compaction.c
@@ -935,12 +935,16 @@ isolate_migratepages_block(struct compac
}
/*
- * Migration will fail if an anonymous page is pinned in memory,
+ * Migration will fail if an page is pinned in memory,
* so avoid taking lru_lock and isolating it unnecessarily in an
- * admittedly racy check.
+ * admittedly racy check simplest case for 0-order pages.
+ *
+ * Open code page_mapcount() to avoid VM_BUG_ON(PageSlab(page)).
+ * Page could have extra reference from mapping or swap cache.
*/
- if (!page_mapping(page) &&
- page_count(page) > page_mapcount(page))
+ if (!PageCompound(page) &&
+ page_count(page) > atomic_read(&page->_mapcount) + 1 +
+ (!PageAnon(page) || PageSwapCache(page)))
goto isolate_fail;
/*
@@ -975,6 +979,11 @@ isolate_migratepages_block(struct compac
low_pfn += compound_nr(page) - 1;
goto isolate_fail;
}
+
+ /* Recheck page extra references under lock */
+ if (page_count(page) > page_mapcount(page) +
+ (!PageAnon(page) || PageSwapCache(page)))
+ goto isolate_fail;
}
lruvec = mem_cgroup_page_lruvec(page, pgdat);
_
Patches currently in -mm which might be from khlebnikov(a)yandex-team.ru are
kernel-watchdog-flush-all-printk-nmi-buffers-when-hardlockup-detected.patch
doc-cgroup-update-note-about-conditions-when-oom-killer-is-invoked.patch
On 6/1/20 8:30 PM, Sasha Levin wrote:
> On Mon, Jun 01, 2020 at 12:22:54PM -0700, Guenter Roeck wrote:
>> Hi Greg,
>>
>> On Mon, Jun 01, 2020 at 06:58:35PM +0200, Greg Kroah-Hartman wrote:
>>> On Tue, May 26, 2020 at 09:58:28PM -0700, Guenter Roeck wrote:
>>> > Upstream commit 106d45f350c7 ("scsi: zfcp: fix request object use-after-free in send path causing wrong traces")
>>> > upstream: v5.3-rc1
>>> > Fixes: d27a7cb91960 ("zfcp: trace on request for open and close of WKA port")
>>> > in linux-4.4.y: b5752b0db014
>>> > upstream: v4.9-rc1
>>> > Affected branches:
>>> > linux-4.4.y
>>> > linux-4.9.y
>>> > linux-4.14.y
>>> > linux-4.19.y (already applied)
>>>
>>> This patch does not apply on those older branches, do you have a working
>>> backport?
>>
>> I am a bit at loss. Right now my script still tells me:
>>
>> Upstream commit 106d45f350c7 ("scsi: zfcp: fix request object use-after-free in send path causing wrong traces")
>> upstream: v5.3-rc1
>> Fixes: d27a7cb91960 ("zfcp: trace on request for open and close of WKA port")
>> in linux-4.4.y: b5752b0db014
>> upstream: v4.9-rc1
>> Affected branches:
>> linux-4.4.y
>> linux-4.9.y
>> linux-4.14.y
>> linux-4.19.y (already applied)
>>
>> It only does that if the patch cherry-picks cleanly; otherwise it would
>> report conflicts. I checked and made sure that the patch was indeed applied
>> to my test branches for linux-{4.4,4.9,4.14}.y. I re-applied it, just to be
>> sure, with no problems. I also extracted it with git format-patch and
>> applied it with "git am", without issue.
>
> Same here, so I've queued it up.
>
>> What do you use to apply patches ?
>
> *snicker*
>
> https://i.pinimg.com/originals/ab/8f/f8/ab8ff8a51f1c2a9014cd9cc71c6def0a.png
>
>> Anyway, my script also tells me:
>>
>> Upstream commit a33a5d2d16cb ("genirq/generic_pending: Do not lose pending affinity update")
>> upstream: v4.18-rc1
>> Fixes: 98229aa36caa ("x86/irq: Plug vector cleanup race")
>> in linux-4.4.y: 996c591227d9
>> upstream: v4.5-rc2
>> Affected branches:
>> linux-4.4.y (queued)
>> linux-4.9.y (queued)
>> linux-4.14.y
>>
>> and, indeed, it looks like a33a5d2d16cb is missing in v4.14.y-queue.
>
> I think that Greg's script didn't like a33a5d2d16cb pointing to the
> wrong "fixes:" commit - 996c591227d9 rather than 98229aa36caa.
>
Interesting. Makes me wonder how my script found the correct reference.
But then why did his script pick it up for 4.4.y and 4.9.y ?
Guenter
Hello,
We ran automated tests on a recent commit from this kernel tree:
Kernel repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
Commit: 279a1f873417 - Linux 5.6.16-rc1
The results of these automated tests are provided below.
Overall result: PASSED
Merge: OK
Compile: OK
Tests: OK
All kernel binaries, config files, and logs are available for download here:
https://cki-artifacts.s3.us-east-2.amazonaws.com/index.html?prefix=dataware…
Please reply to this email if you have any questions about the tests that we
ran or if you have any suggestions on how to make future tests more effective.
,-. ,-.
( C ) ( K ) Continuous
`-',-.`-' Kernel
( I ) Integration
`-'
______________________________________________________________________________
Compile testing
---------------
We compiled the kernel for 4 architectures:
aarch64:
make options: -j30 INSTALL_MOD_STRIP=1 targz-pkg
ppc64le:
make options: -j30 INSTALL_MOD_STRIP=1 targz-pkg
s390x:
make options: -j30 INSTALL_MOD_STRIP=1 targz-pkg
x86_64:
make options: -j30 INSTALL_MOD_STRIP=1 targz-pkg
Hardware testing
----------------
We booted each kernel and ran the following tests:
aarch64:
Host 1:
✅ Boot test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking socket: fuzz
✅ Networking: igmp conformance test
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - transport
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
✅ pciutils: update pci ids test
✅ ALSA PCM loopback test
✅ ALSA Control (mixer) Userspace Element test
✅ storage: SCSI VPD
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ jvm - DaCapo Benchmark Suite
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
🚧 ✅ kdump - kexec_boot
Host 2:
✅ Boot test
✅ xfstests - ext4
✅ xfstests - xfs
✅ selinux-policy: serge-testsuite
✅ storage: software RAID testing
🚧 ✅ IPMI driver test
🚧 ✅ IPMItool loop stress test
🚧 ✅ Storage blktests
ppc64le:
Host 1:
✅ Boot test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking socket: fuzz
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
✅ pciutils: update pci ids test
✅ ALSA PCM loopback test
✅ ALSA Control (mixer) Userspace Element test
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ jvm - DaCapo Benchmark Suite
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
Host 2:
✅ Boot test
✅ xfstests - ext4
✅ xfstests - xfs
✅ selinux-policy: serge-testsuite
✅ storage: software RAID testing
🚧 ✅ IPMI driver test
🚧 ✅ IPMItool loop stress test
🚧 ✅ Storage blktests
Host 3:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
⚡⚡⚡ Boot test
🚧 ⚡⚡⚡ kdump - sysrq-c
Host 4:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
⚡⚡⚡ Boot test
🚧 ⚡⚡⚡ kdump - sysrq-c
Host 5:
✅ Boot test
🚧 ✅ kdump - sysrq-c
s390x:
Host 1:
✅ Boot test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory function: memfd_create
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - transport
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ jvm - DaCapo Benchmark Suite
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Networking firewall: basic netfilter test
🚧 ❌ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
🚧 ✅ kdump - kexec_boot
Host 2:
✅ Boot test
🚧 ✅ kdump - sysrq-c
Host 3:
✅ Boot test
✅ stress: stress-ng
🚧 ✅ Storage blktests
x86_64:
Host 1:
✅ Boot test
✅ xfstests - ext4
✅ xfstests - xfs
✅ selinux-policy: serge-testsuite
✅ storage: software RAID testing
✅ stress: stress-ng
🚧 ❌ CPU: Frequency Driver Test
🚧 ✅ CPU: Idle Test
🚧 ✅ IOMMU boot test
🚧 ✅ IPMI driver test
🚧 ✅ IPMItool loop stress test
🚧 ✅ Storage blktests
Host 2:
✅ Boot test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking socket: fuzz
✅ Networking: igmp conformance test
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - transport
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
✅ pciutils: sanity smoke test
✅ pciutils: update pci ids test
✅ ALSA PCM loopback test
✅ ALSA Control (mixer) Userspace Element test
✅ storage: SCSI VPD
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ jvm - DaCapo Benchmark Suite
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
🚧 ✅ kdump - kexec_boot
Host 3:
✅ Boot test
🚧 ✅ kdump - sysrq-c
Test sources: https://github.com/CKI-project/tests-beaker
💚 Pull requests are welcome for new tests or improvements to existing tests!
Aborted tests
-------------
Tests that didn't complete running successfully are marked with ⚡⚡⚡.
If this was caused by an infrastructure issue, we try to mark that
explicitly in the report.
Waived tests
------------
If the test run included waived tests, they are marked with 🚧. Such tests are
executed but their results are not taken into account. Tests are waived when
their results are not reliable enough, e.g. when they're just introduced or are
being fixed.
Testing timeout
---------------
We aim to provide a report within reasonable timeframe. Tests that haven't
finished running yet are marked with ⏱.
******************************************
* WARNING: Boot tests are now deprecated *
******************************************
As kernelci.org is expanding its functional testing capabilities, the concept
of boot testing is now deprecated. Boot results are scheduled to be dropped on
*5th June 2020*. The full schedule for boot tests deprecation is available on
this GitHub issue: https://github.com/kernelci/kernelci-backend/issues/238
The new equivalent is the *baseline* test suite which also runs sanity checks
using dmesg and bootrr: https://github.com/kernelci/bootrr
See the *baseline results for this kernel revision* on this page:
https://kernelci.org/test/job/stable-rc/branch/linux-5.6.y/kernel/v5.6.15-1…
-------------------------------------------------------------------------------
stable-rc/linux-5.6.y boot: 143 boots: 3 failed, 130 passed with 5 offline, 5 untried/unknown (v5.6.15-178-gc72fcbc7d224)
Full Boot Summary: https://kernelci.org/boot/all/job/stable-rc/branch/linux-5.6.y/kernel/v5.6.…
Full Build Summary: https://kernelci.org/build/stable-rc/branch/linux-5.6.y/kernel/v5.6.15-178-…
Tree: stable-rc
Branch: linux-5.6.y
Git Describe: v5.6.15-178-gc72fcbc7d224
Git Commit: c72fcbc7d224903b8241afc1202a414575c1e557
Git URL: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
Tested: 96 unique boards, 24 SoC families, 16 builds out of 160
Boot Regressions Detected:
arm:
versatile_defconfig:
gcc-8:
versatile-pb:
lab-collabora: new failure (last pass: v5.6.15-178-g1c16267b1e40)
arm64:
defconfig:
gcc-8:
meson-gxm-q200:
lab-baylibre: new failure (last pass: v5.6.15-178-g1c16267b1e40)
sun50i-a64-bananapi-m64:
lab-clabbe: new failure (last pass: v5.6.15-178-g1c16267b1e40)
Boot Failures Detected:
arm64:
defconfig:
gcc-8:
meson-gxm-q200: 1 failed lab
sun50i-a64-bananapi-m64: 1 failed lab
arm:
multi_v7_defconfig:
gcc-8:
bcm2836-rpi-2-b: 1 failed lab
Offline Platforms:
arm:
multi_v7_defconfig:
gcc-8
exynos5800-peach-pi: 1 offline lab
qcom-apq8064-cm-qs600: 1 offline lab
stih410-b2120: 1 offline lab
exynos_defconfig:
gcc-8
exynos5800-peach-pi: 1 offline lab
qcom_defconfig:
gcc-8
qcom-apq8064-cm-qs600: 1 offline lab
---
For more info write to <info(a)kernelci.org>
******************************************
* WARNING: Boot tests are now deprecated *
******************************************
As kernelci.org is expanding its functional testing capabilities, the concept
of boot testing is now deprecated. Boot results are scheduled to be dropped on
*5th June 2020*. The full schedule for boot tests deprecation is available on
this GitHub issue: https://github.com/kernelci/kernelci-backend/issues/238
The new equivalent is the *baseline* test suite which also runs sanity checks
using dmesg and bootrr: https://github.com/kernelci/bootrr
See the *baseline results for this kernel revision* on this page:
https://kernelci.org/test/job/stable-rc/branch/linux-4.14.y/kernel/v4.14.18…
-------------------------------------------------------------------------------
stable-rc/linux-4.14.y boot: 55 boots: 3 failed, 52 passed (v4.14.182-78-g9093a4315f91)
Full Boot Summary: https://kernelci.org/boot/all/job/stable-rc/branch/linux-4.14.y/kernel/v4.1…
Full Build Summary: https://kernelci.org/build/stable-rc/branch/linux-4.14.y/kernel/v4.14.182-7…
Tree: stable-rc
Branch: linux-4.14.y
Git Describe: v4.14.182-78-g9093a4315f91
Git Commit: 9093a4315f917688b56194625b7ad0e407705072
Git URL: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
Tested: 51 unique boards, 13 SoC families, 13 builds out of 146
Boot Regressions Detected:
arm:
sama5_defconfig:
gcc-8:
at91-sama5d4_xplained:
lab-baylibre: failing since 102 days (last pass: v4.14.170-141-g00a0113414f7 - first fail: v4.14.171-29-g9cfe30e85240)
Boot Failures Detected:
arm64:
defconfig:
gcc-8:
meson-gxbb-p200: 1 failed lab
meson-gxm-q200: 1 failed lab
arm:
sama5_defconfig:
gcc-8:
at91-sama5d4_xplained: 1 failed lab
---
For more info write to <info(a)kernelci.org>