While running LTP mm thp01 test case on i386 kernel running on x86_64 device the following kernel warning was noticed multiple times.
This issue is not new, we have noticed on stable-rc 5.4, stable-rc 5.5 and stable-rc 5.6 branches. FYI, CONFIG_HAVE_MOVE_PMD=y is set on and total memory 2.2G as per free output.
steps to reproduce: -------------------- boot i386 kernel on x86_64 device, cd /opt/ltp ./runltp -f mm thp01.c:98: PASS: system didn't crash. thp01.c:98: PASS: system didn't crash. thp01.c:98: PASS: system didn't crash.
[ 207.317499] ------------[ cut here ]------------ [ 207.322153] WARNING: CPU: 0 PID: 18963 at mm/mremap.c:211 move_page_tables+0x5b0/0x5d0 [ 207.330061] Modules linked in: x86_pkg_temp_thermal [ 207.334940] CPU: 0 PID: 18963 Comm: true Tainted: G W 5.6.2-rc1+ #1 [ 207.342498] Hardware name: Supermicro SYS-5019S-ML/X11SSH-F, BIOS 2.2 05/23/2018 [ 207.349881] EIP: move_page_tables+0x5b0/0x5d0 [ 207.354233] Code: 00 00 c0 ff 2b 45 08 39 c3 0f 46 c3 89 45 d4 01 f8 89 45 cc e9 7e fb ff ff 8d 45 d8 83 4d e8 01 e8 65 b0 01 00 e9 b2 fa ff ff <0f> 0b 80 7d be 00 0f 84 7e fd ff ff 31 db e9 74 fe ff ff 31 db e9 [ 207.372969] EAX: 7ce5f067 EBX: 00400000 ECX: e2cc8000 EDX: 00000000 [ 207.379225] ESI: e2cc8bfc EDI: bfc00000 EBP: f3273e18 ESP: f3273dc0 [ 207.385484] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00010202 [ 207.392261] CR0: 80050033 CR2: b7d02f50 CR3: 22cc8000 CR4: 003406d0 [ 207.398517] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000 [ 207.404774] DR6: fffe0ff0 DR7: 00000400 [ 207.408605] Call Trace: [ 207.411053] setup_arg_pages+0x22b/0x310 [ 207.414977] ? security_bprm_committed_creds+0x22/0x30 [ 207.420107] load_elf_binary+0x2fb/0x10a0 [ 207.424110] ? selinux_inode_permission+0xfb/0x1d0 [ 207.428894] ? bm_status_write+0x61/0xa0 [ 207.432811] ? security_inode_permission+0x2c/0x50 [ 207.437597] ? writenote+0xb0/0xb0 [ 207.440992] search_binary_handler+0x77/0x190 [ 207.445356] __do_execve_file+0x429/0x760 [ 207.449375] sys_execve+0x21/0x30 [ 207.452693] do_fast_syscall_32+0x7a/0x280 [ 207.456784] entry_SYSENTER_32+0xa5/0xf8 [ 207.460702] EIP: 0xb7ee7c5d [ 207.463491] Code: Bad RIP value. [ 207.466716] EAX: ffffffda EBX: bfff9ed0 ECX: 08069420 EDX: bfffa134 [ 207.472971] ESI: 080599d4 EDI: bfff9ed9 EBP: bfff9f78 ESP: bfff9ea8 [ 207.479230] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b EFLAGS: 00000296 [ 207.486008] ---[ end trace d49b75932d5396d5 ]---
full test log, https://lkft.validation.linaro.org/scheduler/job/1328795#L14498 https://lkft.validation.linaro.org/scheduler/job/1331455#L8923 https://lkft.validation.linaro.org/scheduler/job/1331632#L17251
Kernel config: https://builds.tuxbuild.com/RJ9BGpsgfPfj3Sfje8oLSw/kernel.config
Test case source: https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/mem/t...
On Thu, Apr 02, 2020 at 04:49:02PM +0530, Naresh Kamboju wrote:
While running LTP mm thp01 test case on i386 kernel running on x86_64 device the following kernel warning was noticed multiple times.
This issue is not new, we have noticed on stable-rc 5.4, stable-rc 5.5 and stable-rc 5.6 branches. FYI, CONFIG_HAVE_MOVE_PMD=y is set on and total memory 2.2G as per free output.
steps to reproduce:
boot i386 kernel on x86_64 device, cd /opt/ltp ./runltp -f mm thp01.c:98: PASS: system didn't crash. thp01.c:98: PASS: system didn't crash. thp01.c:98: PASS: system didn't crash.
[ 207.317499] ------------[ cut here ]------------ [ 207.322153] WARNING: CPU: 0 PID: 18963 at mm/mremap.c:211 move_page_tables+0x5b0/0x5d0
Kernel config: https://builds.tuxbuild.com/RJ9BGpsgfPfj3Sfje8oLSw/kernel.config
Interesting. I suspect it's related to 2-level page tables in this configuration. But I cannot immediately see how.
Could you test if enabling HIGHMEM64 fixes the issue?
Below is patch that prints some additional info:
diff --git a/mm/mremap.c b/mm/mremap.c index d28f08a36b96..065d5ec3614a 100644 --- a/mm/mremap.c +++ b/mm/mremap.c @@ -208,8 +208,14 @@ static bool move_normal_pmd(struct vm_area_struct *vma, unsigned long old_addr, * The destination pmd shouldn't be established, free_pgtables() * should have release it. */ - if (WARN_ON(!pmd_none(*new_pmd))) + if (WARN_ON(!pmd_none(*new_pmd))) { + dump_vma(vma); + printk("old_addr: %#lx, new_addr: %#lx, old_end: %#lx\n", + old_addr, new_addr, old_end); + printk("old_pmd: %#lx", pmd_val(*old_pmd)); + printk("new_pmd: %#lx", pmd_val(*new_pmd)); return false; + }
/* * We don't have to worry about the ordering of src and dst
On Thu, 2 Apr 2020 at 19:08, Kirill A. Shutemov kirill@shutemov.name wrote:
On Thu, Apr 02, 2020 at 04:49:02PM +0530, Naresh Kamboju wrote:
While running LTP mm thp01 test case on i386 kernel running on x86_64 device the following kernel warning was noticed multiple times.
Interesting. I suspect it's related to 2-level page tables in this configuration. But I cannot immediately see how.
Could you test if enabling HIGHMEM64 fixes the issue?
CONFIG_HIGHMEM64G=y
Below is patch that prints some additional info:
Applied your patch and reproduced the problem, The boot log show this warning,
[ 0.261879] ************************************************************ [ 0.261880] ** WARNING! WARNING! WARNING! WARNING! WARNING! WARNING! ** [ 0.261881] ** ** [ 0.261881] ** You are using 32-bit PTI on a 64-bit PCID-capable CPU. ** [ 0.261882] ** Your performance will increase dramatically if you ** [ 0.261882] ** switch to a 64-bit kernel! ** [ 0.261883] ** ** [ 0.261884] ** WARNING! WARNING! WARNING! WARNING! WARNING! WARNING! ** [ 0.261884] ************************************************************ ...
Reproducing steps: ------------------------ cd /opt/ltp ./runltp -f mm
[ 734.485672] ------------[ cut here ]------------ [ 734.490306] WARNING: CPU: 3 PID: 32321 at mm/mremap.c:212 move_page_tables+0x7c3/0x830 [ 734.498212] Modules linked in: x86_pkg_temp_thermal [ 734.503084] CPU: 3 PID: 32321 Comm: true Tainted: G W 5.6.2-rc1+ #14 [ 734.510729] Hardware name: Supermicro SYS-5019S-ML/X11SSH-F, BIOS 2.2 05/23/2018 [ 734.518110] EIP: move_page_tables+0x7c3/0x830 [ 734.522463] Code: 0c eb a7 8d 45 d8 83 4d e8 01 e8 c8 e6 01 00 e9 be f8 ff ff 8d 45 d8 31 d2 e8 59 e8 01 00 e9 a5 fc ff ff 31 db e9 81 fc ff ff <0f> 0b 8b 45 b8 e8 43 e0 fe ff ff 75 b0 ff 75 08 ff 75 cc 68 e4 50 [ 734.541200] EAX: eb5f1fe8 EBX: 00400000 ECX: 2b5f1001 EDX: 2b5f1000 [ 734.547456] ESI: ea5d6010 EDI: eb5f1ff0 EBP: da381e14 ESP: da381d84 [ 734.553714] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00010202 [ 734.560492] CR0: 80050033 CR2: b7d11f50 CR3: 2a5d6000 CR4: 003406f0 [ 734.566748] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000 [ 734.573006] DR6: fffe0ff0 DR7: 00000400 [ 734.576835] Call Trace: [ 734.579282] setup_arg_pages+0x22c/0x350 [ 734.583207] ? strlcpy+0x33/0x50 [ 734.586459] load_elf_binary+0x352/0x1010 [ 734.590468] ? selinux_inode_permission+0xe5/0x1f0 [ 734.595254] search_binary_handler+0x77/0x1a0 [ 734.599614] __do_execve_file+0x5aa/0x710 [ 734.603615] sys_execve+0x21/0x30 [ 734.606926] do_fast_syscall_32+0x75/0x260 [ 734.611019] entry_SYSENTER_32+0xa5/0xf8 [ 734.614942] EIP: 0xb7ef6c11 [ 734.617735] Code: Bad RIP value. [ 734.620956] EAX: ffffffda EBX: bfb8dcb0 ECX: 08069420 EDX: bfb8df14 [ 734.627238] ESI: 080599d4 EDI: bfb8dcb9 EBP: bfb8dd58 ESP: bfb8dc88 [ 734.633499] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b EFLAGS: 00000296 [ 734.640276] ---[ end trace e625d4d55b8380f3 ]--- [ 734.644934] vma dbc4c180 start bf801000 end c0000000 [ 734.644934] next 00000000 prev 00000000 mm d0d75d40 [ 734.644934] prot 25 anon_vma d0dfddc8 vm_ops 00000000 [ 734.644934] pgoff bfa01 file 00000000 private_data 00000000 [ 734.644934] flags: 0x118173(read|write|mayread|maywrite|mayexec|growsdown|seqread|randread|account) [ 734.674346] old_addr: 0xbfc00000, new_addr: 0xbfa00000, old_end: 0xc0000000 [ 734.681322] old_pmd: 0x77923067 [ 734.681322] new_pmd: 0x7796d067 [ 734.684510] ------------[ cut here ]------------ [ 734.692295] WARNING: CPU: 2 PID: 32321 at mm/mremap.c:212 move_page_tables+0x7c3/0x830 [ 734.700241] Modules linked in: x86_pkg_temp_thermal [ 734.705128] CPU: 2 PID: 32321 Comm: true Tainted: G W 5.6.2-rc1+ #14 [ 734.712770] Hardware name: Supermicro SYS-5019S-ML/X11SSH-F, BIOS 2.2 05/23/2018 [ 734.720156] EIP: move_page_tables+0x7c3/0x830 [ 734.724506] Code: 0c eb a7 8d 45 d8 83 4d e8 01 e8 c8 e6 01 00 e9 be f8 ff ff 8d 45 d8 31 d2 e8 59 e8 01 00 e9 a5 fc ff ff 31 db e9 81 fc ff ff <0f> 0b 8b 45 b8 e8 43 e0 fe ff ff 75 b0 ff 75 08 ff 75 cc 68 e4 50 [ 734.743256] EAX: eb5f1ff0 EBX: 00200000 ECX: 2b5f1001 EDX: 2b5f1000 [ 734.749517] ESI: ea5d6010 EDI: eb5f1ff8 EBP: da381e14 ESP: da381d84 [ 734.755776] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00010202 [ 734.762553] CR0: 80050033 CR2: b7f1f4e0 CR3: 2a5d6000 CR4: 003406f0 [ 734.768808] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000 [ 734.775067] DR6: fffe0ff0 DR7: 00000400 [ 734.778898] Call Trace: [ 734.781344] setup_arg_pages+0x22c/0x350 [ 734.785269] ? strlcpy+0x33/0x50 [ 734.788499] load_elf_binary+0x352/0x1010 [ 734.792503] ? selinux_inode_permission+0xe5/0x1f0 [ 734.797287] search_binary_handler+0x77/0x1a0 [ 734.801639] __do_execve_file+0x5aa/0x710 [ 734.805642] sys_execve+0x21/0x30 [ 734.808953] do_fast_syscall_32+0x75/0x260 [ 734.813045] entry_SYSENTER_32+0xa5/0xf8 [ 734.816959] EIP: 0xb7ef6c11 [ 734.819752] Code: Bad RIP value. [ 734.822976] EAX: ffffffda EBX: bfb8dcb0 ECX: 08069420 EDX: bfb8df14 [ 734.829261] ESI: 080599d4 EDI: bfb8dcb9 EBP: bfb8dd58 ESP: bfb8dc88 [ 734.835525] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b EFLAGS: 00000296 [ 734.842302] ---[ end trace e625d4d55b8380f4 ]--- [ 734.846940] vma dbc4c180 start bf801000 end c0000000 [ 734.846940] next 00000000 prev 00000000 mm d0d75d40 [ 734.846940] prot 25 anon_vma d0dfddc8 vm_ops 00000000 [ 734.846940] pgoff bfa01 file 00000000 private_data 00000000 [ 734.846940] flags: 0x118173(read|write|mayread|maywrite|mayexec|growsdown|seqread|randread|account) [ 734.876355] old_addr: 0xbfe00000, new_addr: 0xbfc00000, old_end: 0xc0000000 [ 734.883316] old_pmd: 0x77a08067
Full test log, https://lkft.validation.linaro.org/scheduler/job/1334357#L478
- Naresh
On Fri, Apr 03, 2020 at 12:56:57AM +0530, Naresh Kamboju wrote:
[ 734.876355] old_addr: 0xbfe00000, new_addr: 0xbfc00000, old_end: 0xc0000000
The ranges are overlapping. We don't expect it. mremap(2) never does this.
shift_arg_pages() only moves range downwards. It should be safe.
Could you try this:
diff --git a/mm/mremap.c b/mm/mremap.c index af363063ea23..ddd5337de395 100644 --- a/mm/mremap.c +++ b/mm/mremap.c @@ -205,10 +205,14 @@ static bool move_normal_pmd(struct vm_area_struct *vma, unsigned long old_addr, return false;
/* - * The destination pmd shouldn't be established, free_pgtables() - * should have release it. + * mremap(2) clears the new place, so the new_pmd is expected to be + * clear. + * + * But move_page_tables() is also called from shift_arg_pages() that + * allows for overlapping address ranges. The shift_arg_pages() case + * is also safe as we only move page tables downwards. */ - if (WARN_ON(!pmd_none(*new_pmd))) + if (WARN_ON(!pmd_none(*new_pmd) && old_addr > new_addr)) return false;
/*
On Fri, 3 Apr 2020 at 19:02, Kirill A. Shutemov kirill@shutemov.name wrote:
On Fri, Apr 03, 2020 at 12:56:57AM +0530, Naresh Kamboju wrote:
[ 734.876355] old_addr: 0xbfe00000, new_addr: 0xbfc00000, old_end: 0xc0000000
The ranges are overlapping. We don't expect it. mremap(2) never does this.
shift_arg_pages() only moves range downwards. It should be safe.
Could you try this:
Applied the patch and tested and still getting kernel warning. CONFIG_HIGHMEM64G=y is still enabled.
[ 790.041040] ------------[ cut here ]------------ [ 790.045664] WARNING: CPU: 3 PID: 3195 at mm/mremap.c:212 move_page_tables+0x7a7/0x840 [ 790.053486] Modules linked in: x86_pkg_temp_thermal [ 790.058358] CPU: 3 PID: 3195 Comm: true Tainted: G W 5.6.2-rc1+ #15 [ 790.065915] Hardware name: Supermicro SYS-5019S-ML/X11SSH-F, BIOS 2.0b 07/27/2017 [ 790.073386] EIP: move_page_tables+0x7a7/0x840 [ 790.077737] Code: 9f 84 c0 0f 84 b7 fc ff ff 89 c3 e9 ba fe ff ff 8b 40 54 8b 40 10 8b 40 1c 8b 80 20 02 00 00 8b 40 0c 8b 50 08 83 c2 0c eb a7 <0f> 0b e9 55 fd ff ff 8d 45 d8 83 4d e8 01 e8 c6 e6 01 00 e9 ac f8 [ 790.096475] EAX: bfe00000 EBX: 00200000 ECX: 07606001 EDX: 07606000 [ 790.102732] ESI: c64c0010 EDI: c7606ff8 EBP: c845de14 ESP: c845dd7c [ 790.108989] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00010206 [ 790.115764] CR0: 80050033 CR2: b7e13b50 CR3: 064c0000 CR4: 003406f0 [ 790.122024] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000 [ 790.128281] DR6: fffe0ff0 DR7: 00000400 [ 790.132111] Call Trace: [ 790.134558] setup_arg_pages+0x22c/0x350 [ 790.138514] ? strlcpy+0x33/0x50 [ 790.141776] load_elf_binary+0x352/0x1010 [ 790.145788] ? selinux_inode_permission+0xe5/0x1f0 [ 790.150573] search_binary_handler+0x77/0x1a0 [ 790.154931] __do_execve_file+0x5aa/0x710 [ 790.158935] sys_execve+0x21/0x30 [ 790.162246] do_fast_syscall_32+0x75/0x260 [ 790.166336] entry_SYSENTER_32+0xa5/0xf8 [ 790.170254] EIP: 0xb7f12c11 [ 790.173045] Code: Bad RIP value. [ 790.176266] EAX: ffffffda EBX: bfc687d0 ECX: 08069420 EDX: bfc68a34 [ 790.182548] ESI: 080599d4 EDI: bfc687d9 EBP: bfc68878 ESP: bfc687a8 [ 790.188808] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b EFLAGS: 00000296 [ 790.195585] ---[ end trace e8f9014a5c1de460 ]---
full test log, https://lkft.validation.linaro.org/scheduler/job/1339582#L9858
- Naresh
On Sat, Apr 04, 2020 at 08:10:42PM +0530, Naresh Kamboju wrote:
On Fri, 3 Apr 2020 at 19:02, Kirill A. Shutemov kirill@shutemov.name wrote:
On Fri, Apr 03, 2020 at 12:56:57AM +0530, Naresh Kamboju wrote:
[ 734.876355] old_addr: 0xbfe00000, new_addr: 0xbfc00000, old_end: 0xc0000000
The ranges are overlapping. We don't expect it. mremap(2) never does this.
shift_arg_pages() only moves range downwards. It should be safe.
Could you try this:
Applied the patch and tested and still getting kernel warning. CONFIG_HIGHMEM64G=y is still enabled.
[ 790.041040] ------------[ cut here ]------------ [ 790.045664] WARNING: CPU: 3 PID: 3195 at mm/mremap.c:212 move_page_tables+0x7a7/0x840
Are you sure the patch is applied? The line number in the warning supposed to change.
On Sat, 4 Apr 2020 at 21:36, Kirill A. Shutemov kirill@shutemov.name wrote:
On Sat, Apr 04, 2020 at 08:10:42PM +0530, Naresh Kamboju wrote:
On Fri, 3 Apr 2020 at 19:02, Kirill A. Shutemov kirill@shutemov.name wrote:
On Fri, Apr 03, 2020 at 12:56:57AM +0530, Naresh Kamboju wrote:
[ 734.876355] old_addr: 0xbfe00000, new_addr: 0xbfc00000, old_end: 0xc0000000
The ranges are overlapping. We don't expect it. mremap(2) never does this.
shift_arg_pages() only moves range downwards. It should be safe.
Could you try this:
Applied the patch and tested and still getting kernel warning. CONFIG_HIGHMEM64G=y is still enabled.
[ 790.041040] ------------[ cut here ]------------ [ 790.045664] WARNING: CPU: 3 PID: 3195 at mm/mremap.c:212 move_page_tables+0x7a7/0x840
Are you sure the patch is applied? The line number in the warning supposed to change.
Yes. The patch was applied and tested. The reason for line number change is due to linux/mmdebug.h included because an earlier patch "dump_vma(vma);" needed this.
diff --git a/mm/mremap.c b/mm/mremap.c index af363063ea23..cf02d4244e83 100644 --- a/mm/mremap.c +++ b/mm/mremap.c @@ -24,6 +24,7 @@ #include <linux/uaccess.h> #include <linux/mm-arch-hooks.h> #include <linux/userfaultfd_k.h> +#include <linux/mmdebug.h>
#include <asm/cacheflush.h> #include <asm/tlbflush.h> @@ -208,7 +209,7 @@ static bool move_normal_pmd(struct vm_area_struct *vma, unsigned long old_addr, * The destination pmd shouldn't be established, free_pgtables() * should have release it. */ - if (WARN_ON(!pmd_none(*new_pmd))) + if (WARN_ON(!pmd_none(*new_pmd) && old_addr > new_addr)) return false;
/*
- Naresh