stable/linux-4.4.y boot: 124 boots: 3 failed, 117 passed with 2 offline, 1 untried/unknown, 1 conflict (v4.4.151)
Full Boot Summary: https://kernelci.org/boot/all/job/stable/branch/linux-4.4.y/kernel/v4.4.151/ Full Build Summary: https://kernelci.org/build/stable/branch/linux-4.4.y/kernel/v4.4.151/
Tree: stable Branch: linux-4.4.y Git Describe: v4.4.151 Git Commit: 78f654f6cce3442937b8c7eb4b640357871363c1 Git URL: http://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git Tested: 43 unique boards, 20 SoC families, 22 builds out of 191
Boot Regressions Detected:
arm:
multi_v7_defconfig: stih410-b2120: lab-baylibre-seattle: failing since 15 days (last pass: v4.4.145 - first fail: v4.4.146) tegra124-jetson-tk1: lab-baylibre: new failure (last pass: v4.4.148)
x86:
defconfig+kvm_guest: x86-x5-z8350: lab-mhart: failing since 6 days (last pass: v4.4.147 - first fail: v4.4.148)
x86_64_defconfig: x86-x5-z8350: lab-mhart: failing since 6 days (last pass: v4.4.147 - first fail: v4.4.148)
Boot Failures Detected:
arm:
multi_v7_defconfig stih410-b2120: 1 failed lab
x86:
defconfig+kvm_guest x86-x5-z8350: 1 failed lab
x86_64_defconfig x86-x5-z8350: 1 failed lab
Offline Platforms:
arm:
exynos_defconfig: exynos5800-peach-pi: 1 offline lab
davinci_all_defconfig: dm365evm,legacy: 1 offline lab
Conflicting Boot Failure Detected: (These likely are not failures as other labs are reporting PASS. Needs review.)
arm:
multi_v7_defconfig: tegra124-jetson-tk1: lab-baylibre: FAIL lab-mhart: PASS lab-baylibre-seattle: PASS lab-collabora: PASS
--- For more info write to info@kernelci.org
On 22/08/18 14:51, kernelci.org bot wrote:
stable/linux-4.4.y boot: 124 boots: 3 failed, 117 passed with 2 offline, 1 untried/unknown, 1 conflict (v4.4.151)
Full Boot Summary: https://kernelci.org/boot/all/job/stable/branch/linux-4.4.y/kernel/v4.4.151/ Full Build Summary: https://kernelci.org/build/stable/branch/linux-4.4.y/kernel/v4.4.151/
Tree: stable Branch: linux-4.4.y Git Describe: v4.4.151 Git Commit: 78f654f6cce3442937b8c7eb4b640357871363c1 Git URL: http://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git Tested: 43 unique boards, 20 SoC families, 22 builds out of 191
Boot Regressions Detected:
[...]
x86:
defconfig+kvm_guest: x86-x5-z8350: lab-mhart: failing since 6 days (last pass: v4.4.147 - first fail: v4.4.148) x86_64_defconfig: x86-x5-z8350: lab-mhart: failing since 6 days (last pass: v4.4.147 - first fail: v4.4.148)
We've had a couple of automated boot bisections pointing at the same commit on an x86 Atom x5-Z8350 platform (AAEON UP-CHT01), please see details below. The issue seems to have started with v4.4.148.
Here's the boot details and log showing the issue on v4.4.151:
https://kernelci.org/boot/id/5b7d39ea59b514c03796ba9c/ https://storage.kernelci.org/stable/linux-4.4.y/v4.4.151/x86/x86_64_defconfi...
[ 0.073668] swapper/0: Corrupted page table at address 5b95ef78 [ 0.080286] PGD 272a067 PUD 272d067 PMD 5b8000000e3 [ 0.085847] Bad pagetable: 0009 [#1] SMP
Below is the result of the last kernelci.org automated bisection. I haven't done any further investigation, this is merely sharing the results so please take it with a pinch of salt!
Hope this helps.
Best wishes, Guillaume
--------------------------------------8<--------------------------------------
Bisection result for stable/linux-4.4.y (v4.4.151) on x86-x5-z8350
Good: 8404ae6c8c9f Linux 4.4.147 Bad: 78f654f6cce3 Linux 4.4.151 Found: 02ff2769edbc x86/mm/pat: Make set_memory_np() L1TF safe
Checks: revert: PASS verify: PASS
Parameters: Tree: stable URL: http://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git Branch: linux-4.4.y Target: x86-x5-z8350 Lab: lab-mhart Config: x86_64_defconfig Plan: boot
Breaking commit found:
------------------------------------------------------------------------------- commit 02ff2769edbce2261e981effbc3c4b98fae4faf0 Author: Andi Kleen ak@linux.intel.com Date: Tue Aug 7 15:09:39 2018 -0700
x86/mm/pat: Make set_memory_np() L1TF safe
commit 958f79b9ee55dfaf00c8106ed1c22a2919e0028b upstream
set_memory_np() is used to mark kernel mappings not present, but it has it's own open coded mechanism which does not have the L1TF protection of inverting the address bits.
Replace the open coded PTE manipulation with the L1TF protecting low level PTE routines.
Passes the CPA self test.
Signed-off-by: Andi Kleen ak@linux.intel.com Signed-off-by: Thomas Gleixner tglx@linutronix.de [ dwmw2: Pull in pud_mkhuge() from commit a00cc7d9dd, and pfn_pud() ] Signed-off-by: David Woodhouse dwmw@amazon.co.uk [groeck: port to 4.4] Signed-off-by: Guenter Roeck linux@roeck-us.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h index b5e157c065ae..4de6c282c02a 100644 --- a/arch/x86/include/asm/pgtable.h +++ b/arch/x86/include/asm/pgtable.h @@ -378,12 +378,39 @@ static inline pmd_t pfn_pmd(unsigned long page_nr, pgprot_t pgprot) return __pmd(pfn | massage_pgprot(pgprot)); }
+static inline pud_t pfn_pud(unsigned long page_nr, pgprot_t pgprot) +{ + phys_addr_t pfn = page_nr << PAGE_SHIFT; + pfn ^= protnone_mask(pgprot_val(pgprot)); + pfn &= PHYSICAL_PUD_PAGE_MASK; + return __pud(pfn | massage_pgprot(pgprot)); +} + static inline pmd_t pmd_mknotpresent(pmd_t pmd) { return pfn_pmd(pmd_pfn(pmd), __pgprot(pmd_flags(pmd) & ~(_PAGE_PRESENT|_PAGE_PROTNONE))); }
+static inline pud_t pud_set_flags(pud_t pud, pudval_t set) +{ + pudval_t v = native_pud_val(pud); + + return __pud(v | set); +} + +static inline pud_t pud_clear_flags(pud_t pud, pudval_t clear) +{ + pudval_t v = native_pud_val(pud); + + return __pud(v & ~clear); +} + +static inline pud_t pud_mkhuge(pud_t pud) +{ + return pud_set_flags(pud, _PAGE_PSE); +} + static inline u64 flip_protnone_guard(u64 oldval, u64 val, u64 mask);
static inline pte_t pte_modify(pte_t pte, pgprot_t newprot) diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c index 79377e2a7bcd..27610c2d1821 100644 --- a/arch/x86/mm/pageattr.c +++ b/arch/x86/mm/pageattr.c @@ -1006,8 +1006,8 @@ static int populate_pmd(struct cpa_data *cpa,
pmd = pmd_offset(pud, start);
- set_pmd(pmd, __pmd(cpa->pfn | _PAGE_PSE | - massage_pgprot(pmd_pgprot))); + set_pmd(pmd, pmd_mkhuge(pfn_pmd(cpa->pfn, + canon_pgprot(pmd_pgprot))));
start += PMD_SIZE; cpa->pfn += PMD_SIZE; @@ -1079,8 +1079,8 @@ static int populate_pud(struct cpa_data *cpa, unsigned long start, pgd_t *pgd, * Map everything starting from the Gb boundary, possibly with 1G pages */ while (end - start >= PUD_SIZE) { - set_pud(pud, __pud(cpa->pfn | _PAGE_PSE | - massage_pgprot(pud_pgprot))); + set_pud(pud, pud_mkhuge(pfn_pud(cpa->pfn, + canon_pgprot(pud_pgprot))));
start += PUD_SIZE; cpa->pfn += PUD_SIZE; -------------------------------------------------------------------------------
Git bisection log:
------------------------------------------------------------------------------- git bisect start # good: [8404ae6c8c9ff23a06cf38112e83002e1088bfe1] Linux 4.4.147 git bisect good 8404ae6c8c9ff23a06cf38112e83002e1088bfe1 # bad: [78f654f6cce3442937b8c7eb4b640357871363c1] Linux 4.4.151 git bisect bad 78f654f6cce3442937b8c7eb4b640357871363c1 # bad: [6b06f36f07e2c91ad0126f17d0fc8f933c827da8] x86/mm/kmmio: Make the tracer robust against L1TF git bisect bad 6b06f36f07e2c91ad0126f17d0fc8f933c827da8 # good: [90a231c63cc28d896ab353b027011a949e9884d3] x86/speculation/l1tf: Increase 32bit PAE __PHYSICAL_PAGE_SHIFT git bisect good 90a231c63cc28d896ab353b027011a949e9884d3 # good: [9ac0dc7d949db7afd4116d55fa4fcf6a66d820f0] mm: fix cache mode tracking in vm_insert_mixed() git bisect good 9ac0dc7d949db7afd4116d55fa4fcf6a66d820f0 # good: [dc48c1a2f45b628d3128ad4bb31d1bcd342c059d] x86/cpufeatures: Add detection of L1D cache flush support. git bisect good dc48c1a2f45b628d3128ad4bb31d1bcd342c059d # good: [0aae5fe8413dfcd949d0df1c7d6b835efecd5b3b] x86/speculation/l1tf: Invert all not present mappings git bisect good 0aae5fe8413dfcd949d0df1c7d6b835efecd5b3b # bad: [02ff2769edbce2261e981effbc3c4b98fae4faf0] x86/mm/pat: Make set_memory_np() L1TF safe git bisect bad 02ff2769edbce2261e981effbc3c4b98fae4faf0 # good: [9feecdb6cb73feaa55b0135aee8777eaac848c78] x86/speculation/l1tf: Make pmd/pud_mknotpresent() invert git bisect good 9feecdb6cb73feaa55b0135aee8777eaac848c78 # first bad commit: [02ff2769edbce2261e981effbc3c4b98fae4faf0] x86/mm/pat: Make set_memory_np() L1TF safe -------------------------------------------------------------------------------
For more info write to info@kernelci.org
Kernel-build-reports mailing list Kernel-build-reports@lists.linaro.org https://lists.linaro.org/mailman/listinfo/kernel-build-reports
Hi Guillaume,
On Fri, Aug 24, 2018 at 08:32:07AM +0100, Guillaume Tucker wrote:
On 22/08/18 14:51, kernelci.org bot wrote:
stable/linux-4.4.y boot: 124 boots: 3 failed, 117 passed with 2 offline, 1 untried/unknown, 1 conflict (v4.4.151)
Full Boot Summary: https://kernelci.org/boot/all/job/stable/branch/linux-4.4.y/kernel/v4.4.151/ Full Build Summary: https://kernelci.org/build/stable/branch/linux-4.4.y/kernel/v4.4.151/
Tree: stable Branch: linux-4.4.y Git Describe: v4.4.151 Git Commit: 78f654f6cce3442937b8c7eb4b640357871363c1 Git URL: http://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git Tested: 43 unique boards, 20 SoC families, 22 builds out of 191
Boot Regressions Detected:
[...]
x86:
defconfig+kvm_guest: x86-x5-z8350: lab-mhart: failing since 6 days (last pass: v4.4.147 - first fail: v4.4.148) x86_64_defconfig: x86-x5-z8350: lab-mhart: failing since 6 days (last pass: v4.4.147 - first fail: v4.4.148)
We've had a couple of automated boot bisections pointing at the same commit on an x86 Atom x5-Z8350 platform (AAEON UP-CHT01), please see details below. The issue seems to have started with v4.4.148.
Can you test if a backport of upstream commit d367cef0a7f0 ("x86/mm/pat: Fix boot crash when 1GB pages are not supported by the CPU") fixes the problem ? It doesn't apply cleanly, but the conflict is easy to resolve.
Thanks, Guenter
On Fri, Aug 24, 2018 at 09:30:22AM -0700, Guenter Roeck wrote:
Hi Guillaume,
On Fri, Aug 24, 2018 at 08:32:07AM +0100, Guillaume Tucker wrote:
On 22/08/18 14:51, kernelci.org bot wrote:
stable/linux-4.4.y boot: 124 boots: 3 failed, 117 passed with 2 offline, 1 untried/unknown, 1 conflict (v4.4.151)
Full Boot Summary: https://kernelci.org/boot/all/job/stable/branch/linux-4.4.y/kernel/v4.4.151/ Full Build Summary: https://kernelci.org/build/stable/branch/linux-4.4.y/kernel/v4.4.151/
Tree: stable Branch: linux-4.4.y Git Describe: v4.4.151 Git Commit: 78f654f6cce3442937b8c7eb4b640357871363c1 Git URL: http://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git Tested: 43 unique boards, 20 SoC families, 22 builds out of 191
Boot Regressions Detected:
[...]
x86:
defconfig+kvm_guest: x86-x5-z8350: lab-mhart: failing since 6 days (last pass: v4.4.147 - first fail: v4.4.148) x86_64_defconfig: x86-x5-z8350: lab-mhart: failing since 6 days (last pass: v4.4.147 - first fail: v4.4.148)
We've had a couple of automated boot bisections pointing at the same commit on an x86 Atom x5-Z8350 platform (AAEON UP-CHT01), please see details below. The issue seems to have started with v4.4.148.
Can you test if a backport of upstream commit d367cef0a7f0 ("x86/mm/pat: Fix boot crash when 1GB pages are not supported by the CPU") fixes the problem ? It doesn't apply cleanly, but the conflict is easy to resolve.
We also may have to reapply commit 87e2bd898d3a ("x86/mm/pat: Ensure cpa->pfn only contains page frame numbers") and fix whatever problems it had.
Guenter
On 24/08/18 18:41, Guenter Roeck wrote:
On Fri, Aug 24, 2018 at 09:30:22AM -0700, Guenter Roeck wrote:
Hi Guillaume,
On Fri, Aug 24, 2018 at 08:32:07AM +0100, Guillaume Tucker wrote:
On 22/08/18 14:51, kernelci.org bot wrote:
stable/linux-4.4.y boot: 124 boots: 3 failed, 117 passed with 2 offline, 1 untried/unknown, 1 conflict (v4.4.151)
Full Boot Summary: https://kernelci.org/boot/all/job/stable/branch/linux-4.4.y/kernel/v4.4.151/ Full Build Summary: https://kernelci.org/build/stable/branch/linux-4.4.y/kernel/v4.4.151/
Tree: stable Branch: linux-4.4.y Git Describe: v4.4.151 Git Commit: 78f654f6cce3442937b8c7eb4b640357871363c1 Git URL: http://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git Tested: 43 unique boards, 20 SoC families, 22 builds out of 191
Boot Regressions Detected:
[...]
x86:
defconfig+kvm_guest: x86-x5-z8350: lab-mhart: failing since 6 days (last pass: v4.4.147 - first fail: v4.4.148) x86_64_defconfig: x86-x5-z8350: lab-mhart: failing since 6 days (last pass: v4.4.147 - first fail: v4.4.148)
We've had a couple of automated boot bisections pointing at the same commit on an x86 Atom x5-Z8350 platform (AAEON UP-CHT01), please see details below. The issue seems to have started with v4.4.148.
Can you test if a backport of upstream commit d367cef0a7f0 ("x86/mm/pat: Fix boot crash when 1GB pages are not supported by the CPU") fixes the problem ? It doesn't apply cleanly, but the conflict is easy to resolve.
We also may have to reapply commit 87e2bd898d3a ("x86/mm/pat: Ensure cpa->pfn only contains page frame numbers") and fix whatever problems it had.
I've applied these 2 patches on top of v4.4.152:
https://gitlab.collabora.com/gtucker/linux/commits/linux-4.4.152-bisect-atom...
Then I've run the tests using these 2 commits, but they both still failed to boot on that platform:
http://lava.streamtester.net/scheduler/job/145549 http://lava.streamtester.net/scheduler/job/145548
For the record, here's the same test with v4.4.147 that works:
http://lava.streamtester.net/scheduler/job/145547
Let me know if you have other fixes to apply, I can take another look at some point next week.
Guillaume
On Fri, Aug 24, 2018 at 09:17:55PM +0100, Guillaume Tucker wrote:
On 24/08/18 18:41, Guenter Roeck wrote:
On Fri, Aug 24, 2018 at 09:30:22AM -0700, Guenter Roeck wrote:
Hi Guillaume,
On Fri, Aug 24, 2018 at 08:32:07AM +0100, Guillaume Tucker wrote:
On 22/08/18 14:51, kernelci.org bot wrote:
stable/linux-4.4.y boot: 124 boots: 3 failed, 117 passed with 2 offline, 1 untried/unknown, 1 conflict (v4.4.151)
Full Boot Summary: https://kernelci.org/boot/all/job/stable/branch/linux-4.4.y/kernel/v4.4.151/ Full Build Summary: https://kernelci.org/build/stable/branch/linux-4.4.y/kernel/v4.4.151/
Tree: stable Branch: linux-4.4.y Git Describe: v4.4.151 Git Commit: 78f654f6cce3442937b8c7eb4b640357871363c1 Git URL: http://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git Tested: 43 unique boards, 20 SoC families, 22 builds out of 191
Boot Regressions Detected:
[...]
x86:
defconfig+kvm_guest: x86-x5-z8350: lab-mhart: failing since 6 days (last pass: v4.4.147 - first fail: v4.4.148) x86_64_defconfig: x86-x5-z8350: lab-mhart: failing since 6 days (last pass: v4.4.147 - first fail: v4.4.148)
We've had a couple of automated boot bisections pointing at the same commit on an x86 Atom x5-Z8350 platform (AAEON UP-CHT01), please see details below. The issue seems to have started with v4.4.148.
Can you test if a backport of upstream commit d367cef0a7f0 ("x86/mm/pat: Fix boot crash when 1GB pages are not supported by the CPU") fixes the problem ? It doesn't apply cleanly, but the conflict is easy to resolve.
We also may have to reapply commit 87e2bd898d3a ("x86/mm/pat: Ensure cpa->pfn only contains page frame numbers") and fix whatever problems it had.
I've applied these 2 patches on top of v4.4.152:
https://gitlab.collabora.com/gtucker/linux/commits/linux-4.4.152-bisect-atom...
Then I've run the tests using these 2 commits, but they both still failed to boot on that platform:
http://lava.streamtester.net/scheduler/job/145549 http://lava.streamtester.net/scheduler/job/145548
For the record, here's the same test with v4.4.147 that works:
http://lava.streamtester.net/scheduler/job/145547
Let me know if you have other fixes to apply, I can take another look at some point next week.
Roland Dreier is working on a more comprehensive series. I copied you on the thread. Looks like I missed an entire sequence of EFI related patches in the backport.
Does anyone know if qemu supports EFI ?
Guenter
Roland Dreier is working on a more comprehensive series. I copied you on the thread. Looks like I missed an entire sequence of EFI related patches in the backport.
I don't remember any EFI specific patches for L1TF. Which ones do you mean?
Does anyone know if qemu supports EFI ?
Yes it does, but the exact calls may still depend on the hardware.
-Andi
On Fri, Aug 24, 2018 at 01:50:47PM -0700, Andi Kleen wrote:
Roland Dreier is working on a more comprehensive series. I copied you on the thread. Looks like I missed an entire sequence of EFI related patches in the backport.
I don't remember any EFI specific patches for L1TF. Which ones do you mean?
Not specifically for L1TF, but general EFI patches which may be needed as prerequisite. I'll copy you on the thread discussing which patches may be needed.
Does anyone know if qemu supports EFI ?
Yes it does, but the exact calls may still depend on the hardware.
Apparently so; I managed to boot Linux using qemu in efi mode, but I am unable to reproduce the problem.
Guenter
On Fri, Aug 24, 2018 at 08:32:07AM +0100, Guillaume Tucker wrote:
On 22/08/18 14:51, kernelci.org bot wrote:
stable/linux-4.4.y boot: 124 boots: 3 failed, 117 passed with 2 offline, 1 untried/unknown, 1 conflict (v4.4.151)
Full Boot Summary: https://kernelci.org/boot/all/job/stable/branch/linux-4.4.y/kernel/v4.4.151/ Full Build Summary: https://kernelci.org/build/stable/branch/linux-4.4.y/kernel/v4.4.151/
Tree: stable Branch: linux-4.4.y Git Describe: v4.4.151 Git Commit: 78f654f6cce3442937b8c7eb4b640357871363c1 Git URL: http://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git Tested: 43 unique boards, 20 SoC families, 22 builds out of 191
Boot Regressions Detected:
[...]
x86:
defconfig+kvm_guest: x86-x5-z8350: lab-mhart: failing since 6 days (last pass: v4.4.147 - first fail: v4.4.148) x86_64_defconfig: x86-x5-z8350: lab-mhart: failing since 6 days (last pass: v4.4.147 - first fail: v4.4.148)
We've had a couple of automated boot bisections pointing at the same commit on an x86 Atom x5-Z8350 platform (AAEON UP-CHT01), please see details below. The issue seems to have started with v4.4.148.
I'll take a look.
Do they all fail with the same message?
[ 0.065860] ACPI: 8 ACPI AML tables successfully acquired and loaded [ 0.073668] swapper/0: Corrupted page table at address 5b95ef78 [ 0.080286] PGD 272a067 PUD 272d067 PMD 5b8000000e3 [ 0.085847] Bad pagetable: 0009 [#1] SMP
?
-Andi
kernel-build-reports@lists.linaro.org