Linux 3.18 no longer boots under Xen.
This has been true for over half a year. The Xen project CI has been
sending automatic mails including bisection reports (see below).
I emailed Xen kernel folks and got no takers for fixing this.
Unless this is fixed soon, or at least someone shows some inclination
to investigate this regression, I intend to drop all testing of this
"stable" branch. It has rotted and no-one is fixing it.
> >> > *** Found and reproduced problem changeset ***
> >> >
> >> > Bug is in tree: linux
> > git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git
> >> > Bug introduced: 6b1ae527b1fdee86e81da0cb26ced75731c6c0fa
> >> > Bug not present: ba6984fc0162f24a510ebc34e881b546b69c553b
> >> > Last fail repro: http://logs.test-lab.xenproject.org/osstest/logs/136574/
> >
> > It seems that there is something wrong with the IGB driver.
Additionally, Jan Beulich writes:
> Which in turn reminds me of a patch of mine that was backported
> (and spotted by an earlier bisection), and that I've suggested
> (twice already iirc) was either backported in error, or without some
> further necessary changes. Iirc the stable tree maintainer for that
> branch was Cc-ed back then, and if so I'd conclude he doesn't care.
Thanks,
Ian.
[Please CC me, as I am not subscribed to the list.]
Attempting to build the tools/power/x86/turbostat/ binary fails:
[linux-4.4.180]$ make -C tools/power/x86/turbostat/
make: Entering directory `/linux-stable/linux-4.4.180/tools/power/x86/turbostat'
gcc -Wall -I../../../include -DMSRHEADER='"../../../../arch/x86/include/asm/msr-index.h"' turbostat.c -o /linux-stable/linux-4.4.180/tools/power/x86/turbostat/turbostat
In file included from turbostat.c:23:0:
../../../../arch/x86/include/asm/msr-index.h:4:24: fatal error: linux/bits.h: No such file or directory
#include <linux/bits.h>
^
compilation terminated.
make: *** [turbostat] Error 1
make: Leaving directory `/linux-stable/linux-4.4.180/tools/power/x86/turbostat'
[linux-4.4.180]$
A bisection showed:
683f9fba8c27817b6c2f7320a4095ca353022651 is the first bad commit
commit 683f9fba8c27817b6c2f7320a4095ca353022651
Author: Thomas Gleixner <tglx(a)linutronix.de>
Date: Thu Feb 21 12:36:50 2019 +0100
x86/msr-index: Cleanup bit defines
commit d8eabc37310a92df40d07c5a8afc53cebf996716 upstream.
Greg pointed out that speculation related bit defines are using (1 << N)
format instead of BIT(N). Aside of that (1 << N) is wrong as it should use
1UL at least.
Clean it up.
[ Josh Poimboeuf: Fix tools build ]
Reported-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Reviewed-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Reviewed-by: Borislav Petkov <bp(a)suse.de>
Reviewed-by: Frederic Weisbecker <frederic(a)kernel.org>
Reviewed-by: Jon Masters <jcm(a)redhat.com>
Tested-by: Jon Masters <jcm(a)redhat.com>
[bwh: Backported to 4.4:
- Drop change to x86_energy_perf_policy, which doesn't use msr-index.h here
- Drop changes to flush MSRs which we haven't defined]
Signed-off-by: Ben Hutchings <ben(a)decadent.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
:040000 040000 0ce430a14e73eef1007bf1558693e75e95ffe39a 3ab5675ed0798fc61e7d67ade87ac58dbbf33756 M arch
:040000 040000 d45f1a90570a44d8924711e56280cde7041328de c603a03d7801225fb15869d1386224f793f1ba1d M tools
Fix by modifying the turbostat/Makefile CFLAGS
and one #include line of the turbostat.c file.
Signed-off-by: Alan Bartlett <ajb(a)elrepo.org>
Tested-by: Akemi Yagi <toracat(a)elrepo.org>
Reviewed-by: Philip J Perry <phil(a)elrepo.org>
---
diff -Npru a/tools/power/x86/turbostat/Makefile b/tools/power/x86/turbostat/Makefile
--- a/tools/power/x86/turbostat/Makefile 2019-05-16 13:45:18.000000000 -0400
+++ b/tools/power/x86/turbostat/Makefile 2019-05-21 10:19:21.580477034 -0400
@@ -8,8 +8,7 @@ ifeq ("$(origin O)", "command line")
endif
turbostat : turbostat.c
-CFLAGS += -Wall -I../../../include
-CFLAGS += -DMSRHEADER='"../../../../arch/x86/include/asm/msr-index.h"'
+CFLAGS += -Wall
%: %.c
@mkdir -p $(BUILD_OUTPUT)
diff -Npru a/tools/power/x86/turbostat/turbostat.c b/tools/power/x86/turbostat/turbostat.c
--- a/tools/power/x86/turbostat/turbostat.c 2019-05-16 13:45:18.000000000 -0400
+++ b/tools/power/x86/turbostat/turbostat.c 2019-05-21 10:29:58.007236178 -0400
@@ -20,7 +20,7 @@
*/
#define _GNU_SOURCE
-#include MSRHEADER
+#include <asm/msr-index.h>
#include <stdarg.h>
#include <stdio.h>
#include <err.h>
Sasha Levin <sashal(a)kernel.org> writes:
> Hi,
>
> [This is an automated email]
>
> This commit has been processed because it contains a "Fixes:" tag,
> fixing commit: 6588c1e3ff014 signals: SI_USER: Masquerade si_pid when crossing pid ns boundary.
>
> The bot has tested the following trees: v5.1.4, v5.0.18, v4.19.45, v4.14.121, v4.9.178, v4.4.180, v3.18.140.
>
> v5.1.4: Build OK!
> v5.0.18: Build OK!
> v4.19.45: Failed to apply! Possible dependencies:
> 4cd2e0e70af68 ("signal: Introduce copy_siginfo_from_user and use it's return value")
> ae7795bc6187a ("signal: Distinguish between kernel_siginfo and siginfo")
> efc463adbccf7 ("signal: Simplify tracehook_report_syscall_exit")
>
> v4.14.121: Failed to apply! Possible dependencies:
> 212a36a17efe4 ("signal: Unify and correct copy_siginfo_from_user32")
> 3eb0f5193b497 ("signal: Ensure every siginfo we send has all bits initialized")
> 3f7c86b2382ea ("arm64: Update fault_info table with new exception types")
> 526c3ddb6aa27 ("signal/arm64: Document conflicts with SI_USER and SIGFPE,SIGTRAP,SIGBUS")
> 532826f3712b6 ("arm64: Mirror arm for unimplemented compat syscalls")
> 6b4f3d01052a4 ("usb, signal, security: only pass the cred, not the secid, to kill_pid_info_as_cred and security_task_kill")
> 92ff0674f5d80 ("arm64: mm: Rework unhandled user pagefaults to call arm64_force_sig_info")
> ae7795bc6187a ("signal: Distinguish between kernel_siginfo and siginfo")
> af40ff687bc9d ("arm64: signal: Ensure si_code is valid for all fault signals")
> b713da69e4c91 ("signal: unify compat_siginfo_t")
> ea64d5acc8f03 ("signal: Unify and correct copy_siginfo_to_user32")
> efc463adbccf7 ("signal: Simplify tracehook_report_syscall_exit")
>
> v4.9.178: Failed to apply! Possible dependencies:
> 359566faefa85 ("kernel_wait4()/kernel_waitid(): delay copying status to userland")
> 4c48abe91be03 ("waitid(): switch copyout of siginfo to unsafe_put_user()")
> 4e2648db9c5f7 ("ARM: remove indirection of asm/mach-types.h")
> 4f4ddad395b04 ("nios2: put setup.h in uapi")
> 53d3eaa315082 ("posix_cpu_timers: Move the add_device_randomness() call to a proper place")
> 67d7ddded322d ("waitid(2): leave copyout of siginfo to syscall itself")
> 6bc51cbaa9d75 ("signal: Remove non-uapi <asm/siginfo.h>")
> 7e95a225901a5 ("move compat wait4 and waitid next to native variants")
> 80dce5e374930 ("signal/ia64: Document a conflict with SI_USER with SIGFPE")
> 8f95c90ceb541 ("sched/wait, RCU: Introduce rcuwait machinery")
> 96a8fae0fe094 ("ARM: convert to generated system call tables")
> ae7795bc6187a ("signal: Distinguish between kernel_siginfo and siginfo")
> b9253a43370e8 ("signal: Move copy_siginfo_to_user to <linux/signal.h>")
> cc731525f26af ("signal: Remove kernel interal si_code magic")
> cc9f72e474a4d ("signal/sparc: Document a conflict with SI_USER with SIGFPE")
> ce72a16fa705f ("wait4(2)/waitid(2): separate copying rusage to userland")
> d08477aa975e9 ("fcntl: Don't use ambiguous SIG_POLL si_codes")
> e2bd64d92a10f ("signal/alpha: Document a conflict with SI_USER for SIGTRAP")
> ea1b75cf91380 ("signal/mips: Document a conflict with SI_USER with SIGFPE")
> ea64d5acc8f03 ("signal: Unify and correct copy_siginfo_to_user32")
>
> v4.4.180: Failed to apply! Possible dependencies:
> 2b5e869ecfcb3 ("MIPS: ELF: Interpret the NAN2008 file header flag")
> 4f4acc9472e54 ("parisc: Fix SIGSYS signals in compat case")
> 5050e91fa650e ("MIPS: Support sending SIG_SYS to 32bit userspace from 64bit kernel")
> 5fa393c857195 ("MIPS: Break down cacheops.h definitions")
> 6846351052e68 ("x86/signal: Add SA_{X32,IA32}_ABI sa_flags")
> 694977006a7ba ("MIPS: Use enums to make asm/pgtable-bits.h readable")
> 745f355878462 ("MIPS: mm: Unify pte_page definition")
> 780602d740fc0 ("MIPS: mm: Standardise on _PAGE_NO_READ, drop _PAGE_READ")
> 7939469da29a8 ("MIPS64: signal: Fix o32 sigaction syscall")
> 7b2cb64f91f25 ("MIPS: mm: Fix MIPS32 36b physical addressing (alchemy, netlogic)")
> 80dce5e374930 ("signal/ia64: Document a conflict with SI_USER with SIGFPE")
> 97f2645f358b4 ("tree-wide: replace config_enabled() with IS_ENABLED()")
> a4455082dc6f0 ("x86/signals: Add missing signal_compat code for x86 features")
> a60ae81e5e591 ("MIPS: CM: Fix mips_cm_max_vp_width for UP kernels")
> ae7795bc6187a ("signal: Distinguish between kernel_siginfo and siginfo")
> b1b4fad5cc678 ("MIPS: seccomp: Support compat with both O32 and N32")
> b27873702b060 ("mips, thp: remove infrastructure for handling splitting PMDs")
> b2edcfc814017 ("MIPS: Loongson: Add Loongson-3A R2 basic support")
> cc731525f26af ("signal: Remove kernel interal si_code magic")
> cc9f72e474a4d ("signal/sparc: Document a conflict with SI_USER with SIGFPE")
> e2bd64d92a10f ("signal/alpha: Document a conflict with SI_USER for SIGTRAP")
> ea1b75cf91380 ("signal/mips: Document a conflict with SI_USER with SIGFPE")
> ea64d5acc8f03 ("signal: Unify and correct copy_siginfo_to_user32")
>
> v3.18.140: Failed to apply! Possible dependencies:
> 1a3d59579b9f4 ("MIPS: Tidy up FPU context switching")
> 304acb717e5b6 ("MIPS: Set `si_code' for SIGFPE signals sent from emulation too")
> 4227a2d4efc9c ("MIPS: Support for hybrid FPRs")
> 443c44032a54f ("MIPS: Always clear FCSR cause bits after emulation")
> 4a7c2371823a4 ("MIPS: Reindent R6 RI exception emulation")
> 53f037b08b5be ("ia64: Sync struct siginfo with general version")
> 5a1aca4469fdc ("MIPS: Fix FCSR Cause bit handling for correct SIGFPE issue")
> 5f9f41c474bef ("MIPS: kernel: Prepare the JR instruction for emulation on MIPS R6")
> 7c151d3d5d7a0 ("MIPS: Make use of the ERETNC instruction on MIPS R6")
> 80dce5e374930 ("signal/ia64: Document a conflict with SI_USER with SIGFPE")
> 9cc719ab3f4f6 ("MIPS: MSA: bugfix - disable MSA correctly for new threads/processes.")
> ae7795bc6187a ("signal: Distinguish between kernel_siginfo and siginfo")
> b0a668fb2038d ("MIPS: kernel: mips-r2-to-r6-emul: Add R2 emulator for MIPS R6")
> cc5e9097c9aad ("arm64: add SIGSYS siginfo for compat task")
> cc731525f26af ("signal: Remove kernel interal si_code magic")
> e2bd64d92a10f ("signal/alpha: Document a conflict with SI_USER for SIGTRAP")
> ea1b75cf91380 ("signal/mips: Document a conflict with SI_USER with SIGFPE")
> ea64d5acc8f03 ("signal: Unify and correct copy_siginfo_to_user32")
> ed2d72c1eb364 ("MIPS: Respect the FCSR exception mask for `si_code'")
> f51246efee2b6 ("MIPS: Get rid of finish_arch_switch().")
> fad0bfdb893ac ("MIPS: mips-r2-to-r6-emul.h: Inline empty `mipsr2_decoder'")
>
>
> How should we proceed with this patch?
I have not had any reports of anyone having problems, and this
only triggers when signals traverse a pid or a user namespace
boundary.
So while this is indeed a fix I think the usual best effort backport
will be fine.
If backporting further is desired it looks like the only real dependency
is the addition of the function siginfo_layout. So it should not be as
difficult as the automated scripts suggests.
Eric
On Wed, May 29, 2019 at 3:15 PM Sasha Levin <sashal(a)kernel.org> wrote:
> [This is an automated email]
>
> This commit has been processed because it contains a "Fixes:" tag,
> fixing commit: 287980e49ffc0 remove lots of IS_ERR_VALUE abuses.
>
> The bot has tested the following trees: v5.1.4, v5.0.18, v4.19.45, v4.14.121, v4.9.178.
>
> v5.1.4: Build OK!
> v5.0.18: Build OK!
> v4.19.45: Build OK!
> v4.14.121: Build OK!
> v4.9.178: Failed to apply! Possible dependencies:
[...]
> ddb4a1442def2 ("exec: Rename bprm->cred_prepared to called_set_creds")
[...]
> How should we proceed with this patch?
I think the dependency is on ddb4a1442def2; but the simplest way is
probably to manually adjust the fix, it's basically the same in 4.9.
From: Adrian Hunter <adrian.hunter(a)intel.com>
Fix intel-pt documentation to reflect the change of itrace defaults for
perf script.
Signed-off-by: Adrian Hunter <adrian.hunter(a)intel.com>
Cc: Jiri Olsa <jolsa(a)redhat.com>
Cc: stable(a)vger.kernel.org
Fixes: 4eb068157121 ("perf script: Make itrace script default to all calls")
Link: http://lkml.kernel.org/r/20190520113728.14389-4-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme(a)redhat.com>
---
tools/perf/Documentation/intel-pt.txt | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/tools/perf/Documentation/intel-pt.txt b/tools/perf/Documentation/intel-pt.txt
index 115eaacc455f..60d99e5e7921 100644
--- a/tools/perf/Documentation/intel-pt.txt
+++ b/tools/perf/Documentation/intel-pt.txt
@@ -88,16 +88,16 @@ smaller.
To represent software control flow, "branches" samples are produced. By default
a branch sample is synthesized for every single branch. To get an idea what
-data is available you can use the 'perf script' tool with no parameters, which
-will list all the samples.
+data is available you can use the 'perf script' tool with all itrace sampling
+options, which will list all the samples.
perf record -e intel_pt//u ls
- perf script
+ perf script --itrace=ibxwpe
An interesting field that is not printed by default is 'flags' which can be
displayed as follows:
- perf script -Fcomm,tid,pid,time,cpu,event,trace,ip,sym,dso,addr,symoff,flags
+ perf script --itrace=ibxwpe -F+flags
The flags are "bcrosyiABEx" which stand for branch, call, return, conditional,
system, asynchronous, interrupt, transaction abort, trace begin, trace end, and
@@ -713,7 +713,7 @@ Having no option is the same as
which, in turn, is the same as
- --itrace=ibxwpe
+ --itrace=cepwx
The letters are:
--
2.20.1
Hi,
This patch requires all #MC exception errors set MCG_STATUS_RIPV = 1?
Because on offline CPUs, for #MC exception errors set MCG_STATUS_RIPV = 0
(like "Recoverable-not-continuable SRAR Type" Errors), this patch doesn't seem
to work. if this patch's "return; " in a wrong place?
Thanks
Tony W Wang-oc
While hunting an issue in swiotlb-xen I stumbled over a wrong test
and found some areas for improvement.
Juergen Gross (3):
xen/swiotlb: fix condition for calling xen_destroy_contiguous_region()
xen/swiotlb: simplify range_straddles_page_boundary()
xen/swiotlb: remember having called xen_create_contiguous_region()
drivers/xen/swiotlb-xen.c | 36 ++++++++++++------------------------
include/linux/page-flags.h | 3 +++
2 files changed, 15 insertions(+), 24 deletions(-)
--
2.16.4
The patch titled
Subject: mm: hugetlb: soft-offline: fix wrong return value of soft offline
has been removed from the -mm tree. Its filename was
mm-hugetlb-soft-offline-fix-wrong-return-value-of-soft-offline.patch
This patch was dropped because an updated version will be merged
------------------------------------------------------
From: Naoya Horiguchi <n-horiguchi(a)ah.jp.nec.com>
Subject: mm: hugetlb: soft-offline: fix wrong return value of soft offline
Soft offline events for hugetlb pages return -EBUSY when page migration
succeeded and dissolve_free_huge_page() failed, which can happen when
there're surplus hugepages. We should judge pass/fail of soft offline by
checking whether the raw error page was finally contained or not (i.e.
the result of set_hwpoison_free_buddy_page()), so this behavior is wrong.
This problem was introduced by the following change of commit 6bc9b56433b76
("mm: fix race on soft-offlining"):
if (ret > 0)
ret = -EIO;
} else {
- if (PageHuge(page))
- dissolve_free_huge_page(page);
+ /*
+ * We set PG_hwpoison only when the migration source hugepage
+ * was successfully dissolved, because otherwise hwpoisoned
+ * hugepage remains on free hugepage list, then userspace will
+ * find it as SIGBUS by allocation failure. That's not expected
+ * in soft-offlining.
+ */
+ ret = dissolve_free_huge_page(page);
+ if (!ret) {
+ if (set_hwpoison_free_buddy_page(page))
+ num_poisoned_pages_inc();
+ }
}
return ret;
}
so a simple fix is to restore the PageHuge precheck, but my code reading
shows that we already have PageHuge check in dissolve_free_huge_page()
with hugetlb_lock, which is better place to check it. And currently
dissolve_free_huge_page() returns -EBUSY for !PageHuge but that's simply
wrong because that that case should be considered as success (meaning that
"the given hugetlb was already dissolved.")
This change affects other callers of dissolve_free_huge_page(), which are
also cleaned up by this patch.
Link: http://lkml.kernel.org/r/1558937200-18544-1-git-send-email-n-horiguchi@ah.j…
Fixes: 6bc9b56433b76 ("mm: fix race on soft-offlining")
Signed-off-by: Naoya Horiguchi <n-horiguchi(a)ah.jp.nec.com>
Reported-by: Chen, Jerry T <jerry.t.chen(a)intel.com>
Cc: Michal Hocko <mhocko(a)kernel.org>
Cc: Mike Kravetz <mike.kravetz(a)oracle.com>
Cc: "Zhuo, Qiuxu" <qiuxu.zhuo(a)intel.com>
Cc: <stable(a)vger.kernel.org> [4.19+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/hugetlb.c | 15 +++++++++------
mm/memory-failure.c | 7 +++----
2 files changed, 12 insertions(+), 10 deletions(-)
--- a/mm/hugetlb.c~mm-hugetlb-soft-offline-fix-wrong-return-value-of-soft-offline
+++ a/mm/hugetlb.c
@@ -1519,7 +1519,12 @@ int dissolve_free_huge_page(struct page
int rc = -EBUSY;
spin_lock(&hugetlb_lock);
- if (PageHuge(page) && !page_count(page)) {
+ if (!PageHuge(page)) {
+ rc = 0;
+ goto out;
+ }
+
+ if (!page_count(page)) {
struct page *head = compound_head(page);
struct hstate *h = page_hstate(head);
int nid = page_to_nid(head);
@@ -1564,11 +1569,9 @@ int dissolve_free_huge_pages(unsigned lo
for (pfn = start_pfn; pfn < end_pfn; pfn += 1 << minimum_order) {
page = pfn_to_page(pfn);
- if (PageHuge(page) && !page_count(page)) {
- rc = dissolve_free_huge_page(page);
- if (rc)
- break;
- }
+ rc = dissolve_free_huge_page(page);
+ if (rc)
+ break;
}
return rc;
--- a/mm/memory-failure.c~mm-hugetlb-soft-offline-fix-wrong-return-value-of-soft-offline
+++ a/mm/memory-failure.c
@@ -1733,6 +1733,8 @@ static int soft_offline_huge_page(struct
if (!ret) {
if (set_hwpoison_free_buddy_page(page))
num_poisoned_pages_inc();
+ else
+ ret = -EBUSY;
}
}
return ret;
@@ -1857,11 +1859,8 @@ static int soft_offline_in_use_page(stru
static int soft_offline_free_page(struct page *page)
{
- int rc = 0;
- struct page *head = compound_head(page);
+ int rc = dissolve_free_huge_page(page);
- if (PageHuge(head))
- rc = dissolve_free_huge_page(page);
if (!rc) {
if (set_hwpoison_free_buddy_page(page))
num_poisoned_pages_inc();
_
Patches currently in -mm which might be from n-horiguchi(a)ah.jp.nec.com are