The patch titled
Subject: mm: memcontrol: flush percpu vmstats before releasing memcg
has been removed from the -mm tree. Its filename was
mm-memcontrol-flush-percpu-vmstats-before-releasing-memcg.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Roman Gushchin <guro(a)fb.com>
Subject: mm: memcontrol: flush percpu vmstats before releasing memcg
Percpu caching of local vmstats with the conditional propagation by the
cgroup tree leads to an accumulation of errors on non-leaf levels.
Let's imagine two nested memory cgroups A and A/B. Say, a process
belonging to A/B allocates 100 pagecache pages on the CPU 0. The percpu
cache will spill 3 times, so that 32*3=96 pages will be accounted to A/B
and A atomic vmstat counters, 4 pages will remain in the percpu cache.
Imagine A/B is nearby memory.max, so that every following allocation
triggers a direct reclaim on the local CPU. Say, each such attempt will
free 16 pages on a new cpu. That means every percpu cache will have -16
pages, except the first one, which will have 4 - 16 = -12. A/B and A
atomic counters will not be touched at all.
Now a user removes A/B. All percpu caches are freed and corresponding
vmstat numbers are forgotten. A has 96 pages more than expected.
As memory cgroups are created and destroyed, errors do accumulate. Even
1-2 pages differences can accumulate into large numbers.
To fix this issue let's accumulate and propagate percpu vmstat values
before releasing the memory cgroup. At this point these numbers are
stable and cannot be changed.
Since on cpu hotplug we do flush percpu vmstats anyway, we can iterate
only over online cpus.
Link: http://lkml.kernel.org/r/20190819202338.363363-2-guro@fb.com
Fixes: 42a300353577 ("mm: memcontrol: fix recursive statistics correctness & scalabilty")
Signed-off-by: Roman Gushchin <guro(a)fb.com>
Acked-by: Michal Hocko <mhocko(a)suse.com>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: Vladimir Davydov <vdavydov.dev(a)gmail.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/memcontrol.c | 40 ++++++++++++++++++++++++++++++++++++++++
1 file changed, 40 insertions(+)
--- a/mm/memcontrol.c~mm-memcontrol-flush-percpu-vmstats-before-releasing-memcg
+++ a/mm/memcontrol.c
@@ -3260,6 +3260,41 @@ static u64 mem_cgroup_read_u64(struct cg
}
}
+static void memcg_flush_percpu_vmstats(struct mem_cgroup *memcg)
+{
+ unsigned long stat[MEMCG_NR_STAT];
+ struct mem_cgroup *mi;
+ int node, cpu, i;
+
+ for (i = 0; i < MEMCG_NR_STAT; i++)
+ stat[i] = 0;
+
+ for_each_online_cpu(cpu)
+ for (i = 0; i < MEMCG_NR_STAT; i++)
+ stat[i] += raw_cpu_read(memcg->vmstats_percpu->stat[i]);
+
+ for (mi = memcg; mi; mi = parent_mem_cgroup(mi))
+ for (i = 0; i < MEMCG_NR_STAT; i++)
+ atomic_long_add(stat[i], &mi->vmstats[i]);
+
+ for_each_node(node) {
+ struct mem_cgroup_per_node *pn = memcg->nodeinfo[node];
+ struct mem_cgroup_per_node *pi;
+
+ for (i = 0; i < NR_VM_NODE_STAT_ITEMS; i++)
+ stat[i] = 0;
+
+ for_each_online_cpu(cpu)
+ for (i = 0; i < NR_VM_NODE_STAT_ITEMS; i++)
+ stat[i] += raw_cpu_read(
+ pn->lruvec_stat_cpu->count[i]);
+
+ for (pi = pn; pi; pi = parent_nodeinfo(pi, node))
+ for (i = 0; i < NR_VM_NODE_STAT_ITEMS; i++)
+ atomic_long_add(stat[i], &pi->lruvec_stat[i]);
+ }
+}
+
#ifdef CONFIG_MEMCG_KMEM
static int memcg_online_kmem(struct mem_cgroup *memcg)
{
@@ -4682,6 +4717,11 @@ static void __mem_cgroup_free(struct mem
{
int node;
+ /*
+ * Flush percpu vmstats to guarantee the value correctness
+ * on parent's and all ancestor levels.
+ */
+ memcg_flush_percpu_vmstats(memcg);
for_each_node(node)
free_mem_cgroup_per_node_info(memcg, node);
free_percpu(memcg->vmstats_percpu);
_
Patches currently in -mm which might be from guro(a)fb.com are
mm-memcontrol-flush-percpu-slab-vmstats-on-kmem-offlining.patch
partially-revert-mm-memcontrolc-keep-local-vm-counters-in-sync-with-the-hierarchical-ones.patch
mm-memcontrol-switch-to-rcu-protection-in-drain_all_stock.patch
The patch titled
Subject: mm, page_alloc: move_freepages should not examine struct page of reserved memory
has been removed from the -mm tree. Its filename was
mm-page_alloc-move_freepages-should-not-examine-struct-page-of-reserved-memory.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: David Rientjes <rientjes(a)google.com>
Subject: mm, page_alloc: move_freepages should not examine struct page of reserved memory
After commit 907ec5fca3dc ("mm: zero remaining unavailable struct pages"),
struct page of reserved memory is zeroed. This causes page->flags to be 0
and fixes issues related to reading /proc/kpageflags, for example, of
reserved memory.
The VM_BUG_ON() in move_freepages_block(), however, assumes that
page_zone() is meaningful even for reserved memory. That assumption is no
longer true after the aforementioned commit.
There's no reason why move_freepages_block() should be testing the
legitimacy of page_zone() for reserved memory; its scope is limited only
to pages on the zone's freelist.
Note that pfn_valid() can be true for reserved memory: there is a backing
struct page. The check for page_to_nid(page) is also buggy but reserved
memory normally only appears on node 0 so the zeroing doesn't affect this.
Move the debug checks to after verifying PageBuddy is true. This isolates
the scope of the checks to only be for buddy pages which are on the zone's
freelist which move_freepages_block() is operating on. In this case, an
incorrect node or zone is a bug worthy of being warned about (and the
examination of struct page is acceptable bcause this memory is not
reserved).
Why does move_freepages_block() gets called on reserved memory? It's
simply math after finding a valid free page from the per-zone free area to
use as fallback. We find the beginning and end of the pageblock of the
valid page and that can bring us into memory that was reserved per the
e820. pfn_valid() is still true (it's backed by a struct page), but since
it's zero'd we shouldn't make any inferences here about comparing its node
or zone. The current node check just happens to succeed most of the time
by luck because reserved memory typically appears on node 0.
The fix here is to validate that we actually have buddy pages before
testing if there's any type of zone or node strangeness going on.
We noticed it almost immediately after bringing 907ec5fca3dc in on
CONFIG_DEBUG_VM builds. It depends on finding specific free pages in
the per-zone free area where the math in move_freepages() will bring
the start or end pfn into reserved memory and wanting to claim that
entire pageblock as a new migratetype. So the path will be rare,
require CONFIG_DEBUG_VM, and require fallback to a different
migratetype.
Some struct pages were already zeroed from reserve pages before
907ec5fca3c so it theoretically could trigger before this commit. I
think it's rare enough under a config option that most people don't run
that others may not have noticed. I wouldn't argue against a stable
tag and the backport should be easy enough, but probably wouldn't
single out a commit that this is fixing.
Mel said:
: The overhead of the debugging check is higher with this patch although
: it'll only affect debug builds and the path is not particularly hot.
: If this was a concern, I think it would be reasonable to simply remove
: the debugging check as the zone boundaries are checked in
: move_freepages_block and we never expect a zone/node to be smaller than
: a pageblock and stuck in the middle of another zone.
Link: http://lkml.kernel.org/r/alpine.DEB.2.21.1908122036560.10779@chino.kir.corp…
Signed-off-by: David Rientjes <rientjes(a)google.com>
Acked-by: Mel Gorman <mgorman(a)techsingularity.net>
Cc: Naoya Horiguchi <n-horiguchi(a)ah.jp.nec.com>
Cc: Masayoshi Mizuma <m.mizuma(a)jp.fujitsu.com>
Cc: Oscar Salvador <osalvador(a)suse.de>
Cc: Pavel Tatashin <pavel.tatashin(a)microsoft.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/page_alloc.c | 19 ++++---------------
1 file changed, 4 insertions(+), 15 deletions(-)
--- a/mm/page_alloc.c~mm-page_alloc-move_freepages-should-not-examine-struct-page-of-reserved-memory
+++ a/mm/page_alloc.c
@@ -2238,27 +2238,12 @@ static int move_freepages(struct zone *z
unsigned int order;
int pages_moved = 0;
-#ifndef CONFIG_HOLES_IN_ZONE
- /*
- * page_zone is not safe to call in this context when
- * CONFIG_HOLES_IN_ZONE is set. This bug check is probably redundant
- * anyway as we check zone boundaries in move_freepages_block().
- * Remove at a later date when no bug reports exist related to
- * grouping pages by mobility
- */
- VM_BUG_ON(pfn_valid(page_to_pfn(start_page)) &&
- pfn_valid(page_to_pfn(end_page)) &&
- page_zone(start_page) != page_zone(end_page));
-#endif
for (page = start_page; page <= end_page;) {
if (!pfn_valid_within(page_to_pfn(page))) {
page++;
continue;
}
- /* Make sure we are not inadvertently changing nodes */
- VM_BUG_ON_PAGE(page_to_nid(page) != zone_to_nid(zone), page);
-
if (!PageBuddy(page)) {
/*
* We assume that pages that could be isolated for
@@ -2273,6 +2258,10 @@ static int move_freepages(struct zone *z
continue;
}
+ /* Make sure we are not inadvertently changing nodes */
+ VM_BUG_ON_PAGE(page_to_nid(page) != zone_to_nid(zone), page);
+ VM_BUG_ON_PAGE(page_zone(page) != zone, page);
+
order = page_order(page);
move_to_free_area(page, &zone->free_area[order], migratetype);
page += 1 << order;
_
Patches currently in -mm which might be from rientjes(a)google.com are
The patch titled
Subject: mm/z3fold.c: fix race between migration and destruction
has been removed from the -mm tree. Its filename was
mm-z3foldc-fix-race-between-migration-and-destruction.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Henry Burns <henryburns(a)google.com>
Subject: mm/z3fold.c: fix race between migration and destruction
In z3fold_destroy_pool() we call destroy_workqueue(&pool->compact_wq).
However, we have no guarantee that migration isn't happening in the
background at that time.
Migration directly calls queue_work_on(pool->compact_wq), if destruction
wins that race we are using a destroyed workqueue.
Link: http://lkml.kernel.org/r/20190809213828.202833-1-henryburns@google.com
Signed-off-by: Henry Burns <henryburns(a)google.com>
Cc: Vitaly Wool <vitalywool(a)gmail.com>
Cc: Shakeel Butt <shakeelb(a)google.com>
Cc: Jonathan Adams <jwadams(a)google.com>
Cc: Henry Burns <henrywolfeburns(a)gmail.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/z3fold.c | 89 ++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 89 insertions(+)
--- a/mm/z3fold.c~mm-z3foldc-fix-race-between-migration-and-destruction
+++ a/mm/z3fold.c
@@ -41,6 +41,7 @@
#include <linux/workqueue.h>
#include <linux/slab.h>
#include <linux/spinlock.h>
+#include <linux/wait.h>
#include <linux/zpool.h>
#include <linux/magic.h>
@@ -145,6 +146,8 @@ struct z3fold_header {
* @release_wq: workqueue for safe page release
* @work: work_struct for safe page release
* @inode: inode for z3fold pseudo filesystem
+ * @destroying: bool to stop migration once we start destruction
+ * @isolated: int to count the number of pages currently in isolation
*
* This structure is allocated at pool creation time and maintains metadata
* pertaining to a particular z3fold pool.
@@ -163,8 +166,11 @@ struct z3fold_pool {
const struct zpool_ops *zpool_ops;
struct workqueue_struct *compact_wq;
struct workqueue_struct *release_wq;
+ struct wait_queue_head isolate_wait;
struct work_struct work;
struct inode *inode;
+ bool destroying;
+ int isolated;
};
/*
@@ -769,6 +775,7 @@ static struct z3fold_pool *z3fold_create
goto out_c;
spin_lock_init(&pool->lock);
spin_lock_init(&pool->stale_lock);
+ init_waitqueue_head(&pool->isolate_wait);
pool->unbuddied = __alloc_percpu(sizeof(struct list_head)*NCHUNKS, 2);
if (!pool->unbuddied)
goto out_pool;
@@ -808,6 +815,15 @@ out:
return NULL;
}
+static bool pool_isolated_are_drained(struct z3fold_pool *pool)
+{
+ bool ret;
+
+ spin_lock(&pool->lock);
+ ret = pool->isolated == 0;
+ spin_unlock(&pool->lock);
+ return ret;
+}
/**
* z3fold_destroy_pool() - destroys an existing z3fold pool
* @pool: the z3fold pool to be destroyed
@@ -817,6 +833,22 @@ out:
static void z3fold_destroy_pool(struct z3fold_pool *pool)
{
kmem_cache_destroy(pool->c_handle);
+ /*
+ * We set pool-> destroying under lock to ensure that
+ * z3fold_page_isolate() sees any changes to destroying. This way we
+ * avoid the need for any memory barriers.
+ */
+
+ spin_lock(&pool->lock);
+ pool->destroying = true;
+ spin_unlock(&pool->lock);
+
+ /*
+ * We need to ensure that no pages are being migrated while we destroy
+ * these workqueues, as migration can queue work on either of the
+ * workqueues.
+ */
+ wait_event(pool->isolate_wait, !pool_isolated_are_drained(pool));
/*
* We need to destroy pool->compact_wq before pool->release_wq,
@@ -1307,6 +1339,28 @@ static u64 z3fold_get_pool_size(struct z
return atomic64_read(&pool->pages_nr);
}
+/*
+ * z3fold_dec_isolated() expects to be called while pool->lock is held.
+ */
+static void z3fold_dec_isolated(struct z3fold_pool *pool)
+{
+ assert_spin_locked(&pool->lock);
+ VM_BUG_ON(pool->isolated <= 0);
+ pool->isolated--;
+
+ /*
+ * If we have no more isolated pages, we have to see if
+ * z3fold_destroy_pool() is waiting for a signal.
+ */
+ if (pool->isolated == 0 && waitqueue_active(&pool->isolate_wait))
+ wake_up_all(&pool->isolate_wait);
+}
+
+static void z3fold_inc_isolated(struct z3fold_pool *pool)
+{
+ pool->isolated++;
+}
+
static bool z3fold_page_isolate(struct page *page, isolate_mode_t mode)
{
struct z3fold_header *zhdr;
@@ -1333,6 +1387,33 @@ static bool z3fold_page_isolate(struct p
spin_lock(&pool->lock);
if (!list_empty(&page->lru))
list_del(&page->lru);
+ /*
+ * We need to check for destruction while holding pool->lock, as
+ * otherwise destruction could see 0 isolated pages, and
+ * proceed.
+ */
+ if (unlikely(pool->destroying)) {
+ spin_unlock(&pool->lock);
+ /*
+ * If this page isn't stale, somebody else holds a
+ * reference to it. Let't drop our refcount so that they
+ * can call the release logic.
+ */
+ if (unlikely(kref_put(&zhdr->refcount,
+ release_z3fold_page_locked))) {
+ /*
+ * If we get here we have kref problems, so we
+ * should freak out.
+ */
+ WARN(1, "Z3fold is experiencing kref problems\n");
+ return false;
+ }
+ z3fold_page_unlock(zhdr);
+ return false;
+ }
+
+
+ z3fold_inc_isolated(pool);
spin_unlock(&pool->lock);
z3fold_page_unlock(zhdr);
return true;
@@ -1401,6 +1482,10 @@ static int z3fold_page_migrate(struct ad
queue_work_on(new_zhdr->cpu, pool->compact_wq, &new_zhdr->work);
+ spin_lock(&pool->lock);
+ z3fold_dec_isolated(pool);
+ spin_unlock(&pool->lock);
+
page_mapcount_reset(page);
put_page(page);
return 0;
@@ -1420,10 +1505,14 @@ static void z3fold_page_putback(struct p
INIT_LIST_HEAD(&page->lru);
if (kref_put(&zhdr->refcount, release_z3fold_page_locked)) {
atomic64_dec(&pool->pages_nr);
+ spin_lock(&pool->lock);
+ z3fold_dec_isolated(pool);
+ spin_unlock(&pool->lock);
return;
}
spin_lock(&pool->lock);
list_add(&page->lru, &pool->lru);
+ z3fold_dec_isolated(pool);
spin_unlock(&pool->lock);
z3fold_page_unlock(zhdr);
}
_
Patches currently in -mm which might be from henryburns(a)google.com are
This is the start of the stable review cycle for the 4.14.137 release.
There are 53 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Wed 07 Aug 2019 12:47:58 PM UTC.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.14.137-r…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.14.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 4.14.137-rc1
Andy Lutomirski <luto(a)kernel.org>
x86/vdso: Prevent segfaults due to hoisted vclock reads
Linus Torvalds <torvalds(a)linux-foundation.org>
gcc-9: properly declare the {pv,hv}clock_page storage
Josh Poimboeuf <jpoimboe(a)redhat.com>
objtool: Support GCC 9 cold subfunction naming scheme
Jean Delvare <jdelvare(a)suse.de>
eeprom: at24: make spd world-readable again
John Fleck <john.fleck(a)intel.com>
IB/hfi1: Check for error on call to alloc_rsm_map_table
Yishai Hadas <yishaih(a)mellanox.com>
IB/mlx5: Fix RSS Toeplitz setup to be aligned with the HW specification
Yishai Hadas <yishaih(a)mellanox.com>
IB/mlx5: Move MRs to a kernel PD when freeing them to the MR cache
Yishai Hadas <yishaih(a)mellanox.com>
IB/mlx5: Use direct mkey destroy command upon UMR unreg failure
Yishai Hadas <yishaih(a)mellanox.com>
IB/mlx5: Fix unreg_umr to ignore the mkey state
Juergen Gross <jgross(a)suse.com>
xen/swiotlb: fix condition for calling xen_destroy_contiguous_region()
Munehisa Kamata <kamatam(a)amazon.com>
nbd: replace kill_bdev() with __invalidate_device() again
Will Deacon <will(a)kernel.org>
drivers/perf: arm_pmu: Fix failure path in PM notifier
Helge Deller <deller(a)gmx.de>
parisc: Fix build of compressed kernel even with debug enabled
Stefan Haberland <sth(a)linux.ibm.com>
s390/dasd: fix endless loop after read unit address configuration
Ondrej Mosnacek <omosnace(a)redhat.com>
selinux: fix memory leak in policydb_init()
Gustavo A. R. Silva <gustavo(a)embeddedor.com>
IB/hfi1: Fix Spectre v1 vulnerability
Michael Wu <michael.wu(a)vatics.com>
gpiolib: fix incorrect IRQ requesting of an active-low lineevent
Douglas Anderson <dianders(a)chromium.org>
mmc: dw_mmc: Fix occasional hang after tuning on eMMC
Filipe Manana <fdmanana(a)suse.com>
Btrfs: fix race leading to fs corruption after transaction abort
Filipe Manana <fdmanana(a)suse.com>
Btrfs: fix incremental send failure after deduplication
Masahiro Yamada <yamada.masahiro(a)socionext.com>
kbuild: initialize CLANG_FLAGS correctly in the top Makefile
Yongxin Liu <yongxin.liu(a)windriver.com>
drm/nouveau: fix memory leak in nouveau_conn_reset()
Zhenzhong Duan <zhenzhong.duan(a)oracle.com>
x86, boot: Remove multiple copy of static function sanitize_boot_params()
Josh Poimboeuf <jpoimboe(a)redhat.com>
x86/paravirt: Fix callee-saved function ELF sizes
Josh Poimboeuf <jpoimboe(a)redhat.com>
x86/kvm: Don't call kvm_spurious_fault() from .fixup
Zhenzhong Duan <zhenzhong.duan(a)oracle.com>
xen/pv: Fix a boot up hang revealed by int3 self test
Kees Cook <keescook(a)chromium.org>
ipc/mqueue.c: only perform resource calculation if user valid
Dan Carpenter <dan.carpenter(a)oracle.com>
drivers/rapidio/devices/rio_mport_cdev.c: NUL terminate some strings
Mikko Rapeli <mikko.rapeli(a)iki.fi>
uapi linux/coda_psdev.h: move upc_req definition from uapi to kernel side headers
Sam Protsenko <semen.protsenko(a)linaro.org>
coda: fix build using bare-metal toolchain
Zhouyang Jia <jiazhouyang09(a)gmail.com>
coda: add error handling for fget
Doug Berger <opendmb(a)gmail.com>
mm/cma.c: fail if fixed declaration can't be honored
Arnd Bergmann <arnd(a)arndb.de>
x86: math-emu: Hide clang warnings for 16-bit overflow
Qian Cai <cai(a)lca.pw>
x86/apic: Silence -Wtype-limits compiler warnings
Benjamin Poirier <bpoirier(a)suse.com>
be2net: Signal that the device cannot transmit during reconfiguration
Arnd Bergmann <arnd(a)arndb.de>
ACPI: fix false-positive -Wuninitialized warning
Arnd Bergmann <arnd(a)arndb.de>
x86: kvm: avoid constant-conversion warning
Benjamin Block <bblock(a)linux.ibm.com>
scsi: zfcp: fix GCC compiler warning emitted with -Wmaybe-uninitialized
Arnd Bergmann <arnd(a)arndb.de>
ACPI: blacklist: fix clang warning for unused DMI table
Jeff Layton <jlayton(a)kernel.org>
ceph: return -ERANGE if virtual xattr value didn't fit in buffer
Andrea Parri <andrea.parri(a)amarulasolutions.com>
ceph: fix improper use of smp_mb__before_atomic()
Ronnie Sahlberg <lsahlber(a)redhat.com>
cifs: Fix a race condition with cifs_echo_request
David Sterba <dsterba(a)suse.com>
btrfs: fix minimum number of chunk errors for DUP
Russell King <rmk+kernel(a)armlinux.org.uk>
fs/adfs: super: fix use-after-free bug
JC Kuo <jckuo(a)nvidia.com>
clk: tegra210: fix PLLU and PLLU_OUT1
Geert Uytterhoeven <geert+renesas(a)glider.be>
dmaengine: rcar-dmac: Reject zero-length slave DMA requests
Petr Cvek <petrcvekcz(a)gmail.com>
MIPS: lantiq: Fix bitfield masking
Prarit Bhargava <prarit(a)redhat.com>
kernel/module.c: Only return -EEXIST for modules that have finished loading
Cheng Jian <cj.chengjian(a)huawei.com>
ftrace: Enable trampoline when rec count returns back to one
Douglas Anderson <dianders(a)chromium.org>
ARM: dts: rockchip: Mark that the rk3288 timer might stop in suspend
Douglas Anderson <dianders(a)chromium.org>
ARM: dts: rockchip: Make rk3288-veyron-mickey's emmc work again
Douglas Anderson <dianders(a)chromium.org>
ARM: dts: rockchip: Make rk3288-veyron-minnie run at hs200
Russell King <rmk+kernel(a)armlinux.org.uk>
ARM: riscpc: fix DMA
-------------
Diffstat:
Makefile | 7 +--
arch/arm/boot/dts/rk3288-veyron-mickey.dts | 4 --
arch/arm/boot/dts/rk3288-veyron-minnie.dts | 4 --
arch/arm/boot/dts/rk3288.dtsi | 1 +
arch/arm/mach-rpc/dma.c | 5 +-
arch/mips/lantiq/irq.c | 5 +-
arch/parisc/boot/compressed/vmlinux.lds.S | 4 +-
arch/x86/boot/compressed/misc.c | 1 +
arch/x86/boot/compressed/misc.h | 1 -
arch/x86/entry/entry_64.S | 1 -
arch/x86/entry/vdso/vclock_gettime.c | 19 +++++--
arch/x86/include/asm/apic.h | 2 +-
arch/x86/include/asm/kvm_host.h | 34 +++++++------
arch/x86/include/asm/paravirt.h | 1 +
arch/x86/include/asm/traps.h | 2 +-
arch/x86/kernel/apic/apic.c | 2 +-
arch/x86/kernel/kvm.c | 1 +
arch/x86/kvm/mmu.c | 6 +--
arch/x86/math-emu/fpu_emu.h | 2 +-
arch/x86/math-emu/reg_constant.c | 2 +-
arch/x86/xen/enlighten_pv.c | 2 +-
arch/x86/xen/xen-asm_64.S | 1 -
drivers/acpi/blacklist.c | 4 ++
drivers/block/nbd.c | 2 +-
drivers/clk/tegra/clk-tegra210.c | 8 +--
drivers/dma/sh/rcar-dmac.c | 2 +-
drivers/gpio/gpiolib.c | 6 ++-
drivers/gpu/drm/nouveau/nouveau_connector.c | 2 +-
drivers/infiniband/hw/hfi1/chip.c | 11 ++++-
drivers/infiniband/hw/hfi1/verbs.c | 2 +
drivers/infiniband/hw/mlx5/mlx5_ib.h | 1 +
drivers/infiniband/hw/mlx5/mr.c | 17 ++++---
drivers/infiniband/hw/mlx5/qp.c | 13 +++--
drivers/misc/eeprom/at24.c | 2 +-
drivers/mmc/host/dw_mmc.c | 3 +-
drivers/net/ethernet/emulex/benet/be_main.c | 6 ++-
drivers/perf/arm_pmu.c | 2 +-
drivers/rapidio/devices/rio_mport_cdev.c | 2 +
drivers/s390/block/dasd_alias.c | 22 ++++++---
drivers/s390/scsi/zfcp_erp.c | 7 +++
drivers/xen/swiotlb-xen.c | 4 +-
fs/adfs/super.c | 5 +-
fs/btrfs/send.c | 77 ++++++-----------------------
fs/btrfs/transaction.c | 10 ++++
fs/btrfs/volumes.c | 3 +-
fs/ceph/super.h | 7 ++-
fs/ceph/xattr.c | 14 +++---
fs/cifs/connect.c | 8 +--
fs/coda/psdev.c | 5 +-
include/linux/acpi.h | 5 +-
include/linux/coda.h | 3 +-
include/linux/coda_psdev.h | 11 +++++
include/uapi/linux/coda_psdev.h | 13 -----
ipc/mqueue.c | 19 +++----
kernel/module.c | 6 +--
kernel/trace/ftrace.c | 28 ++++++-----
mm/cma.c | 13 +++++
security/selinux/ss/policydb.c | 6 ++-
tools/objtool/elf.c | 2 +-
59 files changed, 254 insertions(+), 204 deletions(-)
Hello,
We ran automated tests on a patchset that was proposed for merging into this
kernel tree. The patches were applied to:
Kernel repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git
Commit: f7d5b3dc4792 - Linux 5.2.10
The results of these automated tests are provided below.
Overall result: FAILED (see details below)
Merge: OK
Compile: OK
Tests: FAILED
All kernel binaries, config files, and logs are available for download here:
https://artifacts.cki-project.org/pipelines/125539
One or more kernel tests failed:
ppc64le:
❌ xfstests: xfs
We hope that these logs can help you find the problem quickly. For the full
detail on our testing procedures, please scroll to the bottom of this message.
Please reply to this email if you have any questions about the tests that we
ran or if you have any suggestions on how to make future tests more effective.
,-. ,-.
( C ) ( K ) Continuous
`-',-.`-' Kernel
( I ) Integration
`-'
______________________________________________________________________________
Merge testing
-------------
We cloned this repository and checked out the following commit:
Repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git
Commit: f7d5b3dc4792 - Linux 5.2.10
We grabbed the ab8cf3ce2f85 commit of the stable queue repository.
We then merged the patchset with `git am`:
asoc-simple_card_utils.h-care-null-dai-at-asoc_simpl.patch
asoc-simple-card-fix-an-use-after-free-in-simple_dai.patch
asoc-simple-card-fix-an-use-after-free-in-simple_for.patch
asoc-audio-graph-card-fix-use-after-free-in-graph_da.patch
asoc-audio-graph-card-fix-an-use-after-free-in-graph.patch
asoc-audio-graph-card-add-missing-const-at-graph_get.patch
regulator-axp20x-fix-dcdca-and-dcdcd-for-axp806.patch
regulator-axp20x-fix-dcdc5-and-dcdc6-for-axp803.patch
asoc-samsung-odroid-fix-an-use-after-free-issue-for-.patch
asoc-samsung-odroid-fix-a-double-free-issue-for-cpu_.patch
asoc-intel-bytcht_es8316-add-quirk-for-irbis-nb41-ne.patch
hid-logitech-hidpp-add-usb-pid-for-a-few-more-suppor.patch
hid-add-044f-b320-thrustmaster-inc.-2-in-1-dt.patch
mips-kernel-only-use-i8253-clocksource-with-periodic.patch
mips-fix-cacheinfo.patch
libbpf-sanitize-var-to-conservative-1-byte-int.patch
netfilter-ebtables-fix-a-memory-leak-bug-in-compat.patch
asoc-dapm-fix-handling-of-custom_stop_condition-on-d.patch
asoc-sof-use-__u32-instead-of-uint32_t-in-uapi-heade.patch
spi-pxa2xx-balance-runtime-pm-enable-disable-on-erro.patch
bpf-sockmap-sock_map_delete-needs-to-use-xchg.patch
bpf-sockmap-synchronize_rcu-before-free-ing-map.patch
bpf-sockmap-only-create-entry-if-ulp-is-not-already-.patch
selftests-bpf-fix-sendmsg6_prog-on-s390.patch
asoc-dapm-fix-a-memory-leak-bug.patch
bonding-force-slave-speed-check-after-link-state-rec.patch
net-mvpp2-don-t-check-for-3-consecutive-idle-frames-.patch
selftests-forwarding-gre_multipath-enable-ipv4-forwa.patch
selftests-forwarding-gre_multipath-fix-flower-filter.patch
selftests-bpf-add-another-gso_segs-access.patch
libbpf-fix-using-uninitialized-ioctl-results.patch
can-dev-call-netif_carrier_off-in-register_candev.patch
can-mcp251x-add-error-check-when-wq-alloc-failed.patch
can-gw-fix-error-path-of-cgw_module_init.patch
asoc-fail-card-instantiation-if-dai-format-setup-fai.patch
staging-fbtft-fix-gpio-handling.patch
libbpf-silence-gcc8-warning-about-string-truncation.patch
st21nfca_connectivity_event_received-null-check-the-.patch
st_nci_hci_connectivity_event_received-null-check-th.patch
nl-mac-80211-fix-interface-combinations-on-crypto-co.patch
asoc-ti-davinci-mcasp-fix-clk-pdir-handling-for-i2s-.patch
asoc-rockchip-fix-mono-capture.patch
asoc-ti-davinci-mcasp-correct-slot_width-posed-const.patch
net-usb-qmi_wwan-add-the-broadmobi-bm818-card.patch
qed-rdma-fix-the-hw_ver-returned-in-device-attribute.patch
isdn-misdn-hfcsusb-fix-possible-null-pointer-derefer.patch
habanalabs-fix-f-w-download-in-be-architecture.patch
mac80211_hwsim-fix-possible-null-pointer-dereference.patch
net-stmmac-manage-errors-returned-by-of_get_mac_addr.patch
netfilter-ipset-actually-allow-destination-mac-addre.patch
netfilter-ipset-copy-the-right-mac-address-in-bitmap.patch
netfilter-ipset-fix-rename-concurrency-with-listing.patch
rxrpc-fix-potential-deadlock.patch
rxrpc-fix-the-lack-of-notification-when-sendmsg-fail.patch
nvmem-use-the-same-permissions-for-eeprom-as-for-nvm.patch
iwlwifi-mvm-avoid-races-in-rate-init-and-rate-perfor.patch
iwlwifi-dbg_ini-move-iwl_dbg_tlv_load_bin-out-of-deb.patch
iwlwifi-dbg_ini-move-iwl_dbg_tlv_free-outside-of-deb.patch
iwlwifi-fix-locking-in-delayed-gtk-setting.patch
iwlwifi-mvm-send-lq-command-always-async.patch
enetc-fix-build-error-without-phylib.patch
isdn-hfcsusb-fix-misdn-driver-crash-caused-by-transf.patch
net-phy-phy_led_triggers-fix-a-possible-null-pointer.patch
perf-bench-numa-fix-cpu0-binding.patch
spi-pxa2xx-add-support-for-intel-tiger-lake.patch
can-sja1000-force-the-string-buffer-null-terminated.patch
can-peak_usb-force-the-string-buffer-null-terminated.patch
asoc-amd-acp3x-use-dma_ops-of-parent-device-for-acp3.patch
net-ethernet-qlogic-qed-force-the-string-buffer-null.patch
enetc-select-phylib-while-config_fsl_enetc_vf-is-set.patch
nfsv4-fix-a-credential-refcount-leak-in-nfs41_check_.patch
nfsv4-when-recovering-state-fails-with-eagain-retry-.patch
nfsv4.1-fix-open-stateid-recovery.patch
nfsv4.1-only-reap-expired-delegations.patch
nfsv4-fix-a-potential-sleep-while-atomic-in-nfs4_do_.patch
nfs-fix-regression-whereby-fscache-errors-are-appear.patch
hid-quirks-set-the-increment_usage_on_duplicate-quir.patch
hid-input-fix-a4tech-horizontal-wheel-custom-usage.patch
drm-rockchip-suspend-dp-late.patch
smb3-fix-potential-memory-leak-when-processing-compo.patch
smb3-kernel-oops-mounting-a-encryptdata-share-with-c.patch
sched-deadline-fix-double-accounting-of-rq-running-b.patch
sched-psi-reduce-psimon-fifo-priority.patch
sched-psi-do-not-require-setsched-permission-from-th.patch
s390-protvirt-avoid-memory-sharing-for-diag-308-set-.patch
s390-mm-fix-dump_pagetables-top-level-page-table-wal.patch
s390-put-_stext-and-_etext-into-.text-section.patch
ata-rb532_cf-fix-unused-variable-warning-in-rb532_pa.patch
net-cxgb3_main-fix-a-resource-leak-in-a-error-path-i.patch
net-stmmac-fix-issues-when-number-of-queues-4.patch
net-stmmac-tc-do-not-return-a-fragment-entry.patch
drm-amdgpu-pin-the-csb-buffer-on-hw-init-for-gfx-v8.patch
net-hisilicon-make-hip04_tx_reclaim-non-reentrant.patch
net-hisilicon-fix-hip04-xmit-never-return-tx_busy.patch
net-hisilicon-fix-dma_map_single-failed-on-arm64.patch
nfsv4-ensure-state-recovery-handles-etimedout-correc.patch
libata-have-ata_scsi_rw_xlat-fail-invalid-passthroug.patch
libata-add-sg-safety-checks-in-sff-pio-transfers.patch
x86-lib-cpu-address-missing-prototypes-warning.patch
drm-vmwgfx-fix-memory-leak-when-too-many-retries-hav.patch
block-aoe-fix-kernel-crash-due-to-atomic-sleep-when-.patch
block-bfq-handle-null-return-value-by-bfq_init_rq.patch
perf-ftrace-fix-failure-to-set-cpumask-when-only-one.patch
perf-cpumap-fix-writing-to-illegal-memory-in-handlin.patch
perf-pmu-events-fix-missing-cpu_clk_unhalted.core-ev.patch
dt-bindings-riscv-fix-the-schema-compatible-string-f.patch
kvm-arm64-don-t-write-junk-to-sysregs-on-reset.patch
kvm-arm-don-t-write-junk-to-cp15-registers-on-reset.patch
selftests-kvm-adding-config-fragments.patch
iwlwifi-mvm-disable-tx-amsdu-on-older-nics.patch
hid-wacom-correct-misreported-ekr-ring-values.patch
hid-wacom-correct-distance-scale-for-2nd-gen-intuos-devices.patch
revert-kvm-x86-mmu-zap-only-the-relevant-pages-when-removing-a-memslot.patch
revert-dm-bufio-fix-deadlock-with-loop-device.patch
clk-socfpga-stratix10-fix-rate-caclulationg-for-cnt_clks.patch
ceph-clear-page-dirty-before-invalidate-page.patch
ceph-don-t-try-fill-file_lock-on-unsuccessful-getfilelock-reply.patch
libceph-fix-pg-split-vs-osd-re-connect-race.patch
drm-amdgpu-gfx9-update-pg_flags-after-determining-if-gfx-off-is-possible.patch
drm-nouveau-don-t-retry-infinitely-when-receiving-no-data-on-i2c-over-aux.patch
scsi-ufs-fix-null-pointer-dereference-in-ufshcd_config_vreg_hpm.patch
gpiolib-never-report-open-drain-source-lines-as-input-to-user-space.patch
drivers-hv-vmbus-fix-virt_to_hvpfn-for-x86_pae.patch
userfaultfd_release-always-remove-uffd-flags-and-clear-vm_userfaultfd_ctx.patch
x86-retpoline-don-t-clobber-rflags-during-call_nospec-on-i386.patch
x86-apic-handle-missing-global-clockevent-gracefully.patch
x86-cpu-amd-clear-rdrand-cpuid-bit-on-amd-family-15h-16h.patch
x86-boot-save-fields-explicitly-zero-out-everything-else.patch
x86-boot-fix-boot-regression-caused-by-bootparam-sanitizing.patch
ib-hfi1-unsafe-psn-checking-for-tid-rdma-read-resp-packet.patch
ib-hfi1-add-additional-checks-when-handling-tid-rdma-read-resp-packet.patch
ib-hfi1-add-additional-checks-when-handling-tid-rdma-write-data-packet.patch
ib-hfi1-drop-stale-tid-rdma-packets-that-cause-tiderr.patch
psi-get-poll_work-to-run-when-calling-poll-syscall-next-time.patch
dm-kcopyd-always-complete-failed-jobs.patch
dm-dust-use-dust-block-size-for-badblocklist-index.patch
dm-btree-fix-order-of-block-initialization-in-btree_split_beneath.patch
dm-integrity-fix-a-crash-due-to-bug_on-in-__journal_read_write.patch
dm-raid-add-missing-cleanup-in-raid_ctr.patch
dm-space-map-metadata-fix-missing-store-of-apply_bops-return-value.patch
dm-table-fix-invalid-memory-accesses-with-too-high-sector-number.patch
dm-zoned-improve-error-handling-in-reclaim.patch
dm-zoned-improve-error-handling-in-i-o-map-code.patch
dm-zoned-properly-handle-backing-device-failure.patch
genirq-properly-pair-kobject_del-with-kobject_add.patch
mm-z3fold.c-fix-race-between-migration-and-destruction.patch
mm-page_alloc-move_freepages-should-not-examine-struct-page-of-reserved-memory.patch
mm-memcontrol-flush-percpu-vmstats-before-releasing-memcg.patch
mm-memcontrol-flush-percpu-vmevents-before-releasing-memcg.patch
mm-page_owner-handle-thp-splits-correctly.patch
Compile testing
---------------
We compiled the kernel for 3 architectures:
aarch64:
make options: -j30 INSTALL_MOD_STRIP=1 targz-pkg
ppc64le:
make options: -j30 INSTALL_MOD_STRIP=1 targz-pkg
x86_64:
make options: -j30 INSTALL_MOD_STRIP=1 targz-pkg
Hardware testing
----------------
We booted each kernel and ran the following tests:
aarch64:
Host 1:
✅ Boot test [0]
✅ xfstests: xfs [1]
✅ selinux-policy: serge-testsuite [2]
✅ lvm thinp sanity [3]
✅ storage: software RAID testing [4]
🚧 ✅ Storage blktests [5]
Host 2:
✅ Boot test [0]
✅ Podman system integration test (as root) [6]
✅ Podman system integration test (as user) [6]
✅ LTP lite [7]
✅ Loopdev Sanity [8]
✅ jvm test suite [9]
✅ AMTU (Abstract Machine Test Utility) [10]
✅ LTP: openposix test suite [11]
✅ Ethernet drivers sanity [12]
✅ Networking socket: fuzz [13]
✅ audit: audit testsuite test [14]
✅ httpd: mod_ssl smoke sanity [15]
✅ iotop: sanity [16]
✅ tuned: tune-processes-through-perf [17]
✅ Usex - version 1.9-29 [18]
✅ storage: SCSI VPD [19]
✅ stress: stress-ng [20]
ppc64le:
Host 1:
✅ Boot test [0]
✅ Podman system integration test (as root) [6]
✅ Podman system integration test (as user) [6]
✅ LTP lite [7]
✅ Loopdev Sanity [8]
✅ jvm test suite [9]
✅ AMTU (Abstract Machine Test Utility) [10]
✅ LTP: openposix test suite [11]
✅ Ethernet drivers sanity [12]
✅ Networking socket: fuzz [13]
✅ audit: audit testsuite test [14]
✅ httpd: mod_ssl smoke sanity [15]
✅ iotop: sanity [16]
✅ tuned: tune-processes-through-perf [17]
✅ Usex - version 1.9-29 [18]
Host 2:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
✅ Boot test [0]
❌ xfstests: xfs [1]
⚡⚡⚡ selinux-policy: serge-testsuite [2]
⚡⚡⚡ lvm thinp sanity [3]
⚡⚡⚡ storage: software RAID testing [4]
🚧 ⚡⚡⚡ Storage blktests [5]
x86_64:
Host 1:
✅ Boot test [0]
✅ xfstests: xfs [1]
✅ selinux-policy: serge-testsuite [2]
✅ lvm thinp sanity [3]
✅ storage: software RAID testing [4]
🚧 ✅ Storage blktests [5]
Host 2:
✅ Boot test [0]
✅ Podman system integration test (as root) [6]
✅ Podman system integration test (as user) [6]
✅ LTP lite [7]
✅ Loopdev Sanity [8]
✅ jvm test suite [9]
✅ AMTU (Abstract Machine Test Utility) [10]
✅ LTP: openposix test suite [11]
✅ Ethernet drivers sanity [12]
✅ Networking socket: fuzz [13]
✅ audit: audit testsuite test [14]
✅ httpd: mod_ssl smoke sanity [15]
✅ iotop: sanity [16]
✅ tuned: tune-processes-through-perf [17]
✅ pciutils: sanity smoke test [21]
✅ Usex - version 1.9-29 [18]
✅ storage: SCSI VPD [19]
✅ stress: stress-ng [20]
Test source:
💚 Pull requests are welcome for new tests or improvements to existing tests!
[0]: https://github.com/CKI-project/tests-beaker/archive/master.zip#distribution…
[1]: https://github.com/CKI-project/tests-beaker/archive/master.zip#/filesystems…
[2]: https://github.com/CKI-project/tests-beaker/archive/master.zip#/packages/se…
[3]: https://github.com/CKI-project/tests-beaker/archive/master.zip#storage/lvm/…
[4]: https://github.com/CKI-project/tests-beaker/archive/master.zip#storage/swra…
[5]: https://github.com/CKI-project/tests-beaker/archive/master.zip#storage/blk
[6]: https://github.com/CKI-project/tests-beaker/archive/master.zip#/container/p…
[7]: https://github.com/CKI-project/tests-beaker/archive/master.zip#distribution…
[8]: https://github.com/CKI-project/tests-beaker/archive/master.zip#filesystems/…
[9]: https://github.com/CKI-project/tests-beaker/archive/master.zip#/jvm
[10]: https://github.com/CKI-project/tests-beaker/archive/master.zip#misc/amtu
[11]: https://github.com/CKI-project/tests-beaker/archive/master.zip#distribution…
[12]: https://github.com/CKI-project/tests-beaker/archive/master.zip#/networking/…
[13]: https://github.com/CKI-project/tests-beaker/archive/master.zip#/networking/…
[14]: https://github.com/CKI-project/tests-beaker/archive/master.zip#packages/aud…
[15]: https://github.com/CKI-project/tests-beaker/archive/master.zip#packages/htt…
[16]: https://github.com/CKI-project/tests-beaker/archive/master.zip#packages/iot…
[17]: https://github.com/CKI-project/tests-beaker/archive/master.zip#packages/tun…
[18]: https://github.com/CKI-project/tests-beaker/archive/master.zip#standards/us…
[19]: https://github.com/CKI-project/tests-beaker/archive/master.zip#storage/scsi…
[20]: https://github.com/CKI-project/tests-beaker/archive/master.zip#stress/stres…
[21]: https://github.com/CKI-project/tests-beaker/archive/master.zip#pciutils/san…
Waived tests
------------
If the test run included waived tests, they are marked with 🚧. Such tests are
executed but their results are not taken into account. Tests are waived when
their results are not reliable enough, e.g. when they're just introduced or are
being fixed.
Sasha,
you merged my last set of XFS fixes. I asked for one patch to not be
merged yet as one issue was not yet properly fixed. After some further
review I have identified commits which do fix the kernel crash reported
on kz#204223 [0] with generic/388, this patch set applies on top of the
last one I sent you.
These commits do quite a bit of code refactoring, and the actual fix
lies hidden in the last commit by Darrick. Due to the amount of changes
trying to extract the fix is riskier than just carring the code
refactoring. If we're OK with the code refactor for stable, its my
recommendation we keep the changes to match more with upstream and
benefit from other fixes. The code refactoring was merged on v4.20 and
Darrick's fix is the only fix upstream since the code was merged.
If others disagree with this approach please speak up.
I've run a full set of fstests against the following sections 12 times and
have found no regressions against the baseline:
xfs
xfs_logdev
xfs_nocrc_512
xfs_nocrc
xfs_realtimedev
xfs_reflink_1024
xfs_reflink_dev
Review from others is appreciated.
[0] https://bugzilla.kernel.org/show_bug.cgi?id=204223
Allison Henderson (4):
xfs: Move fs/xfs/xfs_attr.h to fs/xfs/libxfs/xfs_attr.h
xfs: Add helper function xfs_attr_try_sf_addname
xfs: Add attibute set and helper functions
xfs: Add attibute remove and helper functions
Brian Foster (1):
xfs: don't trip over uninitialized buffer on extent read of corrupted
inode
Darrick J. Wong (1):
xfs: always rejoin held resources during defer roll
fs/xfs/libxfs/xfs_attr.c | 231 ++++++++++++++++++---------------
fs/xfs/{ => libxfs}/xfs_attr.h | 2 +
fs/xfs/libxfs/xfs_bmap.c | 54 +++++---
fs/xfs/libxfs/xfs_bmap.h | 1 +
fs/xfs/libxfs/xfs_defer.c | 14 +-
fs/xfs/xfs_dquot.c | 17 +--
6 files changed, 183 insertions(+), 136 deletions(-)
rename fs/xfs/{ => libxfs}/xfs_attr.h (98%)
--
2.18.0
The following commit has been merged into the x86/urgent branch of tip:
Commit-ID: 558682b5291937a70748d36fd9ba757fb25b99ae
Gitweb: https://git.kernel.org/tip/558682b5291937a70748d36fd9ba757fb25b99ae
Author: Bandan Das <bsd(a)redhat.com>
AuthorDate: Mon, 26 Aug 2019 06:15:13 -04:00
Committer: Thomas Gleixner <tglx(a)linutronix.de>
CommitterDate: Mon, 26 Aug 2019 20:00:57 +02:00
x86/apic: Include the LDR when clearing out APIC registers
Although APIC initialization will typically clear out the LDR before
setting it, the APIC cleanup code should reset the LDR.
This was discovered with a 32-bit KVM guest jumping into a kdump
kernel. The stale bits in the LDR triggered a bug in the KVM APIC
implementation which caused the destination mapping for VCPUs to be
corrupted.
Note that this isn't intended to paper over the KVM APIC bug. The kernel
has to clear the LDR when resetting the APIC registers except when X2APIC
is enabled.
This lacks a Fixes tag because missing to clear LDR goes way back into pre
git history.
[ tglx: Made x2apic_enabled a function call as required ]
Signed-off-by: Bandan Das <bsd(a)redhat.com>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Cc: stable(a)vger.kernel.org
Link: https://lkml.kernel.org/r/20190826101513.5080-3-bsd@redhat.com
---
arch/x86/kernel/apic/apic.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
index aa5495d..dba2828 100644
--- a/arch/x86/kernel/apic/apic.c
+++ b/arch/x86/kernel/apic/apic.c
@@ -1179,6 +1179,10 @@ void clear_local_APIC(void)
apic_write(APIC_LVT0, v | APIC_LVT_MASKED);
v = apic_read(APIC_LVT1);
apic_write(APIC_LVT1, v | APIC_LVT_MASKED);
+ if (!x2apic_enabled()) {
+ v = apic_read(APIC_LDR) & ~APIC_LDR_MASK;
+ apic_write(APIC_LDR, v);
+ }
if (maxlvt >= 4) {
v = apic_read(APIC_LVTPC);
apic_write(APIC_LVTPC, v | APIC_LVT_MASKED);
The following commit has been merged into the x86/urgent branch of tip:
Commit-ID: bae3a8d3308ee69a7dbdf145911b18dfda8ade0d
Gitweb: https://git.kernel.org/tip/bae3a8d3308ee69a7dbdf145911b18dfda8ade0d
Author: Bandan Das <bsd(a)redhat.com>
AuthorDate: Mon, 26 Aug 2019 06:15:12 -04:00
Committer: Thomas Gleixner <tglx(a)linutronix.de>
CommitterDate: Mon, 26 Aug 2019 20:00:56 +02:00
x86/apic: Do not initialize LDR and DFR for bigsmp
Legacy apic init uses bigsmp for smp systems with 8 and more CPUs. The
bigsmp APIC implementation uses physical destination mode, but it
nevertheless initializes LDR and DFR. The LDR even ends up incorrectly with
multiple bit being set.
This does not cause a functional problem because LDR and DFR are ignored
when physical destination mode is active, but it triggered a problem on a
32-bit KVM guest which jumps into a kdump kernel.
The multiple bits set unearthed a bug in the KVM APIC implementation. The
code which creates the logical destination map for VCPUs ignores the
disabled state of the APIC and ends up overwriting an existing valid entry
and as a result, APIC calibration hangs in the guest during kdump
initialization.
Remove the bogus LDR/DFR initialization.
This is not intended to work around the KVM APIC bug. The LDR/DFR
ininitalization is wrong on its own.
The issue goes back into the pre git history. The fixes tag is the commit
in the bitkeeper import which introduced bigsmp support in 2003.
git://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git
Fixes: db7b9e9f26b8 ("[PATCH] Clustered APIC setup for >8 CPU systems")
Suggested-by: Thomas Gleixner <tglx(a)linutronix.de>
Signed-off-by: Bandan Das <bsd(a)redhat.com>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Cc: stable(a)vger.kernel.org
Link: https://lkml.kernel.org/r/20190826101513.5080-2-bsd@redhat.com
---
arch/x86/kernel/apic/bigsmp_32.c | 24 ++----------------------
1 file changed, 2 insertions(+), 22 deletions(-)
diff --git a/arch/x86/kernel/apic/bigsmp_32.c b/arch/x86/kernel/apic/bigsmp_32.c
index afee386..caedd8d 100644
--- a/arch/x86/kernel/apic/bigsmp_32.c
+++ b/arch/x86/kernel/apic/bigsmp_32.c
@@ -38,32 +38,12 @@ static int bigsmp_early_logical_apicid(int cpu)
return early_per_cpu(x86_cpu_to_apicid, cpu);
}
-static inline unsigned long calculate_ldr(int cpu)
-{
- unsigned long val, id;
-
- val = apic_read(APIC_LDR) & ~APIC_LDR_MASK;
- id = per_cpu(x86_bios_cpu_apicid, cpu);
- val |= SET_APIC_LOGICAL_ID(id);
-
- return val;
-}
-
/*
- * Set up the logical destination ID.
- *
- * Intel recommends to set DFR, LDR and TPR before enabling
- * an APIC. See e.g. "AP-388 82489DX User's Manual" (Intel
- * document number 292116). So here it goes...
+ * bigsmp enables physical destination mode
+ * and doesn't use LDR and DFR
*/
static void bigsmp_init_apic_ldr(void)
{
- unsigned long val;
- int cpu = smp_processor_id();
-
- apic_write(APIC_DFR, APIC_DFR_FLAT);
- val = calculate_ldr(cpu);
- apic_write(APIC_LDR, val);
}
static void bigsmp_setup_apic_routing(void)