This is a note to let you know that I've just added the patch titled
xfrm: Fix a race in the xdst pcpu cache.
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
xfrm-fix-a-race-in-the-xdst-pcpu-cache.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 76a4201191814a0061cb5c861fafb9ecaa764846 Mon Sep 17 00:00:00 2001
From: Steffen Klassert <steffen.klassert(a)secunet.com>
Date: Wed, 10 Jan 2018 12:14:28 +0100
Subject: xfrm: Fix a race in the xdst pcpu cache.
From: Steffen Klassert <steffen.klassert(a)secunet.com>
commit 76a4201191814a0061cb5c861fafb9ecaa764846 upstream.
We need to run xfrm_resolve_and_create_bundle() with
bottom halves off. Otherwise we may reuse an already
released dst_enty when the xfrm lookup functions are
called from process context.
Fixes: c30d78c14a813db39a647b6a348b428 ("xfrm: add xdst pcpu cache")
Reported-by: Darius Ski <darius.ski(a)gmail.com>
Signed-off-by: Steffen Klassert <steffen.klassert(a)secunet.com>
Acked-by: David Miller <davem(a)davemloft.net>,
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/xfrm/xfrm_policy.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
--- a/net/xfrm/xfrm_policy.c
+++ b/net/xfrm/xfrm_policy.c
@@ -2056,8 +2056,11 @@ xfrm_bundle_lookup(struct net *net, cons
if (num_xfrms <= 0)
goto make_dummy_bundle;
+ local_bh_disable();
xdst = xfrm_resolve_and_create_bundle(pols, num_pols, fl, family,
- xflo->dst_orig);
+ xflo->dst_orig);
+ local_bh_enable();
+
if (IS_ERR(xdst)) {
err = PTR_ERR(xdst);
if (err != -EAGAIN)
@@ -2144,9 +2147,12 @@ struct dst_entry *xfrm_lookup(struct net
goto no_transform;
}
+ local_bh_disable();
xdst = xfrm_resolve_and_create_bundle(
pols, num_pols, fl,
family, dst_orig);
+ local_bh_enable();
+
if (IS_ERR(xdst)) {
xfrm_pols_put(pols, num_pols);
err = PTR_ERR(xdst);
Patches currently in stable-queue which might be from steffen.klassert(a)secunet.com are
queue-4.14/xfrm-fix-a-race-in-the-xdst-pcpu-cache.patch
This is a note to let you know that I've just added the patch titled
scsi: libiscsi: fix shifting of DID_REQUEUE host byte
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
scsi-libiscsi-fix-shifting-of-did_requeue-host-byte.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From eef9ffdf9cd39b2986367bc8395e2772bc1284ba Mon Sep 17 00:00:00 2001
From: Johannes Thumshirn <jthumshirn(a)suse.de>
Date: Mon, 9 Oct 2017 13:33:19 +0200
Subject: scsi: libiscsi: fix shifting of DID_REQUEUE host byte
From: Johannes Thumshirn <jthumshirn(a)suse.de>
commit eef9ffdf9cd39b2986367bc8395e2772bc1284ba upstream.
The SCSI host byte should be shifted left by 16 in order to have
scsi_decide_disposition() do the right thing (.i.e. requeue the
command).
Signed-off-by: Johannes Thumshirn <jthumshirn(a)suse.de>
Fixes: 661134ad3765 ("[SCSI] libiscsi, bnx2i: make bound ep check common")
Cc: Lee Duncan <lduncan(a)suse.com>
Cc: Hannes Reinecke <hare(a)suse.de>
Cc: Bart Van Assche <Bart.VanAssche(a)sandisk.com>
Cc: Chris Leech <cleech(a)redhat.com>
Acked-by: Lee Duncan <lduncan(a)suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen(a)oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/scsi/libiscsi.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/scsi/libiscsi.c
+++ b/drivers/scsi/libiscsi.c
@@ -1727,7 +1727,7 @@ int iscsi_queuecommand(struct Scsi_Host
if (test_bit(ISCSI_SUSPEND_BIT, &conn->suspend_tx)) {
reason = FAILURE_SESSION_IN_RECOVERY;
- sc->result = DID_REQUEUE;
+ sc->result = DID_REQUEUE << 16;
goto fault;
}
Patches currently in stable-queue which might be from jthumshirn(a)suse.de are
queue-4.9/scsi-libiscsi-fix-shifting-of-did_requeue-host-byte.patch
This is a note to let you know that I've just added the patch titled
reiserfs: don't preallocate blocks for extended attributes
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
reiserfs-don-t-preallocate-blocks-for-extended-attributes.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 54930dfeb46e978b447af0fb8ab4e181c1bf9d7a Mon Sep 17 00:00:00 2001
From: Jeff Mahoney <jeffm(a)suse.com>
Date: Thu, 22 Jun 2017 16:35:04 -0400
Subject: reiserfs: don't preallocate blocks for extended attributes
From: Jeff Mahoney <jeffm(a)suse.com>
commit 54930dfeb46e978b447af0fb8ab4e181c1bf9d7a upstream.
Most extended attributes will fit in a single block. More importantly,
we drop the reference to the inode while holding the transaction open
so the preallocated blocks aren't released. As a result, the inode
may be evicted before it's removed from the transaction's prealloc list
which can cause memory corruption.
Signed-off-by: Jeff Mahoney <jeffm(a)suse.com>
Signed-off-by: Jan Kara <jack(a)suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
fs/reiserfs/bitmap.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/fs/reiserfs/bitmap.c
+++ b/fs/reiserfs/bitmap.c
@@ -1136,7 +1136,7 @@ static int determine_prealloc_size(reise
hint->prealloc_size = 0;
if (!hint->formatted_node && hint->preallocate) {
- if (S_ISREG(hint->inode->i_mode)
+ if (S_ISREG(hint->inode->i_mode) && !IS_PRIVATE(hint->inode)
&& hint->inode->i_size >=
REISERFS_SB(hint->th->t_super)->s_alloc_options.
preallocmin * hint->inode->i_sb->s_blocksize)
Patches currently in stable-queue which might be from jeffm(a)suse.com are
queue-4.9/reiserfs-don-t-preallocate-blocks-for-extended-attributes.patch
queue-4.9/reiserfs-fix-race-in-prealloc-discard.patch
This is a note to let you know that I've just added the patch titled
reiserfs: fix race in prealloc discard
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
reiserfs-fix-race-in-prealloc-discard.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 08db141b5313ac2f64b844fb5725b8d81744b417 Mon Sep 17 00:00:00 2001
From: Jeff Mahoney <jeffm(a)suse.com>
Date: Thu, 22 Jun 2017 16:47:34 -0400
Subject: reiserfs: fix race in prealloc discard
From: Jeff Mahoney <jeffm(a)suse.com>
commit 08db141b5313ac2f64b844fb5725b8d81744b417 upstream.
The main loop in __discard_prealloc is protected by the reiserfs write lock
which is dropped across schedules like the BKL it replaced. The problem is
that it checks the value, calls a routine that schedules, and then adjusts
the state. As a result, two threads that are calling
reiserfs_prealloc_discard at the same time can race when one calls
reiserfs_free_prealloc_block, the lock is dropped, and the other calls
reiserfs_free_prealloc_block with the same block number. In the right
circumstances, it can cause the prealloc count to go negative.
Signed-off-by: Jeff Mahoney <jeffm(a)suse.com>
Signed-off-by: Jan Kara <jack(a)suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
fs/reiserfs/bitmap.c | 12 ++++++++++--
1 file changed, 10 insertions(+), 2 deletions(-)
--- a/fs/reiserfs/bitmap.c
+++ b/fs/reiserfs/bitmap.c
@@ -513,9 +513,17 @@ static void __discard_prealloc(struct re
"inode has negative prealloc blocks count.");
#endif
while (ei->i_prealloc_count > 0) {
- reiserfs_free_prealloc_block(th, inode, ei->i_prealloc_block);
- ei->i_prealloc_block++;
+ b_blocknr_t block_to_free;
+
+ /*
+ * reiserfs_free_prealloc_block can drop the write lock,
+ * which could allow another caller to free the same block.
+ * We can protect against it by modifying the prealloc
+ * state before calling it.
+ */
+ block_to_free = ei->i_prealloc_block++;
ei->i_prealloc_count--;
+ reiserfs_free_prealloc_block(th, inode, block_to_free);
dirty = 1;
}
if (dirty)
Patches currently in stable-queue which might be from jeffm(a)suse.com are
queue-4.9/reiserfs-don-t-preallocate-blocks-for-extended-attributes.patch
queue-4.9/reiserfs-fix-race-in-prealloc-discard.patch
This is a note to let you know that I've just added the patch titled
netfilter: nfnetlink_cthelper: Add missing permission checks
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
netfilter-nfnetlink_cthelper-add-missing-permission-checks.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 4b380c42f7d00a395feede754f0bc2292eebe6e5 Mon Sep 17 00:00:00 2001
From: Kevin Cernekee <cernekee(a)chromium.org>
Date: Sun, 3 Dec 2017 12:12:45 -0800
Subject: netfilter: nfnetlink_cthelper: Add missing permission checks
From: Kevin Cernekee <cernekee(a)chromium.org>
commit 4b380c42f7d00a395feede754f0bc2292eebe6e5 upstream.
The capability check in nfnetlink_rcv() verifies that the caller
has CAP_NET_ADMIN in the namespace that "owns" the netlink socket.
However, nfnl_cthelper_list is shared by all net namespaces on the
system. An unprivileged user can create user and net namespaces
in which he holds CAP_NET_ADMIN to bypass the netlink_net_capable()
check:
$ nfct helper list
nfct v1.4.4: netlink error: Operation not permitted
$ vpnns -- nfct helper list
{
.name = ftp,
.queuenum = 0,
.l3protonum = 2,
.l4protonum = 6,
.priv_data_len = 24,
.status = enabled,
};
Add capable() checks in nfnetlink_cthelper, as this is cleaner than
trying to generalize the solution.
Signed-off-by: Kevin Cernekee <cernekee(a)chromium.org>
Signed-off-by: Pablo Neira Ayuso <pablo(a)netfilter.org>
Acked-by: Michal Kubecek <mkubecek(a)suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/netfilter/nfnetlink_cthelper.c | 10 ++++++++++
1 file changed, 10 insertions(+)
--- a/net/netfilter/nfnetlink_cthelper.c
+++ b/net/netfilter/nfnetlink_cthelper.c
@@ -17,6 +17,7 @@
#include <linux/types.h>
#include <linux/list.h>
#include <linux/errno.h>
+#include <linux/capability.h>
#include <net/netlink.h>
#include <net/sock.h>
@@ -392,6 +393,9 @@ static int nfnl_cthelper_new(struct net
struct nfnl_cthelper *nlcth;
int ret = 0;
+ if (!capable(CAP_NET_ADMIN))
+ return -EPERM;
+
if (!tb[NFCTH_NAME] || !tb[NFCTH_TUPLE])
return -EINVAL;
@@ -595,6 +599,9 @@ static int nfnl_cthelper_get(struct net
struct nfnl_cthelper *nlcth;
bool tuple_set = false;
+ if (!capable(CAP_NET_ADMIN))
+ return -EPERM;
+
if (nlh->nlmsg_flags & NLM_F_DUMP) {
struct netlink_dump_control c = {
.dump = nfnl_cthelper_dump_table,
@@ -661,6 +668,9 @@ static int nfnl_cthelper_del(struct net
struct nfnl_cthelper *nlcth, *n;
int j = 0, ret;
+ if (!capable(CAP_NET_ADMIN))
+ return -EPERM;
+
if (tb[NFCTH_NAME])
helper_name = nla_data(tb[NFCTH_NAME]);
Patches currently in stable-queue which might be from cernekee(a)chromium.org are
queue-4.9/netfilter-xt_osf-add-missing-permission-checks.patch
queue-4.9/netfilter-nfnetlink_cthelper-add-missing-permission-checks.patch
This is a note to let you know that I've just added the patch titled
netfilter: xt_osf: Add missing permission checks
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
netfilter-xt_osf-add-missing-permission-checks.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 916a27901de01446bcf57ecca4783f6cff493309 Mon Sep 17 00:00:00 2001
From: Kevin Cernekee <cernekee(a)chromium.org>
Date: Tue, 5 Dec 2017 15:42:41 -0800
Subject: netfilter: xt_osf: Add missing permission checks
From: Kevin Cernekee <cernekee(a)chromium.org>
commit 916a27901de01446bcf57ecca4783f6cff493309 upstream.
The capability check in nfnetlink_rcv() verifies that the caller
has CAP_NET_ADMIN in the namespace that "owns" the netlink socket.
However, xt_osf_fingers is shared by all net namespaces on the
system. An unprivileged user can create user and net namespaces
in which he holds CAP_NET_ADMIN to bypass the netlink_net_capable()
check:
vpnns -- nfnl_osf -f /tmp/pf.os
vpnns -- nfnl_osf -f /tmp/pf.os -d
These non-root operations successfully modify the systemwide OS
fingerprint list. Add new capable() checks so that they can't.
Signed-off-by: Kevin Cernekee <cernekee(a)chromium.org>
Signed-off-by: Pablo Neira Ayuso <pablo(a)netfilter.org>
Acked-by: Michal Kubecek <mkubecek(a)suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/netfilter/xt_osf.c | 7 +++++++
1 file changed, 7 insertions(+)
--- a/net/netfilter/xt_osf.c
+++ b/net/netfilter/xt_osf.c
@@ -19,6 +19,7 @@
#include <linux/module.h>
#include <linux/kernel.h>
+#include <linux/capability.h>
#include <linux/if.h>
#include <linux/inetdevice.h>
#include <linux/ip.h>
@@ -69,6 +70,9 @@ static int xt_osf_add_callback(struct ne
struct xt_osf_finger *kf = NULL, *sf;
int err = 0;
+ if (!capable(CAP_NET_ADMIN))
+ return -EPERM;
+
if (!osf_attrs[OSF_ATTR_FINGER])
return -EINVAL;
@@ -113,6 +117,9 @@ static int xt_osf_remove_callback(struct
struct xt_osf_finger *sf;
int err = -ENOENT;
+ if (!capable(CAP_NET_ADMIN))
+ return -EPERM;
+
if (!osf_attrs[OSF_ATTR_FINGER])
return -EINVAL;
Patches currently in stable-queue which might be from cernekee(a)chromium.org are
queue-4.9/netfilter-xt_osf-add-missing-permission-checks.patch
queue-4.9/netfilter-nfnetlink_cthelper-add-missing-permission-checks.patch
This is a note to let you know that I've just added the patch titled
mm, page_alloc: fix potential false positive in __zone_watermark_ok
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
mm-page_alloc-fix-potential-false-positive-in-__zone_watermark_ok.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From b050e3769c6b4013bb937e879fc43bf1847ee819 Mon Sep 17 00:00:00 2001
From: Vlastimil Babka <vbabka(a)suse.cz>
Date: Wed, 15 Nov 2017 17:38:30 -0800
Subject: mm, page_alloc: fix potential false positive in __zone_watermark_ok
From: Vlastimil Babka <vbabka(a)suse.cz>
commit b050e3769c6b4013bb937e879fc43bf1847ee819 upstream.
Since commit 97a16fc82a7c ("mm, page_alloc: only enforce watermarks for
order-0 allocations"), __zone_watermark_ok() check for high-order
allocations will shortcut per-migratetype free list checks for
ALLOC_HARDER allocations, and return true as long as there's free page
of any migratetype. The intention is that ALLOC_HARDER can allocate
from MIGRATE_HIGHATOMIC free lists, while normal allocations can't.
However, as a side effect, the watermark check will then also return
true when there are pages only on the MIGRATE_ISOLATE list, or (prior to
CMA conversion to ZONE_MOVABLE) on the MIGRATE_CMA list. Since the
allocation cannot actually obtain isolated pages, and might not be able
to obtain CMA pages, this can result in a false positive.
The condition should be rare and perhaps the outcome is not a fatal one.
Still, it's better if the watermark check is correct. There also
shouldn't be a performance tradeoff here.
Link: http://lkml.kernel.org/r/20171102125001.23708-1-vbabka@suse.cz
Fixes: 97a16fc82a7c ("mm, page_alloc: only enforce watermarks for order-0 allocations")
Signed-off-by: Vlastimil Babka <vbabka(a)suse.cz>
Acked-by: Mel Gorman <mgorman(a)techsingularity.net>
Cc: Joonsoo Kim <iamjoonsoo.kim(a)lge.com>
Cc: Rik van Riel <riel(a)redhat.com>
Cc: David Rientjes <rientjes(a)google.com>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
mm/page_alloc.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2821,9 +2821,6 @@ bool __zone_watermark_ok(struct zone *z,
if (!area->nr_free)
continue;
- if (alloc_harder)
- return true;
-
for (mt = 0; mt < MIGRATE_PCPTYPES; mt++) {
if (!list_empty(&area->free_list[mt]))
return true;
@@ -2835,6 +2832,9 @@ bool __zone_watermark_ok(struct zone *z,
return true;
}
#endif
+ if (alloc_harder &&
+ !list_empty(&area->free_list[MIGRATE_HIGHATOMIC]))
+ return true;
}
return false;
}
Patches currently in stable-queue which might be from vbabka(a)suse.cz are
queue-4.9/mm-mmap.c-do-not-blow-on-prot_none-map_fixed-holes-in-the-stack.patch
queue-4.9/cma-fix-calculation-of-aligned-offset.patch
queue-4.9/mm-page_alloc-fix-potential-false-positive-in-__zone_watermark_ok.patch
This is a note to let you know that I've just added the patch titled
mm/mmap.c: do not blow on PROT_NONE MAP_FIXED holes in the stack
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
mm-mmap.c-do-not-blow-on-prot_none-map_fixed-holes-in-the-stack.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 561b5e0709e4a248c67d024d4d94b6e31e3edf2f Mon Sep 17 00:00:00 2001
From: Michal Hocko <mhocko(a)suse.com>
Date: Mon, 10 Jul 2017 15:49:51 -0700
Subject: mm/mmap.c: do not blow on PROT_NONE MAP_FIXED holes in the stack
From: Michal Hocko <mhocko(a)suse.com>
commit 561b5e0709e4a248c67d024d4d94b6e31e3edf2f upstream.
Commit 1be7107fbe18 ("mm: larger stack guard gap, between vmas") has
introduced a regression in some rust and Java environments which are
trying to implement their own stack guard page. They are punching a new
MAP_FIXED mapping inside the existing stack Vma.
This will confuse expand_{downwards,upwards} into thinking that the
stack expansion would in fact get us too close to an existing non-stack
vma which is a correct behavior wrt safety. It is a real regression on
the other hand.
Let's work around the problem by considering PROT_NONE mapping as a part
of the stack. This is a gros hack but overflowing to such a mapping
would trap anyway an we only can hope that usespace knows what it is
doing and handle it propely.
Fixes: 1be7107fbe18 ("mm: larger stack guard gap, between vmas")
Link: http://lkml.kernel.org/r/20170705182849.GA18027@dhcp22.suse.cz
Signed-off-by: Michal Hocko <mhocko(a)suse.com>
Debugged-by: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Ben Hutchings <ben(a)decadent.org.uk>
Cc: Willy Tarreau <w(a)1wt.eu>
Cc: Oleg Nesterov <oleg(a)redhat.com>
Cc: Rik van Riel <riel(a)redhat.com>
Cc: Hugh Dickins <hughd(a)google.com>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
mm/mmap.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -2240,7 +2240,8 @@ int expand_upwards(struct vm_area_struct
gap_addr = TASK_SIZE;
next = vma->vm_next;
- if (next && next->vm_start < gap_addr) {
+ if (next && next->vm_start < gap_addr &&
+ (next->vm_flags & (VM_WRITE|VM_READ|VM_EXEC))) {
if (!(next->vm_flags & VM_GROWSUP))
return -ENOMEM;
/* Check that both stack segments have the same anon_vma? */
@@ -2324,7 +2325,8 @@ int expand_downwards(struct vm_area_stru
if (gap_addr > address)
return -ENOMEM;
prev = vma->vm_prev;
- if (prev && prev->vm_end > gap_addr) {
+ if (prev && prev->vm_end > gap_addr &&
+ (prev->vm_flags & (VM_WRITE|VM_READ|VM_EXEC))) {
if (!(prev->vm_flags & VM_GROWSDOWN))
return -ENOMEM;
/* Check that both stack segments have the same anon_vma? */
Patches currently in stable-queue which might be from mhocko(a)suse.com are
queue-4.9/hwpoison-memcg-forcibly-uncharge-lru-pages.patch
queue-4.9/mm-mmap.c-do-not-blow-on-prot_none-map_fixed-holes-in-the-stack.patch
This is a note to let you know that I've just added the patch titled
hwpoison, memcg: forcibly uncharge LRU pages
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
hwpoison-memcg-forcibly-uncharge-lru-pages.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 18365225f0440d09708ad9daade2ec11275c3df9 Mon Sep 17 00:00:00 2001
From: Michal Hocko <mhocko(a)suse.com>
Date: Fri, 12 May 2017 15:46:26 -0700
Subject: hwpoison, memcg: forcibly uncharge LRU pages
From: Michal Hocko <mhocko(a)suse.com>
commit 18365225f0440d09708ad9daade2ec11275c3df9 upstream.
Laurent Dufour has noticed that hwpoinsoned pages are kept charged. In
his particular case he has hit a bad_page("page still charged to
cgroup") when onlining a hwpoison page. While this looks like something
that shouldn't happen in the first place because onlining hwpages and
returning them to the page allocator makes only little sense it shows a
real problem.
hwpoison pages do not get freed usually so we do not uncharge them (at
least not since commit 0a31bc97c80c ("mm: memcontrol: rewrite uncharge
API")). Each charge pins memcg (since e8ea14cc6ead ("mm: memcontrol:
take a css reference for each charged page")) as well and so the
mem_cgroup and the associated state will never go away. Fix this leak
by forcibly uncharging a LRU hwpoisoned page in delete_from_lru_cache().
We also have to tweak uncharge_list because it cannot rely on zero ref
count for these pages.
[akpm(a)linux-foundation.org: coding-style fixes]
Fixes: 0a31bc97c80c ("mm: memcontrol: rewrite uncharge API")
Link: http://lkml.kernel.org/r/20170502185507.GB19165@dhcp22.suse.cz
Signed-off-by: Michal Hocko <mhocko(a)suse.com>
Reported-by: Laurent Dufour <ldufour(a)linux.vnet.ibm.com>
Tested-by: Laurent Dufour <ldufour(a)linux.vnet.ibm.com>
Reviewed-by: Balbir Singh <bsingharora(a)gmail.com>
Reviewed-by: Naoya Horiguchi <n-horiguchi(a)ah.jp.nec.com>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
mm/memcontrol.c | 2 +-
mm/memory-failure.c | 7 +++++++
2 files changed, 8 insertions(+), 1 deletion(-)
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -5531,7 +5531,7 @@ static void uncharge_list(struct list_he
next = page->lru.next;
VM_BUG_ON_PAGE(PageLRU(page), page);
- VM_BUG_ON_PAGE(page_count(page), page);
+ VM_BUG_ON_PAGE(!PageHWPoison(page) && page_count(page), page);
if (!page->mem_cgroup)
continue;
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -535,6 +535,13 @@ static int delete_from_lru_cache(struct
*/
ClearPageActive(p);
ClearPageUnevictable(p);
+
+ /*
+ * Poisoned page might never drop its ref count to 0 so we have
+ * to uncharge it manually from its memcg.
+ */
+ mem_cgroup_uncharge(p);
+
/*
* drop the page count elevated by isolate_lru_page()
*/
Patches currently in stable-queue which might be from mhocko(a)suse.com are
queue-4.9/hwpoison-memcg-forcibly-uncharge-lru-pages.patch
queue-4.9/mm-mmap.c-do-not-blow-on-prot_none-map_fixed-holes-in-the-stack.patch