This is a note to let you know that I've just added the patch titled
af_netlink: ensure that NLMSG_DONE never fails in dumps
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
af_netlink-ensure-that-nlmsg_done-never-fails-in-dumps.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Tue Nov 21 15:28:16 CET 2017
From: "Jason A. Donenfeld" <Jason(a)zx2c4.com>
Date: Thu, 9 Nov 2017 13:04:44 +0900
Subject: af_netlink: ensure that NLMSG_DONE never fails in dumps
From: "Jason A. Donenfeld" <Jason(a)zx2c4.com>
[ Upstream commit 0642840b8bb008528dbdf929cec9f65ac4231ad0 ]
The way people generally use netlink_dump is that they fill in the skb
as much as possible, breaking when nla_put returns an error. Then, they
get called again and start filling out the next skb, and again, and so
forth. The mechanism at work here is the ability for the iterative
dumping function to detect when the skb is filled up and not fill it
past the brim, waiting for a fresh skb for the rest of the data.
However, if the attributes are small and nicely packed, it is possible
that a dump callback function successfully fills in attributes until the
skb is of size 4080 (libmnl's default page-sized receive buffer size).
The dump function completes, satisfied, and then, if it happens to be
that this is actually the last skb, and no further ones are to be sent,
then netlink_dump will add on the NLMSG_DONE part:
nlh = nlmsg_put_answer(skb, cb, NLMSG_DONE, sizeof(len), NLM_F_MULTI);
It is very important that netlink_dump does this, of course. However, in
this example, that call to nlmsg_put_answer will fail, because the
previous filling by the dump function did not leave it enough room. And
how could it possibly have done so? All of the nla_put variety of
functions simply check to see if the skb has enough tailroom,
independent of the context it is in.
In order to keep the important assumptions of all netlink dump users, it
is therefore important to give them an skb that has this end part of the
tail already reserved, so that the call to nlmsg_put_answer does not
fail. Otherwise, library authors are forced to find some bizarre sized
receive buffer that has a large modulo relative to the common sizes of
messages received, which is ugly and buggy.
This patch thus saves the NLMSG_DONE for an additional message, for the
case that things are dangerously close to the brim. This requires
keeping track of the errno from ->dump() across calls.
Signed-off-by: Jason A. Donenfeld <Jason(a)zx2c4.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/netlink/af_netlink.c | 17 +++++++++++------
net/netlink/af_netlink.h | 1 +
2 files changed, 12 insertions(+), 6 deletions(-)
--- a/net/netlink/af_netlink.c
+++ b/net/netlink/af_netlink.c
@@ -2077,7 +2077,7 @@ static int netlink_dump(struct sock *sk)
struct sk_buff *skb = NULL;
struct nlmsghdr *nlh;
struct module *module;
- int len, err = -ENOBUFS;
+ int err = -ENOBUFS;
int alloc_min_size;
int alloc_size;
@@ -2125,9 +2125,11 @@ static int netlink_dump(struct sock *sk)
skb_reserve(skb, skb_tailroom(skb) - alloc_size);
netlink_skb_set_owner_r(skb, sk);
- len = cb->dump(skb, cb);
+ if (nlk->dump_done_errno > 0)
+ nlk->dump_done_errno = cb->dump(skb, cb);
- if (len > 0) {
+ if (nlk->dump_done_errno > 0 ||
+ skb_tailroom(skb) < nlmsg_total_size(sizeof(nlk->dump_done_errno))) {
mutex_unlock(nlk->cb_mutex);
if (sk_filter(sk, skb))
@@ -2137,13 +2139,15 @@ static int netlink_dump(struct sock *sk)
return 0;
}
- nlh = nlmsg_put_answer(skb, cb, NLMSG_DONE, sizeof(len), NLM_F_MULTI);
- if (!nlh)
+ nlh = nlmsg_put_answer(skb, cb, NLMSG_DONE,
+ sizeof(nlk->dump_done_errno), NLM_F_MULTI);
+ if (WARN_ON(!nlh))
goto errout_skb;
nl_dump_check_consistent(cb, nlh);
- memcpy(nlmsg_data(nlh), &len, sizeof(len));
+ memcpy(nlmsg_data(nlh), &nlk->dump_done_errno,
+ sizeof(nlk->dump_done_errno));
if (sk_filter(sk, skb))
kfree_skb(skb);
@@ -2208,6 +2212,7 @@ int __netlink_dump_start(struct sock *ss
cb->skb = skb;
nlk->cb_running = true;
+ nlk->dump_done_errno = INT_MAX;
mutex_unlock(nlk->cb_mutex);
--- a/net/netlink/af_netlink.h
+++ b/net/netlink/af_netlink.h
@@ -38,6 +38,7 @@ struct netlink_sock {
wait_queue_head_t wait;
bool bound;
bool cb_running;
+ int dump_done_errno;
struct netlink_callback cb;
struct mutex *cb_mutex;
struct mutex cb_def_mutex;
Patches currently in stable-queue which might be from Jason(a)zx2c4.com are
queue-4.4/af_netlink-ensure-that-nlmsg_done-never-fails-in-dumps.patch
This is a note to let you know that I've just added the patch titled
ipv6/dccp: do not inherit ipv6_mc_list from parent
to the 3.18-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
ipv6-dccp-do-not-inherit-ipv6_mc_list-from-parent.patch
and it can be found in the queue-3.18 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 83eaddab4378db256d00d295bda6ca997cd13a52 Mon Sep 17 00:00:00 2001
From: WANG Cong <xiyou.wangcong(a)gmail.com>
Date: Tue, 9 May 2017 16:59:54 -0700
Subject: ipv6/dccp: do not inherit ipv6_mc_list from parent
From: WANG Cong <xiyou.wangcong(a)gmail.com>
commit 83eaddab4378db256d00d295bda6ca997cd13a52 upstream.
Like commit 657831ffc38e ("dccp/tcp: do not inherit mc_list from parent")
we should clear ipv6_mc_list etc. for IPv6 sockets too.
Cc: Eric Dumazet <edumazet(a)google.com>
Signed-off-by: Cong Wang <xiyou.wangcong(a)gmail.com>
Acked-by: Eric Dumazet <edumazet(a)google.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Connor O'Brien <connoro(a)google.com>
[AmitP: cherry-picked this backported commit from android-3.18]
Signed-off-by: Amit Pundir <amit.pundir(a)linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/dccp/ipv6.c | 7 +++++++
net/ipv6/tcp_ipv6.c | 2 ++
2 files changed, 9 insertions(+)
--- a/net/dccp/ipv6.c
+++ b/net/dccp/ipv6.c
@@ -487,6 +487,9 @@ static struct sock *dccp_v6_request_recv
newsk->sk_backlog_rcv = dccp_v4_do_rcv;
newnp->pktoptions = NULL;
newnp->opt = NULL;
+ newnp->ipv6_mc_list = NULL;
+ newnp->ipv6_ac_list = NULL;
+ newnp->ipv6_fl_list = NULL;
newnp->mcast_oif = inet6_iif(skb);
newnp->mcast_hops = ipv6_hdr(skb)->hop_limit;
@@ -562,6 +565,10 @@ static struct sock *dccp_v6_request_recv
/* Clone RX bits */
newnp->rxopt.all = np->rxopt.all;
+ newnp->ipv6_mc_list = NULL;
+ newnp->ipv6_ac_list = NULL;
+ newnp->ipv6_fl_list = NULL;
+
/* Clone pktoptions received with SYN */
newnp->pktoptions = NULL;
if (ireq->pktopts != NULL) {
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -1113,6 +1113,7 @@ static struct sock *tcp_v6_syn_recv_sock
newtp->af_specific = &tcp_sock_ipv6_mapped_specific;
#endif
+ newnp->ipv6_mc_list = NULL;
newnp->ipv6_ac_list = NULL;
newnp->ipv6_fl_list = NULL;
newnp->pktoptions = NULL;
@@ -1184,6 +1185,7 @@ static struct sock *tcp_v6_syn_recv_sock
First: no IPv4 options.
*/
newinet->inet_opt = NULL;
+ newnp->ipv6_mc_list = NULL;
newnp->ipv6_ac_list = NULL;
newnp->ipv6_fl_list = NULL;
Patches currently in stable-queue which might be from xiyou.wangcong(a)gmail.com are
queue-3.18/ipv6-dccp-do-not-inherit-ipv6_mc_list-from-parent.patch
From: Daniel Jurgens <danielj(a)mellanox.com>
For now the only LSM security enforcement mechanism available is
specific to InfiniBand. Bypass enforcement for non-IB link types.
This fixes a regression where modify_qp fails for iWARP because
querying the PKEY returns -EINVAL.
Cc: Paul Moore <paul(a)paul-moore.com>
Cc: Don Dutile <ddutile(a)redhat.com>
Cc: stable(a)vger.kernel.org
Reported-by: Potnuri Bharat Teja <bharat(a)chelsio.com>
Fixes: d291f1a65232("IB/core: Enforce PKey security on QPs")
Fixes: 47a2b338fe63("IB/core: Enforce security on management datagrams")
Signed-off-by: Daniel Jurgens <danielj(a)mellanox.com>
Reviewed-by: Parav Pandit <parav(a)mellanox.com>
Tested-by: Potnuri Bharat Teja <bharat(a)chelsio.com>
Signed-off-by: Leon Romanovsky <leon(a)kernel.org>
---
drivers/infiniband/core/security.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/drivers/infiniband/core/security.c b/drivers/infiniband/core/security.c
index 23278ed5be45..314bf1137c7b 100644
--- a/drivers/infiniband/core/security.c
+++ b/drivers/infiniband/core/security.c
@@ -417,8 +417,17 @@ void ib_close_shared_qp_security(struct ib_qp_security *sec)
int ib_create_qp_security(struct ib_qp *qp, struct ib_device *dev)
{
+ u8 i = rdma_start_port(dev);
+ bool is_ib = false;
int ret;
+ while (i <= rdma_end_port(dev) && !is_ib)
+ is_ib = rdma_protocol_ib(dev, i++);
+
+ /* If this isn't an IB device don't create the security context */
+ if (!is_ib)
+ return 0;
+
qp->qp_sec = kzalloc(sizeof(*qp->qp_sec), GFP_KERNEL);
if (!qp->qp_sec)
return -ENOMEM;
--
2.15.0
From: Huacai Chen <chenhc(a)lemote.com>
The rps_resp buffer in ata_device is a DMA target, but it isn't
explicitly cacheline aligned. Due to this, adjacent fields can be
overwritten with stale data from memory on non-coherent architectures.
As a result, the kernel is sometimes unable to communicate with an
SATA device behind a SAS expander.
Fix this by ensuring that the rps_resp buffer is cacheline aligned.
This issue is similar to that fixed by Commit 84bda12af31f93 ("libata:
align ap->sector_buf") and Commit 4ee34ea3a12396f35b26 ("libata: Align
ata_device's id on a cacheline").
Cc: stable(a)vger.kernel.org
Signed-off-by: Huacai Chen <chenhc(a)lemote.com>
Signed-off-by: Christoph Hellwig <hch(a)lst.de>
---
include/scsi/libsas.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/include/scsi/libsas.h b/include/scsi/libsas.h
index 0f9cbf96c093..6df6fe0c2198 100644
--- a/include/scsi/libsas.h
+++ b/include/scsi/libsas.h
@@ -159,11 +159,11 @@ struct expander_device {
struct sata_device {
unsigned int class;
- struct smp_resp rps_resp; /* report_phy_sata_resp */
u8 port_no; /* port number, if this is a PM (Port) */
struct ata_port *ap;
struct ata_host ata_host;
+ struct smp_resp rps_resp ____cacheline_aligned; /* report_phy_sata_resp */
u8 fis[ATA_RESP_FIS_SIZE];
};
--
2.14.2
From: Huacai Chen <chenhc(a)lemote.com>
In non-coherent DMA mode, kernel uses cache flushing operations to maintain
I/O coherency, so scsi's block queue should be aligned to the value
returned by dma_get_cache_alignment(). Otherwise, If a DMA buffer and a
kernel structure share a same cache line, and if the kernel structure has
dirty data, cache_invalidate (no writeback) will cause data corruption.
Cc: stable(a)vger.kernel.org
Signed-off-by: Huacai Chen <chenhc(a)lemote.com>
[hch: rebased and updated the comment and changelog]
Signed-off-by: Christoph Hellwig <hch(a)lst.de>
---
drivers/scsi/scsi_lib.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 1cbc497e00bd..00742c50cd44 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -2148,11 +2148,13 @@ void __scsi_init_queue(struct Scsi_Host *shost, struct request_queue *q)
q->limits.cluster = 0;
/*
- * set a reasonable default alignment on word boundaries: the
- * host and device may alter it using
- * blk_queue_update_dma_alignment() later.
+ * Set a reasonable default alignment: The larger of 32-byte (dword),
+ * which is a common minimum for HBAs, and the minimum DMA alignment,
+ * which is set by the platform.
+ *
+ * Devices that require a bigger alignment can increase it later.
*/
- blk_queue_dma_alignment(q, 0x03);
+ blk_queue_dma_alignment(q, max(4, dma_get_cache_alignment()) - 1);
}
EXPORT_SYMBOL_GPL(__scsi_init_queue);
--
2.14.2
Provide the dummy version of dma_get_cache_alignment that always returns 1
even if CONFIG_HAS_DMA is not set, so that drivers and subsystems can
use it without ifdefs.
Cc: stable(a)vger.kernel.org
Signed-off-by: Christoph Hellwig <hch(a)lst.de>
---
include/linux/dma-mapping.h | 2 --
1 file changed, 2 deletions(-)
diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
index e8f8e8fb244d..81ed9b2d84dc 100644
--- a/include/linux/dma-mapping.h
+++ b/include/linux/dma-mapping.h
@@ -704,7 +704,6 @@ static inline void *dma_zalloc_coherent(struct device *dev, size_t size,
return ret;
}
-#ifdef CONFIG_HAS_DMA
static inline int dma_get_cache_alignment(void)
{
#ifdef ARCH_DMA_MINALIGN
@@ -712,7 +711,6 @@ static inline int dma_get_cache_alignment(void)
#endif
return 1;
}
-#endif
/* flags for the coherent memory api */
#define DMA_MEMORY_EXCLUSIVE 0x01
--
2.14.2
At Linaro we’ve been putting effort into regularly running kernel tests over
arm, arm64 and x86_64 targets. On those targets we’re running mainline, -next,
4.4, and 4.9 kernels and yes we are adding to this list as the hardware
capacity grows.
For test buckets we’re using just LTP, kselftest and libhugetlbfs and
like kernels we will add to this list.
With the 4.14 cycle being a little ‘different’ in so much as the goal to
have it be an LTS kernel I think it’s important to take a look at some
4.14 test results.
Grab a beverage, this is a bit of a long post. But quick summery 4.14 as
released looks just as good as 4.13, for the test buckets I named above.
I’ve enclosed our short form report. We break down the boards/arch combos for
each bucket pass/skip or potentially fails. Pretty straight forward. Skips
generally happen for a few reasons
1) crappy test cases
2) test isn’t appropriate (x86 specific tests so don’t run elsewhere)
With this, we have a decent baseline for 4.14 and other kernels going
forward.
Summary
------------------------------------------------------------------------
kernel: 4.14.0
git repo: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
git branch: master
git commit: bebc6082da0a9f5d47a1ea2edc099bf671058bd4
git describe: v4.14
Test details: https://qa-reports.linaro.org/lkft/linux-mainline-oe/build/v4.14
No regressions (compared to build v4.14-rc8)
Boards, architectures and test suites:
-------------------------------------
hi6220-hikey - arm64
* boot - pass: 20
* kselftest - skip: 16, pass: 38
* libhugetlbfs - skip: 1, pass: 90
* ltp-cap_bounds-tests - pass: 2
* ltp-containers-tests - pass: 76
* ltp-fcntl-locktests-tests - pass: 2
* ltp-filecaps-tests - pass: 2
* ltp-fs-tests - pass: 60
* ltp-fs_bind-tests - pass: 2
* ltp-fs_perms_simple-tests - pass: 19
* ltp-fsx-tests - pass: 2
* ltp-hugetlb-tests - skip: 1, pass: 21
* ltp-io-tests - pass: 3
* ltp-ipc-tests - pass: 9
* ltp-math-tests - pass: 11
* ltp-nptl-tests - pass: 2
* ltp-pty-tests - pass: 4
* ltp-sched-tests - pass: 14
* ltp-securebits-tests - pass: 4
* ltp-syscalls-tests - skip: 122, pass: 983
* ltp-timers-tests - pass: 12
juno-r2 - arm64
* boot - pass: 20
* kselftest - skip: 15, pass: 38
* libhugetlbfs - skip: 1, pass: 90
* ltp-cap_bounds-tests - pass: 2
* ltp-containers-tests - pass: 76
* ltp-fcntl-locktests-tests - pass: 2
* ltp-filecaps-tests - pass: 2
* ltp-fs-tests - pass: 60
* ltp-fs_bind-tests - pass: 2
* ltp-fs_perms_simple-tests - pass: 19
* ltp-fsx-tests - pass: 2
* ltp-hugetlb-tests - pass: 22
* ltp-io-tests - pass: 3
* ltp-ipc-tests - pass: 9
* ltp-math-tests - pass: 11
* ltp-nptl-tests - pass: 2
* ltp-pty-tests - pass: 4
* ltp-sched-tests - pass: 10
* ltp-securebits-tests - pass: 4
* ltp-syscalls-tests - skip: 156, pass: 943
* ltp-timers-tests - pass: 12
x15 - arm
* boot - pass: 20
* kselftest - skip: 17, pass: 36
* libhugetlbfs - skip: 1, pass: 87
* ltp-cap_bounds-tests - pass: 2
* ltp-containers-tests - pass: 64
* ltp-fcntl-locktests-tests - pass: 2
* ltp-filecaps-tests - pass: 2
* ltp-fs-tests - pass: 60
* ltp-fs_bind-tests - pass: 2
* ltp-fs_perms_simple-tests - pass: 19
* ltp-fsx-tests - pass: 2
* ltp-hugetlb-tests - skip: 2, pass: 20
* ltp-io-tests - pass: 3
* ltp-ipc-tests - pass: 9
* ltp-math-tests - pass: 11
* ltp-nptl-tests - pass: 2
* ltp-pty-tests - pass: 4
* ltp-sched-tests - skip: 1, pass: 13
* ltp-securebits-tests - pass: 4
* ltp-syscalls-tests - skip: 66, pass: 1040
* ltp-timers-tests - pass: 12
dell-poweredge-r200 - x86_64
* boot - pass: 19
* kselftest - skip: 11, pass: 54
* libhugetlbfs - skip: 1, pass: 76
* ltp-cap_bounds-tests - pass: 1
* ltp-containers-tests - pass: 64
* ltp-fcntl-locktests-tests - pass: 2
* ltp-filecaps-tests - pass: 2
* ltp-fs-tests - skip: 1, pass: 61
* ltp-fs_bind-tests - pass: 1
* ltp-fs_perms_simple-tests - pass: 19
* ltp-fsx-tests - pass: 2
* ltp-hugetlb-tests - pass: 22
* ltp-io-tests - pass: 3
* ltp-ipc-tests - pass: 8
* ltp-math-tests - pass: 11
* ltp-nptl-tests - pass: 2
* ltp-pty-tests - pass: 4
* ltp-sched-tests - pass: 9
* ltp-securebits-tests - pass: 3
* ltp-syscalls-tests - skip: 163, pass: 962
Lots of green.
Let’s now talk about coverage, the pandora’s box of validation. It’s never
perfect. There’s a bazillion different build combos. Even tools can
make a difference. We’ve seen a case where the dhcp client from open embedded
didn’t trigger a network regression in one of the LTS RCs but Debian’s dhclient
did.
Of no surprise between what we and others have, it’s not perfect coverage,
and there are only so many build, boot and run cycles to execute the test
buckets with various combinations so we need to stay sensible as far as
kernel configs go.
Does this kind of system actually FIND anything and is it useful for
watching for 4.14 regressions as fixes are introduced?
I would assert the answer is yes. We do have data for a couple of kernel
cycles but it’s also somewhat dirty as we have been in the process of
detecting and tossing out dodgy test cases.
Take 4.14-RC7, there was one failure that is no longer there.
ltp-syscalls-tests : perf_event_open02 (arm64)
As things are getting merged post 4.14 there are some failures
cropping up. Here’s an example:
https://qa-reports.linaro.org/lkft/linux-mainline-oe/tests/ltp-fs-tests/pro…
Note the Build column, the kernels are identified by their git describe.
Don’t be alarmed if you see n/a in some columns, the queues are catching up
so data will be filling in.
So why didn’t we report these? As mentioned we’ve been tossing out dodgy
test cases to get to a clean baseline. We don’t need or want noise.
For LTS, I want the system when it detects a failure to enable a quick
bisect involving the affected test bucket. Given the nature of kernel
bugs tho, there is that class of bug which only happens occasionally.
This brings up a conundrum when you have a system like this. A failure
turns up, it’s not consistently failing and a path forward isn’t
necessarily obvious. Remember for an LTS RC, there’s a defined window
to comment.
I’ve been flamed for reporting a LTS RC test failure which didn't include
a fix, just a ‘this fails, and we’re looking at it.’ I’ve been flamed
for not reporting a failure that had been detected but not raised to the
list since it was still being debugged after the RC comment window had
closed.
My 1990s vintage asbestos underwear thankfully is functional.
There is probably a case to be made either way. It boils down to
either:
Red Pill) Be fully open reporting early and often
Blue Pill) Be closed and only pass up failures that include a patch to fix a bug.
Red Pill does expose drama yet it also creates an opportunity for others to
get involved.
Blue Pill protects the community from noise and the creation of frustration
that the system has cried wolf for perhaps a stupid test case.
Likewise from a maintainer or dev perspective, there’s a sea of data.
Time is precious, and who wants to waste it on some snipe hunt?
I’m personally in the Red Pill camp. I like being open.
Be it 0day, LKFT or whatever I think the responsibility is on us
running these projects to be open and give full guidance. Yes there
will be noise. Noise can suggest dodgy test cases or bugs that are
hard to trigger. Either way they warrant a look. Take Arnd Bergman’s
work to get rid of kernel warnings. Same concept in my opinion.
Dodgy test cases can easily be put onto skip lists. As we’ve been
running for a number of months now, data and ol fashioned code
review has been our guide to banish dodgy test cases to skip lists.
Going forward new test cases will pop up. Some of them will be dodgy.
There’s lots of room for collaboration in improving test cases.
In summary I think for mainline, LTS kernels etc, we have a good
warning system to detect regressions as patches flow in. It will evolve
and improve as is the nature of our open community. From kernelci,
LKFT, 0day, etc, that’s a good set of automated systems to ferret out
problems introduced by patches.
Tom
This is a note to let you know that I've just added the patch titled
vlan: fix a use-after-free in vlan_device_event()
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
vlan-fix-a-use-after-free-in-vlan_device_event.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Tue Nov 21 13:08:13 CET 2017
From: Cong Wang <xiyou.wangcong(a)gmail.com>
Date: Thu, 9 Nov 2017 16:43:13 -0800
Subject: vlan: fix a use-after-free in vlan_device_event()
From: Cong Wang <xiyou.wangcong(a)gmail.com>
[ Upstream commit 052d41c01b3a2e3371d66de569717353af489d63 ]
After refcnt reaches zero, vlan_vid_del() could free
dev->vlan_info via RCU:
RCU_INIT_POINTER(dev->vlan_info, NULL);
call_rcu(&vlan_info->rcu, vlan_info_rcu_free);
However, the pointer 'grp' still points to that memory
since it is set before vlan_vid_del():
vlan_info = rtnl_dereference(dev->vlan_info);
if (!vlan_info)
goto out;
grp = &vlan_info->grp;
Depends on when that RCU callback is scheduled, we could
trigger a use-after-free in vlan_group_for_each_dev()
right following this vlan_vid_del().
Fix it by moving vlan_vid_del() before setting grp. This
is also symmetric to the vlan_vid_add() we call in
vlan_device_event().
Reported-by: Fengguang Wu <fengguang.wu(a)intel.com>
Fixes: efc73f4bbc23 ("net: Fix memory leak - vlan_info struct")
Cc: Alexander Duyck <alexander.duyck(a)gmail.com>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Girish Moodalbail <girish.moodalbail(a)oracle.com>
Signed-off-by: Cong Wang <xiyou.wangcong(a)gmail.com>
Reviewed-by: Girish Moodalbail <girish.moodalbail(a)oracle.com>
Tested-by: Fengguang Wu <fengguang.wu(a)intel.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/8021q/vlan.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
--- a/net/8021q/vlan.c
+++ b/net/8021q/vlan.c
@@ -376,6 +376,9 @@ static int vlan_device_event(struct noti
dev->name);
vlan_vid_add(dev, htons(ETH_P_8021Q), 0);
}
+ if (event == NETDEV_DOWN &&
+ (dev->features & NETIF_F_HW_VLAN_CTAG_FILTER))
+ vlan_vid_del(dev, htons(ETH_P_8021Q), 0);
vlan_info = rtnl_dereference(dev->vlan_info);
if (!vlan_info)
@@ -423,9 +426,6 @@ static int vlan_device_event(struct noti
struct net_device *tmp;
LIST_HEAD(close_list);
- if (dev->features & NETIF_F_HW_VLAN_CTAG_FILTER)
- vlan_vid_del(dev, htons(ETH_P_8021Q), 0);
-
/* Put all VLANs for this dev in the down state too. */
vlan_group_for_each_dev(grp, i, vlandev) {
flgs = vlandev->flags;
Patches currently in stable-queue which might be from xiyou.wangcong(a)gmail.com are
queue-4.9/vlan-fix-a-use-after-free-in-vlan_device_event.patch
This is a note to let you know that I've just added the patch titled
tcp_nv: fix division by zero in tcpnv_acked()
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
tcp_nv-fix-division-by-zero-in-tcpnv_acked.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Tue Nov 21 13:08:13 CET 2017
From: Konstantin Khlebnikov <khlebnikov(a)yandex-team.ru>
Date: Wed, 1 Nov 2017 16:32:15 +0300
Subject: tcp_nv: fix division by zero in tcpnv_acked()
From: Konstantin Khlebnikov <khlebnikov(a)yandex-team.ru>
[ Upstream commit 4eebff27ca4182bbf5f039dd60d79e2d7c0a707e ]
Average RTT could become zero. This happened in real life at least twice.
This patch treats zero as 1us.
Signed-off-by: Konstantin Khlebnikov <khlebnikov(a)yandex-team.ru>
Acked-by: Lawrence Brakmo <Brakmo(a)fb.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/ipv4/tcp_nv.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/net/ipv4/tcp_nv.c
+++ b/net/ipv4/tcp_nv.c
@@ -263,7 +263,7 @@ static void tcpnv_acked(struct sock *sk,
/* rate in 100's bits per second */
rate64 = ((u64)sample->in_flight) * 8000000;
- rate = (u32)div64_u64(rate64, (u64)(avg_rtt * 100));
+ rate = (u32)div64_u64(rate64, (u64)(avg_rtt ?: 1) * 100);
/* Remember the maximum rate seen during this RTT
* Note: It may be more than one RTT. This function should be
Patches currently in stable-queue which might be from khlebnikov(a)yandex-team.ru are
queue-4.9/tcp_nv-fix-division-by-zero-in-tcpnv_acked.patch
This is a note to let you know that I've just added the patch titled
tcp: do not mangle skb->cb[] in tcp_make_synack()
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
tcp-do-not-mangle-skb-cb-in-tcp_make_synack.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Tue Nov 21 13:08:13 CET 2017
From: Eric Dumazet <edumazet(a)google.com>
Date: Thu, 2 Nov 2017 12:30:25 -0700
Subject: tcp: do not mangle skb->cb[] in tcp_make_synack()
From: Eric Dumazet <edumazet(a)google.com>
[ Upstream commit 3b11775033dc87c3d161996c54507b15ba26414a ]
Christoph Paasch sent a patch to address the following issue :
tcp_make_synack() is leaving some TCP private info in skb->cb[],
then send the packet by other means than tcp_transmit_skb()
tcp_transmit_skb() makes sure to clear skb->cb[] to not confuse
IPv4/IPV6 stacks, but we have no such cleanup for SYNACK.
tcp_make_synack() should not use tcp_init_nondata_skb() :
tcp_init_nondata_skb() really should be limited to skbs put in write/rtx
queues (the ones that are only sent via tcp_transmit_skb())
This patch fixes the issue and should even save few cpu cycles ;)
Fixes: 971f10eca186 ("tcp: better TCP_SKB_CB layout to reduce cache line misses")
Signed-off-by: Eric Dumazet <edumazet(a)google.com>
Reported-by: Christoph Paasch <cpaasch(a)apple.com>
Reviewed-by: Christoph Paasch <cpaasch(a)apple.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/ipv4/tcp_output.c | 9 ++-------
1 file changed, 2 insertions(+), 7 deletions(-)
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -3110,13 +3110,8 @@ struct sk_buff *tcp_make_synack(const st
tcp_ecn_make_synack(req, th);
th->source = htons(ireq->ir_num);
th->dest = ireq->ir_rmt_port;
- /* Setting of flags are superfluous here for callers (and ECE is
- * not even correctly set)
- */
- tcp_init_nondata_skb(skb, tcp_rsk(req)->snt_isn,
- TCPHDR_SYN | TCPHDR_ACK);
-
- th->seq = htonl(TCP_SKB_CB(skb)->seq);
+ skb->ip_summed = CHECKSUM_PARTIAL;
+ th->seq = htonl(tcp_rsk(req)->snt_isn);
/* XXX data is queued and acked as is. No buffer/window check */
th->ack_seq = htonl(tcp_rsk(req)->rcv_nxt);
Patches currently in stable-queue which might be from edumazet(a)google.com are
queue-4.9/tcp-do-not-mangle-skb-cb-in-tcp_make_synack.patch
This is a note to let you know that I've just added the patch titled
sctp: do not peel off an assoc from one netns to another one
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
sctp-do-not-peel-off-an-assoc-from-one-netns-to-another-one.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Tue Nov 21 13:08:13 CET 2017
From: Xin Long <lucien.xin(a)gmail.com>
Date: Tue, 17 Oct 2017 23:26:10 +0800
Subject: sctp: do not peel off an assoc from one netns to another one
From: Xin Long <lucien.xin(a)gmail.com>
[ Upstream commit df80cd9b28b9ebaa284a41df611dbf3a2d05ca74 ]
Now when peeling off an association to the sock in another netns, all
transports in this assoc are not to be rehashed and keep use the old
key in hashtable.
As a transport uses sk->net as the hash key to insert into hashtable,
it would miss removing these transports from hashtable due to the new
netns when closing the sock and all transports are being freeed, then
later an use-after-free issue could be caused when looking up an asoc
and dereferencing those transports.
This is a very old issue since very beginning, ChunYu found it with
syzkaller fuzz testing with this series:
socket$inet6_sctp()
bind$inet6()
sendto$inet6()
unshare(0x40000000)
getsockopt$inet_sctp6_SCTP_GET_ASSOC_ID_LIST()
getsockopt$inet_sctp6_SCTP_SOCKOPT_PEELOFF()
This patch is to block this call when peeling one assoc off from one
netns to another one, so that the netns of all transport would not
go out-sync with the key in hashtable.
Note that this patch didn't fix it by rehashing transports, as it's
difficult to handle the situation when the tuple is already in use
in the new netns. Besides, no one would like to peel off one assoc
to another netns, considering ipaddrs, ifaces, etc. are usually
different.
Reported-by: ChunYu Wang <chunwang(a)redhat.com>
Signed-off-by: Xin Long <lucien.xin(a)gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner(a)gmail.com>
Acked-by: Neil Horman <nhorman(a)tuxdriver.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/sctp/socket.c | 4 ++++
1 file changed, 4 insertions(+)
--- a/net/sctp/socket.c
+++ b/net/sctp/socket.c
@@ -4764,6 +4764,10 @@ int sctp_do_peeloff(struct sock *sk, sct
struct socket *sock;
int err = 0;
+ /* Do not peel off from one netns to another one. */
+ if (!net_eq(current->nsproxy->net_ns, sock_net(sk)))
+ return -EINVAL;
+
if (!asoc)
return -EINVAL;
Patches currently in stable-queue which might be from lucien.xin(a)gmail.com are
queue-4.9/sctp-do-not-peel-off-an-assoc-from-one-netns-to-another-one.patch
This is a note to let you know that I've just added the patch titled
qmi_wwan: Add missing skb_reset_mac_header-call
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
qmi_wwan-add-missing-skb_reset_mac_header-call.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Tue Nov 21 13:08:13 CET 2017
From: Kristian Evensen <kristian.evensen(a)gmail.com>
Date: Tue, 7 Nov 2017 13:47:56 +0100
Subject: qmi_wwan: Add missing skb_reset_mac_header-call
From: Kristian Evensen <kristian.evensen(a)gmail.com>
[ Upstream commit 0de0add10e587effa880c741c9413c874f16be91 ]
When we receive a packet on a QMI device in raw IP mode, we should call
skb_reset_mac_header() to ensure that skb->mac_header contains a valid
offset in the packet. While it shouldn't really matter, the packets have
no MAC header and the interface is configured as-such, it seems certain
parts of the network stack expects a "good" value in skb->mac_header.
Without the skb_reset_mac_header() call added in this patch, for example
shaping traffic (using tc) triggers the following oops on the first
received packet:
[ 303.642957] skbuff: skb_under_panic: text:8f137918 len:177 put:67 head:8e4b0f00 data:8e4b0eff tail:0x8e4b0fb0 end:0x8e4b1520 dev:wwan0
[ 303.655045] Kernel bug detected[#1]:
[ 303.658622] CPU: 1 PID: 1002 Comm: logd Not tainted 4.9.58 #0
[ 303.664339] task: 8fdf05e0 task.stack: 8f15c000
[ 303.668844] $ 0 : 00000000 00000001 0000007a 00000000
[ 303.674062] $ 4 : 8149a2fc 8149a2fc 8149ce20 00000000
[ 303.679284] $ 8 : 00000030 3878303a 31623465 20303235
[ 303.684510] $12 : ded731e3 2626a277 00000000 03bd0000
[ 303.689747] $16 : 8ef62b40 00000043 8f137918 804db5fc
[ 303.694978] $20 : 00000001 00000004 8fc13800 00000003
[ 303.700215] $24 : 00000001 8024ab10
[ 303.705442] $28 : 8f15c000 8fc19cf0 00000043 802cc920
[ 303.710664] Hi : 00000000
[ 303.713533] Lo : 74e58000
[ 303.716436] epc : 802cc920 skb_panic+0x58/0x5c
[ 303.721046] ra : 802cc920 skb_panic+0x58/0x5c
[ 303.725639] Status: 11007c03 KERNEL EXL IE
[ 303.729823] Cause : 50800024 (ExcCode 09)
[ 303.733817] PrId : 0001992f (MIPS 1004Kc)
[ 303.737892] Modules linked in: rt2800pci rt2800mmio rt2800lib qcserial ppp_async option usb_wwan rt2x00pci rt2x00mmio rt2x00lib rndis_host qmi_wwan ppp_generic nf_nat_pptp nf_conntrack_pptp nf_conntrack_ipv6 mt76x2i
Process logd (pid: 1002, threadinfo=8f15c000, task=8fdf05e0, tls=77b3eee4)
[ 303.962509] Stack : 00000000 80408990 8f137918 000000b1 00000043 8e4b0f00 8e4b0eff 8e4b0fb0
[ 303.970871] 8e4b1520 8fec1800 00000043 802cd2a4 6e000045 00000043 00000000 8ef62000
[ 303.979219] 8eef5d00 8ef62b40 8fea7300 8f137918 00000000 00000000 0002bb01 793e5664
[ 303.987568] 8ef08884 00000001 8fea7300 00000002 8fc19e80 8eef5d00 00000006 00000003
[ 303.995934] 00000000 8030ba90 00000003 77ab3fd0 8149dc80 8004d1bc 8f15c000 8f383700
[ 304.004324] ...
[ 304.006767] Call Trace:
[ 304.009241] [<802cc920>] skb_panic+0x58/0x5c
[ 304.013504] [<802cd2a4>] skb_push+0x78/0x90
[ 304.017783] [<8f137918>] 0x8f137918
[ 304.021269] Code: 00602825 0c02a3b4 24842888 <000c000d> 8c870060 8c8200a0 0007382b 00070336 8c88005c
[ 304.031034]
[ 304.032805] ---[ end trace b778c482b3f0bda9 ]---
[ 304.041384] Kernel panic - not syncing: Fatal exception in interrupt
[ 304.051975] Rebooting in 3 seconds..
While the oops is for a 4.9-kernel, I was able to trigger the same oops with
net-next as of yesterday.
Fixes: 32f7adf633b9 ("net: qmi_wwan: support "raw IP" mode")
Signed-off-by: Kristian Evensen <kristian.evensen(a)gmail.com>
Acked-by: Bjørn Mork <bjorn(a)mork.no>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/net/usb/qmi_wwan.c | 1 +
1 file changed, 1 insertion(+)
--- a/drivers/net/usb/qmi_wwan.c
+++ b/drivers/net/usb/qmi_wwan.c
@@ -205,6 +205,7 @@ static int qmi_wwan_rx_fixup(struct usbn
return 1;
}
if (rawip) {
+ skb_reset_mac_header(skb);
skb->dev = dev->net; /* normally set by eth_type_trans */
skb->protocol = proto;
return 1;
Patches currently in stable-queue which might be from kristian.evensen(a)gmail.com are
queue-4.9/qmi_wwan-add-missing-skb_reset_mac_header-call.patch
This is a note to let you know that I've just added the patch titled
netfilter/ipvs: clear ipvs_property flag when SKB net namespace changed
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
netfilter-ipvs-clear-ipvs_property-flag-when-skb-net-namespace-changed.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Tue Nov 21 13:08:13 CET 2017
From: Ye Yin <hustcat(a)gmail.com>
Date: Thu, 26 Oct 2017 16:57:05 +0800
Subject: netfilter/ipvs: clear ipvs_property flag when SKB net namespace changed
From: Ye Yin <hustcat(a)gmail.com>
[ Upstream commit 2b5ec1a5f9738ee7bf8f5ec0526e75e00362c48f ]
When run ipvs in two different network namespace at the same host, and one
ipvs transport network traffic to the other network namespace ipvs.
'ipvs_property' flag will make the second ipvs take no effect. So we should
clear 'ipvs_property' when SKB network namespace changed.
Fixes: 621e84d6f373 ("dev: introduce skb_scrub_packet()")
Signed-off-by: Ye Yin <hustcat(a)gmail.com>
Signed-off-by: Wei Zhou <chouryzhou(a)gmail.com>
Signed-off-by: Julian Anastasov <ja(a)ssi.bg>
Signed-off-by: Simon Horman <horms(a)verge.net.au>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
include/linux/skbuff.h | 7 +++++++
net/core/skbuff.c | 1 +
2 files changed, 8 insertions(+)
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -3584,6 +3584,13 @@ static inline void nf_reset_trace(struct
#endif
}
+static inline void ipvs_reset(struct sk_buff *skb)
+{
+#if IS_ENABLED(CONFIG_IP_VS)
+ skb->ipvs_property = 0;
+#endif
+}
+
/* Note: This doesn't put any conntrack and bridge info in dst. */
static inline void __nf_copy(struct sk_buff *dst, const struct sk_buff *src,
bool copy)
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -4375,6 +4375,7 @@ void skb_scrub_packet(struct sk_buff *sk
if (!xnet)
return;
+ ipvs_reset(skb);
skb_orphan(skb);
skb->mark = 0;
}
Patches currently in stable-queue which might be from hustcat(a)gmail.com are
queue-4.9/netfilter-ipvs-clear-ipvs_property-flag-when-skb-net-namespace-changed.patch
This is a note to let you know that I've just added the patch titled
net: vrf: correct FRA_L3MDEV encode type
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
net-vrf-correct-fra_l3mdev-encode-type.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Tue Nov 21 13:08:13 CET 2017
From: Jeff Barnhill <0xeffeff(a)gmail.com>
Date: Wed, 1 Nov 2017 14:58:09 +0000
Subject: net: vrf: correct FRA_L3MDEV encode type
From: Jeff Barnhill <0xeffeff(a)gmail.com>
[ Upstream commit 18129a24983906eaf2a2d448ce4b83e27091ebe2 ]
FRA_L3MDEV is defined as U8, but is being added as a U32 attribute. On
big endian architecture, this results in the l3mdev entry not being
added to the FIB rules.
Fixes: 1aa6c4f6b8cd8 ("net: vrf: Add l3mdev rules on first device create")
Signed-off-by: Jeff Barnhill <0xeffeff(a)gmail.com>
Acked-by: David Ahern <dsahern(a)gmail.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/net/vrf.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/net/vrf.c
+++ b/drivers/net/vrf.c
@@ -1129,7 +1129,7 @@ static int vrf_fib_rule(const struct net
frh->family = family;
frh->action = FR_ACT_TO_TBL;
- if (nla_put_u32(skb, FRA_L3MDEV, 1))
+ if (nla_put_u8(skb, FRA_L3MDEV, 1))
goto nla_put_failure;
if (nla_put_u32(skb, FRA_PRIORITY, FIB_RULE_PREF))
Patches currently in stable-queue which might be from 0xeffeff(a)gmail.com are
queue-4.9/net-vrf-correct-fra_l3mdev-encode-type.patch
This is a note to let you know that I've just added the patch titled
net: usb: asix: fill null-ptr-deref in asix_suspend
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
net-usb-asix-fill-null-ptr-deref-in-asix_suspend.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Tue Nov 21 13:08:13 CET 2017
From: Andrey Konovalov <andreyknvl(a)google.com>
Date: Mon, 6 Nov 2017 13:26:46 +0100
Subject: net: usb: asix: fill null-ptr-deref in asix_suspend
From: Andrey Konovalov <andreyknvl(a)google.com>
[ Upstream commit 8f5624629105589bcc23d0e51cc01bd8103d09a5 ]
When asix_suspend() is called dev->driver_priv might not have been
assigned a value, so we need to check that it's not NULL.
Similar issue is present in asix_resume(), this patch fixes it as well.
Found by syzkaller.
kasan: CONFIG_KASAN_INLINE enabled
kasan: GPF could be caused by NULL-ptr deref or user memory access
general protection fault: 0000 [#1] PREEMPT SMP KASAN
Modules linked in:
CPU: 0 PID: 24 Comm: kworker/0:1 Not tainted 4.14.0-rc4-43422-geccacdd69a8c #400
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Workqueue: usb_hub_wq hub_event
task: ffff88006bb36300 task.stack: ffff88006bba8000
RIP: 0010:asix_suspend+0x76/0xc0 drivers/net/usb/asix_devices.c:629
RSP: 0018:ffff88006bbae718 EFLAGS: 00010202
RAX: dffffc0000000000 RBX: ffff880061ba3b80 RCX: 1ffff1000c34d644
RDX: 0000000000000001 RSI: 0000000000000402 RDI: 0000000000000008
RBP: ffff88006bbae738 R08: 1ffff1000d775cad R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffff8800630a8b40
R13: 0000000000000000 R14: 0000000000000402 R15: ffff880061ba3b80
FS: 0000000000000000(0000) GS:ffff88006c600000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007ff33cf89000 CR3: 0000000061c0a000 CR4: 00000000000006f0
Call Trace:
usb_suspend_interface drivers/usb/core/driver.c:1209
usb_suspend_both+0x27f/0x7e0 drivers/usb/core/driver.c:1314
usb_runtime_suspend+0x41/0x120 drivers/usb/core/driver.c:1852
__rpm_callback+0x339/0xb60 drivers/base/power/runtime.c:334
rpm_callback+0x106/0x220 drivers/base/power/runtime.c:461
rpm_suspend+0x465/0x1980 drivers/base/power/runtime.c:596
__pm_runtime_suspend+0x11e/0x230 drivers/base/power/runtime.c:1009
pm_runtime_put_sync_autosuspend ./include/linux/pm_runtime.h:251
usb_new_device+0xa37/0x1020 drivers/usb/core/hub.c:2487
hub_port_connect drivers/usb/core/hub.c:4903
hub_port_connect_change drivers/usb/core/hub.c:5009
port_event drivers/usb/core/hub.c:5115
hub_event+0x194d/0x3740 drivers/usb/core/hub.c:5195
process_one_work+0xc7f/0x1db0 kernel/workqueue.c:2119
worker_thread+0x221/0x1850 kernel/workqueue.c:2253
kthread+0x3a1/0x470 kernel/kthread.c:231
ret_from_fork+0x2a/0x40 arch/x86/entry/entry_64.S:431
Code: 8d 7c 24 20 48 89 fa 48 c1 ea 03 80 3c 02 00 75 5b 48 b8 00 00
00 00 00 fc ff df 4d 8b 6c 24 20 49 8d 7d 08 48 89 fa 48 c1 ea 03 <80>
3c 02 00 75 34 4d 8b 6d 08 4d 85 ed 74 0b e8 26 2b 51 fd 4c
RIP: asix_suspend+0x76/0xc0 RSP: ffff88006bbae718
---[ end trace dfc4f5649284342c ]---
Signed-off-by: Andrey Konovalov <andreyknvl(a)google.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/net/usb/asix_devices.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/net/usb/asix_devices.c
+++ b/drivers/net/usb/asix_devices.c
@@ -624,7 +624,7 @@ static int asix_suspend(struct usb_inter
struct usbnet *dev = usb_get_intfdata(intf);
struct asix_common_private *priv = dev->driver_priv;
- if (priv->suspend)
+ if (priv && priv->suspend)
priv->suspend(dev);
return usbnet_suspend(intf, message);
@@ -676,7 +676,7 @@ static int asix_resume(struct usb_interf
struct usbnet *dev = usb_get_intfdata(intf);
struct asix_common_private *priv = dev->driver_priv;
- if (priv->resume)
+ if (priv && priv->resume)
priv->resume(dev);
return usbnet_resume(intf);
Patches currently in stable-queue which might be from andreyknvl(a)google.com are
queue-4.9/net-qmi_wwan-fix-divide-by-0-on-bad-descriptors.patch
queue-4.9/net-usb-asix-fill-null-ptr-deref-in-asix_suspend.patch
This is a note to let you know that I've just added the patch titled
net/sctp: Always set scope_id in sctp_inet6_skb_msgname
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
net-sctp-always-set-scope_id-in-sctp_inet6_skb_msgname.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Tue Nov 21 13:08:13 CET 2017
From: "Eric W. Biederman" <ebiederm(a)xmission.com>
Date: Wed, 15 Nov 2017 22:17:48 -0600
Subject: net/sctp: Always set scope_id in sctp_inet6_skb_msgname
From: "Eric W. Biederman" <ebiederm(a)xmission.com>
[ Upstream commit 7c8a61d9ee1df0fb4747879fa67a99614eb62fec ]
Alexandar Potapenko while testing the kernel with KMSAN and syzkaller
discovered that in some configurations sctp would leak 4 bytes of
kernel stack.
Working with his reproducer I discovered that those 4 bytes that
are leaked is the scope id of an ipv6 address returned by recvmsg.
With a little code inspection and a shrewd guess I discovered that
sctp_inet6_skb_msgname only initializes the scope_id field for link
local ipv6 addresses to the interface index the link local address
pertains to instead of initializing the scope_id field for all ipv6
addresses.
That is almost reasonable as scope_id's are meaniningful only for link
local addresses. Set the scope_id in all other cases to 0 which is
not a valid interface index to make it clear there is nothing useful
in the scope_id field.
There should be no danger of breaking userspace as the stack leak
guaranteed that previously meaningless random data was being returned.
Fixes: 372f525b495c ("SCTP: Resync with LKSCTP tree.")
History-tree: https://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git
Reported-by: Alexander Potapenko <glider(a)google.com>
Tested-by: Alexander Potapenko <glider(a)google.com>
Signed-off-by: "Eric W. Biederman" <ebiederm(a)xmission.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/sctp/ipv6.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
--- a/net/sctp/ipv6.c
+++ b/net/sctp/ipv6.c
@@ -806,9 +806,10 @@ static void sctp_inet6_skb_msgname(struc
addr->v6.sin6_flowinfo = 0;
addr->v6.sin6_port = sh->source;
addr->v6.sin6_addr = ipv6_hdr(skb)->saddr;
- if (ipv6_addr_type(&addr->v6.sin6_addr) & IPV6_ADDR_LINKLOCAL) {
+ if (ipv6_addr_type(&addr->v6.sin6_addr) & IPV6_ADDR_LINKLOCAL)
addr->v6.sin6_scope_id = sctp_v6_skb_iif(skb);
- }
+ else
+ addr->v6.sin6_scope_id = 0;
}
*addr_len = sctp_v6_addr_to_user(sctp_sk(skb->sk), addr);
Patches currently in stable-queue which might be from ebiederm(a)xmission.com are
queue-4.9/net-sctp-always-set-scope_id-in-sctp_inet6_skb_msgname.patch
This is a note to let you know that I've just added the patch titled
net: qmi_wwan: fix divide by 0 on bad descriptors
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
net-qmi_wwan-fix-divide-by-0-on-bad-descriptors.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Tue Nov 21 13:08:13 CET 2017
From: Bjørn Mork <bjorn(a)mork.no>
Date: Mon, 6 Nov 2017 15:32:18 +0100
Subject: net: qmi_wwan: fix divide by 0 on bad descriptors
From: Bjørn Mork <bjorn(a)mork.no>
[ Upstream commit 7fd078337201cf7468f53c3d9ef81ff78cb6df3b ]
A CDC Ethernet functional descriptor with wMaxSegmentSize = 0 will
cause a divide error in usbnet_probe:
divide error: 0000 [#1] PREEMPT SMP KASAN
Modules linked in:
CPU: 0 PID: 24 Comm: kworker/0:1 Not tainted 4.14.0-rc8-44453-g1fdc1a82c34f #56
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Workqueue: usb_hub_wq hub_event
task: ffff88006bef5c00 task.stack: ffff88006bf60000
RIP: 0010:usbnet_update_max_qlen+0x24d/0x390 drivers/net/usb/usbnet.c:355
RSP: 0018:ffff88006bf67508 EFLAGS: 00010246
RAX: 00000000000163c8 RBX: ffff8800621fce40 RCX: ffff8800621fcf34
RDX: 0000000000000000 RSI: ffffffff837ecb7a RDI: ffff8800621fcf34
RBP: ffff88006bf67520 R08: ffff88006bef5c00 R09: ffffed000c43f881
R10: ffffed000c43f880 R11: ffff8800621fc406 R12: 0000000000000003
R13: ffffffff85c71de0 R14: 0000000000000000 R15: 0000000000000000
FS: 0000000000000000(0000) GS:ffff88006ca00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007ffe9c0d6dac CR3: 00000000614f4000 CR4: 00000000000006f0
Call Trace:
usbnet_probe+0x18b5/0x2790 drivers/net/usb/usbnet.c:1783
qmi_wwan_probe+0x133/0x220 drivers/net/usb/qmi_wwan.c:1338
usb_probe_interface+0x324/0x940 drivers/usb/core/driver.c:361
really_probe drivers/base/dd.c:413
driver_probe_device+0x522/0x740 drivers/base/dd.c:557
Fix by simply ignoring the bogus descriptor, as it is optional
for QMI devices anyway.
Fixes: 423ce8caab7e ("net: usb: qmi_wwan: New driver for Huawei QMI based WWAN devices")
Reported-by: Andrey Konovalov <andreyknvl(a)google.com>
Signed-off-by: Bjørn Mork <bjorn(a)mork.no>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/net/usb/qmi_wwan.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/net/usb/qmi_wwan.c
+++ b/drivers/net/usb/qmi_wwan.c
@@ -386,7 +386,7 @@ static int qmi_wwan_bind(struct usbnet *
}
/* errors aren't fatal - we can live with the dynamic address */
- if (cdc_ether) {
+ if (cdc_ether && cdc_ether->wMaxSegmentSize) {
dev->hard_mtu = le16_to_cpu(cdc_ether->wMaxSegmentSize);
usbnet_get_ethernet_addr(dev, cdc_ether->iMACAddress);
}
Patches currently in stable-queue which might be from bjorn(a)mork.no are
queue-4.9/net-cdc_ether-fix-divide-by-0-on-bad-descriptors.patch
queue-4.9/qmi_wwan-add-missing-skb_reset_mac_header-call.patch
queue-4.9/net-qmi_wwan-fix-divide-by-0-on-bad-descriptors.patch
This is a note to let you know that I've just added the patch titled
net: cdc_ether: fix divide by 0 on bad descriptors
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
net-cdc_ether-fix-divide-by-0-on-bad-descriptors.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Tue Nov 21 13:08:13 CET 2017
From: Bjørn Mork <bjorn(a)mork.no>
Date: Mon, 6 Nov 2017 15:37:22 +0100
Subject: net: cdc_ether: fix divide by 0 on bad descriptors
From: Bjørn Mork <bjorn(a)mork.no>
[ Upstream commit 2cb80187ba065d7decad7c6614e35e07aec8a974 ]
Setting dev->hard_mtu to 0 will cause a divide error in
usbnet_probe. Protect against devices with bogus CDC Ethernet
functional descriptors by ignoring a zero wMaxSegmentSize.
Signed-off-by: Bjørn Mork <bjorn(a)mork.no>
Acked-by: Oliver Neukum <oneukum(a)suse.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/net/usb/cdc_ether.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/net/usb/cdc_ether.c
+++ b/drivers/net/usb/cdc_ether.c
@@ -221,7 +221,7 @@ skip:
goto bad_desc;
}
- if (header.usb_cdc_ether_desc) {
+ if (header.usb_cdc_ether_desc && info->ether->wMaxSegmentSize) {
dev->hard_mtu = le16_to_cpu(info->ether->wMaxSegmentSize);
/* because of Zaurus, we may be ignoring the host
* side link address we were given.
Patches currently in stable-queue which might be from bjorn(a)mork.no are
queue-4.9/net-cdc_ether-fix-divide-by-0-on-bad-descriptors.patch
queue-4.9/qmi_wwan-add-missing-skb_reset_mac_header-call.patch
queue-4.9/net-qmi_wwan-fix-divide-by-0-on-bad-descriptors.patch
This is a note to let you know that I've just added the patch titled
fealnx: Fix building error on MIPS
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
fealnx-fix-building-error-on-mips.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Tue Nov 21 13:08:13 CET 2017
From: Huacai Chen <chenhc(a)lemote.com>
Date: Thu, 16 Nov 2017 11:07:15 +0800
Subject: fealnx: Fix building error on MIPS
From: Huacai Chen <chenhc(a)lemote.com>
[ Upstream commit cc54c1d32e6a4bb3f116721abf900513173e4d02 ]
This patch try to fix the building error on MIPS. The reason is MIPS
has already defined the LONG macro, which conflicts with the LONG enum
in drivers/net/ethernet/fealnx.c.
Signed-off-by: Huacai Chen <chenhc(a)lemote.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/net/ethernet/fealnx.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
--- a/drivers/net/ethernet/fealnx.c
+++ b/drivers/net/ethernet/fealnx.c
@@ -257,8 +257,8 @@ enum rx_desc_status_bits {
RXFSD = 0x00000800, /* first descriptor */
RXLSD = 0x00000400, /* last descriptor */
ErrorSummary = 0x80, /* error summary */
- RUNT = 0x40, /* runt packet received */
- LONG = 0x20, /* long packet received */
+ RUNTPKT = 0x40, /* runt packet received */
+ LONGPKT = 0x20, /* long packet received */
FAE = 0x10, /* frame align error */
CRC = 0x08, /* crc error */
RXER = 0x04, /* receive error */
@@ -1633,7 +1633,7 @@ static int netdev_rx(struct net_device *
dev->name, rx_status);
dev->stats.rx_errors++; /* end of a packet. */
- if (rx_status & (LONG | RUNT))
+ if (rx_status & (LONGPKT | RUNTPKT))
dev->stats.rx_length_errors++;
if (rx_status & RXER)
dev->stats.rx_frame_errors++;
Patches currently in stable-queue which might be from chenhc(a)lemote.com are
queue-4.9/fealnx-fix-building-error-on-mips.patch
This is a note to let you know that I've just added the patch titled
bonding: discard lowest hash bit for 802.3ad layer3+4
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
bonding-discard-lowest-hash-bit-for-802.3ad-layer3-4.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Tue Nov 21 13:08:13 CET 2017
From: Hangbin Liu <liuhangbin(a)gmail.com>
Date: Mon, 6 Nov 2017 09:01:57 +0800
Subject: bonding: discard lowest hash bit for 802.3ad layer3+4
From: Hangbin Liu <liuhangbin(a)gmail.com>
[ Upstream commit b5f862180d7011d9575d0499fa37f0f25b423b12 ]
After commit 07f4c90062f8 ("tcp/dccp: try to not exhaust ip_local_port_range
in connect()"), we will try to use even ports for connect(). Then if an
application (seen clearly with iperf) opens multiple streams to the same
destination IP and port, each stream will be given an even source port.
So the bonding driver's simple xmit_hash_policy based on layer3+4 addressing
will always hash all these streams to the same interface. And the total
throughput will limited to a single slave.
Change the tcp code will impact the whole tcp behavior, only for bonding
usage. Paolo Abeni suggested fix this by changing the bonding code only,
which should be more reasonable, and less impact.
Fix this by discarding the lowest hash bit because it contains little entropy.
After the fix we can re-balance between slaves.
Signed-off-by: Paolo Abeni <pabeni(a)redhat.com>
Signed-off-by: Hangbin Liu <liuhangbin(a)gmail.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/net/bonding/bond_main.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -3217,7 +3217,7 @@ u32 bond_xmit_hash(struct bonding *bond,
hash ^= (hash >> 16);
hash ^= (hash >> 8);
- return hash;
+ return hash >> 1;
}
/*-------------------------- Device entry points ----------------------------*/
Patches currently in stable-queue which might be from liuhangbin(a)gmail.com are
queue-4.9/bonding-discard-lowest-hash-bit-for-802.3ad-layer3-4.patch
This is a note to let you know that I've just added the patch titled
af_netlink: ensure that NLMSG_DONE never fails in dumps
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
af_netlink-ensure-that-nlmsg_done-never-fails-in-dumps.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Tue Nov 21 13:08:13 CET 2017
From: "Jason A. Donenfeld" <Jason(a)zx2c4.com>
Date: Thu, 9 Nov 2017 13:04:44 +0900
Subject: af_netlink: ensure that NLMSG_DONE never fails in dumps
From: "Jason A. Donenfeld" <Jason(a)zx2c4.com>
[ Upstream commit 0642840b8bb008528dbdf929cec9f65ac4231ad0 ]
The way people generally use netlink_dump is that they fill in the skb
as much as possible, breaking when nla_put returns an error. Then, they
get called again and start filling out the next skb, and again, and so
forth. The mechanism at work here is the ability for the iterative
dumping function to detect when the skb is filled up and not fill it
past the brim, waiting for a fresh skb for the rest of the data.
However, if the attributes are small and nicely packed, it is possible
that a dump callback function successfully fills in attributes until the
skb is of size 4080 (libmnl's default page-sized receive buffer size).
The dump function completes, satisfied, and then, if it happens to be
that this is actually the last skb, and no further ones are to be sent,
then netlink_dump will add on the NLMSG_DONE part:
nlh = nlmsg_put_answer(skb, cb, NLMSG_DONE, sizeof(len), NLM_F_MULTI);
It is very important that netlink_dump does this, of course. However, in
this example, that call to nlmsg_put_answer will fail, because the
previous filling by the dump function did not leave it enough room. And
how could it possibly have done so? All of the nla_put variety of
functions simply check to see if the skb has enough tailroom,
independent of the context it is in.
In order to keep the important assumptions of all netlink dump users, it
is therefore important to give them an skb that has this end part of the
tail already reserved, so that the call to nlmsg_put_answer does not
fail. Otherwise, library authors are forced to find some bizarre sized
receive buffer that has a large modulo relative to the common sizes of
messages received, which is ugly and buggy.
This patch thus saves the NLMSG_DONE for an additional message, for the
case that things are dangerously close to the brim. This requires
keeping track of the errno from ->dump() across calls.
Signed-off-by: Jason A. Donenfeld <Jason(a)zx2c4.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/netlink/af_netlink.c | 17 +++++++++++------
net/netlink/af_netlink.h | 1 +
2 files changed, 12 insertions(+), 6 deletions(-)
--- a/net/netlink/af_netlink.c
+++ b/net/netlink/af_netlink.c
@@ -2077,7 +2077,7 @@ static int netlink_dump(struct sock *sk)
struct sk_buff *skb = NULL;
struct nlmsghdr *nlh;
struct module *module;
- int len, err = -ENOBUFS;
+ int err = -ENOBUFS;
int alloc_min_size;
int alloc_size;
@@ -2124,9 +2124,11 @@ static int netlink_dump(struct sock *sk)
skb_reserve(skb, skb_tailroom(skb) - alloc_size);
netlink_skb_set_owner_r(skb, sk);
- len = cb->dump(skb, cb);
+ if (nlk->dump_done_errno > 0)
+ nlk->dump_done_errno = cb->dump(skb, cb);
- if (len > 0) {
+ if (nlk->dump_done_errno > 0 ||
+ skb_tailroom(skb) < nlmsg_total_size(sizeof(nlk->dump_done_errno))) {
mutex_unlock(nlk->cb_mutex);
if (sk_filter(sk, skb))
@@ -2136,13 +2138,15 @@ static int netlink_dump(struct sock *sk)
return 0;
}
- nlh = nlmsg_put_answer(skb, cb, NLMSG_DONE, sizeof(len), NLM_F_MULTI);
- if (!nlh)
+ nlh = nlmsg_put_answer(skb, cb, NLMSG_DONE,
+ sizeof(nlk->dump_done_errno), NLM_F_MULTI);
+ if (WARN_ON(!nlh))
goto errout_skb;
nl_dump_check_consistent(cb, nlh);
- memcpy(nlmsg_data(nlh), &len, sizeof(len));
+ memcpy(nlmsg_data(nlh), &nlk->dump_done_errno,
+ sizeof(nlk->dump_done_errno));
if (sk_filter(sk, skb))
kfree_skb(skb);
@@ -2214,6 +2218,7 @@ int __netlink_dump_start(struct sock *ss
}
nlk->cb_running = true;
+ nlk->dump_done_errno = INT_MAX;
mutex_unlock(nlk->cb_mutex);
--- a/net/netlink/af_netlink.h
+++ b/net/netlink/af_netlink.h
@@ -24,6 +24,7 @@ struct netlink_sock {
wait_queue_head_t wait;
bool bound;
bool cb_running;
+ int dump_done_errno;
struct netlink_callback cb;
struct mutex *cb_mutex;
struct mutex cb_def_mutex;
Patches currently in stable-queue which might be from Jason(a)zx2c4.com are
queue-4.9/af_netlink-ensure-that-nlmsg_done-never-fails-in-dumps.patch
This is a note to let you know that I've just added the patch titled
vxlan: fix the issue that neigh proxy blocks all icmpv6 packets
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
vxlan-fix-the-issue-that-neigh-proxy-blocks-all-icmpv6-packets.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Tue Nov 21 13:07:02 CET 2017
From: Xin Long <lucien.xin(a)gmail.com>
Date: Sat, 11 Nov 2017 19:58:50 +0800
Subject: vxlan: fix the issue that neigh proxy blocks all icmpv6 packets
From: Xin Long <lucien.xin(a)gmail.com>
[ Upstream commit 8bff3685a4bbf175a96bc6a528f13455d8d38244 ]
Commit f1fb08f6337c ("vxlan: fix ND proxy when skb doesn't have transport
header offset") removed icmp6_code and icmp6_type check before calling
neigh_reduce when doing neigh proxy.
It means all icmpv6 packets would be blocked by this, not only ns packet.
In Jianlin's env, even ping6 couldn't work through it.
This patch is to bring the icmp6_code and icmp6_type check back and also
removed the same check from neigh_reduce().
Fixes: f1fb08f6337c ("vxlan: fix ND proxy when skb doesn't have transport header offset")
Reported-by: Jianlin Shi <jishi(a)redhat.com>
Signed-off-by: Xin Long <lucien.xin(a)gmail.com>
Reviewed-by: Vincent Bernat <vincent(a)bernat.im>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/net/vxlan.c | 31 +++++++++++++------------------
1 file changed, 13 insertions(+), 18 deletions(-)
--- a/drivers/net/vxlan.c
+++ b/drivers/net/vxlan.c
@@ -1623,26 +1623,19 @@ static struct sk_buff *vxlan_na_create(s
static int neigh_reduce(struct net_device *dev, struct sk_buff *skb, __be32 vni)
{
struct vxlan_dev *vxlan = netdev_priv(dev);
- struct nd_msg *msg;
- const struct ipv6hdr *iphdr;
const struct in6_addr *daddr;
- struct neighbour *n;
+ const struct ipv6hdr *iphdr;
struct inet6_dev *in6_dev;
+ struct neighbour *n;
+ struct nd_msg *msg;
in6_dev = __in6_dev_get(dev);
if (!in6_dev)
goto out;
- if (!pskb_may_pull(skb, sizeof(struct ipv6hdr) + sizeof(struct nd_msg)))
- goto out;
-
iphdr = ipv6_hdr(skb);
daddr = &iphdr->daddr;
-
msg = (struct nd_msg *)(iphdr + 1);
- if (msg->icmph.icmp6_code != 0 ||
- msg->icmph.icmp6_type != NDISC_NEIGHBOUR_SOLICITATION)
- goto out;
if (ipv6_addr_loopback(daddr) ||
ipv6_addr_is_multicast(&msg->target))
@@ -2240,11 +2233,11 @@ tx_error:
static netdev_tx_t vxlan_xmit(struct sk_buff *skb, struct net_device *dev)
{
struct vxlan_dev *vxlan = netdev_priv(dev);
+ struct vxlan_rdst *rdst, *fdst = NULL;
const struct ip_tunnel_info *info;
- struct ethhdr *eth;
bool did_rsc = false;
- struct vxlan_rdst *rdst, *fdst = NULL;
struct vxlan_fdb *f;
+ struct ethhdr *eth;
__be32 vni = 0;
info = skb_tunnel_info(skb);
@@ -2269,12 +2262,14 @@ static netdev_tx_t vxlan_xmit(struct sk_
if (ntohs(eth->h_proto) == ETH_P_ARP)
return arp_reduce(dev, skb, vni);
#if IS_ENABLED(CONFIG_IPV6)
- else if (ntohs(eth->h_proto) == ETH_P_IPV6) {
- struct ipv6hdr *hdr, _hdr;
- if ((hdr = skb_header_pointer(skb,
- skb_network_offset(skb),
- sizeof(_hdr), &_hdr)) &&
- hdr->nexthdr == IPPROTO_ICMPV6)
+ else if (ntohs(eth->h_proto) == ETH_P_IPV6 &&
+ pskb_may_pull(skb, sizeof(struct ipv6hdr) +
+ sizeof(struct nd_msg)) &&
+ ipv6_hdr(skb)->nexthdr == IPPROTO_ICMPV6) {
+ struct nd_msg *m = (struct nd_msg *)(ipv6_hdr(skb) + 1);
+
+ if (m->icmph.icmp6_code == 0 &&
+ m->icmph.icmp6_type == NDISC_NEIGHBOUR_SOLICITATION)
return neigh_reduce(dev, skb, vni);
}
#endif
Patches currently in stable-queue which might be from lucien.xin(a)gmail.com are
queue-4.14/vxlan-fix-the-issue-that-neigh-proxy-blocks-all-icmpv6-packets.patch