This is a note to let you know that I've just added the patch titled
netlink: reset extack earlier in netlink_rcv_skb
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
netlink-reset-extack-earlier-in-netlink_rcv_skb.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Sun Jan 28 17:35:08 CET 2018
From: Xin Long <lucien.xin(a)gmail.com>
Date: Thu, 18 Jan 2018 14:48:03 +0800
Subject: netlink: reset extack earlier in netlink_rcv_skb
From: Xin Long <lucien.xin(a)gmail.com>
[ Upstream commit cd443f1e91ca600a092e780e8250cd6a2954b763 ]
Move up the extack reset/initialization in netlink_rcv_skb, so that
those 'goto ack' will not skip it. Otherwise, later on netlink_ack
may use the uninitialized extack and cause kernel crash.
Fixes: cbbdf8433a5f ("netlink: extack needs to be reset each time through loop")
Reported-by: syzbot+03bee3680a37466775e7(a)syzkaller.appspotmail.com
Signed-off-by: Xin Long <lucien.xin(a)gmail.com>
Acked-by: David Ahern <dsahern(a)gmail.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/netlink/af_netlink.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/net/netlink/af_netlink.c
+++ b/net/netlink/af_netlink.c
@@ -2400,6 +2400,7 @@ int netlink_rcv_skb(struct sk_buff *skb,
while (skb->len >= nlmsg_total_size(0)) {
int msglen;
+ memset(&extack, 0, sizeof(extack));
nlh = nlmsg_hdr(skb);
err = 0;
@@ -2414,7 +2415,6 @@ int netlink_rcv_skb(struct sk_buff *skb,
if (nlh->nlmsg_type < NLMSG_MIN_TYPE)
goto ack;
- memset(&extack, 0, sizeof(extack));
err = cb(skb, nlh, &extack);
if (err == -EINTR)
goto skip;
Patches currently in stable-queue which might be from lucien.xin(a)gmail.com are
queue-4.14/sctp-do-not-allow-the-v4-socket-to-bind-a-v4mapped-v6-address.patch
queue-4.14/sctp-reinit-stream-if-stream-outcnt-has-been-change-by-sinit-in-sendmsg.patch
queue-4.14/pppoe-take-needed_headroom-of-lower-device-into-account-on-xmit.patch
queue-4.14/sctp-return-error-if-the-asoc-has-been-peeled-off-in-sctp_wait_for_sndbuf.patch
queue-4.14/netlink-reset-extack-earlier-in-netlink_rcv_skb.patch
This is a note to let you know that I've just added the patch titled
net/tls: Only attach to sockets in ESTABLISHED state
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
net-tls-only-attach-to-sockets-in-established-state.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Sun Jan 28 17:35:08 CET 2018
From: Ilya Lesokhin <ilyal(a)mellanox.com>
Date: Tue, 16 Jan 2018 15:31:52 +0200
Subject: net/tls: Only attach to sockets in ESTABLISHED state
From: Ilya Lesokhin <ilyal(a)mellanox.com>
[ Upstream commit d91c3e17f75f218022140dee18cf515292184a8f ]
Calling accept on a TCP socket with a TLS ulp attached results
in two sockets that share the same ulp context.
The ulp context is freed while a socket is destroyed, so
after one of the sockets is released, the second second will
trigger a use after free when it tries to access the ulp context
attached to it.
We restrict the TLS ulp to sockets in ESTABLISHED state
to prevent the scenario above.
Fixes: 3c4d7559159b ("tls: kernel TLS support")
Reported-by: syzbot+904e7cd6c5c741609228(a)syzkaller.appspotmail.com
Signed-off-by: Ilya Lesokhin <ilyal(a)mellanox.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/tls/tls_main.c | 9 +++++++++
1 file changed, 9 insertions(+)
--- a/net/tls/tls_main.c
+++ b/net/tls/tls_main.c
@@ -444,6 +444,15 @@ static int tls_init(struct sock *sk)
struct tls_context *ctx;
int rc = 0;
+ /* The TLS ulp is currently supported only for TCP sockets
+ * in ESTABLISHED state.
+ * Supporting sockets in LISTEN state will require us
+ * to modify the accept implementation to clone rather then
+ * share the ulp context.
+ */
+ if (sk->sk_state != TCP_ESTABLISHED)
+ return -ENOTSUPP;
+
/* allocate tls context */
ctx = kzalloc(sizeof(*ctx), GFP_KERNEL);
if (!ctx) {
Patches currently in stable-queue which might be from ilyal(a)mellanox.com are
queue-4.14/net-tls-only-attach-to-sockets-in-established-state.patch
This is a note to let you know that I've just added the patch titled
net: vrf: Add support for sends to local broadcast address
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
net-vrf-add-support-for-sends-to-local-broadcast-address.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Sun Jan 28 17:35:08 CET 2018
From: David Ahern <dsahern(a)gmail.com>
Date: Wed, 24 Jan 2018 19:37:37 -0800
Subject: net: vrf: Add support for sends to local broadcast address
From: David Ahern <dsahern(a)gmail.com>
[ Upstream commit 1e19c4d689dc1e95bafd23ef68fbc0c6b9e05180 ]
Sukumar reported that sends to the local broadcast address
(255.255.255.255) are broken. Check for the address in vrf driver
and do not redirect to the VRF device - similar to multicast
packets.
With this change sockets can use SO_BINDTODEVICE to specify an
egress interface and receive responses. Note: the egress interface
can not be a VRF device but needs to be the enslaved device.
https://bugzilla.kernel.org/show_bug.cgi?id=198521
Reported-by: Sukumar Gopalakrishnan <sukumarg1973(a)gmail.com>
Signed-off-by: David Ahern <dsahern(a)gmail.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/net/vrf.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
--- a/drivers/net/vrf.c
+++ b/drivers/net/vrf.c
@@ -674,8 +674,9 @@ static struct sk_buff *vrf_ip_out(struct
struct sock *sk,
struct sk_buff *skb)
{
- /* don't divert multicast */
- if (ipv4_is_multicast(ip_hdr(skb)->daddr))
+ /* don't divert multicast or local broadcast */
+ if (ipv4_is_multicast(ip_hdr(skb)->daddr) ||
+ ipv4_is_lbcast(ip_hdr(skb)->daddr))
return skb;
if (qdisc_tx_is_default(vrf_dev))
Patches currently in stable-queue which might be from dsahern(a)gmail.com are
queue-4.14/net-ipv4-make-ip-route-get-match-iif-lo-rules-again.patch
queue-4.14/net-vrf-add-support-for-sends-to-local-broadcast-address.patch
queue-4.14/netlink-extack-needs-to-be-reset-each-time-through-loop.patch
queue-4.14/netlink-reset-extack-earlier-in-netlink_rcv_skb.patch
This is a note to let you know that I've just added the patch titled
net: tcp: close sock if net namespace is exiting
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
net-tcp-close-sock-if-net-namespace-is-exiting.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Sun Jan 28 17:35:08 CET 2018
From: Dan Streetman <ddstreet(a)ieee.org>
Date: Thu, 18 Jan 2018 16:14:26 -0500
Subject: net: tcp: close sock if net namespace is exiting
From: Dan Streetman <ddstreet(a)ieee.org>
[ Upstream commit 4ee806d51176ba7b8ff1efd81f271d7252e03a1d ]
When a tcp socket is closed, if it detects that its net namespace is
exiting, close immediately and do not wait for FIN sequence.
For normal sockets, a reference is taken to their net namespace, so it will
never exit while the socket is open. However, kernel sockets do not take a
reference to their net namespace, so it may begin exiting while the kernel
socket is still open. In this case if the kernel socket is a tcp socket,
it will stay open trying to complete its close sequence. The sock's dst(s)
hold a reference to their interface, which are all transferred to the
namespace's loopback interface when the real interfaces are taken down.
When the namespace tries to take down its loopback interface, it hangs
waiting for all references to the loopback interface to release, which
results in messages like:
unregister_netdevice: waiting for lo to become free. Usage count = 1
These messages continue until the socket finally times out and closes.
Since the net namespace cleanup holds the net_mutex while calling its
registered pernet callbacks, any new net namespace initialization is
blocked until the current net namespace finishes exiting.
After this change, the tcp socket notices the exiting net namespace, and
closes immediately, releasing its dst(s) and their reference to the
loopback interface, which lets the net namespace continue exiting.
Link: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1711407
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=97811
Signed-off-by: Dan Streetman <ddstreet(a)canonical.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
include/net/net_namespace.h | 10 ++++++++++
net/ipv4/tcp.c | 3 +++
net/ipv4/tcp_timer.c | 15 +++++++++++++++
3 files changed, 28 insertions(+)
--- a/include/net/net_namespace.h
+++ b/include/net/net_namespace.h
@@ -223,6 +223,11 @@ int net_eq(const struct net *net1, const
return net1 == net2;
}
+static inline int check_net(const struct net *net)
+{
+ return atomic_read(&net->count) != 0;
+}
+
void net_drop_ns(void *);
#else
@@ -246,6 +251,11 @@ int net_eq(const struct net *net1, const
{
return 1;
}
+
+static inline int check_net(const struct net *net)
+{
+ return 1;
+}
#define net_drop_ns NULL
#endif
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -2273,6 +2273,9 @@ adjudge_to_death:
tcp_send_active_reset(sk, GFP_ATOMIC);
__NET_INC_STATS(sock_net(sk),
LINUX_MIB_TCPABORTONMEMORY);
+ } else if (!check_net(sock_net(sk))) {
+ /* Not possible to send reset; just close */
+ tcp_set_state(sk, TCP_CLOSE);
}
}
--- a/net/ipv4/tcp_timer.c
+++ b/net/ipv4/tcp_timer.c
@@ -50,11 +50,19 @@ static void tcp_write_err(struct sock *s
* to prevent DoS attacks. It is called when a retransmission timeout
* or zero probe timeout occurs on orphaned socket.
*
+ * Also close if our net namespace is exiting; in that case there is no
+ * hope of ever communicating again since all netns interfaces are already
+ * down (or about to be down), and we need to release our dst references,
+ * which have been moved to the netns loopback interface, so the namespace
+ * can finish exiting. This condition is only possible if we are a kernel
+ * socket, as those do not hold references to the namespace.
+ *
* Criteria is still not confirmed experimentally and may change.
* We kill the socket, if:
* 1. If number of orphaned sockets exceeds an administratively configured
* limit.
* 2. If we have strong memory pressure.
+ * 3. If our net namespace is exiting.
*/
static int tcp_out_of_resources(struct sock *sk, bool do_reset)
{
@@ -83,6 +91,13 @@ static int tcp_out_of_resources(struct s
__NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPABORTONMEMORY);
return 1;
}
+
+ if (!check_net(sock_net(sk))) {
+ /* Not possible to send reset; just close */
+ tcp_done(sk);
+ return 1;
+ }
+
return 0;
}
Patches currently in stable-queue which might be from ddstreet(a)ieee.org are
queue-4.14/net-tcp-close-sock-if-net-namespace-is-exiting.patch
This is a note to let you know that I've just added the patch titled
net/tls: Fix inverted error codes to avoid endless loop
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
net-tls-fix-inverted-error-codes-to-avoid-endless-loop.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Sun Jan 28 17:35:08 CET 2018
From: "r.hering(a)avm.de" <r.hering(a)avm.de>
Date: Fri, 12 Jan 2018 15:42:06 +0100
Subject: net/tls: Fix inverted error codes to avoid endless loop
From: "r.hering(a)avm.de" <r.hering(a)avm.de>
[ Upstream commit 30be8f8dba1bd2aff73e8447d59228471233a3d4 ]
sendfile() calls can hang endless with using Kernel TLS if a socket error occurs.
Socket error codes must be inverted by Kernel TLS before returning because
they are stored with positive sign. If returned non-inverted they are
interpreted as number of bytes sent, causing endless looping of the
splice mechanic behind sendfile().
Signed-off-by: Robert Hering <r.hering(a)avm.de>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
include/net/tls.h | 2 +-
net/tls/tls_sw.c | 4 ++--
2 files changed, 3 insertions(+), 3 deletions(-)
--- a/include/net/tls.h
+++ b/include/net/tls.h
@@ -168,7 +168,7 @@ static inline bool tls_is_pending_open_r
static inline void tls_err_abort(struct sock *sk)
{
- sk->sk_err = -EBADMSG;
+ sk->sk_err = EBADMSG;
sk->sk_error_report(sk);
}
--- a/net/tls/tls_sw.c
+++ b/net/tls/tls_sw.c
@@ -407,7 +407,7 @@ int tls_sw_sendmsg(struct sock *sk, stru
while (msg_data_left(msg)) {
if (sk->sk_err) {
- ret = sk->sk_err;
+ ret = -sk->sk_err;
goto send_end;
}
@@ -560,7 +560,7 @@ int tls_sw_sendpage(struct sock *sk, str
size_t copy, required_size;
if (sk->sk_err) {
- ret = sk->sk_err;
+ ret = -sk->sk_err;
goto sendpage_end;
}
Patches currently in stable-queue which might be from r.hering(a)avm.de are
queue-4.14/net-tls-fix-inverted-error-codes-to-avoid-endless-loop.patch
This is a note to let you know that I've just added the patch titled
net: qdisc_pkt_len_init() should be more robust
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
net-qdisc_pkt_len_init-should-be-more-robust.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Sun Jan 28 17:35:08 CET 2018
From: Eric Dumazet <edumazet(a)google.com>
Date: Thu, 18 Jan 2018 19:59:19 -0800
Subject: net: qdisc_pkt_len_init() should be more robust
From: Eric Dumazet <edumazet(a)google.com>
[ Upstream commit 7c68d1a6b4db9012790af7ac0f0fdc0d2083422a ]
Without proper validation of DODGY packets, we might very well
feed qdisc_pkt_len_init() with invalid GSO packets.
tcp_hdrlen() might access out-of-bound data, so let's use
skb_header_pointer() and proper checks.
Whole story is described in commit d0c081b49137 ("flow_dissector:
properly cap thoff field")
We have the goal of validating DODGY packets earlier in the stack,
so we might very well revert this fix in the future.
Signed-off-by: Eric Dumazet <edumazet(a)google.com>
Cc: Willem de Bruijn <willemb(a)google.com>
Cc: Jason Wang <jasowang(a)redhat.com>
Reported-by: syzbot+9da69ebac7dddd804552(a)syzkaller.appspotmail.com
Acked-by: Jason Wang <jasowang(a)redhat.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/core/dev.c | 19 +++++++++++++++----
1 file changed, 15 insertions(+), 4 deletions(-)
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -3128,10 +3128,21 @@ static void qdisc_pkt_len_init(struct sk
hdr_len = skb_transport_header(skb) - skb_mac_header(skb);
/* + transport layer */
- if (likely(shinfo->gso_type & (SKB_GSO_TCPV4 | SKB_GSO_TCPV6)))
- hdr_len += tcp_hdrlen(skb);
- else
- hdr_len += sizeof(struct udphdr);
+ if (likely(shinfo->gso_type & (SKB_GSO_TCPV4 | SKB_GSO_TCPV6))) {
+ const struct tcphdr *th;
+ struct tcphdr _tcphdr;
+
+ th = skb_header_pointer(skb, skb_transport_offset(skb),
+ sizeof(_tcphdr), &_tcphdr);
+ if (likely(th))
+ hdr_len += __tcp_hdrlen(th);
+ } else {
+ struct udphdr _udphdr;
+
+ if (skb_header_pointer(skb, skb_transport_offset(skb),
+ sizeof(_udphdr), &_udphdr))
+ hdr_len += sizeof(struct udphdr);
+ }
if (shinfo->gso_type & SKB_GSO_DODGY)
gso_segs = DIV_ROUND_UP(skb->len - hdr_len,
Patches currently in stable-queue which might be from edumazet(a)google.com are
queue-4.14/ipv6-fix-udpv6-sendmsg-crash-caused-by-too-small-mtu.patch
queue-4.14/flow_dissector-properly-cap-thoff-field.patch
queue-4.14/ipv6-ip6_make_skb-needs-to-clear-cork.base.dst.patch
queue-4.14/dccp-don-t-restart-ccid2_hc_tx_rto_expire-if-sk-in-closed-state.patch
queue-4.14/net-qdisc_pkt_len_init-should-be-more-robust.patch
This is a note to let you know that I've just added the patch titled
net/mlx5: Fix get vector affinity helper function
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
net-mlx5-fix-get-vector-affinity-helper-function.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Sun Jan 28 17:35:08 CET 2018
From: Saeed Mahameed <saeedm(a)mellanox.com>
Date: Thu, 4 Jan 2018 04:35:51 +0200
Subject: net/mlx5: Fix get vector affinity helper function
From: Saeed Mahameed <saeedm(a)mellanox.com>
[ Upstream commit 05e0cc84e00c54fb152d1f4b86bc211823a83d0c ]
mlx5_get_vector_affinity used to call pci_irq_get_affinity and after
reverting the patch that sets the device affinity via PCI_IRQ_AFFINITY
API, calling pci_irq_get_affinity becomes useless and it breaks RDMA
mlx5 users. To fix this, this patch provides an alternative way to
retrieve IRQ vector affinity using legacy IRQ API, following
smp_affinity read procfs implementation.
Fixes: 231243c82793 ("Revert mlx5: move affinity hints assignments to generic code")
Fixes: a435393acafb ("mlx5: move affinity hints assignments to generic code")
Cc: Sagi Grimberg <sagi(a)grimberg.me>
Signed-off-by: Saeed Mahameed <saeedm(a)mellanox.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
include/linux/mlx5/driver.h | 19 ++++++++++++++++++-
1 file changed, 18 insertions(+), 1 deletion(-)
--- a/include/linux/mlx5/driver.h
+++ b/include/linux/mlx5/driver.h
@@ -36,6 +36,7 @@
#include <linux/kernel.h>
#include <linux/completion.h>
#include <linux/pci.h>
+#include <linux/irq.h>
#include <linux/spinlock_types.h>
#include <linux/semaphore.h>
#include <linux/slab.h>
@@ -1194,7 +1195,23 @@ enum {
static inline const struct cpumask *
mlx5_get_vector_affinity(struct mlx5_core_dev *dev, int vector)
{
- return pci_irq_get_affinity(dev->pdev, MLX5_EQ_VEC_COMP_BASE + vector);
+ const struct cpumask *mask;
+ struct irq_desc *desc;
+ unsigned int irq;
+ int eqn;
+ int err;
+
+ err = mlx5_vector2eqn(dev, vector, &eqn, &irq);
+ if (err)
+ return NULL;
+
+ desc = irq_to_desc(irq);
+#ifdef CONFIG_GENERIC_IRQ_EFFECTIVE_AFF_MASK
+ mask = irq_data_get_effective_affinity_mask(&desc->irq_data);
+#else
+ mask = desc->irq_common_data.affinity;
+#endif
+ return mask;
}
#endif /* MLX5_DRIVER_H */
Patches currently in stable-queue which might be from saeedm(a)mellanox.com are
queue-4.14/net-mlx5-fix-get-vector-affinity-helper-function.patch
queue-4.14/net-ib-mlx5-don-t-disable-local-loopback-multicast-traffic-when-needed.patch
queue-4.14/net-mlx5e-fix-fixpoint-divide-exception-in-mlx5e_am_stats_compare.patch
This is a note to let you know that I've just added the patch titled
net/mlx5e: Fix fixpoint divide exception in mlx5e_am_stats_compare
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
net-mlx5e-fix-fixpoint-divide-exception-in-mlx5e_am_stats_compare.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Sun Jan 28 17:35:08 CET 2018
From: Talat Batheesh <talatb(a)mellanox.com>
Date: Sun, 21 Jan 2018 05:30:42 +0200
Subject: net/mlx5e: Fix fixpoint divide exception in mlx5e_am_stats_compare
From: Talat Batheesh <talatb(a)mellanox.com>
[ Upstream commit e58edaa4863583b54409444f11b4f80dff0af1cd ]
Helmut reported a bug about division by zero while
running traffic and doing physical cable pull test.
When the cable unplugged the ppms become zero, so when
dividing the current ppms by the previous ppms in the
next dim iteration there is division by zero.
This patch prevent this division for both ppms and epms.
Fixes: c3164d2fc48f ("net/mlx5e: Added BW check for DIM decision mechanism")
Reported-by: Helmut Grauer <helmut.grauer(a)de.ibm.com>
Signed-off-by: Talat Batheesh <talatb(a)mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm(a)mellanox.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/net/ethernet/mellanox/mlx5/core/en_rx_am.c | 6 ++++++
1 file changed, 6 insertions(+)
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx_am.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx_am.c
@@ -197,9 +197,15 @@ static int mlx5e_am_stats_compare(struct
return (curr->bpms > prev->bpms) ? MLX5E_AM_STATS_BETTER :
MLX5E_AM_STATS_WORSE;
+ if (!prev->ppms)
+ return curr->ppms ? MLX5E_AM_STATS_BETTER :
+ MLX5E_AM_STATS_SAME;
+
if (IS_SIGNIFICANT_DIFF(curr->ppms, prev->ppms))
return (curr->ppms > prev->ppms) ? MLX5E_AM_STATS_BETTER :
MLX5E_AM_STATS_WORSE;
+ if (!prev->epms)
+ return MLX5E_AM_STATS_SAME;
if (IS_SIGNIFICANT_DIFF(curr->epms, prev->epms))
return (curr->epms < prev->epms) ? MLX5E_AM_STATS_BETTER :
Patches currently in stable-queue which might be from talatb(a)mellanox.com are
queue-4.14/net-mlx5e-fix-fixpoint-divide-exception-in-mlx5e_am_stats_compare.patch
This is a note to let you know that I've just added the patch titled
net: ipv4: Make "ip route get" match iif lo rules again.
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
net-ipv4-make-ip-route-get-match-iif-lo-rules-again.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Sun Jan 28 17:35:08 CET 2018
From: Lorenzo Colitti <lorenzo(a)google.com>
Date: Thu, 11 Jan 2018 18:36:26 +0900
Subject: net: ipv4: Make "ip route get" match iif lo rules again.
From: Lorenzo Colitti <lorenzo(a)google.com>
[ Upstream commit 6503a30440962f1e1ccb8868816b4e18201218d4 ]
Commit 3765d35ed8b9 ("net: ipv4: Convert inet_rtm_getroute to rcu
versions of route lookup") broke "ip route get" in the presence
of rules that specify iif lo.
Host-originated traffic always has iif lo, because
ip_route_output_key_hash and ip6_route_output_flags set the flow
iif to LOOPBACK_IFINDEX. Thus, putting "iif lo" in an ip rule is a
convenient way to select only originated traffic and not forwarded
traffic.
inet_rtm_getroute used to match these rules correctly because
even though it sets the flow iif to 0, it called
ip_route_output_key which overwrites iif with LOOPBACK_IFINDEX.
But now that it calls ip_route_output_key_hash_rcu, the ifindex
will remain 0 and not match the iif lo in the rule. As a result,
"ip route get" will return ENETUNREACH.
Fixes: 3765d35ed8b9 ("net: ipv4: Convert inet_rtm_getroute to rcu versions of route lookup")
Tested: https://android.googlesource.com/kernel/tests/+/master/net/test/multinetwor… passes again
Signed-off-by: Lorenzo Colitti <lorenzo(a)google.com>
Acked-by: David Ahern <dsahern(a)gmail.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/ipv4/route.c | 1 +
1 file changed, 1 insertion(+)
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -2762,6 +2762,7 @@ static int inet_rtm_getroute(struct sk_b
if (err == 0 && rt->dst.error)
err = -rt->dst.error;
} else {
+ fl4.flowi4_iif = LOOPBACK_IFINDEX;
rt = ip_route_output_key_hash_rcu(net, &fl4, &res, skb);
err = 0;
if (IS_ERR(rt))
Patches currently in stable-queue which might be from lorenzo(a)google.com are
queue-4.14/net-ipv4-make-ip-route-get-match-iif-lo-rules-again.patch
This is a note to let you know that I've just added the patch titled
{net,ib}/mlx5: Don't disable local loopback multicast traffic when needed
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
net-ib-mlx5-don-t-disable-local-loopback-multicast-traffic-when-needed.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Sun Jan 28 17:35:08 CET 2018
From: Eran Ben Elisha <eranbe(a)mellanox.com>
Date: Tue, 9 Jan 2018 11:41:10 +0200
Subject: {net,ib}/mlx5: Don't disable local loopback multicast traffic when needed
From: Eran Ben Elisha <eranbe(a)mellanox.com>
[ Upstream commit 8978cc921fc7fad3f4d6f91f1da01352aeeeff25 ]
There are systems platform information management interfaces (such as
HOST2BMC) for which we cannot disable local loopback multicast traffic.
Separate disable_local_lb_mc and disable_local_lb_uc capability bits so
driver will not disable multicast loopback traffic if not supported.
(It is expected that Firmware will not set disable_local_lb_mc if
HOST2BMC is running for example.)
Function mlx5_nic_vport_update_local_lb will do best effort to
disable/enable UC/MC loopback traffic and return success only in case it
succeeded to changed all allowed by Firmware.
Adapt mlx5_ib and mlx5e to support the new cap bits.
Fixes: 2c43c5a036be ("net/mlx5e: Enable local loopback in loopback selftest")
Fixes: c85023e153e3 ("IB/mlx5: Add raw ethernet local loopback support")
Fixes: bded747bb432 ("net/mlx5: Add raw ethernet local loopback firmware command")
Signed-off-by: Eran Ben Elisha <eranbe(a)mellanox.com>
Cc: kernel-team(a)fb.com
Signed-off-by: Saeed Mahameed <saeedm(a)mellanox.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/infiniband/hw/mlx5/main.c | 9 ++++--
drivers/net/ethernet/mellanox/mlx5/core/en_selftest.c | 27 ++++++++++++------
drivers/net/ethernet/mellanox/mlx5/core/main.c | 3 --
drivers/net/ethernet/mellanox/mlx5/core/vport.c | 22 ++++++++++----
include/linux/mlx5/mlx5_ifc.h | 5 ++-
5 files changed, 44 insertions(+), 22 deletions(-)
--- a/drivers/infiniband/hw/mlx5/main.c
+++ b/drivers/infiniband/hw/mlx5/main.c
@@ -1276,7 +1276,8 @@ static int mlx5_ib_alloc_transport_domai
return err;
if ((MLX5_CAP_GEN(dev->mdev, port_type) != MLX5_CAP_PORT_TYPE_ETH) ||
- !MLX5_CAP_GEN(dev->mdev, disable_local_lb))
+ (!MLX5_CAP_GEN(dev->mdev, disable_local_lb_uc) &&
+ !MLX5_CAP_GEN(dev->mdev, disable_local_lb_mc)))
return err;
mutex_lock(&dev->lb_mutex);
@@ -1294,7 +1295,8 @@ static void mlx5_ib_dealloc_transport_do
mlx5_core_dealloc_transport_domain(dev->mdev, tdn);
if ((MLX5_CAP_GEN(dev->mdev, port_type) != MLX5_CAP_PORT_TYPE_ETH) ||
- !MLX5_CAP_GEN(dev->mdev, disable_local_lb))
+ (!MLX5_CAP_GEN(dev->mdev, disable_local_lb_uc) &&
+ !MLX5_CAP_GEN(dev->mdev, disable_local_lb_mc)))
return;
mutex_lock(&dev->lb_mutex);
@@ -4161,7 +4163,8 @@ static void *mlx5_ib_add(struct mlx5_cor
}
if ((MLX5_CAP_GEN(mdev, port_type) == MLX5_CAP_PORT_TYPE_ETH) &&
- MLX5_CAP_GEN(mdev, disable_local_lb))
+ (MLX5_CAP_GEN(mdev, disable_local_lb_uc) ||
+ MLX5_CAP_GEN(mdev, disable_local_lb_mc)))
mutex_init(&dev->lb_mutex);
dev->ib_active = true;
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_selftest.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_selftest.c
@@ -238,15 +238,19 @@ static int mlx5e_test_loopback_setup(str
int err = 0;
/* Temporarily enable local_lb */
- if (MLX5_CAP_GEN(priv->mdev, disable_local_lb)) {
- mlx5_nic_vport_query_local_lb(priv->mdev, &lbtp->local_lb);
- if (!lbtp->local_lb)
- mlx5_nic_vport_update_local_lb(priv->mdev, true);
+ err = mlx5_nic_vport_query_local_lb(priv->mdev, &lbtp->local_lb);
+ if (err)
+ return err;
+
+ if (!lbtp->local_lb) {
+ err = mlx5_nic_vport_update_local_lb(priv->mdev, true);
+ if (err)
+ return err;
}
err = mlx5e_refresh_tirs(priv, true);
if (err)
- return err;
+ goto out;
lbtp->loopback_ok = false;
init_completion(&lbtp->comp);
@@ -256,16 +260,21 @@ static int mlx5e_test_loopback_setup(str
lbtp->pt.dev = priv->netdev;
lbtp->pt.af_packet_priv = lbtp;
dev_add_pack(&lbtp->pt);
+
+ return 0;
+
+out:
+ if (!lbtp->local_lb)
+ mlx5_nic_vport_update_local_lb(priv->mdev, false);
+
return err;
}
static void mlx5e_test_loopback_cleanup(struct mlx5e_priv *priv,
struct mlx5e_lbt_priv *lbtp)
{
- if (MLX5_CAP_GEN(priv->mdev, disable_local_lb)) {
- if (!lbtp->local_lb)
- mlx5_nic_vport_update_local_lb(priv->mdev, false);
- }
+ if (!lbtp->local_lb)
+ mlx5_nic_vport_update_local_lb(priv->mdev, false);
dev_remove_pack(&lbtp->pt);
mlx5e_refresh_tirs(priv, false);
--- a/drivers/net/ethernet/mellanox/mlx5/core/main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c
@@ -577,8 +577,7 @@ static int mlx5_core_set_hca_defaults(st
int ret = 0;
/* Disable local_lb by default */
- if ((MLX5_CAP_GEN(dev, port_type) == MLX5_CAP_PORT_TYPE_ETH) &&
- MLX5_CAP_GEN(dev, disable_local_lb))
+ if (MLX5_CAP_GEN(dev, port_type) == MLX5_CAP_PORT_TYPE_ETH)
ret = mlx5_nic_vport_update_local_lb(dev, false);
return ret;
--- a/drivers/net/ethernet/mellanox/mlx5/core/vport.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/vport.c
@@ -908,23 +908,33 @@ int mlx5_nic_vport_update_local_lb(struc
void *in;
int err;
- mlx5_core_dbg(mdev, "%s local_lb\n", enable ? "enable" : "disable");
+ if (!MLX5_CAP_GEN(mdev, disable_local_lb_mc) &&
+ !MLX5_CAP_GEN(mdev, disable_local_lb_uc))
+ return 0;
+
in = kvzalloc(inlen, GFP_KERNEL);
if (!in)
return -ENOMEM;
MLX5_SET(modify_nic_vport_context_in, in,
- field_select.disable_mc_local_lb, 1);
- MLX5_SET(modify_nic_vport_context_in, in,
nic_vport_context.disable_mc_local_lb, !enable);
-
- MLX5_SET(modify_nic_vport_context_in, in,
- field_select.disable_uc_local_lb, 1);
MLX5_SET(modify_nic_vport_context_in, in,
nic_vport_context.disable_uc_local_lb, !enable);
+ if (MLX5_CAP_GEN(mdev, disable_local_lb_mc))
+ MLX5_SET(modify_nic_vport_context_in, in,
+ field_select.disable_mc_local_lb, 1);
+
+ if (MLX5_CAP_GEN(mdev, disable_local_lb_uc))
+ MLX5_SET(modify_nic_vport_context_in, in,
+ field_select.disable_uc_local_lb, 1);
+
err = mlx5_modify_nic_vport_context(mdev, in, inlen);
+ if (!err)
+ mlx5_core_dbg(mdev, "%s local_lb\n",
+ enable ? "enable" : "disable");
+
kvfree(in);
return err;
}
--- a/include/linux/mlx5/mlx5_ifc.h
+++ b/include/linux/mlx5/mlx5_ifc.h
@@ -1023,8 +1023,9 @@ struct mlx5_ifc_cmd_hca_cap_bits {
u8 log_max_wq_sz[0x5];
u8 nic_vport_change_event[0x1];
- u8 disable_local_lb[0x1];
- u8 reserved_at_3e2[0x9];
+ u8 disable_local_lb_uc[0x1];
+ u8 disable_local_lb_mc[0x1];
+ u8 reserved_at_3e3[0x8];
u8 log_max_vlan_list[0x5];
u8 reserved_at_3f0[0x3];
u8 log_max_current_mc_list[0x5];
Patches currently in stable-queue which might be from eranbe(a)mellanox.com are
queue-4.14/net-ib-mlx5-don-t-disable-local-loopback-multicast-traffic-when-needed.patch