This is a note to let you know that I've just added the patch titled
netlink: ensure to loop over all netns in genlmsg_multicast_allns()
to the 3.18-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
netlink-ensure-to-loop-over-all-netns-in-genlmsg_multicast_allns.patch
and it can be found in the queue-3.18 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Thu Mar 8 10:42:32 PST 2018
From: Nicolas Dichtel <nicolas.dichtel(a)6wind.com>
Date: Tue, 6 Feb 2018 14:48:32 +0100
Subject: netlink: ensure to loop over all netns in genlmsg_multicast_allns()
From: Nicolas Dichtel <nicolas.dichtel(a)6wind.com>
[ Upstream commit cb9f7a9a5c96a773bbc9c70660dc600cfff82f82 ]
Nowadays, nlmsg_multicast() returns only 0 or -ESRCH but this was not the
case when commit 134e63756d5f was pushed.
However, there was no reason to stop the loop if a netns does not have
listeners.
Returns -ESRCH only if there was no listeners in all netns.
To avoid having the same problem in the future, I didn't take the
assumption that nlmsg_multicast() returns only 0 or -ESRCH.
Fixes: 134e63756d5f ("genetlink: make netns aware")
CC: Johannes Berg <johannes.berg(a)intel.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel(a)6wind.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/netlink/genetlink.c | 12 ++++++++++--
1 file changed, 10 insertions(+), 2 deletions(-)
--- a/net/netlink/genetlink.c
+++ b/net/netlink/genetlink.c
@@ -1058,6 +1058,7 @@ static int genlmsg_mcast(struct sk_buff
{
struct sk_buff *tmp;
struct net *net, *prev = NULL;
+ bool delivered = false;
int err;
for_each_net_rcu(net) {
@@ -1069,14 +1070,21 @@ static int genlmsg_mcast(struct sk_buff
}
err = nlmsg_multicast(prev->genl_sock, tmp,
portid, group, flags);
- if (err)
+ if (!err)
+ delivered = true;
+ else if (err != -ESRCH)
goto error;
}
prev = net;
}
- return nlmsg_multicast(prev->genl_sock, skb, portid, group, flags);
+ err = nlmsg_multicast(prev->genl_sock, skb, portid, group, flags);
+ if (!err)
+ delivered = true;
+ else if (err != -ESRCH)
+ goto error;
+ return delivered ? 0 : -ESRCH;
error:
kfree_skb(skb);
return err;
Patches currently in stable-queue which might be from nicolas.dichtel(a)6wind.com are
queue-3.18/netlink-ensure-to-loop-over-all-netns-in-genlmsg_multicast_allns.patch
This is a note to let you know that I've just added the patch titled
net: ipv4: don't allow setting net.ipv4.route.min_pmtu below 68
to the 3.18-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
net-ipv4-don-t-allow-setting-net.ipv4.route.min_pmtu-below-68.patch
and it can be found in the queue-3.18 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Thu Mar 8 10:42:32 PST 2018
From: Sabrina Dubroca <sd(a)queasysnail.net>
Date: Mon, 26 Feb 2018 16:13:43 +0100
Subject: net: ipv4: don't allow setting net.ipv4.route.min_pmtu below 68
From: Sabrina Dubroca <sd(a)queasysnail.net>
[ Upstream commit c7272c2f1229125f74f22dcdd59de9bbd804f1c8 ]
According to RFC 1191 sections 3 and 4, ICMP frag-needed messages
indicating an MTU below 68 should be rejected:
A host MUST never reduce its estimate of the Path MTU below 68
octets.
and (talking about ICMP frag-needed's Next-Hop MTU field):
This field will never contain a value less than 68, since every
router "must be able to forward a datagram of 68 octets without
fragmentation".
Furthermore, by letting net.ipv4.route.min_pmtu be set to negative
values, we can end up with a very large PMTU when (-1) is cast into u32.
Let's also make ip_rt_min_pmtu a u32, since it's only ever compared to
unsigned ints.
Reported-by: Jianlin Shi <jishi(a)redhat.com>
Signed-off-by: Sabrina Dubroca <sd(a)queasysnail.net>
Reviewed-by: Stefano Brivio <sbrivio(a)redhat.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/ipv4/route.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -122,9 +122,11 @@ static int ip_rt_redirect_silence __read
static int ip_rt_error_cost __read_mostly = HZ;
static int ip_rt_error_burst __read_mostly = 5 * HZ;
static int ip_rt_mtu_expires __read_mostly = 10 * 60 * HZ;
-static int ip_rt_min_pmtu __read_mostly = 512 + 20 + 20;
+static u32 ip_rt_min_pmtu __read_mostly = 512 + 20 + 20;
static int ip_rt_min_advmss __read_mostly = 256;
+static int ip_min_valid_pmtu __read_mostly = IPV4_MIN_MTU;
+
/*
* Interface to generic destination cache.
*/
@@ -2629,7 +2631,8 @@ static struct ctl_table ipv4_route_table
.data = &ip_rt_min_pmtu,
.maxlen = sizeof(int),
.mode = 0644,
- .proc_handler = proc_dointvec,
+ .proc_handler = proc_dointvec_minmax,
+ .extra1 = &ip_min_valid_pmtu,
},
{
.procname = "min_adv_mss",
Patches currently in stable-queue which might be from sd(a)queasysnail.net are
queue-3.18/net-ipv4-don-t-allow-setting-net.ipv4.route.min_pmtu-below-68.patch
This is a note to let you know that I've just added the patch titled
net: fix race on decreasing number of TX queues
to the 3.18-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
net-fix-race-on-decreasing-number-of-tx-queues.patch
and it can be found in the queue-3.18 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Thu Mar 8 10:42:32 PST 2018
From: Jakub Kicinski <jakub.kicinski(a)netronome.com>
Date: Mon, 12 Feb 2018 21:35:31 -0800
Subject: net: fix race on decreasing number of TX queues
From: Jakub Kicinski <jakub.kicinski(a)netronome.com>
[ Upstream commit ac5b70198adc25c73fba28de4f78adcee8f6be0b ]
netif_set_real_num_tx_queues() can be called when netdev is up.
That usually happens when user requests change of number of
channels/rings with ethtool -L. The procedure for changing
the number of queues involves resetting the qdiscs and setting
dev->num_tx_queues to the new value. When the new value is
lower than the old one, extra care has to be taken to ensure
ordering of accesses to the number of queues vs qdisc reset.
Currently the queues are reset before new dev->num_tx_queues
is assigned, leaving a window of time where packets can be
enqueued onto the queues going down, leading to a likely
crash in the drivers, since most drivers don't check if TX
skbs are assigned to an active queue.
Fixes: e6484930d7c7 ("net: allocate tx queues in register_netdevice")
Signed-off-by: Jakub Kicinski <jakub.kicinski(a)netronome.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/core/dev.c | 11 +++++++++--
1 file changed, 9 insertions(+), 2 deletions(-)
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2070,8 +2070,11 @@ EXPORT_SYMBOL(netif_set_xps_queue);
*/
int netif_set_real_num_tx_queues(struct net_device *dev, unsigned int txq)
{
+ bool disabling;
int rc;
+ disabling = txq < dev->real_num_tx_queues;
+
if (txq < 1 || txq > dev->num_tx_queues)
return -EINVAL;
@@ -2087,15 +2090,19 @@ int netif_set_real_num_tx_queues(struct
if (dev->num_tc)
netif_setup_tc(dev, txq);
- if (txq < dev->real_num_tx_queues) {
+ dev->real_num_tx_queues = txq;
+
+ if (disabling) {
+ synchronize_net();
qdisc_reset_all_tx_gt(dev, txq);
#ifdef CONFIG_XPS
netif_reset_xps_queues_gt(dev, txq);
#endif
}
+ } else {
+ dev->real_num_tx_queues = txq;
}
- dev->real_num_tx_queues = txq;
return 0;
}
EXPORT_SYMBOL(netif_set_real_num_tx_queues);
Patches currently in stable-queue which might be from jakub.kicinski(a)netronome.com are
queue-3.18/net-fix-race-on-decreasing-number-of-tx-queues.patch
This is a note to let you know that I've just added the patch titled
ipv6 sit: work around bogus gcc-8 -Wrestrict warning
to the 3.18-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
ipv6-sit-work-around-bogus-gcc-8-wrestrict-warning.patch
and it can be found in the queue-3.18 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Thu Mar 8 10:42:32 PST 2018
From: Arnd Bergmann <arnd(a)arndb.de>
Date: Thu, 22 Feb 2018 16:55:34 +0100
Subject: ipv6 sit: work around bogus gcc-8 -Wrestrict warning
From: Arnd Bergmann <arnd(a)arndb.de>
[ Upstream commit ca79bec237f5809a7c3c59bd41cd0880aa889966 ]
gcc-8 has a new warning that detects overlapping input and output arguments
in memcpy(). It triggers for sit_init_net() calling ipip6_tunnel_clone_6rd(),
which is actually correct:
net/ipv6/sit.c: In function 'sit_init_net':
net/ipv6/sit.c:192:3: error: 'memcpy' source argument is the same as destination [-Werror=restrict]
The problem here is that the logic detecting the memcpy() arguments finds them
to be the same, but the conditional that tests for the input and output of
ipip6_tunnel_clone_6rd() to be identical is not a compile-time constant.
We know that netdev_priv(t->dev) is the same as t for a tunnel device,
and comparing "dev" directly here lets the compiler figure out as well
that 'dev == sitn->fb_tunnel_dev' when called from sit_init_net(), so
it no longer warns.
This code is old, so Cc stable to make sure that we don't get the warning
for older kernels built with new gcc.
Cc: Martin Sebor <msebor(a)gmail.com>
Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83456
Signed-off-by: Arnd Bergmann <arnd(a)arndb.de>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/ipv6/sit.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/net/ipv6/sit.c
+++ b/net/ipv6/sit.c
@@ -176,7 +176,7 @@ static void ipip6_tunnel_clone_6rd(struc
#ifdef CONFIG_IPV6_SIT_6RD
struct ip_tunnel *t = netdev_priv(dev);
- if (t->dev == sitn->fb_tunnel_dev) {
+ if (dev == sitn->fb_tunnel_dev) {
ipv6_addr_set(&t->ip6rd.prefix, htonl(0x20020000), 0, 0, 0);
t->ip6rd.relay_prefix = 0;
t->ip6rd.prefixlen = 16;
Patches currently in stable-queue which might be from arnd(a)arndb.de are
queue-3.18/ipv6-sit-work-around-bogus-gcc-8-wrestrict-warning.patch
This is a note to let you know that I've just added the patch titled
hdlc_ppp: carrier detect ok, don't turn off negotiation
to the 3.18-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
hdlc_ppp-carrier-detect-ok-don-t-turn-off-negotiation.patch
and it can be found in the queue-3.18 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Thu Mar 8 10:42:32 PST 2018
From: Denis Du <dudenis2000(a)yahoo.ca>
Date: Sat, 24 Feb 2018 16:51:42 -0500
Subject: hdlc_ppp: carrier detect ok, don't turn off negotiation
From: Denis Du <dudenis2000(a)yahoo.ca>
[ Upstream commit b6c3bad1ba83af1062a7ff6986d9edc4f3d7fc8e ]
Sometimes when physical lines have a just good noise to make the protocol
handshaking fail, but the carrier detect still good. Then after remove of
the noise, nobody will trigger this protocol to be start again to cause
the link to never come back. The fix is when the carrier is still on, not
terminate the protocol handshaking.
Signed-off-by: Denis Du <dudenis2000(a)yahoo.ca>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/net/wan/hdlc_ppp.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
--- a/drivers/net/wan/hdlc_ppp.c
+++ b/drivers/net/wan/hdlc_ppp.c
@@ -574,7 +574,10 @@ static void ppp_timer(unsigned long arg)
ppp_cp_event(proto->dev, proto->pid, TO_GOOD, 0, 0,
0, NULL);
proto->restart_counter--;
- } else
+ } else if (netif_carrier_ok(proto->dev))
+ ppp_cp_event(proto->dev, proto->pid, TO_GOOD, 0, 0,
+ 0, NULL);
+ else
ppp_cp_event(proto->dev, proto->pid, TO_BAD, 0, 0,
0, NULL);
break;
Patches currently in stable-queue which might be from dudenis2000(a)yahoo.ca are
queue-3.18/hdlc_ppp-carrier-detect-ok-don-t-turn-off-negotiation.patch
This is a note to let you know that I've just added the patch titled
fib_semantics: Don't match route with mismatching tclassid
to the 3.18-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
fib_semantics-don-t-match-route-with-mismatching-tclassid.patch
and it can be found in the queue-3.18 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Thu Mar 8 10:42:32 PST 2018
From: Stefano Brivio <sbrivio(a)redhat.com>
Date: Thu, 15 Feb 2018 09:46:03 +0100
Subject: fib_semantics: Don't match route with mismatching tclassid
From: Stefano Brivio <sbrivio(a)redhat.com>
[ Upstream commit a8c6db1dfd1b1d18359241372bb204054f2c3174 ]
In fib_nh_match(), if output interface or gateway are passed in
the FIB configuration, we don't have to check next hops of
multipath routes to conclude whether we have a match or not.
However, we might still have routes with different realms
matching the same output interface and gateway configuration,
and this needs to cause the match to fail. Otherwise the first
route inserted in the FIB will match, regardless of the realms:
# ip route add 1.1.1.1 dev eth0 table 1234 realms 1/2
# ip route append 1.1.1.1 dev eth0 table 1234 realms 3/4
# ip route list table 1234
1.1.1.1 dev eth0 scope link realms 1/2
1.1.1.1 dev eth0 scope link realms 3/4
# ip route del 1.1.1.1 dev ens3 table 1234 realms 3/4
# ip route list table 1234
1.1.1.1 dev ens3 scope link realms 3/4
whereas route with realms 3/4 should have been deleted instead.
Explicitly check for fc_flow passed in the FIB configuration
(this comes from RTA_FLOW extracted by rtm_to_fib_config()) and
fail matching if it differs from nh_tclassid.
The handling of RTA_FLOW for multipath routes later in
fib_nh_match() is still needed, as we can have multiple RTA_FLOW
attributes that need to be matched against the tclassid of each
next hop.
v2: Check that fc_flow is set before discarding the match, so
that the user can still select the first matching rule by
not specifying any realm, as suggested by David Ahern.
Reported-by: Jianlin Shi <jishi(a)redhat.com>
Signed-off-by: Stefano Brivio <sbrivio(a)redhat.com>
Acked-by: David Ahern <dsahern(a)gmail.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/ipv4/fib_semantics.c | 5 +++++
1 file changed, 5 insertions(+)
--- a/net/ipv4/fib_semantics.c
+++ b/net/ipv4/fib_semantics.c
@@ -514,6 +514,11 @@ int fib_nh_match(struct fib_config *cfg,
return 1;
if (cfg->fc_oif || cfg->fc_gw) {
+#ifdef CONFIG_IP_ROUTE_CLASSID
+ if (cfg->fc_flow &&
+ cfg->fc_flow != fi->fib_nh->nh_tclassid)
+ return 1;
+#endif
if ((!cfg->fc_oif || cfg->fc_oif == fi->fib_nh->nh_oif) &&
(!cfg->fc_gw || cfg->fc_gw == fi->fib_nh->nh_gw))
return 0;
Patches currently in stable-queue which might be from sbrivio(a)redhat.com are
queue-3.18/net-ipv4-don-t-allow-setting-net.ipv4.route.min_pmtu-below-68.patch
queue-3.18/fib_semantics-don-t-match-route-with-mismatching-tclassid.patch
This is a note to let you know that I've just added the patch titled
bridge: check brport attr show in brport_show
to the 3.18-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
bridge-check-brport-attr-show-in-brport_show.patch
and it can be found in the queue-3.18 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Thu Mar 8 10:42:32 PST 2018
From: Xin Long <lucien.xin(a)gmail.com>
Date: Mon, 12 Feb 2018 17:15:40 +0800
Subject: bridge: check brport attr show in brport_show
From: Xin Long <lucien.xin(a)gmail.com>
[ Upstream commit 1b12580af1d0677c3c3a19e35bfe5d59b03f737f ]
Now br_sysfs_if file flush doesn't have attr show. To read it will
cause kernel panic after users chmod u+r this file.
Xiong found this issue when running the commands:
ip link add br0 type bridge
ip link add type veth
ip link set veth0 master br0
chmod u+r /sys/devices/virtual/net/veth0/brport/flush
timeout 3 cat /sys/devices/virtual/net/veth0/brport/flush
kernel crashed with NULL a pointer dereference call trace.
This patch is to fix it by return -EINVAL when brport_attr->show
is null, just the same as the check for brport_attr->store in
brport_store().
Fixes: 9cf637473c85 ("bridge: add sysfs hook to flush forwarding table")
Reported-by: Xiong Zhou <xzhou(a)redhat.com>
Signed-off-by: Xin Long <lucien.xin(a)gmail.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/bridge/br_sysfs_if.c | 3 +++
1 file changed, 3 insertions(+)
--- a/net/bridge/br_sysfs_if.c
+++ b/net/bridge/br_sysfs_if.c
@@ -225,6 +225,9 @@ static ssize_t brport_show(struct kobjec
struct brport_attribute *brport_attr = to_brport_attr(attr);
struct net_bridge_port *p = to_brport(kobj);
+ if (!brport_attr->show)
+ return -EINVAL;
+
return brport_attr->show(p, buf);
}
Patches currently in stable-queue which might be from lucien.xin(a)gmail.com are
queue-3.18/bridge-check-brport-attr-show-in-brport_show.patch
This is a note to let you know that I've just added the patch titled
udplite: fix partial checksum initialization
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
udplite-fix-partial-checksum-initialization.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Thu Mar 8 10:22:29 PST 2018
From: Alexey Kodanev <alexey.kodanev(a)oracle.com>
Date: Thu, 15 Feb 2018 20:18:43 +0300
Subject: udplite: fix partial checksum initialization
From: Alexey Kodanev <alexey.kodanev(a)oracle.com>
[ Upstream commit 15f35d49c93f4fa9875235e7bf3e3783d2dd7a1b ]
Since UDP-Lite is always using checksum, the following path is
triggered when calculating pseudo header for it:
udp4_csum_init() or udp6_csum_init()
skb_checksum_init_zero_check()
__skb_checksum_validate_complete()
The problem can appear if skb->len is less than CHECKSUM_BREAK. In
this particular case __skb_checksum_validate_complete() also invokes
__skb_checksum_complete(skb). If UDP-Lite is using partial checksum
that covers only part of a packet, the function will return bad
checksum and the packet will be dropped.
It can be fixed if we skip skb_checksum_init_zero_check() and only
set the required pseudo header checksum for UDP-Lite with partial
checksum before udp4_csum_init()/udp6_csum_init() functions return.
Fixes: ed70fcfcee95 ("net: Call skb_checksum_init in IPv4")
Fixes: e4f45b7f40bd ("net: Call skb_checksum_init in IPv6")
Signed-off-by: Alexey Kodanev <alexey.kodanev(a)oracle.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
include/net/udplite.h | 1 +
net/ipv4/udp.c | 5 +++++
net/ipv6/ip6_checksum.c | 5 +++++
3 files changed, 11 insertions(+)
--- a/include/net/udplite.h
+++ b/include/net/udplite.h
@@ -62,6 +62,7 @@ static inline int udplite_checksum_init(
UDP_SKB_CB(skb)->cscov = cscov;
if (skb->ip_summed == CHECKSUM_COMPLETE)
skb->ip_summed = CHECKSUM_NONE;
+ skb->csum_valid = 0;
}
return 0;
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -1744,6 +1744,11 @@ static inline int udp4_csum_init(struct
err = udplite_checksum_init(skb, uh);
if (err)
return err;
+
+ if (UDP_SKB_CB(skb)->partial_cov) {
+ skb->csum = inet_compute_pseudo(skb, proto);
+ return 0;
+ }
}
return skb_checksum_init_zero_check(skb, proto, uh->check,
--- a/net/ipv6/ip6_checksum.c
+++ b/net/ipv6/ip6_checksum.c
@@ -73,6 +73,11 @@ int udp6_csum_init(struct sk_buff *skb,
err = udplite_checksum_init(skb, uh);
if (err)
return err;
+
+ if (UDP_SKB_CB(skb)->partial_cov) {
+ skb->csum = ip6_compute_pseudo(skb, proto);
+ return 0;
+ }
}
/* To support RFC 6936 (allow zero checksum in UDP/IPV6 for tunnels)
Patches currently in stable-queue which might be from alexey.kodanev(a)oracle.com are
queue-4.4/sctp-fix-dst-refcnt-leak-in-sctp_v6_get_dst.patch
queue-4.4/udplite-fix-partial-checksum-initialization.patch
queue-4.4/sctp-verify-size-of-a-new-chunk-in-_sctp_make_chunk.patch
This is a note to let you know that I've just added the patch titled
sctp: verify size of a new chunk in _sctp_make_chunk()
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
sctp-verify-size-of-a-new-chunk-in-_sctp_make_chunk.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Thu Mar 8 10:22:29 PST 2018
From: Alexey Kodanev <alexey.kodanev(a)oracle.com>
Date: Fri, 9 Feb 2018 17:35:23 +0300
Subject: sctp: verify size of a new chunk in _sctp_make_chunk()
From: Alexey Kodanev <alexey.kodanev(a)oracle.com>
[ Upstream commit 07f2c7ab6f8d0a7e7c5764c4e6cc9c52951b9d9c ]
When SCTP makes INIT or INIT_ACK packet the total chunk length
can exceed SCTP_MAX_CHUNK_LEN which leads to kernel panic when
transmitting these packets, e.g. the crash on sending INIT_ACK:
[ 597.804948] skbuff: skb_over_panic: text:00000000ffae06e4 len:120168
put:120156 head:000000007aa47635 data:00000000d991c2de
tail:0x1d640 end:0xfec0 dev:<NULL>
...
[ 597.976970] ------------[ cut here ]------------
[ 598.033408] kernel BUG at net/core/skbuff.c:104!
[ 600.314841] Call Trace:
[ 600.345829] <IRQ>
[ 600.371639] ? sctp_packet_transmit+0x2095/0x26d0 [sctp]
[ 600.436934] skb_put+0x16c/0x200
[ 600.477295] sctp_packet_transmit+0x2095/0x26d0 [sctp]
[ 600.540630] ? sctp_packet_config+0x890/0x890 [sctp]
[ 600.601781] ? __sctp_packet_append_chunk+0x3b4/0xd00 [sctp]
[ 600.671356] ? sctp_cmp_addr_exact+0x3f/0x90 [sctp]
[ 600.731482] sctp_outq_flush+0x663/0x30d0 [sctp]
[ 600.788565] ? sctp_make_init+0xbf0/0xbf0 [sctp]
[ 600.845555] ? sctp_check_transmitted+0x18f0/0x18f0 [sctp]
[ 600.912945] ? sctp_outq_tail+0x631/0x9d0 [sctp]
[ 600.969936] sctp_cmd_interpreter.isra.22+0x3be1/0x5cb0 [sctp]
[ 601.041593] ? sctp_sf_do_5_1B_init+0x85f/0xc30 [sctp]
[ 601.104837] ? sctp_generate_t1_cookie_event+0x20/0x20 [sctp]
[ 601.175436] ? sctp_eat_data+0x1710/0x1710 [sctp]
[ 601.233575] sctp_do_sm+0x182/0x560 [sctp]
[ 601.284328] ? sctp_has_association+0x70/0x70 [sctp]
[ 601.345586] ? sctp_rcv+0xef4/0x32f0 [sctp]
[ 601.397478] ? sctp6_rcv+0xa/0x20 [sctp]
...
Here the chunk size for INIT_ACK packet becomes too big, mostly
because of the state cookie (INIT packet has large size with
many address parameters), plus additional server parameters.
Later this chunk causes the panic in skb_put_data():
skb_packet_transmit()
sctp_packet_pack()
skb_put_data(nskb, chunk->skb->data, chunk->skb->len);
'nskb' (head skb) was previously allocated with packet->size
from u16 'chunk->chunk_hdr->length'.
As suggested by Marcelo we should check the chunk's length in
_sctp_make_chunk() before trying to allocate skb for it and
discard a chunk if its size bigger than SCTP_MAX_CHUNK_LEN.
Signed-off-by: Alexey Kodanev <alexey.kodanev(a)oracle.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leinter(a)gmail.com>
Acked-by: Neil Horman <nhorman(a)tuxdriver.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/sctp/sm_make_chunk.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
--- a/net/sctp/sm_make_chunk.c
+++ b/net/sctp/sm_make_chunk.c
@@ -1367,10 +1367,14 @@ static struct sctp_chunk *_sctp_make_chu
sctp_chunkhdr_t *chunk_hdr;
struct sk_buff *skb;
struct sock *sk;
+ int chunklen;
+
+ chunklen = sizeof(*chunk_hdr) + paylen;
+ if (chunklen > SCTP_MAX_CHUNK_LEN)
+ goto nodata;
/* No need to allocate LL here, as this is only a chunk. */
- skb = alloc_skb(WORD_ROUND(sizeof(sctp_chunkhdr_t) + paylen),
- GFP_ATOMIC);
+ skb = alloc_skb(chunklen, GFP_ATOMIC);
if (!skb)
goto nodata;
Patches currently in stable-queue which might be from alexey.kodanev(a)oracle.com are
queue-4.4/sctp-fix-dst-refcnt-leak-in-sctp_v6_get_dst.patch
queue-4.4/udplite-fix-partial-checksum-initialization.patch
queue-4.4/sctp-verify-size-of-a-new-chunk-in-_sctp_make_chunk.patch
This is a note to let you know that I've just added the patch titled
sctp: fix dst refcnt leak in sctp_v6_get_dst()
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
sctp-fix-dst-refcnt-leak-in-sctp_v6_get_dst.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Thu Mar 8 10:22:29 PST 2018
From: Alexey Kodanev <alexey.kodanev(a)oracle.com>
Date: Mon, 5 Feb 2018 15:10:35 +0300
Subject: sctp: fix dst refcnt leak in sctp_v6_get_dst()
From: Alexey Kodanev <alexey.kodanev(a)oracle.com>
[ Upstream commit 957d761cf91cdbb175ad7d8f5472336a4d54dbf2 ]
When going through the bind address list in sctp_v6_get_dst() and
the previously found address is better ('matchlen > bmatchlen'),
the code continues to the next iteration without releasing currently
held destination.
Fix it by releasing 'bdst' before continue to the next iteration, and
instead of introducing one more '!IS_ERR(bdst)' check for dst_release(),
move the already existed one right after ip6_dst_lookup_flow(), i.e. we
shouldn't proceed further if we get an error for the route lookup.
Fixes: dbc2b5e9a09e ("sctp: fix src address selection if using secondary addresses for ipv6")
Signed-off-by: Alexey Kodanev <alexey.kodanev(a)oracle.com>
Acked-by: Neil Horman <nhorman(a)tuxdriver.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner(a)gmail.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/sctp/ipv6.c | 10 +++++++---
1 file changed, 7 insertions(+), 3 deletions(-)
--- a/net/sctp/ipv6.c
+++ b/net/sctp/ipv6.c
@@ -323,8 +323,10 @@ static void sctp_v6_get_dst(struct sctp_
final_p = fl6_update_dst(fl6, rcu_dereference(np->opt), &final);
bdst = ip6_dst_lookup_flow(sk, fl6, final_p);
- if (!IS_ERR(bdst) &&
- ipv6_chk_addr(dev_net(bdst->dev),
+ if (IS_ERR(bdst))
+ continue;
+
+ if (ipv6_chk_addr(dev_net(bdst->dev),
&laddr->a.v6.sin6_addr, bdst->dev, 1)) {
if (!IS_ERR_OR_NULL(dst))
dst_release(dst);
@@ -333,8 +335,10 @@ static void sctp_v6_get_dst(struct sctp_
}
bmatchlen = sctp_v6_addr_match_len(daddr, &laddr->a);
- if (matchlen > bmatchlen)
+ if (matchlen > bmatchlen) {
+ dst_release(bdst);
continue;
+ }
if (!IS_ERR_OR_NULL(dst))
dst_release(dst);
Patches currently in stable-queue which might be from alexey.kodanev(a)oracle.com are
queue-4.4/sctp-fix-dst-refcnt-leak-in-sctp_v6_get_dst.patch
queue-4.4/udplite-fix-partial-checksum-initialization.patch
queue-4.4/sctp-verify-size-of-a-new-chunk-in-_sctp_make_chunk.patch
This is a note to let you know that I've just added the patch titled
sctp: fix dst refcnt leak in sctp_v4_get_dst
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
sctp-fix-dst-refcnt-leak-in-sctp_v4_get_dst.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Thu Mar 8 10:22:29 PST 2018
From: Tommi Rantala <tommi.t.rantala(a)nokia.com>
Date: Mon, 5 Feb 2018 21:48:14 +0200
Subject: sctp: fix dst refcnt leak in sctp_v4_get_dst
From: Tommi Rantala <tommi.t.rantala(a)nokia.com>
[ Upstream commit 4a31a6b19f9ddf498c81f5c9b089742b7472a6f8 ]
Fix dst reference count leak in sctp_v4_get_dst() introduced in commit
410f03831 ("sctp: add routing output fallback"):
When walking the address_list, successive ip_route_output_key() calls
may return the same rt->dst with the reference incremented on each call.
The code would not decrement the dst refcount when the dst pointer was
identical from the previous iteration, causing the dst refcnt leak.
Testcase:
ip netns add TEST
ip netns exec TEST ip link set lo up
ip link add dummy0 type dummy
ip link add dummy1 type dummy
ip link add dummy2 type dummy
ip link set dev dummy0 netns TEST
ip link set dev dummy1 netns TEST
ip link set dev dummy2 netns TEST
ip netns exec TEST ip addr add 192.168.1.1/24 dev dummy0
ip netns exec TEST ip link set dummy0 up
ip netns exec TEST ip addr add 192.168.1.2/24 dev dummy1
ip netns exec TEST ip link set dummy1 up
ip netns exec TEST ip addr add 192.168.1.3/24 dev dummy2
ip netns exec TEST ip link set dummy2 up
ip netns exec TEST sctp_test -H 192.168.1.2 -P 20002 -h 192.168.1.1 -p 20000 -s -B 192.168.1.3
ip netns del TEST
In 4.4 and 4.9 kernels this results to:
[ 354.179591] unregister_netdevice: waiting for lo to become free. Usage count = 1
[ 364.419674] unregister_netdevice: waiting for lo to become free. Usage count = 1
[ 374.663664] unregister_netdevice: waiting for lo to become free. Usage count = 1
[ 384.903717] unregister_netdevice: waiting for lo to become free. Usage count = 1
[ 395.143724] unregister_netdevice: waiting for lo to become free. Usage count = 1
[ 405.383645] unregister_netdevice: waiting for lo to become free. Usage count = 1
...
Fixes: 410f03831 ("sctp: add routing output fallback")
Fixes: 0ca50d12f ("sctp: fix src address selection if using secondary addresses")
Signed-off-by: Tommi Rantala <tommi.t.rantala(a)nokia.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner(a)gmail.com>
Acked-by: Neil Horman <nhorman(a)tuxdriver.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/sctp/protocol.c | 10 ++++------
1 file changed, 4 insertions(+), 6 deletions(-)
--- a/net/sctp/protocol.c
+++ b/net/sctp/protocol.c
@@ -508,22 +508,20 @@ static void sctp_v4_get_dst(struct sctp_
if (IS_ERR(rt))
continue;
- if (!dst)
- dst = &rt->dst;
-
/* Ensure the src address belongs to the output
* interface.
*/
odev = __ip_dev_find(sock_net(sk), laddr->a.v4.sin_addr.s_addr,
false);
if (!odev || odev->ifindex != fl4->flowi4_oif) {
- if (&rt->dst != dst)
+ if (!dst)
+ dst = &rt->dst;
+ else
dst_release(&rt->dst);
continue;
}
- if (dst != &rt->dst)
- dst_release(dst);
+ dst_release(dst);
dst = &rt->dst;
break;
}
Patches currently in stable-queue which might be from tommi.t.rantala(a)nokia.com are
queue-4.4/sctp-fix-dst-refcnt-leak-in-sctp_v4_get_dst.patch
This is a note to let you know that I've just added the patch titled
s390/qeth: fix SETIP command handling
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
s390-qeth-fix-setip-command-handling.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Thu Mar 8 10:22:29 PST 2018
From: Julian Wiedmann <jwi(a)linux.vnet.ibm.com>
Date: Fri, 9 Feb 2018 11:03:50 +0100
Subject: s390/qeth: fix SETIP command handling
From: Julian Wiedmann <jwi(a)linux.vnet.ibm.com>
[ Upstream commit 1c5b2216fbb973a9410e0b06389740b5c1289171 ]
send_control_data() applies some special handling to SETIP v4 IPA
commands. But current code parses *all* command types for the SETIP
command code. Limit the command code check to IPA commands.
Fixes: 5b54e16f1a54 ("qeth: do not spin for SETIP ip assist command")
Signed-off-by: Julian Wiedmann <jwi(a)linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/s390/net/qeth_core.h | 5 +++++
drivers/s390/net/qeth_core_main.c | 14 ++++++++------
2 files changed, 13 insertions(+), 6 deletions(-)
--- a/drivers/s390/net/qeth_core.h
+++ b/drivers/s390/net/qeth_core.h
@@ -591,6 +591,11 @@ struct qeth_cmd_buffer {
void (*callback) (struct qeth_channel *, struct qeth_cmd_buffer *);
};
+static inline struct qeth_ipa_cmd *__ipa_cmd(struct qeth_cmd_buffer *iob)
+{
+ return (struct qeth_ipa_cmd *)(iob->data + IPA_PDU_HEADER_SIZE);
+}
+
/**
* definition of a qeth channel, used for read and write
*/
--- a/drivers/s390/net/qeth_core_main.c
+++ b/drivers/s390/net/qeth_core_main.c
@@ -2054,7 +2054,7 @@ int qeth_send_control_data(struct qeth_c
unsigned long flags;
struct qeth_reply *reply = NULL;
unsigned long timeout, event_timeout;
- struct qeth_ipa_cmd *cmd;
+ struct qeth_ipa_cmd *cmd = NULL;
QETH_CARD_TEXT(card, 2, "sendctl");
@@ -2081,10 +2081,13 @@ int qeth_send_control_data(struct qeth_c
while (atomic_cmpxchg(&card->write.irq_pending, 0, 1)) ;
qeth_prepare_control_data(card, len, iob);
- if (IS_IPA(iob->data))
+ if (IS_IPA(iob->data)) {
+ cmd = __ipa_cmd(iob);
event_timeout = QETH_IPA_TIMEOUT;
- else
+ } else {
event_timeout = QETH_TIMEOUT;
+ }
+
timeout = jiffies + event_timeout;
QETH_CARD_TEXT(card, 6, "noirqpnd");
@@ -2109,9 +2112,8 @@ int qeth_send_control_data(struct qeth_c
/* we have only one long running ipassist, since we can ensure
process context of this command we can sleep */
- cmd = (struct qeth_ipa_cmd *)(iob->data+IPA_PDU_HEADER_SIZE);
- if ((cmd->hdr.command == IPA_CMD_SETIP) &&
- (cmd->hdr.prot_version == QETH_PROT_IPV4)) {
+ if (cmd && cmd->hdr.command == IPA_CMD_SETIP &&
+ cmd->hdr.prot_version == QETH_PROT_IPV4) {
if (!wait_event_timeout(reply->wait_q,
atomic_read(&reply->received), event_timeout))
goto time_err;
Patches currently in stable-queue which might be from jwi(a)linux.vnet.ibm.com are
queue-4.4/s390-qeth-fix-setip-command-handling.patch
queue-4.4/s390-qeth-fix-ipa-command-submission-race.patch
This is a note to let you know that I've just added the patch titled
s390/qeth: fix IPA command submission race
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
s390-qeth-fix-ipa-command-submission-race.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Thu Mar 8 10:22:29 PST 2018
From: Julian Wiedmann <jwi(a)linux.vnet.ibm.com>
Date: Tue, 27 Feb 2018 18:58:17 +0100
Subject: s390/qeth: fix IPA command submission race
From: Julian Wiedmann <jwi(a)linux.vnet.ibm.com>
[ Upstream commit d22ffb5a712f9211ffd104c38fc17cbfb1b5e2b0 ]
If multiple IPA commands are build & sent out concurrently,
fill_ipacmd_header() may assign a seqno value to a command that's
different from what send_control_data() later assigns to this command's
reply.
This is due to other commands passing through send_control_data(),
and incrementing card->seqno.ipa along the way.
So one IPA command has no reply that's waiting for its seqno, while some
other IPA command has multiple reply objects waiting for it.
Only one of those waiting replies wins, and the other(s) times out and
triggers a recovery via send_ipa_cmd().
Fix this by making sure that the same seqno value is assigned to
a command and its reply object.
Do so immediately before submitting the command & while holding the
irq_pending "lock", to produce nicely ascending seqnos.
As a side effect, *all* IPA commands now use a reply object that's
waiting for its actual seqno. Previously, early IPA commands that were
submitted while the card was still DOWN used the "catch-all" IDX seqno.
Signed-off-by: Julian Wiedmann <jwi(a)linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/s390/net/qeth_core_main.c | 19 ++++++++++---------
1 file changed, 10 insertions(+), 9 deletions(-)
--- a/drivers/s390/net/qeth_core_main.c
+++ b/drivers/s390/net/qeth_core_main.c
@@ -2068,25 +2068,26 @@ int qeth_send_control_data(struct qeth_c
}
reply->callback = reply_cb;
reply->param = reply_param;
- if (card->state == CARD_STATE_DOWN)
- reply->seqno = QETH_IDX_COMMAND_SEQNO;
- else
- reply->seqno = card->seqno.ipa++;
+
init_waitqueue_head(&reply->wait_q);
- spin_lock_irqsave(&card->lock, flags);
- list_add_tail(&reply->list, &card->cmd_waiter_list);
- spin_unlock_irqrestore(&card->lock, flags);
QETH_DBF_HEX(CTRL, 2, iob->data, QETH_DBF_CTRL_LEN);
while (atomic_cmpxchg(&card->write.irq_pending, 0, 1)) ;
- qeth_prepare_control_data(card, len, iob);
if (IS_IPA(iob->data)) {
cmd = __ipa_cmd(iob);
+ cmd->hdr.seqno = card->seqno.ipa++;
+ reply->seqno = cmd->hdr.seqno;
event_timeout = QETH_IPA_TIMEOUT;
} else {
+ reply->seqno = QETH_IDX_COMMAND_SEQNO;
event_timeout = QETH_TIMEOUT;
}
+ qeth_prepare_control_data(card, len, iob);
+
+ spin_lock_irqsave(&card->lock, flags);
+ list_add_tail(&reply->list, &card->cmd_waiter_list);
+ spin_unlock_irqrestore(&card->lock, flags);
timeout = jiffies + event_timeout;
@@ -2879,7 +2880,7 @@ static void qeth_fill_ipacmd_header(stru
memset(cmd, 0, sizeof(struct qeth_ipa_cmd));
cmd->hdr.command = command;
cmd->hdr.initiator = IPA_CMD_INITIATOR_HOST;
- cmd->hdr.seqno = card->seqno.ipa;
+ /* cmd->hdr.seqno is set by qeth_send_control_data() */
cmd->hdr.adapter_type = qeth_get_ipa_adp_type(card->info.link_type);
cmd->hdr.rel_adapter_no = (__u8) card->info.portno;
if (card->options.layer2)
Patches currently in stable-queue which might be from jwi(a)linux.vnet.ibm.com are
queue-4.4/s390-qeth-fix-setip-command-handling.patch
queue-4.4/s390-qeth-fix-ipa-command-submission-race.patch
This is a note to let you know that I've just added the patch titled
ppp: prevent unregistered channels from connecting to PPP units
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
ppp-prevent-unregistered-channels-from-connecting-to-ppp-units.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Thu Mar 8 10:22:29 PST 2018
From: Guillaume Nault <g.nault(a)alphalink.fr>
Date: Fri, 2 Mar 2018 18:41:16 +0100
Subject: ppp: prevent unregistered channels from connecting to PPP units
From: Guillaume Nault <g.nault(a)alphalink.fr>
[ Upstream commit 77f840e3e5f09c6d7d727e85e6e08276dd813d11 ]
PPP units don't hold any reference on the channels connected to it.
It is the channel's responsibility to ensure that it disconnects from
its unit before being destroyed.
In practice, this is ensured by ppp_unregister_channel() disconnecting
the channel from the unit before dropping a reference on the channel.
However, it is possible for an unregistered channel to connect to a PPP
unit: register a channel with ppp_register_net_channel(), attach a
/dev/ppp file to it with ioctl(PPPIOCATTCHAN), unregister the channel
with ppp_unregister_channel() and finally connect the /dev/ppp file to
a PPP unit with ioctl(PPPIOCCONNECT).
Once in this situation, the channel is only held by the /dev/ppp file,
which can be released at anytime and free the channel without letting
the parent PPP unit know. Then the ppp structure ends up with dangling
pointers in its ->channels list.
Prevent this scenario by forbidding unregistered channels from
connecting to PPP units. This maintains the code logic by keeping
ppp_unregister_channel() responsible from disconnecting the channel if
necessary and avoids modification on the reference counting mechanism.
This issue seems to predate git history (successfully reproduced on
Linux 2.6.26 and earlier PPP commits are unrelated).
Signed-off-by: Guillaume Nault <g.nault(a)alphalink.fr>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/net/ppp/ppp_generic.c | 9 +++++++++
1 file changed, 9 insertions(+)
--- a/drivers/net/ppp/ppp_generic.c
+++ b/drivers/net/ppp/ppp_generic.c
@@ -2952,6 +2952,15 @@ ppp_connect_channel(struct channel *pch,
goto outl;
ppp_lock(ppp);
+ spin_lock_bh(&pch->downl);
+ if (!pch->chan) {
+ /* Don't connect unregistered channels */
+ spin_unlock_bh(&pch->downl);
+ ppp_unlock(ppp);
+ ret = -ENOTCONN;
+ goto outl;
+ }
+ spin_unlock_bh(&pch->downl);
if (pch->file.hdrlen > ppp->file.hdrlen)
ppp->file.hdrlen = pch->file.hdrlen;
hdrlen = pch->file.hdrlen + 2; /* for protocol bytes */
Patches currently in stable-queue which might be from g.nault(a)alphalink.fr are
queue-4.4/ppp-prevent-unregistered-channels-from-connecting-to-ppp-units.patch
This is a note to let you know that I've just added the patch titled
net: ipv4: don't allow setting net.ipv4.route.min_pmtu below 68
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
net-ipv4-don-t-allow-setting-net.ipv4.route.min_pmtu-below-68.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Thu Mar 8 10:22:29 PST 2018
From: Sabrina Dubroca <sd(a)queasysnail.net>
Date: Mon, 26 Feb 2018 16:13:43 +0100
Subject: net: ipv4: don't allow setting net.ipv4.route.min_pmtu below 68
From: Sabrina Dubroca <sd(a)queasysnail.net>
[ Upstream commit c7272c2f1229125f74f22dcdd59de9bbd804f1c8 ]
According to RFC 1191 sections 3 and 4, ICMP frag-needed messages
indicating an MTU below 68 should be rejected:
A host MUST never reduce its estimate of the Path MTU below 68
octets.
and (talking about ICMP frag-needed's Next-Hop MTU field):
This field will never contain a value less than 68, since every
router "must be able to forward a datagram of 68 octets without
fragmentation".
Furthermore, by letting net.ipv4.route.min_pmtu be set to negative
values, we can end up with a very large PMTU when (-1) is cast into u32.
Let's also make ip_rt_min_pmtu a u32, since it's only ever compared to
unsigned ints.
Reported-by: Jianlin Shi <jishi(a)redhat.com>
Signed-off-by: Sabrina Dubroca <sd(a)queasysnail.net>
Reviewed-by: Stefano Brivio <sbrivio(a)redhat.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/ipv4/route.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -126,10 +126,13 @@ static int ip_rt_redirect_silence __read
static int ip_rt_error_cost __read_mostly = HZ;
static int ip_rt_error_burst __read_mostly = 5 * HZ;
static int ip_rt_mtu_expires __read_mostly = 10 * 60 * HZ;
-static int ip_rt_min_pmtu __read_mostly = 512 + 20 + 20;
+static u32 ip_rt_min_pmtu __read_mostly = 512 + 20 + 20;
static int ip_rt_min_advmss __read_mostly = 256;
static int ip_rt_gc_timeout __read_mostly = RT_GC_TIMEOUT;
+
+static int ip_min_valid_pmtu __read_mostly = IPV4_MIN_MTU;
+
/*
* Interface to generic destination cache.
*/
@@ -2765,7 +2768,8 @@ static struct ctl_table ipv4_route_table
.data = &ip_rt_min_pmtu,
.maxlen = sizeof(int),
.mode = 0644,
- .proc_handler = proc_dointvec,
+ .proc_handler = proc_dointvec_minmax,
+ .extra1 = &ip_min_valid_pmtu,
},
{
.procname = "min_adv_mss",
Patches currently in stable-queue which might be from sd(a)queasysnail.net are
queue-4.4/net-ipv4-don-t-allow-setting-net.ipv4.route.min_pmtu-below-68.patch
This is a note to let you know that I've just added the patch titled
netlink: ensure to loop over all netns in genlmsg_multicast_allns()
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
netlink-ensure-to-loop-over-all-netns-in-genlmsg_multicast_allns.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Thu Mar 8 10:22:29 PST 2018
From: Nicolas Dichtel <nicolas.dichtel(a)6wind.com>
Date: Tue, 6 Feb 2018 14:48:32 +0100
Subject: netlink: ensure to loop over all netns in genlmsg_multicast_allns()
From: Nicolas Dichtel <nicolas.dichtel(a)6wind.com>
[ Upstream commit cb9f7a9a5c96a773bbc9c70660dc600cfff82f82 ]
Nowadays, nlmsg_multicast() returns only 0 or -ESRCH but this was not the
case when commit 134e63756d5f was pushed.
However, there was no reason to stop the loop if a netns does not have
listeners.
Returns -ESRCH only if there was no listeners in all netns.
To avoid having the same problem in the future, I didn't take the
assumption that nlmsg_multicast() returns only 0 or -ESRCH.
Fixes: 134e63756d5f ("genetlink: make netns aware")
CC: Johannes Berg <johannes.berg(a)intel.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel(a)6wind.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/netlink/genetlink.c | 12 ++++++++++--
1 file changed, 10 insertions(+), 2 deletions(-)
--- a/net/netlink/genetlink.c
+++ b/net/netlink/genetlink.c
@@ -1118,6 +1118,7 @@ static int genlmsg_mcast(struct sk_buff
{
struct sk_buff *tmp;
struct net *net, *prev = NULL;
+ bool delivered = false;
int err;
for_each_net_rcu(net) {
@@ -1129,14 +1130,21 @@ static int genlmsg_mcast(struct sk_buff
}
err = nlmsg_multicast(prev->genl_sock, tmp,
portid, group, flags);
- if (err)
+ if (!err)
+ delivered = true;
+ else if (err != -ESRCH)
goto error;
}
prev = net;
}
- return nlmsg_multicast(prev->genl_sock, skb, portid, group, flags);
+ err = nlmsg_multicast(prev->genl_sock, skb, portid, group, flags);
+ if (!err)
+ delivered = true;
+ else if (err != -ESRCH)
+ goto error;
+ return delivered ? 0 : -ESRCH;
error:
kfree_skb(skb);
return err;
Patches currently in stable-queue which might be from nicolas.dichtel(a)6wind.com are
queue-4.4/netlink-ensure-to-loop-over-all-netns-in-genlmsg_multicast_allns.patch
This is a note to let you know that I've just added the patch titled
net: fix race on decreasing number of TX queues
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
net-fix-race-on-decreasing-number-of-tx-queues.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Thu Mar 8 10:22:29 PST 2018
From: Jakub Kicinski <jakub.kicinski(a)netronome.com>
Date: Mon, 12 Feb 2018 21:35:31 -0800
Subject: net: fix race on decreasing number of TX queues
From: Jakub Kicinski <jakub.kicinski(a)netronome.com>
[ Upstream commit ac5b70198adc25c73fba28de4f78adcee8f6be0b ]
netif_set_real_num_tx_queues() can be called when netdev is up.
That usually happens when user requests change of number of
channels/rings with ethtool -L. The procedure for changing
the number of queues involves resetting the qdiscs and setting
dev->num_tx_queues to the new value. When the new value is
lower than the old one, extra care has to be taken to ensure
ordering of accesses to the number of queues vs qdisc reset.
Currently the queues are reset before new dev->num_tx_queues
is assigned, leaving a window of time where packets can be
enqueued onto the queues going down, leading to a likely
crash in the drivers, since most drivers don't check if TX
skbs are assigned to an active queue.
Fixes: e6484930d7c7 ("net: allocate tx queues in register_netdevice")
Signed-off-by: Jakub Kicinski <jakub.kicinski(a)netronome.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/core/dev.c | 11 +++++++++--
1 file changed, 9 insertions(+), 2 deletions(-)
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2183,8 +2183,11 @@ EXPORT_SYMBOL(netif_set_xps_queue);
*/
int netif_set_real_num_tx_queues(struct net_device *dev, unsigned int txq)
{
+ bool disabling;
int rc;
+ disabling = txq < dev->real_num_tx_queues;
+
if (txq < 1 || txq > dev->num_tx_queues)
return -EINVAL;
@@ -2200,15 +2203,19 @@ int netif_set_real_num_tx_queues(struct
if (dev->num_tc)
netif_setup_tc(dev, txq);
- if (txq < dev->real_num_tx_queues) {
+ dev->real_num_tx_queues = txq;
+
+ if (disabling) {
+ synchronize_net();
qdisc_reset_all_tx_gt(dev, txq);
#ifdef CONFIG_XPS
netif_reset_xps_queues_gt(dev, txq);
#endif
}
+ } else {
+ dev->real_num_tx_queues = txq;
}
- dev->real_num_tx_queues = txq;
return 0;
}
EXPORT_SYMBOL(netif_set_real_num_tx_queues);
Patches currently in stable-queue which might be from jakub.kicinski(a)netronome.com are
queue-4.4/net-fix-race-on-decreasing-number-of-tx-queues.patch
This is a note to let you know that I've just added the patch titled
ipv6 sit: work around bogus gcc-8 -Wrestrict warning
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
ipv6-sit-work-around-bogus-gcc-8-wrestrict-warning.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Thu Mar 8 10:22:29 PST 2018
From: Arnd Bergmann <arnd(a)arndb.de>
Date: Thu, 22 Feb 2018 16:55:34 +0100
Subject: ipv6 sit: work around bogus gcc-8 -Wrestrict warning
From: Arnd Bergmann <arnd(a)arndb.de>
[ Upstream commit ca79bec237f5809a7c3c59bd41cd0880aa889966 ]
gcc-8 has a new warning that detects overlapping input and output arguments
in memcpy(). It triggers for sit_init_net() calling ipip6_tunnel_clone_6rd(),
which is actually correct:
net/ipv6/sit.c: In function 'sit_init_net':
net/ipv6/sit.c:192:3: error: 'memcpy' source argument is the same as destination [-Werror=restrict]
The problem here is that the logic detecting the memcpy() arguments finds them
to be the same, but the conditional that tests for the input and output of
ipip6_tunnel_clone_6rd() to be identical is not a compile-time constant.
We know that netdev_priv(t->dev) is the same as t for a tunnel device,
and comparing "dev" directly here lets the compiler figure out as well
that 'dev == sitn->fb_tunnel_dev' when called from sit_init_net(), so
it no longer warns.
This code is old, so Cc stable to make sure that we don't get the warning
for older kernels built with new gcc.
Cc: Martin Sebor <msebor(a)gmail.com>
Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83456
Signed-off-by: Arnd Bergmann <arnd(a)arndb.de>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/ipv6/sit.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/net/ipv6/sit.c
+++ b/net/ipv6/sit.c
@@ -176,7 +176,7 @@ static void ipip6_tunnel_clone_6rd(struc
#ifdef CONFIG_IPV6_SIT_6RD
struct ip_tunnel *t = netdev_priv(dev);
- if (t->dev == sitn->fb_tunnel_dev) {
+ if (dev == sitn->fb_tunnel_dev) {
ipv6_addr_set(&t->ip6rd.prefix, htonl(0x20020000), 0, 0, 0);
t->ip6rd.relay_prefix = 0;
t->ip6rd.prefixlen = 16;
Patches currently in stable-queue which might be from arnd(a)arndb.de are
queue-4.4/ipv6-sit-work-around-bogus-gcc-8-wrestrict-warning.patch
This is a note to let you know that I've just added the patch titled
hdlc_ppp: carrier detect ok, don't turn off negotiation
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
hdlc_ppp-carrier-detect-ok-don-t-turn-off-negotiation.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Thu Mar 8 10:22:29 PST 2018
From: Denis Du <dudenis2000(a)yahoo.ca>
Date: Sat, 24 Feb 2018 16:51:42 -0500
Subject: hdlc_ppp: carrier detect ok, don't turn off negotiation
From: Denis Du <dudenis2000(a)yahoo.ca>
[ Upstream commit b6c3bad1ba83af1062a7ff6986d9edc4f3d7fc8e ]
Sometimes when physical lines have a just good noise to make the protocol
handshaking fail, but the carrier detect still good. Then after remove of
the noise, nobody will trigger this protocol to be start again to cause
the link to never come back. The fix is when the carrier is still on, not
terminate the protocol handshaking.
Signed-off-by: Denis Du <dudenis2000(a)yahoo.ca>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/net/wan/hdlc_ppp.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
--- a/drivers/net/wan/hdlc_ppp.c
+++ b/drivers/net/wan/hdlc_ppp.c
@@ -574,7 +574,10 @@ static void ppp_timer(unsigned long arg)
ppp_cp_event(proto->dev, proto->pid, TO_GOOD, 0, 0,
0, NULL);
proto->restart_counter--;
- } else
+ } else if (netif_carrier_ok(proto->dev))
+ ppp_cp_event(proto->dev, proto->pid, TO_GOOD, 0, 0,
+ 0, NULL);
+ else
ppp_cp_event(proto->dev, proto->pid, TO_BAD, 0, 0,
0, NULL);
break;
Patches currently in stable-queue which might be from dudenis2000(a)yahoo.ca are
queue-4.4/hdlc_ppp-carrier-detect-ok-don-t-turn-off-negotiation.patch
This is a note to let you know that I've just added the patch titled
fib_semantics: Don't match route with mismatching tclassid
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
fib_semantics-don-t-match-route-with-mismatching-tclassid.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Thu Mar 8 10:22:29 PST 2018
From: Stefano Brivio <sbrivio(a)redhat.com>
Date: Thu, 15 Feb 2018 09:46:03 +0100
Subject: fib_semantics: Don't match route with mismatching tclassid
From: Stefano Brivio <sbrivio(a)redhat.com>
[ Upstream commit a8c6db1dfd1b1d18359241372bb204054f2c3174 ]
In fib_nh_match(), if output interface or gateway are passed in
the FIB configuration, we don't have to check next hops of
multipath routes to conclude whether we have a match or not.
However, we might still have routes with different realms
matching the same output interface and gateway configuration,
and this needs to cause the match to fail. Otherwise the first
route inserted in the FIB will match, regardless of the realms:
# ip route add 1.1.1.1 dev eth0 table 1234 realms 1/2
# ip route append 1.1.1.1 dev eth0 table 1234 realms 3/4
# ip route list table 1234
1.1.1.1 dev eth0 scope link realms 1/2
1.1.1.1 dev eth0 scope link realms 3/4
# ip route del 1.1.1.1 dev ens3 table 1234 realms 3/4
# ip route list table 1234
1.1.1.1 dev ens3 scope link realms 3/4
whereas route with realms 3/4 should have been deleted instead.
Explicitly check for fc_flow passed in the FIB configuration
(this comes from RTA_FLOW extracted by rtm_to_fib_config()) and
fail matching if it differs from nh_tclassid.
The handling of RTA_FLOW for multipath routes later in
fib_nh_match() is still needed, as we can have multiple RTA_FLOW
attributes that need to be matched against the tclassid of each
next hop.
v2: Check that fc_flow is set before discarding the match, so
that the user can still select the first matching rule by
not specifying any realm, as suggested by David Ahern.
Reported-by: Jianlin Shi <jishi(a)redhat.com>
Signed-off-by: Stefano Brivio <sbrivio(a)redhat.com>
Acked-by: David Ahern <dsahern(a)gmail.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/ipv4/fib_semantics.c | 5 +++++
1 file changed, 5 insertions(+)
--- a/net/ipv4/fib_semantics.c
+++ b/net/ipv4/fib_semantics.c
@@ -640,6 +640,11 @@ int fib_nh_match(struct fib_config *cfg,
fi->fib_nh, cfg))
return 1;
}
+#ifdef CONFIG_IP_ROUTE_CLASSID
+ if (cfg->fc_flow &&
+ cfg->fc_flow != fi->fib_nh->nh_tclassid)
+ return 1;
+#endif
if ((!cfg->fc_oif || cfg->fc_oif == fi->fib_nh->nh_oif) &&
(!cfg->fc_gw || cfg->fc_gw == fi->fib_nh->nh_gw))
return 0;
Patches currently in stable-queue which might be from sbrivio(a)redhat.com are
queue-4.4/net-ipv4-don-t-allow-setting-net.ipv4.route.min_pmtu-below-68.patch
queue-4.4/fib_semantics-don-t-match-route-with-mismatching-tclassid.patch
This is a note to let you know that I've just added the patch titled
bridge: check brport attr show in brport_show
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
bridge-check-brport-attr-show-in-brport_show.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Thu Mar 8 10:22:29 PST 2018
From: Xin Long <lucien.xin(a)gmail.com>
Date: Mon, 12 Feb 2018 17:15:40 +0800
Subject: bridge: check brport attr show in brport_show
From: Xin Long <lucien.xin(a)gmail.com>
[ Upstream commit 1b12580af1d0677c3c3a19e35bfe5d59b03f737f ]
Now br_sysfs_if file flush doesn't have attr show. To read it will
cause kernel panic after users chmod u+r this file.
Xiong found this issue when running the commands:
ip link add br0 type bridge
ip link add type veth
ip link set veth0 master br0
chmod u+r /sys/devices/virtual/net/veth0/brport/flush
timeout 3 cat /sys/devices/virtual/net/veth0/brport/flush
kernel crashed with NULL a pointer dereference call trace.
This patch is to fix it by return -EINVAL when brport_attr->show
is null, just the same as the check for brport_attr->store in
brport_store().
Fixes: 9cf637473c85 ("bridge: add sysfs hook to flush forwarding table")
Reported-by: Xiong Zhou <xzhou(a)redhat.com>
Signed-off-by: Xin Long <lucien.xin(a)gmail.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/bridge/br_sysfs_if.c | 3 +++
1 file changed, 3 insertions(+)
--- a/net/bridge/br_sysfs_if.c
+++ b/net/bridge/br_sysfs_if.c
@@ -229,6 +229,9 @@ static ssize_t brport_show(struct kobjec
struct brport_attribute *brport_attr = to_brport_attr(attr);
struct net_bridge_port *p = to_brport(kobj);
+ if (!brport_attr->show)
+ return -EINVAL;
+
return brport_attr->show(p, buf);
}
Patches currently in stable-queue which might be from lucien.xin(a)gmail.com are
queue-4.4/bridge-check-brport-attr-show-in-brport_show.patch
This is a note to let you know that I've just added the patch titled
x86/mm: Give each mm TLB flush generation a unique ID
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
x86-mm-give-each-mm-tlb-flush-generation-a-unique-id.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From f39681ed0f48498b80455095376f11535feea332 Mon Sep 17 00:00:00 2001
From: Andy Lutomirski <luto(a)kernel.org>
Date: Thu, 29 Jun 2017 08:53:15 -0700
Subject: x86/mm: Give each mm TLB flush generation a unique ID
From: Andy Lutomirski <luto(a)kernel.org>
commit f39681ed0f48498b80455095376f11535feea332 upstream.
This adds two new variables to mmu_context_t: ctx_id and tlb_gen.
ctx_id uniquely identifies the mm_struct and will never be reused.
For a given mm_struct (and hence ctx_id), tlb_gen is a monotonic
count of the number of times that a TLB flush has been requested.
The pair (ctx_id, tlb_gen) can be used as an identifier for TLB
flush actions and will be used in subsequent patches to reliably
determine whether all needed TLB flushes have occurred on a given
CPU.
This patch is split out for ease of review. By itself, it has no
real effect other than creating and updating the new variables.
Signed-off-by: Andy Lutomirski <luto(a)kernel.org>
Reviewed-by: Nadav Amit <nadav.amit(a)gmail.com>
Reviewed-by: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Arjan van de Ven <arjan(a)linux.intel.com>
Cc: Borislav Petkov <bp(a)alien8.de>
Cc: Dave Hansen <dave.hansen(a)intel.com>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Mel Gorman <mgorman(a)suse.de>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Rik van Riel <riel(a)redhat.com>
Cc: linux-mm(a)kvack.org
Link: http://lkml.kernel.org/r/413a91c24dab3ed0caa5f4e4d017d87b0857f920.149875120…
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
Signed-off-by: Tim Chen <tim.c.chen(a)linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/include/asm/mmu.h | 15 +++++++++++++--
arch/x86/include/asm/mmu_context.h | 5 +++++
arch/x86/mm/tlb.c | 2 ++
3 files changed, 20 insertions(+), 2 deletions(-)
--- a/arch/x86/include/asm/mmu.h
+++ b/arch/x86/include/asm/mmu.h
@@ -3,12 +3,18 @@
#include <linux/spinlock.h>
#include <linux/mutex.h>
+#include <linux/atomic.h>
/*
- * The x86 doesn't have a mmu context, but
- * we put the segment information here.
+ * x86 has arch-specific MMU state beyond what lives in mm_struct.
*/
typedef struct {
+ /*
+ * ctx_id uniquely identifies this mm_struct. A ctx_id will never
+ * be reused, and zero is not a valid ctx_id.
+ */
+ u64 ctx_id;
+
#ifdef CONFIG_MODIFY_LDT_SYSCALL
struct ldt_struct *ldt;
#endif
@@ -33,6 +39,11 @@ typedef struct {
#endif
} mm_context_t;
+#define INIT_MM_CONTEXT(mm) \
+ .context = { \
+ .ctx_id = 1, \
+ }
+
void leave_mm(int cpu);
#endif /* _ASM_X86_MMU_H */
--- a/arch/x86/include/asm/mmu_context.h
+++ b/arch/x86/include/asm/mmu_context.h
@@ -12,6 +12,9 @@
#include <asm/tlbflush.h>
#include <asm/paravirt.h>
#include <asm/mpx.h>
+
+extern atomic64_t last_mm_ctx_id;
+
#ifndef CONFIG_PARAVIRT
static inline void paravirt_activate_mm(struct mm_struct *prev,
struct mm_struct *next)
@@ -106,6 +109,8 @@ static inline void enter_lazy_tlb(struct
static inline int init_new_context(struct task_struct *tsk,
struct mm_struct *mm)
{
+ mm->context.ctx_id = atomic64_inc_return(&last_mm_ctx_id);
+
#ifdef CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS
if (cpu_feature_enabled(X86_FEATURE_OSPKE)) {
/* pkey 0 is the default and always allocated */
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -29,6 +29,8 @@
* Implement flush IPI by CALL_FUNCTION_VECTOR, Alex Shi
*/
+atomic64_t last_mm_ctx_id = ATOMIC64_INIT(1);
+
struct flush_tlb_info {
struct mm_struct *flush_mm;
unsigned long flush_start;
Patches currently in stable-queue which might be from luto(a)kernel.org are
queue-4.9/nospec-allow-index-argument-to-have-const-qualified-type.patch
queue-4.9/x86-speculation-use-indirect-branch-prediction-barrier-in-context-switch.patch
queue-4.9/x86-mm-give-each-mm-tlb-flush-generation-a-unique-id.patch
This is a note to let you know that I've just added the patch titled
udplite: fix partial checksum initialization
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
udplite-fix-partial-checksum-initialization.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Thu Mar 8 06:55:02 PST 2018
From: Alexey Kodanev <alexey.kodanev(a)oracle.com>
Date: Thu, 15 Feb 2018 20:18:43 +0300
Subject: udplite: fix partial checksum initialization
From: Alexey Kodanev <alexey.kodanev(a)oracle.com>
[ Upstream commit 15f35d49c93f4fa9875235e7bf3e3783d2dd7a1b ]
Since UDP-Lite is always using checksum, the following path is
triggered when calculating pseudo header for it:
udp4_csum_init() or udp6_csum_init()
skb_checksum_init_zero_check()
__skb_checksum_validate_complete()
The problem can appear if skb->len is less than CHECKSUM_BREAK. In
this particular case __skb_checksum_validate_complete() also invokes
__skb_checksum_complete(skb). If UDP-Lite is using partial checksum
that covers only part of a packet, the function will return bad
checksum and the packet will be dropped.
It can be fixed if we skip skb_checksum_init_zero_check() and only
set the required pseudo header checksum for UDP-Lite with partial
checksum before udp4_csum_init()/udp6_csum_init() functions return.
Fixes: ed70fcfcee95 ("net: Call skb_checksum_init in IPv4")
Fixes: e4f45b7f40bd ("net: Call skb_checksum_init in IPv6")
Signed-off-by: Alexey Kodanev <alexey.kodanev(a)oracle.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
include/net/udplite.h | 1 +
net/ipv4/udp.c | 5 +++++
net/ipv6/ip6_checksum.c | 5 +++++
3 files changed, 11 insertions(+)
--- a/include/net/udplite.h
+++ b/include/net/udplite.h
@@ -62,6 +62,7 @@ static inline int udplite_checksum_init(
UDP_SKB_CB(skb)->cscov = cscov;
if (skb->ip_summed == CHECKSUM_COMPLETE)
skb->ip_summed = CHECKSUM_NONE;
+ skb->csum_valid = 0;
}
return 0;
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -1713,6 +1713,11 @@ static inline int udp4_csum_init(struct
err = udplite_checksum_init(skb, uh);
if (err)
return err;
+
+ if (UDP_SKB_CB(skb)->partial_cov) {
+ skb->csum = inet_compute_pseudo(skb, proto);
+ return 0;
+ }
}
/* Note, we are only interested in != 0 or == 0, thus the
--- a/net/ipv6/ip6_checksum.c
+++ b/net/ipv6/ip6_checksum.c
@@ -72,6 +72,11 @@ int udp6_csum_init(struct sk_buff *skb,
err = udplite_checksum_init(skb, uh);
if (err)
return err;
+
+ if (UDP_SKB_CB(skb)->partial_cov) {
+ skb->csum = ip6_compute_pseudo(skb, proto);
+ return 0;
+ }
}
/* To support RFC 6936 (allow zero checksum in UDP/IPV6 for tunnels)
Patches currently in stable-queue which might be from alexey.kodanev(a)oracle.com are
queue-4.9/sctp-fix-dst-refcnt-leak-in-sctp_v6_get_dst.patch
queue-4.9/udplite-fix-partial-checksum-initialization.patch
queue-4.9/sctp-verify-size-of-a-new-chunk-in-_sctp_make_chunk.patch
This is a note to let you know that I've just added the patch titled
tcp_bbr: better deal with suboptimal GSO
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
tcp_bbr-better-deal-with-suboptimal-gso.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Thu Mar 8 06:55:02 PST 2018
From: Eric Dumazet <edumazet(a)google.com>
Date: Wed, 21 Feb 2018 06:43:03 -0800
Subject: tcp_bbr: better deal with suboptimal GSO
From: Eric Dumazet <edumazet(a)google.com>
[ Upstream commit 350c9f484bde93ef229682eedd98cd5f74350f7f ]
BBR uses tcp_tso_autosize() in an attempt to probe what would be the
burst sizes and to adjust cwnd in bbr_target_cwnd() with following
gold formula :
/* Allow enough full-sized skbs in flight to utilize end systems. */
cwnd += 3 * bbr->tso_segs_goal;
But GSO can be lacking or be constrained to very small
units (ip link set dev ... gso_max_segs 2)
What we really want is to have enough packets in flight so that both
GSO and GRO are efficient.
So in the case GSO is off or downgraded, we still want to have the same
number of packets in flight as if GSO/TSO was fully operational, so
that GRO can hopefully be working efficiently.
To fix this issue, we make tcp_tso_autosize() unaware of
sk->sk_gso_max_segs
Only tcp_tso_segs() has to enforce the gso_max_segs limit.
Tested:
ethtool -K eth0 tso off gso off
tc qd replace dev eth0 root pfifo_fast
Before patch:
for f in {1..5}; do ./super_netperf 1 -H lpaa24 -- -K bbr; done
691 (ss -temoi shows cwnd is stuck around 6 )
667
651
631
517
After patch :
# for f in {1..5}; do ./super_netperf 1 -H lpaa24 -- -K bbr; done
1733 (ss -temoi shows cwnd is around 386 )
1778
1746
1781
1718
Fixes: 0f8782ea1497 ("tcp_bbr: add BBR congestion control")
Signed-off-by: Eric Dumazet <edumazet(a)google.com>
Reported-by: Oleksandr Natalenko <oleksandr(a)natalenko.name>
Acked-by: Neal Cardwell <ncardwell(a)google.com>
Acked-by: Soheil Hassas Yeganeh <soheil(a)google.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/ipv4/tcp_output.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -1580,7 +1580,7 @@ u32 tcp_tso_autosize(const struct sock *
*/
segs = max_t(u32, bytes / mss_now, min_tso_segs);
- return min_t(u32, segs, sk->sk_gso_max_segs);
+ return segs;
}
EXPORT_SYMBOL(tcp_tso_autosize);
@@ -1592,8 +1592,10 @@ static u32 tcp_tso_segs(struct sock *sk,
const struct tcp_congestion_ops *ca_ops = inet_csk(sk)->icsk_ca_ops;
u32 tso_segs = ca_ops->tso_segs_goal ? ca_ops->tso_segs_goal(sk) : 0;
- return tso_segs ? :
- tcp_tso_autosize(sk, mss_now, sysctl_tcp_min_tso_segs);
+ if (!tso_segs)
+ tso_segs = tcp_tso_autosize(sk, mss_now,
+ sysctl_tcp_min_tso_segs);
+ return min_t(u32, tso_segs, sk->sk_gso_max_segs);
}
/* Returns the portion of skb which can be sent right away */
Patches currently in stable-queue which might be from edumazet(a)google.com are
queue-4.9/tcp_bbr-better-deal-with-suboptimal-gso.patch
This is a note to let you know that I've just added the patch titled
sctp: verify size of a new chunk in _sctp_make_chunk()
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
sctp-verify-size-of-a-new-chunk-in-_sctp_make_chunk.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Thu Mar 8 06:55:02 PST 2018
From: Alexey Kodanev <alexey.kodanev(a)oracle.com>
Date: Fri, 9 Feb 2018 17:35:23 +0300
Subject: sctp: verify size of a new chunk in _sctp_make_chunk()
From: Alexey Kodanev <alexey.kodanev(a)oracle.com>
[ Upstream commit 07f2c7ab6f8d0a7e7c5764c4e6cc9c52951b9d9c ]
When SCTP makes INIT or INIT_ACK packet the total chunk length
can exceed SCTP_MAX_CHUNK_LEN which leads to kernel panic when
transmitting these packets, e.g. the crash on sending INIT_ACK:
[ 597.804948] skbuff: skb_over_panic: text:00000000ffae06e4 len:120168
put:120156 head:000000007aa47635 data:00000000d991c2de
tail:0x1d640 end:0xfec0 dev:<NULL>
...
[ 597.976970] ------------[ cut here ]------------
[ 598.033408] kernel BUG at net/core/skbuff.c:104!
[ 600.314841] Call Trace:
[ 600.345829] <IRQ>
[ 600.371639] ? sctp_packet_transmit+0x2095/0x26d0 [sctp]
[ 600.436934] skb_put+0x16c/0x200
[ 600.477295] sctp_packet_transmit+0x2095/0x26d0 [sctp]
[ 600.540630] ? sctp_packet_config+0x890/0x890 [sctp]
[ 600.601781] ? __sctp_packet_append_chunk+0x3b4/0xd00 [sctp]
[ 600.671356] ? sctp_cmp_addr_exact+0x3f/0x90 [sctp]
[ 600.731482] sctp_outq_flush+0x663/0x30d0 [sctp]
[ 600.788565] ? sctp_make_init+0xbf0/0xbf0 [sctp]
[ 600.845555] ? sctp_check_transmitted+0x18f0/0x18f0 [sctp]
[ 600.912945] ? sctp_outq_tail+0x631/0x9d0 [sctp]
[ 600.969936] sctp_cmd_interpreter.isra.22+0x3be1/0x5cb0 [sctp]
[ 601.041593] ? sctp_sf_do_5_1B_init+0x85f/0xc30 [sctp]
[ 601.104837] ? sctp_generate_t1_cookie_event+0x20/0x20 [sctp]
[ 601.175436] ? sctp_eat_data+0x1710/0x1710 [sctp]
[ 601.233575] sctp_do_sm+0x182/0x560 [sctp]
[ 601.284328] ? sctp_has_association+0x70/0x70 [sctp]
[ 601.345586] ? sctp_rcv+0xef4/0x32f0 [sctp]
[ 601.397478] ? sctp6_rcv+0xa/0x20 [sctp]
...
Here the chunk size for INIT_ACK packet becomes too big, mostly
because of the state cookie (INIT packet has large size with
many address parameters), plus additional server parameters.
Later this chunk causes the panic in skb_put_data():
skb_packet_transmit()
sctp_packet_pack()
skb_put_data(nskb, chunk->skb->data, chunk->skb->len);
'nskb' (head skb) was previously allocated with packet->size
from u16 'chunk->chunk_hdr->length'.
As suggested by Marcelo we should check the chunk's length in
_sctp_make_chunk() before trying to allocate skb for it and
discard a chunk if its size bigger than SCTP_MAX_CHUNK_LEN.
Signed-off-by: Alexey Kodanev <alexey.kodanev(a)oracle.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leinter(a)gmail.com>
Acked-by: Neil Horman <nhorman(a)tuxdriver.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/sctp/sm_make_chunk.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
--- a/net/sctp/sm_make_chunk.c
+++ b/net/sctp/sm_make_chunk.c
@@ -1373,9 +1373,14 @@ static struct sctp_chunk *_sctp_make_chu
sctp_chunkhdr_t *chunk_hdr;
struct sk_buff *skb;
struct sock *sk;
+ int chunklen;
+
+ chunklen = SCTP_PAD4(sizeof(*chunk_hdr) + paylen);
+ if (chunklen > SCTP_MAX_CHUNK_LEN)
+ goto nodata;
/* No need to allocate LL here, as this is only a chunk. */
- skb = alloc_skb(SCTP_PAD4(sizeof(sctp_chunkhdr_t) + paylen), gfp);
+ skb = alloc_skb(chunklen, gfp);
if (!skb)
goto nodata;
Patches currently in stable-queue which might be from alexey.kodanev(a)oracle.com are
queue-4.9/sctp-fix-dst-refcnt-leak-in-sctp_v6_get_dst.patch
queue-4.9/udplite-fix-partial-checksum-initialization.patch
queue-4.9/sctp-verify-size-of-a-new-chunk-in-_sctp_make_chunk.patch
This is a note to let you know that I've just added the patch titled
tcp: Honor the eor bit in tcp_mtu_probe
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
tcp-honor-the-eor-bit-in-tcp_mtu_probe.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Thu Mar 8 06:55:02 PST 2018
From: Ilya Lesokhin <ilyal(a)mellanox.com>
Date: Mon, 12 Feb 2018 12:57:04 +0200
Subject: tcp: Honor the eor bit in tcp_mtu_probe
From: Ilya Lesokhin <ilyal(a)mellanox.com>
[ Upstream commit 808cf9e38cd7923036a99f459ccc8cf2955e47af ]
Avoid SKB coalescing if eor bit is set in one of the relevant
SKBs.
Fixes: c134ecb87817 ("tcp: Make use of MSG_EOR in tcp_sendmsg")
Signed-off-by: Ilya Lesokhin <ilyal(a)mellanox.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/ipv4/tcp_output.c | 25 +++++++++++++++++++++++++
1 file changed, 25 insertions(+)
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -1907,6 +1907,24 @@ static inline void tcp_mtu_check_reprobe
}
}
+static bool tcp_can_coalesce_send_queue_head(struct sock *sk, int len)
+{
+ struct sk_buff *skb, *next;
+
+ skb = tcp_send_head(sk);
+ tcp_for_write_queue_from_safe(skb, next, sk) {
+ if (len <= skb->len)
+ break;
+
+ if (unlikely(TCP_SKB_CB(skb)->eor))
+ return false;
+
+ len -= skb->len;
+ }
+
+ return true;
+}
+
/* Create a new MTU probe if we are ready.
* MTU probe is regularly attempting to increase the path MTU by
* deliberately sending larger packets. This discovers routing
@@ -1979,6 +1997,9 @@ static int tcp_mtu_probe(struct sock *sk
return 0;
}
+ if (!tcp_can_coalesce_send_queue_head(sk, probe_size))
+ return -1;
+
/* We're allowed to probe. Build it now. */
nskb = sk_stream_alloc_skb(sk, probe_size, GFP_ATOMIC, false);
if (!nskb)
@@ -2014,6 +2035,10 @@ static int tcp_mtu_probe(struct sock *sk
/* We've eaten all the data from this skb.
* Throw it away. */
TCP_SKB_CB(nskb)->tcp_flags |= TCP_SKB_CB(skb)->tcp_flags;
+ /* If this is the last SKB we copy and eor is set
+ * we need to propagate it to the new skb.
+ */
+ TCP_SKB_CB(nskb)->eor = TCP_SKB_CB(skb)->eor;
tcp_unlink_write_queue(skb, sk);
sk_wmem_free_skb(sk, skb);
} else {
Patches currently in stable-queue which might be from ilyal(a)mellanox.com are
queue-4.9/tcp-honor-the-eor-bit-in-tcp_mtu_probe.patch
This is a note to let you know that I've just added the patch titled
sctp: fix dst refcnt leak in sctp_v6_get_dst()
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
sctp-fix-dst-refcnt-leak-in-sctp_v6_get_dst.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Thu Mar 8 06:55:02 PST 2018
From: Alexey Kodanev <alexey.kodanev(a)oracle.com>
Date: Mon, 5 Feb 2018 15:10:35 +0300
Subject: sctp: fix dst refcnt leak in sctp_v6_get_dst()
From: Alexey Kodanev <alexey.kodanev(a)oracle.com>
[ Upstream commit 957d761cf91cdbb175ad7d8f5472336a4d54dbf2 ]
When going through the bind address list in sctp_v6_get_dst() and
the previously found address is better ('matchlen > bmatchlen'),
the code continues to the next iteration without releasing currently
held destination.
Fix it by releasing 'bdst' before continue to the next iteration, and
instead of introducing one more '!IS_ERR(bdst)' check for dst_release(),
move the already existed one right after ip6_dst_lookup_flow(), i.e. we
shouldn't proceed further if we get an error for the route lookup.
Fixes: dbc2b5e9a09e ("sctp: fix src address selection if using secondary addresses for ipv6")
Signed-off-by: Alexey Kodanev <alexey.kodanev(a)oracle.com>
Acked-by: Neil Horman <nhorman(a)tuxdriver.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner(a)gmail.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/sctp/ipv6.c | 10 +++++++---
1 file changed, 7 insertions(+), 3 deletions(-)
--- a/net/sctp/ipv6.c
+++ b/net/sctp/ipv6.c
@@ -324,8 +324,10 @@ static void sctp_v6_get_dst(struct sctp_
final_p = fl6_update_dst(fl6, rcu_dereference(np->opt), &final);
bdst = ip6_dst_lookup_flow(sk, fl6, final_p);
- if (!IS_ERR(bdst) &&
- ipv6_chk_addr(dev_net(bdst->dev),
+ if (IS_ERR(bdst))
+ continue;
+
+ if (ipv6_chk_addr(dev_net(bdst->dev),
&laddr->a.v6.sin6_addr, bdst->dev, 1)) {
if (!IS_ERR_OR_NULL(dst))
dst_release(dst);
@@ -334,8 +336,10 @@ static void sctp_v6_get_dst(struct sctp_
}
bmatchlen = sctp_v6_addr_match_len(daddr, &laddr->a);
- if (matchlen > bmatchlen)
+ if (matchlen > bmatchlen) {
+ dst_release(bdst);
continue;
+ }
if (!IS_ERR_OR_NULL(dst))
dst_release(dst);
Patches currently in stable-queue which might be from alexey.kodanev(a)oracle.com are
queue-4.9/sctp-fix-dst-refcnt-leak-in-sctp_v6_get_dst.patch
queue-4.9/udplite-fix-partial-checksum-initialization.patch
queue-4.9/sctp-verify-size-of-a-new-chunk-in-_sctp_make_chunk.patch
This is a note to let you know that I've just added the patch titled
sctp: fix dst refcnt leak in sctp_v4_get_dst
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
sctp-fix-dst-refcnt-leak-in-sctp_v4_get_dst.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Thu Mar 8 06:55:02 PST 2018
From: Tommi Rantala <tommi.t.rantala(a)nokia.com>
Date: Mon, 5 Feb 2018 21:48:14 +0200
Subject: sctp: fix dst refcnt leak in sctp_v4_get_dst
From: Tommi Rantala <tommi.t.rantala(a)nokia.com>
[ Upstream commit 4a31a6b19f9ddf498c81f5c9b089742b7472a6f8 ]
Fix dst reference count leak in sctp_v4_get_dst() introduced in commit
410f03831 ("sctp: add routing output fallback"):
When walking the address_list, successive ip_route_output_key() calls
may return the same rt->dst with the reference incremented on each call.
The code would not decrement the dst refcount when the dst pointer was
identical from the previous iteration, causing the dst refcnt leak.
Testcase:
ip netns add TEST
ip netns exec TEST ip link set lo up
ip link add dummy0 type dummy
ip link add dummy1 type dummy
ip link add dummy2 type dummy
ip link set dev dummy0 netns TEST
ip link set dev dummy1 netns TEST
ip link set dev dummy2 netns TEST
ip netns exec TEST ip addr add 192.168.1.1/24 dev dummy0
ip netns exec TEST ip link set dummy0 up
ip netns exec TEST ip addr add 192.168.1.2/24 dev dummy1
ip netns exec TEST ip link set dummy1 up
ip netns exec TEST ip addr add 192.168.1.3/24 dev dummy2
ip netns exec TEST ip link set dummy2 up
ip netns exec TEST sctp_test -H 192.168.1.2 -P 20002 -h 192.168.1.1 -p 20000 -s -B 192.168.1.3
ip netns del TEST
In 4.4 and 4.9 kernels this results to:
[ 354.179591] unregister_netdevice: waiting for lo to become free. Usage count = 1
[ 364.419674] unregister_netdevice: waiting for lo to become free. Usage count = 1
[ 374.663664] unregister_netdevice: waiting for lo to become free. Usage count = 1
[ 384.903717] unregister_netdevice: waiting for lo to become free. Usage count = 1
[ 395.143724] unregister_netdevice: waiting for lo to become free. Usage count = 1
[ 405.383645] unregister_netdevice: waiting for lo to become free. Usage count = 1
...
Fixes: 410f03831 ("sctp: add routing output fallback")
Fixes: 0ca50d12f ("sctp: fix src address selection if using secondary addresses")
Signed-off-by: Tommi Rantala <tommi.t.rantala(a)nokia.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner(a)gmail.com>
Acked-by: Neil Horman <nhorman(a)tuxdriver.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/sctp/protocol.c | 10 ++++------
1 file changed, 4 insertions(+), 6 deletions(-)
--- a/net/sctp/protocol.c
+++ b/net/sctp/protocol.c
@@ -510,22 +510,20 @@ static void sctp_v4_get_dst(struct sctp_
if (IS_ERR(rt))
continue;
- if (!dst)
- dst = &rt->dst;
-
/* Ensure the src address belongs to the output
* interface.
*/
odev = __ip_dev_find(sock_net(sk), laddr->a.v4.sin_addr.s_addr,
false);
if (!odev || odev->ifindex != fl4->flowi4_oif) {
- if (&rt->dst != dst)
+ if (!dst)
+ dst = &rt->dst;
+ else
dst_release(&rt->dst);
continue;
}
- if (dst != &rt->dst)
- dst_release(dst);
+ dst_release(dst);
dst = &rt->dst;
break;
}
Patches currently in stable-queue which might be from tommi.t.rantala(a)nokia.com are
queue-4.9/sctp-fix-dst-refcnt-leak-in-sctp_v4_get_dst.patch
This is a note to let you know that I've just added the patch titled
s390/qeth: fix underestimated count of buffer elements
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
s390-qeth-fix-underestimated-count-of-buffer-elements.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Thu Mar 8 06:55:02 PST 2018
From: Ursula Braun <ubraun(a)linux.vnet.ibm.com>
Date: Fri, 9 Feb 2018 11:03:49 +0100
Subject: s390/qeth: fix underestimated count of buffer elements
From: Ursula Braun <ubraun(a)linux.vnet.ibm.com>
[ Upstream commit 89271c65edd599207dd982007900506283c90ae3 ]
For a memory range/skb where the last byte falls onto a page boundary
(ie. 'end' is of the form xxx...xxx001), the PFN_UP() part of the
calculation currently doesn't round up to the next PFN due to an
off-by-one error.
Thus qeth believes that the skb occupies one page less than it
actually does, and may select a IO buffer that doesn't have enough spare
buffer elements to fit all of the skb's data.
HW detects this as a malformed buffer descriptor, and raises an
exception which then triggers device recovery.
Fixes: 2863c61334aa ("qeth: refactor calculation of SBALE count")
Signed-off-by: Ursula Braun <ubraun(a)linux.vnet.ibm.com>
Signed-off-by: Julian Wiedmann <jwi(a)linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/s390/net/qeth_core.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/s390/net/qeth_core.h
+++ b/drivers/s390/net/qeth_core.h
@@ -849,7 +849,7 @@ struct qeth_trap_id {
*/
static inline int qeth_get_elements_for_range(addr_t start, addr_t end)
{
- return PFN_UP(end - 1) - PFN_DOWN(start);
+ return PFN_UP(end) - PFN_DOWN(start);
}
static inline int qeth_get_micros(void)
Patches currently in stable-queue which might be from ubraun(a)linux.vnet.ibm.com are
queue-4.9/s390-qeth-fix-underestimated-count-of-buffer-elements.patch
This is a note to let you know that I've just added the patch titled
s390/qeth: fix SETIP command handling
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
s390-qeth-fix-setip-command-handling.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Thu Mar 8 06:55:02 PST 2018
From: Julian Wiedmann <jwi(a)linux.vnet.ibm.com>
Date: Fri, 9 Feb 2018 11:03:50 +0100
Subject: s390/qeth: fix SETIP command handling
From: Julian Wiedmann <jwi(a)linux.vnet.ibm.com>
[ Upstream commit 1c5b2216fbb973a9410e0b06389740b5c1289171 ]
send_control_data() applies some special handling to SETIP v4 IPA
commands. But current code parses *all* command types for the SETIP
command code. Limit the command code check to IPA commands.
Fixes: 5b54e16f1a54 ("qeth: do not spin for SETIP ip assist command")
Signed-off-by: Julian Wiedmann <jwi(a)linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/s390/net/qeth_core.h | 5 +++++
drivers/s390/net/qeth_core_main.c | 14 ++++++++------
2 files changed, 13 insertions(+), 6 deletions(-)
--- a/drivers/s390/net/qeth_core.h
+++ b/drivers/s390/net/qeth_core.h
@@ -592,6 +592,11 @@ struct qeth_cmd_buffer {
void (*callback) (struct qeth_channel *, struct qeth_cmd_buffer *);
};
+static inline struct qeth_ipa_cmd *__ipa_cmd(struct qeth_cmd_buffer *iob)
+{
+ return (struct qeth_ipa_cmd *)(iob->data + IPA_PDU_HEADER_SIZE);
+}
+
/**
* definition of a qeth channel, used for read and write
*/
--- a/drivers/s390/net/qeth_core_main.c
+++ b/drivers/s390/net/qeth_core_main.c
@@ -2050,7 +2050,7 @@ int qeth_send_control_data(struct qeth_c
unsigned long flags;
struct qeth_reply *reply = NULL;
unsigned long timeout, event_timeout;
- struct qeth_ipa_cmd *cmd;
+ struct qeth_ipa_cmd *cmd = NULL;
QETH_CARD_TEXT(card, 2, "sendctl");
@@ -2077,10 +2077,13 @@ int qeth_send_control_data(struct qeth_c
while (atomic_cmpxchg(&card->write.irq_pending, 0, 1)) ;
qeth_prepare_control_data(card, len, iob);
- if (IS_IPA(iob->data))
+ if (IS_IPA(iob->data)) {
+ cmd = __ipa_cmd(iob);
event_timeout = QETH_IPA_TIMEOUT;
- else
+ } else {
event_timeout = QETH_TIMEOUT;
+ }
+
timeout = jiffies + event_timeout;
QETH_CARD_TEXT(card, 6, "noirqpnd");
@@ -2105,9 +2108,8 @@ int qeth_send_control_data(struct qeth_c
/* we have only one long running ipassist, since we can ensure
process context of this command we can sleep */
- cmd = (struct qeth_ipa_cmd *)(iob->data+IPA_PDU_HEADER_SIZE);
- if ((cmd->hdr.command == IPA_CMD_SETIP) &&
- (cmd->hdr.prot_version == QETH_PROT_IPV4)) {
+ if (cmd && cmd->hdr.command == IPA_CMD_SETIP &&
+ cmd->hdr.prot_version == QETH_PROT_IPV4) {
if (!wait_event_timeout(reply->wait_q,
atomic_read(&reply->received), event_timeout))
goto time_err;
Patches currently in stable-queue which might be from jwi(a)linux.vnet.ibm.com are
queue-4.9/s390-qeth-fix-setip-command-handling.patch
queue-4.9/s390-qeth-fix-ip-address-lookup-for-l3-devices.patch
queue-4.9/s390-qeth-fix-ipa-command-submission-race.patch
queue-4.9/s390-qeth-fix-overestimated-count-of-buffer-elements.patch
queue-4.9/s390-qeth-fix-double-free-on-ip-add-remove-race.patch
queue-4.9/s390-qeth-fix-ip-removal-on-offline-cards.patch
queue-4.9/s390-qeth-fix-underestimated-count-of-buffer-elements.patch
This is a note to let you know that I've just added the patch titled
s390/qeth: fix overestimated count of buffer elements
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
s390-qeth-fix-overestimated-count-of-buffer-elements.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Thu Mar 8 06:55:02 PST 2018
From: Julian Wiedmann <jwi(a)linux.vnet.ibm.com>
Date: Tue, 27 Feb 2018 18:58:12 +0100
Subject: s390/qeth: fix overestimated count of buffer elements
From: Julian Wiedmann <jwi(a)linux.vnet.ibm.com>
[ Upstream commit 12472af89632beb1ed8dea29d4efe208ca05b06a ]
qeth_get_elements_for_range() doesn't know how to handle a 0-length
range (ie. start == end), and returns 1 when it should return 0.
Such ranges occur on TSO skbs, where the L2/L3/L4 headers (and thus all
of the skb's linear data) are skipped when mapping the skb into regular
buffer elements.
This overestimation may cause several performance-related issues:
1. sub-optimal IO buffer selection, where the next buffer gets selected
even though the skb would actually still fit into the current buffer.
2. forced linearization, if the element count for a non-linear skb
exceeds QETH_MAX_BUFFER_ELEMENTS.
Rather than modifying qeth_get_elements_for_range() and adding overhead
to every caller, fix up those callers that are in risk of passing a
0-length range.
Fixes: 2863c61334aa ("qeth: refactor calculation of SBALE count")
Signed-off-by: Julian Wiedmann <jwi(a)linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/s390/net/qeth_core_main.c | 10 ++++++----
drivers/s390/net/qeth_l3_main.c | 11 ++++++-----
2 files changed, 12 insertions(+), 9 deletions(-)
--- a/drivers/s390/net/qeth_core_main.c
+++ b/drivers/s390/net/qeth_core_main.c
@@ -3854,10 +3854,12 @@ EXPORT_SYMBOL_GPL(qeth_get_elements_for_
int qeth_get_elements_no(struct qeth_card *card,
struct sk_buff *skb, int extra_elems, int data_offset)
{
- int elements = qeth_get_elements_for_range(
- (addr_t)skb->data + data_offset,
- (addr_t)skb->data + skb_headlen(skb)) +
- qeth_get_elements_for_frags(skb);
+ addr_t end = (addr_t)skb->data + skb_headlen(skb);
+ int elements = qeth_get_elements_for_frags(skb);
+ addr_t start = (addr_t)skb->data + data_offset;
+
+ if (start != end)
+ elements += qeth_get_elements_for_range(start, end);
if ((elements + extra_elems) > QETH_MAX_BUFFER_ELEMENTS(card)) {
QETH_DBF_MESSAGE(2, "Invalid size of IP packet "
--- a/drivers/s390/net/qeth_l3_main.c
+++ b/drivers/s390/net/qeth_l3_main.c
@@ -2784,11 +2784,12 @@ static void qeth_tso_fill_header(struct
static int qeth_l3_get_elements_no_tso(struct qeth_card *card,
struct sk_buff *skb, int extra_elems)
{
- addr_t tcpdptr = (addr_t)tcp_hdr(skb) + tcp_hdrlen(skb);
- int elements = qeth_get_elements_for_range(
- tcpdptr,
- (addr_t)skb->data + skb_headlen(skb)) +
- qeth_get_elements_for_frags(skb);
+ addr_t start = (addr_t)tcp_hdr(skb) + tcp_hdrlen(skb);
+ addr_t end = (addr_t)skb->data + skb_headlen(skb);
+ int elements = qeth_get_elements_for_frags(skb);
+
+ if (start != end)
+ elements += qeth_get_elements_for_range(start, end);
if ((elements + extra_elems) > QETH_MAX_BUFFER_ELEMENTS(card)) {
QETH_DBF_MESSAGE(2,
Patches currently in stable-queue which might be from jwi(a)linux.vnet.ibm.com are
queue-4.9/s390-qeth-fix-setip-command-handling.patch
queue-4.9/s390-qeth-fix-ip-address-lookup-for-l3-devices.patch
queue-4.9/s390-qeth-fix-ipa-command-submission-race.patch
queue-4.9/s390-qeth-fix-overestimated-count-of-buffer-elements.patch
queue-4.9/s390-qeth-fix-double-free-on-ip-add-remove-race.patch
queue-4.9/s390-qeth-fix-ip-removal-on-offline-cards.patch
queue-4.9/s390-qeth-fix-underestimated-count-of-buffer-elements.patch
This is a note to let you know that I've just added the patch titled
s390/qeth: fix IPA command submission race
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
s390-qeth-fix-ipa-command-submission-race.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Thu Mar 8 06:55:02 PST 2018
From: Julian Wiedmann <jwi(a)linux.vnet.ibm.com>
Date: Tue, 27 Feb 2018 18:58:17 +0100
Subject: s390/qeth: fix IPA command submission race
From: Julian Wiedmann <jwi(a)linux.vnet.ibm.com>
[ Upstream commit d22ffb5a712f9211ffd104c38fc17cbfb1b5e2b0 ]
If multiple IPA commands are build & sent out concurrently,
fill_ipacmd_header() may assign a seqno value to a command that's
different from what send_control_data() later assigns to this command's
reply.
This is due to other commands passing through send_control_data(),
and incrementing card->seqno.ipa along the way.
So one IPA command has no reply that's waiting for its seqno, while some
other IPA command has multiple reply objects waiting for it.
Only one of those waiting replies wins, and the other(s) times out and
triggers a recovery via send_ipa_cmd().
Fix this by making sure that the same seqno value is assigned to
a command and its reply object.
Do so immediately before submitting the command & while holding the
irq_pending "lock", to produce nicely ascending seqnos.
As a side effect, *all* IPA commands now use a reply object that's
waiting for its actual seqno. Previously, early IPA commands that were
submitted while the card was still DOWN used the "catch-all" IDX seqno.
Signed-off-by: Julian Wiedmann <jwi(a)linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/s390/net/qeth_core_main.c | 19 ++++++++++---------
1 file changed, 10 insertions(+), 9 deletions(-)
--- a/drivers/s390/net/qeth_core_main.c
+++ b/drivers/s390/net/qeth_core_main.c
@@ -2064,25 +2064,26 @@ int qeth_send_control_data(struct qeth_c
}
reply->callback = reply_cb;
reply->param = reply_param;
- if (card->state == CARD_STATE_DOWN)
- reply->seqno = QETH_IDX_COMMAND_SEQNO;
- else
- reply->seqno = card->seqno.ipa++;
+
init_waitqueue_head(&reply->wait_q);
- spin_lock_irqsave(&card->lock, flags);
- list_add_tail(&reply->list, &card->cmd_waiter_list);
- spin_unlock_irqrestore(&card->lock, flags);
QETH_DBF_HEX(CTRL, 2, iob->data, QETH_DBF_CTRL_LEN);
while (atomic_cmpxchg(&card->write.irq_pending, 0, 1)) ;
- qeth_prepare_control_data(card, len, iob);
if (IS_IPA(iob->data)) {
cmd = __ipa_cmd(iob);
+ cmd->hdr.seqno = card->seqno.ipa++;
+ reply->seqno = cmd->hdr.seqno;
event_timeout = QETH_IPA_TIMEOUT;
} else {
+ reply->seqno = QETH_IDX_COMMAND_SEQNO;
event_timeout = QETH_TIMEOUT;
}
+ qeth_prepare_control_data(card, len, iob);
+
+ spin_lock_irqsave(&card->lock, flags);
+ list_add_tail(&reply->list, &card->cmd_waiter_list);
+ spin_unlock_irqrestore(&card->lock, flags);
timeout = jiffies + event_timeout;
@@ -2873,7 +2874,7 @@ static void qeth_fill_ipacmd_header(stru
memset(cmd, 0, sizeof(struct qeth_ipa_cmd));
cmd->hdr.command = command;
cmd->hdr.initiator = IPA_CMD_INITIATOR_HOST;
- cmd->hdr.seqno = card->seqno.ipa;
+ /* cmd->hdr.seqno is set by qeth_send_control_data() */
cmd->hdr.adapter_type = qeth_get_ipa_adp_type(card->info.link_type);
cmd->hdr.rel_adapter_no = (__u8) card->info.portno;
if (card->options.layer2)
Patches currently in stable-queue which might be from jwi(a)linux.vnet.ibm.com are
queue-4.9/s390-qeth-fix-setip-command-handling.patch
queue-4.9/s390-qeth-fix-ip-address-lookup-for-l3-devices.patch
queue-4.9/s390-qeth-fix-ipa-command-submission-race.patch
queue-4.9/s390-qeth-fix-overestimated-count-of-buffer-elements.patch
queue-4.9/s390-qeth-fix-double-free-on-ip-add-remove-race.patch
queue-4.9/s390-qeth-fix-ip-removal-on-offline-cards.patch
queue-4.9/s390-qeth-fix-underestimated-count-of-buffer-elements.patch
This is a note to let you know that I've just added the patch titled
s390/qeth: fix IP removal on offline cards
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
s390-qeth-fix-ip-removal-on-offline-cards.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Thu Mar 8 06:55:02 PST 2018
From: Julian Wiedmann <jwi(a)linux.vnet.ibm.com>
Date: Tue, 27 Feb 2018 18:58:13 +0100
Subject: s390/qeth: fix IP removal on offline cards
From: Julian Wiedmann <jwi(a)linux.vnet.ibm.com>
[ Upstream commit 98d823ab1fbdcb13abc25b420f9bb71bade42056 ]
If the HW is not reachable, then none of the IPs in qeth's internal
table has been registered with the HW yet. So when deleting such an IP,
there's no need to stage it for deregistration - just drop it from
the table.
This fixes the "add-delete-add" scenario on an offline card, where the
the second "add" merely increments the IP's use count. But as the IP is
still set to DISP_ADDR_DELETE from the previous "delete" step,
l3_recover_ip() won't register it with the HW when the card goes online.
Fixes: 5f78e29ceebf ("qeth: optimize IP handling in rx_mode callback")
Signed-off-by: Julian Wiedmann <jwi(a)linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/s390/net/qeth_l3_main.c | 14 +++-----------
1 file changed, 3 insertions(+), 11 deletions(-)
--- a/drivers/s390/net/qeth_l3_main.c
+++ b/drivers/s390/net/qeth_l3_main.c
@@ -259,12 +259,8 @@ int qeth_l3_delete_ip(struct qeth_card *
if (addr->in_progress)
return -EINPROGRESS;
- if (!qeth_card_hw_is_reachable(card)) {
- addr->disp_flag = QETH_DISP_ADDR_DELETE;
- return 0;
- }
-
- rc = qeth_l3_deregister_addr_entry(card, addr);
+ if (qeth_card_hw_is_reachable(card))
+ rc = qeth_l3_deregister_addr_entry(card, addr);
hash_del(&addr->hnode);
kfree(addr);
@@ -406,11 +402,7 @@ static void qeth_l3_recover_ip(struct qe
spin_lock_bh(&card->ip_lock);
hash_for_each_safe(card->ip_htable, i, tmp, addr, hnode) {
- if (addr->disp_flag == QETH_DISP_ADDR_DELETE) {
- qeth_l3_deregister_addr_entry(card, addr);
- hash_del(&addr->hnode);
- kfree(addr);
- } else if (addr->disp_flag == QETH_DISP_ADDR_ADD) {
+ if (addr->disp_flag == QETH_DISP_ADDR_ADD) {
if (addr->proto == QETH_PROT_IPV4) {
addr->in_progress = 1;
spin_unlock_bh(&card->ip_lock);
Patches currently in stable-queue which might be from jwi(a)linux.vnet.ibm.com are
queue-4.9/s390-qeth-fix-setip-command-handling.patch
queue-4.9/s390-qeth-fix-ip-address-lookup-for-l3-devices.patch
queue-4.9/s390-qeth-fix-ipa-command-submission-race.patch
queue-4.9/s390-qeth-fix-overestimated-count-of-buffer-elements.patch
queue-4.9/s390-qeth-fix-double-free-on-ip-add-remove-race.patch
queue-4.9/s390-qeth-fix-ip-removal-on-offline-cards.patch
queue-4.9/s390-qeth-fix-underestimated-count-of-buffer-elements.patch
This is a note to let you know that I've just added the patch titled
s390/qeth: fix IP address lookup for L3 devices
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
s390-qeth-fix-ip-address-lookup-for-l3-devices.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Thu Mar 8 06:55:02 PST 2018
From: Julian Wiedmann <jwi(a)linux.vnet.ibm.com>
Date: Tue, 27 Feb 2018 18:58:16 +0100
Subject: s390/qeth: fix IP address lookup for L3 devices
From: Julian Wiedmann <jwi(a)linux.vnet.ibm.com>
[ Upstream commit c5c48c58b259bb8f0482398370ee539d7a12df3e ]
Current code ("qeth_l3_ip_from_hash()") matches a queried address object
against objects in the IP table by IP address, Mask/Prefix Length and
MAC address ("qeth_l3_ipaddrs_is_equal()"). But what callers actually
require is either
a) "is this IP address registered" (ie. match by IP address only),
before adding a new address.
b) or "is this address object registered" (ie. match all relevant
attributes), before deleting an address.
Right now
1. the ADD path is too strict in its lookup, and eg. doesn't detect
conflicts between an existing NORMAL address and a new VIPA address
(because the NORMAL address will have mask != 0, while VIPA has
a mask == 0),
2. the DELETE path is not strict enough, and eg. allows del_rxip() to
delete a VIPA address as long as the IP address matches.
Fix all this by adding helpers (_addr_match_ip() and _addr_match_all())
that do the appropriate checking.
Note that the ADD path for NORMAL addresses is special, as qeth keeps
track of how many times such an address is in use (and there is no
immediate way of returning errors to the caller). So when a requested
NORMAL address _fully_ matches an existing one, it's not considered a
conflict and we merely increment the refcount.
Fixes: 5f78e29ceebf ("qeth: optimize IP handling in rx_mode callback")
Signed-off-by: Julian Wiedmann <jwi(a)linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/s390/net/qeth_l3.h | 34 ++++++++++++++
drivers/s390/net/qeth_l3_main.c | 91 ++++++++++++++++++----------------------
2 files changed, 74 insertions(+), 51 deletions(-)
--- a/drivers/s390/net/qeth_l3.h
+++ b/drivers/s390/net/qeth_l3.h
@@ -39,8 +39,40 @@ struct qeth_ipaddr {
unsigned int pfxlen;
} a6;
} u;
-
};
+
+static inline bool qeth_l3_addr_match_ip(struct qeth_ipaddr *a1,
+ struct qeth_ipaddr *a2)
+{
+ if (a1->proto != a2->proto)
+ return false;
+ if (a1->proto == QETH_PROT_IPV6)
+ return ipv6_addr_equal(&a1->u.a6.addr, &a2->u.a6.addr);
+ return a1->u.a4.addr == a2->u.a4.addr;
+}
+
+static inline bool qeth_l3_addr_match_all(struct qeth_ipaddr *a1,
+ struct qeth_ipaddr *a2)
+{
+ /* Assumes that the pair was obtained via qeth_l3_addr_find_by_ip(),
+ * so 'proto' and 'addr' match for sure.
+ *
+ * For ucast:
+ * - 'mac' is always 0.
+ * - 'mask'/'pfxlen' for RXIP/VIPA is always 0. For NORMAL, matching
+ * values are required to avoid mixups in takeover eligibility.
+ *
+ * For mcast,
+ * - 'mac' is mapped from the IP, and thus always matches.
+ * - 'mask'/'pfxlen' is always 0.
+ */
+ if (a1->type != a2->type)
+ return false;
+ if (a1->proto == QETH_PROT_IPV6)
+ return a1->u.a6.pfxlen == a2->u.a6.pfxlen;
+ return a1->u.a4.mask == a2->u.a4.mask;
+}
+
static inline u64 qeth_l3_ipaddr_hash(struct qeth_ipaddr *addr)
{
u64 ret = 0;
--- a/drivers/s390/net/qeth_l3_main.c
+++ b/drivers/s390/net/qeth_l3_main.c
@@ -154,6 +154,24 @@ int qeth_l3_string_to_ipaddr(const char
return -EINVAL;
}
+static struct qeth_ipaddr *qeth_l3_find_addr_by_ip(struct qeth_card *card,
+ struct qeth_ipaddr *query)
+{
+ u64 key = qeth_l3_ipaddr_hash(query);
+ struct qeth_ipaddr *addr;
+
+ if (query->is_multicast) {
+ hash_for_each_possible(card->ip_mc_htable, addr, hnode, key)
+ if (qeth_l3_addr_match_ip(addr, query))
+ return addr;
+ } else {
+ hash_for_each_possible(card->ip_htable, addr, hnode, key)
+ if (qeth_l3_addr_match_ip(addr, query))
+ return addr;
+ }
+ return NULL;
+}
+
static void qeth_l3_convert_addr_to_bits(u8 *addr, u8 *bits, int len)
{
int i, j;
@@ -207,34 +225,6 @@ static bool qeth_l3_is_addr_covered_by_i
return rc;
}
-inline int
-qeth_l3_ipaddrs_is_equal(struct qeth_ipaddr *addr1, struct qeth_ipaddr *addr2)
-{
- return addr1->proto == addr2->proto &&
- !memcmp(&addr1->u, &addr2->u, sizeof(addr1->u)) &&
- !memcmp(&addr1->mac, &addr2->mac, sizeof(addr1->mac));
-}
-
-static struct qeth_ipaddr *
-qeth_l3_ip_from_hash(struct qeth_card *card, struct qeth_ipaddr *tmp_addr)
-{
- struct qeth_ipaddr *addr;
-
- if (tmp_addr->is_multicast) {
- hash_for_each_possible(card->ip_mc_htable, addr,
- hnode, qeth_l3_ipaddr_hash(tmp_addr))
- if (qeth_l3_ipaddrs_is_equal(tmp_addr, addr))
- return addr;
- } else {
- hash_for_each_possible(card->ip_htable, addr,
- hnode, qeth_l3_ipaddr_hash(tmp_addr))
- if (qeth_l3_ipaddrs_is_equal(tmp_addr, addr))
- return addr;
- }
-
- return NULL;
-}
-
int qeth_l3_delete_ip(struct qeth_card *card, struct qeth_ipaddr *tmp_addr)
{
int rc = 0;
@@ -249,8 +239,8 @@ int qeth_l3_delete_ip(struct qeth_card *
QETH_CARD_HEX(card, 4, ((char *)&tmp_addr->u.a6.addr) + 8, 8);
}
- addr = qeth_l3_ip_from_hash(card, tmp_addr);
- if (!addr)
+ addr = qeth_l3_find_addr_by_ip(card, tmp_addr);
+ if (!addr || !qeth_l3_addr_match_all(addr, tmp_addr))
return -ENOENT;
addr->ref_counter--;
@@ -272,6 +262,7 @@ int qeth_l3_add_ip(struct qeth_card *car
{
int rc = 0;
struct qeth_ipaddr *addr;
+ char buf[40];
QETH_CARD_TEXT(card, 4, "addip");
@@ -282,8 +273,20 @@ int qeth_l3_add_ip(struct qeth_card *car
QETH_CARD_HEX(card, 4, ((char *)&tmp_addr->u.a6.addr) + 8, 8);
}
- addr = qeth_l3_ip_from_hash(card, tmp_addr);
- if (!addr) {
+ addr = qeth_l3_find_addr_by_ip(card, tmp_addr);
+ if (addr) {
+ if (tmp_addr->type != QETH_IP_TYPE_NORMAL)
+ return -EADDRINUSE;
+ if (qeth_l3_addr_match_all(addr, tmp_addr)) {
+ addr->ref_counter++;
+ return 0;
+ }
+ qeth_l3_ipaddr_to_string(tmp_addr->proto, (u8 *)&tmp_addr->u,
+ buf);
+ dev_warn(&card->gdev->dev,
+ "Registering IP address %s failed\n", buf);
+ return -EADDRINUSE;
+ } else {
addr = qeth_l3_get_addr_buffer(tmp_addr->proto);
if (!addr)
return -ENOMEM;
@@ -331,11 +334,7 @@ int qeth_l3_add_ip(struct qeth_card *car
hash_del(&addr->hnode);
kfree(addr);
}
- } else {
- if (addr->type == QETH_IP_TYPE_NORMAL)
- addr->ref_counter++;
}
-
return rc;
}
@@ -719,12 +718,7 @@ int qeth_l3_add_vipa(struct qeth_card *c
return -ENOMEM;
spin_lock_bh(&card->ip_lock);
-
- if (qeth_l3_ip_from_hash(card, ipaddr))
- rc = -EEXIST;
- else
- qeth_l3_add_ip(card, ipaddr);
-
+ rc = qeth_l3_add_ip(card, ipaddr);
spin_unlock_bh(&card->ip_lock);
kfree(ipaddr);
@@ -787,12 +781,7 @@ int qeth_l3_add_rxip(struct qeth_card *c
return -ENOMEM;
spin_lock_bh(&card->ip_lock);
-
- if (qeth_l3_ip_from_hash(card, ipaddr))
- rc = -EEXIST;
- else
- qeth_l3_add_ip(card, ipaddr);
-
+ rc = qeth_l3_add_ip(card, ipaddr);
spin_unlock_bh(&card->ip_lock);
kfree(ipaddr);
@@ -1437,8 +1426,9 @@ qeth_l3_add_mc_to_hash(struct qeth_card
memcpy(tmp->mac, buf, sizeof(tmp->mac));
tmp->is_multicast = 1;
- ipm = qeth_l3_ip_from_hash(card, tmp);
+ ipm = qeth_l3_find_addr_by_ip(card, tmp);
if (ipm) {
+ /* for mcast, by-IP match means full match */
ipm->disp_flag = QETH_DISP_ADDR_DO_NOTHING;
} else {
ipm = qeth_l3_get_addr_buffer(QETH_PROT_IPV4);
@@ -1521,8 +1511,9 @@ qeth_l3_add_mc6_to_hash(struct qeth_card
sizeof(struct in6_addr));
tmp->is_multicast = 1;
- ipm = qeth_l3_ip_from_hash(card, tmp);
+ ipm = qeth_l3_find_addr_by_ip(card, tmp);
if (ipm) {
+ /* for mcast, by-IP match means full match */
ipm->disp_flag = QETH_DISP_ADDR_DO_NOTHING;
continue;
}
Patches currently in stable-queue which might be from jwi(a)linux.vnet.ibm.com are
queue-4.9/s390-qeth-fix-setip-command-handling.patch
queue-4.9/s390-qeth-fix-ip-address-lookup-for-l3-devices.patch
queue-4.9/s390-qeth-fix-ipa-command-submission-race.patch
queue-4.9/s390-qeth-fix-overestimated-count-of-buffer-elements.patch
queue-4.9/s390-qeth-fix-double-free-on-ip-add-remove-race.patch
queue-4.9/s390-qeth-fix-ip-removal-on-offline-cards.patch
queue-4.9/s390-qeth-fix-underestimated-count-of-buffer-elements.patch
This is a note to let you know that I've just added the patch titled
s390/qeth: fix double-free on IP add/remove race
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
s390-qeth-fix-double-free-on-ip-add-remove-race.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Thu Mar 8 06:55:02 PST 2018
From: Julian Wiedmann <jwi(a)linux.vnet.ibm.com>
Date: Tue, 27 Feb 2018 18:58:14 +0100
Subject: s390/qeth: fix double-free on IP add/remove race
From: Julian Wiedmann <jwi(a)linux.vnet.ibm.com>
[ Upstream commit 14d066c3531a87f727968cacd85bd95c75f59843 ]
Registering an IPv4 address with the HW takes quite a while, so we
temporarily drop the ip_htable lock. Any concurrent add/remove of the
same IP adjusts the IP's use count, and (on remove) is then blocked by
addr->in_progress.
After the register call has completed, we check the use count for
concurrently attempted add/remove calls - and possibly straight-away
deregister the IP again. This happens via l3_delete_ip(), which
1) looks up the queried IP in the htable (getting a reference to the
*same* queried object),
2) deregisters the IP from the HW, and
3) frees the IP object.
The caller in l3_add_ip() then does a second free on the same object.
For this case, skip all the extra checks and lookups in l3_delete_ip()
and just deregister & free the IP object ourselves.
Fixes: 5f78e29ceebf ("qeth: optimize IP handling in rx_mode callback")
Signed-off-by: Julian Wiedmann <jwi(a)linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/s390/net/qeth_l3_main.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
--- a/drivers/s390/net/qeth_l3_main.c
+++ b/drivers/s390/net/qeth_l3_main.c
@@ -323,7 +323,8 @@ int qeth_l3_add_ip(struct qeth_card *car
(rc == IPA_RC_LAN_OFFLINE)) {
addr->disp_flag = QETH_DISP_ADDR_DO_NOTHING;
if (addr->ref_counter < 1) {
- qeth_l3_delete_ip(card, addr);
+ qeth_l3_deregister_addr_entry(card, addr);
+ hash_del(&addr->hnode);
kfree(addr);
}
} else {
Patches currently in stable-queue which might be from jwi(a)linux.vnet.ibm.com are
queue-4.9/s390-qeth-fix-setip-command-handling.patch
queue-4.9/s390-qeth-fix-ip-address-lookup-for-l3-devices.patch
queue-4.9/s390-qeth-fix-ipa-command-submission-race.patch
queue-4.9/s390-qeth-fix-overestimated-count-of-buffer-elements.patch
queue-4.9/s390-qeth-fix-double-free-on-ip-add-remove-race.patch
queue-4.9/s390-qeth-fix-ip-removal-on-offline-cards.patch
queue-4.9/s390-qeth-fix-underestimated-count-of-buffer-elements.patch
This is a note to let you know that I've just added the patch titled
rxrpc: Fix send in rxrpc_send_data_packet()
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
rxrpc-fix-send-in-rxrpc_send_data_packet.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Thu Mar 8 06:55:02 PST 2018
From: David Howells <dhowells(a)redhat.com>
Date: Thu, 22 Feb 2018 14:38:14 +0000
Subject: rxrpc: Fix send in rxrpc_send_data_packet()
From: David Howells <dhowells(a)redhat.com>
[ Upstream commit 93c62c45ed5fad1b87e3a45835b251cd68de9c46 ]
All the kernel_sendmsg() calls in rxrpc_send_data_packet() need to send
both parts of the iov[] buffer, but one of them does not. Fix it so that
it does.
Without this, short IPv6 rxrpc DATA packets may be seen that have the rxrpc
header included, but no payload.
Fixes: 5a924b8951f8 ("rxrpc: Don't store the rxrpc header in the Tx queue sk_buffs")
Reported-by: Marc Dionne <marc.dionne(a)auristor.com>
Signed-off-by: David Howells <dhowells(a)redhat.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/rxrpc/output.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/net/rxrpc/output.c
+++ b/net/rxrpc/output.c
@@ -391,7 +391,7 @@ send_fragmentable:
(char *)&opt, sizeof(opt));
if (ret == 0) {
ret = kernel_sendmsg(conn->params.local->socket, &msg,
- iov, 1, iov[0].iov_len);
+ iov, 2, len);
opt = IPV6_PMTUDISC_DO;
kernel_setsockopt(conn->params.local->socket,
Patches currently in stable-queue which might be from dhowells(a)redhat.com are
queue-4.9/rxrpc-fix-send-in-rxrpc_send_data_packet.patch
This is a note to let you know that I've just added the patch titled
ppp: prevent unregistered channels from connecting to PPP units
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
ppp-prevent-unregistered-channels-from-connecting-to-ppp-units.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Thu Mar 8 06:55:02 PST 2018
From: Guillaume Nault <g.nault(a)alphalink.fr>
Date: Fri, 2 Mar 2018 18:41:16 +0100
Subject: ppp: prevent unregistered channels from connecting to PPP units
From: Guillaume Nault <g.nault(a)alphalink.fr>
[ Upstream commit 77f840e3e5f09c6d7d727e85e6e08276dd813d11 ]
PPP units don't hold any reference on the channels connected to it.
It is the channel's responsibility to ensure that it disconnects from
its unit before being destroyed.
In practice, this is ensured by ppp_unregister_channel() disconnecting
the channel from the unit before dropping a reference on the channel.
However, it is possible for an unregistered channel to connect to a PPP
unit: register a channel with ppp_register_net_channel(), attach a
/dev/ppp file to it with ioctl(PPPIOCATTCHAN), unregister the channel
with ppp_unregister_channel() and finally connect the /dev/ppp file to
a PPP unit with ioctl(PPPIOCCONNECT).
Once in this situation, the channel is only held by the /dev/ppp file,
which can be released at anytime and free the channel without letting
the parent PPP unit know. Then the ppp structure ends up with dangling
pointers in its ->channels list.
Prevent this scenario by forbidding unregistered channels from
connecting to PPP units. This maintains the code logic by keeping
ppp_unregister_channel() responsible from disconnecting the channel if
necessary and avoids modification on the reference counting mechanism.
This issue seems to predate git history (successfully reproduced on
Linux 2.6.26 and earlier PPP commits are unrelated).
Signed-off-by: Guillaume Nault <g.nault(a)alphalink.fr>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/net/ppp/ppp_generic.c | 9 +++++++++
1 file changed, 9 insertions(+)
--- a/drivers/net/ppp/ppp_generic.c
+++ b/drivers/net/ppp/ppp_generic.c
@@ -3157,6 +3157,15 @@ ppp_connect_channel(struct channel *pch,
goto outl;
ppp_lock(ppp);
+ spin_lock_bh(&pch->downl);
+ if (!pch->chan) {
+ /* Don't connect unregistered channels */
+ spin_unlock_bh(&pch->downl);
+ ppp_unlock(ppp);
+ ret = -ENOTCONN;
+ goto outl;
+ }
+ spin_unlock_bh(&pch->downl);
if (pch->file.hdrlen > ppp->file.hdrlen)
ppp->file.hdrlen = pch->file.hdrlen;
hdrlen = pch->file.hdrlen + 2; /* for protocol bytes */
Patches currently in stable-queue which might be from g.nault(a)alphalink.fr are
queue-4.9/ppp-prevent-unregistered-channels-from-connecting-to-ppp-units.patch
This is a note to let you know that I've just added the patch titled
net: phy: fix phy_start to consider PHY_IGNORE_INTERRUPT
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
net-phy-fix-phy_start-to-consider-phy_ignore_interrupt.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Thu Mar 8 06:55:02 PST 2018
From: Heiner Kallweit <hkallweit1(a)gmail.com>
Date: Thu, 8 Feb 2018 21:01:48 +0100
Subject: net: phy: fix phy_start to consider PHY_IGNORE_INTERRUPT
From: Heiner Kallweit <hkallweit1(a)gmail.com>
[ Upstream commit 08f5138512180a479ce6b9d23b825c9f4cd3be77 ]
This condition wasn't adjusted when PHY_IGNORE_INTERRUPT (-2) was added
long ago. In case of PHY_IGNORE_INTERRUPT the MAC interrupt indicates
also PHY state changes and we should do what the symbol says.
Fixes: 84a527a41f38 ("net: phylib: fix interrupts re-enablement in phy_start")
Signed-off-by: Heiner Kallweit <hkallweit1(a)gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli(a)gmail.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/net/phy/phy.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/net/phy/phy.c
+++ b/drivers/net/phy/phy.c
@@ -925,7 +925,7 @@ void phy_start(struct phy_device *phydev
break;
case PHY_HALTED:
/* make sure interrupts are re-enabled for the PHY */
- if (phydev->irq != PHY_POLL) {
+ if (phy_interrupt_is_valid(phydev)) {
err = phy_enable_interrupts(phydev);
if (err < 0)
break;
Patches currently in stable-queue which might be from hkallweit1(a)gmail.com are
queue-4.9/net-phy-fix-phy_start-to-consider-phy_ignore_interrupt.patch
This is a note to let you know that I've just added the patch titled
netlink: ensure to loop over all netns in genlmsg_multicast_allns()
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
netlink-ensure-to-loop-over-all-netns-in-genlmsg_multicast_allns.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Thu Mar 8 06:55:02 PST 2018
From: Nicolas Dichtel <nicolas.dichtel(a)6wind.com>
Date: Tue, 6 Feb 2018 14:48:32 +0100
Subject: netlink: ensure to loop over all netns in genlmsg_multicast_allns()
From: Nicolas Dichtel <nicolas.dichtel(a)6wind.com>
[ Upstream commit cb9f7a9a5c96a773bbc9c70660dc600cfff82f82 ]
Nowadays, nlmsg_multicast() returns only 0 or -ESRCH but this was not the
case when commit 134e63756d5f was pushed.
However, there was no reason to stop the loop if a netns does not have
listeners.
Returns -ESRCH only if there was no listeners in all netns.
To avoid having the same problem in the future, I didn't take the
assumption that nlmsg_multicast() returns only 0 or -ESRCH.
Fixes: 134e63756d5f ("genetlink: make netns aware")
CC: Johannes Berg <johannes.berg(a)intel.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel(a)6wind.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/netlink/genetlink.c | 12 ++++++++++--
1 file changed, 10 insertions(+), 2 deletions(-)
--- a/net/netlink/genetlink.c
+++ b/net/netlink/genetlink.c
@@ -1103,6 +1103,7 @@ static int genlmsg_mcast(struct sk_buff
{
struct sk_buff *tmp;
struct net *net, *prev = NULL;
+ bool delivered = false;
int err;
for_each_net_rcu(net) {
@@ -1114,14 +1115,21 @@ static int genlmsg_mcast(struct sk_buff
}
err = nlmsg_multicast(prev->genl_sock, tmp,
portid, group, flags);
- if (err)
+ if (!err)
+ delivered = true;
+ else if (err != -ESRCH)
goto error;
}
prev = net;
}
- return nlmsg_multicast(prev->genl_sock, skb, portid, group, flags);
+ err = nlmsg_multicast(prev->genl_sock, skb, portid, group, flags);
+ if (!err)
+ delivered = true;
+ else if (err != -ESRCH)
+ goto error;
+ return delivered ? 0 : -ESRCH;
error:
kfree_skb(skb);
return err;
Patches currently in stable-queue which might be from nicolas.dichtel(a)6wind.com are
queue-4.9/netlink-ensure-to-loop-over-all-netns-in-genlmsg_multicast_allns.patch
This is a note to let you know that I've just added the patch titled
net: ipv4: don't allow setting net.ipv4.route.min_pmtu below 68
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
net-ipv4-don-t-allow-setting-net.ipv4.route.min_pmtu-below-68.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Thu Mar 8 06:55:02 PST 2018
From: Sabrina Dubroca <sd(a)queasysnail.net>
Date: Mon, 26 Feb 2018 16:13:43 +0100
Subject: net: ipv4: don't allow setting net.ipv4.route.min_pmtu below 68
From: Sabrina Dubroca <sd(a)queasysnail.net>
[ Upstream commit c7272c2f1229125f74f22dcdd59de9bbd804f1c8 ]
According to RFC 1191 sections 3 and 4, ICMP frag-needed messages
indicating an MTU below 68 should be rejected:
A host MUST never reduce its estimate of the Path MTU below 68
octets.
and (talking about ICMP frag-needed's Next-Hop MTU field):
This field will never contain a value less than 68, since every
router "must be able to forward a datagram of 68 octets without
fragmentation".
Furthermore, by letting net.ipv4.route.min_pmtu be set to negative
values, we can end up with a very large PMTU when (-1) is cast into u32.
Let's also make ip_rt_min_pmtu a u32, since it's only ever compared to
unsigned ints.
Reported-by: Jianlin Shi <jishi(a)redhat.com>
Signed-off-by: Sabrina Dubroca <sd(a)queasysnail.net>
Reviewed-by: Stefano Brivio <sbrivio(a)redhat.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/ipv4/route.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -126,10 +126,13 @@ static int ip_rt_redirect_silence __read
static int ip_rt_error_cost __read_mostly = HZ;
static int ip_rt_error_burst __read_mostly = 5 * HZ;
static int ip_rt_mtu_expires __read_mostly = 10 * 60 * HZ;
-static int ip_rt_min_pmtu __read_mostly = 512 + 20 + 20;
+static u32 ip_rt_min_pmtu __read_mostly = 512 + 20 + 20;
static int ip_rt_min_advmss __read_mostly = 256;
static int ip_rt_gc_timeout __read_mostly = RT_GC_TIMEOUT;
+
+static int ip_min_valid_pmtu __read_mostly = IPV4_MIN_MTU;
+
/*
* Interface to generic destination cache.
*/
@@ -2772,7 +2775,8 @@ static struct ctl_table ipv4_route_table
.data = &ip_rt_min_pmtu,
.maxlen = sizeof(int),
.mode = 0644,
- .proc_handler = proc_dointvec,
+ .proc_handler = proc_dointvec_minmax,
+ .extra1 = &ip_min_valid_pmtu,
},
{
.procname = "min_adv_mss",
Patches currently in stable-queue which might be from sd(a)queasysnail.net are
queue-4.9/net-ipv4-don-t-allow-setting-net.ipv4.route.min_pmtu-below-68.patch
This is a note to let you know that I've just added the patch titled
net: fix race on decreasing number of TX queues
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
net-fix-race-on-decreasing-number-of-tx-queues.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Thu Mar 8 06:55:02 PST 2018
From: Jakub Kicinski <jakub.kicinski(a)netronome.com>
Date: Mon, 12 Feb 2018 21:35:31 -0800
Subject: net: fix race on decreasing number of TX queues
From: Jakub Kicinski <jakub.kicinski(a)netronome.com>
[ Upstream commit ac5b70198adc25c73fba28de4f78adcee8f6be0b ]
netif_set_real_num_tx_queues() can be called when netdev is up.
That usually happens when user requests change of number of
channels/rings with ethtool -L. The procedure for changing
the number of queues involves resetting the qdiscs and setting
dev->num_tx_queues to the new value. When the new value is
lower than the old one, extra care has to be taken to ensure
ordering of accesses to the number of queues vs qdisc reset.
Currently the queues are reset before new dev->num_tx_queues
is assigned, leaving a window of time where packets can be
enqueued onto the queues going down, leading to a likely
crash in the drivers, since most drivers don't check if TX
skbs are assigned to an active queue.
Fixes: e6484930d7c7 ("net: allocate tx queues in register_netdevice")
Signed-off-by: Jakub Kicinski <jakub.kicinski(a)netronome.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/core/dev.c | 11 +++++++++--
1 file changed, 9 insertions(+), 2 deletions(-)
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2199,8 +2199,11 @@ EXPORT_SYMBOL(netif_set_xps_queue);
*/
int netif_set_real_num_tx_queues(struct net_device *dev, unsigned int txq)
{
+ bool disabling;
int rc;
+ disabling = txq < dev->real_num_tx_queues;
+
if (txq < 1 || txq > dev->num_tx_queues)
return -EINVAL;
@@ -2216,15 +2219,19 @@ int netif_set_real_num_tx_queues(struct
if (dev->num_tc)
netif_setup_tc(dev, txq);
- if (txq < dev->real_num_tx_queues) {
+ dev->real_num_tx_queues = txq;
+
+ if (disabling) {
+ synchronize_net();
qdisc_reset_all_tx_gt(dev, txq);
#ifdef CONFIG_XPS
netif_reset_xps_queues_gt(dev, txq);
#endif
}
+ } else {
+ dev->real_num_tx_queues = txq;
}
- dev->real_num_tx_queues = txq;
return 0;
}
EXPORT_SYMBOL(netif_set_real_num_tx_queues);
Patches currently in stable-queue which might be from jakub.kicinski(a)netronome.com are
queue-4.9/net-fix-race-on-decreasing-number-of-tx-queues.patch
This is a note to let you know that I've just added the patch titled
mlxsw: spectrum_switchdev: Check success of FDB add operation
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
mlxsw-spectrum_switchdev-check-success-of-fdb-add-operation.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Thu Mar 8 06:55:02 PST 2018
From: Shalom Toledo <shalomt(a)mellanox.com>
Date: Thu, 1 Mar 2018 11:37:05 +0100
Subject: mlxsw: spectrum_switchdev: Check success of FDB add operation
From: Shalom Toledo <shalomt(a)mellanox.com>
[ Upstream commit 0a8a1bf17e3af34f1f8d2368916a6327f8b3bfd5 ]
Until now, we assumed that in case of error when adding FDB entries, the
write operation will fail, but this is not the case. Instead, we need to
check that the number of entries reported in the response is equal to
the number of entries specified in the request.
Fixes: 56ade8fe3fe1 ("mlxsw: spectrum: Add initial support for Spectrum ASIC")
Reported-by: Ido Schimmel <idosch(a)mellanox.com>
Signed-off-by: Shalom Toledo <shalomt(a)mellanox.com>
Reviewed-by: Ido Schimmel <idosch(a)mellanox.com>
Signed-off-by: Jiri Pirko <jiri(a)mellanox.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c | 29 +++++++++++++--
1 file changed, 27 insertions(+), 2 deletions(-)
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c
@@ -809,6 +809,7 @@ static int __mlxsw_sp_port_fdb_uc_op(str
bool dynamic)
{
char *sfd_pl;
+ u8 num_rec;
int err;
sfd_pl = kmalloc(MLXSW_REG_SFD_LEN, GFP_KERNEL);
@@ -818,9 +819,16 @@ static int __mlxsw_sp_port_fdb_uc_op(str
mlxsw_reg_sfd_pack(sfd_pl, mlxsw_sp_sfd_op(adding), 0);
mlxsw_reg_sfd_uc_pack(sfd_pl, 0, mlxsw_sp_sfd_rec_policy(dynamic),
mac, fid, action, local_port);
+ num_rec = mlxsw_reg_sfd_num_rec_get(sfd_pl);
err = mlxsw_reg_write(mlxsw_sp->core, MLXSW_REG(sfd), sfd_pl);
- kfree(sfd_pl);
+ if (err)
+ goto out;
+
+ if (num_rec != mlxsw_reg_sfd_num_rec_get(sfd_pl))
+ err = -EBUSY;
+out:
+ kfree(sfd_pl);
return err;
}
@@ -845,6 +853,7 @@ static int mlxsw_sp_port_fdb_uc_lag_op(s
bool adding, bool dynamic)
{
char *sfd_pl;
+ u8 num_rec;
int err;
sfd_pl = kmalloc(MLXSW_REG_SFD_LEN, GFP_KERNEL);
@@ -855,9 +864,16 @@ static int mlxsw_sp_port_fdb_uc_lag_op(s
mlxsw_reg_sfd_uc_lag_pack(sfd_pl, 0, mlxsw_sp_sfd_rec_policy(dynamic),
mac, fid, MLXSW_REG_SFD_REC_ACTION_NOP,
lag_vid, lag_id);
+ num_rec = mlxsw_reg_sfd_num_rec_get(sfd_pl);
err = mlxsw_reg_write(mlxsw_sp->core, MLXSW_REG(sfd), sfd_pl);
- kfree(sfd_pl);
+ if (err)
+ goto out;
+ if (num_rec != mlxsw_reg_sfd_num_rec_get(sfd_pl))
+ err = -EBUSY;
+
+out:
+ kfree(sfd_pl);
return err;
}
@@ -891,6 +907,7 @@ static int mlxsw_sp_port_mdb_op(struct m
u16 fid, u16 mid, bool adding)
{
char *sfd_pl;
+ u8 num_rec;
int err;
sfd_pl = kmalloc(MLXSW_REG_SFD_LEN, GFP_KERNEL);
@@ -900,7 +917,15 @@ static int mlxsw_sp_port_mdb_op(struct m
mlxsw_reg_sfd_pack(sfd_pl, mlxsw_sp_sfd_op(adding), 0);
mlxsw_reg_sfd_mc_pack(sfd_pl, 0, addr, fid,
MLXSW_REG_SFD_REC_ACTION_NOP, mid);
+ num_rec = mlxsw_reg_sfd_num_rec_get(sfd_pl);
err = mlxsw_reg_write(mlxsw_sp->core, MLXSW_REG(sfd), sfd_pl);
+ if (err)
+ goto out;
+
+ if (num_rec != mlxsw_reg_sfd_num_rec_get(sfd_pl))
+ err = -EBUSY;
+
+out:
kfree(sfd_pl);
return err;
}
Patches currently in stable-queue which might be from shalomt(a)mellanox.com are
queue-4.9/mlxsw-spectrum_switchdev-check-success-of-fdb-add-operation.patch
This is a note to let you know that I've just added the patch titled
ipv6 sit: work around bogus gcc-8 -Wrestrict warning
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
ipv6-sit-work-around-bogus-gcc-8-wrestrict-warning.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Thu Mar 8 06:55:02 PST 2018
From: Arnd Bergmann <arnd(a)arndb.de>
Date: Thu, 22 Feb 2018 16:55:34 +0100
Subject: ipv6 sit: work around bogus gcc-8 -Wrestrict warning
From: Arnd Bergmann <arnd(a)arndb.de>
[ Upstream commit ca79bec237f5809a7c3c59bd41cd0880aa889966 ]
gcc-8 has a new warning that detects overlapping input and output arguments
in memcpy(). It triggers for sit_init_net() calling ipip6_tunnel_clone_6rd(),
which is actually correct:
net/ipv6/sit.c: In function 'sit_init_net':
net/ipv6/sit.c:192:3: error: 'memcpy' source argument is the same as destination [-Werror=restrict]
The problem here is that the logic detecting the memcpy() arguments finds them
to be the same, but the conditional that tests for the input and output of
ipip6_tunnel_clone_6rd() to be identical is not a compile-time constant.
We know that netdev_priv(t->dev) is the same as t for a tunnel device,
and comparing "dev" directly here lets the compiler figure out as well
that 'dev == sitn->fb_tunnel_dev' when called from sit_init_net(), so
it no longer warns.
This code is old, so Cc stable to make sure that we don't get the warning
for older kernels built with new gcc.
Cc: Martin Sebor <msebor(a)gmail.com>
Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83456
Signed-off-by: Arnd Bergmann <arnd(a)arndb.de>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/ipv6/sit.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/net/ipv6/sit.c
+++ b/net/ipv6/sit.c
@@ -176,7 +176,7 @@ static void ipip6_tunnel_clone_6rd(struc
#ifdef CONFIG_IPV6_SIT_6RD
struct ip_tunnel *t = netdev_priv(dev);
- if (t->dev == sitn->fb_tunnel_dev) {
+ if (dev == sitn->fb_tunnel_dev) {
ipv6_addr_set(&t->ip6rd.prefix, htonl(0x20020000), 0, 0, 0);
t->ip6rd.relay_prefix = 0;
t->ip6rd.prefixlen = 16;
Patches currently in stable-queue which might be from arnd(a)arndb.de are
queue-4.9/tpm-constify-transmit-data-pointers.patch
queue-4.9/ipv6-sit-work-around-bogus-gcc-8-wrestrict-warning.patch
queue-4.9/arm-kvm-fix-building-with-gcc-8.patch
This is a note to let you know that I've just added the patch titled
hdlc_ppp: carrier detect ok, don't turn off negotiation
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
hdlc_ppp-carrier-detect-ok-don-t-turn-off-negotiation.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Thu Mar 8 06:55:02 PST 2018
From: Denis Du <dudenis2000(a)yahoo.ca>
Date: Sat, 24 Feb 2018 16:51:42 -0500
Subject: hdlc_ppp: carrier detect ok, don't turn off negotiation
From: Denis Du <dudenis2000(a)yahoo.ca>
[ Upstream commit b6c3bad1ba83af1062a7ff6986d9edc4f3d7fc8e ]
Sometimes when physical lines have a just good noise to make the protocol
handshaking fail, but the carrier detect still good. Then after remove of
the noise, nobody will trigger this protocol to be start again to cause
the link to never come back. The fix is when the carrier is still on, not
terminate the protocol handshaking.
Signed-off-by: Denis Du <dudenis2000(a)yahoo.ca>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/net/wan/hdlc_ppp.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
--- a/drivers/net/wan/hdlc_ppp.c
+++ b/drivers/net/wan/hdlc_ppp.c
@@ -574,7 +574,10 @@ static void ppp_timer(unsigned long arg)
ppp_cp_event(proto->dev, proto->pid, TO_GOOD, 0, 0,
0, NULL);
proto->restart_counter--;
- } else
+ } else if (netif_carrier_ok(proto->dev))
+ ppp_cp_event(proto->dev, proto->pid, TO_GOOD, 0, 0,
+ 0, NULL);
+ else
ppp_cp_event(proto->dev, proto->pid, TO_BAD, 0, 0,
0, NULL);
break;
Patches currently in stable-queue which might be from dudenis2000(a)yahoo.ca are
queue-4.9/hdlc_ppp-carrier-detect-ok-don-t-turn-off-negotiation.patch
This is a note to let you know that I've just added the patch titled
fib_semantics: Don't match route with mismatching tclassid
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
fib_semantics-don-t-match-route-with-mismatching-tclassid.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Thu Mar 8 06:55:02 PST 2018
From: Stefano Brivio <sbrivio(a)redhat.com>
Date: Thu, 15 Feb 2018 09:46:03 +0100
Subject: fib_semantics: Don't match route with mismatching tclassid
From: Stefano Brivio <sbrivio(a)redhat.com>
[ Upstream commit a8c6db1dfd1b1d18359241372bb204054f2c3174 ]
In fib_nh_match(), if output interface or gateway are passed in
the FIB configuration, we don't have to check next hops of
multipath routes to conclude whether we have a match or not.
However, we might still have routes with different realms
matching the same output interface and gateway configuration,
and this needs to cause the match to fail. Otherwise the first
route inserted in the FIB will match, regardless of the realms:
# ip route add 1.1.1.1 dev eth0 table 1234 realms 1/2
# ip route append 1.1.1.1 dev eth0 table 1234 realms 3/4
# ip route list table 1234
1.1.1.1 dev eth0 scope link realms 1/2
1.1.1.1 dev eth0 scope link realms 3/4
# ip route del 1.1.1.1 dev ens3 table 1234 realms 3/4
# ip route list table 1234
1.1.1.1 dev ens3 scope link realms 3/4
whereas route with realms 3/4 should have been deleted instead.
Explicitly check for fc_flow passed in the FIB configuration
(this comes from RTA_FLOW extracted by rtm_to_fib_config()) and
fail matching if it differs from nh_tclassid.
The handling of RTA_FLOW for multipath routes later in
fib_nh_match() is still needed, as we can have multiple RTA_FLOW
attributes that need to be matched against the tclassid of each
next hop.
v2: Check that fc_flow is set before discarding the match, so
that the user can still select the first matching rule by
not specifying any realm, as suggested by David Ahern.
Reported-by: Jianlin Shi <jishi(a)redhat.com>
Signed-off-by: Stefano Brivio <sbrivio(a)redhat.com>
Acked-by: David Ahern <dsahern(a)gmail.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/ipv4/fib_semantics.c | 5 +++++
1 file changed, 5 insertions(+)
--- a/net/ipv4/fib_semantics.c
+++ b/net/ipv4/fib_semantics.c
@@ -640,6 +640,11 @@ int fib_nh_match(struct fib_config *cfg,
fi->fib_nh, cfg))
return 1;
}
+#ifdef CONFIG_IP_ROUTE_CLASSID
+ if (cfg->fc_flow &&
+ cfg->fc_flow != fi->fib_nh->nh_tclassid)
+ return 1;
+#endif
if ((!cfg->fc_oif || cfg->fc_oif == fi->fib_nh->nh_oif) &&
(!cfg->fc_gw || cfg->fc_gw == fi->fib_nh->nh_gw))
return 0;
Patches currently in stable-queue which might be from sbrivio(a)redhat.com are
queue-4.9/net-ipv4-don-t-allow-setting-net.ipv4.route.min_pmtu-below-68.patch
queue-4.9/fib_semantics-don-t-match-route-with-mismatching-tclassid.patch
This is a note to let you know that I've just added the patch titled
bridge: check brport attr show in brport_show
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
bridge-check-brport-attr-show-in-brport_show.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Thu Mar 8 06:55:02 PST 2018
From: Xin Long <lucien.xin(a)gmail.com>
Date: Mon, 12 Feb 2018 17:15:40 +0800
Subject: bridge: check brport attr show in brport_show
From: Xin Long <lucien.xin(a)gmail.com>
[ Upstream commit 1b12580af1d0677c3c3a19e35bfe5d59b03f737f ]
Now br_sysfs_if file flush doesn't have attr show. To read it will
cause kernel panic after users chmod u+r this file.
Xiong found this issue when running the commands:
ip link add br0 type bridge
ip link add type veth
ip link set veth0 master br0
chmod u+r /sys/devices/virtual/net/veth0/brport/flush
timeout 3 cat /sys/devices/virtual/net/veth0/brport/flush
kernel crashed with NULL a pointer dereference call trace.
This patch is to fix it by return -EINVAL when brport_attr->show
is null, just the same as the check for brport_attr->store in
brport_store().
Fixes: 9cf637473c85 ("bridge: add sysfs hook to flush forwarding table")
Reported-by: Xiong Zhou <xzhou(a)redhat.com>
Signed-off-by: Xin Long <lucien.xin(a)gmail.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/bridge/br_sysfs_if.c | 3 +++
1 file changed, 3 insertions(+)
--- a/net/bridge/br_sysfs_if.c
+++ b/net/bridge/br_sysfs_if.c
@@ -230,6 +230,9 @@ static ssize_t brport_show(struct kobjec
struct brport_attribute *brport_attr = to_brport_attr(attr);
struct net_bridge_port *p = to_brport(kobj);
+ if (!brport_attr->show)
+ return -EINVAL;
+
return brport_attr->show(p, buf);
}
Patches currently in stable-queue which might be from lucien.xin(a)gmail.com are
queue-4.9/bridge-check-brport-attr-show-in-brport_show.patch
This is a note to let you know that I've just added the patch titled
usb: host: xhci-rcar: add support for r8a77965
to my usb git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git
in the usb-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From 015dbeb2282030bf56762e21d25f09422edfd750 Mon Sep 17 00:00:00 2001
From: Yoshihiro Shimoda <yoshihiro.shimoda.uh(a)renesas.com>
Date: Tue, 27 Feb 2018 17:15:20 +0900
Subject: usb: host: xhci-rcar: add support for r8a77965
This patch adds support for r8a77965 (R-Car M3-N).
Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh(a)renesas.com>
Reviewed-by: Simon Horman <horms+renesas(a)verge.net.au>
Reviewed-by: Rob Herring <robh(a)kernel.org>
Cc: stable <stable(a)vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
Documentation/devicetree/bindings/usb/usb-xhci.txt | 1 +
drivers/usb/host/xhci-rcar.c | 4 ++++
2 files changed, 5 insertions(+)
diff --git a/Documentation/devicetree/bindings/usb/usb-xhci.txt b/Documentation/devicetree/bindings/usb/usb-xhci.txt
index e2ea59bbca93..1651483a7048 100644
--- a/Documentation/devicetree/bindings/usb/usb-xhci.txt
+++ b/Documentation/devicetree/bindings/usb/usb-xhci.txt
@@ -13,6 +13,7 @@ Required properties:
- "renesas,xhci-r8a7793" for r8a7793 SoC
- "renesas,xhci-r8a7795" for r8a7795 SoC
- "renesas,xhci-r8a7796" for r8a7796 SoC
+ - "renesas,xhci-r8a77965" for r8a77965 SoC
- "renesas,rcar-gen2-xhci" for a generic R-Car Gen2 or RZ/G1 compatible
device
- "renesas,rcar-gen3-xhci" for a generic R-Car Gen3 compatible device
diff --git a/drivers/usb/host/xhci-rcar.c b/drivers/usb/host/xhci-rcar.c
index f0b559660007..f33ffc2bc4ed 100644
--- a/drivers/usb/host/xhci-rcar.c
+++ b/drivers/usb/host/xhci-rcar.c
@@ -83,6 +83,10 @@ static const struct soc_device_attribute rcar_quirks_match[] = {
.soc_id = "r8a7796",
.data = (void *)RCAR_XHCI_FIRMWARE_V3,
},
+ {
+ .soc_id = "r8a77965",
+ .data = (void *)RCAR_XHCI_FIRMWARE_V3,
+ },
{ /* sentinel */ },
};
--
2.16.2
This is the start of the stable review cycle for the 4.15.8 release.
There are 122 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Fri Mar 9 19:16:43 UTC 2018.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.15.8-rc1…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.15.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 4.15.8-rc1
NeilBrown <neilb(a)suse.com>
md: only allow remove_and_add_spares when no sync_thread running.
Nicholas Piggin <npiggin(a)gmail.com>
powerpc/64s/radix: Boot-time NULL pointer protection using a guard-PID
Adam Ford <aford173(a)gmail.com>
ARM: dts: LogicPD Torpedo: Fix I2C1 pinmux
Adam Ford <aford173(a)gmail.com>
ARM: dts: LogicPD SOM-LV: Fix I2C1 pinmux
Kai Heng Feng <kai.heng.feng(a)canonical.com>
ACPI / bus: Parse tables as term_list for Dell XPS 9570 and Precision M5530
Eric Biggers <ebiggers(a)google.com>
KVM/x86: remove WARN_ON() for when vm_munmap() fails
Radim Krčmář <rkrcmar(a)redhat.com>
KVM: x86: fix vcpu initialization with userspace lapic
Paolo Bonzini <pbonzini(a)redhat.com>
KVM/VMX: Optimize vmx_vcpu_run() and svm_vcpu_run() by marking the RDMSR path as unlikely()
Paolo Bonzini <pbonzini(a)redhat.com>
KVM: x86: move LAPIC initialization after VMCS creation
Paolo Bonzini <pbonzini(a)redhat.com>
KVM/x86: Remove indirect MSR op calls from SPEC_CTRL
Wanpeng Li <wanpeng.li(a)hotmail.com>
KVM: mmu: Fix overlap between public and private memslots
Wanpeng Li <wanpengli(a)tencent.com>
KVM: X86: Fix SMRAM accessing even if VM is shutdown
Arnd Bergmann <arnd(a)arndb.de>
ARM: kvm: fix building with gcc-8
Ulf Magnusson <ulfalizer(a)gmail.com>
ARM: mvebu: Fix broken PL310_ERRATA_753970 selects
Daniel Schultz <d.schultz(a)phytec.de>
ARM: dts: rockchip: Remove 1.8 GHz operation point from phycore som
Arnd Bergmann <arnd(a)arndb.de>
ARM: orion: fix orion_ge00_switch_board_info initialization
Jan Beulich <JBeulich(a)suse.com>
x86/mm: Fix {pmd,pud}_{set,clear}_flags()
Rasmus Villemoes <linux(a)rasmusvillemoes.dk>
nospec: Allow index argument to have const-qualified type
David Hildenbrand <david(a)redhat.com>
KVM: s390: consider epoch index on TOD clock syncs
David Hildenbrand <david(a)redhat.com>
KVM: s390: consider epoch index on hotplugged CPUs
David Hildenbrand <david(a)redhat.com>
KVM: s390: provide only a single function for setting the tod (fix SCK)
David Hildenbrand <david(a)redhat.com>
KVM: s390: take care of clock-comparator sign control
Anna Karbownik <anna.karbownik(a)intel.com>
EDAC, sb_edac: Fix out of bound writes during DIMM configuration on KNL
Mauro Carvalho Chehab <mchehab(a)kernel.org>
media: m88ds3103: don't call a non-initalized function
Ming Lei <ming.lei(a)redhat.com>
blk-mq: don't call io sched's .requeue_request when requeueing rq to ->dispatch
Yuchung Cheng <ycheng(a)google.com>
tcp: revert F-RTO extension to detect more spurious timeouts
Yuchung Cheng <ycheng(a)google.com>
tcp: revert F-RTO middle-box workaround
Julian Wiedmann <jwi(a)linux.vnet.ibm.com>
s390/qeth: fix IPA command submission race
Julian Wiedmann <jwi(a)linux.vnet.ibm.com>
s390/qeth: fix IP address lookup for L3 devices
Julian Wiedmann <jwi(a)linux.vnet.ibm.com>
Revert "s390/qeth: fix using of ref counter for rxip addresses"
Julian Wiedmann <jwi(a)linux.vnet.ibm.com>
s390/qeth: fix double-free on IP add/remove race
Julian Wiedmann <jwi(a)linux.vnet.ibm.com>
s390/qeth: fix IP removal on offline cards
Julian Wiedmann <jwi(a)linux.vnet.ibm.com>
s390/qeth: fix overestimated count of buffer elements
Julian Wiedmann <jwi(a)linux.vnet.ibm.com>
s390/qeth: fix SETIP command handling
Ursula Braun <ubraun(a)linux.vnet.ibm.com>
s390/qeth: fix underestimated count of buffer elements
James Chapman <jchapman(a)katalix.com>
l2tp: fix tunnel lookup use-after-free race
James Chapman <jchapman(a)katalix.com>
l2tp: fix race in pppol2tp_release with session object destroy
James Chapman <jchapman(a)katalix.com>
l2tp: fix races with tunnel socket close
James Chapman <jchapman(a)katalix.com>
l2tp: don't use inet_shutdown on ppp session destroy
James Chapman <jchapman(a)katalix.com>
l2tp: don't use inet_shutdown on tunnel destroy
Song Liu <songliubraving(a)fb.com>
tcp: tracepoint: only call trace_tcp_send_reset with full socket
Andrew Lunn <andrew(a)lunn.ch>
net: phy: Restore phy_resume() locking assumption
Vlad Buslov <vladbu(a)mellanox.com>
net/mlx5: Fix error handling when adding flow rules
Rahul Lakkireddy <rahul.lakkireddy(a)chelsio.com>
cxgb4: fix trailing zero in CIM LA dump
Jason Wang <jasowang(a)redhat.com>
virtio-net: disable NAPI only when enabled during XDP set
Jason Wang <jasowang(a)redhat.com>
tuntap: disable preemption during XDP processing
Jason Wang <jasowang(a)redhat.com>
tuntap: correctly add the missing XDP flush
Soheil Hassas Yeganeh <soheil(a)google.com>
tcp: purge write queue upon RST
Jason A. Donenfeld <Jason(a)zx2c4.com>
netlink: put module reference if dump start fails
Ido Schimmel <idosch(a)mellanox.com>
mlxsw: spectrum_router: Do not unconditionally clear route offload indication
Paolo Abeni <pabeni(a)redhat.com>
cls_u32: fix use after free in u32_destroy_key()
Tom Lendacky <thomas.lendacky(a)amd.com>
amd-xgbe: Restore PCI interrupt enablement setting on resume
Boris Pismenny <borisp(a)mellanox.com>
tls: Use correct sk->sk_prot for IPV6
Eran Ben Elisha <eranbe(a)mellanox.com>
net/mlx5e: Verify inline header size do not exceed SKB linear size
Ido Schimmel <idosch(a)mellanox.com>
bridge: Fix VLAN reference count problem
Alexey Kodanev <alexey.kodanev(a)oracle.com>
sctp: fix dst refcnt leak in sctp_v6_get_dst()
David Ahern <dsahern(a)gmail.com>
net: ipv4: Set addr_type in hash_keys for forwarded case
Jiri Pirko <jiri(a)mellanox.com>
mlxsw: spectrum_router: Fix error path in mlxsw_sp_vr_create
Xin Long <lucien.xin(a)gmail.com>
sctp: do not pr_err for the duplicated node in transport rhlist
Ivan Vecera <ivecera(a)redhat.com>
net/sched: cls_u32: fix cls_u32 on filter replace
Eric Dumazet <edumazet(a)google.com>
net_sched: gen_estimator: fix broken estimators based on percpu stats
Inbar Karmy <inbark(a)mellanox.com>
net/mlx5e: Fix loopback self test when GRO is off
Tonghao Zhang <xiangxia.m.yue(a)gmail.com>
doc: Change the min default value of tcp_wmem/tcp_rmem.
Eric Dumazet <edumazet(a)google.com>
tcp_bbr: better deal with suboptimal GSO
David Howells <dhowells(a)redhat.com>
rxrpc: Fix send in rxrpc_send_data_packet()
Ilya Lesokhin <ilyal(a)mellanox.com>
tcp: Honor the eor bit in tcp_mtu_probe
Heiner Kallweit <hkallweit1(a)gmail.com>
net: phy: fix phy_start to consider PHY_IGNORE_INTERRUPT
Gal Pressman <galp(a)mellanox.com>
net/mlx5e: Specify numa node when allocating drop rq
Shalom Toledo <shalomt(a)mellanox.com>
mlxsw: spectrum_switchdev: Check success of FDB add operation
Tommi Rantala <tommi.t.rantala(a)nokia.com>
sctp: fix dst refcnt leak in sctp_v4_get_dst
Gal Pressman <galp(a)mellanox.com>
net/mlx5e: Fix TCP checksum in LRO buffers
Alexey Kodanev <alexey.kodanev(a)oracle.com>
udplite: fix partial checksum initialization
Alexey Kodanev <alexey.kodanev(a)oracle.com>
sctp: verify size of a new chunk in _sctp_make_chunk()
Guillaume Nault <g.nault(a)alphalink.fr>
ppp: prevent unregistered channels from connecting to PPP units
Roman Kapl <code(a)rkapl.cz>
net: sched: report if filter is too large to dump
Nicolas Dichtel <nicolas.dichtel(a)6wind.com>
netlink: ensure to loop over all netns in genlmsg_multicast_allns()
Sabrina Dubroca <sd(a)queasysnail.net>
net: ipv4: don't allow setting net.ipv4.route.min_pmtu below 68
Jakub Kicinski <jakub.kicinski(a)netronome.com>
net: fix race on decreasing number of TX queues
Grygorii Strashko <grygorii.strashko(a)ti.com>
net: ethernet: ti: cpsw: fix net watchdog timeout
Wolfram Sang <wsa+renesas(a)sang-engineering.com>
net: amd-xgbe: fix comparison to bitshift when dealing with a mask
Arnd Bergmann <arnd(a)arndb.de>
ipv6 sit: work around bogus gcc-8 -Wrestrict warning
Denis Du <dudenis2000(a)yahoo.ca>
hdlc_ppp: carrier detect ok, don't turn off negotiation
Stefano Brivio <sbrivio(a)redhat.com>
fib_semantics: Don't match route with mismatching tclassid
Xin Long <lucien.xin(a)gmail.com>
bridge: check brport attr show in brport_show
Thomas Gleixner <tglx(a)linutronix.de>
x86/cpu_entry_area: Sync cpu_entry_area to initial_page_table
Sebastian Panceac <sebastian(a)resin.io>
x86/platform/intel-mid: Handle Intel Edison reboot correctly
Juergen Gross <jgross(a)suse.com>
x86/xen: Zero MSR_IA32_SPEC_CTRL before suspend
Jan Kara <jack(a)suse.cz>
direct-io: Fix sleep in atomic due to sync AIO
Dan Williams <dan.j.williams(a)intel.com>
dax: fix vma_is_fsdax() helper
Viresh Kumar <viresh.kumar(a)linaro.org>
cpufreq: s3c24xx: Fix broken s3c_cpufreq_init()
Dan Williams <dan.j.williams(a)intel.com>
vfio: disable filesystem-dax page pinning
Ming Lei <ming.lei(a)redhat.com>
block: pass inclusive 'lend' parameter to truncate_inode_pages_range
Ming Lei <ming.lei(a)redhat.com>
block: kyber: fix domain token leak during requeue
Jiufei Xue <jiufei.xue(a)linux.alibaba.com>
block: fix the count of PGPGOUT for WRITE_SAME
Anand Jain <anand.jain(a)oracle.com>
btrfs: use proper endianness accessors for super_copy
Helge Deller <deller(a)gmx.de>
parisc: Hide virtual kernel memory layout
John David Anglin <dave.anglin(a)bell.net>
parisc: Fix ordering of cache and TLB flushes
Helge Deller <deller(a)gmx.de>
parisc: Reduce irq overhead when run in qemu
Helge Deller <deller(a)gmx.de>
parisc: Use cr16 interval timers unconditionally on qemu
Lingutla Chandrasekhar <clingutla(a)codeaurora.org>
timers: Forward timer base before migrating timers
Shawn Lin <shawn.lin(a)rock-chips.com>
mmc: dw_mmc: Fix out-of-bounds access for slot's caps
Shawn Lin <shawn.lin(a)rock-chips.com>
mmc: dw_mmc: Factor out dw_mci_init_slot_caps
Shawn Lin <shawn.lin(a)rock-chips.com>
mmc: dw_mmc: Avoid accessing registers in runtime suspended state
Geert Uytterhoeven <geert+renesas(a)glider.be>
mmc: dw_mmc-k3: Fix out-of-bounds access through DT alias
Adrian Hunter <adrian.hunter(a)intel.com>
mmc: sdhci-pci: Fix S0i3 for Intel BYT-based controllers
Takashi Iwai <tiwai(a)suse.de>
ALSA: hda - Fix pincfg at resume on Lenovo T470 dock
Hans de Goede <hdegoede(a)redhat.com>
ALSA: hda: Add a power_save blacklist
Takashi Iwai <tiwai(a)suse.de>
ALSA: x86: Fix missing spinlock and mutex initializations
Richard Fitzgerald <rf(a)opensource.cirrus.com>
ALSA: control: Fix memory corruption risk in snd_ctl_elem_read
Erik Veijola <erik.veijola(a)gmail.com>
ALSA: usb-audio: Add a quirck for B&W PX headphones
Jeremy Boone <jeremy.boone(a)nccgroup.trust>
tpm_tis: fix potential buffer overruns caused by bit glitches on the bus
Jeremy Boone <jeremy.boone(a)nccgroup.trust>
tpm_i2c_nuvoton: fix potential buffer overruns caused by bit glitches on the bus
Jeremy Boone <jeremy.boone(a)nccgroup.trust>
tpm_i2c_infineon: fix potential buffer overruns caused by bit glitches on the bus
Jeremy Boone <jeremy.boone(a)nccgroup.trust>
tpm: fix potential buffer overruns caused by bit glitches on the bus
Jeremy Boone <jeremy.boone(a)nccgroup.trust>
tpm: st33zp24: fix potential buffer overruns caused by bit glitches on the bus
Emil Tantilov <emil.s.tantilov(a)intel.com>
ixgbe: fix crash in build_skb Rx code path
Hans de Goede <hdegoede(a)redhat.com>
Bluetooth: btusb: Use DMI matching for QCA reset_resume quirking
Sam Bobroff <sam.bobroff(a)au1.ibm.com>
powerpc/pseries: Enable RAS hotplug events later
Mario Limonciello <mario.limonciello(a)dell.com>
platform/x86: dell-laptop: Allocate buffer on heap rather than globally
Corey Minyard <cminyard(a)mvista.com>
ipmi_si: Fix error handling of platform device
Anna-Maria Gleixner <anna-maria(a)linutronix.de>
hrtimer: Ensure POSIX compliance (relative CLOCK_REALTIME hrtimers)
Adam Borowski <kilobyte(a)angband.pl>
vsprintf: avoid misleading "(null)" for %px
-------------
Diffstat:
Documentation/networking/ip-sysctl.txt | 4 +-
Makefile | 4 +-
arch/arm/boot/dts/logicpd-som-lv.dtsi | 9 +-
arch/arm/boot/dts/logicpd-torpedo-som.dtsi | 8 +
arch/arm/boot/dts/rk3288-phycore-som.dtsi | 20 ---
arch/arm/kvm/hyp/Makefile | 5 +
arch/arm/kvm/hyp/banked-sr.c | 4 +
arch/arm/mach-mvebu/Kconfig | 4 +-
arch/arm/plat-orion/common.c | 23 ++-
arch/parisc/include/asm/cacheflush.h | 1 +
arch/parisc/include/asm/processor.h | 2 +
arch/parisc/kernel/cache.c | 57 ++++---
arch/parisc/kernel/pacache.S | 22 +++
arch/parisc/kernel/time.c | 11 +-
arch/parisc/mm/init.c | 7 +-
arch/powerpc/mm/pgtable-radix.c | 20 +++
arch/powerpc/platforms/pseries/ras.c | 31 +++-
arch/s390/kvm/interrupt.c | 25 ++-
arch/s390/kvm/kvm-s390.c | 79 +++++----
arch/s390/kvm/kvm-s390.h | 5 +-
arch/s390/kvm/priv.c | 9 +-
arch/x86/include/asm/pgtable.h | 8 +-
arch/x86/include/asm/pgtable_32.h | 1 +
arch/x86/include/asm/pgtable_64.h | 1 +
arch/x86/include/asm/pgtable_types.h | 10 ++
arch/x86/kernel/setup.c | 17 +-
arch/x86/kernel/setup_percpu.c | 17 +-
arch/x86/kvm/lapic.c | 11 +-
arch/x86/kvm/mmu.c | 2 +-
arch/x86/kvm/svm.c | 9 +-
arch/x86/kvm/vmx.c | 9 +-
arch/x86/kvm/x86.c | 8 +-
arch/x86/mm/cpu_entry_area.c | 6 +
arch/x86/mm/init_32.c | 15 ++
arch/x86/platform/intel-mid/intel-mid.c | 2 +-
arch/x86/xen/suspend.c | 16 ++
block/blk-core.c | 2 +-
block/blk-mq.c | 4 +-
block/ioctl.c | 2 +-
block/kyber-iosched.c | 1 +
drivers/acpi/bus.c | 38 ++++-
drivers/bluetooth/btusb.c | 25 ++-
drivers/char/ipmi/ipmi_si_intf.c | 9 +-
drivers/char/tpm/st33zp24/st33zp24.c | 4 +-
drivers/char/tpm/tpm-interface.c | 4 +
drivers/char/tpm/tpm2-cmd.c | 4 +
drivers/char/tpm/tpm_i2c_infineon.c | 5 +-
drivers/char/tpm/tpm_i2c_nuvoton.c | 8 +-
drivers/char/tpm/tpm_tis_core.c | 5 +-
drivers/cpufreq/s3c24xx-cpufreq.c | 8 +-
drivers/edac/sb_edac.c | 2 +-
drivers/md/md.c | 4 +
drivers/media/dvb-frontends/m88ds3103.c | 7 +-
drivers/mmc/host/dw_mmc-exynos.c | 1 +
drivers/mmc/host/dw_mmc-k3.c | 4 +
drivers/mmc/host/dw_mmc-rockchip.c | 1 +
drivers/mmc/host/dw_mmc-zx.c | 1 +
drivers/mmc/host/dw_mmc.c | 84 +++++----
drivers/mmc/host/dw_mmc.h | 2 +
drivers/mmc/host/sdhci-pci-core.c | 35 +++-
drivers/net/ethernet/amd/xgbe/xgbe-drv.c | 2 +-
drivers/net/ethernet/amd/xgbe/xgbe-pci.c | 2 +
drivers/net/ethernet/chelsio/cxgb4/cudbg_lib.c | 2 +-
drivers/net/ethernet/chelsio/cxgb4/cxgb4_cudbg.c | 2 +-
drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 8 +
drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 10 +-
drivers/net/ethernet/mellanox/mlx5/core/en_rx.c | 49 ++++--
.../net/ethernet/mellanox/mlx5/core/en_selftest.c | 3 +-
drivers/net/ethernet/mellanox/mlx5/core/en_tx.c | 2 +-
drivers/net/ethernet/mellanox/mlx5/core/fs_core.c | 10 +-
.../net/ethernet/mellanox/mlxsw/spectrum_router.c | 35 ++--
.../ethernet/mellanox/mlxsw/spectrum_switchdev.c | 29 +++-
drivers/net/ethernet/ti/cpsw.c | 16 +-
drivers/net/phy/phy.c | 4 +-
drivers/net/phy/phy_device.c | 18 +-
drivers/net/ppp/ppp_generic.c | 9 +
drivers/net/tun.c | 7 +
drivers/net/virtio_net.c | 8 +-
drivers/net/wan/hdlc_ppp.c | 5 +-
drivers/platform/x86/dell-laptop.c | 188 +++++++++++----------
drivers/s390/net/qeth_core.h | 7 +-
drivers/s390/net/qeth_core_main.c | 43 ++---
drivers/s390/net/qeth_l3.h | 34 +++-
drivers/s390/net/qeth_l3_main.c | 123 ++++++--------
drivers/vfio/vfio_iommu_type1.c | 18 +-
fs/btrfs/sysfs.c | 8 +-
fs/btrfs/transaction.c | 20 ++-
fs/direct-io.c | 3 +-
include/linux/fs.h | 2 +-
include/linux/nospec.h | 3 +-
include/linux/phy.h | 1 +
include/net/udplite.h | 1 +
kernel/time/hrtimer.c | 7 +-
kernel/time/timer.c | 6 +
lib/vsprintf.c | 2 +-
net/bridge/br_sysfs_if.c | 3 +
net/bridge/br_vlan.c | 2 +
net/core/dev.c | 11 +-
net/core/gen_estimator.c | 1 +
net/ipv4/fib_semantics.c | 5 +
net/ipv4/route.c | 10 +-
net/ipv4/tcp_input.c | 24 +--
net/ipv4/tcp_ipv4.c | 3 +-
net/ipv4/tcp_output.c | 34 +++-
net/ipv4/udp.c | 5 +
net/ipv6/ip6_checksum.c | 5 +
net/ipv6/sit.c | 2 +-
net/ipv6/tcp_ipv6.c | 3 +-
net/l2tp/l2tp_core.c | 142 +++++-----------
net/l2tp/l2tp_core.h | 23 +--
net/l2tp/l2tp_ip.c | 10 +-
net/l2tp/l2tp_ip6.c | 8 +-
net/l2tp/l2tp_ppp.c | 60 +++----
net/netlink/af_netlink.c | 4 +-
net/netlink/genetlink.c | 12 +-
net/rxrpc/output.c | 2 +-
net/sched/cls_api.c | 7 +-
net/sched/cls_u32.c | 24 +--
net/sctp/input.c | 5 +-
net/sctp/ipv6.c | 10 +-
net/sctp/protocol.c | 10 +-
net/sctp/sm_make_chunk.c | 7 +-
net/tls/tls_main.c | 52 ++++--
sound/core/control.c | 2 +-
sound/pci/hda/hda_intel.c | 38 ++++-
sound/pci/hda/patch_realtek.c | 3 +-
sound/usb/quirks-table.h | 47 ++++++
sound/x86/intel_hdmi_audio.c | 2 +
virt/kvm/kvm_main.c | 3 +-
129 files changed, 1271 insertions(+), 737 deletions(-)
This is a note to let you know that I've just added the patch titled
x86/spectre: Fix an error message
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
x86-spectre-fix-an-error-message.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 9de29eac8d2189424d81c0d840cd0469aa3d41c8 Mon Sep 17 00:00:00 2001
From: Dan Carpenter <dan.carpenter(a)oracle.com>
Date: Wed, 14 Feb 2018 10:14:17 +0300
Subject: x86/spectre: Fix an error message
From: Dan Carpenter <dan.carpenter(a)oracle.com>
commit 9de29eac8d2189424d81c0d840cd0469aa3d41c8 upstream.
If i == ARRAY_SIZE(mitigation_options) then we accidentally print
garbage from one space beyond the end of the mitigation_options[] array.
Signed-off-by: Dan Carpenter <dan.carpenter(a)oracle.com>
Cc: Andy Lutomirski <luto(a)kernel.org>
Cc: Borislav Petkov <bp(a)suse.de>
Cc: David Woodhouse <dwmw(a)amazon.co.uk>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: KarimAllah Ahmed <karahmed(a)amazon.de>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: kernel-janitors(a)vger.kernel.org
Fixes: 9005c6834c0f ("x86/spectre: Simplify spectre_v2 command line parsing")
Link: http://lkml.kernel.org/r/20180214071416.GA26677@mwanda
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
Cc: Ben Hutchings <ben.hutchings(a)codethink.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/kernel/cpu/bugs.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -175,8 +175,7 @@ static enum spectre_v2_mitigation_cmd __
}
if (i >= ARRAY_SIZE(mitigation_options)) {
- pr_err("unknown option (%s). Switching to AUTO select\n",
- mitigation_options[i].option);
+ pr_err("unknown option (%s). Switching to AUTO select\n", arg);
return SPECTRE_V2_CMD_AUTO;
}
}
Patches currently in stable-queue which might be from dan.carpenter(a)oracle.com are
queue-4.4/x86-spectre-fix-an-error-message.patch
This reverts commit 20ac8f72514b3af8b62c520d55656ded865eff00, which
was commit 2b83ff96f51d0b039c4561b9f95c824d7bddb85c upstream.
The bug that it should fix was only introduced in Linux 4.7, and
in 4.4 it causes a regression.
Reported-by: Jacek Anaszewski <jacek.anaszewski(a)gmail.com>
Cc: Matthieu CASTET <matthieu.castet(a)parrot.com>
Signed-off-by: Ben Hutchings <ben.hutchings(a)codethink.co.uk>
---
drivers/leds/led-core.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/leds/led-core.c b/drivers/leds/led-core.c
index 92b6798ef5b3..c1c3af089634 100644
--- a/drivers/leds/led-core.c
+++ b/drivers/leds/led-core.c
@@ -149,7 +149,7 @@ void led_blink_set(struct led_classdev *led_cdev,
unsigned long *delay_on,
unsigned long *delay_off)
{
- led_stop_software_blink(led_cdev);
+ del_timer_sync(&led_cdev->blink_timer);
led_cdev->flags &= ~LED_BLINK_ONESHOT;
led_cdev->flags &= ~LED_BLINK_ONESHOT_STOP;
--
2.15.0.rc0
Please pick this fix for 4.4-stable:
commit 9de29eac8d2189424d81c0d840cd0469aa3d41c8
Author: Dan Carpenter <dan.carpenter(a)oracle.com>
Date: Wed Feb 14 10:14:17 2018 +0300
x86/spectre: Fix an error message
(It has already been applied to your later branches.)
Ben.
--
Ben Hutchings
Software Developer, Codethink Ltd.
2018-03-08 16:09 GMT+09:00 James Hogan <jhogan(a)kernel.org>:
> On Wed, Mar 07, 2018 at 03:19:11PM -0800, Frank Rowand wrote:
>> On 03/07/18 12:25, James Hogan wrote:
>> > On Wed, Mar 07, 2018 at 12:11:41PM -0800, Frank Rowand wrote:
>> >> On 03/07/18 06:06, James Hogan wrote:
>> >>> Quite a lot of dts files have hyphens, but its only a problem on MIPS
>> >>> where such files can be built into the kernel. For example when
>> >>> CONFIG_DT_NETGEAR_CVG834G=y, or on BMIPS kernels when the dtbs target is
>> >>> used (in the latter case it admitedly shouldn't really build all the
>> >>> dtb.o files, but thats a separate issue).
>
>> > I'll keep the paragraph about MIPS and the example configuration though,
>> > as I think its important information to reproduce the problem, and to
>> > justify why it wouldn't be appropriate to just rename the files (which
>> > was my first reaction).
>>
>> Other than the part that says "its only a problem on MIPS". That is
>> pedantically correct because no other architecture (that I am aware
>> of, not that I searched) currently has a devicetree source file name
>> with a hyphen in it, where that file is compiled into the kernel as
>> an asm file. But it is potentially a problem on any architecture
>> to it is misleading to label it as MIPS only.
>
> Okay I'll reword to make it clearer and do a v2.
>
> Thanks
> James
The code looks good.
If you send v2, I can shortly apply it to the fixes branch.
If possible, I want to send a PR this weekend.
--
Best Regards
Masahiro Yamada
While UBI and UBIFS seem to work at first sight with MLC NAND, you will
most likely lose all your data upon a power-cut or due to read/write
disturb.
In order to protect users from bad surprises, refuse to attach to MLC
NAND.
Cc: stable(a)vger.kernel.org
Signed-off-by: Richard Weinberger <richard(a)nod.at>
---
drivers/mtd/ubi/build.c | 11 +++++++++++
1 file changed, 11 insertions(+)
diff --git a/drivers/mtd/ubi/build.c b/drivers/mtd/ubi/build.c
index e941395de3ae..753494e042d5 100644
--- a/drivers/mtd/ubi/build.c
+++ b/drivers/mtd/ubi/build.c
@@ -854,6 +854,17 @@ int ubi_attach_mtd_dev(struct mtd_info *mtd, int ubi_num,
return -EINVAL;
}
+ /*
+ * Both UBI and UBIFS have been designed for SLC NAND and NOR flashes.
+ * MLC NAND is different and needs special care, otherwise UBI or UBIFS
+ * will die soon and you will lose all your data.
+ */
+ if (mtd->type == MTD_MLCNANDFLASH) {
+ pr_err("ubi: refuse attaching mtd%d - MLC NAND is not supported\n",
+ mtd->index);
+ return -EINVAL;
+ }
+
if (ubi_num == UBI_DEV_NUM_AUTO) {
/* Search for an empty slot in the @ubi_devices array */
for (ubi_num = 0; ubi_num < UBI_MAX_DEVICES; ubi_num++)
--
2.13.6
PARTITION_CONFIG is cached in mmc_card->ext_csd.part_config and the
currently active partition in mmc_blk_data->part_curr. These caches do
not always reflect changes if the ioctl call modifies the
PARTITION_CONFIG registers, e.g. by changing BOOT_PARTITION_ENABLE.
Write the PARTITION_CONFIG value extracted from the ioctl call to the
cache and update the currently active partition accordingly. This
ensures that the user space cannot change the values behind the
kernel's back. The next call to mmc_blk_part_switch() will operate on
the data set by the ioctl and reflect the changes appropriately.
Signed-off-by: Bastian Stender <bst(a)pengutronix.de>
Signed-off-by: Jan Luebbe <jlu(a)pengutronix.de>
Cc: stable(a)vger.kernel.org
---
drivers/mmc/core/block.c | 19 +++++++++++++++++++
1 file changed, 19 insertions(+)
diff --git a/drivers/mmc/core/block.c b/drivers/mmc/core/block.c
index 20135a5de748..2cfb963d9f37 100644
--- a/drivers/mmc/core/block.c
+++ b/drivers/mmc/core/block.c
@@ -72,6 +72,7 @@ MODULE_ALIAS("mmc:block");
#define MMC_BLK_TIMEOUT_MS (10 * 1000)
#define MMC_SANITIZE_REQ_TIMEOUT 240000
#define MMC_EXTRACT_INDEX_FROM_ARG(x) ((x & 0x00FF0000) >> 16)
+#define MMC_EXTRACT_VALUE_FROM_ARG(x) ((x & 0x0000FF00) >> 8)
#define mmc_req_rel_wr(req) ((req->cmd_flags & REQ_FUA) && \
(rq_data_dir(req) == WRITE))
@@ -586,6 +587,24 @@ static int __mmc_blk_ioctl_cmd(struct mmc_card *card, struct mmc_blk_data *md,
return data.error;
}
+ /*
+ * Make sure the cache of the PARTITION_CONFIG register and
+ * PARTITION_ACCESS bits is updated in case the ioctl ext_csd write
+ * changed it successfully.
+ */
+ if ((MMC_EXTRACT_INDEX_FROM_ARG(cmd.arg) == EXT_CSD_PART_CONFIG) &&
+ (cmd.opcode == MMC_SWITCH)) {
+ struct mmc_blk_data *main_md = dev_get_drvdata(&card->dev);
+ u8 value = MMC_EXTRACT_VALUE_FROM_ARG(cmd.arg);
+
+ /*
+ * Update cache so the next mmc_blk_part_switch call operates
+ * on up-to-date data.
+ */
+ card->ext_csd.part_config = value;
+ main_md->part_curr = value & EXT_CSD_PART_CONFIG_ACC_MASK;
+ }
+
/*
* According to the SD specs, some commands require a delay after
* issuing the command.
--
2.16.1
This is a note to let you know that I've just added the patch titled
usb: dwc3: Fix lock-up on ID change during system suspend/resume
to my usb git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git
in the usb-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From 084a804e01205bcd74cd0849bc72cb5c88f8e648 Mon Sep 17 00:00:00 2001
From: Roger Quadros <rogerq(a)ti.com>
Date: Tue, 27 Feb 2018 12:41:41 +0200
Subject: usb: dwc3: Fix lock-up on ID change during system suspend/resume
To reproduce the lock up do the following
- connect otg host adapter and a USB device to the dual-role port
so that it is in host mode.
- suspend to mem.
- disconnect otg adapter.
- resume the system.
If we call dwc3_host_exit() before tasks are thawed
xhci_plat_remove() seems to lock up at the second usb_remove_hcd() call.
To work around this we queue the _dwc3_set_mode() work on
the system_freezable_wq.
Fixes: 41ce1456e1db ("usb: dwc3: core: make dwc3_set_mode() work properly")
Cc: <stable(a)vger.kernel.org> # v4.12+
Suggested-by: Manu Gautam <mgautam(a)codeaurora.org>
Signed-off-by: Roger Quadros <rogerq(a)ti.com>
Signed-off-by: Felipe Balbi <felipe.balbi(a)linux.intel.com>
---
drivers/usb/dwc3/core.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/usb/dwc3/core.c b/drivers/usb/dwc3/core.c
index f1d838a4acd6..e94bf91cc58a 100644
--- a/drivers/usb/dwc3/core.c
+++ b/drivers/usb/dwc3/core.c
@@ -175,7 +175,7 @@ void dwc3_set_mode(struct dwc3 *dwc, u32 mode)
dwc->desired_dr_role = mode;
spin_unlock_irqrestore(&dwc->lock, flags);
- queue_work(system_power_efficient_wq, &dwc->drd_work);
+ queue_work(system_freezable_wq, &dwc->drd_work);
}
u32 dwc3_core_fifo_space(struct dwc3_ep *dep, u8 type)
--
2.16.2
On Wed, Mar 07, 2018 at 09:29:10PM +0100, Nikola Ciprich wrote:
> Hi,
>
> > > > I'd like to report that when upgrading our cluster from 4.14.18 to
> > > > 4.14.24-rc1 (with live guests migration), almost none of guests survived..
> > > What's your hardware setup, intel with IBPB enabled microcode?
> > Intel(R) Xeon(R) CPU E5-2650 0 @ 2.00GHz
> >
> > therefore I suppose no IBPB (at least meltdown checker reports so)
> >
> >
> > > Does guests hang right after live migration?
> > yes, just tried it.
> >
> >
> > >
> > > Are you able to reproduce the problem, does it work with latest upstream?
> > yup, so I'm able to reproduce quickly. I'll revert the cluster to 4.14.18 now,
> > but setup test system just afterwards, so and test the patch you've proposed.
> >
> > >
> > > Not sure it helps, but following patch is missing in 4.14.24
> > >
> > > commit 37b95951c58fdf08dc10afa9d02066ed9f176fb5 upstream.
> > >
> > > kvm_valid_sregs() should use X86_CR0_PG and X86_CR4_PAE to check bit
> > > status rather than X86_CR0_PG_BIT and X86_CR4_PAE_BIT. This patch is
> > > to fix it.
> > >
> > > Fixes: f29810335965a(KVM/x86: Check input paging mode when cs.l is set)
> > > Reported-by: Jeremi Piotrowski <jeremi.piotrowski(a)gmail.com>
> > > Cc: Paolo Bonzini <pbonzini(a)redhat.com>
> > > Cc: Radim Krčmář <rkrcmar(a)redhat.com>
> > > Signed-off-by: Tianyu Lan <Tianyu.Lan(a)microsoft.com>
> > > Signed-off-by: Radim Krčmář <rkrcmar(a)redhat.com>
> >
> > I'll test and report.
>
> so indeed, this one on top of 4.14.24-rc1 fixes the migration for me.
> Greg, could you queue this one up please?
As was already pointed out, this is already queued up to be in the next
release.
thanks,
greg k-h
This is a note to let you know that I've just added the patch titled
leds: do not overflow sysfs buffer in led_trigger_show
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
leds-do-not-overflow-sysfs-buffer-in-led_trigger_show.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 3b9b95363c45365d606ad4bbba16acca75fdf6d3 Mon Sep 17 00:00:00 2001
From: Nathan Sullivan <nathan.sullivan(a)ni.com>
Date: Mon, 15 Aug 2016 17:20:14 -0500
Subject: leds: do not overflow sysfs buffer in led_trigger_show
From: Nathan Sullivan <nathan.sullivan(a)ni.com>
commit 3b9b95363c45365d606ad4bbba16acca75fdf6d3 upstream.
Per the documentation, use scnprintf instead of sprintf to ensure there
is never more than PAGE_SIZE bytes of trigger names put into the
buffer.
Signed-off-by: Nathan Sullivan <nathan.sullivan(a)ni.com>
Signed-off-by: Zach Brown <zach.brown(a)ni.com>
Signed-off-by: Jacek Anaszewski <j.anaszewski(a)samsung.com>
Cc: Willy Tarreau <w(a)1wt.eu>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/leds/led-triggers.c | 12 +++++++-----
1 file changed, 7 insertions(+), 5 deletions(-)
--- a/drivers/leds/led-triggers.c
+++ b/drivers/leds/led-triggers.c
@@ -88,21 +88,23 @@ ssize_t led_trigger_show(struct device *
down_read(&led_cdev->trigger_lock);
if (!led_cdev->trigger)
- len += sprintf(buf+len, "[none] ");
+ len += scnprintf(buf+len, PAGE_SIZE - len, "[none] ");
else
- len += sprintf(buf+len, "none ");
+ len += scnprintf(buf+len, PAGE_SIZE - len, "none ");
list_for_each_entry(trig, &trigger_list, next_trig) {
if (led_cdev->trigger && !strcmp(led_cdev->trigger->name,
trig->name))
- len += sprintf(buf+len, "[%s] ", trig->name);
+ len += scnprintf(buf+len, PAGE_SIZE - len, "[%s] ",
+ trig->name);
else
- len += sprintf(buf+len, "%s ", trig->name);
+ len += scnprintf(buf+len, PAGE_SIZE - len, "%s ",
+ trig->name);
}
up_read(&led_cdev->trigger_lock);
up_read(&triggers_list_lock);
- len += sprintf(len+buf, "\n");
+ len += scnprintf(len+buf, PAGE_SIZE - len, "\n");
return len;
}
EXPORT_SYMBOL_GPL(led_trigger_show);
Patches currently in stable-queue which might be from nathan.sullivan(a)ni.com are
queue-4.4/leds-do-not-overflow-sysfs-buffer-in-led_trigger_show.patch
This is a note to let you know that I've just added the patch titled
leds: do not overflow sysfs buffer in led_trigger_show
to the 3.18-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
leds-do-not-overflow-sysfs-buffer-in-led_trigger_show.patch
and it can be found in the queue-3.18 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 3b9b95363c45365d606ad4bbba16acca75fdf6d3 Mon Sep 17 00:00:00 2001
From: Nathan Sullivan <nathan.sullivan(a)ni.com>
Date: Mon, 15 Aug 2016 17:20:14 -0500
Subject: leds: do not overflow sysfs buffer in led_trigger_show
From: Nathan Sullivan <nathan.sullivan(a)ni.com>
commit 3b9b95363c45365d606ad4bbba16acca75fdf6d3 upstream.
Per the documentation, use scnprintf instead of sprintf to ensure there
is never more than PAGE_SIZE bytes of trigger names put into the
buffer.
Signed-off-by: Nathan Sullivan <nathan.sullivan(a)ni.com>
Signed-off-by: Zach Brown <zach.brown(a)ni.com>
Signed-off-by: Jacek Anaszewski <j.anaszewski(a)samsung.com>
Cc: Willy Tarreau <w(a)1wt.eu>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/leds/led-triggers.c | 12 +++++++-----
1 file changed, 7 insertions(+), 5 deletions(-)
--- a/drivers/leds/led-triggers.c
+++ b/drivers/leds/led-triggers.c
@@ -78,21 +78,23 @@ ssize_t led_trigger_show(struct device *
down_read(&led_cdev->trigger_lock);
if (!led_cdev->trigger)
- len += sprintf(buf+len, "[none] ");
+ len += scnprintf(buf+len, PAGE_SIZE - len, "[none] ");
else
- len += sprintf(buf+len, "none ");
+ len += scnprintf(buf+len, PAGE_SIZE - len, "none ");
list_for_each_entry(trig, &trigger_list, next_trig) {
if (led_cdev->trigger && !strcmp(led_cdev->trigger->name,
trig->name))
- len += sprintf(buf+len, "[%s] ", trig->name);
+ len += scnprintf(buf+len, PAGE_SIZE - len, "[%s] ",
+ trig->name);
else
- len += sprintf(buf+len, "%s ", trig->name);
+ len += scnprintf(buf+len, PAGE_SIZE - len, "%s ",
+ trig->name);
}
up_read(&led_cdev->trigger_lock);
up_read(&triggers_list_lock);
- len += sprintf(len+buf, "\n");
+ len += scnprintf(len+buf, PAGE_SIZE - len, "\n");
return len;
}
EXPORT_SYMBOL_GPL(led_trigger_show);
Patches currently in stable-queue which might be from nathan.sullivan(a)ni.com are
queue-3.18/leds-do-not-overflow-sysfs-buffer-in-led_trigger_show.patch
On Thursday 08 March 2018 13:27:15 Pavel Machek wrote:
> Hi!
>
> > Resent without non-upstream patches.
> >
> > This backport patchset fixed the spectre issue, it's original branch:
> > https://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git/log/?h=kpti
> > A few dependency or fixingpatches are also picked up, if they are necessary
> > and no functional changes.
> >
> > No bug found from kernelci.org and lkft testing. It also could be gotten from:
> >
> > git://git.linaro.org/kernel/linux-linaro-stable.git v4.9-spectre-upstream-only
> >
> > Comments are appreciated!
>
> Not entirely related to this patched, but... I have few older ARM
> boards here, and Nokia N9000 I really care about.
>
> AFAICT Meltdown is arm64 only?
IIRC ARMv7 is not affected by meltdown.
> Spectre affects the older boards, too, right? Was there any work done
> on that? cpuinfo says "ARMv7" for N900.
I remember that I saw some spectre patches for ARMv7 on LKML.
In general for ARMv7 it is problematic as mitigation needs to change IBE
bit which is not possible on OMAP HS devices. But for Nokia N900 there
is special code which do it via smc instruction (function
rx51_secure_update_aux_cr(), see also nokia_n900_legacy_init()).
--
Pali Rohár
pali.rohar(a)gmail.com
From: Jack Wang <jinpu.wang(a)profitbricks.com>
Hi Greg,
I noticed 2 fixes for kvm are missing in your queue-4.14, both are bugfix,
can be cherry pick cleanly.
The patch from Tianyu should close bug below, also included in 3.16
https://bugzilla.kernel.org/show_bug.cgi?id=198991
Eric Biggers (1):
KVM/x86: remove WARN_ON() for when vm_munmap() fails
Tianyu Lan (1):
KVM/x86: Fix wrong macro references of X86_CR0_PG_BIT and
X86_CR4_PAE_BIT in kvm_valid_sregs()
arch/x86/kvm/x86.c | 10 ++++------
1 file changed, 4 insertions(+), 6 deletions(-)
--
2.7.4
On dtb files which contain hyphens, the dt_S_dtb command to build the
dtb.S files (which allow DTB files to be built into the kernel) results
in errors like the following:
bcm3368-netgear-cvg834g.dtb.S: Assembler messages:
bcm3368-netgear-cvg834g.dtb.S:5: Error: : no such section
bcm3368-netgear-cvg834g.dtb.S:5: Error: junk at end of line, first unrecognized character is `-'
bcm3368-netgear-cvg834g.dtb.S:6: Error: unrecognized opcode `__dtb_bcm3368-netgear-cvg834g_begin:'
bcm3368-netgear-cvg834g.dtb.S:8: Error: unrecognized opcode `__dtb_bcm3368-netgear-cvg834g_end:'
bcm3368-netgear-cvg834g.dtb.S:9: Error: : no such section
bcm3368-netgear-cvg834g.dtb.S:9: Error: junk at end of line, first unrecognized character is `-'
This is due to the hyphen being used in symbol names. Replace all
hyphens with underscores in the dt_S_dtb command to avoid this problem.
Quite a lot of dts files have hyphens, but its only a problem on MIPS
where such files can be built into the kernel. For example when
CONFIG_DT_NETGEAR_CVG834G=y, or on BMIPS kernels when the dtbs target is
used (in the latter case it admitedly shouldn't really build all the
dtb.o files, but thats a separate issue).
Fixes: 695835511f96 ("MIPS: BMIPS: rename bcm96358nb4ser to bcm6358-neufbox4-sercom")
Signed-off-by: James Hogan <jhogan(a)kernel.org>
Cc: Rob Herring <robh+dt(a)kernel.org>
Cc: Frank Rowand <frowand.list(a)gmail.com>
Cc: Masahiro Yamada <yamada.masahiro(a)socionext.com>
Cc: Michal Marek <michal.lkml(a)markovi.net>
Cc: Ralf Baechle <ralf(a)linux-mips.org>
Cc: Florian Fainelli <f.fainelli(a)gmail.com>
Cc: Kevin Cernekee <cernekee(a)gmail.com>
Cc: devicetree(a)vger.kernel.org
Cc: linux-kbuild(a)vger.kernel.org
Cc: linux-mips(a)linux-mips.org
Cc: <stable(a)vger.kernel.org> # 4.9+
---
scripts/Makefile.lib | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/scripts/Makefile.lib b/scripts/Makefile.lib
index 5589bae34af6..a6f538b31ad6 100644
--- a/scripts/Makefile.lib
+++ b/scripts/Makefile.lib
@@ -297,11 +297,11 @@ cmd_dt_S_dtb= \
echo '\#include <asm-generic/vmlinux.lds.h>'; \
echo '.section .dtb.init.rodata,"a"'; \
echo '.balign STRUCT_ALIGNMENT'; \
- echo '.global __dtb_$(*F)_begin'; \
- echo '__dtb_$(*F)_begin:'; \
+ echo '.global __dtb_$(subst -,_,$(*F))_begin'; \
+ echo '__dtb_$(subst -,_,$(*F))_begin:'; \
echo '.incbin "$<" '; \
- echo '__dtb_$(*F)_end:'; \
- echo '.global __dtb_$(*F)_end'; \
+ echo '__dtb_$(subst -,_,$(*F))_end:'; \
+ echo '.global __dtb_$(subst -,_,$(*F))_end'; \
echo '.balign STRUCT_ALIGNMENT'; \
) > $@
--
2.13.6
From: John Stultz <john.stultz(a)linaro.org>
[ Upstream commit dad3f793f20fbb5c0c342f0f5a0bdf69a4d76089 ]
I had seen some odd behavior with HiKey's usb-gadget interface
that I finally seemed to have chased down. Basically every other
time I plugged in the OTG port, the gadget interface would
properly initialize. The other times, I'd get a big WARN_ON
in dwc2_hsotg_init_fifo() about the fifo_map not being clear.
Ends up if we don't disconnect the gadget state, the fifo-map
doesn't get cleared properly, which causes WARN_ON messages and
also results in the device not properly being setup as a gadget
every other time the OTG port is connected.
So this patch adds a call to dwc2_hsotg_disconnect() in the
reset path so the state is properly cleared.
With it, the gadget interface initializes properly on every
plug in.
Cc: Wei Xu <xuwei5(a)hisilicon.com>
Cc: Guodong Xu <guodong.xu(a)linaro.org>
Cc: Amit Pundir <amit.pundir(a)linaro.org>
Cc: Rob Herring <robh+dt(a)kernel.org>
Cc: John Youn <johnyoun(a)synopsys.com>
Cc: Douglas Anderson <dianders(a)chromium.org>
Cc: Chen Yu <chenyu56(a)huawei.com>
Cc: Felipe Balbi <felipe.balbi(a)linux.intel.com>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: linux-usb(a)vger.kernel.org
Acked-by: John Youn <johnyoun(a)synopsys.com>
Signed-off-by: John Stultz <john.stultz(a)linaro.org>
Signed-off-by: Felipe Balbi <felipe.balbi(a)linux.intel.com>
Signed-off-by: Sasha Levin <alexander.levin(a)microsoft.com>
---
drivers/usb/dwc2/hcd.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/usb/dwc2/hcd.c b/drivers/usb/dwc2/hcd.c
index 571c21727ff9..88bd950665fa 100644
--- a/drivers/usb/dwc2/hcd.c
+++ b/drivers/usb/dwc2/hcd.c
@@ -1385,6 +1385,7 @@ static void dwc2_conn_id_status_change(struct work_struct *work)
dwc2_core_init(hsotg, false, -1);
dwc2_enable_global_interrupts(hsotg);
spin_lock_irqsave(&hsotg->lock, flags);
+ dwc2_hsotg_disconnect(hsotg);
dwc2_hsotg_core_init_disconnected(hsotg, false);
spin_unlock_irqrestore(&hsotg->lock, flags);
dwc2_hsotg_core_connect(hsotg);
--
2.14.1
This is a note to let you know that I've just added the patch titled
platform/x86: dell-laptop: fix kbd_get_state's request value
to the 4.15-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
platform-x86-dell-laptop-fix-kbd_get_state-s-request-value.patch
and it can be found in the queue-4.15 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From eca39e7f0cdb9bde4003a29149fa695e876c6f73 Mon Sep 17 00:00:00 2001
From: Laszlo Toth <laszlth(a)gmail.com>
Date: Tue, 13 Feb 2018 21:43:43 +0100
Subject: platform/x86: dell-laptop: fix kbd_get_state's request value
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
From: Laszlo Toth <laszlth(a)gmail.com>
commit eca39e7f0cdb9bde4003a29149fa695e876c6f73 upstream.
Commit 9862b43624a5 ("platform/x86: dell-laptop: Allocate buffer on heap
rather than globally")
broke one request, changed it back to the original value.
Tested on a Dell E6540, backlight came back.
Fixes: 9862b43624a5 ("platform/x86: dell-laptop: Allocate buffer on heap rather than globally")
Signed-off-by: Laszlo Toth <laszlth(a)gmail.com>
Reviewed-by: Pali Rohár <pali.rohar(a)gmail.com>
Reviewed-by: Mario Limonciello <mario.limonciello(a)dell.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko(a)linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/platform/x86/dell-laptop.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/platform/x86/dell-laptop.c
+++ b/drivers/platform/x86/dell-laptop.c
@@ -1254,7 +1254,7 @@ static int kbd_get_state(struct kbd_stat
struct calling_interface_buffer buffer;
int ret;
- dell_fill_request(&buffer, 0, 0, 0, 0);
+ dell_fill_request(&buffer, 0x1, 0, 0, 0);
ret = dell_send_request(&buffer,
CLASS_KBD_BACKLIGHT, SELECT_KBD_BACKLIGHT);
if (ret)
Patches currently in stable-queue which might be from laszlth(a)gmail.com are
queue-4.15/platform-x86-dell-laptop-fix-kbd_get_state-s-request-value.patch
On Wed 2018-03-07 22:11:13, David Woodhouse wrote:
>
>
> On Wed, 2018-03-07 at 14:08 -0800, Steve deRosier wrote:
> >
> > To clarify one thing: the reason for this is MLC has actually never
> > been supported, nor worked properly. The fact that it kinda worked was
> > incidental and the cause of major problems for people due to that not
> > being clear. This patch only makes it explicit and avoids people
> > mistakenly trying to use UBIFS on MLC flash and risking their data and
> > products. To me, that's what's important.
> >
> > This is an important patch, even if all it does is keep people from
> > loosing data. It also changes the conversation from "I have a
> > corrupted UBIFS device, BTW it's on MLC..." to "What can we do to get
> > UBIFS to work on MLC".
Well, for -stable I'd suggest printk(KERN_ALERT ...) but keep the
system running.
> This is a bug fix.
>
> UBI on MLC never worked. It was a bug that we ever permitted it. This
> is now fixed.
Yeah, well, so lets say I have a working hardware (maybe using
read-only UBI on MLC), update to next stable kernel, and now kernel
refuses to see the partition.
I'll certainly not consider this patch a bug fix.
Removing support for hardware that "only works by mistake" may be good
idea, but maybe it is slightly too surprising for a -stable.
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
On 03/07/18 12:25, James Hogan wrote:
> On Wed, Mar 07, 2018 at 12:11:41PM -0800, Frank Rowand wrote:
>> I initially misread the patch description (and imagined an entirely
>> different problem).
>>
>>
>> On 03/07/18 06:06, James Hogan wrote:
>>> On dtb files which contain hyphens, the dt_S_dtb command to build the> dtb.S files (which allow DTB files to be built into the kernel) results> in errors like the following:> > bcm3368-netgear-cvg834g.dtb.S: Assembler messages:> bcm3368-netgear-cvg834g.dtb.S:5: Error: : no such section> bcm3368-netgear-cvg834g.dtb.S:5: Error: junk at end of line, first unrecognized character is `-'> bcm3368-netgear-cvg834g.dtb.S:6: Error: unrecognized opcode `__dtb_bcm3368-netgear-cvg834g_begin:'> bcm3368-netgear-cvg834g.dtb.S:8: Error: unrecognized opcode `__dtb_bcm3368-netgear-cvg834g_end:'> bcm3368-netgear-cvg834g.dtb.S:9: Error: : no such section> bcm3368-netgear-cvg834g.dtb.S:9: Error: junk at end of line, first unrecognized character is `-'
>> Please replace the following section:
>>
>>> This is due to the hyphen being used in symbol names. Replace all
>>> hyphens
>>> with underscores in the dt_S_dtb command to avoid this problem.
>>>
>>> Quite a lot of dts files have hyphens, but its only a problem on MIPS
>>> where such files can be built into the kernel. For example when
>>> CONFIG_DT_NETGEAR_CVG834G=y, or on BMIPS kernels when the dtbs target is
>>> used (in the latter case it admitedly shouldn't really build all the
>>> dtb.o files, but thats a separate issue).
>>
>> with:
>>
>> cmd_dt_S_dtb constructs the assembly source to incorporate a devicetree
>> FDT (that is, the .dtb file) as binary data in the kernel image.
>> This assembly source contains labels before and after the binary data.
>> The label names incorporate the file name of the corresponding .dtb
>> file. Hyphens are not legal characters in labels, so transform all
>> hyphens from the file name to underscores when constructing the labels.
>
> Thanks, that is clearer.
>
> I'll keep the paragraph about MIPS and the example configuration though,
> as I think its important information to reproduce the problem, and to
> justify why it wouldn't be appropriate to just rename the files (which
> was my first reaction).
Other than the part that says "its only a problem on MIPS". That is
pedantically correct because no other architecture (that I am aware
of, not that I searched) currently has a devicetree source file name
with a hyphen in it, where that file is compiled into the kernel as
an asm file. But it is potentially a problem on any architecture
to it is misleading to label it as MIPS only.
>
>> Reviewed-by: Frank Rowand <frowand.list(a)gmail.com>
>
> Thanks
> James
>
On Wed, 7 Mar 2018 22:43:42 +0100
Pavel Machek <pavel(a)ucw.cz> wrote:
> On Wed 2018-03-07 09:01:16, Richard Weinberger wrote:
> > Pavel,
> >
> > Am Mittwoch, 7. März 2018, 00:18:05 CET schrieb Pavel Machek:
> > > On Sat 2018-03-03 11:45:54, Richard Weinberger wrote:
> > > > While UBI and UBIFS seem to work at first sight with MLC NAND, you will
> > > > most likely lose all your data upon a power-cut or due to read/write
> > > > disturb.
> > > > In order to protect users from bad surprises, refuse to attach to MLC
> > > > NAND.
> > > >
> > > > Cc: stable(a)vger.kernel.org
> > >
> > > That sounds like _really_ bad idea for stable. All it does is it
> > > removes support for hardware that somehow works.
> >
> > MLC is not supported and does not work. Full stop.
> > If someone manages to get it somehow work, either with hardware or software
> > hacks they are on their own.
> > Having it in stable is the only chance we have to get it into vendor
> > kernels.
>
> Can you show how it meets the stable kernel criteria? They are
> documented in tree. This should not be in stable.
>
> And I'd like to see changelog improved. Real reason MLC is not
> supported is upper/lower page parts on MLC. And real fix to work with
> bigger pages in UBI.
Come on! Don't you think this would have been fixed already if it was
that easy?! Have you looked at an MLC datasheet to see how paired pages
are combined? If you had you would now that paired pages are almost all
the time not contiguous, thus preventing the trick you're suggesting
here. Please document yourself before doing such presumptuous
statements (you can have a look at these slides if you want some details
about why this is not so simple [1]).
I'm definitely not saying supporting MLC NANDs in Linux is impossible,
and if you're interested in working on this topic I'd be happy to help.
But please don't block this patch without understanding what supporting
MLC NANDs implies.
[1]https://events.static.linuxfound.org/sites/events/files/slides/ubi-mlc.pdf
--
Boris Brezillon, Bootlin (formerly Free Electrons)
Embedded Linux and Kernel engineering
https://bootlin.com
Hi Pavel,
On Wed, Mar 7, 2018 at 1:43 PM, Pavel Machek <pavel(a)ucw.cz> wrote:
> On Wed 2018-03-07 09:01:16, Richard Weinberger wrote:
>> Pavel,
>>
>> Am Mittwoch, 7. März 2018, 00:18:05 CET schrieb Pavel Machek:
>> > On Sat 2018-03-03 11:45:54, Richard Weinberger wrote:
>> > > While UBI and UBIFS seem to work at first sight with MLC NAND, you will
>> > > most likely lose all your data upon a power-cut or due to read/write
>> > > disturb.
>> > > In order to protect users from bad surprises, refuse to attach to MLC
>> > > NAND.
>> > >
>> > > Cc: stable(a)vger.kernel.org
>> >
>> > That sounds like _really_ bad idea for stable. All it does is it
>> > removes support for hardware that somehow works.
>>
>> MLC is not supported and does not work. Full stop.
>> If someone manages to get it somehow work, either with hardware or software
>> hacks they are on their own.
>> Having it in stable is the only chance we have to get it into vendor
>> kernels.
>
> Can you show how it meets the stable kernel criteria? They are
> documented in tree. This should not be in stable.
>
> And I'd like to see changelog improved. Real reason MLC is not
> supported is upper/lower page parts on MLC. And real fix to work with
> bigger pages in UBI.
>
To clarify one thing: the reason for this is MLC has actually never
been supported, nor worked properly. The fact that it kinda worked was
incidental and the cause of major problems for people due to that not
being clear. This patch only makes it explicit and avoids people
mistakenly trying to use UBIFS on MLC flash and risking their data and
products. To me, that's what's important.
This is an important patch, even if all it does is keep people from
loosing data. It also changes the conversation from "I have a
corrupted UBIFS device, BTW it's on MLC..." to "What can we do to get
UBIFS to work on MLC".
I don't know what the stable criteria is with re: to this patch. But
what I do know is if it doesn't go back into the various stables,
there will be manufacturers who will continue to try to use UBIFS on
MLC in ignorance for the next several years until the current stable
kernels EOL, despite there being a known patch that would make it
immediately obvious they shouldn't.
Thanks,
- Steve
On Wed, Mar 07, 2018 at 02:02:13PM -0800, Paul Lawrence wrote:
> Great! We need to make sure this gets backported to 4.4 and 4.9, and to
> 3.18 with the original dependency, please.
That will happen when it lands in Linus's tree, which should be later
this week if all goes well.
thanks,
greg k-h
This is a note to let you know that I've just added the patch titled
nvme-rdma: don't suppress send completions
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
nvme-rdma-don-t-suppress-send-completions.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From b4b591c87f2b0f4ebaf3a68d4f13873b241aa584 Mon Sep 17 00:00:00 2001
From: Sagi Grimberg <sagi(a)grimberg.me>
Date: Thu, 23 Nov 2017 17:35:21 +0200
Subject: nvme-rdma: don't suppress send completions
From: Sagi Grimberg <sagi(a)grimberg.me>
commit b4b591c87f2b0f4ebaf3a68d4f13873b241aa584 upstream.
The entire completions suppress mechanism is currently broken because the
HCA might retry a send operation (due to dropped ack) after the nvme
transaction has completed.
In order to handle this, we signal all send completions and introduce a
separate done handler for async events as they will be handled differently
(as they don't include in-capsule data by definition).
Signed-off-by: Sagi Grimberg <sagi(a)grimberg.me>
Reviewed-by: Max Gurtovoy <maxg(a)mellanox.com>
Signed-off-by: Christoph Hellwig <hch(a)lst.de>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/nvme/host/rdma.c | 54 ++++++++++++-----------------------------------
1 file changed, 14 insertions(+), 40 deletions(-)
--- a/drivers/nvme/host/rdma.c
+++ b/drivers/nvme/host/rdma.c
@@ -88,7 +88,6 @@ enum nvme_rdma_queue_flags {
struct nvme_rdma_queue {
struct nvme_rdma_qe *rsp_ring;
- atomic_t sig_count;
int queue_size;
size_t cmnd_capsule_len;
struct nvme_rdma_ctrl *ctrl;
@@ -521,7 +520,6 @@ static int nvme_rdma_alloc_queue(struct
queue->cmnd_capsule_len = sizeof(struct nvme_command);
queue->queue_size = queue_size;
- atomic_set(&queue->sig_count, 0);
queue->cm_id = rdma_create_id(&init_net, nvme_rdma_cm_handler, queue,
RDMA_PS_TCP, IB_QPT_RC);
@@ -1232,21 +1230,9 @@ static void nvme_rdma_send_done(struct i
nvme_end_request(rq, req->status, req->result);
}
-/*
- * We want to signal completion at least every queue depth/2. This returns the
- * largest power of two that is not above half of (queue size + 1) to optimize
- * (avoid divisions).
- */
-static inline bool nvme_rdma_queue_sig_limit(struct nvme_rdma_queue *queue)
-{
- int limit = 1 << ilog2((queue->queue_size + 1) / 2);
-
- return (atomic_inc_return(&queue->sig_count) & (limit - 1)) == 0;
-}
-
static int nvme_rdma_post_send(struct nvme_rdma_queue *queue,
struct nvme_rdma_qe *qe, struct ib_sge *sge, u32 num_sge,
- struct ib_send_wr *first, bool flush)
+ struct ib_send_wr *first)
{
struct ib_send_wr wr, *bad_wr;
int ret;
@@ -1255,31 +1241,12 @@ static int nvme_rdma_post_send(struct nv
sge->length = sizeof(struct nvme_command),
sge->lkey = queue->device->pd->local_dma_lkey;
- qe->cqe.done = nvme_rdma_send_done;
-
wr.next = NULL;
wr.wr_cqe = &qe->cqe;
wr.sg_list = sge;
wr.num_sge = num_sge;
wr.opcode = IB_WR_SEND;
- wr.send_flags = 0;
-
- /*
- * Unsignalled send completions are another giant desaster in the
- * IB Verbs spec: If we don't regularly post signalled sends
- * the send queue will fill up and only a QP reset will rescue us.
- * Would have been way to obvious to handle this in hardware or
- * at least the RDMA stack..
- *
- * Always signal the flushes. The magic request used for the flush
- * sequencer is not allocated in our driver's tagset and it's
- * triggered to be freed by blk_cleanup_queue(). So we need to
- * always mark it as signaled to ensure that the "wr_cqe", which is
- * embedded in request's payload, is not freed when __ib_process_cq()
- * calls wr_cqe->done().
- */
- if (nvme_rdma_queue_sig_limit(queue) || flush)
- wr.send_flags |= IB_SEND_SIGNALED;
+ wr.send_flags = IB_SEND_SIGNALED;
if (first)
first->next = ≀
@@ -1329,6 +1296,12 @@ static struct blk_mq_tags *nvme_rdma_tag
return queue->ctrl->tag_set.tags[queue_idx - 1];
}
+static void nvme_rdma_async_done(struct ib_cq *cq, struct ib_wc *wc)
+{
+ if (unlikely(wc->status != IB_WC_SUCCESS))
+ nvme_rdma_wr_error(cq, wc, "ASYNC");
+}
+
static void nvme_rdma_submit_async_event(struct nvme_ctrl *arg, int aer_idx)
{
struct nvme_rdma_ctrl *ctrl = to_rdma_ctrl(arg);
@@ -1350,10 +1323,12 @@ static void nvme_rdma_submit_async_event
cmd->common.flags |= NVME_CMD_SGL_METABUF;
nvme_rdma_set_sg_null(cmd);
+ sqe->cqe.done = nvme_rdma_async_done;
+
ib_dma_sync_single_for_device(dev, sqe->dma, sizeof(*cmd),
DMA_TO_DEVICE);
- ret = nvme_rdma_post_send(queue, sqe, &sge, 1, NULL, false);
+ ret = nvme_rdma_post_send(queue, sqe, &sge, 1, NULL);
WARN_ON_ONCE(ret);
}
@@ -1639,7 +1614,6 @@ static blk_status_t nvme_rdma_queue_rq(s
struct nvme_rdma_request *req = blk_mq_rq_to_pdu(rq);
struct nvme_rdma_qe *sqe = &req->sqe;
struct nvme_command *c = sqe->data;
- bool flush = false;
struct ib_device *dev;
blk_status_t ret;
int err;
@@ -1668,13 +1642,13 @@ static blk_status_t nvme_rdma_queue_rq(s
goto err;
}
+ sqe->cqe.done = nvme_rdma_send_done;
+
ib_dma_sync_single_for_device(dev, sqe->dma,
sizeof(struct nvme_command), DMA_TO_DEVICE);
- if (req_op(rq) == REQ_OP_FLUSH)
- flush = true;
err = nvme_rdma_post_send(queue, sqe, req->sge, req->num_sge,
- req->mr->need_inval ? &req->reg_wr.wr : NULL, flush);
+ req->mr->need_inval ? &req->reg_wr.wr : NULL);
if (unlikely(err)) {
nvme_rdma_unmap_data(queue, rq);
goto err;
Patches currently in stable-queue which might be from sagi(a)grimberg.me are
queue-4.14/nvme-rdma-don-t-suppress-send-completions.patch
This is a note to let you know that I've just added the patch titled
netlink: put module reference if dump start fails
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
netlink-put-module-reference-if-dump-start-fails.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From b87b6194be631c94785fe93398651e804ed43e28 Mon Sep 17 00:00:00 2001
From: "Jason A. Donenfeld" <Jason(a)zx2c4.com>
Date: Wed, 21 Feb 2018 04:41:59 +0100
Subject: netlink: put module reference if dump start fails
From: Jason A. Donenfeld <Jason(a)zx2c4.com>
commit b87b6194be631c94785fe93398651e804ed43e28 upstream.
Before, if cb->start() failed, the module reference would never be put,
because cb->cb_running is intentionally false at this point. Users are
generally annoyed by this because they can no longer unload modules that
leak references. Also, it may be possible to tediously wrap a reference
counter back to zero, especially since module.c still uses atomic_inc
instead of refcount_inc.
This patch expands the error path to simply call module_put if
cb->start() fails.
Fixes: 41c87425a1ac ("netlink: do not set cb_running if dump's start() errs")
Signed-off-by: Jason A. Donenfeld <Jason(a)zx2c4.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/netlink/af_netlink.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
--- a/net/netlink/af_netlink.c
+++ b/net/netlink/af_netlink.c
@@ -2258,7 +2258,7 @@ int __netlink_dump_start(struct sock *ss
if (cb->start) {
ret = cb->start(cb);
if (ret)
- goto error_unlock;
+ goto error_put;
}
nlk->cb_running = true;
@@ -2278,6 +2278,8 @@ int __netlink_dump_start(struct sock *ss
*/
return -EINTR;
+error_put:
+ module_put(control->module);
error_unlock:
sock_put(sk);
mutex_unlock(nlk->cb_mutex);
Patches currently in stable-queue which might be from Jason(a)zx2c4.com are
queue-4.9/netlink-put-module-reference-if-dump-start-fails.patch
This is a note to let you know that I've just added the patch titled
netlink: put module reference if dump start fails
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
netlink-put-module-reference-if-dump-start-fails.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Tue Mar 6 19:02:12 PST 2018
From: "Jason A. Donenfeld" <Jason(a)zx2c4.com>
Date: Wed, 21 Feb 2018 04:41:59 +0100
Subject: netlink: put module reference if dump start fails
From: "Jason A. Donenfeld" <Jason(a)zx2c4.com>
[ Upstream commit b87b6194be631c94785fe93398651e804ed43e28 ]
Before, if cb->start() failed, the module reference would never be put,
because cb->cb_running is intentionally false at this point. Users are
generally annoyed by this because they can no longer unload modules that
leak references. Also, it may be possible to tediously wrap a reference
counter back to zero, especially since module.c still uses atomic_inc
instead of refcount_inc.
This patch expands the error path to simply call module_put if
cb->start() fails.
Fixes: 41c87425a1ac ("netlink: do not set cb_running if dump's start() errs")
Signed-off-by: Jason A. Donenfeld <Jason(a)zx2c4.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/netlink/af_netlink.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
--- a/net/netlink/af_netlink.c
+++ b/net/netlink/af_netlink.c
@@ -2276,7 +2276,7 @@ int __netlink_dump_start(struct sock *ss
if (cb->start) {
ret = cb->start(cb);
if (ret)
- goto error_unlock;
+ goto error_put;
}
nlk->cb_running = true;
@@ -2296,6 +2296,8 @@ int __netlink_dump_start(struct sock *ss
*/
return -EINTR;
+error_put:
+ module_put(control->module);
error_unlock:
sock_put(sk);
mutex_unlock(nlk->cb_mutex);
Patches currently in stable-queue which might be from Jason(a)zx2c4.com are
queue-4.14/netlink-put-module-reference-if-dump-start-fails.patch
This is a note to let you know that I've just added the patch titled
x86/speculation: Use Indirect Branch Prediction Barrier in context switch
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
x86-speculation-use-indirect-branch-prediction-barrier-in-context-switch.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 18bf3c3ea8ece8f03b6fc58508f2dfd23c7711c7 Mon Sep 17 00:00:00 2001
From: Tim Chen <tim.c.chen(a)linux.intel.com>
Date: Mon, 29 Jan 2018 22:04:47 +0000
Subject: x86/speculation: Use Indirect Branch Prediction Barrier in context switch
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
From: Tim Chen <tim.c.chen(a)linux.intel.com>
commit 18bf3c3ea8ece8f03b6fc58508f2dfd23c7711c7 upstream.
Flush indirect branches when switching into a process that marked itself
non dumpable. This protects high value processes like gpg better,
without having too high performance overhead.
If done naïvely, we could switch to a kernel idle thread and then back
to the original process, such as:
process A -> idle -> process A
In such scenario, we do not have to do IBPB here even though the process
is non-dumpable, as we are switching back to the same process after a
hiatus.
To avoid the redundant IBPB, which is expensive, we track the last mm
user context ID. The cost is to have an extra u64 mm context id to track
the last mm we were using before switching to the init_mm used by idle.
Avoiding the extra IBPB is probably worth the extra memory for this
common scenario.
For those cases where tlb_defer_switch_to_init_mm() returns true (non
PCID), lazy tlb will defer switch to init_mm, so we will not be changing
the mm for the process A -> idle -> process A switch. So IBPB will be
skipped for this case.
Thanks to the reviewers and Andy Lutomirski for the suggestion of
using ctx_id which got rid of the problem of mm pointer recycling.
Signed-off-by: Tim Chen <tim.c.chen(a)linux.intel.com>
Signed-off-by: David Woodhouse <dwmw(a)amazon.co.uk>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Cc: ak(a)linux.intel.com
Cc: karahmed(a)amazon.de
Cc: arjan(a)linux.intel.com
Cc: torvalds(a)linux-foundation.org
Cc: linux(a)dominikbrodowski.net
Cc: peterz(a)infradead.org
Cc: bp(a)alien8.de
Cc: luto(a)kernel.org
Cc: pbonzini(a)redhat.com
Link: https://lkml.kernel.org/r/1517263487-3708-1-git-send-email-dwmw@amazon.co.uk
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/include/asm/tlbflush.h | 2 ++
arch/x86/mm/tlb.c | 31 +++++++++++++++++++++++++++++++
2 files changed, 33 insertions(+)
--- a/arch/x86/include/asm/tlbflush.h
+++ b/arch/x86/include/asm/tlbflush.h
@@ -68,6 +68,8 @@ static inline void invpcid_flush_all_non
struct tlb_state {
struct mm_struct *active_mm;
int state;
+ /* last user mm's ctx id */
+ u64 last_ctx_id;
/*
* Access to this CR4 shadow and to H/W CR4 is protected by
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -10,6 +10,7 @@
#include <asm/tlbflush.h>
#include <asm/mmu_context.h>
+#include <asm/nospec-branch.h>
#include <asm/cache.h>
#include <asm/apic.h>
#include <asm/uv/uv.h>
@@ -106,6 +107,28 @@ void switch_mm_irqs_off(struct mm_struct
unsigned cpu = smp_processor_id();
if (likely(prev != next)) {
+ u64 last_ctx_id = this_cpu_read(cpu_tlbstate.last_ctx_id);
+
+ /*
+ * Avoid user/user BTB poisoning by flushing the branch
+ * predictor when switching between processes. This stops
+ * one process from doing Spectre-v2 attacks on another.
+ *
+ * As an optimization, flush indirect branches only when
+ * switching into processes that disable dumping. This
+ * protects high value processes like gpg, without having
+ * too high performance overhead. IBPB is *expensive*!
+ *
+ * This will not flush branches when switching into kernel
+ * threads. It will also not flush if we switch to idle
+ * thread and back to the same process. It will flush if we
+ * switch to a different non-dumpable process.
+ */
+ if (tsk && tsk->mm &&
+ tsk->mm->context.ctx_id != last_ctx_id &&
+ get_dumpable(tsk->mm) != SUID_DUMP_USER)
+ indirect_branch_prediction_barrier();
+
if (IS_ENABLED(CONFIG_VMAP_STACK)) {
/*
* If our current stack is in vmalloc space and isn't
@@ -120,6 +143,14 @@ void switch_mm_irqs_off(struct mm_struct
set_pgd(pgd, init_mm.pgd[stack_pgd_index]);
}
+ /*
+ * Record last user mm's context id, so we can avoid
+ * flushing branch buffer with IBPB if we switch back
+ * to the same user.
+ */
+ if (next != &init_mm)
+ this_cpu_write(cpu_tlbstate.last_ctx_id, next->context.ctx_id);
+
this_cpu_write(cpu_tlbstate.state, TLBSTATE_OK);
this_cpu_write(cpu_tlbstate.active_mm, next);
Patches currently in stable-queue which might be from tim.c.chen(a)linux.intel.com are
queue-4.9/x86-speculation-use-indirect-branch-prediction-barrier-in-context-switch.patch
queue-4.9/x86-mm-give-each-mm-tlb-flush-generation-a-unique-id.patch
This is a note to let you know that I've just added the patch titled
md: only allow remove_and_add_spares when no sync_thread running.
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
md-only-allow-remove_and_add_spares-when-no-sync_thread-running.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 39772f0a7be3b3dc26c74ea13fe7847fd1522c8b Mon Sep 17 00:00:00 2001
From: NeilBrown <neilb(a)suse.com>
Date: Sat, 3 Feb 2018 09:19:30 +1100
Subject: md: only allow remove_and_add_spares when no sync_thread running.
From: NeilBrown <neilb(a)suse.com>
commit 39772f0a7be3b3dc26c74ea13fe7847fd1522c8b upstream.
The locking protocols in md assume that a device will
never be removed from an array during resync/recovery/reshape.
When that isn't happening, rcu or reconfig_mutex is needed
to protect an rdev pointer while taking a refcount. When
it is happening, that protection isn't needed.
Unfortunately there are cases were remove_and_add_spares() is
called when recovery might be happening: is state_store(),
slot_store() and hot_remove_disk().
In each case, this is just an optimization, to try to expedite
removal from the personality so the device can be removed from
the array. If resync etc is happening, we just have to wait
for md_check_recover to find a suitable time to call
remove_and_add_spares().
This optimization and not essential so it doesn't
matter if it fails.
So change remove_and_add_spares() to abort early if
resync/recovery/reshape is happening, unless it is called
from md_check_recovery() as part of a newly started recovery.
The parameter "this" is only NULL when called from
md_check_recovery() so when it is NULL, there is no need to abort.
As this can result in a NULL dereference, the fix is suitable
for -stable.
cc: yuyufen <yuyufen(a)huawei.com>
Cc: Tomasz Majchrzak <tomasz.majchrzak(a)intel.com>
Fixes: 8430e7e0af9a ("md: disconnect device from personality before trying to remove it.")
Cc: stable(a)ver.kernel.org (v4.8+)
Signed-off-by: NeilBrown <neilb(a)suse.com>
Signed-off-by: Shaohua Li <sh.li(a)alibaba-inc.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/md/md.c | 4 ++++
1 file changed, 4 insertions(+)
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -8224,6 +8224,10 @@ static int remove_and_add_spares(struct
int removed = 0;
bool remove_some = false;
+ if (this && test_bit(MD_RECOVERY_RUNNING, &mddev->recovery))
+ /* Mustn't remove devices when resync thread is running */
+ return 0;
+
rdev_for_each(rdev, mddev) {
if ((this == NULL || rdev == this) &&
rdev->raid_disk >= 0 &&
Patches currently in stable-queue which might be from neilb(a)suse.com are
queue-4.9/md-only-allow-remove_and_add_spares-when-no-sync_thread-running.patch
This is a note to let you know that I've just added the patch titled
dm io: fix duplicate bio completion due to missing ref count
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
dm-io-fix-duplicate-bio-completion-due-to-missing-ref-count.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From feb7695fe9fb83084aa29de0094774f4c9d4c9fc Mon Sep 17 00:00:00 2001
From: Mike Snitzer <snitzer(a)redhat.com>
Date: Tue, 20 Jun 2017 19:14:30 -0400
Subject: dm io: fix duplicate bio completion due to missing ref count
From: Mike Snitzer <snitzer(a)redhat.com>
commit feb7695fe9fb83084aa29de0094774f4c9d4c9fc upstream.
If only a subset of the devices associated with multiple regions support
a given special operation (eg. DISCARD) then the dec_count() that is
used to set error for the region must increment the io->count.
Otherwise, when the dec_count() is called it can cause the dm-io
caller's bio to be completed multiple times. As was reported against
the dm-mirror target that had mirror legs with a mix of discard
capabilities.
Bug: https://bugzilla.kernel.org/show_bug.cgi?id=196077
Reported-by: Zhang Yi <yizhan(a)redhat.com>
Signed-off-by: Mike Snitzer <snitzer(a)redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/md/dm-io.c | 1 +
1 file changed, 1 insertion(+)
--- a/drivers/md/dm-io.c
+++ b/drivers/md/dm-io.c
@@ -302,6 +302,7 @@ static void do_region(int op, int op_fla
special_cmd_max_sectors = q->limits.max_write_same_sectors;
if ((op == REQ_OP_DISCARD || op == REQ_OP_WRITE_SAME) &&
special_cmd_max_sectors == 0) {
+ atomic_inc(&io->count);
dec_count(io, region, -EOPNOTSUPP);
return;
}
Patches currently in stable-queue which might be from snitzer(a)redhat.com are
queue-4.9/dm-io-fix-duplicate-bio-completion-due-to-missing-ref-count.patch
This is a note to let you know that I've just added the patch titled
ARM: dts: LogicPD Torpedo: Fix I2C1 pinmux
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
arm-dts-logicpd-torpedo-fix-i2c1-pinmux.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 74402055a2d3ec998a1ded599e86185a27d9bbf4 Mon Sep 17 00:00:00 2001
From: Adam Ford <aford173(a)gmail.com>
Date: Thu, 25 Jan 2018 14:10:37 -0600
Subject: ARM: dts: LogicPD Torpedo: Fix I2C1 pinmux
From: Adam Ford <aford173(a)gmail.com>
commit 74402055a2d3ec998a1ded599e86185a27d9bbf4 upstream.
The pinmuxing was missing for I2C1 which was causing intermittent issues
with the PMIC which is connected to I2C1. The bootloader did not quite
configure the I2C1 either, so when running at 2.6MHz, it was generating
errors at time.
This correctly sets the I2C1 pinmuxing so it can operate at 2.6MHz
Fixes: 687c27676151 ("ARM: dts: Add minimal support for LogicPD Torpedo
DM3730 devkit")
Signed-off-by: Adam Ford <aford173(a)gmail.com>
Signed-off-by: Tony Lindgren <tony(a)atomide.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/arm/boot/dts/logicpd-torpedo-som.dtsi | 8 ++++++++
1 file changed, 8 insertions(+)
--- a/arch/arm/boot/dts/logicpd-torpedo-som.dtsi
+++ b/arch/arm/boot/dts/logicpd-torpedo-som.dtsi
@@ -100,6 +100,8 @@
};
&i2c1 {
+ pinctrl-names = "default";
+ pinctrl-0 = <&i2c1_pins>;
clock-frequency = <2600000>;
twl: twl@48 {
@@ -207,6 +209,12 @@
OMAP3_CORE1_IOPAD(0x21b8, PIN_INPUT | MUX_MODE0) /* hsusb0_data7.hsusb0_data7 */
>;
};
+ i2c1_pins: pinmux_i2c1_pins {
+ pinctrl-single,pins = <
+ OMAP3_CORE1_IOPAD(0x21ba, PIN_INPUT | MUX_MODE0) /* i2c1_scl.i2c1_scl */
+ OMAP3_CORE1_IOPAD(0x21bc, PIN_INPUT | MUX_MODE0) /* i2c1_sda.i2c1_sda */
+ >;
+ };
};
&uart2 {
Patches currently in stable-queue which might be from aford173(a)gmail.com are
queue-4.9/arm-dts-logicpd-torpedo-fix-i2c1-pinmux.patch
queue-4.9/arm-dts-logicpd-som-lv-fix-i2c1-pinmux.patch
This is a note to let you know that I've just added the patch titled
ARM: dts: LogicPD SOM-LV: Fix I2C1 pinmux
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
arm-dts-logicpd-som-lv-fix-i2c1-pinmux.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 84c7efd607e7fb6933920322086db64654f669b2 Mon Sep 17 00:00:00 2001
From: Adam Ford <aford173(a)gmail.com>
Date: Sat, 27 Jan 2018 15:27:05 -0600
Subject: ARM: dts: LogicPD SOM-LV: Fix I2C1 pinmux
From: Adam Ford <aford173(a)gmail.com>
commit 84c7efd607e7fb6933920322086db64654f669b2 upstream.
The pinmuxing was missing for I2C1 which was causing intermittent issues
with the PMIC which is connected to I2C1. The bootloader did not quite
configure the I2C1 either, so when running at 2.6MHz, it was generating
errors at times.
This correctly sets the I2C1 pinmuxing so it can operate at 2.6MHz
Fixes: ab8dd3aed011 ("ARM: DTS: Add minimal Support for Logic PD DM3730
SOM-LV")
Signed-off-by: Adam Ford <aford173(a)gmail.com>
Signed-off-by: Tony Lindgren <tony(a)atomide.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/arm/boot/dts/logicpd-som-lv.dtsi | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
--- a/arch/arm/boot/dts/logicpd-som-lv.dtsi
+++ b/arch/arm/boot/dts/logicpd-som-lv.dtsi
@@ -97,6 +97,8 @@
};
&i2c1 {
+ pinctrl-names = "default";
+ pinctrl-0 = <&i2c1_pins>;
clock-frequency = <2600000>;
twl: twl@48 {
@@ -215,7 +217,12 @@
>;
};
-
+ i2c1_pins: pinmux_i2c1_pins {
+ pinctrl-single,pins = <
+ OMAP3_CORE1_IOPAD(0x21ba, PIN_INPUT | MUX_MODE0) /* i2c1_scl.i2c1_scl */
+ OMAP3_CORE1_IOPAD(0x21bc, PIN_INPUT | MUX_MODE0) /* i2c1_sda.i2c1_sda */
+ >;
+ };
};
&omap3_pmx_wkup {
Patches currently in stable-queue which might be from aford173(a)gmail.com are
queue-4.9/arm-dts-logicpd-torpedo-fix-i2c1-pinmux.patch
queue-4.9/arm-dts-logicpd-som-lv-fix-i2c1-pinmux.patch
This is a note to let you know that I've just added the patch titled
ARM: dts: LogicPD Torpedo: Fix I2C1 pinmux
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
arm-dts-logicpd-torpedo-fix-i2c1-pinmux.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 74402055a2d3ec998a1ded599e86185a27d9bbf4 Mon Sep 17 00:00:00 2001
From: Adam Ford <aford173(a)gmail.com>
Date: Thu, 25 Jan 2018 14:10:37 -0600
Subject: ARM: dts: LogicPD Torpedo: Fix I2C1 pinmux
From: Adam Ford <aford173(a)gmail.com>
commit 74402055a2d3ec998a1ded599e86185a27d9bbf4 upstream.
The pinmuxing was missing for I2C1 which was causing intermittent issues
with the PMIC which is connected to I2C1. The bootloader did not quite
configure the I2C1 either, so when running at 2.6MHz, it was generating
errors at time.
This correctly sets the I2C1 pinmuxing so it can operate at 2.6MHz
Fixes: 687c27676151 ("ARM: dts: Add minimal support for LogicPD Torpedo
DM3730 devkit")
Signed-off-by: Adam Ford <aford173(a)gmail.com>
Signed-off-by: Tony Lindgren <tony(a)atomide.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/arm/boot/dts/logicpd-torpedo-som.dtsi | 8 ++++++++
1 file changed, 8 insertions(+)
--- a/arch/arm/boot/dts/logicpd-torpedo-som.dtsi
+++ b/arch/arm/boot/dts/logicpd-torpedo-som.dtsi
@@ -90,6 +90,8 @@
};
&i2c1 {
+ pinctrl-names = "default";
+ pinctrl-0 = <&i2c1_pins>;
clock-frequency = <2600000>;
twl: twl@48 {
@@ -146,6 +148,12 @@
OMAP3630_CORE2_IOPAD(0x25da, PIN_INPUT_PULLUP | MUX_MODE2) /* etk_ctl.sdmmc3_cmd */
>;
};
+ i2c1_pins: pinmux_i2c1_pins {
+ pinctrl-single,pins = <
+ OMAP3_CORE1_IOPAD(0x21ba, PIN_INPUT | MUX_MODE0) /* i2c1_scl.i2c1_scl */
+ OMAP3_CORE1_IOPAD(0x21bc, PIN_INPUT | MUX_MODE0) /* i2c1_sda.i2c1_sda */
+ >;
+ };
};
#include "twl4030.dtsi"
Patches currently in stable-queue which might be from aford173(a)gmail.com are
queue-4.4/arm-dts-logicpd-torpedo-fix-i2c1-pinmux.patch
This is a note to let you know that I've just added the patch titled
powerpc/64s/radix: Boot-time NULL pointer protection using a guard-PID
to the 4.15-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
powerpc-64s-radix-boot-time-null-pointer-protection-using-a-guard-pid.patch
and it can be found in the queue-4.15 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From eeb715c3e995fbdda0cc05e61216c6c5609bce66 Mon Sep 17 00:00:00 2001
From: Nicholas Piggin <npiggin(a)gmail.com>
Date: Wed, 7 Feb 2018 11:20:02 +1000
Subject: powerpc/64s/radix: Boot-time NULL pointer protection using a guard-PID
From: Nicholas Piggin <npiggin(a)gmail.com>
commit eeb715c3e995fbdda0cc05e61216c6c5609bce66 upstream.
This change restores and formalises the behaviour that access to NULL
or other user addresses by the kernel during boot should fault rather
than succeed and modify memory. This was inadvertently broken when
fixing another bug, because it was previously not well defined and
only worked by chance.
powerpc/64s/radix uses high address bits to select an address space
"quadrant", which determines which PID and LPID are used to translate
the rest of the address (effective PID, effective LPID). The kernel
mapping at 0xC... selects quadrant 3, which uses PID=0 and LPID=0. So
the kernel page tables are installed in the PID 0 process table entry.
An address at 0x0... selects quadrant 0, which uses PID=PIDR for
translating the rest of the address (that is, it uses the value of the
PIDR register as the effective PID). If PIDR=0, then the translation
is performed with the PID 0 process table entry page tables. This is
the kernel mapping, so we effectively get another copy of the kernel
address space at 0. A NULL pointer access will access physical memory
address 0.
To prevent duplicating the kernel address space in quadrant 0, this
patch allocates a guard PID containing no translations, and
initializes PIDR with this during boot, before the MMU is switched on.
Any kernel access to quadrant 0 will use this guard PID for
translation and find no valid mappings, and therefore fault.
After boot, this PID will be switchd away to user context PIDs, but
those contain user mappings (and usually NULL pointer protection)
rather than kernel mapping, which is much safer (and by design). It
may be in future this is tightened further, which the guard PID could
be used for.
Commit 371b8044 ("powerpc/64s: Initialize ISAv3 MMU registers before
setting partition table"), introduced this problem because it zeroes
PIDR at boot. However previously the value was inherited from firmware
or kexec, which is not robust and can be zero (e.g., mambo).
Fixes: 371b80447ff3 ("powerpc/64s: Initialize ISAv3 MMU registers before setting partition table")
Cc: stable(a)vger.kernel.org # v4.15+
Reported-by: Florian Weimer <fweimer(a)redhat.com>
Tested-by: Mauricio Faria de Oliveira <mauricfo(a)linux.vnet.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin(a)gmail.com>
Signed-off-by: Michael Ellerman <mpe(a)ellerman.id.au>
[mauricfo: backport to v4.15.7 (context line updates only) and re-test]
Signed-off-by: Mauricio Faria de Oliveira <mauricfo(a)linux.vnet.ibm.com>
Acked-by: Michael Ellerman <mpe(a)ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/powerpc/mm/pgtable-radix.c | 20 ++++++++++++++++++++
1 file changed, 20 insertions(+)
--- a/arch/powerpc/mm/pgtable-radix.c
+++ b/arch/powerpc/mm/pgtable-radix.c
@@ -21,6 +21,7 @@
#include <asm/pgtable.h>
#include <asm/pgalloc.h>
+#include <asm/mmu_context.h>
#include <asm/dma.h>
#include <asm/machdep.h>
#include <asm/mmu.h>
@@ -334,6 +335,22 @@ static void __init radix_init_pgtable(vo
"r" (TLBIEL_INVAL_SET_LPID), "r" (0));
asm volatile("eieio; tlbsync; ptesync" : : : "memory");
trace_tlbie(0, 0, TLBIEL_INVAL_SET_LPID, 0, 2, 1, 1);
+
+ /*
+ * The init_mm context is given the first available (non-zero) PID,
+ * which is the "guard PID" and contains no page table. PIDR should
+ * never be set to zero because that duplicates the kernel address
+ * space at the 0x0... offset (quadrant 0)!
+ *
+ * An arbitrary PID that may later be allocated by the PID allocator
+ * for userspace processes must not be used either, because that
+ * would cause stale user mappings for that PID on CPUs outside of
+ * the TLB invalidation scheme (because it won't be in mm_cpumask).
+ *
+ * So permanently carve out one PID for the purpose of a guard PID.
+ */
+ init_mm.context.id = mmu_base_pid;
+ mmu_base_pid++;
}
static void __init radix_init_partition_table(void)
@@ -580,6 +597,8 @@ void __init radix__early_init_mmu(void)
radix_init_iamr();
radix_init_pgtable();
+ /* Switch to the guard PID before turning on MMU */
+ radix__switch_mmu_context(NULL, &init_mm);
}
void radix__early_init_mmu_secondary(void)
@@ -601,6 +620,7 @@ void radix__early_init_mmu_secondary(voi
radix_init_amor();
}
radix_init_iamr();
+ radix__switch_mmu_context(NULL, &init_mm);
}
void radix__mmu_cleanup_all(void)
Patches currently in stable-queue which might be from npiggin(a)gmail.com are
queue-4.15/powerpc-64s-radix-boot-time-null-pointer-protection-using-a-guard-pid.patch
This is a note to let you know that I've just added the patch titled
md: only allow remove_and_add_spares when no sync_thread running.
to the 4.15-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
md-only-allow-remove_and_add_spares-when-no-sync_thread-running.patch
and it can be found in the queue-4.15 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 39772f0a7be3b3dc26c74ea13fe7847fd1522c8b Mon Sep 17 00:00:00 2001
From: NeilBrown <neilb(a)suse.com>
Date: Sat, 3 Feb 2018 09:19:30 +1100
Subject: md: only allow remove_and_add_spares when no sync_thread running.
From: NeilBrown <neilb(a)suse.com>
commit 39772f0a7be3b3dc26c74ea13fe7847fd1522c8b upstream.
The locking protocols in md assume that a device will
never be removed from an array during resync/recovery/reshape.
When that isn't happening, rcu or reconfig_mutex is needed
to protect an rdev pointer while taking a refcount. When
it is happening, that protection isn't needed.
Unfortunately there are cases were remove_and_add_spares() is
called when recovery might be happening: is state_store(),
slot_store() and hot_remove_disk().
In each case, this is just an optimization, to try to expedite
removal from the personality so the device can be removed from
the array. If resync etc is happening, we just have to wait
for md_check_recover to find a suitable time to call
remove_and_add_spares().
This optimization and not essential so it doesn't
matter if it fails.
So change remove_and_add_spares() to abort early if
resync/recovery/reshape is happening, unless it is called
from md_check_recovery() as part of a newly started recovery.
The parameter "this" is only NULL when called from
md_check_recovery() so when it is NULL, there is no need to abort.
As this can result in a NULL dereference, the fix is suitable
for -stable.
cc: yuyufen <yuyufen(a)huawei.com>
Cc: Tomasz Majchrzak <tomasz.majchrzak(a)intel.com>
Fixes: 8430e7e0af9a ("md: disconnect device from personality before trying to remove it.")
Cc: stable(a)ver.kernel.org (v4.8+)
Signed-off-by: NeilBrown <neilb(a)suse.com>
Signed-off-by: Shaohua Li <sh.li(a)alibaba-inc.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/md/md.c | 4 ++++
1 file changed, 4 insertions(+)
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -8554,6 +8554,10 @@ static int remove_and_add_spares(struct
int removed = 0;
bool remove_some = false;
+ if (this && test_bit(MD_RECOVERY_RUNNING, &mddev->recovery))
+ /* Mustn't remove devices when resync thread is running */
+ return 0;
+
rdev_for_each(rdev, mddev) {
if ((this == NULL || rdev == this) &&
rdev->raid_disk >= 0 &&
Patches currently in stable-queue which might be from neilb(a)suse.com are
queue-4.15/md-only-allow-remove_and_add_spares-when-no-sync_thread-running.patch
This is a note to let you know that I've just added the patch titled
ARM: dts: LogicPD Torpedo: Fix I2C1 pinmux
to the 4.15-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
arm-dts-logicpd-torpedo-fix-i2c1-pinmux.patch
and it can be found in the queue-4.15 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 74402055a2d3ec998a1ded599e86185a27d9bbf4 Mon Sep 17 00:00:00 2001
From: Adam Ford <aford173(a)gmail.com>
Date: Thu, 25 Jan 2018 14:10:37 -0600
Subject: ARM: dts: LogicPD Torpedo: Fix I2C1 pinmux
From: Adam Ford <aford173(a)gmail.com>
commit 74402055a2d3ec998a1ded599e86185a27d9bbf4 upstream.
The pinmuxing was missing for I2C1 which was causing intermittent issues
with the PMIC which is connected to I2C1. The bootloader did not quite
configure the I2C1 either, so when running at 2.6MHz, it was generating
errors at time.
This correctly sets the I2C1 pinmuxing so it can operate at 2.6MHz
Fixes: 687c27676151 ("ARM: dts: Add minimal support for LogicPD Torpedo
DM3730 devkit")
Signed-off-by: Adam Ford <aford173(a)gmail.com>
Signed-off-by: Tony Lindgren <tony(a)atomide.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/arm/boot/dts/logicpd-torpedo-som.dtsi | 8 ++++++++
1 file changed, 8 insertions(+)
--- a/arch/arm/boot/dts/logicpd-torpedo-som.dtsi
+++ b/arch/arm/boot/dts/logicpd-torpedo-som.dtsi
@@ -104,6 +104,8 @@
};
&i2c1 {
+ pinctrl-names = "default";
+ pinctrl-0 = <&i2c1_pins>;
clock-frequency = <2600000>;
twl: twl@48 {
@@ -211,6 +213,12 @@
OMAP3_CORE1_IOPAD(0x21b8, PIN_INPUT | MUX_MODE0) /* hsusb0_data7.hsusb0_data7 */
>;
};
+ i2c1_pins: pinmux_i2c1_pins {
+ pinctrl-single,pins = <
+ OMAP3_CORE1_IOPAD(0x21ba, PIN_INPUT | MUX_MODE0) /* i2c1_scl.i2c1_scl */
+ OMAP3_CORE1_IOPAD(0x21bc, PIN_INPUT | MUX_MODE0) /* i2c1_sda.i2c1_sda */
+ >;
+ };
};
&uart2 {
Patches currently in stable-queue which might be from aford173(a)gmail.com are
queue-4.15/arm-dts-logicpd-torpedo-fix-i2c1-pinmux.patch
queue-4.15/arm-dts-logicpd-som-lv-fix-i2c1-pinmux.patch
This is a note to let you know that I've just added the patch titled
ARM: dts: LogicPD SOM-LV: Fix I2C1 pinmux
to the 4.15-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
arm-dts-logicpd-som-lv-fix-i2c1-pinmux.patch
and it can be found in the queue-4.15 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 84c7efd607e7fb6933920322086db64654f669b2 Mon Sep 17 00:00:00 2001
From: Adam Ford <aford173(a)gmail.com>
Date: Sat, 27 Jan 2018 15:27:05 -0600
Subject: ARM: dts: LogicPD SOM-LV: Fix I2C1 pinmux
From: Adam Ford <aford173(a)gmail.com>
commit 84c7efd607e7fb6933920322086db64654f669b2 upstream.
The pinmuxing was missing for I2C1 which was causing intermittent issues
with the PMIC which is connected to I2C1. The bootloader did not quite
configure the I2C1 either, so when running at 2.6MHz, it was generating
errors at times.
This correctly sets the I2C1 pinmuxing so it can operate at 2.6MHz
Fixes: ab8dd3aed011 ("ARM: DTS: Add minimal Support for Logic PD DM3730
SOM-LV")
Signed-off-by: Adam Ford <aford173(a)gmail.com>
Signed-off-by: Tony Lindgren <tony(a)atomide.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/arm/boot/dts/logicpd-som-lv.dtsi | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
--- a/arch/arm/boot/dts/logicpd-som-lv.dtsi
+++ b/arch/arm/boot/dts/logicpd-som-lv.dtsi
@@ -98,6 +98,8 @@
};
&i2c1 {
+ pinctrl-names = "default";
+ pinctrl-0 = <&i2c1_pins>;
clock-frequency = <2600000>;
twl: twl@48 {
@@ -216,7 +218,12 @@
>;
};
-
+ i2c1_pins: pinmux_i2c1_pins {
+ pinctrl-single,pins = <
+ OMAP3_CORE1_IOPAD(0x21ba, PIN_INPUT | MUX_MODE0) /* i2c1_scl.i2c1_scl */
+ OMAP3_CORE1_IOPAD(0x21bc, PIN_INPUT | MUX_MODE0) /* i2c1_sda.i2c1_sda */
+ >;
+ };
};
&omap3_pmx_wkup {
Patches currently in stable-queue which might be from aford173(a)gmail.com are
queue-4.15/arm-dts-logicpd-torpedo-fix-i2c1-pinmux.patch
queue-4.15/arm-dts-logicpd-som-lv-fix-i2c1-pinmux.patch
This is a note to let you know that I've just added the patch titled
ACPI / bus: Parse tables as term_list for Dell XPS 9570 and Precision M5530
to the 4.15-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
acpi-bus-parse-tables-as-term_list-for-dell-xps-9570-and-precision-m5530.patch
and it can be found in the queue-4.15 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 36904703aeeeb6cd31993f1353c8325006229f9a Mon Sep 17 00:00:00 2001
From: Kai Heng Feng <kai.heng.feng(a)canonical.com>
Date: Mon, 5 Feb 2018 13:19:24 +0800
Subject: ACPI / bus: Parse tables as term_list for Dell XPS 9570 and Precision M5530
From: Kai Heng Feng <kai.heng.feng(a)canonical.com>
commit 36904703aeeeb6cd31993f1353c8325006229f9a upstream.
The i2c touchpad on Dell XPS 9570 and Precision M5530 doesn't work out
of box.
The touchpad relies on its _INI method to update its _HID value from
XXXX0000 to SYNA2393.
Also, the _STA relies on value of I2CN to report correct status.
Set acpi_gbl_parse_table_as_term_list so the value of I2CN can be
correctly set up, and _INI can get run. The ACPI table in this machine
is designed to get parsed this way.
Also, change the quirk table to a more generic name.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=198515
Signed-off-by: Kai-Heng Feng <kai.heng.feng(a)canonical.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki(a)intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/acpi/bus.c | 38 +++++++++++++++++++++++++++++++-------
1 file changed, 31 insertions(+), 7 deletions(-)
--- a/drivers/acpi/bus.c
+++ b/drivers/acpi/bus.c
@@ -66,10 +66,37 @@ static int set_copy_dsdt(const struct dm
return 0;
}
#endif
+static int set_gbl_term_list(const struct dmi_system_id *id)
+{
+ acpi_gbl_parse_table_as_term_list = 1;
+ return 0;
+}
-static const struct dmi_system_id dsdt_dmi_table[] __initconst = {
+static const struct dmi_system_id acpi_quirks_dmi_table[] __initconst = {
+ /*
+ * Touchpad on Dell XPS 9570/Precision M5530 doesn't work under I2C
+ * mode.
+ * https://bugzilla.kernel.org/show_bug.cgi?id=198515
+ */
+ {
+ .callback = set_gbl_term_list,
+ .ident = "Dell Precision M5530",
+ .matches = {
+ DMI_MATCH(DMI_SYS_VENDOR, "Dell Inc."),
+ DMI_MATCH(DMI_PRODUCT_NAME, "Precision M5530"),
+ },
+ },
+ {
+ .callback = set_gbl_term_list,
+ .ident = "Dell XPS 15 9570",
+ .matches = {
+ DMI_MATCH(DMI_SYS_VENDOR, "Dell Inc."),
+ DMI_MATCH(DMI_PRODUCT_NAME, "XPS 15 9570"),
+ },
+ },
/*
* Invoke DSDT corruption work-around on all Toshiba Satellite.
+ * DSDT will be copied to memory.
* https://bugzilla.kernel.org/show_bug.cgi?id=14679
*/
{
@@ -83,7 +110,7 @@ static const struct dmi_system_id dsdt_d
{}
};
#else
-static const struct dmi_system_id dsdt_dmi_table[] __initconst = {
+static const struct dmi_system_id acpi_quirks_dmi_table[] __initconst = {
{}
};
#endif
@@ -1001,11 +1028,8 @@ void __init acpi_early_init(void)
acpi_permanent_mmap = true;
- /*
- * If the machine falls into the DMI check table,
- * DSDT will be copied to memory
- */
- dmi_check_system(dsdt_dmi_table);
+ /* Check machine-specific quirks */
+ dmi_check_system(acpi_quirks_dmi_table);
status = acpi_reallocate_root_table();
if (ACPI_FAILURE(status)) {
Patches currently in stable-queue which might be from kai.heng.feng(a)canonical.com are
queue-4.15/bluetooth-btusb-use-dmi-matching-for-qca-reset_resume-quirking.patch
queue-4.15/acpi-bus-parse-tables-as-term_list-for-dell-xps-9570-and-precision-m5530.patch
This is a note to let you know that I've just added the patch titled
md: only allow remove_and_add_spares when no sync_thread running.
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
md-only-allow-remove_and_add_spares-when-no-sync_thread-running.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 39772f0a7be3b3dc26c74ea13fe7847fd1522c8b Mon Sep 17 00:00:00 2001
From: NeilBrown <neilb(a)suse.com>
Date: Sat, 3 Feb 2018 09:19:30 +1100
Subject: md: only allow remove_and_add_spares when no sync_thread running.
From: NeilBrown <neilb(a)suse.com>
commit 39772f0a7be3b3dc26c74ea13fe7847fd1522c8b upstream.
The locking protocols in md assume that a device will
never be removed from an array during resync/recovery/reshape.
When that isn't happening, rcu or reconfig_mutex is needed
to protect an rdev pointer while taking a refcount. When
it is happening, that protection isn't needed.
Unfortunately there are cases were remove_and_add_spares() is
called when recovery might be happening: is state_store(),
slot_store() and hot_remove_disk().
In each case, this is just an optimization, to try to expedite
removal from the personality so the device can be removed from
the array. If resync etc is happening, we just have to wait
for md_check_recover to find a suitable time to call
remove_and_add_spares().
This optimization and not essential so it doesn't
matter if it fails.
So change remove_and_add_spares() to abort early if
resync/recovery/reshape is happening, unless it is called
from md_check_recovery() as part of a newly started recovery.
The parameter "this" is only NULL when called from
md_check_recovery() so when it is NULL, there is no need to abort.
As this can result in a NULL dereference, the fix is suitable
for -stable.
cc: yuyufen <yuyufen(a)huawei.com>
Cc: Tomasz Majchrzak <tomasz.majchrzak(a)intel.com>
Fixes: 8430e7e0af9a ("md: disconnect device from personality before trying to remove it.")
Cc: stable(a)ver.kernel.org (v4.8+)
Signed-off-by: NeilBrown <neilb(a)suse.com>
Signed-off-by: Shaohua Li <sh.li(a)alibaba-inc.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/md/md.c | 4 ++++
1 file changed, 4 insertions(+)
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -8522,6 +8522,10 @@ static int remove_and_add_spares(struct
int removed = 0;
bool remove_some = false;
+ if (this && test_bit(MD_RECOVERY_RUNNING, &mddev->recovery))
+ /* Mustn't remove devices when resync thread is running */
+ return 0;
+
rdev_for_each(rdev, mddev) {
if ((this == NULL || rdev == this) &&
rdev->raid_disk >= 0 &&
Patches currently in stable-queue which might be from neilb(a)suse.com are
queue-4.14/md-only-allow-remove_and_add_spares-when-no-sync_thread-running.patch
This is a note to let you know that I've just added the patch titled
ARM: dts: LogicPD Torpedo: Fix I2C1 pinmux
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
arm-dts-logicpd-torpedo-fix-i2c1-pinmux.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 74402055a2d3ec998a1ded599e86185a27d9bbf4 Mon Sep 17 00:00:00 2001
From: Adam Ford <aford173(a)gmail.com>
Date: Thu, 25 Jan 2018 14:10:37 -0600
Subject: ARM: dts: LogicPD Torpedo: Fix I2C1 pinmux
From: Adam Ford <aford173(a)gmail.com>
commit 74402055a2d3ec998a1ded599e86185a27d9bbf4 upstream.
The pinmuxing was missing for I2C1 which was causing intermittent issues
with the PMIC which is connected to I2C1. The bootloader did not quite
configure the I2C1 either, so when running at 2.6MHz, it was generating
errors at time.
This correctly sets the I2C1 pinmuxing so it can operate at 2.6MHz
Fixes: 687c27676151 ("ARM: dts: Add minimal support for LogicPD Torpedo
DM3730 devkit")
Signed-off-by: Adam Ford <aford173(a)gmail.com>
Signed-off-by: Tony Lindgren <tony(a)atomide.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/arm/boot/dts/logicpd-torpedo-som.dtsi | 8 ++++++++
1 file changed, 8 insertions(+)
--- a/arch/arm/boot/dts/logicpd-torpedo-som.dtsi
+++ b/arch/arm/boot/dts/logicpd-torpedo-som.dtsi
@@ -104,6 +104,8 @@
};
&i2c1 {
+ pinctrl-names = "default";
+ pinctrl-0 = <&i2c1_pins>;
clock-frequency = <2600000>;
twl: twl@48 {
@@ -211,6 +213,12 @@
OMAP3_CORE1_IOPAD(0x21b8, PIN_INPUT | MUX_MODE0) /* hsusb0_data7.hsusb0_data7 */
>;
};
+ i2c1_pins: pinmux_i2c1_pins {
+ pinctrl-single,pins = <
+ OMAP3_CORE1_IOPAD(0x21ba, PIN_INPUT | MUX_MODE0) /* i2c1_scl.i2c1_scl */
+ OMAP3_CORE1_IOPAD(0x21bc, PIN_INPUT | MUX_MODE0) /* i2c1_sda.i2c1_sda */
+ >;
+ };
};
&uart2 {
Patches currently in stable-queue which might be from aford173(a)gmail.com are
queue-4.14/arm-dts-logicpd-torpedo-fix-i2c1-pinmux.patch
queue-4.14/arm-dts-logicpd-som-lv-fix-i2c1-pinmux.patch
This is a note to let you know that I've just added the patch titled
ARM: dts: LogicPD SOM-LV: Fix I2C1 pinmux
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
arm-dts-logicpd-som-lv-fix-i2c1-pinmux.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 84c7efd607e7fb6933920322086db64654f669b2 Mon Sep 17 00:00:00 2001
From: Adam Ford <aford173(a)gmail.com>
Date: Sat, 27 Jan 2018 15:27:05 -0600
Subject: ARM: dts: LogicPD SOM-LV: Fix I2C1 pinmux
From: Adam Ford <aford173(a)gmail.com>
commit 84c7efd607e7fb6933920322086db64654f669b2 upstream.
The pinmuxing was missing for I2C1 which was causing intermittent issues
with the PMIC which is connected to I2C1. The bootloader did not quite
configure the I2C1 either, so when running at 2.6MHz, it was generating
errors at times.
This correctly sets the I2C1 pinmuxing so it can operate at 2.6MHz
Fixes: ab8dd3aed011 ("ARM: DTS: Add minimal Support for Logic PD DM3730
SOM-LV")
Signed-off-by: Adam Ford <aford173(a)gmail.com>
Signed-off-by: Tony Lindgren <tony(a)atomide.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/arm/boot/dts/logicpd-som-lv.dtsi | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
--- a/arch/arm/boot/dts/logicpd-som-lv.dtsi
+++ b/arch/arm/boot/dts/logicpd-som-lv.dtsi
@@ -97,6 +97,8 @@
};
&i2c1 {
+ pinctrl-names = "default";
+ pinctrl-0 = <&i2c1_pins>;
clock-frequency = <2600000>;
twl: twl@48 {
@@ -215,7 +217,12 @@
>;
};
-
+ i2c1_pins: pinmux_i2c1_pins {
+ pinctrl-single,pins = <
+ OMAP3_CORE1_IOPAD(0x21ba, PIN_INPUT | MUX_MODE0) /* i2c1_scl.i2c1_scl */
+ OMAP3_CORE1_IOPAD(0x21bc, PIN_INPUT | MUX_MODE0) /* i2c1_sda.i2c1_sda */
+ >;
+ };
};
&omap3_pmx_wkup {
Patches currently in stable-queue which might be from aford173(a)gmail.com are
queue-4.14/arm-dts-logicpd-torpedo-fix-i2c1-pinmux.patch
queue-4.14/arm-dts-logicpd-som-lv-fix-i2c1-pinmux.patch
This is a note to let you know that I've just added the patch titled
ACPI / bus: Parse tables as term_list for Dell XPS 9570 and Precision M5530
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
acpi-bus-parse-tables-as-term_list-for-dell-xps-9570-and-precision-m5530.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 36904703aeeeb6cd31993f1353c8325006229f9a Mon Sep 17 00:00:00 2001
From: Kai Heng Feng <kai.heng.feng(a)canonical.com>
Date: Mon, 5 Feb 2018 13:19:24 +0800
Subject: ACPI / bus: Parse tables as term_list for Dell XPS 9570 and Precision M5530
From: Kai Heng Feng <kai.heng.feng(a)canonical.com>
commit 36904703aeeeb6cd31993f1353c8325006229f9a upstream.
The i2c touchpad on Dell XPS 9570 and Precision M5530 doesn't work out
of box.
The touchpad relies on its _INI method to update its _HID value from
XXXX0000 to SYNA2393.
Also, the _STA relies on value of I2CN to report correct status.
Set acpi_gbl_parse_table_as_term_list so the value of I2CN can be
correctly set up, and _INI can get run. The ACPI table in this machine
is designed to get parsed this way.
Also, change the quirk table to a more generic name.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=198515
Signed-off-by: Kai-Heng Feng <kai.heng.feng(a)canonical.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki(a)intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/acpi/bus.c | 38 +++++++++++++++++++++++++++++++-------
1 file changed, 31 insertions(+), 7 deletions(-)
--- a/drivers/acpi/bus.c
+++ b/drivers/acpi/bus.c
@@ -66,10 +66,37 @@ static int set_copy_dsdt(const struct dm
return 0;
}
#endif
+static int set_gbl_term_list(const struct dmi_system_id *id)
+{
+ acpi_gbl_parse_table_as_term_list = 1;
+ return 0;
+}
-static const struct dmi_system_id dsdt_dmi_table[] __initconst = {
+static const struct dmi_system_id acpi_quirks_dmi_table[] __initconst = {
+ /*
+ * Touchpad on Dell XPS 9570/Precision M5530 doesn't work under I2C
+ * mode.
+ * https://bugzilla.kernel.org/show_bug.cgi?id=198515
+ */
+ {
+ .callback = set_gbl_term_list,
+ .ident = "Dell Precision M5530",
+ .matches = {
+ DMI_MATCH(DMI_SYS_VENDOR, "Dell Inc."),
+ DMI_MATCH(DMI_PRODUCT_NAME, "Precision M5530"),
+ },
+ },
+ {
+ .callback = set_gbl_term_list,
+ .ident = "Dell XPS 15 9570",
+ .matches = {
+ DMI_MATCH(DMI_SYS_VENDOR, "Dell Inc."),
+ DMI_MATCH(DMI_PRODUCT_NAME, "XPS 15 9570"),
+ },
+ },
/*
* Invoke DSDT corruption work-around on all Toshiba Satellite.
+ * DSDT will be copied to memory.
* https://bugzilla.kernel.org/show_bug.cgi?id=14679
*/
{
@@ -83,7 +110,7 @@ static const struct dmi_system_id dsdt_d
{}
};
#else
-static const struct dmi_system_id dsdt_dmi_table[] __initconst = {
+static const struct dmi_system_id acpi_quirks_dmi_table[] __initconst = {
{}
};
#endif
@@ -1001,11 +1028,8 @@ void __init acpi_early_init(void)
acpi_permanent_mmap = true;
- /*
- * If the machine falls into the DMI check table,
- * DSDT will be copied to memory
- */
- dmi_check_system(dsdt_dmi_table);
+ /* Check machine-specific quirks */
+ dmi_check_system(acpi_quirks_dmi_table);
status = acpi_reallocate_root_table();
if (ACPI_FAILURE(status)) {
Patches currently in stable-queue which might be from kai.heng.feng(a)canonical.com are
queue-4.14/bluetooth-btusb-use-dmi-matching-for-qca-reset_resume-quirking.patch
queue-4.14/acpi-bus-parse-tables-as-term_list-for-dell-xps-9570-and-precision-m5530.patch
This is a note to let you know that I've just added the patch titled
net: fec: introduce fec_ptp_stop and use in probe fail path
to the 3.18-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
net-fec-introduce-fec_ptp_stop-and-use-in-probe-fail-path.patch
and it can be found in the queue-3.18 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 32cba57ba74be58589aeb4cb6496183e46a5e3e5 Mon Sep 17 00:00:00 2001
From: Lucas Stach <l.stach(a)pengutronix.de>
Date: Thu, 23 Jul 2015 16:06:20 +0200
Subject: net: fec: introduce fec_ptp_stop and use in probe fail path
From: Lucas Stach <l.stach(a)pengutronix.de>
commit 32cba57ba74be58589aeb4cb6496183e46a5e3e5 upstream.
This function frees resources and cancels delayed work item that
have been initialized in fec_ptp_init().
Use this to do proper error handling if something goes wrong in
probe function after fec_ptp_init has been called.
Signed-off-by: Lucas Stach <l.stach(a)pengutronix.de>
Acked-by: Fugang Duan <B38611(a)freescale.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
[groeck: backport: context changes in .../fec_main.c]
Signed-off-by: Guenter Roeck <linux(a)roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/net/ethernet/freescale/fec.h | 1 +
drivers/net/ethernet/freescale/fec_main.c | 5 ++---
drivers/net/ethernet/freescale/fec_ptp.c | 10 ++++++++++
3 files changed, 13 insertions(+), 3 deletions(-)
--- a/drivers/net/ethernet/freescale/fec.h
+++ b/drivers/net/ethernet/freescale/fec.h
@@ -546,6 +546,7 @@ struct fec_enet_private {
};
void fec_ptp_init(struct platform_device *pdev);
+void fec_ptp_stop(struct platform_device *pdev);
void fec_ptp_start_cyclecounter(struct net_device *ndev);
int fec_ptp_set(struct net_device *ndev, struct ifreq *ifr);
int fec_ptp_get(struct net_device *ndev, struct ifreq *ifr);
--- a/drivers/net/ethernet/freescale/fec_main.c
+++ b/drivers/net/ethernet/freescale/fec_main.c
@@ -3312,6 +3312,7 @@ failed_register:
failed_mii_init:
failed_irq:
failed_init:
+ fec_ptp_stop(pdev);
if (fep->reg_phy)
regulator_disable(fep->reg_phy);
failed_regulator:
@@ -3331,14 +3332,12 @@ fec_drv_remove(struct platform_device *p
struct net_device *ndev = platform_get_drvdata(pdev);
struct fec_enet_private *fep = netdev_priv(ndev);
- cancel_delayed_work_sync(&fep->time_keep);
cancel_work_sync(&fep->tx_timeout_work);
+ fec_ptp_stop(pdev);
unregister_netdev(ndev);
fec_enet_mii_remove(fep);
if (fep->reg_phy)
regulator_disable(fep->reg_phy);
- if (fep->ptp_clock)
- ptp_clock_unregister(fep->ptp_clock);
fec_enet_clk_enable(ndev, false);
of_node_put(fep->phy_node);
free_netdev(ndev);
--- a/drivers/net/ethernet/freescale/fec_ptp.c
+++ b/drivers/net/ethernet/freescale/fec_ptp.c
@@ -620,6 +620,16 @@ void fec_ptp_init(struct platform_device
schedule_delayed_work(&fep->time_keep, HZ);
}
+void fec_ptp_stop(struct platform_device *pdev)
+{
+ struct net_device *ndev = platform_get_drvdata(pdev);
+ struct fec_enet_private *fep = netdev_priv(ndev);
+
+ cancel_delayed_work_sync(&fep->time_keep);
+ if (fep->ptp_clock)
+ ptp_clock_unregister(fep->ptp_clock);
+}
+
/**
* fec_ptp_check_pps_event
* @fep: the fec_enet_private structure handle
Patches currently in stable-queue which might be from l.stach(a)pengutronix.de are
queue-3.18/net-fec-introduce-fec_ptp_stop-and-use-in-probe-fail-path.patch
The patch below does not apply to the 4.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From eeb715c3e995fbdda0cc05e61216c6c5609bce66 Mon Sep 17 00:00:00 2001
From: Nicholas Piggin <npiggin(a)gmail.com>
Date: Wed, 7 Feb 2018 11:20:02 +1000
Subject: [PATCH] powerpc/64s/radix: Boot-time NULL pointer protection using a
guard-PID
This change restores and formalises the behaviour that access to NULL
or other user addresses by the kernel during boot should fault rather
than succeed and modify memory. This was inadvertently broken when
fixing another bug, because it was previously not well defined and
only worked by chance.
powerpc/64s/radix uses high address bits to select an address space
"quadrant", which determines which PID and LPID are used to translate
the rest of the address (effective PID, effective LPID). The kernel
mapping at 0xC... selects quadrant 3, which uses PID=0 and LPID=0. So
the kernel page tables are installed in the PID 0 process table entry.
An address at 0x0... selects quadrant 0, which uses PID=PIDR for
translating the rest of the address (that is, it uses the value of the
PIDR register as the effective PID). If PIDR=0, then the translation
is performed with the PID 0 process table entry page tables. This is
the kernel mapping, so we effectively get another copy of the kernel
address space at 0. A NULL pointer access will access physical memory
address 0.
To prevent duplicating the kernel address space in quadrant 0, this
patch allocates a guard PID containing no translations, and
initializes PIDR with this during boot, before the MMU is switched on.
Any kernel access to quadrant 0 will use this guard PID for
translation and find no valid mappings, and therefore fault.
After boot, this PID will be switchd away to user context PIDs, but
those contain user mappings (and usually NULL pointer protection)
rather than kernel mapping, which is much safer (and by design). It
may be in future this is tightened further, which the guard PID could
be used for.
Commit 371b8044 ("powerpc/64s: Initialize ISAv3 MMU registers before
setting partition table"), introduced this problem because it zeroes
PIDR at boot. However previously the value was inherited from firmware
or kexec, which is not robust and can be zero (e.g., mambo).
Fixes: 371b80447ff3 ("powerpc/64s: Initialize ISAv3 MMU registers before setting partition table")
Cc: stable(a)vger.kernel.org # v4.15+
Reported-by: Florian Weimer <fweimer(a)redhat.com>
Tested-by: Mauricio Faria de Oliveira <mauricfo(a)linux.vnet.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin(a)gmail.com>
Signed-off-by: Michael Ellerman <mpe(a)ellerman.id.au>
diff --git a/arch/powerpc/mm/pgtable-radix.c b/arch/powerpc/mm/pgtable-radix.c
index 573a9a2ee455..96e07d1f673d 100644
--- a/arch/powerpc/mm/pgtable-radix.c
+++ b/arch/powerpc/mm/pgtable-radix.c
@@ -20,6 +20,7 @@
#include <asm/pgtable.h>
#include <asm/pgalloc.h>
+#include <asm/mmu_context.h>
#include <asm/dma.h>
#include <asm/machdep.h>
#include <asm/mmu.h>
@@ -333,6 +334,22 @@ static void __init radix_init_pgtable(void)
"r" (TLBIEL_INVAL_SET_LPID), "r" (0));
asm volatile("eieio; tlbsync; ptesync" : : : "memory");
trace_tlbie(0, 0, TLBIEL_INVAL_SET_LPID, 0, 2, 1, 1);
+
+ /*
+ * The init_mm context is given the first available (non-zero) PID,
+ * which is the "guard PID" and contains no page table. PIDR should
+ * never be set to zero because that duplicates the kernel address
+ * space at the 0x0... offset (quadrant 0)!
+ *
+ * An arbitrary PID that may later be allocated by the PID allocator
+ * for userspace processes must not be used either, because that
+ * would cause stale user mappings for that PID on CPUs outside of
+ * the TLB invalidation scheme (because it won't be in mm_cpumask).
+ *
+ * So permanently carve out one PID for the purpose of a guard PID.
+ */
+ init_mm.context.id = mmu_base_pid;
+ mmu_base_pid++;
}
static void __init radix_init_partition_table(void)
@@ -579,7 +596,8 @@ void __init radix__early_init_mmu(void)
radix_init_iamr();
radix_init_pgtable();
-
+ /* Switch to the guard PID before turning on MMU */
+ radix__switch_mmu_context(NULL, &init_mm);
if (cpu_has_feature(CPU_FTR_HVMODE))
tlbiel_all();
}
@@ -604,6 +622,7 @@ void radix__early_init_mmu_secondary(void)
}
radix_init_iamr();
+ radix__switch_mmu_context(NULL, &init_mm);
if (cpu_has_feature(CPU_FTR_HVMODE))
tlbiel_all();
}
Can you please apply 84c7efd607e7 ("ARM: dts: LogicPD SOM-LV: Fix
I2C1 pinmux") to the 4.9 and 4.15 kernels?
Like the Torpedo, this fixes an intermittent issue with the RTC and
Audio which are integrated into the PMIC on I2C1.
Thank you,
adam
Can you please apply 74402055a2d3 ("ARM: dts: LogicPD Torpedo: Fix
I2C1 pinmux") to the 4.4, 4.9 and 4.15 stable kernels?
There is a problem with older versions of the bootloader not setting
this correctly and causes intermittent issues. This fixes the I2C1
bus to stablize communication with the PMIC which contains the RTC and
audio.
Thank you,
adam
This is backport of the upstream commit that fixes memory corruption in
dm-io. It is suitable for stable kernels 4.8 to 4.11. (the bug was already
fixed in 4.12)
Mikulas
commit feb7695fe9fb83084aa29de0094774f4c9d4c9fc
Author: Mike Snitzer <snitzer(a)redhat.com>
Date: Tue Jun 20 19:14:30 2017 -0400
dm io: fix duplicate bio completion due to missing ref count
If only a subset of the devices associated with multiple regions support
a given special operation (eg. DISCARD) then the dec_count() that is
used to set error for the region must increment the io->count.
Otherwise, when the dec_count() is called it can cause the dm-io
caller's bio to be completed multiple times. As was reported against
the dm-mirror target that had mirror legs with a mix of discard
capabilities.
Bug: https://bugzilla.kernel.org/show_bug.cgi?id=196077
Reported-by: Zhang Yi <yizhan(a)redhat.com>
Signed-off-by: Mike Snitzer <snitzer(a)redhat.com>
---
drivers/md/dm-io.c | 1 +
1 file changed, 1 insertion(+)
Index: linux-stable/drivers/md/dm-io.c
===================================================================
--- linux-stable.orig/drivers/md/dm-io.c 2018-03-06 14:13:59.000000000 +0100
+++ linux-stable/drivers/md/dm-io.c 2018-03-06 14:14:23.000000000 +0100
@@ -316,6 +316,7 @@ static void do_region(int op, int op_fla
special_cmd_max_sectors = q->limits.max_write_same_sectors;
if ((op == REQ_OP_DISCARD || op == REQ_OP_WRITE_SAME) &&
special_cmd_max_sectors == 0) {
+ atomic_inc(&io->count);
dec_count(io, region, -EOPNOTSUPP);
return;
}
From: Lucas Stach <l.stach(a)pengutronix.de>
[ upstream commit 32cba57ba74be58589aeb4cb6496183e46a5e3e5 ]
This function frees resources and cancels delayed work item that
have been initialized in fec_ptp_init().
Use this to do proper error handling if something goes wrong in
probe function after fec_ptp_init has been called.
Signed-off-by: Lucas Stach <l.stach(a)pengutronix.de>
Acked-by: Fugang Duan <B38611(a)freescale.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
[groeck: backport: context changes in .../fec_main.c]
Signed-off-by: Guenter Roeck <linux(a)roeck-us.net>
---
Not really sure if I should send this one to David or Greg, since v3.18 is no
longer officially supported, so I am sending it to both. Sorry for the noise.
This patch fixes a crash seen when running sabrelite images in a version
of qemu which fixes https://bugs.launchpad.net/qemu/+bug/1753309.
drivers/net/ethernet/freescale/fec.h | 1 +
drivers/net/ethernet/freescale/fec_main.c | 5 ++---
drivers/net/ethernet/freescale/fec_ptp.c | 10 ++++++++++
3 files changed, 13 insertions(+), 3 deletions(-)
diff --git a/drivers/net/ethernet/freescale/fec.h b/drivers/net/ethernet/freescale/fec.h
index 9af296a1ca99..c16b6deff5ea 100644
--- a/drivers/net/ethernet/freescale/fec.h
+++ b/drivers/net/ethernet/freescale/fec.h
@@ -546,6 +546,7 @@ struct fec_enet_private {
};
void fec_ptp_init(struct platform_device *pdev);
+void fec_ptp_stop(struct platform_device *pdev);
void fec_ptp_start_cyclecounter(struct net_device *ndev);
int fec_ptp_set(struct net_device *ndev, struct ifreq *ifr);
int fec_ptp_get(struct net_device *ndev, struct ifreq *ifr);
diff --git a/drivers/net/ethernet/freescale/fec_main.c b/drivers/net/ethernet/freescale/fec_main.c
index 065a7616e961..f1224c2d112a 100644
--- a/drivers/net/ethernet/freescale/fec_main.c
+++ b/drivers/net/ethernet/freescale/fec_main.c
@@ -3312,6 +3312,7 @@ failed_register:
failed_mii_init:
failed_irq:
failed_init:
+ fec_ptp_stop(pdev);
if (fep->reg_phy)
regulator_disable(fep->reg_phy);
failed_regulator:
@@ -3331,14 +3332,12 @@ fec_drv_remove(struct platform_device *pdev)
struct net_device *ndev = platform_get_drvdata(pdev);
struct fec_enet_private *fep = netdev_priv(ndev);
- cancel_delayed_work_sync(&fep->time_keep);
cancel_work_sync(&fep->tx_timeout_work);
+ fec_ptp_stop(pdev);
unregister_netdev(ndev);
fec_enet_mii_remove(fep);
if (fep->reg_phy)
regulator_disable(fep->reg_phy);
- if (fep->ptp_clock)
- ptp_clock_unregister(fep->ptp_clock);
fec_enet_clk_enable(ndev, false);
of_node_put(fep->phy_node);
free_netdev(ndev);
diff --git a/drivers/net/ethernet/freescale/fec_ptp.c b/drivers/net/ethernet/freescale/fec_ptp.c
index 992c8c3db553..9b2dcf4261a5 100644
--- a/drivers/net/ethernet/freescale/fec_ptp.c
+++ b/drivers/net/ethernet/freescale/fec_ptp.c
@@ -620,6 +620,16 @@ void fec_ptp_init(struct platform_device *pdev)
schedule_delayed_work(&fep->time_keep, HZ);
}
+void fec_ptp_stop(struct platform_device *pdev)
+{
+ struct net_device *ndev = platform_get_drvdata(pdev);
+ struct fec_enet_private *fep = netdev_priv(ndev);
+
+ cancel_delayed_work_sync(&fep->time_keep);
+ if (fep->ptp_clock)
+ ptp_clock_unregister(fep->ptp_clock);
+}
+
/**
* fec_ptp_check_pps_event
* @fep: the fec_enet_private structure handle
--
2.7.4
Hi Greg,
Please include commit 36904703aeee ("ACPI / bus: Parse tables as term_list
for Dell XPS 9570 and Precision M5530")
for v4.14+.
This can be backported to older versions, but this system won't function on
v4.4 or v4.9.
It requires new drivers that don't exist in v4.4 or v4.9.
Thanks!
Kai-Heng
This is a note to let you know that I've just added the patch titled
btrfs: Don't clear SGID when inheriting ACLs
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
btrfs-don-t-clear-sgid-when-inheriting-acls.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From b7f8a09f8097db776b8d160862540e4fc1f51296 Mon Sep 17 00:00:00 2001
From: Jan Kara <jack(a)suse.cz>
Date: Thu, 22 Jun 2017 15:31:07 +0200
Subject: btrfs: Don't clear SGID when inheriting ACLs
From: Jan Kara <jack(a)suse.cz>
commit b7f8a09f8097db776b8d160862540e4fc1f51296 upstream.
When new directory 'DIR1' is created in a directory 'DIR0' with SGID bit
set, DIR1 is expected to have SGID bit set (and owning group equal to
the owning group of 'DIR0'). However when 'DIR0' also has some default
ACLs that 'DIR1' inherits, setting these ACLs will result in SGID bit on
'DIR1' to get cleared if user is not member of the owning group.
Fix the problem by moving posix_acl_update_mode() out of
__btrfs_set_acl() into btrfs_set_acl(). That way the function will not be
called when inheriting ACLs which is what we want as it prevents SGID
bit clearing and the mode has been properly set by posix_acl_create()
anyway.
Fixes: 073931017b49d9458aa351605b43a7e34598caef
CC: stable(a)vger.kernel.org
CC: linux-btrfs(a)vger.kernel.org
CC: David Sterba <dsterba(a)suse.com>
Signed-off-by: Jan Kara <jack(a)suse.cz>
Signed-off-by: David Sterba <dsterba(a)suse.com>
Signed-off-by: Nikolay Borisov <nborisov(a)suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
fs/btrfs/acl.c | 13 +++++++------
1 file changed, 7 insertions(+), 6 deletions(-)
--- a/fs/btrfs/acl.c
+++ b/fs/btrfs/acl.c
@@ -82,12 +82,6 @@ static int __btrfs_set_acl(struct btrfs_
switch (type) {
case ACL_TYPE_ACCESS:
name = POSIX_ACL_XATTR_ACCESS;
- if (acl) {
- ret = posix_acl_update_mode(inode, &inode->i_mode, &acl);
- if (ret)
- return ret;
- }
- ret = 0;
break;
case ACL_TYPE_DEFAULT:
if (!S_ISDIR(inode->i_mode))
@@ -123,6 +117,13 @@ out:
int btrfs_set_acl(struct inode *inode, struct posix_acl *acl, int type)
{
+ int ret;
+
+ if (type == ACL_TYPE_ACCESS && acl) {
+ ret = posix_acl_update_mode(inode, &inode->i_mode, &acl);
+ if (ret)
+ return ret;
+ }
return __btrfs_set_acl(NULL, inode, acl, type);
}
Patches currently in stable-queue which might be from jack(a)suse.cz are
queue-4.4/btrfs-don-t-clear-sgid-when-inheriting-acls.patch
This is a note to let you know that I've just added the patch titled
KVM/x86: remove WARN_ON() for when vm_munmap() fails
to the 4.15-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
kvm-x86-remove-warn_on-for-when-vm_munmap-fails.patch
and it can be found in the queue-4.15 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 103c763c72dd2df3e8c91f2d7ec88f98ed391111 Mon Sep 17 00:00:00 2001
From: Eric Biggers <ebiggers(a)google.com>
Date: Wed, 31 Jan 2018 17:30:21 -0800
Subject: KVM/x86: remove WARN_ON() for when vm_munmap() fails
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
From: Eric Biggers <ebiggers(a)google.com>
commit 103c763c72dd2df3e8c91f2d7ec88f98ed391111 upstream.
On x86, special KVM memslots such as the TSS region have anonymous
memory mappings created on behalf of userspace, and these mappings are
removed when the VM is destroyed.
It is however possible for removing these mappings via vm_munmap() to
fail. This can most easily happen if the thread receives SIGKILL while
it's waiting to acquire ->mmap_sem. This triggers the 'WARN_ON(r < 0)'
in __x86_set_memory_region(). syzkaller was able to hit this, using
'exit()' to send the SIGKILL. Note that while the vm_munmap() failure
results in the mapping not being removed immediately, it is not leaked
forever but rather will be freed when the process exits.
It's not really possible to handle this failure properly, so almost
every other caller of vm_munmap() doesn't check the return value. It's
a limitation of having the kernel manage these mappings rather than
userspace.
So just remove the WARN_ON() so that users can't spam the kernel log
with this warning.
Fixes: f0d648bdf0a5 ("KVM: x86: map/unmap private slots in __x86_set_memory_region")
Reported-by: syzbot <syzkaller(a)googlegroups.com>
Signed-off-by: Eric Biggers <ebiggers(a)google.com>
Signed-off-by: Radim Krčmář <rkrcmar(a)redhat.com>
Signed-off-by: Jack Wang <jinpu.wang(a)profitbricks.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/kvm/x86.c | 6 ++----
1 file changed, 2 insertions(+), 4 deletions(-)
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -8281,10 +8281,8 @@ int __x86_set_memory_region(struct kvm *
return r;
}
- if (!size) {
- r = vm_munmap(old.userspace_addr, old.npages * PAGE_SIZE);
- WARN_ON(r < 0);
- }
+ if (!size)
+ vm_munmap(old.userspace_addr, old.npages * PAGE_SIZE);
return 0;
}
Patches currently in stable-queue which might be from ebiggers(a)google.com are
queue-4.15/kvm-x86-remove-warn_on-for-when-vm_munmap-fails.patch
This is a note to let you know that I've just added the patch titled
KVM/x86: remove WARN_ON() for when vm_munmap() fails
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
kvm-x86-remove-warn_on-for-when-vm_munmap-fails.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 103c763c72dd2df3e8c91f2d7ec88f98ed391111 Mon Sep 17 00:00:00 2001
From: Eric Biggers <ebiggers(a)google.com>
Date: Wed, 31 Jan 2018 17:30:21 -0800
Subject: KVM/x86: remove WARN_ON() for when vm_munmap() fails
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
From: Eric Biggers <ebiggers(a)google.com>
commit 103c763c72dd2df3e8c91f2d7ec88f98ed391111 upstream.
On x86, special KVM memslots such as the TSS region have anonymous
memory mappings created on behalf of userspace, and these mappings are
removed when the VM is destroyed.
It is however possible for removing these mappings via vm_munmap() to
fail. This can most easily happen if the thread receives SIGKILL while
it's waiting to acquire ->mmap_sem. This triggers the 'WARN_ON(r < 0)'
in __x86_set_memory_region(). syzkaller was able to hit this, using
'exit()' to send the SIGKILL. Note that while the vm_munmap() failure
results in the mapping not being removed immediately, it is not leaked
forever but rather will be freed when the process exits.
It's not really possible to handle this failure properly, so almost
every other caller of vm_munmap() doesn't check the return value. It's
a limitation of having the kernel manage these mappings rather than
userspace.
So just remove the WARN_ON() so that users can't spam the kernel log
with this warning.
Fixes: f0d648bdf0a5 ("KVM: x86: map/unmap private slots in __x86_set_memory_region")
Reported-by: syzbot <syzkaller(a)googlegroups.com>
Signed-off-by: Eric Biggers <ebiggers(a)google.com>
Signed-off-by: Radim Krčmář <rkrcmar(a)redhat.com>
Signed-off-by: Jack Wang <jinpu.wang(a)profitbricks.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/kvm/x86.c | 6 ++----
1 file changed, 2 insertions(+), 4 deletions(-)
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -8251,10 +8251,8 @@ int __x86_set_memory_region(struct kvm *
return r;
}
- if (!size) {
- r = vm_munmap(old.userspace_addr, old.npages * PAGE_SIZE);
- WARN_ON(r < 0);
- }
+ if (!size)
+ vm_munmap(old.userspace_addr, old.npages * PAGE_SIZE);
return 0;
}
Patches currently in stable-queue which might be from ebiggers(a)google.com are
queue-4.14/kvm-x86-remove-warn_on-for-when-vm_munmap-fails.patch
This is a note to let you know that I've just added the patch titled
KVM/x86: Fix wrong macro references of X86_CR0_PG_BIT and X86_CR4_PAE_BIT in kvm_valid_sregs()
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
kvm-x86-fix-wrong-macro-references-of-x86_cr0_pg_bit-and-x86_cr4_pae_bit-in-kvm_valid_sregs.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 37b95951c58fdf08dc10afa9d02066ed9f176fb5 Mon Sep 17 00:00:00 2001
From: Tianyu Lan <lantianyu1986(a)gmail.com>
Date: Tue, 16 Jan 2018 17:34:07 +0800
Subject: KVM/x86: Fix wrong macro references of X86_CR0_PG_BIT and X86_CR4_PAE_BIT in kvm_valid_sregs()
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
From: Tianyu Lan <lantianyu1986(a)gmail.com>
commit 37b95951c58fdf08dc10afa9d02066ed9f176fb5 upstream.
kvm_valid_sregs() should use X86_CR0_PG and X86_CR4_PAE to check bit
status rather than X86_CR0_PG_BIT and X86_CR4_PAE_BIT. This patch is
to fix it.
Fixes: f29810335965a(KVM/x86: Check input paging mode when cs.l is set)
Reported-by: Jeremi Piotrowski <jeremi.piotrowski(a)gmail.com>
Cc: Paolo Bonzini <pbonzini(a)redhat.com>
Cc: Radim Krčmář <rkrcmar(a)redhat.com>
Signed-off-by: Tianyu Lan <Tianyu.Lan(a)microsoft.com>
Signed-off-by: Radim Krčmář <rkrcmar(a)redhat.com>
Signed-off-by: Jack Wang <jinpu.wang(a)profitbricks.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/kvm/x86.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -7482,13 +7482,13 @@ EXPORT_SYMBOL_GPL(kvm_task_switch);
int kvm_valid_sregs(struct kvm_vcpu *vcpu, struct kvm_sregs *sregs)
{
- if ((sregs->efer & EFER_LME) && (sregs->cr0 & X86_CR0_PG_BIT)) {
+ if ((sregs->efer & EFER_LME) && (sregs->cr0 & X86_CR0_PG)) {
/*
* When EFER.LME and CR0.PG are set, the processor is in
* 64-bit mode (though maybe in a 32-bit code segment).
* CR4.PAE and EFER.LMA must be set.
*/
- if (!(sregs->cr4 & X86_CR4_PAE_BIT)
+ if (!(sregs->cr4 & X86_CR4_PAE)
|| !(sregs->efer & EFER_LMA))
return -EINVAL;
} else {
Patches currently in stable-queue which might be from lantianyu1986(a)gmail.com are
queue-4.14/kvm-x86-fix-wrong-macro-references-of-x86_cr0_pg_bit-and-x86_cr4_pae_bit-in-kvm_valid_sregs.patch
This is a note to let you know that I've just added the patch titled
PCI/ASPM: Deal with missing root ports in link state handling
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
pci-aspm-deal-with-missing-root-ports-in-link-state-handling.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From ee8bdfb6568d86bb93f55f8d99c4c643e77304ee Mon Sep 17 00:00:00 2001
From: Ard Biesheuvel <ard.biesheuvel(a)linaro.org>
Date: Mon, 2 Oct 2017 15:08:40 +0100
Subject: PCI/ASPM: Deal with missing root ports in link state handling
From: Ard Biesheuvel <ard.biesheuvel(a)linaro.org>
commit ee8bdfb6568d86bb93f55f8d99c4c643e77304ee upstream.
Even though it is unconventional, some PCIe host implementations omit the
root ports entirely, and simply consist of a host bridge (which is not
modeled as a device in the PCI hierarchy) and a link.
When the downstream device is an endpoint, our current code does not seem
to mind this unusual configuration. However, when PCIe switches are
involved, the ASPM code assumes that any downstream switch port has a
parent, and blindly dereferences the bus->parent->self field of the pci_dev
struct to chain the downstream link state to the link state of the root
port. Given that the root port is missing, the link is not modeled at all,
and nor is the link state, and attempting to access it results in a NULL
pointer dereference and a crash.
Avoid this by allowing the link state chain to terminate at the downstream
port if no root port exists.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel(a)linaro.org>
Signed-off-by: Bjorn Helgaas <bhelgaas(a)google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/pci/pcie/aspm.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
--- a/drivers/pci/pcie/aspm.c
+++ b/drivers/pci/pcie/aspm.c
@@ -526,10 +526,14 @@ static struct pcie_link_state *alloc_pci
/*
* Root Ports and PCI/PCI-X to PCIe Bridges are roots of PCIe
- * hierarchies.
+ * hierarchies. Note that some PCIe host implementations omit
+ * the root ports entirely, in which case a downstream port on
+ * a switch may become the root of the link state chain for all
+ * its subordinate endpoints.
*/
if (pci_pcie_type(pdev) == PCI_EXP_TYPE_ROOT_PORT ||
- pci_pcie_type(pdev) == PCI_EXP_TYPE_PCIE_BRIDGE) {
+ pci_pcie_type(pdev) == PCI_EXP_TYPE_PCIE_BRIDGE ||
+ !pdev->bus->parent->self) {
link->root = link;
} else {
struct pcie_link_state *parent;
Patches currently in stable-queue which might be from ard.biesheuvel(a)linaro.org are
queue-4.9/pci-aspm-deal-with-missing-root-ports-in-link-state-handling.patch
Pavel,
Am Mittwoch, 7. März 2018, 00:18:05 CET schrieb Pavel Machek:
> On Sat 2018-03-03 11:45:54, Richard Weinberger wrote:
> > While UBI and UBIFS seem to work at first sight with MLC NAND, you will
> > most likely lose all your data upon a power-cut or due to read/write
> > disturb.
> > In order to protect users from bad surprises, refuse to attach to MLC
> > NAND.
> >
> > Cc: stable(a)vger.kernel.org
>
> That sounds like _really_ bad idea for stable. All it does is it
> removes support for hardware that somehow works.
MLC is not supported and does not work. Full stop.
If someone manages to get it somehow work, either with hardware or software
hacks they are on their own.
Having it in stable is the only chance we have to get it into vendor kernels.
All this patch does is representing the current state of UBI/UBIFS.
In the last months I got a way too much boards with MLC NAND on my desk where
UBIFS broke. Customers wished they had known before...
Thanks,
//richard
This is a note to let you know that I've just added the patch titled
PCI/ASPM: Deal with missing root ports in link state handling
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
pci-aspm-deal-with-missing-root-ports-in-link-state-handling.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From ee8bdfb6568d86bb93f55f8d99c4c643e77304ee Mon Sep 17 00:00:00 2001
From: Ard Biesheuvel <ard.biesheuvel(a)linaro.org>
Date: Mon, 2 Oct 2017 15:08:40 +0100
Subject: PCI/ASPM: Deal with missing root ports in link state handling
From: Ard Biesheuvel <ard.biesheuvel(a)linaro.org>
commit ee8bdfb6568d86bb93f55f8d99c4c643e77304ee upstream.
Even though it is unconventional, some PCIe host implementations omit the
root ports entirely, and simply consist of a host bridge (which is not
modeled as a device in the PCI hierarchy) and a link.
When the downstream device is an endpoint, our current code does not seem
to mind this unusual configuration. However, when PCIe switches are
involved, the ASPM code assumes that any downstream switch port has a
parent, and blindly dereferences the bus->parent->self field of the pci_dev
struct to chain the downstream link state to the link state of the root
port. Given that the root port is missing, the link is not modeled at all,
and nor is the link state, and attempting to access it results in a NULL
pointer dereference and a crash.
Avoid this by allowing the link state chain to terminate at the downstream
port if no root port exists.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel(a)linaro.org>
Signed-off-by: Bjorn Helgaas <bhelgaas(a)google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/pci/pcie/aspm.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
--- a/drivers/pci/pcie/aspm.c
+++ b/drivers/pci/pcie/aspm.c
@@ -803,10 +803,14 @@ static struct pcie_link_state *alloc_pci
/*
* Root Ports and PCI/PCI-X to PCIe Bridges are roots of PCIe
- * hierarchies.
+ * hierarchies. Note that some PCIe host implementations omit
+ * the root ports entirely, in which case a downstream port on
+ * a switch may become the root of the link state chain for all
+ * its subordinate endpoints.
*/
if (pci_pcie_type(pdev) == PCI_EXP_TYPE_ROOT_PORT ||
- pci_pcie_type(pdev) == PCI_EXP_TYPE_PCIE_BRIDGE) {
+ pci_pcie_type(pdev) == PCI_EXP_TYPE_PCIE_BRIDGE ||
+ !pdev->bus->parent->self) {
link->root = link;
} else {
struct pcie_link_state *parent;
Patches currently in stable-queue which might be from ard.biesheuvel(a)linaro.org are
queue-4.14/pci-aspm-deal-with-missing-root-ports-in-link-state-handling.patch
L.S.,
Please apply
ee8bdfb6568d PCI/ASPM: Deal with missing root ports in link state handling
to the -stable trees going back to v4.9 LTS
It fixes an annoying boot crash that only occurs on systems with an
unusual PCIe topology, but is rather hard to diagnose for unsuspecting
users.
Regards,
Ard.
This is a note to let you know that I've just added the patch titled
KVM: X86: Fix SMRAM accessing even if VM is shutdown
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
kvm-x86-fix-smram-accessing-even-if-vm-is-shutdown.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 95e057e25892eaa48cad1e2d637b80d0f1a4fac5 Mon Sep 17 00:00:00 2001
From: Wanpeng Li <wanpengli(a)tencent.com>
Date: Thu, 8 Feb 2018 15:32:45 +0800
Subject: KVM: X86: Fix SMRAM accessing even if VM is shutdown
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
From: Wanpeng Li <wanpengli(a)tencent.com>
commit 95e057e25892eaa48cad1e2d637b80d0f1a4fac5 upstream.
Reported by syzkaller:
WARNING: CPU: 6 PID: 2434 at arch/x86/kvm/vmx.c:6660 handle_ept_misconfig+0x54/0x1e0 [kvm_intel]
CPU: 6 PID: 2434 Comm: repro_test Not tainted 4.15.0+ #4
RIP: 0010:handle_ept_misconfig+0x54/0x1e0 [kvm_intel]
Call Trace:
vmx_handle_exit+0xbd/0xe20 [kvm_intel]
kvm_arch_vcpu_ioctl_run+0xdaf/0x1d50 [kvm]
kvm_vcpu_ioctl+0x3e9/0x720 [kvm]
do_vfs_ioctl+0xa4/0x6a0
SyS_ioctl+0x79/0x90
entry_SYSCALL_64_fastpath+0x25/0x9c
The testcase creates a first thread to issue KVM_SMI ioctl, and then creates
a second thread to mmap and operate on the same vCPU. This triggers a race
condition when running the testcase with multiple threads. Sometimes one thread
exits with a triple fault while another thread mmaps and operates on the same
vCPU. Because CS=0x3000/IP=0x8000 is not mapped, accessing the SMI handler
results in an EPT misconfig. This patch fixes it by returning RET_PF_EMULATE
in kvm_handle_bad_page(), which will go on to cause an emulation failure and an
exit with KVM_EXIT_INTERNAL_ERROR.
Reported-by: syzbot+c1d9517cab094dae65e446c0c5b4de6c40f4dc58(a)syzkaller.appspotmail.com
Cc: Paolo Bonzini <pbonzini(a)redhat.com>
Cc: Radim Krčmář <rkrcmar(a)redhat.com>
Cc: stable(a)vger.kernel.org
Signed-off-by: Wanpeng Li <wanpengli(a)tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini(a)redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/kvm/mmu.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -3004,7 +3004,7 @@ static int kvm_handle_bad_page(struct kv
return 0;
}
- return -EFAULT;
+ return RET_PF_EMULATE;
}
static void transparent_hugepage_adjust(struct kvm_vcpu *vcpu,
Patches currently in stable-queue which might be from wanpengli(a)tencent.com are
queue-4.14/kvm-x86-fix-smram-accessing-even-if-vm-is-shutdown.patch
This is a note to let you know that I've just added the patch titled
KVM: x86: fix vcpu initialization with userspace lapic
to the 4.15-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
kvm-x86-fix-vcpu-initialization-with-userspace-lapic.patch
and it can be found in the queue-4.15 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From b7e31be385584afe7f073130e8e570d53c95f7fe Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Radim=20Kr=C4=8Dm=C3=A1=C5=99?= <rkrcmar(a)redhat.com>
Date: Thu, 1 Mar 2018 15:24:25 +0100
Subject: KVM: x86: fix vcpu initialization with userspace lapic
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
From: Radim Krčmář <rkrcmar(a)redhat.com>
commit b7e31be385584afe7f073130e8e570d53c95f7fe upstream.
Moving the code around broke this rare configuration.
Use this opportunity to finally call lapic reset from vcpu reset.
Reported-by: syzbot+fb7a33a4b6c35007a72b(a)syzkaller.appspotmail.com
Suggested-by: Paolo Bonzini <pbonzini(a)redhat.com>
Fixes: 0b2e9904c159 ("KVM: x86: move LAPIC initialization after VMCS creation")
Cc: stable(a)vger.kernel.org
Signed-off-by: Radim Krčmář <rkrcmar(a)redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/kvm/lapic.c | 10 ++++------
arch/x86/kvm/x86.c | 3 ++-
2 files changed, 6 insertions(+), 7 deletions(-)
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -1993,14 +1993,13 @@ void kvm_lapic_set_base(struct kvm_vcpu
void kvm_lapic_reset(struct kvm_vcpu *vcpu, bool init_event)
{
- struct kvm_lapic *apic;
+ struct kvm_lapic *apic = vcpu->arch.apic;
int i;
- apic_debug("%s\n", __func__);
+ if (!apic)
+ return;
- ASSERT(vcpu);
- apic = vcpu->arch.apic;
- ASSERT(apic != NULL);
+ apic_debug("%s\n", __func__);
/* Stop the timer in case it's a reset to an active apic */
hrtimer_cancel(&apic->lapic_timer.timer);
@@ -2559,7 +2558,6 @@ void kvm_apic_accept_events(struct kvm_v
pe = xchg(&apic->pending_events, 0);
if (test_bit(KVM_APIC_INIT, &pe)) {
- kvm_lapic_reset(vcpu, true);
kvm_vcpu_reset(vcpu, true);
if (kvm_vcpu_is_bsp(apic->vcpu))
vcpu->arch.mp_state = KVM_MP_STATE_RUNNABLE;
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -7793,7 +7793,6 @@ int kvm_arch_vcpu_setup(struct kvm_vcpu
if (r)
return r;
kvm_vcpu_reset(vcpu, false);
- kvm_lapic_reset(vcpu, false);
kvm_mmu_setup(vcpu);
vcpu_put(vcpu);
return r;
@@ -7836,6 +7835,8 @@ void kvm_arch_vcpu_destroy(struct kvm_vc
void kvm_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event)
{
+ kvm_lapic_reset(vcpu, init_event);
+
vcpu->arch.hflags = 0;
vcpu->arch.smi_pending = 0;
Patches currently in stable-queue which might be from rkrcmar(a)redhat.com are
queue-4.15/kvm-x86-fix-smram-accessing-even-if-vm-is-shutdown.patch
queue-4.15/kvm-x86-remove-indirect-msr-op-calls-from-spec_ctrl.patch
queue-4.15/kvm-mmu-fix-overlap-between-public-and-private-memslots.patch
queue-4.15/kvm-x86-fix-vcpu-initialization-with-userspace-lapic.patch
queue-4.15/kvm-vmx-optimize-vmx_vcpu_run-and-svm_vcpu_run-by-marking-the-rdmsr-path-as-unlikely.patch
This is a note to let you know that I've just added the patch titled
KVM: x86: fix vcpu initialization with userspace lapic
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
kvm-x86-fix-vcpu-initialization-with-userspace-lapic.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From b7e31be385584afe7f073130e8e570d53c95f7fe Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Radim=20Kr=C4=8Dm=C3=A1=C5=99?= <rkrcmar(a)redhat.com>
Date: Thu, 1 Mar 2018 15:24:25 +0100
Subject: KVM: x86: fix vcpu initialization with userspace lapic
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
From: Radim Krčmář <rkrcmar(a)redhat.com>
commit b7e31be385584afe7f073130e8e570d53c95f7fe upstream.
Moving the code around broke this rare configuration.
Use this opportunity to finally call lapic reset from vcpu reset.
Reported-by: syzbot+fb7a33a4b6c35007a72b(a)syzkaller.appspotmail.com
Suggested-by: Paolo Bonzini <pbonzini(a)redhat.com>
Fixes: 0b2e9904c159 ("KVM: x86: move LAPIC initialization after VMCS creation")
Cc: stable(a)vger.kernel.org
Signed-off-by: Radim Krčmář <rkrcmar(a)redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/kvm/lapic.c | 10 ++++------
arch/x86/kvm/x86.c | 3 ++-
2 files changed, 6 insertions(+), 7 deletions(-)
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -1944,14 +1944,13 @@ void kvm_lapic_set_base(struct kvm_vcpu
void kvm_lapic_reset(struct kvm_vcpu *vcpu, bool init_event)
{
- struct kvm_lapic *apic;
+ struct kvm_lapic *apic = vcpu->arch.apic;
int i;
- apic_debug("%s\n", __func__);
+ if (!apic)
+ return;
- ASSERT(vcpu);
- apic = vcpu->arch.apic;
- ASSERT(apic != NULL);
+ apic_debug("%s\n", __func__);
/* Stop the timer in case it's a reset to an active apic */
hrtimer_cancel(&apic->lapic_timer.timer);
@@ -2510,7 +2509,6 @@ void kvm_apic_accept_events(struct kvm_v
pe = xchg(&apic->pending_events, 0);
if (test_bit(KVM_APIC_INIT, &pe)) {
- kvm_lapic_reset(vcpu, true);
kvm_vcpu_reset(vcpu, true);
if (kvm_vcpu_is_bsp(apic->vcpu))
vcpu->arch.mp_state = KVM_MP_STATE_RUNNABLE;
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -7779,7 +7779,6 @@ int kvm_arch_vcpu_setup(struct kvm_vcpu
if (r)
return r;
kvm_vcpu_reset(vcpu, false);
- kvm_lapic_reset(vcpu, false);
kvm_mmu_setup(vcpu);
vcpu_put(vcpu);
return r;
@@ -7822,6 +7821,8 @@ void kvm_arch_vcpu_destroy(struct kvm_vc
void kvm_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event)
{
+ kvm_lapic_reset(vcpu, init_event);
+
vcpu->arch.hflags = 0;
vcpu->arch.smi_pending = 0;
Patches currently in stable-queue which might be from rkrcmar(a)redhat.com are
queue-4.14/kvm-x86-fix-smram-accessing-even-if-vm-is-shutdown.patch
queue-4.14/kvm-x86-remove-indirect-msr-op-calls-from-spec_ctrl.patch
queue-4.14/kvm-mmu-fix-overlap-between-public-and-private-memslots.patch
queue-4.14/kvm-x86-fix-vcpu-initialization-with-userspace-lapic.patch
queue-4.14/kvm-vmx-optimize-vmx_vcpu_run-and-svm_vcpu_run-by-marking-the-rdmsr-path-as-unlikely.patch
----- gregkh(a)linuxfoundation.org wrote:
> This is a note to let you know that I've just added the patch titled
>
> KVM: x86: move LAPIC initialization after VMCS creation
>
> to the 4.14-stable tree which can be found at:
>
> https://urldefense.proofpoint.com/v2/url?u=http-3A__www.kernel.org_git_-3Fp…
>
> The filename of the patch is:
> kvm-x86-move-lapic-initialization-after-vmcs-creation.patch
> and it can be found in the queue-4.14 subdirectory.
>
> If you, or anyone else, feels it should not be added to the stable
> tree,
> please let <stable(a)vger.kernel.org> know about it.
>
>
> From 0b2e9904c15963e715d33e5f3f1387f17d19333a Mon Sep 17 00:00:00
> 2001
> From: Paolo Bonzini <pbonzini(a)redhat.com>
> Date: Fri, 23 Feb 2018 23:29:32 +0100
> Subject: KVM: x86: move LAPIC initialization after VMCS creation
>
> From: Paolo Bonzini <pbonzini(a)redhat.com>
>
> commit 0b2e9904c15963e715d33e5f3f1387f17d19333a upstream.
>
> The initial reset of the local APIC is performed before the VMCS has
> been
> created, but it tries to do a vmwrite:
>
> vmwrite error: reg 810 value 4a00 (err 18944)
> CPU: 54 PID: 38652 Comm: qemu-kvm Tainted: G W I
> 4.16.0-0.rc2.git0.1.fc28.x86_64 #1
> Hardware name: Intel Corporation S2600CW/S2600CW, BIOS
> SE5C610.86B.01.01.0003.090520141303 09/05/2014
> Call Trace:
> vmx_set_rvi [kvm_intel]
> vmx_hwapic_irr_update [kvm_intel]
> kvm_lapic_reset [kvm]
> kvm_create_lapic [kvm]
> kvm_arch_vcpu_init [kvm]
> kvm_vcpu_init [kvm]
> vmx_create_vcpu [kvm_intel]
> kvm_vm_ioctl [kvm]
>
> Move it later, after the VMCS has been created.
>
> Fixes: 4191db26b714 ("KVM: x86: Update APICv on APIC reset")
> Cc: stable(a)vger.kernel.org
> Cc: Liran Alon <liran.alon(a)oracle.com>
> Signed-off-by: Paolo Bonzini <pbonzini(a)redhat.com>
> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
This commit broke a rare configuration.
Don't we want to apply this to stable-tree together with the fix for that?
The fix is at upstream commit b7e31be38558 ("KVM: x86: fix vcpu initialization with userspace lapic")
-Liran
Hi Ingo,
Please consider pulling,
- Arnaldo
Test results at the end of this message, as usual.
The following changes since commit 58bdf601c2de6071d0386a7a6fa707bd04761c47:
Merge branch 'i2c/for-current-fixed' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux (2018-03-03 14:55:20 -0800)
are available in the Git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-urgent-for-mingo-4.16-20180306
for you to fetch changes up to 8f2c9efabe1ed212b88ce1c5cf5e768385c9222e:
perf record: Combine some auxtrace initialization into a single function (2018-03-06 12:03:39 -0300)
----------------------------------------------------------------
perf/urgent fixes:
- Be more robust when drawing arrows in the annotation TUI, avoiding a
segfault when jump instructions have as a target addresses in functions
other that the one currently being annotated. The full fix will come in
the following days, when jumping to other functions will work as call
instructions (Arnaldo Carvalho de Melo)
- Prevent auxtrace_queues__process_index() from queuing AUX area data for
decoding when the --no-itrace option has been used (Adrian Hunter)
- Sync copy of kvm UAPI headers and x86's cpufeatures.h (Arnaldo Carvalho de Melo)
- Fix 'perf stat' CVS output format for non-supported counters (Ilya Pronin)
- Fix crash in 'perf record|perf report' pipe mode (Jiri Olsa)
- Fix annoying 'perf top' overwrite fallback message on older kernels (Kan Liang)
- Fix the usage on the 'perf kallsyms' man page (Sangwon Hong)
Signed-off-by: Arnaldo Carvalho de Melo <acme(a)redhat.com>
----------------------------------------------------------------
The following changes since commit 317660940fd9dddd3201c2f92e25c27902c753fa:
perf/x86/intel/uncore: Fix Skylake UPI event format (2018-03-04 09:59:00 +0100)
are available in the Git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-urgent-for-mingo-4.16-20180306
for you to fetch changes up to de19e5c3c51fdb1ff20d0f61d099db902ff7494b:
perf tools: Fix trigger class trigger_on() (2018-03-06 11:31:14 -0300)
----------------------------------------------------------------
perf/urgent fixes:
- Be more robust when drawing arrows in the annotation TUI, avoiding a
segfault when jump instructions have as a target addresses in functions
other that the one currently being annotated. The full fix will come in
the following days, when jumping to other functions will work as call
instructions (Arnaldo Carvalho de Melo)
- Prevent auxtrace_queues__process_index() from queuing AUX area data for
decoding when the --no-itrace option has been used (Adrian Hunter)
- Sync copy of kvm UAPI headers and x86's cpufeatures.h (Arnaldo Carvalho de Melo)
- Fix 'perf stat' CSV output format for non-supported counters (Ilya Pronin)
- Fix crash in 'perf record|perf report' pipe mode (Jiri Olsa)
- Fix annoying 'perf top' overwrite fallback message on older kernels (Kan Liang)
- Fix the usage on the 'perf kallsyms' man page (Sangwon Hong)
Signed-off-by: Arnaldo Carvalho de Melo <acme(a)redhat.com>
----------------------------------------------------------------
Adrian Hunter (2):
perf auxtrace: Prevent decoding when --no-itrace
perf tools: Fix trigger class trigger_on()
Arnaldo Carvalho de Melo (3):
perf annotate browser: Be more robust when drawing jump arrows
tools headers: Sync copy of kvm UAPI headers
tools headers: Sync x86's cpufeatures.h
Ilya Pronin (1):
perf stat: Fix CVS output format for non-supported counters
Jiri Olsa (1):
perf record: Fix crash in pipe mode
Kan Liang (1):
perf top: Fix annoying fallback message on older kernels
Sangwon Hong (1):
perf kallsyms: Fix the usage on the man page
tools/arch/x86/include/asm/cpufeatures.h | 1 +
tools/include/uapi/linux/kvm.h | 2 ++
tools/perf/Documentation/perf-kallsyms.txt | 2 +-
tools/perf/builtin-record.c | 9 +++++++++
tools/perf/builtin-stat.c | 2 +-
tools/perf/builtin-top.c | 2 +-
tools/perf/perf.h | 1 +
tools/perf/ui/browsers/annotate.c | 25 +++++++++++++++++++++++++
tools/perf/util/auxtrace.c | 15 +++++++++------
tools/perf/util/record.c | 8 ++++++--
tools/perf/util/trigger.h | 9 +++++----
11 files changed, 61 insertions(+), 15 deletions(-)
Test results:
The first ones are container (docker) based builds of tools/perf with and
without libelf support. Where clang is available, it is also used to build
perf with/without libelf.
The objtool and samples/bpf/ builds are disabled now that I'm switching from
using the sources in a local volume to fetching them from a http server to
build it inside the container, to make it easier to build in a container cluster.
Those will come back later.
Several are cross builds, the ones with -x-ARCH and the android one, and those
may not have all the features built, due to lack of multi-arch devel packages,
available and being used so far on just a few, like
debian:experimental-x-{arm64,mipsel}.
The 'perf test' one will perform a variety of tests exercising
tools/perf/util/, tools/lib/{bpf,traceevent,etc}, as well as run perf commands
with a variety of command line event specifications to then intercept the
sys_perf_event syscall to check that the perf_event_attr fields are set up as
expected, among a variety of other unit tests.
Then there is the 'make -C tools/perf build-test' ones, that build tools/perf/
with a variety of feature sets, exercising the build with an incomplete set of
features as well as with a complete one. It is planned to have it run on each
of the containers mentioned above, using some container orchestration
infrastructure. Get in contact if interested in helping having this in place.
# dm
1 alpine:3.4 : Ok gcc (Alpine 5.3.0) 5.3.0
2 alpine:3.5 : Ok gcc (Alpine 6.2.1) 6.2.1 20160822
3 alpine:3.6 : Ok gcc (Alpine 6.3.0) 6.3.0
4 alpine:edge : Ok gcc (Alpine 6.4.0) 6.4.0
5 amazonlinux:1 : Ok gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-11)
6 amazonlinux:2 : Ok gcc (GCC) 7.2.1 20170915 (Red Hat 7.2.1-2)
7 android-ndk:r12b-arm : Ok arm-linux-androideabi-gcc (GCC) 4.9.x 20150123 (prerelease)
8 android-ndk:r15c-arm : Ok arm-linux-androideabi-gcc (GCC) 4.9.x 20150123 (prerelease)
9 centos:5 : Ok gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-55)
10 centos:6 : Ok gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-18)
11 centos:7 : Ok gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-16)
12 debian:7 : Ok gcc (Debian 4.7.2-5) 4.7.2
13 debian:8 : Ok gcc (Debian 4.9.2-10+deb8u1) 4.9.2
14 debian:9 : Ok gcc (Debian 6.3.0-18) 6.3.0 20170516
15 debian:experimental : Ok gcc (Debian 7.2.0-17) 7.2.1 20171205
16 debian:experimental-x-arm64 : Ok aarch64-linux-gnu-gcc (Debian 7.2.0-11) 7.2.0
17 debian:experimental-x-mips : Ok mips-linux-gnu-gcc (Debian 7.2.0-11) 7.2.0
18 debian:experimental-x-mips64 : Ok mips64-linux-gnuabi64-gcc (Debian 7.2.0-11) 7.2.0
19 debian:experimental-x-mipsel : Ok mipsel-linux-gnu-gcc (Debian 7.2.0-11) 7.2.0
20 fedora:20 : Ok gcc (GCC) 4.8.3 20140911 (Red Hat 4.8.3-7)
21 fedora:21 : Ok gcc (GCC) 4.9.2 20150212 (Red Hat 4.9.2-6)
22 fedora:22 : Ok gcc (GCC) 5.3.1 20160406 (Red Hat 5.3.1-6)
23 fedora:23 : Ok gcc (GCC) 5.3.1 20160406 (Red Hat 5.3.1-6)
24 fedora:24 : Ok gcc (GCC) 6.3.1 20161221 (Red Hat 6.3.1-1)
25 fedora:24-x-ARC-uClibc : Ok arc-linux-gcc (ARCompact ISA Linux uClibc toolchain 2017.09-rc2) 7.1.1 20170710
26 fedora:25 : Ok gcc (GCC) 6.4.1 20170727 (Red Hat 6.4.1-1)
27 fedora:26 : Ok gcc (GCC) 7.2.1 20170915 (Red Hat 7.2.1-2)
28 fedora:27 : Ok gcc (GCC) 7.2.1 20170915 (Red Hat 7.2.1-2)
29 fedora:rawhide : Ok gcc (GCC) 7.2.1 20170829 (Red Hat 7.2.1-1)
30 gentoo-stage3-amd64:latest : Ok gcc (Gentoo 6.4.0-r1 p1.3) 6.4.0
31 mageia:5 : Ok gcc (GCC) 4.9.2
32 mageia:6 : Ok gcc (Mageia 5.4.0-5.mga6) 5.4.0
33 opensuse:42.1 : Ok gcc (SUSE Linux) 4.8.5
34 opensuse:42.2 : Ok gcc (SUSE Linux) 4.8.5
35 opensuse:42.3 : Ok gcc (SUSE Linux) 4.8.5
36 opensuse:tumbleweed : Ok gcc (SUSE Linux) 7.3.0
37 oraclelinux:6 : Ok gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-18)
38 oraclelinux:7 : Ok gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-16)
39 ubuntu:12.04.5 : Ok gcc (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3
40 ubuntu:14.04.4 : Ok gcc (Ubuntu 4.8.4-2ubuntu1~14.04.3) 4.8.4
41 ubuntu:14.04.4-x-linaro-arm64 : Ok aarch64-linux-gnu-gcc (Linaro GCC 5.4-2017.05) 5.4.1 20170404
42 ubuntu:15.04 : Ok gcc (Ubuntu 4.9.2-10ubuntu13) 4.9.2
43 ubuntu:16.04 : Ok gcc (Ubuntu 5.4.0-6ubuntu1~16.04.5) 5.4.0 20160609
44 ubuntu:16.04-x-arm : Ok arm-linux-gnueabihf-gcc (Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.4) 5.4.0 20160609
45 ubuntu:16.04-x-arm64 : Ok aarch64-linux-gnu-gcc (Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.4) 5.4.0 20160609
46 ubuntu:16.04-x-powerpc : Ok powerpc-linux-gnu-gcc (Ubuntu 5.4.0-6ubuntu1~16.04.4) 5.4.0 20160609
47 ubuntu:16.04-x-powerpc64 : Ok powerpc64-linux-gnu-gcc (Ubuntu/IBM 5.4.0-6ubuntu1~16.04.1) 5.4.0 20160609
48 ubuntu:16.04-x-powerpc64el : Ok powerpc64le-linux-gnu-gcc (Ubuntu/IBM 5.4.0-6ubuntu1~16.04.4) 5.4.0 20160609
49 ubuntu:16.04-x-s390 : Ok s390x-linux-gnu-gcc (Ubuntu 5.4.0-6ubuntu1~16.04.4) 5.4.0 20160609
50 ubuntu:16.10 : Ok gcc (Ubuntu 6.2.0-5ubuntu12) 6.2.0 20161005
51 ubuntu:17.04 : Ok gcc (Ubuntu 6.3.0-12ubuntu2) 6.3.0 20170406
52 ubuntu:17.10 : Ok gcc (Ubuntu 7.2.0-8ubuntu3) 7.2.0
53 ubuntu:18.04 : Ok gcc (Ubuntu 7.2.0-16ubuntu1) 7.2.0
# uname -a
Linux jouet 4.16.0-rc4 #1 SMP Mon Mar 5 12:18:05 -03 2018 x86_64 x86_64 x86_64 GNU/Linux
# perf test
1: vmlinux symtab matches kallsyms : Ok
2: Detect openat syscall event : Ok
3: Detect openat syscall event on all cpus : Ok
4: Read samples using the mmap interface : Ok
5: Test data source output : Ok
6: Parse event definition strings : Ok
7: Simple expression parser : Ok
8: PERF_RECORD_* events & perf_sample fields : Ok
9: Parse perf pmu format : Ok
10: DSO data read : Ok
11: DSO data cache : Ok
12: DSO data reopen : Ok
13: Roundtrip evsel->name : Ok
14: Parse sched tracepoints fields : Ok
15: syscalls:sys_enter_openat event fields : Ok
16: Setup struct perf_event_attr : Ok
17: Match and link multiple hists : Ok
18: 'import perf' in python : Ok
19: Breakpoint overflow signal handler : Ok
20: Breakpoint overflow sampling : Ok
21: Number of exit events of a simple workload : Ok
22: Software clock events period values : Ok
23: Object code reading : Ok
24: Sample parsing : Ok
25: Use a dummy software event to keep tracking : Ok
26: Parse with no sample_id_all bit set : Ok
27: Filter hist entries : Ok
28: Lookup mmap thread : Ok
29: Share thread mg : Ok
30: Sort output of hist entries : Ok
31: Cumulate child hist entries : Ok
32: Track with sched_switch : Ok
33: Filter fds with revents mask in a fdarray : Ok
34: Add fd to a fdarray, making it autogrow : Ok
35: kmod_path__parse : Ok
36: Thread map : Ok
37: LLVM search and compile :
37.1: Basic BPF llvm compile : Ok
37.2: kbuild searching : Ok
37.3: Compile source for BPF prologue generation : Ok
37.4: Compile source for BPF relocation : Ok
38: Session topology : Ok
39: BPF filter :
39.1: Basic BPF filtering : Ok
39.2: BPF pinning : Ok
39.3: BPF prologue generation : Ok
39.4: BPF relocation checker : Ok
40: Synthesize thread map : Ok
41: Remove thread map : Ok
42: Synthesize cpu map : Ok
43: Synthesize stat config : Ok
44: Synthesize stat : Ok
45: Synthesize stat round : Ok
46: Synthesize attr update : Ok
47: Event times : Ok
48: Read backward ring buffer : Ok
49: Print cpu map : Ok
50: Probe SDT events : Ok
51: is_printable_array : Ok
52: Print bitmap : Ok
53: perf hooks : Ok
54: builtin clang support : Skip (not compiled in)
55: unit_number__scnprintf : Ok
56: x86 rdpmc : Ok
57: Convert perf time to TSC : Ok
58: DWARF unwind : Ok
59: x86 instruction decoder - new instructions : Ok
60: Use vfs_getname probe to get syscall args filenames : Ok
61: probe libc's inet_pton & backtrace it with ping : Ok
62: Check open filename arg using perf trace + vfs_getname: Ok
63: probe libc's inet_pton & backtrace it with ping : Ok
64: Add vfs_getname probe to get syscall args filenames : Ok
#
$ make -C tools/perf build-test
make: Entering directory '/home/acme/git/perf/tools/perf'
- tarpkg: ./tests/perf-targz-src-pkg .
make_no_auxtrace_O: make NO_AUXTRACE=1
make_debug_O: make DEBUG=1
make_no_gtk2_O: make NO_GTK2=1
make_install_bin_O: make install-bin
make_no_slang_O: make NO_SLANG=1
make_no_newt_O: make NO_NEWT=1
make_no_scripts_O: make NO_LIBPYTHON=1 NO_LIBPERL=1
make_no_ui_O: make NO_NEWT=1 NO_SLANG=1 NO_GTK2=1
make_perf_o_O: make perf.o
make_util_map_o_O: make util/map.o
make_pure_O: make
make_install_prefix_O: make install prefix=/tmp/krava
make_with_clangllvm_O: make LIBCLANGLLVM=1
make_util_pmu_bison_o_O: make util/pmu-bison.o
make_no_libpython_O: make NO_LIBPYTHON=1
make_no_libelf_O: make NO_LIBELF=1
make_install_O: make install
make_help_O: make help
make_clean_all_O: make clean all
make_no_libaudit_O: make NO_LIBAUDIT=1
make_doc_O: make doc
make_tags_O: make tags
make_no_libdw_dwarf_unwind_O: make NO_LIBDW_DWARF_UNWIND=1
make_install_prefix_slash_O: make install prefix=/tmp/krava/
make_no_backtrace_O: make NO_BACKTRACE=1
make_no_demangle_O: make NO_DEMANGLE=1
make_no_libbionic_O: make NO_LIBBIONIC=1
make_no_libnuma_O: make NO_LIBNUMA=1
make_with_babeltrace_O: make LIBBABELTRACE=1
make_no_libunwind_O: make NO_LIBUNWIND=1
make_minimal_O: make NO_LIBPERL=1 NO_LIBPYTHON=1 NO_NEWT=1 NO_GTK2=1 NO_DEMANGLE=1 NO_LIBELF=1 NO_LIBUNWIND=1 NO_BACKTRACE=1 NO_LIBNUMA=1 NO_LIBAUDIT=1 NO_LIBBIONIC=1 NO_LIBDW_DWARF_UNWIND=1 NO_AUXTRACE=1 NO_LIBBPF=1 NO_LIBCRYPTO=1 NO_SDT=1 NO_JVMTI=1
make_no_libperl_O: make NO_LIBPERL=1
make_no_libbpf_O: make NO_LIBBPF=1
make_static_O: make LDFLAGS=-static
OK
make: Leaving directory '/home/acme/git/perf/tools/perf'
$
The msleep() when processing EXT4_GOING_FLAGS_NOLOGFLUSH was a hack to
avoid some races (that are now fixed), but in fact it introduced its
own race.
Signed-off-by: Theodore Ts'o <tytso(a)mit.edu>
Cc: stable(a)vger.kernel.org
---
fs/ext4/ioctl.c | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/fs/ext4/ioctl.c b/fs/ext4/ioctl.c
index 4d1b1575f8ac..16d3d1325f5b 100644
--- a/fs/ext4/ioctl.c
+++ b/fs/ext4/ioctl.c
@@ -498,10 +498,8 @@ static int ext4_shutdown(struct super_block *sb, unsigned long arg)
break;
case EXT4_GOING_FLAGS_NOLOGFLUSH:
set_bit(EXT4_FLAGS_SHUTDOWN, &sbi->s_ext4_flags);
- if (sbi->s_journal && !is_journal_aborted(sbi->s_journal)) {
- msleep(100);
+ if (sbi->s_journal && !is_journal_aborted(sbi->s_journal))
jbd2_journal_abort(sbi->s_journal, 0);
- }
break;
default:
return -EINVAL;
--
2.16.1.72.g5be1f00a9a
This updates the jbd2 superblock unnecessarily, and on an abort we
shouldn't truncate the log.
Signed-off-by: Theodore Ts'o <tytso(a)mit.edu>
Cc: stable(a)vger.kernel.org
---
fs/jbd2/journal.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/fs/jbd2/journal.c b/fs/jbd2/journal.c
index efa0c72a0b9f..dfb057900e79 100644
--- a/fs/jbd2/journal.c
+++ b/fs/jbd2/journal.c
@@ -974,7 +974,7 @@ int __jbd2_update_log_tail(journal_t *journal, tid_t tid, unsigned long block)
}
/*
- * This is a variaon of __jbd2_update_log_tail which checks for validity of
+ * This is a variation of __jbd2_update_log_tail which checks for validity of
* provided log tail and locks j_checkpoint_mutex. So it is safe against races
* with other threads updating log tail.
*/
@@ -1417,6 +1417,9 @@ int jbd2_journal_update_sb_log_tail(journal_t *journal, tid_t tail_tid,
journal_superblock_t *sb = journal->j_superblock;
int ret;
+ if (is_journal_aborted(journal))
+ return -EIO;
+
BUG_ON(!mutex_is_locked(&journal->j_checkpoint_mutex));
jbd_debug(1, "JBD2: updating superblock (start %lu, seq %u)\n",
tail_block, tail_tid);
--
2.16.1.72.g5be1f00a9a
On Thu, Feb 22, 2018 at 5:04 PM, Mike Snitzer <snitzer(a)redhat.com> wrote:
> On Thu, Feb 22 2018 at 10:56am -0500,
> Arnd Bergmann <arnd(a)arndb.de> wrote:
>
> Mikulas already sent a fix for this:
> https://patchwork.kernel.org/patch/10211631/
>
> But I like yours a bit better, though I'll likely move the declaration
> of 'noio_flag' temporary inside the conditional.
>
> Anyway, I'll get this fixed up shortly, thanks.
I see the fix made it into linux-next on the same day, but the build bots still
report the warning for mainline kernels and now also for stable kernels
that got a backport of the patch that introduced it on arm64.
I assume you had not planned to send it for mainline, any chance you
could change that and send it as a bugfix with a 'Cc:
stable(a)vger.kernel.org' tag to restore a clean build?
Arnd
On Tue, Mar 06, 2018 at 02:26:34PM +0000, Mark Brown wrote:
> On Mon, Mar 05, 2018 at 02:08:59PM +0100, Greg KH wrote:
>
> > I know there is lots more than Android to ARM, but the huge majority by
> > quantity is Android.
>
> > What I'm saying here is look at all of the backports that were required
> > to get this working in the android tree. It was non-trivial by a long
> > shot, and based on that work, this series feels really "small" and I'm
> > really worried that it's not really working or solving the problem here.
>
> Unfortunately what's been coming over was just the bit about using
> android-common, not the bit about why you're worried about the code. :(
Sorry, it's been a long few months, my ability to communicate well about
this topic is tough at times without assuming everyone else has been
dealing with it for as long as some of us have.
> > There are major features that were backported to the android trees for
> > ARM that the upstream features for Spectre and Meltdown built on top of
> > to get their solution. To not backport all of that is a huge risk,
> > right?
>
> I'm not far enough into the details to comment on the specifics here;
> there's other people in the CCs who are. Let's let people look at the
> code and see if they think some of the fixes are useful in LTS. The
> Android tree does have things beyond what's in LTS and there's been more
> time for analysis since the changes were made there.
I suggest looking at the backports in the android-common tree that are
needed for this "feature" to work properly, and pull them out and test
them if you really want it in your Linaro trees. If you think some of
them should be added to the LTS kernels, I'll be glad to consider them,
but don't do a hack to try to work around the lack of these features,
otherwise you will not be happy in the long-run.
Again, look at the mess we have for x86 in 4.4.y and 4.9.y. You do not
want that for ARM for the simple reason that ARM systems usually last
"longer" with those old kernels than the x86 systems do.
> > So that's why I keep pointing people at the android trees. Look at what
> > they did there. There's nothing stoping anyone who is really insistant
> > on staying on these old kernel versions from pulling from those branches
> > to get these bugfixes in a known stable, and tested, implementation.
>
> I think there's enough stuff going on in the Android tree to make that
> unpalatable for a good segment of users.
Really? Like what? Last I looked it's only about 300 or so patches.
Something like less than .5% of the normal SoC backport size for any ARM
system recently. There were some numbers published a few months ago
about the real count, I can dig them up if you are curious.
> > Or just move to 4.14.y. Seriously, that's probably the safest thing in
> > the long run for anyone here. And when you realize you can't do that,
> > go yell at your SoC for forcing you into the nightmare that they conned
> > you into by their 3+ million lines added to their kernel tree. You were
> > always living on borowed time, and it looks like that time is finally
> > up...
>
> Yes, there are some people who are stuck with enormous out of tree patch
> sets on most architectures (just look at the enterprise distros!) - but
> there are also people who are at or very close to vanilla and just
> trying to control their validation costs by not changing too much when
> they don't need to.
Great, then move to 4.14.y :)
And before someone says "but it takes more to validate a new kernel
version than it does to just validate a core backport for the
architecture code", well...
> There's a good discussion to be had about it being sensible for people
> to accept more change in that segment of the market but equally those
> same attitudes have been an important part of the pressure that's been
> placed on vendors long term to get things in mainline.
>
> > [1] It's also why I keep doing the LTS merges into the android-common
> > trees within days of the upstream LTS release (today being an
> > exception). That way once you do a pull/merge, you can just keep
> > always merging to keep a secure device that is always up to date
> > with the latest LTS releases in a simple way. How much easier can I
> > make it for the ARM ecosystem here, really?
>
> That's great for the Android ecosystem, it's fantastic work and is doing
> a lot to overcome resistances people had there to merging up the LTS
> which is going to help many people. While that's a very large part of
> ARM ecosystem it's not all of it, there are also chip vendors and system
> integrators who have made deliberate choices to minimize out of tree
> code just as we've been encouraging them to.
Again great, go use 4.14.y for those systems please. It's better in the
long run.
thanks,
greg k-h
This is a note to let you know that I've just added the patch titled
nospec: Allow index argument to have const-qualified type
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
nospec-allow-index-argument-to-have-const-qualified-type.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From b98c6a160a057d5686a8c54c79cc6c8c94a7d0c8 Mon Sep 17 00:00:00 2001
From: Rasmus Villemoes <linux(a)rasmusvillemoes.dk>
Date: Fri, 16 Feb 2018 13:20:48 -0800
Subject: nospec: Allow index argument to have const-qualified type
From: Rasmus Villemoes <linux(a)rasmusvillemoes.dk>
commit b98c6a160a057d5686a8c54c79cc6c8c94a7d0c8 upstream.
The last expression in a statement expression need not be a bare
variable, quoting gcc docs
The last thing in the compound statement should be an expression
followed by a semicolon; the value of this subexpression serves as the
value of the entire construct.
and we already use that in e.g. the min/max macros which end with a
ternary expression.
This way, we can allow index to have const-qualified type, which will in
some cases avoid the need for introducing a local copy of index of
non-const qualified type. That, in turn, can prevent readers not
familiar with the internals of array_index_nospec from wondering about
the seemingly redundant extra variable, and I think that's worthwhile
considering how confusing the whole _nospec business is.
The expression _i&_mask has type unsigned long (since that is the type
of _mask, and the BUILD_BUG_ONs guarantee that _i will get promoted to
that), so in order not to change the type of the whole expression, add
a cast back to typeof(_i).
Signed-off-by: Rasmus Villemoes <linux(a)rasmusvillemoes.dk>
Signed-off-by: Dan Williams <dan.j.williams(a)intel.com>
Acked-by: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Andy Lutomirski <luto(a)kernel.org>
Cc: Arjan van de Ven <arjan(a)linux.intel.com>
Cc: Borislav Petkov <bp(a)alien8.de>
Cc: Dave Hansen <dave.hansen(a)linux.intel.com>
Cc: David Woodhouse <dwmw2(a)infradead.org>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: Josh Poimboeuf <jpoimboe(a)redhat.com>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Will Deacon <will.deacon(a)arm.com>
Cc: linux-arch(a)vger.kernel.org
Cc: stable(a)vger.kernel.org
Link: http://lkml.kernel.org/r/151881604837.17395.10812767547837568328.stgit@dwil…
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
include/linux/nospec.h | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
--- a/include/linux/nospec.h
+++ b/include/linux/nospec.h
@@ -72,7 +72,6 @@ static inline unsigned long array_index_
BUILD_BUG_ON(sizeof(_i) > sizeof(long)); \
BUILD_BUG_ON(sizeof(_s) > sizeof(long)); \
\
- _i &= _mask; \
- _i; \
+ (typeof(_i)) (_i & _mask); \
})
#endif /* _LINUX_NOSPEC_H */
Patches currently in stable-queue which might be from linux(a)rasmusvillemoes.dk are
queue-4.9/nospec-allow-index-argument-to-have-const-qualified-type.patch
This is a note to let you know that I've just added the patch titled
KVM/x86: Remove indirect MSR op calls from SPEC_CTRL
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
kvm-x86-remove-indirect-msr-op-calls-from-spec_ctrl.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From ecb586bd29c99fb4de599dec388658e74388daad Mon Sep 17 00:00:00 2001
From: Paolo Bonzini <pbonzini(a)redhat.com>
Date: Thu, 22 Feb 2018 16:43:17 +0100
Subject: KVM/x86: Remove indirect MSR op calls from SPEC_CTRL
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
From: Paolo Bonzini <pbonzini(a)redhat.com>
commit ecb586bd29c99fb4de599dec388658e74388daad upstream.
Having a paravirt indirect call in the IBRS restore path is not a
good idea, since we are trying to protect from speculative execution
of bogus indirect branch targets. It is also slower, so use
native_wrmsrl() on the vmentry path too.
Signed-off-by: Paolo Bonzini <pbonzini(a)redhat.com>
Reviewed-by: Jim Mattson <jmattson(a)google.com>
Cc: David Woodhouse <dwmw(a)amazon.co.uk>
Cc: KarimAllah Ahmed <karahmed(a)amazon.de>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Radim Krčmář <rkrcmar(a)redhat.com>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: kvm(a)vger.kernel.org
Cc: stable(a)vger.kernel.org
Fixes: d28b387fb74da95d69d2615732f50cceb38e9a4d
Link: http://lkml.kernel.org/r/20180222154318.20361-2-pbonzini@redhat.com
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/kvm/svm.c | 7 ++++---
arch/x86/kvm/vmx.c | 7 ++++---
2 files changed, 8 insertions(+), 6 deletions(-)
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -44,6 +44,7 @@
#include <asm/debugreg.h>
#include <asm/kvm_para.h>
#include <asm/irq_remapping.h>
+#include <asm/microcode.h>
#include <asm/nospec-branch.h>
#include <asm/virtext.h>
@@ -4919,7 +4920,7 @@ static void svm_vcpu_run(struct kvm_vcpu
* being speculatively taken.
*/
if (svm->spec_ctrl)
- wrmsrl(MSR_IA32_SPEC_CTRL, svm->spec_ctrl);
+ native_wrmsrl(MSR_IA32_SPEC_CTRL, svm->spec_ctrl);
asm volatile (
"push %%" _ASM_BP "; \n\t"
@@ -5029,10 +5030,10 @@ static void svm_vcpu_run(struct kvm_vcpu
* save it.
*/
if (!msr_write_intercepted(vcpu, MSR_IA32_SPEC_CTRL))
- rdmsrl(MSR_IA32_SPEC_CTRL, svm->spec_ctrl);
+ svm->spec_ctrl = native_read_msr(MSR_IA32_SPEC_CTRL);
if (svm->spec_ctrl)
- wrmsrl(MSR_IA32_SPEC_CTRL, 0);
+ native_wrmsrl(MSR_IA32_SPEC_CTRL, 0);
/* Eliminate branch target predictions from guest mode */
vmexit_fill_RSB();
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -49,6 +49,7 @@
#include <asm/kexec.h>
#include <asm/apic.h>
#include <asm/irq_remapping.h>
+#include <asm/microcode.h>
#include <asm/nospec-branch.h>
#include "trace.h"
@@ -8906,7 +8907,7 @@ static void __noclone vmx_vcpu_run(struc
* being speculatively taken.
*/
if (vmx->spec_ctrl)
- wrmsrl(MSR_IA32_SPEC_CTRL, vmx->spec_ctrl);
+ native_wrmsrl(MSR_IA32_SPEC_CTRL, vmx->spec_ctrl);
vmx->__launched = vmx->loaded_vmcs->launched;
asm(
@@ -9042,10 +9043,10 @@ static void __noclone vmx_vcpu_run(struc
* save it.
*/
if (!msr_write_intercepted(vcpu, MSR_IA32_SPEC_CTRL))
- rdmsrl(MSR_IA32_SPEC_CTRL, vmx->spec_ctrl);
+ vmx->spec_ctrl = native_read_msr(MSR_IA32_SPEC_CTRL);
if (vmx->spec_ctrl)
- wrmsrl(MSR_IA32_SPEC_CTRL, 0);
+ native_wrmsrl(MSR_IA32_SPEC_CTRL, 0);
/* Eliminate branch target predictions from guest mode */
vmexit_fill_RSB();
Patches currently in stable-queue which might be from pbonzini(a)redhat.com are
queue-4.9/kvm-x86-fix-smram-accessing-even-if-vm-is-shutdown.patch
queue-4.9/kvm-x86-remove-indirect-msr-op-calls-from-spec_ctrl.patch
queue-4.9/kvm-mmu-fix-overlap-between-public-and-private-memslots.patch
queue-4.9/kvm-vmx-optimize-vmx_vcpu_run-and-svm_vcpu_run-by-marking-the-rdmsr-path-as-unlikely.patch
This is a note to let you know that I've just added the patch titled
KVM: X86: Fix SMRAM accessing even if VM is shutdown
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
kvm-x86-fix-smram-accessing-even-if-vm-is-shutdown.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 95e057e25892eaa48cad1e2d637b80d0f1a4fac5 Mon Sep 17 00:00:00 2001
From: Wanpeng Li <wanpengli(a)tencent.com>
Date: Thu, 8 Feb 2018 15:32:45 +0800
Subject: KVM: X86: Fix SMRAM accessing even if VM is shutdown
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
From: Wanpeng Li <wanpengli(a)tencent.com>
commit 95e057e25892eaa48cad1e2d637b80d0f1a4fac5 upstream.
Reported by syzkaller:
WARNING: CPU: 6 PID: 2434 at arch/x86/kvm/vmx.c:6660 handle_ept_misconfig+0x54/0x1e0 [kvm_intel]
CPU: 6 PID: 2434 Comm: repro_test Not tainted 4.15.0+ #4
RIP: 0010:handle_ept_misconfig+0x54/0x1e0 [kvm_intel]
Call Trace:
vmx_handle_exit+0xbd/0xe20 [kvm_intel]
kvm_arch_vcpu_ioctl_run+0xdaf/0x1d50 [kvm]
kvm_vcpu_ioctl+0x3e9/0x720 [kvm]
do_vfs_ioctl+0xa4/0x6a0
SyS_ioctl+0x79/0x90
entry_SYSCALL_64_fastpath+0x25/0x9c
The testcase creates a first thread to issue KVM_SMI ioctl, and then creates
a second thread to mmap and operate on the same vCPU. This triggers a race
condition when running the testcase with multiple threads. Sometimes one thread
exits with a triple fault while another thread mmaps and operates on the same
vCPU. Because CS=0x3000/IP=0x8000 is not mapped, accessing the SMI handler
results in an EPT misconfig. This patch fixes it by returning RET_PF_EMULATE
in kvm_handle_bad_page(), which will go on to cause an emulation failure and an
exit with KVM_EXIT_INTERNAL_ERROR.
Reported-by: syzbot+c1d9517cab094dae65e446c0c5b4de6c40f4dc58(a)syzkaller.appspotmail.com
Cc: Paolo Bonzini <pbonzini(a)redhat.com>
Cc: Radim Krčmář <rkrcmar(a)redhat.com>
Cc: stable(a)vger.kernel.org
Signed-off-by: Wanpeng Li <wanpengli(a)tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini(a)redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/kvm/mmu.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -2808,7 +2808,7 @@ static int kvm_handle_bad_page(struct kv
return 0;
}
- return -EFAULT;
+ return RET_PF_EMULATE;
}
static void transparent_hugepage_adjust(struct kvm_vcpu *vcpu,
Patches currently in stable-queue which might be from wanpengli(a)tencent.com are
queue-4.9/kvm-x86-fix-smram-accessing-even-if-vm-is-shutdown.patch
This is a note to let you know that I've just added the patch titled
KVM/VMX: Optimize vmx_vcpu_run() and svm_vcpu_run() by marking the RDMSR path as unlikely()
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
kvm-vmx-optimize-vmx_vcpu_run-and-svm_vcpu_run-by-marking-the-rdmsr-path-as-unlikely.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 946fbbc13dce68902f64515b610eeb2a6c3d7a64 Mon Sep 17 00:00:00 2001
From: Paolo Bonzini <pbonzini(a)redhat.com>
Date: Thu, 22 Feb 2018 16:43:18 +0100
Subject: KVM/VMX: Optimize vmx_vcpu_run() and svm_vcpu_run() by marking the RDMSR path as unlikely()
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
From: Paolo Bonzini <pbonzini(a)redhat.com>
commit 946fbbc13dce68902f64515b610eeb2a6c3d7a64 upstream.
vmx_vcpu_run() and svm_vcpu_run() are large functions, and giving
branch hints to the compiler can actually make a substantial cycle
difference by keeping the fast path contiguous in memory.
With this optimization, the retpoline-guest/retpoline-host case is
about 50 cycles faster.
Signed-off-by: Paolo Bonzini <pbonzini(a)redhat.com>
Reviewed-by: Jim Mattson <jmattson(a)google.com>
Cc: David Woodhouse <dwmw(a)amazon.co.uk>
Cc: KarimAllah Ahmed <karahmed(a)amazon.de>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Radim Krčmář <rkrcmar(a)redhat.com>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: kvm(a)vger.kernel.org
Cc: stable(a)vger.kernel.org
Link: http://lkml.kernel.org/r/20180222154318.20361-3-pbonzini@redhat.com
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/kvm/svm.c | 2 +-
arch/x86/kvm/vmx.c | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -5029,7 +5029,7 @@ static void svm_vcpu_run(struct kvm_vcpu
* If the L02 MSR bitmap does not intercept the MSR, then we need to
* save it.
*/
- if (!msr_write_intercepted(vcpu, MSR_IA32_SPEC_CTRL))
+ if (unlikely(!msr_write_intercepted(vcpu, MSR_IA32_SPEC_CTRL)))
svm->spec_ctrl = native_read_msr(MSR_IA32_SPEC_CTRL);
if (svm->spec_ctrl)
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -9042,7 +9042,7 @@ static void __noclone vmx_vcpu_run(struc
* If the L02 MSR bitmap does not intercept the MSR, then we need to
* save it.
*/
- if (!msr_write_intercepted(vcpu, MSR_IA32_SPEC_CTRL))
+ if (unlikely(!msr_write_intercepted(vcpu, MSR_IA32_SPEC_CTRL)))
vmx->spec_ctrl = native_read_msr(MSR_IA32_SPEC_CTRL);
if (vmx->spec_ctrl)
Patches currently in stable-queue which might be from pbonzini(a)redhat.com are
queue-4.9/kvm-x86-fix-smram-accessing-even-if-vm-is-shutdown.patch
queue-4.9/kvm-x86-remove-indirect-msr-op-calls-from-spec_ctrl.patch
queue-4.9/kvm-mmu-fix-overlap-between-public-and-private-memslots.patch
queue-4.9/kvm-vmx-optimize-vmx_vcpu_run-and-svm_vcpu_run-by-marking-the-rdmsr-path-as-unlikely.patch
This is a note to let you know that I've just added the patch titled
ARM: mvebu: Fix broken PL310_ERRATA_753970 selects
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
arm-mvebu-fix-broken-pl310_errata_753970-selects.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 8aa36a8dcde3183d84db7b0d622ffddcebb61077 Mon Sep 17 00:00:00 2001
From: Ulf Magnusson <ulfalizer(a)gmail.com>
Date: Mon, 5 Feb 2018 02:21:13 +0100
Subject: ARM: mvebu: Fix broken PL310_ERRATA_753970 selects
From: Ulf Magnusson <ulfalizer(a)gmail.com>
commit 8aa36a8dcde3183d84db7b0d622ffddcebb61077 upstream.
The MACH_ARMADA_375 and MACH_ARMADA_38X boards select ARM_ERRATA_753970,
but it was renamed to PL310_ERRATA_753970 by commit fa0ce4035d48 ("ARM:
7162/1: errata: tidy up Kconfig options for PL310 errata workarounds").
Fix the selects to use the new name.
Discovered with the
https://github.com/ulfalizer/Kconfiglib/blob/master/examples/list_undefined…
script.
Fixes: fa0ce4035d48 ("ARM: 7162/1: errata: tidy up Kconfig options for
PL310 errata workarounds"
cc: stable(a)vger.kernel.org
Signed-off-by: Ulf Magnusson <ulfalizer(a)gmail.com>
Signed-off-by: Gregory CLEMENT <gregory.clement(a)bootlin.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/arm/mach-mvebu/Kconfig | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
--- a/arch/arm/mach-mvebu/Kconfig
+++ b/arch/arm/mach-mvebu/Kconfig
@@ -42,7 +42,7 @@ config MACH_ARMADA_375
depends on ARCH_MULTI_V7
select ARMADA_370_XP_IRQ
select ARM_ERRATA_720789
- select ARM_ERRATA_753970
+ select PL310_ERRATA_753970
select ARM_GIC
select ARMADA_375_CLK
select HAVE_ARM_SCU
@@ -58,7 +58,7 @@ config MACH_ARMADA_38X
bool "Marvell Armada 380/385 boards"
depends on ARCH_MULTI_V7
select ARM_ERRATA_720789
- select ARM_ERRATA_753970
+ select PL310_ERRATA_753970
select ARM_GIC
select ARMADA_370_XP_IRQ
select ARMADA_38X_CLK
Patches currently in stable-queue which might be from ulfalizer(a)gmail.com are
queue-4.9/arm-mvebu-fix-broken-pl310_errata_753970-selects.patch
This is a note to let you know that I've just added the patch titled
KVM: mmu: Fix overlap between public and private memslots
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
kvm-mmu-fix-overlap-between-public-and-private-memslots.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From b28676bb8ae4569cced423dc2a88f7cb319d5379 Mon Sep 17 00:00:00 2001
From: Wanpeng Li <wanpeng.li(a)hotmail.com>
Date: Tue, 13 Feb 2018 15:36:00 +0100
Subject: KVM: mmu: Fix overlap between public and private memslots
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
From: Wanpeng Li <wanpeng.li(a)hotmail.com>
commit b28676bb8ae4569cced423dc2a88f7cb319d5379 upstream.
Reported by syzkaller:
pte_list_remove: ffff9714eb1f8078 0->BUG
------------[ cut here ]------------
kernel BUG at arch/x86/kvm/mmu.c:1157!
invalid opcode: 0000 [#1] SMP
RIP: 0010:pte_list_remove+0x11b/0x120 [kvm]
Call Trace:
drop_spte+0x83/0xb0 [kvm]
mmu_page_zap_pte+0xcc/0xe0 [kvm]
kvm_mmu_prepare_zap_page+0x81/0x4a0 [kvm]
kvm_mmu_invalidate_zap_all_pages+0x159/0x220 [kvm]
kvm_arch_flush_shadow_all+0xe/0x10 [kvm]
kvm_mmu_notifier_release+0x6c/0xa0 [kvm]
? kvm_mmu_notifier_release+0x5/0xa0 [kvm]
__mmu_notifier_release+0x79/0x110
? __mmu_notifier_release+0x5/0x110
exit_mmap+0x15a/0x170
? do_exit+0x281/0xcb0
mmput+0x66/0x160
do_exit+0x2c9/0xcb0
? __context_tracking_exit.part.5+0x4a/0x150
do_group_exit+0x50/0xd0
SyS_exit_group+0x14/0x20
do_syscall_64+0x73/0x1f0
entry_SYSCALL64_slow_path+0x25/0x25
The reason is that when creates new memslot, there is no guarantee for new
memslot not overlap with private memslots. This can be triggered by the
following program:
#include <fcntl.h>
#include <pthread.h>
#include <setjmp.h>
#include <signal.h>
#include <stddef.h>
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/ioctl.h>
#include <sys/stat.h>
#include <sys/syscall.h>
#include <sys/types.h>
#include <unistd.h>
#include <linux/kvm.h>
long r[16];
int main()
{
void *p = valloc(0x4000);
r[2] = open("/dev/kvm", 0);
r[3] = ioctl(r[2], KVM_CREATE_VM, 0x0ul);
uint64_t addr = 0xf000;
ioctl(r[3], KVM_SET_IDENTITY_MAP_ADDR, &addr);
r[6] = ioctl(r[3], KVM_CREATE_VCPU, 0x0ul);
ioctl(r[3], KVM_SET_TSS_ADDR, 0x0ul);
ioctl(r[6], KVM_RUN, 0);
ioctl(r[6], KVM_RUN, 0);
struct kvm_userspace_memory_region mr = {
.slot = 0,
.flags = KVM_MEM_LOG_DIRTY_PAGES,
.guest_phys_addr = 0xf000,
.memory_size = 0x4000,
.userspace_addr = (uintptr_t) p
};
ioctl(r[3], KVM_SET_USER_MEMORY_REGION, &mr);
return 0;
}
This patch fixes the bug by not adding a new memslot even if it
overlaps with private memslots.
Reported-by: Dmitry Vyukov <dvyukov(a)google.com>
Cc: Paolo Bonzini <pbonzini(a)redhat.com>
Cc: Radim Krčmář <rkrcmar(a)redhat.com>
Cc: Dmitry Vyukov <dvyukov(a)google.com>
Cc: Eric Biggers <ebiggers3(a)gmail.com>
Cc: stable(a)vger.kernel.org
Signed-off-by: Wanpeng Li <wanpeng.li(a)hotmail.com>
---
virt/kvm/kvm_main.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -976,8 +976,7 @@ int __kvm_set_memory_region(struct kvm *
/* Check for overlaps */
r = -EEXIST;
kvm_for_each_memslot(slot, __kvm_memslots(kvm, as_id)) {
- if ((slot->id >= KVM_USER_MEM_SLOTS) ||
- (slot->id == id))
+ if (slot->id == id)
continue;
if (!((base_gfn + npages <= slot->base_gfn) ||
(base_gfn >= slot->base_gfn + slot->npages)))
Patches currently in stable-queue which might be from wanpeng.li(a)hotmail.com are
queue-4.9/kvm-mmu-fix-overlap-between-public-and-private-memslots.patch
This is a note to let you know that I've just added the patch titled
ARM: kvm: fix building with gcc-8
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
arm-kvm-fix-building-with-gcc-8.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 67870eb1204223598ea6d8a4467b482e9f5875b5 Mon Sep 17 00:00:00 2001
From: Arnd Bergmann <arnd(a)arndb.de>
Date: Fri, 2 Feb 2018 16:07:34 +0100
Subject: ARM: kvm: fix building with gcc-8
From: Arnd Bergmann <arnd(a)arndb.de>
commit 67870eb1204223598ea6d8a4467b482e9f5875b5 upstream.
In banked-sr.c, we use a top-level '__asm__(".arch_extension virt")'
statement to allow compilation of a multi-CPU kernel for ARMv6
and older ARMv7-A that don't normally support access to the banked
registers.
This is considered to be a programming error by the gcc developers
and will no longer work in gcc-8, where we now get a build error:
/tmp/cc4Qy7GR.s:34: Error: Banked registers are not available with this architecture. -- `mrs r3,SP_usr'
/tmp/cc4Qy7GR.s:41: Error: Banked registers are not available with this architecture. -- `mrs r3,ELR_hyp'
/tmp/cc4Qy7GR.s:55: Error: Banked registers are not available with this architecture. -- `mrs r3,SP_svc'
/tmp/cc4Qy7GR.s:62: Error: Banked registers are not available with this architecture. -- `mrs r3,LR_svc'
/tmp/cc4Qy7GR.s:69: Error: Banked registers are not available with this architecture. -- `mrs r3,SPSR_svc'
/tmp/cc4Qy7GR.s:76: Error: Banked registers are not available with this architecture. -- `mrs r3,SP_abt'
Passign the '-march-armv7ve' flag to gcc works, and is ok here, because
we know the functions won't ever be called on pre-ARMv7VE machines.
Unfortunately, older compiler versions (4.8 and earlier) do not understand
that flag, so we still need to keep the asm around.
Backporting to stable kernels (4.6+) is needed to allow those to be built
with future compilers as well.
Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84129
Fixes: 33280b4cd1dc ("ARM: KVM: Add banked registers save/restore")
Cc: stable(a)vger.kernel.org
Signed-off-by: Arnd Bergmann <arnd(a)arndb.de>
Signed-off-by: Christoffer Dall <christoffer.dall(a)linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/arm/kvm/hyp/Makefile | 5 +++++
arch/arm/kvm/hyp/banked-sr.c | 4 ++++
2 files changed, 9 insertions(+)
--- a/arch/arm/kvm/hyp/Makefile
+++ b/arch/arm/kvm/hyp/Makefile
@@ -6,6 +6,8 @@ ccflags-y += -fno-stack-protector -DDISA
KVM=../../../../virt/kvm
+CFLAGS_ARMV7VE :=$(call cc-option, -march=armv7ve)
+
obj-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/hyp/vgic-v2-sr.o
obj-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/hyp/vgic-v3-sr.o
obj-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/hyp/timer-sr.o
@@ -14,7 +16,10 @@ obj-$(CONFIG_KVM_ARM_HOST) += tlb.o
obj-$(CONFIG_KVM_ARM_HOST) += cp15-sr.o
obj-$(CONFIG_KVM_ARM_HOST) += vfp.o
obj-$(CONFIG_KVM_ARM_HOST) += banked-sr.o
+CFLAGS_banked-sr.o += $(CFLAGS_ARMV7VE)
+
obj-$(CONFIG_KVM_ARM_HOST) += entry.o
obj-$(CONFIG_KVM_ARM_HOST) += hyp-entry.o
obj-$(CONFIG_KVM_ARM_HOST) += switch.o
+CFLAGS_switch.o += $(CFLAGS_ARMV7VE)
obj-$(CONFIG_KVM_ARM_HOST) += s2-setup.o
--- a/arch/arm/kvm/hyp/banked-sr.c
+++ b/arch/arm/kvm/hyp/banked-sr.c
@@ -20,6 +20,10 @@
#include <asm/kvm_hyp.h>
+/*
+ * gcc before 4.9 doesn't understand -march=armv7ve, so we have to
+ * trick the assembler.
+ */
__asm__(".arch_extension virt");
void __hyp_text __banked_save_state(struct kvm_cpu_context *ctxt)
Patches currently in stable-queue which might be from arnd(a)arndb.de are
queue-4.9/tpm-constify-transmit-data-pointers.patch
queue-4.9/arm-kvm-fix-building-with-gcc-8.patch
This is a note to let you know that I've just added the patch titled
nospec: Allow index argument to have const-qualified type
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
nospec-allow-index-argument-to-have-const-qualified-type.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From b98c6a160a057d5686a8c54c79cc6c8c94a7d0c8 Mon Sep 17 00:00:00 2001
From: Rasmus Villemoes <linux(a)rasmusvillemoes.dk>
Date: Fri, 16 Feb 2018 13:20:48 -0800
Subject: nospec: Allow index argument to have const-qualified type
From: Rasmus Villemoes <linux(a)rasmusvillemoes.dk>
commit b98c6a160a057d5686a8c54c79cc6c8c94a7d0c8 upstream.
The last expression in a statement expression need not be a bare
variable, quoting gcc docs
The last thing in the compound statement should be an expression
followed by a semicolon; the value of this subexpression serves as the
value of the entire construct.
and we already use that in e.g. the min/max macros which end with a
ternary expression.
This way, we can allow index to have const-qualified type, which will in
some cases avoid the need for introducing a local copy of index of
non-const qualified type. That, in turn, can prevent readers not
familiar with the internals of array_index_nospec from wondering about
the seemingly redundant extra variable, and I think that's worthwhile
considering how confusing the whole _nospec business is.
The expression _i&_mask has type unsigned long (since that is the type
of _mask, and the BUILD_BUG_ONs guarantee that _i will get promoted to
that), so in order not to change the type of the whole expression, add
a cast back to typeof(_i).
Signed-off-by: Rasmus Villemoes <linux(a)rasmusvillemoes.dk>
Signed-off-by: Dan Williams <dan.j.williams(a)intel.com>
Acked-by: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Andy Lutomirski <luto(a)kernel.org>
Cc: Arjan van de Ven <arjan(a)linux.intel.com>
Cc: Borislav Petkov <bp(a)alien8.de>
Cc: Dave Hansen <dave.hansen(a)linux.intel.com>
Cc: David Woodhouse <dwmw2(a)infradead.org>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: Josh Poimboeuf <jpoimboe(a)redhat.com>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Will Deacon <will.deacon(a)arm.com>
Cc: linux-arch(a)vger.kernel.org
Cc: stable(a)vger.kernel.org
Link: http://lkml.kernel.org/r/151881604837.17395.10812767547837568328.stgit@dwil…
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
include/linux/nospec.h | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
--- a/include/linux/nospec.h
+++ b/include/linux/nospec.h
@@ -66,7 +66,6 @@ static inline unsigned long array_index_
BUILD_BUG_ON(sizeof(_i) > sizeof(long)); \
BUILD_BUG_ON(sizeof(_s) > sizeof(long)); \
\
- _i &= _mask; \
- _i; \
+ (typeof(_i)) (_i & _mask); \
})
#endif /* _LINUX_NOSPEC_H */
Patches currently in stable-queue which might be from linux(a)rasmusvillemoes.dk are
queue-4.4/nospec-allow-index-argument-to-have-const-qualified-type.patch
This is a note to let you know that I've just added the patch titled
KVM: X86: Fix SMRAM accessing even if VM is shutdown
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
kvm-x86-fix-smram-accessing-even-if-vm-is-shutdown.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 95e057e25892eaa48cad1e2d637b80d0f1a4fac5 Mon Sep 17 00:00:00 2001
From: Wanpeng Li <wanpengli(a)tencent.com>
Date: Thu, 8 Feb 2018 15:32:45 +0800
Subject: KVM: X86: Fix SMRAM accessing even if VM is shutdown
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
From: Wanpeng Li <wanpengli(a)tencent.com>
commit 95e057e25892eaa48cad1e2d637b80d0f1a4fac5 upstream.
Reported by syzkaller:
WARNING: CPU: 6 PID: 2434 at arch/x86/kvm/vmx.c:6660 handle_ept_misconfig+0x54/0x1e0 [kvm_intel]
CPU: 6 PID: 2434 Comm: repro_test Not tainted 4.15.0+ #4
RIP: 0010:handle_ept_misconfig+0x54/0x1e0 [kvm_intel]
Call Trace:
vmx_handle_exit+0xbd/0xe20 [kvm_intel]
kvm_arch_vcpu_ioctl_run+0xdaf/0x1d50 [kvm]
kvm_vcpu_ioctl+0x3e9/0x720 [kvm]
do_vfs_ioctl+0xa4/0x6a0
SyS_ioctl+0x79/0x90
entry_SYSCALL_64_fastpath+0x25/0x9c
The testcase creates a first thread to issue KVM_SMI ioctl, and then creates
a second thread to mmap and operate on the same vCPU. This triggers a race
condition when running the testcase with multiple threads. Sometimes one thread
exits with a triple fault while another thread mmaps and operates on the same
vCPU. Because CS=0x3000/IP=0x8000 is not mapped, accessing the SMI handler
results in an EPT misconfig. This patch fixes it by returning RET_PF_EMULATE
in kvm_handle_bad_page(), which will go on to cause an emulation failure and an
exit with KVM_EXIT_INTERNAL_ERROR.
Reported-by: syzbot+c1d9517cab094dae65e446c0c5b4de6c40f4dc58(a)syzkaller.appspotmail.com
Cc: Paolo Bonzini <pbonzini(a)redhat.com>
Cc: Radim Krčmář <rkrcmar(a)redhat.com>
Cc: stable(a)vger.kernel.org
Signed-off-by: Wanpeng Li <wanpengli(a)tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini(a)redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/kvm/mmu.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -2775,7 +2775,7 @@ static int kvm_handle_bad_page(struct kv
return 0;
}
- return -EFAULT;
+ return RET_PF_EMULATE;
}
static void transparent_hugepage_adjust(struct kvm_vcpu *vcpu,
Patches currently in stable-queue which might be from wanpengli(a)tencent.com are
queue-4.4/kvm-x86-fix-smram-accessing-even-if-vm-is-shutdown.patch
This is a note to let you know that I've just added the patch titled
ARM: mvebu: Fix broken PL310_ERRATA_753970 selects
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
arm-mvebu-fix-broken-pl310_errata_753970-selects.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 8aa36a8dcde3183d84db7b0d622ffddcebb61077 Mon Sep 17 00:00:00 2001
From: Ulf Magnusson <ulfalizer(a)gmail.com>
Date: Mon, 5 Feb 2018 02:21:13 +0100
Subject: ARM: mvebu: Fix broken PL310_ERRATA_753970 selects
From: Ulf Magnusson <ulfalizer(a)gmail.com>
commit 8aa36a8dcde3183d84db7b0d622ffddcebb61077 upstream.
The MACH_ARMADA_375 and MACH_ARMADA_38X boards select ARM_ERRATA_753970,
but it was renamed to PL310_ERRATA_753970 by commit fa0ce4035d48 ("ARM:
7162/1: errata: tidy up Kconfig options for PL310 errata workarounds").
Fix the selects to use the new name.
Discovered with the
https://github.com/ulfalizer/Kconfiglib/blob/master/examples/list_undefined…
script.
Fixes: fa0ce4035d48 ("ARM: 7162/1: errata: tidy up Kconfig options for
PL310 errata workarounds"
cc: stable(a)vger.kernel.org
Signed-off-by: Ulf Magnusson <ulfalizer(a)gmail.com>
Signed-off-by: Gregory CLEMENT <gregory.clement(a)bootlin.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/arm/mach-mvebu/Kconfig | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
--- a/arch/arm/mach-mvebu/Kconfig
+++ b/arch/arm/mach-mvebu/Kconfig
@@ -37,7 +37,7 @@ config MACH_ARMADA_370
config MACH_ARMADA_375
bool "Marvell Armada 375 boards" if ARCH_MULTI_V7
select ARM_ERRATA_720789
- select ARM_ERRATA_753970
+ select PL310_ERRATA_753970
select ARM_GIC
select ARMADA_375_CLK
select HAVE_ARM_SCU
@@ -52,7 +52,7 @@ config MACH_ARMADA_375
config MACH_ARMADA_38X
bool "Marvell Armada 380/385 boards" if ARCH_MULTI_V7
select ARM_ERRATA_720789
- select ARM_ERRATA_753970
+ select PL310_ERRATA_753970
select ARM_GIC
select ARMADA_38X_CLK
select HAVE_ARM_SCU
Patches currently in stable-queue which might be from ulfalizer(a)gmail.com are
queue-4.4/arm-mvebu-fix-broken-pl310_errata_753970-selects.patch
This is a note to let you know that I've just added the patch titled
KVM: mmu: Fix overlap between public and private memslots
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
kvm-mmu-fix-overlap-between-public-and-private-memslots.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From b28676bb8ae4569cced423dc2a88f7cb319d5379 Mon Sep 17 00:00:00 2001
From: Wanpeng Li <wanpeng.li(a)hotmail.com>
Date: Tue, 13 Feb 2018 15:36:00 +0100
Subject: KVM: mmu: Fix overlap between public and private memslots
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
From: Wanpeng Li <wanpeng.li(a)hotmail.com>
commit b28676bb8ae4569cced423dc2a88f7cb319d5379 upstream.
Reported by syzkaller:
pte_list_remove: ffff9714eb1f8078 0->BUG
------------[ cut here ]------------
kernel BUG at arch/x86/kvm/mmu.c:1157!
invalid opcode: 0000 [#1] SMP
RIP: 0010:pte_list_remove+0x11b/0x120 [kvm]
Call Trace:
drop_spte+0x83/0xb0 [kvm]
mmu_page_zap_pte+0xcc/0xe0 [kvm]
kvm_mmu_prepare_zap_page+0x81/0x4a0 [kvm]
kvm_mmu_invalidate_zap_all_pages+0x159/0x220 [kvm]
kvm_arch_flush_shadow_all+0xe/0x10 [kvm]
kvm_mmu_notifier_release+0x6c/0xa0 [kvm]
? kvm_mmu_notifier_release+0x5/0xa0 [kvm]
__mmu_notifier_release+0x79/0x110
? __mmu_notifier_release+0x5/0x110
exit_mmap+0x15a/0x170
? do_exit+0x281/0xcb0
mmput+0x66/0x160
do_exit+0x2c9/0xcb0
? __context_tracking_exit.part.5+0x4a/0x150
do_group_exit+0x50/0xd0
SyS_exit_group+0x14/0x20
do_syscall_64+0x73/0x1f0
entry_SYSCALL64_slow_path+0x25/0x25
The reason is that when creates new memslot, there is no guarantee for new
memslot not overlap with private memslots. This can be triggered by the
following program:
#include <fcntl.h>
#include <pthread.h>
#include <setjmp.h>
#include <signal.h>
#include <stddef.h>
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/ioctl.h>
#include <sys/stat.h>
#include <sys/syscall.h>
#include <sys/types.h>
#include <unistd.h>
#include <linux/kvm.h>
long r[16];
int main()
{
void *p = valloc(0x4000);
r[2] = open("/dev/kvm", 0);
r[3] = ioctl(r[2], KVM_CREATE_VM, 0x0ul);
uint64_t addr = 0xf000;
ioctl(r[3], KVM_SET_IDENTITY_MAP_ADDR, &addr);
r[6] = ioctl(r[3], KVM_CREATE_VCPU, 0x0ul);
ioctl(r[3], KVM_SET_TSS_ADDR, 0x0ul);
ioctl(r[6], KVM_RUN, 0);
ioctl(r[6], KVM_RUN, 0);
struct kvm_userspace_memory_region mr = {
.slot = 0,
.flags = KVM_MEM_LOG_DIRTY_PAGES,
.guest_phys_addr = 0xf000,
.memory_size = 0x4000,
.userspace_addr = (uintptr_t) p
};
ioctl(r[3], KVM_SET_USER_MEMORY_REGION, &mr);
return 0;
}
This patch fixes the bug by not adding a new memslot even if it
overlaps with private memslots.
Reported-by: Dmitry Vyukov <dvyukov(a)google.com>
Cc: Paolo Bonzini <pbonzini(a)redhat.com>
Cc: Radim Krčmář <rkrcmar(a)redhat.com>
Cc: Dmitry Vyukov <dvyukov(a)google.com>
Cc: Eric Biggers <ebiggers3(a)gmail.com>
Cc: stable(a)vger.kernel.org
Signed-off-by: Wanpeng Li <wanpeng.li(a)hotmail.com>
---
virt/kvm/kvm_main.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -902,8 +902,7 @@ int __kvm_set_memory_region(struct kvm *
/* Check for overlaps */
r = -EEXIST;
kvm_for_each_memslot(slot, __kvm_memslots(kvm, as_id)) {
- if ((slot->id >= KVM_USER_MEM_SLOTS) ||
- (slot->id == id))
+ if (slot->id == id)
continue;
if (!((base_gfn + npages <= slot->base_gfn) ||
(base_gfn >= slot->base_gfn + slot->npages)))
Patches currently in stable-queue which might be from wanpeng.li(a)hotmail.com are
queue-4.4/kvm-mmu-fix-overlap-between-public-and-private-memslots.patch
This is a note to let you know that I've just added the patch titled
x86/mm: Fix {pmd,pud}_{set,clear}_flags()
to the 4.15-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
x86-mm-fix-pmd-pud-_-set-clear-_flags.patch
and it can be found in the queue-4.15 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 842cef9113c2120f74f645111ded1e020193d84c Mon Sep 17 00:00:00 2001
From: Jan Beulich <JBeulich(a)suse.com>
Date: Mon, 19 Feb 2018 07:48:11 -0700
Subject: x86/mm: Fix {pmd,pud}_{set,clear}_flags()
From: Jan Beulich <JBeulich(a)suse.com>
commit 842cef9113c2120f74f645111ded1e020193d84c upstream.
Just like pte_{set,clear}_flags() their PMD and PUD counterparts should
not do any address translation. This was outright wrong under Xen
(causing a dead boot with no useful output on "suitable" systems), and
produced needlessly more complicated code (even if just slightly) when
paravirt was enabled.
Signed-off-by: Jan Beulich <jbeulich(a)suse.com>
Reviewed-by: Juergen Gross <jgross(a)suse.com>
Acked-by: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Andy Lutomirski <luto(a)kernel.org>
Cc: Boris Ostrovsky <boris.ostrovsky(a)oracle.com>
Cc: Borislav Petkov <bp(a)alien8.de>
Cc: Brian Gerst <brgerst(a)gmail.com>
Cc: Denys Vlasenko <dvlasenk(a)redhat.com>
Cc: H. Peter Anvin <hpa(a)zytor.com>
Cc: Josh Poimboeuf <jpoimboe(a)redhat.com>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: stable(a)vger.kernel.org
Link: http://lkml.kernel.org/r/5A8AF1BB02000078001A91C3@prv-mh.provo.novell.com
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/include/asm/pgtable.h | 8 ++++----
arch/x86/include/asm/pgtable_types.h | 10 ++++++++++
2 files changed, 14 insertions(+), 4 deletions(-)
--- a/arch/x86/include/asm/pgtable.h
+++ b/arch/x86/include/asm/pgtable.h
@@ -350,14 +350,14 @@ static inline pmd_t pmd_set_flags(pmd_t
{
pmdval_t v = native_pmd_val(pmd);
- return __pmd(v | set);
+ return native_make_pmd(v | set);
}
static inline pmd_t pmd_clear_flags(pmd_t pmd, pmdval_t clear)
{
pmdval_t v = native_pmd_val(pmd);
- return __pmd(v & ~clear);
+ return native_make_pmd(v & ~clear);
}
static inline pmd_t pmd_mkold(pmd_t pmd)
@@ -409,14 +409,14 @@ static inline pud_t pud_set_flags(pud_t
{
pudval_t v = native_pud_val(pud);
- return __pud(v | set);
+ return native_make_pud(v | set);
}
static inline pud_t pud_clear_flags(pud_t pud, pudval_t clear)
{
pudval_t v = native_pud_val(pud);
- return __pud(v & ~clear);
+ return native_make_pud(v & ~clear);
}
static inline pud_t pud_mkold(pud_t pud)
--- a/arch/x86/include/asm/pgtable_types.h
+++ b/arch/x86/include/asm/pgtable_types.h
@@ -323,6 +323,11 @@ static inline pudval_t native_pud_val(pu
#else
#include <asm-generic/pgtable-nopud.h>
+static inline pud_t native_make_pud(pudval_t val)
+{
+ return (pud_t) { .p4d.pgd = native_make_pgd(val) };
+}
+
static inline pudval_t native_pud_val(pud_t pud)
{
return native_pgd_val(pud.p4d.pgd);
@@ -344,6 +349,11 @@ static inline pmdval_t native_pmd_val(pm
#else
#include <asm-generic/pgtable-nopmd.h>
+static inline pmd_t native_make_pmd(pmdval_t val)
+{
+ return (pmd_t) { .pud.p4d.pgd = native_make_pgd(val) };
+}
+
static inline pmdval_t native_pmd_val(pmd_t pmd)
{
return native_pgd_val(pmd.pud.p4d.pgd);
Patches currently in stable-queue which might be from JBeulich(a)suse.com are
queue-4.15/x86-mm-fix-pmd-pud-_-set-clear-_flags.patch
This is a note to let you know that I've just added the patch titled
nospec: Allow index argument to have const-qualified type
to the 4.15-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
nospec-allow-index-argument-to-have-const-qualified-type.patch
and it can be found in the queue-4.15 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From b98c6a160a057d5686a8c54c79cc6c8c94a7d0c8 Mon Sep 17 00:00:00 2001
From: Rasmus Villemoes <linux(a)rasmusvillemoes.dk>
Date: Fri, 16 Feb 2018 13:20:48 -0800
Subject: nospec: Allow index argument to have const-qualified type
From: Rasmus Villemoes <linux(a)rasmusvillemoes.dk>
commit b98c6a160a057d5686a8c54c79cc6c8c94a7d0c8 upstream.
The last expression in a statement expression need not be a bare
variable, quoting gcc docs
The last thing in the compound statement should be an expression
followed by a semicolon; the value of this subexpression serves as the
value of the entire construct.
and we already use that in e.g. the min/max macros which end with a
ternary expression.
This way, we can allow index to have const-qualified type, which will in
some cases avoid the need for introducing a local copy of index of
non-const qualified type. That, in turn, can prevent readers not
familiar with the internals of array_index_nospec from wondering about
the seemingly redundant extra variable, and I think that's worthwhile
considering how confusing the whole _nospec business is.
The expression _i&_mask has type unsigned long (since that is the type
of _mask, and the BUILD_BUG_ONs guarantee that _i will get promoted to
that), so in order not to change the type of the whole expression, add
a cast back to typeof(_i).
Signed-off-by: Rasmus Villemoes <linux(a)rasmusvillemoes.dk>
Signed-off-by: Dan Williams <dan.j.williams(a)intel.com>
Acked-by: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Andy Lutomirski <luto(a)kernel.org>
Cc: Arjan van de Ven <arjan(a)linux.intel.com>
Cc: Borislav Petkov <bp(a)alien8.de>
Cc: Dave Hansen <dave.hansen(a)linux.intel.com>
Cc: David Woodhouse <dwmw2(a)infradead.org>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: Josh Poimboeuf <jpoimboe(a)redhat.com>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Will Deacon <will.deacon(a)arm.com>
Cc: linux-arch(a)vger.kernel.org
Cc: stable(a)vger.kernel.org
Link: http://lkml.kernel.org/r/151881604837.17395.10812767547837568328.stgit@dwil…
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
include/linux/nospec.h | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
--- a/include/linux/nospec.h
+++ b/include/linux/nospec.h
@@ -72,7 +72,6 @@ static inline unsigned long array_index_
BUILD_BUG_ON(sizeof(_i) > sizeof(long)); \
BUILD_BUG_ON(sizeof(_s) > sizeof(long)); \
\
- _i &= _mask; \
- _i; \
+ (typeof(_i)) (_i & _mask); \
})
#endif /* _LINUX_NOSPEC_H */
Patches currently in stable-queue which might be from linux(a)rasmusvillemoes.dk are
queue-4.15/nospec-allow-index-argument-to-have-const-qualified-type.patch
This is a note to let you know that I've just added the patch titled
KVM/x86: Remove indirect MSR op calls from SPEC_CTRL
to the 4.15-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
kvm-x86-remove-indirect-msr-op-calls-from-spec_ctrl.patch
and it can be found in the queue-4.15 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From ecb586bd29c99fb4de599dec388658e74388daad Mon Sep 17 00:00:00 2001
From: Paolo Bonzini <pbonzini(a)redhat.com>
Date: Thu, 22 Feb 2018 16:43:17 +0100
Subject: KVM/x86: Remove indirect MSR op calls from SPEC_CTRL
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
From: Paolo Bonzini <pbonzini(a)redhat.com>
commit ecb586bd29c99fb4de599dec388658e74388daad upstream.
Having a paravirt indirect call in the IBRS restore path is not a
good idea, since we are trying to protect from speculative execution
of bogus indirect branch targets. It is also slower, so use
native_wrmsrl() on the vmentry path too.
Signed-off-by: Paolo Bonzini <pbonzini(a)redhat.com>
Reviewed-by: Jim Mattson <jmattson(a)google.com>
Cc: David Woodhouse <dwmw(a)amazon.co.uk>
Cc: KarimAllah Ahmed <karahmed(a)amazon.de>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Radim Krčmář <rkrcmar(a)redhat.com>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: kvm(a)vger.kernel.org
Cc: stable(a)vger.kernel.org
Fixes: d28b387fb74da95d69d2615732f50cceb38e9a4d
Link: http://lkml.kernel.org/r/20180222154318.20361-2-pbonzini@redhat.com
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/kvm/svm.c | 7 ++++---
arch/x86/kvm/vmx.c | 7 ++++---
2 files changed, 8 insertions(+), 6 deletions(-)
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -45,6 +45,7 @@
#include <asm/debugreg.h>
#include <asm/kvm_para.h>
#include <asm/irq_remapping.h>
+#include <asm/microcode.h>
#include <asm/nospec-branch.h>
#include <asm/virtext.h>
@@ -5029,7 +5030,7 @@ static void svm_vcpu_run(struct kvm_vcpu
* being speculatively taken.
*/
if (svm->spec_ctrl)
- wrmsrl(MSR_IA32_SPEC_CTRL, svm->spec_ctrl);
+ native_wrmsrl(MSR_IA32_SPEC_CTRL, svm->spec_ctrl);
asm volatile (
"push %%" _ASM_BP "; \n\t"
@@ -5139,10 +5140,10 @@ static void svm_vcpu_run(struct kvm_vcpu
* save it.
*/
if (!msr_write_intercepted(vcpu, MSR_IA32_SPEC_CTRL))
- rdmsrl(MSR_IA32_SPEC_CTRL, svm->spec_ctrl);
+ svm->spec_ctrl = native_read_msr(MSR_IA32_SPEC_CTRL);
if (svm->spec_ctrl)
- wrmsrl(MSR_IA32_SPEC_CTRL, 0);
+ native_wrmsrl(MSR_IA32_SPEC_CTRL, 0);
/* Eliminate branch target predictions from guest mode */
vmexit_fill_RSB();
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -51,6 +51,7 @@
#include <asm/apic.h>
#include <asm/irq_remapping.h>
#include <asm/mmu_context.h>
+#include <asm/microcode.h>
#include <asm/nospec-branch.h>
#include "trace.h"
@@ -9443,7 +9444,7 @@ static void __noclone vmx_vcpu_run(struc
* being speculatively taken.
*/
if (vmx->spec_ctrl)
- wrmsrl(MSR_IA32_SPEC_CTRL, vmx->spec_ctrl);
+ native_wrmsrl(MSR_IA32_SPEC_CTRL, vmx->spec_ctrl);
vmx->__launched = vmx->loaded_vmcs->launched;
asm(
@@ -9579,10 +9580,10 @@ static void __noclone vmx_vcpu_run(struc
* save it.
*/
if (!msr_write_intercepted(vcpu, MSR_IA32_SPEC_CTRL))
- rdmsrl(MSR_IA32_SPEC_CTRL, vmx->spec_ctrl);
+ vmx->spec_ctrl = native_read_msr(MSR_IA32_SPEC_CTRL);
if (vmx->spec_ctrl)
- wrmsrl(MSR_IA32_SPEC_CTRL, 0);
+ native_wrmsrl(MSR_IA32_SPEC_CTRL, 0);
/* Eliminate branch target predictions from guest mode */
vmexit_fill_RSB();
Patches currently in stable-queue which might be from pbonzini(a)redhat.com are
queue-4.15/kvm-x86-fix-smram-accessing-even-if-vm-is-shutdown.patch
queue-4.15/kvm-x86-remove-indirect-msr-op-calls-from-spec_ctrl.patch
queue-4.15/kvm-x86-move-lapic-initialization-after-vmcs-creation.patch
queue-4.15/kvm-mmu-fix-overlap-between-public-and-private-memslots.patch
queue-4.15/kvm-vmx-optimize-vmx_vcpu_run-and-svm_vcpu_run-by-marking-the-rdmsr-path-as-unlikely.patch
This is a note to let you know that I've just added the patch titled
KVM: X86: Fix SMRAM accessing even if VM is shutdown
to the 4.15-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
kvm-x86-fix-smram-accessing-even-if-vm-is-shutdown.patch
and it can be found in the queue-4.15 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 95e057e25892eaa48cad1e2d637b80d0f1a4fac5 Mon Sep 17 00:00:00 2001
From: Wanpeng Li <wanpengli(a)tencent.com>
Date: Thu, 8 Feb 2018 15:32:45 +0800
Subject: KVM: X86: Fix SMRAM accessing even if VM is shutdown
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
From: Wanpeng Li <wanpengli(a)tencent.com>
commit 95e057e25892eaa48cad1e2d637b80d0f1a4fac5 upstream.
Reported by syzkaller:
WARNING: CPU: 6 PID: 2434 at arch/x86/kvm/vmx.c:6660 handle_ept_misconfig+0x54/0x1e0 [kvm_intel]
CPU: 6 PID: 2434 Comm: repro_test Not tainted 4.15.0+ #4
RIP: 0010:handle_ept_misconfig+0x54/0x1e0 [kvm_intel]
Call Trace:
vmx_handle_exit+0xbd/0xe20 [kvm_intel]
kvm_arch_vcpu_ioctl_run+0xdaf/0x1d50 [kvm]
kvm_vcpu_ioctl+0x3e9/0x720 [kvm]
do_vfs_ioctl+0xa4/0x6a0
SyS_ioctl+0x79/0x90
entry_SYSCALL_64_fastpath+0x25/0x9c
The testcase creates a first thread to issue KVM_SMI ioctl, and then creates
a second thread to mmap and operate on the same vCPU. This triggers a race
condition when running the testcase with multiple threads. Sometimes one thread
exits with a triple fault while another thread mmaps and operates on the same
vCPU. Because CS=0x3000/IP=0x8000 is not mapped, accessing the SMI handler
results in an EPT misconfig. This patch fixes it by returning RET_PF_EMULATE
in kvm_handle_bad_page(), which will go on to cause an emulation failure and an
exit with KVM_EXIT_INTERNAL_ERROR.
Reported-by: syzbot+c1d9517cab094dae65e446c0c5b4de6c40f4dc58(a)syzkaller.appspotmail.com
Cc: Paolo Bonzini <pbonzini(a)redhat.com>
Cc: Radim Krčmář <rkrcmar(a)redhat.com>
Cc: stable(a)vger.kernel.org
Signed-off-by: Wanpeng Li <wanpengli(a)tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini(a)redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/kvm/mmu.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -3017,7 +3017,7 @@ static int kvm_handle_bad_page(struct kv
return RET_PF_RETRY;
}
- return -EFAULT;
+ return RET_PF_EMULATE;
}
static void transparent_hugepage_adjust(struct kvm_vcpu *vcpu,
Patches currently in stable-queue which might be from wanpengli(a)tencent.com are
queue-4.15/kvm-x86-fix-smram-accessing-even-if-vm-is-shutdown.patch
This is a note to let you know that I've just added the patch titled
KVM: x86: move LAPIC initialization after VMCS creation
to the 4.15-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
kvm-x86-move-lapic-initialization-after-vmcs-creation.patch
and it can be found in the queue-4.15 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 0b2e9904c15963e715d33e5f3f1387f17d19333a Mon Sep 17 00:00:00 2001
From: Paolo Bonzini <pbonzini(a)redhat.com>
Date: Fri, 23 Feb 2018 23:29:32 +0100
Subject: KVM: x86: move LAPIC initialization after VMCS creation
From: Paolo Bonzini <pbonzini(a)redhat.com>
commit 0b2e9904c15963e715d33e5f3f1387f17d19333a upstream.
The initial reset of the local APIC is performed before the VMCS has been
created, but it tries to do a vmwrite:
vmwrite error: reg 810 value 4a00 (err 18944)
CPU: 54 PID: 38652 Comm: qemu-kvm Tainted: G W I 4.16.0-0.rc2.git0.1.fc28.x86_64 #1
Hardware name: Intel Corporation S2600CW/S2600CW, BIOS SE5C610.86B.01.01.0003.090520141303 09/05/2014
Call Trace:
vmx_set_rvi [kvm_intel]
vmx_hwapic_irr_update [kvm_intel]
kvm_lapic_reset [kvm]
kvm_create_lapic [kvm]
kvm_arch_vcpu_init [kvm]
kvm_vcpu_init [kvm]
vmx_create_vcpu [kvm_intel]
kvm_vm_ioctl [kvm]
Move it later, after the VMCS has been created.
Fixes: 4191db26b714 ("KVM: x86: Update APICv on APIC reset")
Cc: stable(a)vger.kernel.org
Cc: Liran Alon <liran.alon(a)oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini(a)redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/kvm/lapic.c | 1 -
arch/x86/kvm/x86.c | 1 +
2 files changed, 1 insertion(+), 1 deletion(-)
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -2156,7 +2156,6 @@ int kvm_create_lapic(struct kvm_vcpu *vc
*/
vcpu->arch.apic_base = MSR_IA32_APICBASE_ENABLE;
static_key_slow_inc(&apic_sw_disabled.key); /* sw disabled at reset */
- kvm_lapic_reset(vcpu, false);
kvm_iodevice_init(&apic->dev, &apic_mmio_ops);
return 0;
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -7793,6 +7793,7 @@ int kvm_arch_vcpu_setup(struct kvm_vcpu
if (r)
return r;
kvm_vcpu_reset(vcpu, false);
+ kvm_lapic_reset(vcpu, false);
kvm_mmu_setup(vcpu);
vcpu_put(vcpu);
return r;
Patches currently in stable-queue which might be from pbonzini(a)redhat.com are
queue-4.15/kvm-x86-fix-smram-accessing-even-if-vm-is-shutdown.patch
queue-4.15/kvm-x86-remove-indirect-msr-op-calls-from-spec_ctrl.patch
queue-4.15/kvm-x86-move-lapic-initialization-after-vmcs-creation.patch
queue-4.15/kvm-mmu-fix-overlap-between-public-and-private-memslots.patch
queue-4.15/kvm-vmx-optimize-vmx_vcpu_run-and-svm_vcpu_run-by-marking-the-rdmsr-path-as-unlikely.patch
This is a note to let you know that I've just added the patch titled
KVM: s390: take care of clock-comparator sign control
to the 4.15-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
kvm-s390-take-care-of-clock-comparator-sign-control.patch
and it can be found in the queue-4.15 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 5fe01793dd953ab947fababe8abaf5ed5258c8df Mon Sep 17 00:00:00 2001
From: David Hildenbrand <david(a)redhat.com>
Date: Wed, 7 Feb 2018 12:46:42 +0100
Subject: KVM: s390: take care of clock-comparator sign control
From: David Hildenbrand <david(a)redhat.com>
commit 5fe01793dd953ab947fababe8abaf5ed5258c8df upstream.
Missed when enabling the Multiple-epoch facility. If the facility is
installed and the control is set, a sign based comaprison has to be
performed.
Right now we would inject wrong interrupts and ignore interrupt
conditions. Also the sleep time is calculated in a wrong way.
Signed-off-by: David Hildenbrand <david(a)redhat.com>
Message-Id: <20180207114647.6220-2-david(a)redhat.com>
Fixes: 8fa1696ea781 ("KVM: s390: Multiple Epoch Facility support")
Cc: stable(a)vger.kernel.org
Reviewed-by: Christian Borntraeger <borntraeger(a)de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger(a)de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/s390/kvm/interrupt.c | 25 +++++++++++++++++++------
1 file changed, 19 insertions(+), 6 deletions(-)
--- a/arch/s390/kvm/interrupt.c
+++ b/arch/s390/kvm/interrupt.c
@@ -170,8 +170,15 @@ static int ckc_interrupts_enabled(struct
static int ckc_irq_pending(struct kvm_vcpu *vcpu)
{
- if (vcpu->arch.sie_block->ckc >= kvm_s390_get_tod_clock_fast(vcpu->kvm))
+ const u64 now = kvm_s390_get_tod_clock_fast(vcpu->kvm);
+ const u64 ckc = vcpu->arch.sie_block->ckc;
+
+ if (vcpu->arch.sie_block->gcr[0] & 0x0020000000000000ul) {
+ if ((s64)ckc >= (s64)now)
+ return 0;
+ } else if (ckc >= now) {
return 0;
+ }
return ckc_interrupts_enabled(vcpu);
}
@@ -1011,13 +1018,19 @@ int kvm_cpu_has_pending_timer(struct kvm
static u64 __calculate_sltime(struct kvm_vcpu *vcpu)
{
- u64 now, cputm, sltime = 0;
+ const u64 now = kvm_s390_get_tod_clock_fast(vcpu->kvm);
+ const u64 ckc = vcpu->arch.sie_block->ckc;
+ u64 cputm, sltime = 0;
if (ckc_interrupts_enabled(vcpu)) {
- now = kvm_s390_get_tod_clock_fast(vcpu->kvm);
- sltime = tod_to_ns(vcpu->arch.sie_block->ckc - now);
- /* already expired or overflow? */
- if (!sltime || vcpu->arch.sie_block->ckc <= now)
+ if (vcpu->arch.sie_block->gcr[0] & 0x0020000000000000ul) {
+ if ((s64)now < (s64)ckc)
+ sltime = tod_to_ns((s64)ckc - (s64)now);
+ } else if (now < ckc) {
+ sltime = tod_to_ns(ckc - now);
+ }
+ /* already expired */
+ if (!sltime)
return 0;
if (cpu_timer_interrupts_enabled(vcpu)) {
cputm = kvm_s390_get_cpu_timer(vcpu);
Patches currently in stable-queue which might be from david(a)redhat.com are
queue-4.15/kvm-s390-take-care-of-clock-comparator-sign-control.patch
queue-4.15/kvm-s390-provide-only-a-single-function-for-setting-the-tod-fix-sck.patch
queue-4.15/kvm-s390-consider-epoch-index-on-tod-clock-syncs.patch
queue-4.15/kvm-s390-consider-epoch-index-on-hotplugged-cpus.patch
This is a note to let you know that I've just added the patch titled
KVM/VMX: Optimize vmx_vcpu_run() and svm_vcpu_run() by marking the RDMSR path as unlikely()
to the 4.15-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
kvm-vmx-optimize-vmx_vcpu_run-and-svm_vcpu_run-by-marking-the-rdmsr-path-as-unlikely.patch
and it can be found in the queue-4.15 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 946fbbc13dce68902f64515b610eeb2a6c3d7a64 Mon Sep 17 00:00:00 2001
From: Paolo Bonzini <pbonzini(a)redhat.com>
Date: Thu, 22 Feb 2018 16:43:18 +0100
Subject: KVM/VMX: Optimize vmx_vcpu_run() and svm_vcpu_run() by marking the RDMSR path as unlikely()
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
From: Paolo Bonzini <pbonzini(a)redhat.com>
commit 946fbbc13dce68902f64515b610eeb2a6c3d7a64 upstream.
vmx_vcpu_run() and svm_vcpu_run() are large functions, and giving
branch hints to the compiler can actually make a substantial cycle
difference by keeping the fast path contiguous in memory.
With this optimization, the retpoline-guest/retpoline-host case is
about 50 cycles faster.
Signed-off-by: Paolo Bonzini <pbonzini(a)redhat.com>
Reviewed-by: Jim Mattson <jmattson(a)google.com>
Cc: David Woodhouse <dwmw(a)amazon.co.uk>
Cc: KarimAllah Ahmed <karahmed(a)amazon.de>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Radim Krčmář <rkrcmar(a)redhat.com>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: kvm(a)vger.kernel.org
Cc: stable(a)vger.kernel.org
Link: http://lkml.kernel.org/r/20180222154318.20361-3-pbonzini@redhat.com
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/kvm/svm.c | 2 +-
arch/x86/kvm/vmx.c | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -5139,7 +5139,7 @@ static void svm_vcpu_run(struct kvm_vcpu
* If the L02 MSR bitmap does not intercept the MSR, then we need to
* save it.
*/
- if (!msr_write_intercepted(vcpu, MSR_IA32_SPEC_CTRL))
+ if (unlikely(!msr_write_intercepted(vcpu, MSR_IA32_SPEC_CTRL)))
svm->spec_ctrl = native_read_msr(MSR_IA32_SPEC_CTRL);
if (svm->spec_ctrl)
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -9579,7 +9579,7 @@ static void __noclone vmx_vcpu_run(struc
* If the L02 MSR bitmap does not intercept the MSR, then we need to
* save it.
*/
- if (!msr_write_intercepted(vcpu, MSR_IA32_SPEC_CTRL))
+ if (unlikely(!msr_write_intercepted(vcpu, MSR_IA32_SPEC_CTRL)))
vmx->spec_ctrl = native_read_msr(MSR_IA32_SPEC_CTRL);
if (vmx->spec_ctrl)
Patches currently in stable-queue which might be from pbonzini(a)redhat.com are
queue-4.15/kvm-x86-fix-smram-accessing-even-if-vm-is-shutdown.patch
queue-4.15/kvm-x86-remove-indirect-msr-op-calls-from-spec_ctrl.patch
queue-4.15/kvm-x86-move-lapic-initialization-after-vmcs-creation.patch
queue-4.15/kvm-mmu-fix-overlap-between-public-and-private-memslots.patch
queue-4.15/kvm-vmx-optimize-vmx_vcpu_run-and-svm_vcpu_run-by-marking-the-rdmsr-path-as-unlikely.patch
This is a note to let you know that I've just added the patch titled
KVM: s390: consider epoch index on TOD clock syncs
to the 4.15-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
kvm-s390-consider-epoch-index-on-tod-clock-syncs.patch
and it can be found in the queue-4.15 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 1575767ef3cf5326701d2ae3075b7732cbc855e4 Mon Sep 17 00:00:00 2001
From: David Hildenbrand <david(a)redhat.com>
Date: Wed, 7 Feb 2018 12:46:45 +0100
Subject: KVM: s390: consider epoch index on TOD clock syncs
From: David Hildenbrand <david(a)redhat.com>
commit 1575767ef3cf5326701d2ae3075b7732cbc855e4 upstream.
For now, we don't take care of over/underflows. Especially underflows
are critical:
Assume the epoch is currently 0 and we get a sync request for delta=1,
meaning the TOD is moved forward by 1 and we have to fix it up by
subtracting 1 from the epoch. Right now, this will leave the epoch
index untouched, resulting in epoch=-1, epoch_idx=0, which is wrong.
We have to take care of over and underflows, also for the VSIE case. So
let's factor out calculation into a separate function.
Signed-off-by: David Hildenbrand <david(a)redhat.com>
Message-Id: <20180207114647.6220-5-david(a)redhat.com>
Reviewed-by: Christian Borntraeger <borntraeger(a)de.ibm.com>
Fixes: 8fa1696ea781 ("KVM: s390: Multiple Epoch Facility support")
Cc: stable(a)vger.kernel.org
Signed-off-by: Christian Borntraeger <borntraeger(a)de.ibm.com>
[use u8 for idx]
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/s390/kvm/kvm-s390.c | 32 +++++++++++++++++++++++++++++---
1 file changed, 29 insertions(+), 3 deletions(-)
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -166,6 +166,28 @@ int kvm_arch_hardware_enable(void)
static void kvm_gmap_notifier(struct gmap *gmap, unsigned long start,
unsigned long end);
+static void kvm_clock_sync_scb(struct kvm_s390_sie_block *scb, u64 delta)
+{
+ u8 delta_idx = 0;
+
+ /*
+ * The TOD jumps by delta, we have to compensate this by adding
+ * -delta to the epoch.
+ */
+ delta = -delta;
+
+ /* sign-extension - we're adding to signed values below */
+ if ((s64)delta < 0)
+ delta_idx = -1;
+
+ scb->epoch += delta;
+ if (scb->ecd & ECD_MEF) {
+ scb->epdx += delta_idx;
+ if (scb->epoch < delta)
+ scb->epdx += 1;
+ }
+}
+
/*
* This callback is executed during stop_machine(). All CPUs are therefore
* temporarily stopped. In order not to change guest behavior, we have to
@@ -181,13 +203,17 @@ static int kvm_clock_sync(struct notifie
unsigned long long *delta = v;
list_for_each_entry(kvm, &vm_list, vm_list) {
- kvm->arch.epoch -= *delta;
kvm_for_each_vcpu(i, vcpu, kvm) {
- vcpu->arch.sie_block->epoch -= *delta;
+ kvm_clock_sync_scb(vcpu->arch.sie_block, *delta);
+ if (i == 0) {
+ kvm->arch.epoch = vcpu->arch.sie_block->epoch;
+ kvm->arch.epdx = vcpu->arch.sie_block->epdx;
+ }
if (vcpu->arch.cputm_enabled)
vcpu->arch.cputm_start += *delta;
if (vcpu->arch.vsie_block)
- vcpu->arch.vsie_block->epoch -= *delta;
+ kvm_clock_sync_scb(vcpu->arch.vsie_block,
+ *delta);
}
}
return NOTIFY_OK;
Patches currently in stable-queue which might be from david(a)redhat.com are
queue-4.15/kvm-s390-take-care-of-clock-comparator-sign-control.patch
queue-4.15/kvm-s390-provide-only-a-single-function-for-setting-the-tod-fix-sck.patch
queue-4.15/kvm-s390-consider-epoch-index-on-tod-clock-syncs.patch
queue-4.15/kvm-s390-consider-epoch-index-on-hotplugged-cpus.patch
This is a note to let you know that I've just added the patch titled
KVM: s390: provide only a single function for setting the tod (fix SCK)
to the 4.15-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
kvm-s390-provide-only-a-single-function-for-setting-the-tod-fix-sck.patch
and it can be found in the queue-4.15 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 0e7def5fb0dc53ddbb9f62a497d15f1e11ccdc36 Mon Sep 17 00:00:00 2001
From: David Hildenbrand <david(a)redhat.com>
Date: Wed, 7 Feb 2018 12:46:43 +0100
Subject: KVM: s390: provide only a single function for setting the tod (fix SCK)
From: David Hildenbrand <david(a)redhat.com>
commit 0e7def5fb0dc53ddbb9f62a497d15f1e11ccdc36 upstream.
Right now, SET CLOCK called in the guest does not properly take care of
the epoch index, as the call goes via the old kvm_s390_set_tod_clock()
interface. So the epoch index is neither reset to 0, if required, nor
properly set to e.g. 0xff on negative values.
Fix this by providing a single kvm_s390_set_tod_clock() function. Move
Multiple-epoch facility handling into it.
Signed-off-by: David Hildenbrand <david(a)redhat.com>
Message-Id: <20180207114647.6220-3-david(a)redhat.com>
Reviewed-by: Christian Borntraeger <borntraeger(a)de.ibm.com>
Fixes: 8fa1696ea781 ("KVM: s390: Multiple Epoch Facility support")
Cc: stable(a)vger.kernel.org
Signed-off-by: Christian Borntraeger <borntraeger(a)de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/s390/kvm/kvm-s390.c | 46 +++++++++++++++-------------------------------
arch/s390/kvm/kvm-s390.h | 5 ++---
arch/s390/kvm/priv.c | 9 +++++----
3 files changed, 22 insertions(+), 38 deletions(-)
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -889,12 +889,9 @@ static int kvm_s390_set_tod_ext(struct k
if (copy_from_user(>od, (void __user *)attr->addr, sizeof(gtod)))
return -EFAULT;
- if (test_kvm_facility(kvm, 139))
- kvm_s390_set_tod_clock_ext(kvm, >od);
- else if (gtod.epoch_idx == 0)
- kvm_s390_set_tod_clock(kvm, gtod.tod);
- else
+ if (!test_kvm_facility(kvm, 139) && gtod.epoch_idx)
return -EINVAL;
+ kvm_s390_set_tod_clock(kvm, >od);
VM_EVENT(kvm, 3, "SET: TOD extension: 0x%x, TOD base: 0x%llx",
gtod.epoch_idx, gtod.tod);
@@ -919,13 +916,14 @@ static int kvm_s390_set_tod_high(struct
static int kvm_s390_set_tod_low(struct kvm *kvm, struct kvm_device_attr *attr)
{
- u64 gtod;
+ struct kvm_s390_vm_tod_clock gtod = { 0 };
- if (copy_from_user(>od, (void __user *)attr->addr, sizeof(gtod)))
+ if (copy_from_user(>od.tod, (void __user *)attr->addr,
+ sizeof(gtod.tod)))
return -EFAULT;
- kvm_s390_set_tod_clock(kvm, gtod);
- VM_EVENT(kvm, 3, "SET: TOD base: 0x%llx", gtod);
+ kvm_s390_set_tod_clock(kvm, >od);
+ VM_EVENT(kvm, 3, "SET: TOD base: 0x%llx", gtod.tod);
return 0;
}
@@ -2947,8 +2945,8 @@ retry:
return 0;
}
-void kvm_s390_set_tod_clock_ext(struct kvm *kvm,
- const struct kvm_s390_vm_tod_clock *gtod)
+void kvm_s390_set_tod_clock(struct kvm *kvm,
+ const struct kvm_s390_vm_tod_clock *gtod)
{
struct kvm_vcpu *vcpu;
struct kvm_s390_tod_clock_ext htod;
@@ -2960,10 +2958,12 @@ void kvm_s390_set_tod_clock_ext(struct k
get_tod_clock_ext((char *)&htod);
kvm->arch.epoch = gtod->tod - htod.tod;
- kvm->arch.epdx = gtod->epoch_idx - htod.epoch_idx;
-
- if (kvm->arch.epoch > gtod->tod)
- kvm->arch.epdx -= 1;
+ kvm->arch.epdx = 0;
+ if (test_kvm_facility(kvm, 139)) {
+ kvm->arch.epdx = gtod->epoch_idx - htod.epoch_idx;
+ if (kvm->arch.epoch > gtod->tod)
+ kvm->arch.epdx -= 1;
+ }
kvm_s390_vcpu_block_all(kvm);
kvm_for_each_vcpu(i, vcpu, kvm) {
@@ -2974,22 +2974,6 @@ void kvm_s390_set_tod_clock_ext(struct k
kvm_s390_vcpu_unblock_all(kvm);
preempt_enable();
mutex_unlock(&kvm->lock);
-}
-
-void kvm_s390_set_tod_clock(struct kvm *kvm, u64 tod)
-{
- struct kvm_vcpu *vcpu;
- int i;
-
- mutex_lock(&kvm->lock);
- preempt_disable();
- kvm->arch.epoch = tod - get_tod_clock();
- kvm_s390_vcpu_block_all(kvm);
- kvm_for_each_vcpu(i, vcpu, kvm)
- vcpu->arch.sie_block->epoch = kvm->arch.epoch;
- kvm_s390_vcpu_unblock_all(kvm);
- preempt_enable();
- mutex_unlock(&kvm->lock);
}
/**
--- a/arch/s390/kvm/kvm-s390.h
+++ b/arch/s390/kvm/kvm-s390.h
@@ -268,9 +268,8 @@ int kvm_s390_handle_sigp(struct kvm_vcpu
int kvm_s390_handle_sigp_pei(struct kvm_vcpu *vcpu);
/* implemented in kvm-s390.c */
-void kvm_s390_set_tod_clock_ext(struct kvm *kvm,
- const struct kvm_s390_vm_tod_clock *gtod);
-void kvm_s390_set_tod_clock(struct kvm *kvm, u64 tod);
+void kvm_s390_set_tod_clock(struct kvm *kvm,
+ const struct kvm_s390_vm_tod_clock *gtod);
long kvm_arch_fault_in_page(struct kvm_vcpu *vcpu, gpa_t gpa, int writable);
int kvm_s390_store_status_unloaded(struct kvm_vcpu *vcpu, unsigned long addr);
int kvm_s390_vcpu_store_status(struct kvm_vcpu *vcpu, unsigned long addr);
--- a/arch/s390/kvm/priv.c
+++ b/arch/s390/kvm/priv.c
@@ -81,9 +81,10 @@ int kvm_s390_handle_e3(struct kvm_vcpu *
/* Handle SCK (SET CLOCK) interception */
static int handle_set_clock(struct kvm_vcpu *vcpu)
{
+ struct kvm_s390_vm_tod_clock gtod = { 0 };
int rc;
u8 ar;
- u64 op2, val;
+ u64 op2;
if (vcpu->arch.sie_block->gpsw.mask & PSW_MASK_PSTATE)
return kvm_s390_inject_program_int(vcpu, PGM_PRIVILEGED_OP);
@@ -91,12 +92,12 @@ static int handle_set_clock(struct kvm_v
op2 = kvm_s390_get_base_disp_s(vcpu, &ar);
if (op2 & 7) /* Operand must be on a doubleword boundary */
return kvm_s390_inject_program_int(vcpu, PGM_SPECIFICATION);
- rc = read_guest(vcpu, op2, ar, &val, sizeof(val));
+ rc = read_guest(vcpu, op2, ar, >od.tod, sizeof(gtod.tod));
if (rc)
return kvm_s390_inject_prog_cond(vcpu, rc);
- VCPU_EVENT(vcpu, 3, "SCK: setting guest TOD to 0x%llx", val);
- kvm_s390_set_tod_clock(vcpu->kvm, val);
+ VCPU_EVENT(vcpu, 3, "SCK: setting guest TOD to 0x%llx", gtod.tod);
+ kvm_s390_set_tod_clock(vcpu->kvm, >od);
kvm_s390_set_psw_cc(vcpu, 0);
return 0;
Patches currently in stable-queue which might be from david(a)redhat.com are
queue-4.15/kvm-s390-take-care-of-clock-comparator-sign-control.patch
queue-4.15/kvm-s390-provide-only-a-single-function-for-setting-the-tod-fix-sck.patch
queue-4.15/kvm-s390-consider-epoch-index-on-tod-clock-syncs.patch
queue-4.15/kvm-s390-consider-epoch-index-on-hotplugged-cpus.patch
This is a note to let you know that I've just added the patch titled
KVM: mmu: Fix overlap between public and private memslots
to the 4.15-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
kvm-mmu-fix-overlap-between-public-and-private-memslots.patch
and it can be found in the queue-4.15 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From b28676bb8ae4569cced423dc2a88f7cb319d5379 Mon Sep 17 00:00:00 2001
From: Wanpeng Li <wanpeng.li(a)hotmail.com>
Date: Tue, 13 Feb 2018 15:36:00 +0100
Subject: KVM: mmu: Fix overlap between public and private memslots
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
From: Wanpeng Li <wanpeng.li(a)hotmail.com>
commit b28676bb8ae4569cced423dc2a88f7cb319d5379 upstream.
Reported by syzkaller:
pte_list_remove: ffff9714eb1f8078 0->BUG
------------[ cut here ]------------
kernel BUG at arch/x86/kvm/mmu.c:1157!
invalid opcode: 0000 [#1] SMP
RIP: 0010:pte_list_remove+0x11b/0x120 [kvm]
Call Trace:
drop_spte+0x83/0xb0 [kvm]
mmu_page_zap_pte+0xcc/0xe0 [kvm]
kvm_mmu_prepare_zap_page+0x81/0x4a0 [kvm]
kvm_mmu_invalidate_zap_all_pages+0x159/0x220 [kvm]
kvm_arch_flush_shadow_all+0xe/0x10 [kvm]
kvm_mmu_notifier_release+0x6c/0xa0 [kvm]
? kvm_mmu_notifier_release+0x5/0xa0 [kvm]
__mmu_notifier_release+0x79/0x110
? __mmu_notifier_release+0x5/0x110
exit_mmap+0x15a/0x170
? do_exit+0x281/0xcb0
mmput+0x66/0x160
do_exit+0x2c9/0xcb0
? __context_tracking_exit.part.5+0x4a/0x150
do_group_exit+0x50/0xd0
SyS_exit_group+0x14/0x20
do_syscall_64+0x73/0x1f0
entry_SYSCALL64_slow_path+0x25/0x25
The reason is that when creates new memslot, there is no guarantee for new
memslot not overlap with private memslots. This can be triggered by the
following program:
#include <fcntl.h>
#include <pthread.h>
#include <setjmp.h>
#include <signal.h>
#include <stddef.h>
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/ioctl.h>
#include <sys/stat.h>
#include <sys/syscall.h>
#include <sys/types.h>
#include <unistd.h>
#include <linux/kvm.h>
long r[16];
int main()
{
void *p = valloc(0x4000);
r[2] = open("/dev/kvm", 0);
r[3] = ioctl(r[2], KVM_CREATE_VM, 0x0ul);
uint64_t addr = 0xf000;
ioctl(r[3], KVM_SET_IDENTITY_MAP_ADDR, &addr);
r[6] = ioctl(r[3], KVM_CREATE_VCPU, 0x0ul);
ioctl(r[3], KVM_SET_TSS_ADDR, 0x0ul);
ioctl(r[6], KVM_RUN, 0);
ioctl(r[6], KVM_RUN, 0);
struct kvm_userspace_memory_region mr = {
.slot = 0,
.flags = KVM_MEM_LOG_DIRTY_PAGES,
.guest_phys_addr = 0xf000,
.memory_size = 0x4000,
.userspace_addr = (uintptr_t) p
};
ioctl(r[3], KVM_SET_USER_MEMORY_REGION, &mr);
return 0;
}
This patch fixes the bug by not adding a new memslot even if it
overlaps with private memslots.
Reported-by: Dmitry Vyukov <dvyukov(a)google.com>
Cc: Paolo Bonzini <pbonzini(a)redhat.com>
Cc: Radim Krčmář <rkrcmar(a)redhat.com>
Cc: Dmitry Vyukov <dvyukov(a)google.com>
Cc: Eric Biggers <ebiggers3(a)gmail.com>
Cc: stable(a)vger.kernel.org
Signed-off-by: Wanpeng Li <wanpeng.li(a)hotmail.com>
---
virt/kvm/kvm_main.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -974,8 +974,7 @@ int __kvm_set_memory_region(struct kvm *
/* Check for overlaps */
r = -EEXIST;
kvm_for_each_memslot(slot, __kvm_memslots(kvm, as_id)) {
- if ((slot->id >= KVM_USER_MEM_SLOTS) ||
- (slot->id == id))
+ if (slot->id == id)
continue;
if (!((base_gfn + npages <= slot->base_gfn) ||
(base_gfn >= slot->base_gfn + slot->npages)))
Patches currently in stable-queue which might be from wanpeng.li(a)hotmail.com are
queue-4.15/kvm-mmu-fix-overlap-between-public-and-private-memslots.patch
This is a note to let you know that I've just added the patch titled
KVM: s390: consider epoch index on hotplugged CPUs
to the 4.15-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
kvm-s390-consider-epoch-index-on-hotplugged-cpus.patch
and it can be found in the queue-4.15 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From d16b52cb9cdb6f06dea8ab2f0a428e7d7f0b0a81 Mon Sep 17 00:00:00 2001
From: David Hildenbrand <david(a)redhat.com>
Date: Wed, 7 Feb 2018 12:46:44 +0100
Subject: KVM: s390: consider epoch index on hotplugged CPUs
From: David Hildenbrand <david(a)redhat.com>
commit d16b52cb9cdb6f06dea8ab2f0a428e7d7f0b0a81 upstream.
We must copy both, the epoch and the epoch_idx.
Signed-off-by: David Hildenbrand <david(a)redhat.com>
Message-Id: <20180207114647.6220-4-david(a)redhat.com>
Fixes: 8fa1696ea781 ("KVM: s390: Multiple Epoch Facility support")
Reviewed-by: Cornelia Huck <cohuck(a)redhat.com>
Reviewed-by: Christian Borntraeger <borntraeger(a)de.ibm.com>
Fixes: 8fa1696ea781 ("KVM: s390: Multiple Epoch Facility support")
Cc: stable(a)vger.kernel.org
Signed-off-by: Christian Borntraeger <borntraeger(a)de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/s390/kvm/kvm-s390.c | 1 +
1 file changed, 1 insertion(+)
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -2359,6 +2359,7 @@ void kvm_arch_vcpu_postcreate(struct kvm
mutex_lock(&vcpu->kvm->lock);
preempt_disable();
vcpu->arch.sie_block->epoch = vcpu->kvm->arch.epoch;
+ vcpu->arch.sie_block->epdx = vcpu->kvm->arch.epdx;
preempt_enable();
mutex_unlock(&vcpu->kvm->lock);
if (!kvm_is_ucontrol(vcpu->kvm)) {
Patches currently in stable-queue which might be from david(a)redhat.com are
queue-4.15/kvm-s390-take-care-of-clock-comparator-sign-control.patch
queue-4.15/kvm-s390-provide-only-a-single-function-for-setting-the-tod-fix-sck.patch
queue-4.15/kvm-s390-consider-epoch-index-on-tod-clock-syncs.patch
queue-4.15/kvm-s390-consider-epoch-index-on-hotplugged-cpus.patch
This is a note to let you know that I've just added the patch titled
blk-mq: don't call io sched's .requeue_request when requeueing rq to ->dispatch
to the 4.15-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
blk-mq-don-t-call-io-sched-s-.requeue_request-when-requeueing-rq-to-dispatch.patch
and it can be found in the queue-4.15 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 105976f517791aed3b11f8f53b308a2069d42055 Mon Sep 17 00:00:00 2001
From: Ming Lei <ming.lei(a)redhat.com>
Date: Fri, 23 Feb 2018 23:36:56 +0800
Subject: blk-mq: don't call io sched's .requeue_request when requeueing rq to ->dispatch
From: Ming Lei <ming.lei(a)redhat.com>
commit 105976f517791aed3b11f8f53b308a2069d42055 upstream.
__blk_mq_requeue_request() covers two cases:
- one is that the requeued request is added to hctx->dispatch, such as
blk_mq_dispatch_rq_list()
- another case is that the request is requeued to io scheduler, such as
blk_mq_requeue_request().
We should call io sched's .requeue_request callback only for the 2nd
case.
Cc: Paolo Valente <paolo.valente(a)linaro.org>
Cc: Omar Sandoval <osandov(a)fb.com>
Fixes: bd166ef183c2 ("blk-mq-sched: add framework for MQ capable IO schedulers")
Cc: stable(a)vger.kernel.org
Reviewed-by: Bart Van Assche <bart.vanassche(a)wdc.com>
Acked-by: Paolo Valente <paolo.valente(a)linaro.org>
Signed-off-by: Ming Lei <ming.lei(a)redhat.com>
Signed-off-by: Jens Axboe <axboe(a)kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
block/blk-mq.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -655,7 +655,6 @@ static void __blk_mq_requeue_request(str
trace_block_rq_requeue(q, rq);
wbt_requeue(q->rq_wb, &rq->issue_stat);
- blk_mq_sched_requeue_request(rq);
if (test_and_clear_bit(REQ_ATOM_STARTED, &rq->atomic_flags)) {
if (q->dma_drain_size && blk_rq_bytes(rq))
@@ -667,6 +666,9 @@ void blk_mq_requeue_request(struct reque
{
__blk_mq_requeue_request(rq);
+ /* this request will be re-inserted to io scheduler queue */
+ blk_mq_sched_requeue_request(rq);
+
BUG_ON(blk_queued_rq(rq));
blk_mq_add_to_requeue_list(rq, true, kick_requeue_list);
}
Patches currently in stable-queue which might be from ming.lei(a)redhat.com are
queue-4.15/block-pass-inclusive-lend-parameter-to-truncate_inode_pages_range.patch
queue-4.15/blk-mq-don-t-call-io-sched-s-.requeue_request-when-requeueing-rq-to-dispatch.patch
queue-4.15/block-kyber-fix-domain-token-leak-during-requeue.patch
This is a note to let you know that I've just added the patch titled
EDAC, sb_edac: Fix out of bound writes during DIMM configuration on KNL
to the 4.15-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
edac-sb_edac-fix-out-of-bound-writes-during-dimm-configuration-on-knl.patch
and it can be found in the queue-4.15 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From bf8486709ac7fad99e4040dea73fe466c57a4ae1 Mon Sep 17 00:00:00 2001
From: Anna Karbownik <anna.karbownik(a)intel.com>
Date: Thu, 22 Feb 2018 16:18:13 +0100
Subject: EDAC, sb_edac: Fix out of bound writes during DIMM configuration on KNL
From: Anna Karbownik <anna.karbownik(a)intel.com>
commit bf8486709ac7fad99e4040dea73fe466c57a4ae1 upstream.
Commit
3286d3eb906c ("EDAC, sb_edac: Drop NUM_CHANNELS from 8 back to 4")
decreased NUM_CHANNELS from 8 to 4, but this is not enough for Knights
Landing which supports up to 6 channels.
This caused out-of-bounds writes to pvt->mirror_mode and pvt->tolm
variables which don't pay critical role on KNL code path, so the memory
corruption wasn't causing any visible driver failures.
The easiest way of fixing it is to change NUM_CHANNELS to 6. Do that.
An alternative solution would be to restructure the KNL part of the
driver to 2MC/3channel representation.
Reported-by: Dan Carpenter <dan.carpenter(a)oracle.com>
Signed-off-by: Anna Karbownik <anna.karbownik(a)intel.com>
Cc: Mauro Carvalho Chehab <mchehab(a)kernel.org>
Cc: Tony Luck <tony.luck(a)intel.com>
Cc: jim.m.snow(a)intel.com
Cc: krzysztof.paliswiat(a)intel.com
Cc: lukasz.odzioba(a)intel.com
Cc: qiuxu.zhuo(a)intel.com
Cc: linux-edac <linux-edac(a)vger.kernel.org>
Cc: <stable(a)vger.kernel.org>
Fixes: 3286d3eb906c ("EDAC, sb_edac: Drop NUM_CHANNELS from 8 back to 4")
Link: http://lkml.kernel.org/r/1519312693-4789-1-git-send-email-anna.karbownik@in…
[ Massage commit message. ]
Signed-off-by: Borislav Petkov <bp(a)suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/edac/sb_edac.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/edac/sb_edac.c
+++ b/drivers/edac/sb_edac.c
@@ -279,7 +279,7 @@ static const u32 correrrthrsld[] = {
* sbridge structs
*/
-#define NUM_CHANNELS 4 /* Max channels per MC */
+#define NUM_CHANNELS 6 /* Max channels per MC */
#define MAX_DIMMS 3 /* Max DIMMS per channel */
#define KNL_MAX_CHAS 38 /* KNL max num. of Cache Home Agents */
#define KNL_MAX_CHANNELS 6 /* KNL max num. of PCI channels */
Patches currently in stable-queue which might be from anna.karbownik(a)intel.com are
queue-4.15/edac-sb_edac-fix-out-of-bound-writes-during-dimm-configuration-on-knl.patch
This is a note to let you know that I've just added the patch titled
ARM: orion: fix orion_ge00_switch_board_info initialization
to the 4.15-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
arm-orion-fix-orion_ge00_switch_board_info-initialization.patch
and it can be found in the queue-4.15 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 8337d083507b9827dfb36d545538b7789df834fd Mon Sep 17 00:00:00 2001
From: Arnd Bergmann <arnd(a)arndb.de>
Date: Wed, 21 Feb 2018 13:18:49 +0100
Subject: ARM: orion: fix orion_ge00_switch_board_info initialization
From: Arnd Bergmann <arnd(a)arndb.de>
commit 8337d083507b9827dfb36d545538b7789df834fd upstream.
A section type mismatch warning shows up when building with LTO,
since orion_ge00_mvmdio_bus_name was put in __initconst but not marked
const itself:
include/linux/of.h: In function 'spear_setup_of_timer':
arch/arm/mach-spear/time.c:207:34: error: 'timer_of_match' causes a section type conflict with 'orion_ge00_mvmdio_bus_name'
static const struct of_device_id timer_of_match[] __initconst = {
^
arch/arm/plat-orion/common.c:475:32: note: 'orion_ge00_mvmdio_bus_name' was declared here
static __initconst const char *orion_ge00_mvmdio_bus_name = "orion-mii";
^
As pointed out by Andrew Lunn, it should in fact be 'const' but not
'__initconst' because the string is never copied but may be accessed
after the init sections are freed. To fix that, I get rid of the
extra symbol and rewrite the initialization in a simpler way that
assigns both the bus_id and modalias statically.
I spotted another theoretical bug in the same place, where d->netdev[i]
may be an out of bounds access, this can be fixed by moving the device
assignment into the loop.
Cc: stable(a)vger.kernel.org
Reviewed-by: Andrew Lunn <andrew(a)lunn.ch>
Signed-off-by: Arnd Bergmann <arnd(a)arndb.de>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/arm/plat-orion/common.c | 23 +++++++++++------------
1 file changed, 11 insertions(+), 12 deletions(-)
--- a/arch/arm/plat-orion/common.c
+++ b/arch/arm/plat-orion/common.c
@@ -472,28 +472,27 @@ void __init orion_ge11_init(struct mv643
/*****************************************************************************
* Ethernet switch
****************************************************************************/
-static __initconst const char *orion_ge00_mvmdio_bus_name = "orion-mii";
-static __initdata struct mdio_board_info
- orion_ge00_switch_board_info;
+static __initdata struct mdio_board_info orion_ge00_switch_board_info = {
+ .bus_id = "orion-mii",
+ .modalias = "mv88e6085",
+};
void __init orion_ge00_switch_init(struct dsa_chip_data *d)
{
- struct mdio_board_info *bd;
unsigned int i;
if (!IS_BUILTIN(CONFIG_PHYLIB))
return;
- for (i = 0; i < ARRAY_SIZE(d->port_names); i++)
- if (!strcmp(d->port_names[i], "cpu"))
+ for (i = 0; i < ARRAY_SIZE(d->port_names); i++) {
+ if (!strcmp(d->port_names[i], "cpu")) {
+ d->netdev[i] = &orion_ge00.dev;
break;
+ }
+ }
- bd = &orion_ge00_switch_board_info;
- bd->bus_id = orion_ge00_mvmdio_bus_name;
- bd->mdio_addr = d->sw_addr;
- d->netdev[i] = &orion_ge00.dev;
- strcpy(bd->modalias, "mv88e6085");
- bd->platform_data = d;
+ orion_ge00_switch_board_info.mdio_addr = d->sw_addr;
+ orion_ge00_switch_board_info.platform_data = d;
mdiobus_register_board_info(&orion_ge00_switch_board_info, 1);
}
Patches currently in stable-queue which might be from arnd(a)arndb.de are
queue-4.15/ipv6-sit-work-around-bogus-gcc-8-wrestrict-warning.patch
queue-4.15/arm-kvm-fix-building-with-gcc-8.patch
queue-4.15/arm-orion-fix-orion_ge00_switch_board_info-initialization.patch
This is a note to let you know that I've just added the patch titled
ARM: kvm: fix building with gcc-8
to the 4.15-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
arm-kvm-fix-building-with-gcc-8.patch
and it can be found in the queue-4.15 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 67870eb1204223598ea6d8a4467b482e9f5875b5 Mon Sep 17 00:00:00 2001
From: Arnd Bergmann <arnd(a)arndb.de>
Date: Fri, 2 Feb 2018 16:07:34 +0100
Subject: ARM: kvm: fix building with gcc-8
From: Arnd Bergmann <arnd(a)arndb.de>
commit 67870eb1204223598ea6d8a4467b482e9f5875b5 upstream.
In banked-sr.c, we use a top-level '__asm__(".arch_extension virt")'
statement to allow compilation of a multi-CPU kernel for ARMv6
and older ARMv7-A that don't normally support access to the banked
registers.
This is considered to be a programming error by the gcc developers
and will no longer work in gcc-8, where we now get a build error:
/tmp/cc4Qy7GR.s:34: Error: Banked registers are not available with this architecture. -- `mrs r3,SP_usr'
/tmp/cc4Qy7GR.s:41: Error: Banked registers are not available with this architecture. -- `mrs r3,ELR_hyp'
/tmp/cc4Qy7GR.s:55: Error: Banked registers are not available with this architecture. -- `mrs r3,SP_svc'
/tmp/cc4Qy7GR.s:62: Error: Banked registers are not available with this architecture. -- `mrs r3,LR_svc'
/tmp/cc4Qy7GR.s:69: Error: Banked registers are not available with this architecture. -- `mrs r3,SPSR_svc'
/tmp/cc4Qy7GR.s:76: Error: Banked registers are not available with this architecture. -- `mrs r3,SP_abt'
Passign the '-march-armv7ve' flag to gcc works, and is ok here, because
we know the functions won't ever be called on pre-ARMv7VE machines.
Unfortunately, older compiler versions (4.8 and earlier) do not understand
that flag, so we still need to keep the asm around.
Backporting to stable kernels (4.6+) is needed to allow those to be built
with future compilers as well.
Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84129
Fixes: 33280b4cd1dc ("ARM: KVM: Add banked registers save/restore")
Cc: stable(a)vger.kernel.org
Signed-off-by: Arnd Bergmann <arnd(a)arndb.de>
Signed-off-by: Christoffer Dall <christoffer.dall(a)linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/arm/kvm/hyp/Makefile | 5 +++++
arch/arm/kvm/hyp/banked-sr.c | 4 ++++
2 files changed, 9 insertions(+)
--- a/arch/arm/kvm/hyp/Makefile
+++ b/arch/arm/kvm/hyp/Makefile
@@ -7,6 +7,8 @@ ccflags-y += -fno-stack-protector -DDISA
KVM=../../../../virt/kvm
+CFLAGS_ARMV7VE :=$(call cc-option, -march=armv7ve)
+
obj-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/hyp/vgic-v2-sr.o
obj-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/hyp/vgic-v3-sr.o
obj-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/hyp/timer-sr.o
@@ -15,7 +17,10 @@ obj-$(CONFIG_KVM_ARM_HOST) += tlb.o
obj-$(CONFIG_KVM_ARM_HOST) += cp15-sr.o
obj-$(CONFIG_KVM_ARM_HOST) += vfp.o
obj-$(CONFIG_KVM_ARM_HOST) += banked-sr.o
+CFLAGS_banked-sr.o += $(CFLAGS_ARMV7VE)
+
obj-$(CONFIG_KVM_ARM_HOST) += entry.o
obj-$(CONFIG_KVM_ARM_HOST) += hyp-entry.o
obj-$(CONFIG_KVM_ARM_HOST) += switch.o
+CFLAGS_switch.o += $(CFLAGS_ARMV7VE)
obj-$(CONFIG_KVM_ARM_HOST) += s2-setup.o
--- a/arch/arm/kvm/hyp/banked-sr.c
+++ b/arch/arm/kvm/hyp/banked-sr.c
@@ -20,6 +20,10 @@
#include <asm/kvm_hyp.h>
+/*
+ * gcc before 4.9 doesn't understand -march=armv7ve, so we have to
+ * trick the assembler.
+ */
__asm__(".arch_extension virt");
void __hyp_text __banked_save_state(struct kvm_cpu_context *ctxt)
Patches currently in stable-queue which might be from arnd(a)arndb.de are
queue-4.15/ipv6-sit-work-around-bogus-gcc-8-wrestrict-warning.patch
queue-4.15/arm-kvm-fix-building-with-gcc-8.patch
queue-4.15/arm-orion-fix-orion_ge00_switch_board_info-initialization.patch
This is a note to let you know that I've just added the patch titled
ARM: mvebu: Fix broken PL310_ERRATA_753970 selects
to the 4.15-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
arm-mvebu-fix-broken-pl310_errata_753970-selects.patch
and it can be found in the queue-4.15 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 8aa36a8dcde3183d84db7b0d622ffddcebb61077 Mon Sep 17 00:00:00 2001
From: Ulf Magnusson <ulfalizer(a)gmail.com>
Date: Mon, 5 Feb 2018 02:21:13 +0100
Subject: ARM: mvebu: Fix broken PL310_ERRATA_753970 selects
From: Ulf Magnusson <ulfalizer(a)gmail.com>
commit 8aa36a8dcde3183d84db7b0d622ffddcebb61077 upstream.
The MACH_ARMADA_375 and MACH_ARMADA_38X boards select ARM_ERRATA_753970,
but it was renamed to PL310_ERRATA_753970 by commit fa0ce4035d48 ("ARM:
7162/1: errata: tidy up Kconfig options for PL310 errata workarounds").
Fix the selects to use the new name.
Discovered with the
https://github.com/ulfalizer/Kconfiglib/blob/master/examples/list_undefined…
script.
Fixes: fa0ce4035d48 ("ARM: 7162/1: errata: tidy up Kconfig options for
PL310 errata workarounds"
cc: stable(a)vger.kernel.org
Signed-off-by: Ulf Magnusson <ulfalizer(a)gmail.com>
Signed-off-by: Gregory CLEMENT <gregory.clement(a)bootlin.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/arm/mach-mvebu/Kconfig | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
--- a/arch/arm/mach-mvebu/Kconfig
+++ b/arch/arm/mach-mvebu/Kconfig
@@ -42,7 +42,7 @@ config MACH_ARMADA_375
depends on ARCH_MULTI_V7
select ARMADA_370_XP_IRQ
select ARM_ERRATA_720789
- select ARM_ERRATA_753970
+ select PL310_ERRATA_753970
select ARM_GIC
select ARMADA_375_CLK
select HAVE_ARM_SCU
@@ -58,7 +58,7 @@ config MACH_ARMADA_38X
bool "Marvell Armada 380/385 boards"
depends on ARCH_MULTI_V7
select ARM_ERRATA_720789
- select ARM_ERRATA_753970
+ select PL310_ERRATA_753970
select ARM_GIC
select ARM_GLOBAL_TIMER
select CLKSRC_ARM_GLOBAL_TIMER_SCHED_CLOCK
Patches currently in stable-queue which might be from ulfalizer(a)gmail.com are
queue-4.15/arm-mvebu-fix-broken-pl310_errata_753970-selects.patch
This is a note to let you know that I've just added the patch titled
ARM: dts: rockchip: Remove 1.8 GHz operation point from phycore som
to the 4.15-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
arm-dts-rockchip-remove-1.8-ghz-operation-point-from-phycore-som.patch
and it can be found in the queue-4.15 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 5ce0bad4ccd04c8a989e94d3c89e4e796ac22e48 Mon Sep 17 00:00:00 2001
From: Daniel Schultz <d.schultz(a)phytec.de>
Date: Tue, 13 Feb 2018 10:44:32 +0100
Subject: ARM: dts: rockchip: Remove 1.8 GHz operation point from phycore som
From: Daniel Schultz <d.schultz(a)phytec.de>
commit 5ce0bad4ccd04c8a989e94d3c89e4e796ac22e48 upstream.
Rockchip recommends to run the CPU cores only with operations points of
1.6 GHz or lower.
Removed the cpu0 node with too high operation points and use the default
values instead.
Fixes: 903d31e34628 ("ARM: dts: rockchip: Add support for phyCORE-RK3288 SoM")
Cc: stable(a)vger.kernel.org
Signed-off-by: Daniel Schultz <d.schultz(a)phytec.de>
Signed-off-by: Heiko Stuebner <heiko(a)sntech.de>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/arm/boot/dts/rk3288-phycore-som.dtsi | 20 --------------------
1 file changed, 20 deletions(-)
--- a/arch/arm/boot/dts/rk3288-phycore-som.dtsi
+++ b/arch/arm/boot/dts/rk3288-phycore-som.dtsi
@@ -110,26 +110,6 @@
};
};
-&cpu0 {
- cpu0-supply = <&vdd_cpu>;
- operating-points = <
- /* KHz uV */
- 1800000 1400000
- 1608000 1350000
- 1512000 1300000
- 1416000 1200000
- 1200000 1100000
- 1008000 1050000
- 816000 1000000
- 696000 950000
- 600000 900000
- 408000 900000
- 312000 900000
- 216000 900000
- 126000 900000
- >;
-};
-
&emmc {
status = "okay";
bus-width = <8>;
Patches currently in stable-queue which might be from d.schultz(a)phytec.de are
queue-4.15/arm-dts-rockchip-remove-1.8-ghz-operation-point-from-phycore-som.patch
This is a note to let you know that I've just added the patch titled
nospec: Allow index argument to have const-qualified type
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
nospec-allow-index-argument-to-have-const-qualified-type.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From b98c6a160a057d5686a8c54c79cc6c8c94a7d0c8 Mon Sep 17 00:00:00 2001
From: Rasmus Villemoes <linux(a)rasmusvillemoes.dk>
Date: Fri, 16 Feb 2018 13:20:48 -0800
Subject: nospec: Allow index argument to have const-qualified type
From: Rasmus Villemoes <linux(a)rasmusvillemoes.dk>
commit b98c6a160a057d5686a8c54c79cc6c8c94a7d0c8 upstream.
The last expression in a statement expression need not be a bare
variable, quoting gcc docs
The last thing in the compound statement should be an expression
followed by a semicolon; the value of this subexpression serves as the
value of the entire construct.
and we already use that in e.g. the min/max macros which end with a
ternary expression.
This way, we can allow index to have const-qualified type, which will in
some cases avoid the need for introducing a local copy of index of
non-const qualified type. That, in turn, can prevent readers not
familiar with the internals of array_index_nospec from wondering about
the seemingly redundant extra variable, and I think that's worthwhile
considering how confusing the whole _nospec business is.
The expression _i&_mask has type unsigned long (since that is the type
of _mask, and the BUILD_BUG_ONs guarantee that _i will get promoted to
that), so in order not to change the type of the whole expression, add
a cast back to typeof(_i).
Signed-off-by: Rasmus Villemoes <linux(a)rasmusvillemoes.dk>
Signed-off-by: Dan Williams <dan.j.williams(a)intel.com>
Acked-by: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Andy Lutomirski <luto(a)kernel.org>
Cc: Arjan van de Ven <arjan(a)linux.intel.com>
Cc: Borislav Petkov <bp(a)alien8.de>
Cc: Dave Hansen <dave.hansen(a)linux.intel.com>
Cc: David Woodhouse <dwmw2(a)infradead.org>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: Josh Poimboeuf <jpoimboe(a)redhat.com>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Will Deacon <will.deacon(a)arm.com>
Cc: linux-arch(a)vger.kernel.org
Cc: stable(a)vger.kernel.org
Link: http://lkml.kernel.org/r/151881604837.17395.10812767547837568328.stgit@dwil…
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
include/linux/nospec.h | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
--- a/include/linux/nospec.h
+++ b/include/linux/nospec.h
@@ -72,7 +72,6 @@ static inline unsigned long array_index_
BUILD_BUG_ON(sizeof(_i) > sizeof(long)); \
BUILD_BUG_ON(sizeof(_s) > sizeof(long)); \
\
- _i &= _mask; \
- _i; \
+ (typeof(_i)) (_i & _mask); \
})
#endif /* _LINUX_NOSPEC_H */
Patches currently in stable-queue which might be from linux(a)rasmusvillemoes.dk are
queue-4.14/nospec-allow-index-argument-to-have-const-qualified-type.patch
This is a note to let you know that I've just added the patch titled
x86/mm: Fix {pmd,pud}_{set,clear}_flags()
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
x86-mm-fix-pmd-pud-_-set-clear-_flags.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 842cef9113c2120f74f645111ded1e020193d84c Mon Sep 17 00:00:00 2001
From: Jan Beulich <JBeulich(a)suse.com>
Date: Mon, 19 Feb 2018 07:48:11 -0700
Subject: x86/mm: Fix {pmd,pud}_{set,clear}_flags()
From: Jan Beulich <JBeulich(a)suse.com>
commit 842cef9113c2120f74f645111ded1e020193d84c upstream.
Just like pte_{set,clear}_flags() their PMD and PUD counterparts should
not do any address translation. This was outright wrong under Xen
(causing a dead boot with no useful output on "suitable" systems), and
produced needlessly more complicated code (even if just slightly) when
paravirt was enabled.
Signed-off-by: Jan Beulich <jbeulich(a)suse.com>
Reviewed-by: Juergen Gross <jgross(a)suse.com>
Acked-by: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Andy Lutomirski <luto(a)kernel.org>
Cc: Boris Ostrovsky <boris.ostrovsky(a)oracle.com>
Cc: Borislav Petkov <bp(a)alien8.de>
Cc: Brian Gerst <brgerst(a)gmail.com>
Cc: Denys Vlasenko <dvlasenk(a)redhat.com>
Cc: H. Peter Anvin <hpa(a)zytor.com>
Cc: Josh Poimboeuf <jpoimboe(a)redhat.com>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: stable(a)vger.kernel.org
Link: http://lkml.kernel.org/r/5A8AF1BB02000078001A91C3@prv-mh.provo.novell.com
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/include/asm/pgtable.h | 8 ++++----
arch/x86/include/asm/pgtable_types.h | 10 ++++++++++
2 files changed, 14 insertions(+), 4 deletions(-)
--- a/arch/x86/include/asm/pgtable.h
+++ b/arch/x86/include/asm/pgtable.h
@@ -350,14 +350,14 @@ static inline pmd_t pmd_set_flags(pmd_t
{
pmdval_t v = native_pmd_val(pmd);
- return __pmd(v | set);
+ return native_make_pmd(v | set);
}
static inline pmd_t pmd_clear_flags(pmd_t pmd, pmdval_t clear)
{
pmdval_t v = native_pmd_val(pmd);
- return __pmd(v & ~clear);
+ return native_make_pmd(v & ~clear);
}
static inline pmd_t pmd_mkold(pmd_t pmd)
@@ -409,14 +409,14 @@ static inline pud_t pud_set_flags(pud_t
{
pudval_t v = native_pud_val(pud);
- return __pud(v | set);
+ return native_make_pud(v | set);
}
static inline pud_t pud_clear_flags(pud_t pud, pudval_t clear)
{
pudval_t v = native_pud_val(pud);
- return __pud(v & ~clear);
+ return native_make_pud(v & ~clear);
}
static inline pud_t pud_mkold(pud_t pud)
--- a/arch/x86/include/asm/pgtable_types.h
+++ b/arch/x86/include/asm/pgtable_types.h
@@ -323,6 +323,11 @@ static inline pudval_t native_pud_val(pu
#else
#include <asm-generic/pgtable-nopud.h>
+static inline pud_t native_make_pud(pudval_t val)
+{
+ return (pud_t) { .p4d.pgd = native_make_pgd(val) };
+}
+
static inline pudval_t native_pud_val(pud_t pud)
{
return native_pgd_val(pud.p4d.pgd);
@@ -344,6 +349,11 @@ static inline pmdval_t native_pmd_val(pm
#else
#include <asm-generic/pgtable-nopmd.h>
+static inline pmd_t native_make_pmd(pmdval_t val)
+{
+ return (pmd_t) { .pud.p4d.pgd = native_make_pgd(val) };
+}
+
static inline pmdval_t native_pmd_val(pmd_t pmd)
{
return native_pgd_val(pmd.pud.p4d.pgd);
Patches currently in stable-queue which might be from JBeulich(a)suse.com are
queue-4.14/x86-mm-fix-pmd-pud-_-set-clear-_flags.patch
This is a note to let you know that I've just added the patch titled
KVM: x86: move LAPIC initialization after VMCS creation
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
kvm-x86-move-lapic-initialization-after-vmcs-creation.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 0b2e9904c15963e715d33e5f3f1387f17d19333a Mon Sep 17 00:00:00 2001
From: Paolo Bonzini <pbonzini(a)redhat.com>
Date: Fri, 23 Feb 2018 23:29:32 +0100
Subject: KVM: x86: move LAPIC initialization after VMCS creation
From: Paolo Bonzini <pbonzini(a)redhat.com>
commit 0b2e9904c15963e715d33e5f3f1387f17d19333a upstream.
The initial reset of the local APIC is performed before the VMCS has been
created, but it tries to do a vmwrite:
vmwrite error: reg 810 value 4a00 (err 18944)
CPU: 54 PID: 38652 Comm: qemu-kvm Tainted: G W I 4.16.0-0.rc2.git0.1.fc28.x86_64 #1
Hardware name: Intel Corporation S2600CW/S2600CW, BIOS SE5C610.86B.01.01.0003.090520141303 09/05/2014
Call Trace:
vmx_set_rvi [kvm_intel]
vmx_hwapic_irr_update [kvm_intel]
kvm_lapic_reset [kvm]
kvm_create_lapic [kvm]
kvm_arch_vcpu_init [kvm]
kvm_vcpu_init [kvm]
vmx_create_vcpu [kvm_intel]
kvm_vm_ioctl [kvm]
Move it later, after the VMCS has been created.
Fixes: 4191db26b714 ("KVM: x86: Update APICv on APIC reset")
Cc: stable(a)vger.kernel.org
Cc: Liran Alon <liran.alon(a)oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini(a)redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/kvm/lapic.c | 1 -
arch/x86/kvm/x86.c | 1 +
2 files changed, 1 insertion(+), 1 deletion(-)
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -2107,7 +2107,6 @@ int kvm_create_lapic(struct kvm_vcpu *vc
*/
vcpu->arch.apic_base = MSR_IA32_APICBASE_ENABLE;
static_key_slow_inc(&apic_sw_disabled.key); /* sw disabled at reset */
- kvm_lapic_reset(vcpu, false);
kvm_iodevice_init(&apic->dev, &apic_mmio_ops);
return 0;
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -7779,6 +7779,7 @@ int kvm_arch_vcpu_setup(struct kvm_vcpu
if (r)
return r;
kvm_vcpu_reset(vcpu, false);
+ kvm_lapic_reset(vcpu, false);
kvm_mmu_setup(vcpu);
vcpu_put(vcpu);
return r;
Patches currently in stable-queue which might be from pbonzini(a)redhat.com are
queue-4.14/kvm-x86-fix-smram-accessing-even-if-vm-is-shutdown.patch
queue-4.14/kvm-x86-remove-indirect-msr-op-calls-from-spec_ctrl.patch
queue-4.14/kvm-x86-move-lapic-initialization-after-vmcs-creation.patch
queue-4.14/kvm-mmu-fix-overlap-between-public-and-private-memslots.patch
queue-4.14/kvm-vmx-optimize-vmx_vcpu_run-and-svm_vcpu_run-by-marking-the-rdmsr-path-as-unlikely.patch
This is a note to let you know that I've just added the patch titled
KVM/x86: Remove indirect MSR op calls from SPEC_CTRL
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
kvm-x86-remove-indirect-msr-op-calls-from-spec_ctrl.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From ecb586bd29c99fb4de599dec388658e74388daad Mon Sep 17 00:00:00 2001
From: Paolo Bonzini <pbonzini(a)redhat.com>
Date: Thu, 22 Feb 2018 16:43:17 +0100
Subject: KVM/x86: Remove indirect MSR op calls from SPEC_CTRL
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
From: Paolo Bonzini <pbonzini(a)redhat.com>
commit ecb586bd29c99fb4de599dec388658e74388daad upstream.
Having a paravirt indirect call in the IBRS restore path is not a
good idea, since we are trying to protect from speculative execution
of bogus indirect branch targets. It is also slower, so use
native_wrmsrl() on the vmentry path too.
Signed-off-by: Paolo Bonzini <pbonzini(a)redhat.com>
Reviewed-by: Jim Mattson <jmattson(a)google.com>
Cc: David Woodhouse <dwmw(a)amazon.co.uk>
Cc: KarimAllah Ahmed <karahmed(a)amazon.de>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Radim Krčmář <rkrcmar(a)redhat.com>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: kvm(a)vger.kernel.org
Cc: stable(a)vger.kernel.org
Fixes: d28b387fb74da95d69d2615732f50cceb38e9a4d
Link: http://lkml.kernel.org/r/20180222154318.20361-2-pbonzini@redhat.com
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/kvm/svm.c | 7 ++++---
arch/x86/kvm/vmx.c | 7 ++++---
2 files changed, 8 insertions(+), 6 deletions(-)
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -45,6 +45,7 @@
#include <asm/debugreg.h>
#include <asm/kvm_para.h>
#include <asm/irq_remapping.h>
+#include <asm/microcode.h>
#include <asm/nospec-branch.h>
#include <asm/virtext.h>
@@ -5015,7 +5016,7 @@ static void svm_vcpu_run(struct kvm_vcpu
* being speculatively taken.
*/
if (svm->spec_ctrl)
- wrmsrl(MSR_IA32_SPEC_CTRL, svm->spec_ctrl);
+ native_wrmsrl(MSR_IA32_SPEC_CTRL, svm->spec_ctrl);
asm volatile (
"push %%" _ASM_BP "; \n\t"
@@ -5125,10 +5126,10 @@ static void svm_vcpu_run(struct kvm_vcpu
* save it.
*/
if (!msr_write_intercepted(vcpu, MSR_IA32_SPEC_CTRL))
- rdmsrl(MSR_IA32_SPEC_CTRL, svm->spec_ctrl);
+ svm->spec_ctrl = native_read_msr(MSR_IA32_SPEC_CTRL);
if (svm->spec_ctrl)
- wrmsrl(MSR_IA32_SPEC_CTRL, 0);
+ native_wrmsrl(MSR_IA32_SPEC_CTRL, 0);
/* Eliminate branch target predictions from guest mode */
vmexit_fill_RSB();
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -51,6 +51,7 @@
#include <asm/apic.h>
#include <asm/irq_remapping.h>
#include <asm/mmu_context.h>
+#include <asm/microcode.h>
#include <asm/nospec-branch.h>
#include "trace.h"
@@ -9431,7 +9432,7 @@ static void __noclone vmx_vcpu_run(struc
* being speculatively taken.
*/
if (vmx->spec_ctrl)
- wrmsrl(MSR_IA32_SPEC_CTRL, vmx->spec_ctrl);
+ native_wrmsrl(MSR_IA32_SPEC_CTRL, vmx->spec_ctrl);
vmx->__launched = vmx->loaded_vmcs->launched;
asm(
@@ -9567,10 +9568,10 @@ static void __noclone vmx_vcpu_run(struc
* save it.
*/
if (!msr_write_intercepted(vcpu, MSR_IA32_SPEC_CTRL))
- rdmsrl(MSR_IA32_SPEC_CTRL, vmx->spec_ctrl);
+ vmx->spec_ctrl = native_read_msr(MSR_IA32_SPEC_CTRL);
if (vmx->spec_ctrl)
- wrmsrl(MSR_IA32_SPEC_CTRL, 0);
+ native_wrmsrl(MSR_IA32_SPEC_CTRL, 0);
/* Eliminate branch target predictions from guest mode */
vmexit_fill_RSB();
Patches currently in stable-queue which might be from pbonzini(a)redhat.com are
queue-4.14/kvm-x86-fix-smram-accessing-even-if-vm-is-shutdown.patch
queue-4.14/kvm-x86-remove-indirect-msr-op-calls-from-spec_ctrl.patch
queue-4.14/kvm-x86-move-lapic-initialization-after-vmcs-creation.patch
queue-4.14/kvm-mmu-fix-overlap-between-public-and-private-memslots.patch
queue-4.14/kvm-vmx-optimize-vmx_vcpu_run-and-svm_vcpu_run-by-marking-the-rdmsr-path-as-unlikely.patch
This is a note to let you know that I've just added the patch titled
KVM: s390: take care of clock-comparator sign control
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
kvm-s390-take-care-of-clock-comparator-sign-control.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 5fe01793dd953ab947fababe8abaf5ed5258c8df Mon Sep 17 00:00:00 2001
From: David Hildenbrand <david(a)redhat.com>
Date: Wed, 7 Feb 2018 12:46:42 +0100
Subject: KVM: s390: take care of clock-comparator sign control
From: David Hildenbrand <david(a)redhat.com>
commit 5fe01793dd953ab947fababe8abaf5ed5258c8df upstream.
Missed when enabling the Multiple-epoch facility. If the facility is
installed and the control is set, a sign based comaprison has to be
performed.
Right now we would inject wrong interrupts and ignore interrupt
conditions. Also the sleep time is calculated in a wrong way.
Signed-off-by: David Hildenbrand <david(a)redhat.com>
Message-Id: <20180207114647.6220-2-david(a)redhat.com>
Fixes: 8fa1696ea781 ("KVM: s390: Multiple Epoch Facility support")
Cc: stable(a)vger.kernel.org
Reviewed-by: Christian Borntraeger <borntraeger(a)de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger(a)de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/s390/kvm/interrupt.c | 25 +++++++++++++++++++------
1 file changed, 19 insertions(+), 6 deletions(-)
--- a/arch/s390/kvm/interrupt.c
+++ b/arch/s390/kvm/interrupt.c
@@ -173,8 +173,15 @@ static int ckc_interrupts_enabled(struct
static int ckc_irq_pending(struct kvm_vcpu *vcpu)
{
- if (vcpu->arch.sie_block->ckc >= kvm_s390_get_tod_clock_fast(vcpu->kvm))
+ const u64 now = kvm_s390_get_tod_clock_fast(vcpu->kvm);
+ const u64 ckc = vcpu->arch.sie_block->ckc;
+
+ if (vcpu->arch.sie_block->gcr[0] & 0x0020000000000000ul) {
+ if ((s64)ckc >= (s64)now)
+ return 0;
+ } else if (ckc >= now) {
return 0;
+ }
return ckc_interrupts_enabled(vcpu);
}
@@ -1004,13 +1011,19 @@ int kvm_cpu_has_pending_timer(struct kvm
static u64 __calculate_sltime(struct kvm_vcpu *vcpu)
{
- u64 now, cputm, sltime = 0;
+ const u64 now = kvm_s390_get_tod_clock_fast(vcpu->kvm);
+ const u64 ckc = vcpu->arch.sie_block->ckc;
+ u64 cputm, sltime = 0;
if (ckc_interrupts_enabled(vcpu)) {
- now = kvm_s390_get_tod_clock_fast(vcpu->kvm);
- sltime = tod_to_ns(vcpu->arch.sie_block->ckc - now);
- /* already expired or overflow? */
- if (!sltime || vcpu->arch.sie_block->ckc <= now)
+ if (vcpu->arch.sie_block->gcr[0] & 0x0020000000000000ul) {
+ if ((s64)now < (s64)ckc)
+ sltime = tod_to_ns((s64)ckc - (s64)now);
+ } else if (now < ckc) {
+ sltime = tod_to_ns(ckc - now);
+ }
+ /* already expired */
+ if (!sltime)
return 0;
if (cpu_timer_interrupts_enabled(vcpu)) {
cputm = kvm_s390_get_cpu_timer(vcpu);
Patches currently in stable-queue which might be from david(a)redhat.com are
queue-4.14/kvm-s390-take-care-of-clock-comparator-sign-control.patch
queue-4.14/kvm-s390-provide-only-a-single-function-for-setting-the-tod-fix-sck.patch
queue-4.14/kvm-s390-consider-epoch-index-on-tod-clock-syncs.patch
queue-4.14/kvm-s390-consider-epoch-index-on-hotplugged-cpus.patch
This is a note to let you know that I've just added the patch titled
KVM/VMX: Optimize vmx_vcpu_run() and svm_vcpu_run() by marking the RDMSR path as unlikely()
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
kvm-vmx-optimize-vmx_vcpu_run-and-svm_vcpu_run-by-marking-the-rdmsr-path-as-unlikely.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 946fbbc13dce68902f64515b610eeb2a6c3d7a64 Mon Sep 17 00:00:00 2001
From: Paolo Bonzini <pbonzini(a)redhat.com>
Date: Thu, 22 Feb 2018 16:43:18 +0100
Subject: KVM/VMX: Optimize vmx_vcpu_run() and svm_vcpu_run() by marking the RDMSR path as unlikely()
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
From: Paolo Bonzini <pbonzini(a)redhat.com>
commit 946fbbc13dce68902f64515b610eeb2a6c3d7a64 upstream.
vmx_vcpu_run() and svm_vcpu_run() are large functions, and giving
branch hints to the compiler can actually make a substantial cycle
difference by keeping the fast path contiguous in memory.
With this optimization, the retpoline-guest/retpoline-host case is
about 50 cycles faster.
Signed-off-by: Paolo Bonzini <pbonzini(a)redhat.com>
Reviewed-by: Jim Mattson <jmattson(a)google.com>
Cc: David Woodhouse <dwmw(a)amazon.co.uk>
Cc: KarimAllah Ahmed <karahmed(a)amazon.de>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Radim Krčmář <rkrcmar(a)redhat.com>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: kvm(a)vger.kernel.org
Cc: stable(a)vger.kernel.org
Link: http://lkml.kernel.org/r/20180222154318.20361-3-pbonzini@redhat.com
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/kvm/svm.c | 2 +-
arch/x86/kvm/vmx.c | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -5125,7 +5125,7 @@ static void svm_vcpu_run(struct kvm_vcpu
* If the L02 MSR bitmap does not intercept the MSR, then we need to
* save it.
*/
- if (!msr_write_intercepted(vcpu, MSR_IA32_SPEC_CTRL))
+ if (unlikely(!msr_write_intercepted(vcpu, MSR_IA32_SPEC_CTRL)))
svm->spec_ctrl = native_read_msr(MSR_IA32_SPEC_CTRL);
if (svm->spec_ctrl)
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -9567,7 +9567,7 @@ static void __noclone vmx_vcpu_run(struc
* If the L02 MSR bitmap does not intercept the MSR, then we need to
* save it.
*/
- if (!msr_write_intercepted(vcpu, MSR_IA32_SPEC_CTRL))
+ if (unlikely(!msr_write_intercepted(vcpu, MSR_IA32_SPEC_CTRL)))
vmx->spec_ctrl = native_read_msr(MSR_IA32_SPEC_CTRL);
if (vmx->spec_ctrl)
Patches currently in stable-queue which might be from pbonzini(a)redhat.com are
queue-4.14/kvm-x86-fix-smram-accessing-even-if-vm-is-shutdown.patch
queue-4.14/kvm-x86-remove-indirect-msr-op-calls-from-spec_ctrl.patch
queue-4.14/kvm-x86-move-lapic-initialization-after-vmcs-creation.patch
queue-4.14/kvm-mmu-fix-overlap-between-public-and-private-memslots.patch
queue-4.14/kvm-vmx-optimize-vmx_vcpu_run-and-svm_vcpu_run-by-marking-the-rdmsr-path-as-unlikely.patch
This is a note to let you know that I've just added the patch titled
KVM: s390: consider epoch index on TOD clock syncs
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
kvm-s390-consider-epoch-index-on-tod-clock-syncs.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 1575767ef3cf5326701d2ae3075b7732cbc855e4 Mon Sep 17 00:00:00 2001
From: David Hildenbrand <david(a)redhat.com>
Date: Wed, 7 Feb 2018 12:46:45 +0100
Subject: KVM: s390: consider epoch index on TOD clock syncs
From: David Hildenbrand <david(a)redhat.com>
commit 1575767ef3cf5326701d2ae3075b7732cbc855e4 upstream.
For now, we don't take care of over/underflows. Especially underflows
are critical:
Assume the epoch is currently 0 and we get a sync request for delta=1,
meaning the TOD is moved forward by 1 and we have to fix it up by
subtracting 1 from the epoch. Right now, this will leave the epoch
index untouched, resulting in epoch=-1, epoch_idx=0, which is wrong.
We have to take care of over and underflows, also for the VSIE case. So
let's factor out calculation into a separate function.
Signed-off-by: David Hildenbrand <david(a)redhat.com>
Message-Id: <20180207114647.6220-5-david(a)redhat.com>
Reviewed-by: Christian Borntraeger <borntraeger(a)de.ibm.com>
Fixes: 8fa1696ea781 ("KVM: s390: Multiple Epoch Facility support")
Cc: stable(a)vger.kernel.org
Signed-off-by: Christian Borntraeger <borntraeger(a)de.ibm.com>
[use u8 for idx]
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/s390/kvm/kvm-s390.c | 32 +++++++++++++++++++++++++++++---
1 file changed, 29 insertions(+), 3 deletions(-)
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -169,6 +169,28 @@ int kvm_arch_hardware_enable(void)
static void kvm_gmap_notifier(struct gmap *gmap, unsigned long start,
unsigned long end);
+static void kvm_clock_sync_scb(struct kvm_s390_sie_block *scb, u64 delta)
+{
+ u8 delta_idx = 0;
+
+ /*
+ * The TOD jumps by delta, we have to compensate this by adding
+ * -delta to the epoch.
+ */
+ delta = -delta;
+
+ /* sign-extension - we're adding to signed values below */
+ if ((s64)delta < 0)
+ delta_idx = -1;
+
+ scb->epoch += delta;
+ if (scb->ecd & ECD_MEF) {
+ scb->epdx += delta_idx;
+ if (scb->epoch < delta)
+ scb->epdx += 1;
+ }
+}
+
/*
* This callback is executed during stop_machine(). All CPUs are therefore
* temporarily stopped. In order not to change guest behavior, we have to
@@ -184,13 +206,17 @@ static int kvm_clock_sync(struct notifie
unsigned long long *delta = v;
list_for_each_entry(kvm, &vm_list, vm_list) {
- kvm->arch.epoch -= *delta;
kvm_for_each_vcpu(i, vcpu, kvm) {
- vcpu->arch.sie_block->epoch -= *delta;
+ kvm_clock_sync_scb(vcpu->arch.sie_block, *delta);
+ if (i == 0) {
+ kvm->arch.epoch = vcpu->arch.sie_block->epoch;
+ kvm->arch.epdx = vcpu->arch.sie_block->epdx;
+ }
if (vcpu->arch.cputm_enabled)
vcpu->arch.cputm_start += *delta;
if (vcpu->arch.vsie_block)
- vcpu->arch.vsie_block->epoch -= *delta;
+ kvm_clock_sync_scb(vcpu->arch.vsie_block,
+ *delta);
}
}
return NOTIFY_OK;
Patches currently in stable-queue which might be from david(a)redhat.com are
queue-4.14/kvm-s390-take-care-of-clock-comparator-sign-control.patch
queue-4.14/kvm-s390-provide-only-a-single-function-for-setting-the-tod-fix-sck.patch
queue-4.14/kvm-s390-consider-epoch-index-on-tod-clock-syncs.patch
queue-4.14/kvm-s390-consider-epoch-index-on-hotplugged-cpus.patch
This is a note to let you know that I've just added the patch titled
KVM: s390: provide only a single function for setting the tod (fix SCK)
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
kvm-s390-provide-only-a-single-function-for-setting-the-tod-fix-sck.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 0e7def5fb0dc53ddbb9f62a497d15f1e11ccdc36 Mon Sep 17 00:00:00 2001
From: David Hildenbrand <david(a)redhat.com>
Date: Wed, 7 Feb 2018 12:46:43 +0100
Subject: KVM: s390: provide only a single function for setting the tod (fix SCK)
From: David Hildenbrand <david(a)redhat.com>
commit 0e7def5fb0dc53ddbb9f62a497d15f1e11ccdc36 upstream.
Right now, SET CLOCK called in the guest does not properly take care of
the epoch index, as the call goes via the old kvm_s390_set_tod_clock()
interface. So the epoch index is neither reset to 0, if required, nor
properly set to e.g. 0xff on negative values.
Fix this by providing a single kvm_s390_set_tod_clock() function. Move
Multiple-epoch facility handling into it.
Signed-off-by: David Hildenbrand <david(a)redhat.com>
Message-Id: <20180207114647.6220-3-david(a)redhat.com>
Reviewed-by: Christian Borntraeger <borntraeger(a)de.ibm.com>
Fixes: 8fa1696ea781 ("KVM: s390: Multiple Epoch Facility support")
Cc: stable(a)vger.kernel.org
Signed-off-by: Christian Borntraeger <borntraeger(a)de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/s390/kvm/kvm-s390.c | 46 +++++++++++++++-------------------------------
arch/s390/kvm/kvm-s390.h | 5 ++---
arch/s390/kvm/priv.c | 9 +++++----
3 files changed, 22 insertions(+), 38 deletions(-)
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -888,12 +888,9 @@ static int kvm_s390_set_tod_ext(struct k
if (copy_from_user(>od, (void __user *)attr->addr, sizeof(gtod)))
return -EFAULT;
- if (test_kvm_facility(kvm, 139))
- kvm_s390_set_tod_clock_ext(kvm, >od);
- else if (gtod.epoch_idx == 0)
- kvm_s390_set_tod_clock(kvm, gtod.tod);
- else
+ if (!test_kvm_facility(kvm, 139) && gtod.epoch_idx)
return -EINVAL;
+ kvm_s390_set_tod_clock(kvm, >od);
VM_EVENT(kvm, 3, "SET: TOD extension: 0x%x, TOD base: 0x%llx",
gtod.epoch_idx, gtod.tod);
@@ -918,13 +915,14 @@ static int kvm_s390_set_tod_high(struct
static int kvm_s390_set_tod_low(struct kvm *kvm, struct kvm_device_attr *attr)
{
- u64 gtod;
+ struct kvm_s390_vm_tod_clock gtod = { 0 };
- if (copy_from_user(>od, (void __user *)attr->addr, sizeof(gtod)))
+ if (copy_from_user(>od.tod, (void __user *)attr->addr,
+ sizeof(gtod.tod)))
return -EFAULT;
- kvm_s390_set_tod_clock(kvm, gtod);
- VM_EVENT(kvm, 3, "SET: TOD base: 0x%llx", gtod);
+ kvm_s390_set_tod_clock(kvm, >od);
+ VM_EVENT(kvm, 3, "SET: TOD base: 0x%llx", gtod.tod);
return 0;
}
@@ -2945,8 +2943,8 @@ retry:
return 0;
}
-void kvm_s390_set_tod_clock_ext(struct kvm *kvm,
- const struct kvm_s390_vm_tod_clock *gtod)
+void kvm_s390_set_tod_clock(struct kvm *kvm,
+ const struct kvm_s390_vm_tod_clock *gtod)
{
struct kvm_vcpu *vcpu;
struct kvm_s390_tod_clock_ext htod;
@@ -2958,10 +2956,12 @@ void kvm_s390_set_tod_clock_ext(struct k
get_tod_clock_ext((char *)&htod);
kvm->arch.epoch = gtod->tod - htod.tod;
- kvm->arch.epdx = gtod->epoch_idx - htod.epoch_idx;
-
- if (kvm->arch.epoch > gtod->tod)
- kvm->arch.epdx -= 1;
+ kvm->arch.epdx = 0;
+ if (test_kvm_facility(kvm, 139)) {
+ kvm->arch.epdx = gtod->epoch_idx - htod.epoch_idx;
+ if (kvm->arch.epoch > gtod->tod)
+ kvm->arch.epdx -= 1;
+ }
kvm_s390_vcpu_block_all(kvm);
kvm_for_each_vcpu(i, vcpu, kvm) {
@@ -2972,22 +2972,6 @@ void kvm_s390_set_tod_clock_ext(struct k
kvm_s390_vcpu_unblock_all(kvm);
preempt_enable();
mutex_unlock(&kvm->lock);
-}
-
-void kvm_s390_set_tod_clock(struct kvm *kvm, u64 tod)
-{
- struct kvm_vcpu *vcpu;
- int i;
-
- mutex_lock(&kvm->lock);
- preempt_disable();
- kvm->arch.epoch = tod - get_tod_clock();
- kvm_s390_vcpu_block_all(kvm);
- kvm_for_each_vcpu(i, vcpu, kvm)
- vcpu->arch.sie_block->epoch = kvm->arch.epoch;
- kvm_s390_vcpu_unblock_all(kvm);
- preempt_enable();
- mutex_unlock(&kvm->lock);
}
/**
--- a/arch/s390/kvm/kvm-s390.h
+++ b/arch/s390/kvm/kvm-s390.h
@@ -272,9 +272,8 @@ int kvm_s390_handle_sigp_pei(struct kvm_
int handle_sthyi(struct kvm_vcpu *vcpu);
/* implemented in kvm-s390.c */
-void kvm_s390_set_tod_clock_ext(struct kvm *kvm,
- const struct kvm_s390_vm_tod_clock *gtod);
-void kvm_s390_set_tod_clock(struct kvm *kvm, u64 tod);
+void kvm_s390_set_tod_clock(struct kvm *kvm,
+ const struct kvm_s390_vm_tod_clock *gtod);
long kvm_arch_fault_in_page(struct kvm_vcpu *vcpu, gpa_t gpa, int writable);
int kvm_s390_store_status_unloaded(struct kvm_vcpu *vcpu, unsigned long addr);
int kvm_s390_vcpu_store_status(struct kvm_vcpu *vcpu, unsigned long addr);
--- a/arch/s390/kvm/priv.c
+++ b/arch/s390/kvm/priv.c
@@ -84,9 +84,10 @@ int kvm_s390_handle_e3(struct kvm_vcpu *
/* Handle SCK (SET CLOCK) interception */
static int handle_set_clock(struct kvm_vcpu *vcpu)
{
+ struct kvm_s390_vm_tod_clock gtod = { 0 };
int rc;
u8 ar;
- u64 op2, val;
+ u64 op2;
if (vcpu->arch.sie_block->gpsw.mask & PSW_MASK_PSTATE)
return kvm_s390_inject_program_int(vcpu, PGM_PRIVILEGED_OP);
@@ -94,12 +95,12 @@ static int handle_set_clock(struct kvm_v
op2 = kvm_s390_get_base_disp_s(vcpu, &ar);
if (op2 & 7) /* Operand must be on a doubleword boundary */
return kvm_s390_inject_program_int(vcpu, PGM_SPECIFICATION);
- rc = read_guest(vcpu, op2, ar, &val, sizeof(val));
+ rc = read_guest(vcpu, op2, ar, >od.tod, sizeof(gtod.tod));
if (rc)
return kvm_s390_inject_prog_cond(vcpu, rc);
- VCPU_EVENT(vcpu, 3, "SCK: setting guest TOD to 0x%llx", val);
- kvm_s390_set_tod_clock(vcpu->kvm, val);
+ VCPU_EVENT(vcpu, 3, "SCK: setting guest TOD to 0x%llx", gtod.tod);
+ kvm_s390_set_tod_clock(vcpu->kvm, >od);
kvm_s390_set_psw_cc(vcpu, 0);
return 0;
Patches currently in stable-queue which might be from david(a)redhat.com are
queue-4.14/kvm-s390-take-care-of-clock-comparator-sign-control.patch
queue-4.14/kvm-s390-provide-only-a-single-function-for-setting-the-tod-fix-sck.patch
queue-4.14/kvm-s390-consider-epoch-index-on-tod-clock-syncs.patch
queue-4.14/kvm-s390-consider-epoch-index-on-hotplugged-cpus.patch
This is a note to let you know that I've just added the patch titled
EDAC, sb_edac: Fix out of bound writes during DIMM configuration on KNL
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
edac-sb_edac-fix-out-of-bound-writes-during-dimm-configuration-on-knl.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From bf8486709ac7fad99e4040dea73fe466c57a4ae1 Mon Sep 17 00:00:00 2001
From: Anna Karbownik <anna.karbownik(a)intel.com>
Date: Thu, 22 Feb 2018 16:18:13 +0100
Subject: EDAC, sb_edac: Fix out of bound writes during DIMM configuration on KNL
From: Anna Karbownik <anna.karbownik(a)intel.com>
commit bf8486709ac7fad99e4040dea73fe466c57a4ae1 upstream.
Commit
3286d3eb906c ("EDAC, sb_edac: Drop NUM_CHANNELS from 8 back to 4")
decreased NUM_CHANNELS from 8 to 4, but this is not enough for Knights
Landing which supports up to 6 channels.
This caused out-of-bounds writes to pvt->mirror_mode and pvt->tolm
variables which don't pay critical role on KNL code path, so the memory
corruption wasn't causing any visible driver failures.
The easiest way of fixing it is to change NUM_CHANNELS to 6. Do that.
An alternative solution would be to restructure the KNL part of the
driver to 2MC/3channel representation.
Reported-by: Dan Carpenter <dan.carpenter(a)oracle.com>
Signed-off-by: Anna Karbownik <anna.karbownik(a)intel.com>
Cc: Mauro Carvalho Chehab <mchehab(a)kernel.org>
Cc: Tony Luck <tony.luck(a)intel.com>
Cc: jim.m.snow(a)intel.com
Cc: krzysztof.paliswiat(a)intel.com
Cc: lukasz.odzioba(a)intel.com
Cc: qiuxu.zhuo(a)intel.com
Cc: linux-edac <linux-edac(a)vger.kernel.org>
Cc: <stable(a)vger.kernel.org>
Fixes: 3286d3eb906c ("EDAC, sb_edac: Drop NUM_CHANNELS from 8 back to 4")
Link: http://lkml.kernel.org/r/1519312693-4789-1-git-send-email-anna.karbownik@in…
[ Massage commit message. ]
Signed-off-by: Borislav Petkov <bp(a)suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/edac/sb_edac.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/edac/sb_edac.c
+++ b/drivers/edac/sb_edac.c
@@ -279,7 +279,7 @@ static const u32 correrrthrsld[] = {
* sbridge structs
*/
-#define NUM_CHANNELS 4 /* Max channels per MC */
+#define NUM_CHANNELS 6 /* Max channels per MC */
#define MAX_DIMMS 3 /* Max DIMMS per channel */
#define KNL_MAX_CHAS 38 /* KNL max num. of Cache Home Agents */
#define KNL_MAX_CHANNELS 6 /* KNL max num. of PCI channels */
Patches currently in stable-queue which might be from anna.karbownik(a)intel.com are
queue-4.14/edac-sb_edac-fix-out-of-bound-writes-during-dimm-configuration-on-knl.patch
This is a note to let you know that I've just added the patch titled
KVM: s390: consider epoch index on hotplugged CPUs
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
kvm-s390-consider-epoch-index-on-hotplugged-cpus.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From d16b52cb9cdb6f06dea8ab2f0a428e7d7f0b0a81 Mon Sep 17 00:00:00 2001
From: David Hildenbrand <david(a)redhat.com>
Date: Wed, 7 Feb 2018 12:46:44 +0100
Subject: KVM: s390: consider epoch index on hotplugged CPUs
From: David Hildenbrand <david(a)redhat.com>
commit d16b52cb9cdb6f06dea8ab2f0a428e7d7f0b0a81 upstream.
We must copy both, the epoch and the epoch_idx.
Signed-off-by: David Hildenbrand <david(a)redhat.com>
Message-Id: <20180207114647.6220-4-david(a)redhat.com>
Fixes: 8fa1696ea781 ("KVM: s390: Multiple Epoch Facility support")
Reviewed-by: Cornelia Huck <cohuck(a)redhat.com>
Reviewed-by: Christian Borntraeger <borntraeger(a)de.ibm.com>
Fixes: 8fa1696ea781 ("KVM: s390: Multiple Epoch Facility support")
Cc: stable(a)vger.kernel.org
Signed-off-by: Christian Borntraeger <borntraeger(a)de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/s390/kvm/kvm-s390.c | 1 +
1 file changed, 1 insertion(+)
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -2357,6 +2357,7 @@ void kvm_arch_vcpu_postcreate(struct kvm
mutex_lock(&vcpu->kvm->lock);
preempt_disable();
vcpu->arch.sie_block->epoch = vcpu->kvm->arch.epoch;
+ vcpu->arch.sie_block->epdx = vcpu->kvm->arch.epdx;
preempt_enable();
mutex_unlock(&vcpu->kvm->lock);
if (!kvm_is_ucontrol(vcpu->kvm)) {
Patches currently in stable-queue which might be from david(a)redhat.com are
queue-4.14/kvm-s390-take-care-of-clock-comparator-sign-control.patch
queue-4.14/kvm-s390-provide-only-a-single-function-for-setting-the-tod-fix-sck.patch
queue-4.14/kvm-s390-consider-epoch-index-on-tod-clock-syncs.patch
queue-4.14/kvm-s390-consider-epoch-index-on-hotplugged-cpus.patch
This is a note to let you know that I've just added the patch titled
KVM: mmu: Fix overlap between public and private memslots
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
kvm-mmu-fix-overlap-between-public-and-private-memslots.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From b28676bb8ae4569cced423dc2a88f7cb319d5379 Mon Sep 17 00:00:00 2001
From: Wanpeng Li <wanpeng.li(a)hotmail.com>
Date: Tue, 13 Feb 2018 15:36:00 +0100
Subject: KVM: mmu: Fix overlap between public and private memslots
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
From: Wanpeng Li <wanpeng.li(a)hotmail.com>
commit b28676bb8ae4569cced423dc2a88f7cb319d5379 upstream.
Reported by syzkaller:
pte_list_remove: ffff9714eb1f8078 0->BUG
------------[ cut here ]------------
kernel BUG at arch/x86/kvm/mmu.c:1157!
invalid opcode: 0000 [#1] SMP
RIP: 0010:pte_list_remove+0x11b/0x120 [kvm]
Call Trace:
drop_spte+0x83/0xb0 [kvm]
mmu_page_zap_pte+0xcc/0xe0 [kvm]
kvm_mmu_prepare_zap_page+0x81/0x4a0 [kvm]
kvm_mmu_invalidate_zap_all_pages+0x159/0x220 [kvm]
kvm_arch_flush_shadow_all+0xe/0x10 [kvm]
kvm_mmu_notifier_release+0x6c/0xa0 [kvm]
? kvm_mmu_notifier_release+0x5/0xa0 [kvm]
__mmu_notifier_release+0x79/0x110
? __mmu_notifier_release+0x5/0x110
exit_mmap+0x15a/0x170
? do_exit+0x281/0xcb0
mmput+0x66/0x160
do_exit+0x2c9/0xcb0
? __context_tracking_exit.part.5+0x4a/0x150
do_group_exit+0x50/0xd0
SyS_exit_group+0x14/0x20
do_syscall_64+0x73/0x1f0
entry_SYSCALL64_slow_path+0x25/0x25
The reason is that when creates new memslot, there is no guarantee for new
memslot not overlap with private memslots. This can be triggered by the
following program:
#include <fcntl.h>
#include <pthread.h>
#include <setjmp.h>
#include <signal.h>
#include <stddef.h>
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/ioctl.h>
#include <sys/stat.h>
#include <sys/syscall.h>
#include <sys/types.h>
#include <unistd.h>
#include <linux/kvm.h>
long r[16];
int main()
{
void *p = valloc(0x4000);
r[2] = open("/dev/kvm", 0);
r[3] = ioctl(r[2], KVM_CREATE_VM, 0x0ul);
uint64_t addr = 0xf000;
ioctl(r[3], KVM_SET_IDENTITY_MAP_ADDR, &addr);
r[6] = ioctl(r[3], KVM_CREATE_VCPU, 0x0ul);
ioctl(r[3], KVM_SET_TSS_ADDR, 0x0ul);
ioctl(r[6], KVM_RUN, 0);
ioctl(r[6], KVM_RUN, 0);
struct kvm_userspace_memory_region mr = {
.slot = 0,
.flags = KVM_MEM_LOG_DIRTY_PAGES,
.guest_phys_addr = 0xf000,
.memory_size = 0x4000,
.userspace_addr = (uintptr_t) p
};
ioctl(r[3], KVM_SET_USER_MEMORY_REGION, &mr);
return 0;
}
This patch fixes the bug by not adding a new memslot even if it
overlaps with private memslots.
Reported-by: Dmitry Vyukov <dvyukov(a)google.com>
Cc: Paolo Bonzini <pbonzini(a)redhat.com>
Cc: Radim Krčmář <rkrcmar(a)redhat.com>
Cc: Dmitry Vyukov <dvyukov(a)google.com>
Cc: Eric Biggers <ebiggers3(a)gmail.com>
Cc: stable(a)vger.kernel.org
Signed-off-by: Wanpeng Li <wanpeng.li(a)hotmail.com>
---
virt/kvm/kvm_main.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -975,8 +975,7 @@ int __kvm_set_memory_region(struct kvm *
/* Check for overlaps */
r = -EEXIST;
kvm_for_each_memslot(slot, __kvm_memslots(kvm, as_id)) {
- if ((slot->id >= KVM_USER_MEM_SLOTS) ||
- (slot->id == id))
+ if (slot->id == id)
continue;
if (!((base_gfn + npages <= slot->base_gfn) ||
(base_gfn >= slot->base_gfn + slot->npages)))
Patches currently in stable-queue which might be from wanpeng.li(a)hotmail.com are
queue-4.14/kvm-mmu-fix-overlap-between-public-and-private-memslots.patch
This is a note to let you know that I've just added the patch titled
blk-mq: don't call io sched's .requeue_request when requeueing rq to ->dispatch
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
blk-mq-don-t-call-io-sched-s-.requeue_request-when-requeueing-rq-to-dispatch.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 105976f517791aed3b11f8f53b308a2069d42055 Mon Sep 17 00:00:00 2001
From: Ming Lei <ming.lei(a)redhat.com>
Date: Fri, 23 Feb 2018 23:36:56 +0800
Subject: blk-mq: don't call io sched's .requeue_request when requeueing rq to ->dispatch
From: Ming Lei <ming.lei(a)redhat.com>
commit 105976f517791aed3b11f8f53b308a2069d42055 upstream.
__blk_mq_requeue_request() covers two cases:
- one is that the requeued request is added to hctx->dispatch, such as
blk_mq_dispatch_rq_list()
- another case is that the request is requeued to io scheduler, such as
blk_mq_requeue_request().
We should call io sched's .requeue_request callback only for the 2nd
case.
Cc: Paolo Valente <paolo.valente(a)linaro.org>
Cc: Omar Sandoval <osandov(a)fb.com>
Fixes: bd166ef183c2 ("blk-mq-sched: add framework for MQ capable IO schedulers")
Cc: stable(a)vger.kernel.org
Reviewed-by: Bart Van Assche <bart.vanassche(a)wdc.com>
Acked-by: Paolo Valente <paolo.valente(a)linaro.org>
Signed-off-by: Ming Lei <ming.lei(a)redhat.com>
Signed-off-by: Jens Axboe <axboe(a)kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
block/blk-mq.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -638,7 +638,6 @@ static void __blk_mq_requeue_request(str
trace_block_rq_requeue(q, rq);
wbt_requeue(q->rq_wb, &rq->issue_stat);
- blk_mq_sched_requeue_request(rq);
if (test_and_clear_bit(REQ_ATOM_STARTED, &rq->atomic_flags)) {
if (q->dma_drain_size && blk_rq_bytes(rq))
@@ -650,6 +649,9 @@ void blk_mq_requeue_request(struct reque
{
__blk_mq_requeue_request(rq);
+ /* this request will be re-inserted to io scheduler queue */
+ blk_mq_sched_requeue_request(rq);
+
BUG_ON(blk_queued_rq(rq));
blk_mq_add_to_requeue_list(rq, true, kick_requeue_list);
}
Patches currently in stable-queue which might be from ming.lei(a)redhat.com are
queue-4.14/blk-mq-don-t-call-io-sched-s-.requeue_request-when-requeueing-rq-to-dispatch.patch
queue-4.14/block-kyber-fix-domain-token-leak-during-requeue.patch
This is a note to let you know that I've just added the patch titled
ARM: mvebu: Fix broken PL310_ERRATA_753970 selects
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
arm-mvebu-fix-broken-pl310_errata_753970-selects.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 8aa36a8dcde3183d84db7b0d622ffddcebb61077 Mon Sep 17 00:00:00 2001
From: Ulf Magnusson <ulfalizer(a)gmail.com>
Date: Mon, 5 Feb 2018 02:21:13 +0100
Subject: ARM: mvebu: Fix broken PL310_ERRATA_753970 selects
From: Ulf Magnusson <ulfalizer(a)gmail.com>
commit 8aa36a8dcde3183d84db7b0d622ffddcebb61077 upstream.
The MACH_ARMADA_375 and MACH_ARMADA_38X boards select ARM_ERRATA_753970,
but it was renamed to PL310_ERRATA_753970 by commit fa0ce4035d48 ("ARM:
7162/1: errata: tidy up Kconfig options for PL310 errata workarounds").
Fix the selects to use the new name.
Discovered with the
https://github.com/ulfalizer/Kconfiglib/blob/master/examples/list_undefined…
script.
Fixes: fa0ce4035d48 ("ARM: 7162/1: errata: tidy up Kconfig options for
PL310 errata workarounds"
cc: stable(a)vger.kernel.org
Signed-off-by: Ulf Magnusson <ulfalizer(a)gmail.com>
Signed-off-by: Gregory CLEMENT <gregory.clement(a)bootlin.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/arm/mach-mvebu/Kconfig | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
--- a/arch/arm/mach-mvebu/Kconfig
+++ b/arch/arm/mach-mvebu/Kconfig
@@ -42,7 +42,7 @@ config MACH_ARMADA_375
depends on ARCH_MULTI_V7
select ARMADA_370_XP_IRQ
select ARM_ERRATA_720789
- select ARM_ERRATA_753970
+ select PL310_ERRATA_753970
select ARM_GIC
select ARMADA_375_CLK
select HAVE_ARM_SCU
@@ -58,7 +58,7 @@ config MACH_ARMADA_38X
bool "Marvell Armada 380/385 boards"
depends on ARCH_MULTI_V7
select ARM_ERRATA_720789
- select ARM_ERRATA_753970
+ select PL310_ERRATA_753970
select ARM_GIC
select ARM_GLOBAL_TIMER
select CLKSRC_ARM_GLOBAL_TIMER_SCHED_CLOCK
Patches currently in stable-queue which might be from ulfalizer(a)gmail.com are
queue-4.14/arm-mvebu-fix-broken-pl310_errata_753970-selects.patch
This is a note to let you know that I've just added the patch titled
ARM: orion: fix orion_ge00_switch_board_info initialization
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
arm-orion-fix-orion_ge00_switch_board_info-initialization.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 8337d083507b9827dfb36d545538b7789df834fd Mon Sep 17 00:00:00 2001
From: Arnd Bergmann <arnd(a)arndb.de>
Date: Wed, 21 Feb 2018 13:18:49 +0100
Subject: ARM: orion: fix orion_ge00_switch_board_info initialization
From: Arnd Bergmann <arnd(a)arndb.de>
commit 8337d083507b9827dfb36d545538b7789df834fd upstream.
A section type mismatch warning shows up when building with LTO,
since orion_ge00_mvmdio_bus_name was put in __initconst but not marked
const itself:
include/linux/of.h: In function 'spear_setup_of_timer':
arch/arm/mach-spear/time.c:207:34: error: 'timer_of_match' causes a section type conflict with 'orion_ge00_mvmdio_bus_name'
static const struct of_device_id timer_of_match[] __initconst = {
^
arch/arm/plat-orion/common.c:475:32: note: 'orion_ge00_mvmdio_bus_name' was declared here
static __initconst const char *orion_ge00_mvmdio_bus_name = "orion-mii";
^
As pointed out by Andrew Lunn, it should in fact be 'const' but not
'__initconst' because the string is never copied but may be accessed
after the init sections are freed. To fix that, I get rid of the
extra symbol and rewrite the initialization in a simpler way that
assigns both the bus_id and modalias statically.
I spotted another theoretical bug in the same place, where d->netdev[i]
may be an out of bounds access, this can be fixed by moving the device
assignment into the loop.
Cc: stable(a)vger.kernel.org
Reviewed-by: Andrew Lunn <andrew(a)lunn.ch>
Signed-off-by: Arnd Bergmann <arnd(a)arndb.de>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/arm/plat-orion/common.c | 23 +++++++++++------------
1 file changed, 11 insertions(+), 12 deletions(-)
--- a/arch/arm/plat-orion/common.c
+++ b/arch/arm/plat-orion/common.c
@@ -472,28 +472,27 @@ void __init orion_ge11_init(struct mv643
/*****************************************************************************
* Ethernet switch
****************************************************************************/
-static __initconst const char *orion_ge00_mvmdio_bus_name = "orion-mii";
-static __initdata struct mdio_board_info
- orion_ge00_switch_board_info;
+static __initdata struct mdio_board_info orion_ge00_switch_board_info = {
+ .bus_id = "orion-mii",
+ .modalias = "mv88e6085",
+};
void __init orion_ge00_switch_init(struct dsa_chip_data *d)
{
- struct mdio_board_info *bd;
unsigned int i;
if (!IS_BUILTIN(CONFIG_PHYLIB))
return;
- for (i = 0; i < ARRAY_SIZE(d->port_names); i++)
- if (!strcmp(d->port_names[i], "cpu"))
+ for (i = 0; i < ARRAY_SIZE(d->port_names); i++) {
+ if (!strcmp(d->port_names[i], "cpu")) {
+ d->netdev[i] = &orion_ge00.dev;
break;
+ }
+ }
- bd = &orion_ge00_switch_board_info;
- bd->bus_id = orion_ge00_mvmdio_bus_name;
- bd->mdio_addr = d->sw_addr;
- d->netdev[i] = &orion_ge00.dev;
- strcpy(bd->modalias, "mv88e6085");
- bd->platform_data = d;
+ orion_ge00_switch_board_info.mdio_addr = d->sw_addr;
+ orion_ge00_switch_board_info.platform_data = d;
mdiobus_register_board_info(&orion_ge00_switch_board_info, 1);
}
Patches currently in stable-queue which might be from arnd(a)arndb.de are
queue-4.14/tpm-constify-transmit-data-pointers.patch
queue-4.14/ipv6-sit-work-around-bogus-gcc-8-wrestrict-warning.patch
queue-4.14/arm-kvm-fix-building-with-gcc-8.patch
queue-4.14/arm-orion-fix-orion_ge00_switch_board_info-initialization.patch
This is a note to let you know that I've just added the patch titled
ARM: kvm: fix building with gcc-8
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
arm-kvm-fix-building-with-gcc-8.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 67870eb1204223598ea6d8a4467b482e9f5875b5 Mon Sep 17 00:00:00 2001
From: Arnd Bergmann <arnd(a)arndb.de>
Date: Fri, 2 Feb 2018 16:07:34 +0100
Subject: ARM: kvm: fix building with gcc-8
From: Arnd Bergmann <arnd(a)arndb.de>
commit 67870eb1204223598ea6d8a4467b482e9f5875b5 upstream.
In banked-sr.c, we use a top-level '__asm__(".arch_extension virt")'
statement to allow compilation of a multi-CPU kernel for ARMv6
and older ARMv7-A that don't normally support access to the banked
registers.
This is considered to be a programming error by the gcc developers
and will no longer work in gcc-8, where we now get a build error:
/tmp/cc4Qy7GR.s:34: Error: Banked registers are not available with this architecture. -- `mrs r3,SP_usr'
/tmp/cc4Qy7GR.s:41: Error: Banked registers are not available with this architecture. -- `mrs r3,ELR_hyp'
/tmp/cc4Qy7GR.s:55: Error: Banked registers are not available with this architecture. -- `mrs r3,SP_svc'
/tmp/cc4Qy7GR.s:62: Error: Banked registers are not available with this architecture. -- `mrs r3,LR_svc'
/tmp/cc4Qy7GR.s:69: Error: Banked registers are not available with this architecture. -- `mrs r3,SPSR_svc'
/tmp/cc4Qy7GR.s:76: Error: Banked registers are not available with this architecture. -- `mrs r3,SP_abt'
Passign the '-march-armv7ve' flag to gcc works, and is ok here, because
we know the functions won't ever be called on pre-ARMv7VE machines.
Unfortunately, older compiler versions (4.8 and earlier) do not understand
that flag, so we still need to keep the asm around.
Backporting to stable kernels (4.6+) is needed to allow those to be built
with future compilers as well.
Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84129
Fixes: 33280b4cd1dc ("ARM: KVM: Add banked registers save/restore")
Cc: stable(a)vger.kernel.org
Signed-off-by: Arnd Bergmann <arnd(a)arndb.de>
Signed-off-by: Christoffer Dall <christoffer.dall(a)linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/arm/kvm/hyp/Makefile | 5 +++++
arch/arm/kvm/hyp/banked-sr.c | 4 ++++
2 files changed, 9 insertions(+)
--- a/arch/arm/kvm/hyp/Makefile
+++ b/arch/arm/kvm/hyp/Makefile
@@ -7,6 +7,8 @@ ccflags-y += -fno-stack-protector -DDISA
KVM=../../../../virt/kvm
+CFLAGS_ARMV7VE :=$(call cc-option, -march=armv7ve)
+
obj-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/hyp/vgic-v2-sr.o
obj-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/hyp/vgic-v3-sr.o
obj-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/hyp/timer-sr.o
@@ -15,7 +17,10 @@ obj-$(CONFIG_KVM_ARM_HOST) += tlb.o
obj-$(CONFIG_KVM_ARM_HOST) += cp15-sr.o
obj-$(CONFIG_KVM_ARM_HOST) += vfp.o
obj-$(CONFIG_KVM_ARM_HOST) += banked-sr.o
+CFLAGS_banked-sr.o += $(CFLAGS_ARMV7VE)
+
obj-$(CONFIG_KVM_ARM_HOST) += entry.o
obj-$(CONFIG_KVM_ARM_HOST) += hyp-entry.o
obj-$(CONFIG_KVM_ARM_HOST) += switch.o
+CFLAGS_switch.o += $(CFLAGS_ARMV7VE)
obj-$(CONFIG_KVM_ARM_HOST) += s2-setup.o
--- a/arch/arm/kvm/hyp/banked-sr.c
+++ b/arch/arm/kvm/hyp/banked-sr.c
@@ -20,6 +20,10 @@
#include <asm/kvm_hyp.h>
+/*
+ * gcc before 4.9 doesn't understand -march=armv7ve, so we have to
+ * trick the assembler.
+ */
__asm__(".arch_extension virt");
void __hyp_text __banked_save_state(struct kvm_cpu_context *ctxt)
Patches currently in stable-queue which might be from arnd(a)arndb.de are
queue-4.14/tpm-constify-transmit-data-pointers.patch
queue-4.14/ipv6-sit-work-around-bogus-gcc-8-wrestrict-warning.patch
queue-4.14/arm-kvm-fix-building-with-gcc-8.patch
queue-4.14/arm-orion-fix-orion_ge00_switch_board_info-initialization.patch
This is a note to let you know that I've just added the patch titled
ARM: dts: rockchip: Remove 1.8 GHz operation point from phycore som
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
arm-dts-rockchip-remove-1.8-ghz-operation-point-from-phycore-som.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 5ce0bad4ccd04c8a989e94d3c89e4e796ac22e48 Mon Sep 17 00:00:00 2001
From: Daniel Schultz <d.schultz(a)phytec.de>
Date: Tue, 13 Feb 2018 10:44:32 +0100
Subject: ARM: dts: rockchip: Remove 1.8 GHz operation point from phycore som
From: Daniel Schultz <d.schultz(a)phytec.de>
commit 5ce0bad4ccd04c8a989e94d3c89e4e796ac22e48 upstream.
Rockchip recommends to run the CPU cores only with operations points of
1.6 GHz or lower.
Removed the cpu0 node with too high operation points and use the default
values instead.
Fixes: 903d31e34628 ("ARM: dts: rockchip: Add support for phyCORE-RK3288 SoM")
Cc: stable(a)vger.kernel.org
Signed-off-by: Daniel Schultz <d.schultz(a)phytec.de>
Signed-off-by: Heiko Stuebner <heiko(a)sntech.de>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/arm/boot/dts/rk3288-phycore-som.dtsi | 20 --------------------
1 file changed, 20 deletions(-)
--- a/arch/arm/boot/dts/rk3288-phycore-som.dtsi
+++ b/arch/arm/boot/dts/rk3288-phycore-som.dtsi
@@ -110,26 +110,6 @@
};
};
-&cpu0 {
- cpu0-supply = <&vdd_cpu>;
- operating-points = <
- /* KHz uV */
- 1800000 1400000
- 1608000 1350000
- 1512000 1300000
- 1416000 1200000
- 1200000 1100000
- 1008000 1050000
- 816000 1000000
- 696000 950000
- 600000 900000
- 408000 900000
- 312000 900000
- 216000 900000
- 126000 900000
- >;
-};
-
&emmc {
status = "okay";
bus-width = <8>;
Patches currently in stable-queue which might be from d.schultz(a)phytec.de are
queue-4.14/arm-dts-rockchip-remove-1.8-ghz-operation-point-from-phycore-som.patch
This is a note to let you know that I've just added the patch titled
KVM: X86: Fix SMRAM accessing even if VM is shutdown
to the 3.18-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
kvm-x86-fix-smram-accessing-even-if-vm-is-shutdown.patch
and it can be found in the queue-3.18 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 95e057e25892eaa48cad1e2d637b80d0f1a4fac5 Mon Sep 17 00:00:00 2001
From: Wanpeng Li <wanpengli(a)tencent.com>
Date: Thu, 8 Feb 2018 15:32:45 +0800
Subject: KVM: X86: Fix SMRAM accessing even if VM is shutdown
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
From: Wanpeng Li <wanpengli(a)tencent.com>
commit 95e057e25892eaa48cad1e2d637b80d0f1a4fac5 upstream.
Reported by syzkaller:
WARNING: CPU: 6 PID: 2434 at arch/x86/kvm/vmx.c:6660 handle_ept_misconfig+0x54/0x1e0 [kvm_intel]
CPU: 6 PID: 2434 Comm: repro_test Not tainted 4.15.0+ #4
RIP: 0010:handle_ept_misconfig+0x54/0x1e0 [kvm_intel]
Call Trace:
vmx_handle_exit+0xbd/0xe20 [kvm_intel]
kvm_arch_vcpu_ioctl_run+0xdaf/0x1d50 [kvm]
kvm_vcpu_ioctl+0x3e9/0x720 [kvm]
do_vfs_ioctl+0xa4/0x6a0
SyS_ioctl+0x79/0x90
entry_SYSCALL_64_fastpath+0x25/0x9c
The testcase creates a first thread to issue KVM_SMI ioctl, and then creates
a second thread to mmap and operate on the same vCPU. This triggers a race
condition when running the testcase with multiple threads. Sometimes one thread
exits with a triple fault while another thread mmaps and operates on the same
vCPU. Because CS=0x3000/IP=0x8000 is not mapped, accessing the SMI handler
results in an EPT misconfig. This patch fixes it by returning RET_PF_EMULATE
in kvm_handle_bad_page(), which will go on to cause an emulation failure and an
exit with KVM_EXIT_INTERNAL_ERROR.
Reported-by: syzbot+c1d9517cab094dae65e446c0c5b4de6c40f4dc58(a)syzkaller.appspotmail.com
Cc: Paolo Bonzini <pbonzini(a)redhat.com>
Cc: Radim Krčmář <rkrcmar(a)redhat.com>
Cc: stable(a)vger.kernel.org
Signed-off-by: Wanpeng Li <wanpengli(a)tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini(a)redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/kvm/mmu.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -2698,7 +2698,7 @@ static int kvm_handle_bad_page(struct kv
return 0;
}
- return -EFAULT;
+ return RET_PF_EMULATE;
}
static void transparent_hugepage_adjust(struct kvm_vcpu *vcpu,
Patches currently in stable-queue which might be from wanpengli(a)tencent.com are
queue-3.18/kvm-x86-fix-smram-accessing-even-if-vm-is-shutdown.patch
This is a note to let you know that I've just added the patch titled
ARM: mvebu: Fix broken PL310_ERRATA_753970 selects
to the 3.18-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
arm-mvebu-fix-broken-pl310_errata_753970-selects.patch
and it can be found in the queue-3.18 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 8aa36a8dcde3183d84db7b0d622ffddcebb61077 Mon Sep 17 00:00:00 2001
From: Ulf Magnusson <ulfalizer(a)gmail.com>
Date: Mon, 5 Feb 2018 02:21:13 +0100
Subject: ARM: mvebu: Fix broken PL310_ERRATA_753970 selects
From: Ulf Magnusson <ulfalizer(a)gmail.com>
commit 8aa36a8dcde3183d84db7b0d622ffddcebb61077 upstream.
The MACH_ARMADA_375 and MACH_ARMADA_38X boards select ARM_ERRATA_753970,
but it was renamed to PL310_ERRATA_753970 by commit fa0ce4035d48 ("ARM:
7162/1: errata: tidy up Kconfig options for PL310 errata workarounds").
Fix the selects to use the new name.
Discovered with the
https://github.com/ulfalizer/Kconfiglib/blob/master/examples/list_undefined…
script.
Fixes: fa0ce4035d48 ("ARM: 7162/1: errata: tidy up Kconfig options for
PL310 errata workarounds"
cc: stable(a)vger.kernel.org
Signed-off-by: Ulf Magnusson <ulfalizer(a)gmail.com>
Signed-off-by: Gregory CLEMENT <gregory.clement(a)bootlin.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/arm/mach-mvebu/Kconfig | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
--- a/arch/arm/mach-mvebu/Kconfig
+++ b/arch/arm/mach-mvebu/Kconfig
@@ -37,7 +37,7 @@ config MACH_ARMADA_370
config MACH_ARMADA_375
bool "Marvell Armada 375 boards" if ARCH_MULTI_V7
select ARM_ERRATA_720789
- select ARM_ERRATA_753970
+ select PL310_ERRATA_753970
select ARM_GIC
select ARMADA_375_CLK
select HAVE_ARM_SCU
@@ -52,7 +52,7 @@ config MACH_ARMADA_375
config MACH_ARMADA_38X
bool "Marvell Armada 380/385 boards" if ARCH_MULTI_V7
select ARM_ERRATA_720789
- select ARM_ERRATA_753970
+ select PL310_ERRATA_753970
select ARM_GIC
select ARMADA_38X_CLK
select HAVE_ARM_SCU
Patches currently in stable-queue which might be from ulfalizer(a)gmail.com are
queue-3.18/arm-mvebu-fix-broken-pl310_errata_753970-selects.patch
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From fe2a3027e74e40a3ece3a4c1e4e51403090a907a Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Radim=20Kr=C4=8Dm=C3=A1=C5=99?= <rkrcmar(a)redhat.com>
Date: Thu, 1 Feb 2018 22:16:21 +0100
Subject: [PATCH] KVM: x86: fix backward migration with async_PF
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Guests on new hypersiors might set KVM_ASYNC_PF_DELIVERY_AS_PF_VMEXIT
bit when enabling async_PF, but this bit is reserved on old hypervisors,
which results in a failure upon migration.
To avoid breaking different cases, we are checking for CPUID feature bit
before enabling the feature and nothing else.
Fixes: 52a5c155cf79 ("KVM: async_pf: Let guest support delivery of async_pf from guest mode")
Cc: <stable(a)vger.kernel.org>
Reviewed-by: Wanpeng Li <wanpengli(a)tencent.com>
Reviewed-by: David Hildenbrand <david(a)redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar(a)redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini(a)redhat.com>
diff --git a/Documentation/virtual/kvm/cpuid.txt b/Documentation/virtual/kvm/cpuid.txt
index dcab6dc11e3b..87a7506f31c2 100644
--- a/Documentation/virtual/kvm/cpuid.txt
+++ b/Documentation/virtual/kvm/cpuid.txt
@@ -58,6 +58,10 @@ KVM_FEATURE_PV_TLB_FLUSH || 9 || guest checks this feature bit
|| || before enabling paravirtualized
|| || tlb flush.
------------------------------------------------------------------------------
+KVM_FEATURE_ASYNC_PF_VMEXIT || 10 || paravirtualized async PF VM exit
+ || || can be enabled by setting bit 2
+ || || when writing to msr 0x4b564d02
+------------------------------------------------------------------------------
KVM_FEATURE_CLOCKSOURCE_STABLE_BIT || 24 || host will warn if no guest-side
|| || per-cpu warps are expected in
|| || kvmclock.
diff --git a/Documentation/virtual/kvm/msr.txt b/Documentation/virtual/kvm/msr.txt
index 1ebecc115dc6..f3f0d57ced8e 100644
--- a/Documentation/virtual/kvm/msr.txt
+++ b/Documentation/virtual/kvm/msr.txt
@@ -170,7 +170,8 @@ MSR_KVM_ASYNC_PF_EN: 0x4b564d02
when asynchronous page faults are enabled on the vcpu 0 when
disabled. Bit 1 is 1 if asynchronous page faults can be injected
when vcpu is in cpl == 0. Bit 2 is 1 if asynchronous page faults
- are delivered to L1 as #PF vmexits.
+ are delivered to L1 as #PF vmexits. Bit 2 can be set only if
+ KVM_FEATURE_ASYNC_PF_VMEXIT is present in CPUID.
First 4 byte of 64 byte memory location will be written to by
the hypervisor at the time of asynchronous page fault (APF)
diff --git a/arch/x86/include/uapi/asm/kvm_para.h b/arch/x86/include/uapi/asm/kvm_para.h
index 7a2ade4aa235..6cfa9c8cb7d6 100644
--- a/arch/x86/include/uapi/asm/kvm_para.h
+++ b/arch/x86/include/uapi/asm/kvm_para.h
@@ -26,6 +26,7 @@
#define KVM_FEATURE_PV_EOI 6
#define KVM_FEATURE_PV_UNHALT 7
#define KVM_FEATURE_PV_TLB_FLUSH 9
+#define KVM_FEATURE_ASYNC_PF_VMEXIT 10
/* The last 8 bits are used to indicate how to interpret the flags field
* in pvclock structure. If no bits are set, all flags are ignored.
diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index 4e37d1a851a6..971babe964d2 100644
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -341,10 +341,10 @@ static void kvm_guest_cpu_init(void)
#endif
pa |= KVM_ASYNC_PF_ENABLED;
- /* Async page fault support for L1 hypervisor is optional */
- if (wrmsr_safe(MSR_KVM_ASYNC_PF_EN,
- (pa | KVM_ASYNC_PF_DELIVERY_AS_PF_VMEXIT) & 0xffffffff, pa >> 32) < 0)
- wrmsrl(MSR_KVM_ASYNC_PF_EN, pa);
+ if (kvm_para_has_feature(KVM_FEATURE_ASYNC_PF_VMEXIT))
+ pa |= KVM_ASYNC_PF_DELIVERY_AS_PF_VMEXIT;
+
+ wrmsrl(MSR_KVM_ASYNC_PF_EN, pa);
__this_cpu_write(apf_reason.enabled, 1);
printk(KERN_INFO"KVM setup async PF for cpu %d\n",
smp_processor_id());
diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index a0c5a69bc7c4..b671fc2d0422 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -607,7 +607,8 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
(1 << KVM_FEATURE_PV_EOI) |
(1 << KVM_FEATURE_CLOCKSOURCE_STABLE_BIT) |
(1 << KVM_FEATURE_PV_UNHALT) |
- (1 << KVM_FEATURE_PV_TLB_FLUSH);
+ (1 << KVM_FEATURE_PV_TLB_FLUSH) |
+ (1 << KVM_FEATURE_ASYNC_PF_VMEXIT);
if (sched_info_on())
entry->eax |= (1 << KVM_FEATURE_STEAL_TIME);
The patch below does not apply to the 4.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From fe2a3027e74e40a3ece3a4c1e4e51403090a907a Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Radim=20Kr=C4=8Dm=C3=A1=C5=99?= <rkrcmar(a)redhat.com>
Date: Thu, 1 Feb 2018 22:16:21 +0100
Subject: [PATCH] KVM: x86: fix backward migration with async_PF
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Guests on new hypersiors might set KVM_ASYNC_PF_DELIVERY_AS_PF_VMEXIT
bit when enabling async_PF, but this bit is reserved on old hypervisors,
which results in a failure upon migration.
To avoid breaking different cases, we are checking for CPUID feature bit
before enabling the feature and nothing else.
Fixes: 52a5c155cf79 ("KVM: async_pf: Let guest support delivery of async_pf from guest mode")
Cc: <stable(a)vger.kernel.org>
Reviewed-by: Wanpeng Li <wanpengli(a)tencent.com>
Reviewed-by: David Hildenbrand <david(a)redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar(a)redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini(a)redhat.com>
diff --git a/Documentation/virtual/kvm/cpuid.txt b/Documentation/virtual/kvm/cpuid.txt
index dcab6dc11e3b..87a7506f31c2 100644
--- a/Documentation/virtual/kvm/cpuid.txt
+++ b/Documentation/virtual/kvm/cpuid.txt
@@ -58,6 +58,10 @@ KVM_FEATURE_PV_TLB_FLUSH || 9 || guest checks this feature bit
|| || before enabling paravirtualized
|| || tlb flush.
------------------------------------------------------------------------------
+KVM_FEATURE_ASYNC_PF_VMEXIT || 10 || paravirtualized async PF VM exit
+ || || can be enabled by setting bit 2
+ || || when writing to msr 0x4b564d02
+------------------------------------------------------------------------------
KVM_FEATURE_CLOCKSOURCE_STABLE_BIT || 24 || host will warn if no guest-side
|| || per-cpu warps are expected in
|| || kvmclock.
diff --git a/Documentation/virtual/kvm/msr.txt b/Documentation/virtual/kvm/msr.txt
index 1ebecc115dc6..f3f0d57ced8e 100644
--- a/Documentation/virtual/kvm/msr.txt
+++ b/Documentation/virtual/kvm/msr.txt
@@ -170,7 +170,8 @@ MSR_KVM_ASYNC_PF_EN: 0x4b564d02
when asynchronous page faults are enabled on the vcpu 0 when
disabled. Bit 1 is 1 if asynchronous page faults can be injected
when vcpu is in cpl == 0. Bit 2 is 1 if asynchronous page faults
- are delivered to L1 as #PF vmexits.
+ are delivered to L1 as #PF vmexits. Bit 2 can be set only if
+ KVM_FEATURE_ASYNC_PF_VMEXIT is present in CPUID.
First 4 byte of 64 byte memory location will be written to by
the hypervisor at the time of asynchronous page fault (APF)
diff --git a/arch/x86/include/uapi/asm/kvm_para.h b/arch/x86/include/uapi/asm/kvm_para.h
index 7a2ade4aa235..6cfa9c8cb7d6 100644
--- a/arch/x86/include/uapi/asm/kvm_para.h
+++ b/arch/x86/include/uapi/asm/kvm_para.h
@@ -26,6 +26,7 @@
#define KVM_FEATURE_PV_EOI 6
#define KVM_FEATURE_PV_UNHALT 7
#define KVM_FEATURE_PV_TLB_FLUSH 9
+#define KVM_FEATURE_ASYNC_PF_VMEXIT 10
/* The last 8 bits are used to indicate how to interpret the flags field
* in pvclock structure. If no bits are set, all flags are ignored.
diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index 4e37d1a851a6..971babe964d2 100644
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -341,10 +341,10 @@ static void kvm_guest_cpu_init(void)
#endif
pa |= KVM_ASYNC_PF_ENABLED;
- /* Async page fault support for L1 hypervisor is optional */
- if (wrmsr_safe(MSR_KVM_ASYNC_PF_EN,
- (pa | KVM_ASYNC_PF_DELIVERY_AS_PF_VMEXIT) & 0xffffffff, pa >> 32) < 0)
- wrmsrl(MSR_KVM_ASYNC_PF_EN, pa);
+ if (kvm_para_has_feature(KVM_FEATURE_ASYNC_PF_VMEXIT))
+ pa |= KVM_ASYNC_PF_DELIVERY_AS_PF_VMEXIT;
+
+ wrmsrl(MSR_KVM_ASYNC_PF_EN, pa);
__this_cpu_write(apf_reason.enabled, 1);
printk(KERN_INFO"KVM setup async PF for cpu %d\n",
smp_processor_id());
diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index a0c5a69bc7c4..b671fc2d0422 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -607,7 +607,8 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
(1 << KVM_FEATURE_PV_EOI) |
(1 << KVM_FEATURE_CLOCKSOURCE_STABLE_BIT) |
(1 << KVM_FEATURE_PV_UNHALT) |
- (1 << KVM_FEATURE_PV_TLB_FLUSH);
+ (1 << KVM_FEATURE_PV_TLB_FLUSH) |
+ (1 << KVM_FEATURE_ASYNC_PF_VMEXIT);
if (sched_info_on())
entry->eax |= (1 << KVM_FEATURE_STEAL_TIME);
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From d60d8b64280c8b36c085eda7821585c1ce911795 Mon Sep 17 00:00:00 2001
From: Christoffer Dall <christoffer.dall(a)linaro.org>
Date: Fri, 26 Jan 2018 16:06:51 +0100
Subject: [PATCH] KVM: arm/arm64: Fix arch timers with userspace irqchips
When introducing support for irqchip in userspace we needed a way to
mask the timer signal to prevent the guest continuously exiting due to a
screaming timer.
We did this by disabling the corresponding percpu interrupt on the
host interrupt controller, because we cannot rely on the host system
having a GIC, and therefore cannot make any assumptions about having an
active state to hide the timer signal.
Unfortunately, when introducing this feature, it became entirely
possible that a VCPU which belongs to a VM that has a userspace irqchip
can disable the vtimer irq on the host on some physical CPU, and then go
away without ever enabling the vtimer irq on that physical CPU again.
This means that using irqchips in userspace on a system that also
supports running VMs with an in-kernel GIC can prevent forward progress
from in-kernel GIC VMs.
Later on, when we started taking virtual timer interrupts in the arch
timer code, we would also leave this timer state active for userspace
irqchip VMs, because we leave it up to a VGIC-enabled guest to
deactivate the hardware IRQ using the HW bit in the LR.
Both issues are solved by only using the enable/disable trick on systems
that do not have a host GIC which supports the active state, because all
VMs on such systems must use irqchips in userspace. Systems that have a
working GIC with support for an active state use the active state to
mask the timer signal for both userspace and in-kernel irqchips.
Cc: Alexander Graf <agraf(a)suse.de>
Reviewed-by: Marc Zyngier <marc.zyngier(a)arm.com>
Cc: <stable(a)vger.kernel.org> # v4.12+
Fixes: d9e139778376 ("KVM: arm/arm64: Support arch timers with a userspace gic")
Signed-off-by: Christoffer Dall <christoffer.dall(a)linaro.org>
diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
index 70268c0bec79..70f4c30918eb 100644
--- a/virt/kvm/arm/arch_timer.c
+++ b/virt/kvm/arm/arch_timer.c
@@ -36,6 +36,8 @@ static struct timecounter *timecounter;
static unsigned int host_vtimer_irq;
static u32 host_vtimer_irq_flags;
+static DEFINE_STATIC_KEY_FALSE(has_gic_active_state);
+
static const struct kvm_irq_level default_ptimer_irq = {
.irq = 30,
.level = 1,
@@ -56,6 +58,12 @@ u64 kvm_phys_timer_read(void)
return timecounter->cc->read(timecounter->cc);
}
+static inline bool userspace_irqchip(struct kvm *kvm)
+{
+ return static_branch_unlikely(&userspace_irqchip_in_use) &&
+ unlikely(!irqchip_in_kernel(kvm));
+}
+
static void soft_timer_start(struct hrtimer *hrt, u64 ns)
{
hrtimer_start(hrt, ktime_add_ns(ktime_get(), ns),
@@ -69,25 +77,6 @@ static void soft_timer_cancel(struct hrtimer *hrt, struct work_struct *work)
cancel_work_sync(work);
}
-static void kvm_vtimer_update_mask_user(struct kvm_vcpu *vcpu)
-{
- struct arch_timer_context *vtimer = vcpu_vtimer(vcpu);
-
- /*
- * When using a userspace irqchip with the architected timers, we must
- * prevent continuously exiting from the guest, and therefore mask the
- * physical interrupt by disabling it on the host interrupt controller
- * when the virtual level is high, such that the guest can make
- * forward progress. Once we detect the output level being
- * de-asserted, we unmask the interrupt again so that we exit from the
- * guest when the timer fires.
- */
- if (vtimer->irq.level)
- disable_percpu_irq(host_vtimer_irq);
- else
- enable_percpu_irq(host_vtimer_irq, 0);
-}
-
static irqreturn_t kvm_arch_timer_handler(int irq, void *dev_id)
{
struct kvm_vcpu *vcpu = *(struct kvm_vcpu **)dev_id;
@@ -106,9 +95,9 @@ static irqreturn_t kvm_arch_timer_handler(int irq, void *dev_id)
if (kvm_timer_should_fire(vtimer))
kvm_timer_update_irq(vcpu, true, vtimer);
- if (static_branch_unlikely(&userspace_irqchip_in_use) &&
- unlikely(!irqchip_in_kernel(vcpu->kvm)))
- kvm_vtimer_update_mask_user(vcpu);
+ if (userspace_irqchip(vcpu->kvm) &&
+ !static_branch_unlikely(&has_gic_active_state))
+ disable_percpu_irq(host_vtimer_irq);
return IRQ_HANDLED;
}
@@ -290,8 +279,7 @@ static void kvm_timer_update_irq(struct kvm_vcpu *vcpu, bool new_level,
trace_kvm_timer_update_irq(vcpu->vcpu_id, timer_ctx->irq.irq,
timer_ctx->irq.level);
- if (!static_branch_unlikely(&userspace_irqchip_in_use) ||
- likely(irqchip_in_kernel(vcpu->kvm))) {
+ if (!userspace_irqchip(vcpu->kvm)) {
ret = kvm_vgic_inject_irq(vcpu->kvm, vcpu->vcpu_id,
timer_ctx->irq.irq,
timer_ctx->irq.level,
@@ -350,12 +338,6 @@ static void kvm_timer_update_state(struct kvm_vcpu *vcpu)
phys_timer_emulate(vcpu);
}
-static void __timer_snapshot_state(struct arch_timer_context *timer)
-{
- timer->cnt_ctl = read_sysreg_el0(cntv_ctl);
- timer->cnt_cval = read_sysreg_el0(cntv_cval);
-}
-
static void vtimer_save_state(struct kvm_vcpu *vcpu)
{
struct arch_timer_cpu *timer = &vcpu->arch.timer_cpu;
@@ -367,8 +349,10 @@ static void vtimer_save_state(struct kvm_vcpu *vcpu)
if (!vtimer->loaded)
goto out;
- if (timer->enabled)
- __timer_snapshot_state(vtimer);
+ if (timer->enabled) {
+ vtimer->cnt_ctl = read_sysreg_el0(cntv_ctl);
+ vtimer->cnt_cval = read_sysreg_el0(cntv_cval);
+ }
/* Disable the virtual timer */
write_sysreg_el0(0, cntv_ctl);
@@ -460,23 +444,43 @@ static void set_cntvoff(u64 cntvoff)
kvm_call_hyp(__kvm_timer_set_cntvoff, low, high);
}
-static void kvm_timer_vcpu_load_vgic(struct kvm_vcpu *vcpu)
+static inline void set_vtimer_irq_phys_active(struct kvm_vcpu *vcpu, bool active)
+{
+ int r;
+ r = irq_set_irqchip_state(host_vtimer_irq, IRQCHIP_STATE_ACTIVE, active);
+ WARN_ON(r);
+}
+
+static void kvm_timer_vcpu_load_gic(struct kvm_vcpu *vcpu)
{
struct arch_timer_context *vtimer = vcpu_vtimer(vcpu);
bool phys_active;
- int ret;
- phys_active = kvm_vgic_map_is_active(vcpu, vtimer->irq.irq);
-
- ret = irq_set_irqchip_state(host_vtimer_irq,
- IRQCHIP_STATE_ACTIVE,
- phys_active);
- WARN_ON(ret);
+ if (irqchip_in_kernel(vcpu->kvm))
+ phys_active = kvm_vgic_map_is_active(vcpu, vtimer->irq.irq);
+ else
+ phys_active = vtimer->irq.level;
+ set_vtimer_irq_phys_active(vcpu, phys_active);
}
-static void kvm_timer_vcpu_load_user(struct kvm_vcpu *vcpu)
+static void kvm_timer_vcpu_load_nogic(struct kvm_vcpu *vcpu)
{
- kvm_vtimer_update_mask_user(vcpu);
+ struct arch_timer_context *vtimer = vcpu_vtimer(vcpu);
+
+ /*
+ * When using a userspace irqchip with the architected timers and a
+ * host interrupt controller that doesn't support an active state, we
+ * must still prevent continuously exiting from the guest, and
+ * therefore mask the physical interrupt by disabling it on the host
+ * interrupt controller when the virtual level is high, such that the
+ * guest can make forward progress. Once we detect the output level
+ * being de-asserted, we unmask the interrupt again so that we exit
+ * from the guest when the timer fires.
+ */
+ if (vtimer->irq.level)
+ disable_percpu_irq(host_vtimer_irq);
+ else
+ enable_percpu_irq(host_vtimer_irq, host_vtimer_irq_flags);
}
void kvm_timer_vcpu_load(struct kvm_vcpu *vcpu)
@@ -487,10 +491,10 @@ void kvm_timer_vcpu_load(struct kvm_vcpu *vcpu)
if (unlikely(!timer->enabled))
return;
- if (unlikely(!irqchip_in_kernel(vcpu->kvm)))
- kvm_timer_vcpu_load_user(vcpu);
+ if (static_branch_likely(&has_gic_active_state))
+ kvm_timer_vcpu_load_gic(vcpu);
else
- kvm_timer_vcpu_load_vgic(vcpu);
+ kvm_timer_vcpu_load_nogic(vcpu);
set_cntvoff(vtimer->cntvoff);
@@ -555,18 +559,24 @@ static void unmask_vtimer_irq_user(struct kvm_vcpu *vcpu)
{
struct arch_timer_context *vtimer = vcpu_vtimer(vcpu);
- if (unlikely(!irqchip_in_kernel(vcpu->kvm))) {
- __timer_snapshot_state(vtimer);
- if (!kvm_timer_should_fire(vtimer)) {
- kvm_timer_update_irq(vcpu, false, vtimer);
- kvm_vtimer_update_mask_user(vcpu);
- }
+ if (!kvm_timer_should_fire(vtimer)) {
+ kvm_timer_update_irq(vcpu, false, vtimer);
+ if (static_branch_likely(&has_gic_active_state))
+ set_vtimer_irq_phys_active(vcpu, false);
+ else
+ enable_percpu_irq(host_vtimer_irq, host_vtimer_irq_flags);
}
}
void kvm_timer_sync_hwstate(struct kvm_vcpu *vcpu)
{
- unmask_vtimer_irq_user(vcpu);
+ struct arch_timer_cpu *timer = &vcpu->arch.timer_cpu;
+
+ if (unlikely(!timer->enabled))
+ return;
+
+ if (unlikely(!irqchip_in_kernel(vcpu->kvm)))
+ unmask_vtimer_irq_user(vcpu);
}
int kvm_timer_vcpu_reset(struct kvm_vcpu *vcpu)
@@ -753,6 +763,8 @@ int kvm_timer_hyp_init(bool has_gic)
kvm_err("kvm_arch_timer: error setting vcpu affinity\n");
goto out_free_irq;
}
+
+ static_branch_enable(&has_gic_active_state);
}
kvm_info("virtual timer IRQ%d\n", host_vtimer_irq);
The patch below does not apply to the 4.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From d60d8b64280c8b36c085eda7821585c1ce911795 Mon Sep 17 00:00:00 2001
From: Christoffer Dall <christoffer.dall(a)linaro.org>
Date: Fri, 26 Jan 2018 16:06:51 +0100
Subject: [PATCH] KVM: arm/arm64: Fix arch timers with userspace irqchips
When introducing support for irqchip in userspace we needed a way to
mask the timer signal to prevent the guest continuously exiting due to a
screaming timer.
We did this by disabling the corresponding percpu interrupt on the
host interrupt controller, because we cannot rely on the host system
having a GIC, and therefore cannot make any assumptions about having an
active state to hide the timer signal.
Unfortunately, when introducing this feature, it became entirely
possible that a VCPU which belongs to a VM that has a userspace irqchip
can disable the vtimer irq on the host on some physical CPU, and then go
away without ever enabling the vtimer irq on that physical CPU again.
This means that using irqchips in userspace on a system that also
supports running VMs with an in-kernel GIC can prevent forward progress
from in-kernel GIC VMs.
Later on, when we started taking virtual timer interrupts in the arch
timer code, we would also leave this timer state active for userspace
irqchip VMs, because we leave it up to a VGIC-enabled guest to
deactivate the hardware IRQ using the HW bit in the LR.
Both issues are solved by only using the enable/disable trick on systems
that do not have a host GIC which supports the active state, because all
VMs on such systems must use irqchips in userspace. Systems that have a
working GIC with support for an active state use the active state to
mask the timer signal for both userspace and in-kernel irqchips.
Cc: Alexander Graf <agraf(a)suse.de>
Reviewed-by: Marc Zyngier <marc.zyngier(a)arm.com>
Cc: <stable(a)vger.kernel.org> # v4.12+
Fixes: d9e139778376 ("KVM: arm/arm64: Support arch timers with a userspace gic")
Signed-off-by: Christoffer Dall <christoffer.dall(a)linaro.org>
diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
index 70268c0bec79..70f4c30918eb 100644
--- a/virt/kvm/arm/arch_timer.c
+++ b/virt/kvm/arm/arch_timer.c
@@ -36,6 +36,8 @@ static struct timecounter *timecounter;
static unsigned int host_vtimer_irq;
static u32 host_vtimer_irq_flags;
+static DEFINE_STATIC_KEY_FALSE(has_gic_active_state);
+
static const struct kvm_irq_level default_ptimer_irq = {
.irq = 30,
.level = 1,
@@ -56,6 +58,12 @@ u64 kvm_phys_timer_read(void)
return timecounter->cc->read(timecounter->cc);
}
+static inline bool userspace_irqchip(struct kvm *kvm)
+{
+ return static_branch_unlikely(&userspace_irqchip_in_use) &&
+ unlikely(!irqchip_in_kernel(kvm));
+}
+
static void soft_timer_start(struct hrtimer *hrt, u64 ns)
{
hrtimer_start(hrt, ktime_add_ns(ktime_get(), ns),
@@ -69,25 +77,6 @@ static void soft_timer_cancel(struct hrtimer *hrt, struct work_struct *work)
cancel_work_sync(work);
}
-static void kvm_vtimer_update_mask_user(struct kvm_vcpu *vcpu)
-{
- struct arch_timer_context *vtimer = vcpu_vtimer(vcpu);
-
- /*
- * When using a userspace irqchip with the architected timers, we must
- * prevent continuously exiting from the guest, and therefore mask the
- * physical interrupt by disabling it on the host interrupt controller
- * when the virtual level is high, such that the guest can make
- * forward progress. Once we detect the output level being
- * de-asserted, we unmask the interrupt again so that we exit from the
- * guest when the timer fires.
- */
- if (vtimer->irq.level)
- disable_percpu_irq(host_vtimer_irq);
- else
- enable_percpu_irq(host_vtimer_irq, 0);
-}
-
static irqreturn_t kvm_arch_timer_handler(int irq, void *dev_id)
{
struct kvm_vcpu *vcpu = *(struct kvm_vcpu **)dev_id;
@@ -106,9 +95,9 @@ static irqreturn_t kvm_arch_timer_handler(int irq, void *dev_id)
if (kvm_timer_should_fire(vtimer))
kvm_timer_update_irq(vcpu, true, vtimer);
- if (static_branch_unlikely(&userspace_irqchip_in_use) &&
- unlikely(!irqchip_in_kernel(vcpu->kvm)))
- kvm_vtimer_update_mask_user(vcpu);
+ if (userspace_irqchip(vcpu->kvm) &&
+ !static_branch_unlikely(&has_gic_active_state))
+ disable_percpu_irq(host_vtimer_irq);
return IRQ_HANDLED;
}
@@ -290,8 +279,7 @@ static void kvm_timer_update_irq(struct kvm_vcpu *vcpu, bool new_level,
trace_kvm_timer_update_irq(vcpu->vcpu_id, timer_ctx->irq.irq,
timer_ctx->irq.level);
- if (!static_branch_unlikely(&userspace_irqchip_in_use) ||
- likely(irqchip_in_kernel(vcpu->kvm))) {
+ if (!userspace_irqchip(vcpu->kvm)) {
ret = kvm_vgic_inject_irq(vcpu->kvm, vcpu->vcpu_id,
timer_ctx->irq.irq,
timer_ctx->irq.level,
@@ -350,12 +338,6 @@ static void kvm_timer_update_state(struct kvm_vcpu *vcpu)
phys_timer_emulate(vcpu);
}
-static void __timer_snapshot_state(struct arch_timer_context *timer)
-{
- timer->cnt_ctl = read_sysreg_el0(cntv_ctl);
- timer->cnt_cval = read_sysreg_el0(cntv_cval);
-}
-
static void vtimer_save_state(struct kvm_vcpu *vcpu)
{
struct arch_timer_cpu *timer = &vcpu->arch.timer_cpu;
@@ -367,8 +349,10 @@ static void vtimer_save_state(struct kvm_vcpu *vcpu)
if (!vtimer->loaded)
goto out;
- if (timer->enabled)
- __timer_snapshot_state(vtimer);
+ if (timer->enabled) {
+ vtimer->cnt_ctl = read_sysreg_el0(cntv_ctl);
+ vtimer->cnt_cval = read_sysreg_el0(cntv_cval);
+ }
/* Disable the virtual timer */
write_sysreg_el0(0, cntv_ctl);
@@ -460,23 +444,43 @@ static void set_cntvoff(u64 cntvoff)
kvm_call_hyp(__kvm_timer_set_cntvoff, low, high);
}
-static void kvm_timer_vcpu_load_vgic(struct kvm_vcpu *vcpu)
+static inline void set_vtimer_irq_phys_active(struct kvm_vcpu *vcpu, bool active)
+{
+ int r;
+ r = irq_set_irqchip_state(host_vtimer_irq, IRQCHIP_STATE_ACTIVE, active);
+ WARN_ON(r);
+}
+
+static void kvm_timer_vcpu_load_gic(struct kvm_vcpu *vcpu)
{
struct arch_timer_context *vtimer = vcpu_vtimer(vcpu);
bool phys_active;
- int ret;
- phys_active = kvm_vgic_map_is_active(vcpu, vtimer->irq.irq);
-
- ret = irq_set_irqchip_state(host_vtimer_irq,
- IRQCHIP_STATE_ACTIVE,
- phys_active);
- WARN_ON(ret);
+ if (irqchip_in_kernel(vcpu->kvm))
+ phys_active = kvm_vgic_map_is_active(vcpu, vtimer->irq.irq);
+ else
+ phys_active = vtimer->irq.level;
+ set_vtimer_irq_phys_active(vcpu, phys_active);
}
-static void kvm_timer_vcpu_load_user(struct kvm_vcpu *vcpu)
+static void kvm_timer_vcpu_load_nogic(struct kvm_vcpu *vcpu)
{
- kvm_vtimer_update_mask_user(vcpu);
+ struct arch_timer_context *vtimer = vcpu_vtimer(vcpu);
+
+ /*
+ * When using a userspace irqchip with the architected timers and a
+ * host interrupt controller that doesn't support an active state, we
+ * must still prevent continuously exiting from the guest, and
+ * therefore mask the physical interrupt by disabling it on the host
+ * interrupt controller when the virtual level is high, such that the
+ * guest can make forward progress. Once we detect the output level
+ * being de-asserted, we unmask the interrupt again so that we exit
+ * from the guest when the timer fires.
+ */
+ if (vtimer->irq.level)
+ disable_percpu_irq(host_vtimer_irq);
+ else
+ enable_percpu_irq(host_vtimer_irq, host_vtimer_irq_flags);
}
void kvm_timer_vcpu_load(struct kvm_vcpu *vcpu)
@@ -487,10 +491,10 @@ void kvm_timer_vcpu_load(struct kvm_vcpu *vcpu)
if (unlikely(!timer->enabled))
return;
- if (unlikely(!irqchip_in_kernel(vcpu->kvm)))
- kvm_timer_vcpu_load_user(vcpu);
+ if (static_branch_likely(&has_gic_active_state))
+ kvm_timer_vcpu_load_gic(vcpu);
else
- kvm_timer_vcpu_load_vgic(vcpu);
+ kvm_timer_vcpu_load_nogic(vcpu);
set_cntvoff(vtimer->cntvoff);
@@ -555,18 +559,24 @@ static void unmask_vtimer_irq_user(struct kvm_vcpu *vcpu)
{
struct arch_timer_context *vtimer = vcpu_vtimer(vcpu);
- if (unlikely(!irqchip_in_kernel(vcpu->kvm))) {
- __timer_snapshot_state(vtimer);
- if (!kvm_timer_should_fire(vtimer)) {
- kvm_timer_update_irq(vcpu, false, vtimer);
- kvm_vtimer_update_mask_user(vcpu);
- }
+ if (!kvm_timer_should_fire(vtimer)) {
+ kvm_timer_update_irq(vcpu, false, vtimer);
+ if (static_branch_likely(&has_gic_active_state))
+ set_vtimer_irq_phys_active(vcpu, false);
+ else
+ enable_percpu_irq(host_vtimer_irq, host_vtimer_irq_flags);
}
}
void kvm_timer_sync_hwstate(struct kvm_vcpu *vcpu)
{
- unmask_vtimer_irq_user(vcpu);
+ struct arch_timer_cpu *timer = &vcpu->arch.timer_cpu;
+
+ if (unlikely(!timer->enabled))
+ return;
+
+ if (unlikely(!irqchip_in_kernel(vcpu->kvm)))
+ unmask_vtimer_irq_user(vcpu);
}
int kvm_timer_vcpu_reset(struct kvm_vcpu *vcpu)
@@ -753,6 +763,8 @@ int kvm_timer_hyp_init(bool has_gic)
kvm_err("kvm_arch_timer: error setting vcpu affinity\n");
goto out_free_irq;
}
+
+ static_branch_enable(&has_gic_active_state);
}
kvm_info("virtual timer IRQ%d\n", host_vtimer_irq);
The patch below does not apply to the 4.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From c37406e05d1e541df40b8b81c4bd40753fcaf414 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Christian=20K=C3=B6nig?= <ckoenig.leichtzumerken(a)gmail.com>
Date: Mon, 26 Feb 2018 14:51:13 -0600
Subject: [PATCH] PCI: Allow release of resources that were never assigned
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
It is entirely possible that the BIOS wasn't able to assign resources to a
device. In this case don't crash in pci_release_resource() when we try to
resize the resource.
Fixes: 8bb705e3e79d ("PCI: Add pci_resize_resource() for resizing BARs")
Signed-off-by: Christian König <christian.koenig(a)amd.com>
Signed-off-by: Bjorn Helgaas <bhelgaas(a)google.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko(a)gmail.com>
CC: stable(a)vger.kernel.org # v4.15+
diff --git a/drivers/pci/setup-res.c b/drivers/pci/setup-res.c
index 369d48d6c6f1..365447240d95 100644
--- a/drivers/pci/setup-res.c
+++ b/drivers/pci/setup-res.c
@@ -401,6 +401,10 @@ void pci_release_resource(struct pci_dev *dev, int resno)
struct resource *res = dev->resource + resno;
pci_info(dev, "BAR %d: releasing %pR\n", resno, res);
+
+ if (!res->parent)
+ return;
+
release_resource(res);
res->end = resource_size(res) - 1;
res->start = 0;
This is a note to let you know that I've just added the patch titled
virtio-net: disable NAPI only when enabled during XDP set
to the 4.15-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
virtio-net-disable-napi-only-when-enabled-during-xdp-set.patch
and it can be found in the queue-4.15 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Tue Mar 6 19:02:56 PST 2018
From: Jason Wang <jasowang(a)redhat.com>
Date: Wed, 28 Feb 2018 18:20:04 +0800
Subject: virtio-net: disable NAPI only when enabled during XDP set
From: Jason Wang <jasowang(a)redhat.com>
[ Upstream commit 4e09ff5362843dff3accfa84c805c7f3a99de9cd ]
We try to disable NAPI to prevent a single XDP TX queue being used by
multiple cpus. But we don't check if device is up (NAPI is enabled),
this could result stall because of infinite wait in
napi_disable(). Fixing this by checking device state through
netif_running() before.
Fixes: 4941d472bf95b ("virtio-net: do not reset during XDP set")
Signed-off-by: Jason Wang <jasowang(a)redhat.com>
Acked-by: Michael S. Tsirkin <mst(a)redhat.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/net/virtio_net.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -2040,8 +2040,9 @@ static int virtnet_xdp_set(struct net_de
}
/* Make sure NAPI is not using any XDP TX queues for RX. */
- for (i = 0; i < vi->max_queue_pairs; i++)
- napi_disable(&vi->rq[i].napi);
+ if (netif_running(dev))
+ for (i = 0; i < vi->max_queue_pairs; i++)
+ napi_disable(&vi->rq[i].napi);
netif_set_real_num_rx_queues(dev, curr_qp + xdp_qp);
err = _virtnet_set_queues(vi, curr_qp + xdp_qp);
@@ -2060,7 +2061,8 @@ static int virtnet_xdp_set(struct net_de
}
if (old_prog)
bpf_prog_put(old_prog);
- virtnet_napi_enable(vi->rq[i].vq, &vi->rq[i].napi);
+ if (netif_running(dev))
+ virtnet_napi_enable(vi->rq[i].vq, &vi->rq[i].napi);
}
return 0;
Patches currently in stable-queue which might be from jasowang(a)redhat.com are
queue-4.15/tuntap-correctly-add-the-missing-xdp-flush.patch
queue-4.15/virtio-net-disable-napi-only-when-enabled-during-xdp-set.patch
queue-4.15/tuntap-disable-preemption-during-xdp-processing.patch
This is a note to let you know that I've just added the patch titled
udplite: fix partial checksum initialization
to the 4.15-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
udplite-fix-partial-checksum-initialization.patch
and it can be found in the queue-4.15 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Tue Mar 6 19:02:56 PST 2018
From: Alexey Kodanev <alexey.kodanev(a)oracle.com>
Date: Thu, 15 Feb 2018 20:18:43 +0300
Subject: udplite: fix partial checksum initialization
From: Alexey Kodanev <alexey.kodanev(a)oracle.com>
[ Upstream commit 15f35d49c93f4fa9875235e7bf3e3783d2dd7a1b ]
Since UDP-Lite is always using checksum, the following path is
triggered when calculating pseudo header for it:
udp4_csum_init() or udp6_csum_init()
skb_checksum_init_zero_check()
__skb_checksum_validate_complete()
The problem can appear if skb->len is less than CHECKSUM_BREAK. In
this particular case __skb_checksum_validate_complete() also invokes
__skb_checksum_complete(skb). If UDP-Lite is using partial checksum
that covers only part of a packet, the function will return bad
checksum and the packet will be dropped.
It can be fixed if we skip skb_checksum_init_zero_check() and only
set the required pseudo header checksum for UDP-Lite with partial
checksum before udp4_csum_init()/udp6_csum_init() functions return.
Fixes: ed70fcfcee95 ("net: Call skb_checksum_init in IPv4")
Fixes: e4f45b7f40bd ("net: Call skb_checksum_init in IPv6")
Signed-off-by: Alexey Kodanev <alexey.kodanev(a)oracle.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
include/net/udplite.h | 1 +
net/ipv4/udp.c | 5 +++++
net/ipv6/ip6_checksum.c | 5 +++++
3 files changed, 11 insertions(+)
--- a/include/net/udplite.h
+++ b/include/net/udplite.h
@@ -64,6 +64,7 @@ static inline int udplite_checksum_init(
UDP_SKB_CB(skb)->cscov = cscov;
if (skb->ip_summed == CHECKSUM_COMPLETE)
skb->ip_summed = CHECKSUM_NONE;
+ skb->csum_valid = 0;
}
return 0;
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -2031,6 +2031,11 @@ static inline int udp4_csum_init(struct
err = udplite_checksum_init(skb, uh);
if (err)
return err;
+
+ if (UDP_SKB_CB(skb)->partial_cov) {
+ skb->csum = inet_compute_pseudo(skb, proto);
+ return 0;
+ }
}
/* Note, we are only interested in != 0 or == 0, thus the
--- a/net/ipv6/ip6_checksum.c
+++ b/net/ipv6/ip6_checksum.c
@@ -73,6 +73,11 @@ int udp6_csum_init(struct sk_buff *skb,
err = udplite_checksum_init(skb, uh);
if (err)
return err;
+
+ if (UDP_SKB_CB(skb)->partial_cov) {
+ skb->csum = ip6_compute_pseudo(skb, proto);
+ return 0;
+ }
}
/* To support RFC 6936 (allow zero checksum in UDP/IPV6 for tunnels)
Patches currently in stable-queue which might be from alexey.kodanev(a)oracle.com are
queue-4.15/sctp-fix-dst-refcnt-leak-in-sctp_v6_get_dst.patch
queue-4.15/udplite-fix-partial-checksum-initialization.patch
queue-4.15/sctp-verify-size-of-a-new-chunk-in-_sctp_make_chunk.patch
This is a note to let you know that I've just added the patch titled
tuntap: disable preemption during XDP processing
to the 4.15-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
tuntap-disable-preemption-during-xdp-processing.patch
and it can be found in the queue-4.15 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Tue Mar 6 19:02:56 PST 2018
From: Jason Wang <jasowang(a)redhat.com>
Date: Sat, 24 Feb 2018 11:32:25 +0800
Subject: tuntap: disable preemption during XDP processing
From: Jason Wang <jasowang(a)redhat.com>
[ Upstream commit 23e43f07f896f8578318cfcc9466f1e8b8ab21b6 ]
Except for tuntap, all other drivers' XDP was implemented at NAPI
poll() routine in a bh. This guarantees all XDP operation were done at
the same CPU which is required by e.g BFP_MAP_TYPE_PERCPU_ARRAY. But
for tuntap, we do it in process context and we try to protect XDP
processing by RCU reader lock. This is insufficient since
CONFIG_PREEMPT_RCU can preempt the RCU reader critical section which
breaks the assumption that all XDP were processed in the same CPU.
Fixing this by simply disabling preemption during XDP processing.
Fixes: 761876c857cb ("tap: XDP support")
Signed-off-by: Jason Wang <jasowang(a)redhat.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/net/tun.c | 6 ++++++
1 file changed, 6 insertions(+)
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -1471,6 +1471,7 @@ static struct sk_buff *tun_build_skb(str
else
*skb_xdp = 0;
+ preempt_disable();
rcu_read_lock();
xdp_prog = rcu_dereference(tun->xdp_prog);
if (xdp_prog && !*skb_xdp) {
@@ -1494,6 +1495,7 @@ static struct sk_buff *tun_build_skb(str
if (err)
goto err_redirect;
rcu_read_unlock();
+ preempt_enable();
return NULL;
case XDP_TX:
xdp_xmit = true;
@@ -1515,6 +1517,7 @@ static struct sk_buff *tun_build_skb(str
skb = build_skb(buf, buflen);
if (!skb) {
rcu_read_unlock();
+ preempt_enable();
return ERR_PTR(-ENOMEM);
}
@@ -1527,10 +1530,12 @@ static struct sk_buff *tun_build_skb(str
skb->dev = tun->dev;
generic_xdp_tx(skb, xdp_prog);
rcu_read_unlock();
+ preempt_enable();
return NULL;
}
rcu_read_unlock();
+ preempt_enable();
return skb;
@@ -1538,6 +1543,7 @@ err_redirect:
put_page(alloc_frag->page);
err_xdp:
rcu_read_unlock();
+ preempt_enable();
this_cpu_inc(tun->pcpu_stats->rx_dropped);
return NULL;
}
Patches currently in stable-queue which might be from jasowang(a)redhat.com are
queue-4.15/tuntap-correctly-add-the-missing-xdp-flush.patch
queue-4.15/virtio-net-disable-napi-only-when-enabled-during-xdp-set.patch
queue-4.15/tuntap-disable-preemption-during-xdp-processing.patch
This is a note to let you know that I've just added the patch titled
tuntap: correctly add the missing XDP flush
to the 4.15-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
tuntap-correctly-add-the-missing-xdp-flush.patch
and it can be found in the queue-4.15 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Tue Mar 6 19:02:56 PST 2018
From: Jason Wang <jasowang(a)redhat.com>
Date: Sat, 24 Feb 2018 11:32:26 +0800
Subject: tuntap: correctly add the missing XDP flush
From: Jason Wang <jasowang(a)redhat.com>
[ Upstream commit 1bb4f2e868a2891ab8bc668b8173d6ccb8c4ce6f ]
We don't flush batched XDP packets through xdp_do_flush_map(), this
will cause packets stall at TX queue. Consider we don't do XDP on NAPI
poll(), the only possible fix is to call xdp_do_flush_map()
immediately after xdp_do_redirect().
Note, this in fact won't try to batch packets through devmap, we could
address in the future.
Reported-by: Christoffer Dall <christoffer.dall(a)linaro.org>
Fixes: 761876c857cb ("tap: XDP support")
Signed-off-by: Jason Wang <jasowang(a)redhat.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/net/tun.c | 1 +
1 file changed, 1 insertion(+)
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -1490,6 +1490,7 @@ static struct sk_buff *tun_build_skb(str
get_page(alloc_frag->page);
alloc_frag->offset += buflen;
err = xdp_do_redirect(tun->dev, &xdp, xdp_prog);
+ xdp_do_flush_map();
if (err)
goto err_redirect;
rcu_read_unlock();
Patches currently in stable-queue which might be from jasowang(a)redhat.com are
queue-4.15/tuntap-correctly-add-the-missing-xdp-flush.patch
queue-4.15/virtio-net-disable-napi-only-when-enabled-during-xdp-set.patch
queue-4.15/tuntap-disable-preemption-during-xdp-processing.patch
This is a note to let you know that I've just added the patch titled
tls: Use correct sk->sk_prot for IPV6
to the 4.15-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
tls-use-correct-sk-sk_prot-for-ipv6.patch
and it can be found in the queue-4.15 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Tue Mar 6 19:02:56 PST 2018
From: Boris Pismenny <borisp(a)mellanox.com>
Date: Tue, 27 Feb 2018 14:18:39 +0200
Subject: tls: Use correct sk->sk_prot for IPV6
From: Boris Pismenny <borisp(a)mellanox.com>
[ Upstream commit c113187d38ff85dc302a1bb55864b203ebb2ba10 ]
The tls ulp overrides sk->prot with a new tls specific proto structs.
The tls specific structs were previously based on the ipv4 specific
tcp_prot sturct.
As a result, attaching the tls ulp to an ipv6 tcp socket replaced
some ipv6 callback with the ipv4 equivalents.
This patch adds ipv6 tls proto structs and uses them when
attached to ipv6 sockets.
Fixes: 3c4d7559159b ('tls: kernel TLS support')
Signed-off-by: Boris Pismenny <borisp(a)mellanox.com>
Signed-off-by: Ilya Lesokhin <ilyal(a)mellanox.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/tls/tls_main.c | 52 +++++++++++++++++++++++++++++++++++++---------------
1 file changed, 37 insertions(+), 15 deletions(-)
--- a/net/tls/tls_main.c
+++ b/net/tls/tls_main.c
@@ -46,16 +46,26 @@ MODULE_DESCRIPTION("Transport Layer Secu
MODULE_LICENSE("Dual BSD/GPL");
enum {
+ TLSV4,
+ TLSV6,
+ TLS_NUM_PROTS,
+};
+
+enum {
TLS_BASE_TX,
TLS_SW_TX,
TLS_NUM_CONFIG,
};
-static struct proto tls_prots[TLS_NUM_CONFIG];
+static struct proto *saved_tcpv6_prot;
+static DEFINE_MUTEX(tcpv6_prot_mutex);
+static struct proto tls_prots[TLS_NUM_PROTS][TLS_NUM_CONFIG];
static inline void update_sk_prot(struct sock *sk, struct tls_context *ctx)
{
- sk->sk_prot = &tls_prots[ctx->tx_conf];
+ int ip_ver = sk->sk_family == AF_INET6 ? TLSV6 : TLSV4;
+
+ sk->sk_prot = &tls_prots[ip_ver][ctx->tx_conf];
}
int wait_on_pending_writer(struct sock *sk, long *timeo)
@@ -450,8 +460,21 @@ static int tls_setsockopt(struct sock *s
return do_tls_setsockopt(sk, optname, optval, optlen);
}
+static void build_protos(struct proto *prot, struct proto *base)
+{
+ prot[TLS_BASE_TX] = *base;
+ prot[TLS_BASE_TX].setsockopt = tls_setsockopt;
+ prot[TLS_BASE_TX].getsockopt = tls_getsockopt;
+ prot[TLS_BASE_TX].close = tls_sk_proto_close;
+
+ prot[TLS_SW_TX] = prot[TLS_BASE_TX];
+ prot[TLS_SW_TX].sendmsg = tls_sw_sendmsg;
+ prot[TLS_SW_TX].sendpage = tls_sw_sendpage;
+}
+
static int tls_init(struct sock *sk)
{
+ int ip_ver = sk->sk_family == AF_INET6 ? TLSV6 : TLSV4;
struct inet_connection_sock *icsk = inet_csk(sk);
struct tls_context *ctx;
int rc = 0;
@@ -476,6 +499,17 @@ static int tls_init(struct sock *sk)
ctx->getsockopt = sk->sk_prot->getsockopt;
ctx->sk_proto_close = sk->sk_prot->close;
+ /* Build IPv6 TLS whenever the address of tcpv6_prot changes */
+ if (ip_ver == TLSV6 &&
+ unlikely(sk->sk_prot != smp_load_acquire(&saved_tcpv6_prot))) {
+ mutex_lock(&tcpv6_prot_mutex);
+ if (likely(sk->sk_prot != saved_tcpv6_prot)) {
+ build_protos(tls_prots[TLSV6], sk->sk_prot);
+ smp_store_release(&saved_tcpv6_prot, sk->sk_prot);
+ }
+ mutex_unlock(&tcpv6_prot_mutex);
+ }
+
ctx->tx_conf = TLS_BASE_TX;
update_sk_prot(sk, ctx);
out:
@@ -488,21 +522,9 @@ static struct tcp_ulp_ops tcp_tls_ulp_op
.init = tls_init,
};
-static void build_protos(struct proto *prot, struct proto *base)
-{
- prot[TLS_BASE_TX] = *base;
- prot[TLS_BASE_TX].setsockopt = tls_setsockopt;
- prot[TLS_BASE_TX].getsockopt = tls_getsockopt;
- prot[TLS_BASE_TX].close = tls_sk_proto_close;
-
- prot[TLS_SW_TX] = prot[TLS_BASE_TX];
- prot[TLS_SW_TX].sendmsg = tls_sw_sendmsg;
- prot[TLS_SW_TX].sendpage = tls_sw_sendpage;
-}
-
static int __init tls_register(void)
{
- build_protos(tls_prots, &tcp_prot);
+ build_protos(tls_prots[TLSV4], &tcp_prot);
tcp_register_ulp(&tcp_tls_ulp_ops);
Patches currently in stable-queue which might be from borisp(a)mellanox.com are
queue-4.15/tls-use-correct-sk-sk_prot-for-ipv6.patch
This is a note to let you know that I've just added the patch titled
tcp_bbr: better deal with suboptimal GSO
to the 4.15-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
tcp_bbr-better-deal-with-suboptimal-gso.patch
and it can be found in the queue-4.15 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Tue Mar 6 19:02:56 PST 2018
From: Eric Dumazet <edumazet(a)google.com>
Date: Wed, 21 Feb 2018 06:43:03 -0800
Subject: tcp_bbr: better deal with suboptimal GSO
From: Eric Dumazet <edumazet(a)google.com>
[ Upstream commit 350c9f484bde93ef229682eedd98cd5f74350f7f ]
BBR uses tcp_tso_autosize() in an attempt to probe what would be the
burst sizes and to adjust cwnd in bbr_target_cwnd() with following
gold formula :
/* Allow enough full-sized skbs in flight to utilize end systems. */
cwnd += 3 * bbr->tso_segs_goal;
But GSO can be lacking or be constrained to very small
units (ip link set dev ... gso_max_segs 2)
What we really want is to have enough packets in flight so that both
GSO and GRO are efficient.
So in the case GSO is off or downgraded, we still want to have the same
number of packets in flight as if GSO/TSO was fully operational, so
that GRO can hopefully be working efficiently.
To fix this issue, we make tcp_tso_autosize() unaware of
sk->sk_gso_max_segs
Only tcp_tso_segs() has to enforce the gso_max_segs limit.
Tested:
ethtool -K eth0 tso off gso off
tc qd replace dev eth0 root pfifo_fast
Before patch:
for f in {1..5}; do ./super_netperf 1 -H lpaa24 -- -K bbr; done
691 (ss -temoi shows cwnd is stuck around 6 )
667
651
631
517
After patch :
# for f in {1..5}; do ./super_netperf 1 -H lpaa24 -- -K bbr; done
1733 (ss -temoi shows cwnd is around 386 )
1778
1746
1781
1718
Fixes: 0f8782ea1497 ("tcp_bbr: add BBR congestion control")
Signed-off-by: Eric Dumazet <edumazet(a)google.com>
Reported-by: Oleksandr Natalenko <oleksandr(a)natalenko.name>
Acked-by: Neal Cardwell <ncardwell(a)google.com>
Acked-by: Soheil Hassas Yeganeh <soheil(a)google.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/ipv4/tcp_output.c | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -1730,7 +1730,7 @@ u32 tcp_tso_autosize(const struct sock *
*/
segs = max_t(u32, bytes / mss_now, min_tso_segs);
- return min_t(u32, segs, sk->sk_gso_max_segs);
+ return segs;
}
EXPORT_SYMBOL(tcp_tso_autosize);
@@ -1742,9 +1742,10 @@ static u32 tcp_tso_segs(struct sock *sk,
const struct tcp_congestion_ops *ca_ops = inet_csk(sk)->icsk_ca_ops;
u32 tso_segs = ca_ops->tso_segs_goal ? ca_ops->tso_segs_goal(sk) : 0;
- return tso_segs ? :
- tcp_tso_autosize(sk, mss_now,
- sock_net(sk)->ipv4.sysctl_tcp_min_tso_segs);
+ if (!tso_segs)
+ tso_segs = tcp_tso_autosize(sk, mss_now,
+ sock_net(sk)->ipv4.sysctl_tcp_min_tso_segs);
+ return min_t(u32, tso_segs, sk->sk_gso_max_segs);
}
/* Returns the portion of skb which can be sent right away */
Patches currently in stable-queue which might be from edumazet(a)google.com are
queue-4.15/doc-change-the-min-default-value-of-tcp_wmem-tcp_rmem.patch
queue-4.15/tcp-purge-write-queue-upon-rst.patch
queue-4.15/net_sched-gen_estimator-fix-broken-estimators-based-on-percpu-stats.patch
queue-4.15/tcp_bbr-better-deal-with-suboptimal-gso.patch
This is a note to let you know that I've just added the patch titled
tcp: tracepoint: only call trace_tcp_send_reset with full socket
to the 4.15-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
tcp-tracepoint-only-call-trace_tcp_send_reset-with-full-socket.patch
and it can be found in the queue-4.15 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Tue Mar 6 19:02:56 PST 2018
From: Song Liu <songliubraving(a)fb.com>
Date: Tue, 6 Feb 2018 20:50:23 -0800
Subject: tcp: tracepoint: only call trace_tcp_send_reset with full socket
From: Song Liu <songliubraving(a)fb.com>
[ Upstream commit 5c487bb9adddbc1d23433e09d2548759375c2b52 ]
tracepoint tcp_send_reset requires a full socket to work. However, it
may be called when in TCP_TIME_WAIT:
case TCP_TW_RST:
tcp_v6_send_reset(sk, skb);
inet_twsk_deschedule_put(inet_twsk(sk));
goto discard_it;
To avoid this problem, this patch checks the socket with sk_fullsock()
before calling trace_tcp_send_reset().
Fixes: c24b14c46bb8 ("tcp: add tracepoint trace_tcp_send_reset")
Signed-off-by: Song Liu <songliubraving(a)fb.com>
Reviewed-by: Lawrence Brakmo <brakmo(a)fb.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/ipv4/tcp_ipv4.c | 3 ++-
net/ipv6/tcp_ipv6.c | 3 ++-
2 files changed, 4 insertions(+), 2 deletions(-)
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -705,7 +705,8 @@ static void tcp_v4_send_reset(const stru
*/
if (sk) {
arg.bound_dev_if = sk->sk_bound_dev_if;
- trace_tcp_send_reset(sk, skb);
+ if (sk_fullsock(sk))
+ trace_tcp_send_reset(sk, skb);
}
BUILD_BUG_ON(offsetof(struct sock, sk_bound_dev_if) !=
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -943,7 +943,8 @@ static void tcp_v6_send_reset(const stru
if (sk) {
oif = sk->sk_bound_dev_if;
- trace_tcp_send_reset(sk, skb);
+ if (sk_fullsock(sk))
+ trace_tcp_send_reset(sk, skb);
}
tcp_v6_send_response(sk, skb, seq, ack_seq, 0, 0, 0, oif, key, 1, 0, 0);
Patches currently in stable-queue which might be from songliubraving(a)fb.com are
queue-4.15/tcp-tracepoint-only-call-trace_tcp_send_reset-with-full-socket.patch
This is a note to let you know that I've just added the patch titled
tcp: revert F-RTO middle-box workaround
to the 4.15-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
tcp-revert-f-rto-middle-box-workaround.patch
and it can be found in the queue-4.15 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Tue Mar 6 19:02:56 PST 2018
From: Yuchung Cheng <ycheng(a)google.com>
Date: Tue, 27 Feb 2018 14:15:01 -0800
Subject: tcp: revert F-RTO middle-box workaround
From: Yuchung Cheng <ycheng(a)google.com>
[ Upstream commit d4131f09770d9b7471c9da65e6ecd2477746ac5c ]
This reverts commit cc663f4d4c97b7297fb45135ab23cfd508b35a77. While fixing
some broken middle-boxes that modifies receive window fields, it does not
address middle-boxes that strip off SACK options. The best solution is
to fully revert this patch and the root F-RTO enhancement.
Fixes: cc663f4d4c97 ("tcp: restrict F-RTO to work-around broken middle-boxes")
Reported-by: Teodor Milkov <tm(a)del.bg>
Signed-off-by: Yuchung Cheng <ycheng(a)google.com>
Signed-off-by: Neal Cardwell <ncardwell(a)google.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/ipv4/tcp_input.c | 17 +++++++----------
1 file changed, 7 insertions(+), 10 deletions(-)
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 45f750e85714..50963f92a67d 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -1915,7 +1915,6 @@ void tcp_enter_loss(struct sock *sk)
struct tcp_sock *tp = tcp_sk(sk);
struct net *net = sock_net(sk);
struct sk_buff *skb;
- bool new_recovery = icsk->icsk_ca_state < TCP_CA_Recovery;
bool is_reneg; /* is receiver reneging on SACKs? */
bool mark_lost;
@@ -1974,17 +1973,15 @@ void tcp_enter_loss(struct sock *sk)
tp->high_seq = tp->snd_nxt;
tcp_ecn_queue_cwr(tp);
- /* F-RTO RFC5682 sec 3.1 step 1: retransmit SND.UNA if no previous
- * loss recovery is underway except recurring timeout(s) on
- * the same SND.UNA (sec 3.2). Disable F-RTO on path MTU probing
- *
- * In theory F-RTO can be used repeatedly during loss recovery.
- * In practice this interacts badly with broken middle-boxes that
- * falsely raise the receive window, which results in repeated
- * timeouts and stop-and-go behavior.
+ /* F-RTO RFC5682 sec 3.1 step 1 mandates to disable F-RTO
+ * if a previous recovery is underway, otherwise it may incorrectly
+ * call a timeout spurious if some previously retransmitted packets
+ * are s/acked (sec 3.2). We do not apply that retriction since
+ * retransmitted skbs are permanently tagged with TCPCB_EVER_RETRANS
+ * so FLAG_ORIG_SACK_ACKED is always correct. But we do disable F-RTO
+ * on PTMU discovery to avoid sending new data.
*/
tp->frto = net->ipv4.sysctl_tcp_frto &&
- (new_recovery || icsk->icsk_retransmits) &&
!inet_csk(sk)->icsk_mtup.probe_size;
}
--
2.14.3
Patches currently in stable-queue which might be from ycheng(a)google.com are
queue-4.15/tcp-purge-write-queue-upon-rst.patch
queue-4.15/tcp-revert-f-rto-extension-to-detect-more-spurious-timeouts.patch
queue-4.15/tcp-revert-f-rto-middle-box-workaround.patch
This is a note to let you know that I've just added the patch titled
tcp: revert F-RTO extension to detect more spurious timeouts
to the 4.15-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
tcp-revert-f-rto-extension-to-detect-more-spurious-timeouts.patch
and it can be found in the queue-4.15 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Tue Mar 6 19:02:56 PST 2018
From: Yuchung Cheng <ycheng(a)google.com>
Date: Tue, 27 Feb 2018 14:15:02 -0800
Subject: tcp: revert F-RTO extension to detect more spurious timeouts
From: Yuchung Cheng <ycheng(a)google.com>
[ Upstream commit fc68e171d376c322e6777a3d7ac2f0278b68b17f ]
This reverts commit 89fe18e44f7ee5ab1c90d0dff5835acee7751427.
While the patch could detect more spurious timeouts, it could cause
poor TCP performance on broken middle-boxes that modifies TCP packets
(e.g. receive window, SACK options). Since the performance gain is
much smaller compared to the potential loss. The best solution is
to fully revert the change.
Fixes: 89fe18e44f7e ("tcp: extend F-RTO to catch more spurious timeouts")
Reported-by: Teodor Milkov <tm(a)del.bg>
Signed-off-by: Yuchung Cheng <ycheng(a)google.com>
Signed-off-by: Neal Cardwell <ncardwell(a)google.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/ipv4/tcp_input.c | 30 ++++++++++++------------------
1 file changed, 12 insertions(+), 18 deletions(-)
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -1915,6 +1915,7 @@ void tcp_enter_loss(struct sock *sk)
struct tcp_sock *tp = tcp_sk(sk);
struct net *net = sock_net(sk);
struct sk_buff *skb;
+ bool new_recovery = icsk->icsk_ca_state < TCP_CA_Recovery;
bool is_reneg; /* is receiver reneging on SACKs? */
bool mark_lost;
@@ -1973,15 +1974,12 @@ void tcp_enter_loss(struct sock *sk)
tp->high_seq = tp->snd_nxt;
tcp_ecn_queue_cwr(tp);
- /* F-RTO RFC5682 sec 3.1 step 1 mandates to disable F-RTO
- * if a previous recovery is underway, otherwise it may incorrectly
- * call a timeout spurious if some previously retransmitted packets
- * are s/acked (sec 3.2). We do not apply that retriction since
- * retransmitted skbs are permanently tagged with TCPCB_EVER_RETRANS
- * so FLAG_ORIG_SACK_ACKED is always correct. But we do disable F-RTO
- * on PTMU discovery to avoid sending new data.
+ /* F-RTO RFC5682 sec 3.1 step 1: retransmit SND.UNA if no previous
+ * loss recovery is underway except recurring timeout(s) on
+ * the same SND.UNA (sec 3.2). Disable F-RTO on path MTU probing
*/
tp->frto = net->ipv4.sysctl_tcp_frto &&
+ (new_recovery || icsk->icsk_retransmits) &&
!inet_csk(sk)->icsk_mtup.probe_size;
}
@@ -2634,18 +2632,14 @@ static void tcp_process_loss(struct sock
tcp_try_undo_loss(sk, false))
return;
- /* The ACK (s)acks some never-retransmitted data meaning not all
- * the data packets before the timeout were lost. Therefore we
- * undo the congestion window and state. This is essentially
- * the operation in F-RTO (RFC5682 section 3.1 step 3.b). Since
- * a retransmitted skb is permantly marked, we can apply such an
- * operation even if F-RTO was not used.
- */
- if ((flag & FLAG_ORIG_SACK_ACKED) &&
- tcp_try_undo_loss(sk, tp->undo_marker))
- return;
-
if (tp->frto) { /* F-RTO RFC5682 sec 3.1 (sack enhanced version). */
+ /* Step 3.b. A timeout is spurious if not all data are
+ * lost, i.e., never-retransmitted data are (s)acked.
+ */
+ if ((flag & FLAG_ORIG_SACK_ACKED) &&
+ tcp_try_undo_loss(sk, true))
+ return;
+
if (after(tp->snd_nxt, tp->high_seq)) {
if (flag & FLAG_DATA_SACKED || is_dupack)
tp->frto = 0; /* Step 3.a. loss was real */
Patches currently in stable-queue which might be from ycheng(a)google.com are
queue-4.15/tcp-purge-write-queue-upon-rst.patch
queue-4.15/tcp-revert-f-rto-extension-to-detect-more-spurious-timeouts.patch
queue-4.15/tcp-revert-f-rto-middle-box-workaround.patch
This is a note to let you know that I've just added the patch titled
tcp: purge write queue upon RST
to the 4.15-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
tcp-purge-write-queue-upon-rst.patch
and it can be found in the queue-4.15 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Tue Mar 6 19:02:56 PST 2018
From: Soheil Hassas Yeganeh <soheil(a)google.com>
Date: Tue, 27 Feb 2018 18:32:18 -0500
Subject: tcp: purge write queue upon RST
From: Soheil Hassas Yeganeh <soheil(a)google.com>
[ Upstream commit a27fd7a8ed3856faaf5a2ff1c8c5f00c0667aaa0 ]
When the connection is reset, there is no point in
keeping the packets on the write queue until the connection
is closed.
RFC 793 (page 70) and RFC 793-bis (page 64) both suggest
purging the write queue upon RST:
https://tools.ietf.org/html/draft-ietf-tcpm-rfc793bis-07
Moreover, this is essential for a correct MSG_ZEROCOPY
implementation, because userspace cannot call close(fd)
before receiving zerocopy signals even when the connection
is reset.
Fixes: f214f915e7db ("tcp: enable MSG_ZEROCOPY")
Signed-off-by: Soheil Hassas Yeganeh <soheil(a)google.com>
Reviewed-by: Eric Dumazet <edumazet(a)google.com>
Signed-off-by: Yuchung Cheng <ycheng(a)google.com>
Signed-off-by: Neal Cardwell <ncardwell(a)google.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/ipv4/tcp_input.c | 1 +
1 file changed, 1 insertion(+)
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -3988,6 +3988,7 @@ void tcp_reset(struct sock *sk)
/* This barrier is coupled with smp_rmb() in tcp_poll() */
smp_wmb();
+ tcp_write_queue_purge(sk);
tcp_done(sk);
if (!sock_flag(sk, SOCK_DEAD))
Patches currently in stable-queue which might be from soheil(a)google.com are
queue-4.15/tcp-purge-write-queue-upon-rst.patch
queue-4.15/tcp_bbr-better-deal-with-suboptimal-gso.patch
This is a note to let you know that I've just added the patch titled
sctp: verify size of a new chunk in _sctp_make_chunk()
to the 4.15-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
sctp-verify-size-of-a-new-chunk-in-_sctp_make_chunk.patch
and it can be found in the queue-4.15 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Tue Mar 6 19:02:56 PST 2018
From: Alexey Kodanev <alexey.kodanev(a)oracle.com>
Date: Fri, 9 Feb 2018 17:35:23 +0300
Subject: sctp: verify size of a new chunk in _sctp_make_chunk()
From: Alexey Kodanev <alexey.kodanev(a)oracle.com>
[ Upstream commit 07f2c7ab6f8d0a7e7c5764c4e6cc9c52951b9d9c ]
When SCTP makes INIT or INIT_ACK packet the total chunk length
can exceed SCTP_MAX_CHUNK_LEN which leads to kernel panic when
transmitting these packets, e.g. the crash on sending INIT_ACK:
[ 597.804948] skbuff: skb_over_panic: text:00000000ffae06e4 len:120168
put:120156 head:000000007aa47635 data:00000000d991c2de
tail:0x1d640 end:0xfec0 dev:<NULL>
...
[ 597.976970] ------------[ cut here ]------------
[ 598.033408] kernel BUG at net/core/skbuff.c:104!
[ 600.314841] Call Trace:
[ 600.345829] <IRQ>
[ 600.371639] ? sctp_packet_transmit+0x2095/0x26d0 [sctp]
[ 600.436934] skb_put+0x16c/0x200
[ 600.477295] sctp_packet_transmit+0x2095/0x26d0 [sctp]
[ 600.540630] ? sctp_packet_config+0x890/0x890 [sctp]
[ 600.601781] ? __sctp_packet_append_chunk+0x3b4/0xd00 [sctp]
[ 600.671356] ? sctp_cmp_addr_exact+0x3f/0x90 [sctp]
[ 600.731482] sctp_outq_flush+0x663/0x30d0 [sctp]
[ 600.788565] ? sctp_make_init+0xbf0/0xbf0 [sctp]
[ 600.845555] ? sctp_check_transmitted+0x18f0/0x18f0 [sctp]
[ 600.912945] ? sctp_outq_tail+0x631/0x9d0 [sctp]
[ 600.969936] sctp_cmd_interpreter.isra.22+0x3be1/0x5cb0 [sctp]
[ 601.041593] ? sctp_sf_do_5_1B_init+0x85f/0xc30 [sctp]
[ 601.104837] ? sctp_generate_t1_cookie_event+0x20/0x20 [sctp]
[ 601.175436] ? sctp_eat_data+0x1710/0x1710 [sctp]
[ 601.233575] sctp_do_sm+0x182/0x560 [sctp]
[ 601.284328] ? sctp_has_association+0x70/0x70 [sctp]
[ 601.345586] ? sctp_rcv+0xef4/0x32f0 [sctp]
[ 601.397478] ? sctp6_rcv+0xa/0x20 [sctp]
...
Here the chunk size for INIT_ACK packet becomes too big, mostly
because of the state cookie (INIT packet has large size with
many address parameters), plus additional server parameters.
Later this chunk causes the panic in skb_put_data():
skb_packet_transmit()
sctp_packet_pack()
skb_put_data(nskb, chunk->skb->data, chunk->skb->len);
'nskb' (head skb) was previously allocated with packet->size
from u16 'chunk->chunk_hdr->length'.
As suggested by Marcelo we should check the chunk's length in
_sctp_make_chunk() before trying to allocate skb for it and
discard a chunk if its size bigger than SCTP_MAX_CHUNK_LEN.
Signed-off-by: Alexey Kodanev <alexey.kodanev(a)oracle.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leinter(a)gmail.com>
Acked-by: Neil Horman <nhorman(a)tuxdriver.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/sctp/sm_make_chunk.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
--- a/net/sctp/sm_make_chunk.c
+++ b/net/sctp/sm_make_chunk.c
@@ -1378,9 +1378,14 @@ static struct sctp_chunk *_sctp_make_chu
struct sctp_chunk *retval;
struct sk_buff *skb;
struct sock *sk;
+ int chunklen;
+
+ chunklen = SCTP_PAD4(sizeof(*chunk_hdr) + paylen);
+ if (chunklen > SCTP_MAX_CHUNK_LEN)
+ goto nodata;
/* No need to allocate LL here, as this is only a chunk. */
- skb = alloc_skb(SCTP_PAD4(sizeof(*chunk_hdr) + paylen), gfp);
+ skb = alloc_skb(chunklen, gfp);
if (!skb)
goto nodata;
Patches currently in stable-queue which might be from alexey.kodanev(a)oracle.com are
queue-4.15/sctp-fix-dst-refcnt-leak-in-sctp_v6_get_dst.patch
queue-4.15/udplite-fix-partial-checksum-initialization.patch
queue-4.15/sctp-verify-size-of-a-new-chunk-in-_sctp_make_chunk.patch
This is a note to let you know that I've just added the patch titled
tcp: Honor the eor bit in tcp_mtu_probe
to the 4.15-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
tcp-honor-the-eor-bit-in-tcp_mtu_probe.patch
and it can be found in the queue-4.15 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Tue Mar 6 19:02:56 PST 2018
From: Ilya Lesokhin <ilyal(a)mellanox.com>
Date: Mon, 12 Feb 2018 12:57:04 +0200
Subject: tcp: Honor the eor bit in tcp_mtu_probe
From: Ilya Lesokhin <ilyal(a)mellanox.com>
[ Upstream commit 808cf9e38cd7923036a99f459ccc8cf2955e47af ]
Avoid SKB coalescing if eor bit is set in one of the relevant
SKBs.
Fixes: c134ecb87817 ("tcp: Make use of MSG_EOR in tcp_sendmsg")
Signed-off-by: Ilya Lesokhin <ilyal(a)mellanox.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/ipv4/tcp_output.c | 25 +++++++++++++++++++++++++
1 file changed, 25 insertions(+)
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -2026,6 +2026,24 @@ static inline void tcp_mtu_check_reprobe
}
}
+static bool tcp_can_coalesce_send_queue_head(struct sock *sk, int len)
+{
+ struct sk_buff *skb, *next;
+
+ skb = tcp_send_head(sk);
+ tcp_for_write_queue_from_safe(skb, next, sk) {
+ if (len <= skb->len)
+ break;
+
+ if (unlikely(TCP_SKB_CB(skb)->eor))
+ return false;
+
+ len -= skb->len;
+ }
+
+ return true;
+}
+
/* Create a new MTU probe if we are ready.
* MTU probe is regularly attempting to increase the path MTU by
* deliberately sending larger packets. This discovers routing
@@ -2098,6 +2116,9 @@ static int tcp_mtu_probe(struct sock *sk
return 0;
}
+ if (!tcp_can_coalesce_send_queue_head(sk, probe_size))
+ return -1;
+
/* We're allowed to probe. Build it now. */
nskb = sk_stream_alloc_skb(sk, probe_size, GFP_ATOMIC, false);
if (!nskb)
@@ -2133,6 +2154,10 @@ static int tcp_mtu_probe(struct sock *sk
/* We've eaten all the data from this skb.
* Throw it away. */
TCP_SKB_CB(nskb)->tcp_flags |= TCP_SKB_CB(skb)->tcp_flags;
+ /* If this is the last SKB we copy and eor is set
+ * we need to propagate it to the new skb.
+ */
+ TCP_SKB_CB(nskb)->eor = TCP_SKB_CB(skb)->eor;
tcp_unlink_write_queue(skb, sk);
sk_wmem_free_skb(sk, skb);
} else {
Patches currently in stable-queue which might be from ilyal(a)mellanox.com are
queue-4.15/tls-use-correct-sk-sk_prot-for-ipv6.patch
queue-4.15/tcp-honor-the-eor-bit-in-tcp_mtu_probe.patch
This is a note to let you know that I've just added the patch titled
sctp: fix dst refcnt leak in sctp_v6_get_dst()
to the 4.15-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
sctp-fix-dst-refcnt-leak-in-sctp_v6_get_dst.patch
and it can be found in the queue-4.15 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Tue Mar 6 19:02:56 PST 2018
From: Alexey Kodanev <alexey.kodanev(a)oracle.com>
Date: Mon, 5 Feb 2018 15:10:35 +0300
Subject: sctp: fix dst refcnt leak in sctp_v6_get_dst()
From: Alexey Kodanev <alexey.kodanev(a)oracle.com>
[ Upstream commit 957d761cf91cdbb175ad7d8f5472336a4d54dbf2 ]
When going through the bind address list in sctp_v6_get_dst() and
the previously found address is better ('matchlen > bmatchlen'),
the code continues to the next iteration without releasing currently
held destination.
Fix it by releasing 'bdst' before continue to the next iteration, and
instead of introducing one more '!IS_ERR(bdst)' check for dst_release(),
move the already existed one right after ip6_dst_lookup_flow(), i.e. we
shouldn't proceed further if we get an error for the route lookup.
Fixes: dbc2b5e9a09e ("sctp: fix src address selection if using secondary addresses for ipv6")
Signed-off-by: Alexey Kodanev <alexey.kodanev(a)oracle.com>
Acked-by: Neil Horman <nhorman(a)tuxdriver.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner(a)gmail.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/sctp/ipv6.c | 10 +++++++---
1 file changed, 7 insertions(+), 3 deletions(-)
--- a/net/sctp/ipv6.c
+++ b/net/sctp/ipv6.c
@@ -326,8 +326,10 @@ static void sctp_v6_get_dst(struct sctp_
final_p = fl6_update_dst(fl6, rcu_dereference(np->opt), &final);
bdst = ip6_dst_lookup_flow(sk, fl6, final_p);
- if (!IS_ERR(bdst) &&
- ipv6_chk_addr(dev_net(bdst->dev),
+ if (IS_ERR(bdst))
+ continue;
+
+ if (ipv6_chk_addr(dev_net(bdst->dev),
&laddr->a.v6.sin6_addr, bdst->dev, 1)) {
if (!IS_ERR_OR_NULL(dst))
dst_release(dst);
@@ -336,8 +338,10 @@ static void sctp_v6_get_dst(struct sctp_
}
bmatchlen = sctp_v6_addr_match_len(daddr, &laddr->a);
- if (matchlen > bmatchlen)
+ if (matchlen > bmatchlen) {
+ dst_release(bdst);
continue;
+ }
if (!IS_ERR_OR_NULL(dst))
dst_release(dst);
Patches currently in stable-queue which might be from alexey.kodanev(a)oracle.com are
queue-4.15/sctp-fix-dst-refcnt-leak-in-sctp_v6_get_dst.patch
queue-4.15/udplite-fix-partial-checksum-initialization.patch
queue-4.15/sctp-verify-size-of-a-new-chunk-in-_sctp_make_chunk.patch
This is a note to let you know that I've just added the patch titled
sctp: fix dst refcnt leak in sctp_v4_get_dst
to the 4.15-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
sctp-fix-dst-refcnt-leak-in-sctp_v4_get_dst.patch
and it can be found in the queue-4.15 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Tue Mar 6 19:02:56 PST 2018
From: Tommi Rantala <tommi.t.rantala(a)nokia.com>
Date: Mon, 5 Feb 2018 21:48:14 +0200
Subject: sctp: fix dst refcnt leak in sctp_v4_get_dst
From: Tommi Rantala <tommi.t.rantala(a)nokia.com>
[ Upstream commit 4a31a6b19f9ddf498c81f5c9b089742b7472a6f8 ]
Fix dst reference count leak in sctp_v4_get_dst() introduced in commit
410f03831 ("sctp: add routing output fallback"):
When walking the address_list, successive ip_route_output_key() calls
may return the same rt->dst with the reference incremented on each call.
The code would not decrement the dst refcount when the dst pointer was
identical from the previous iteration, causing the dst refcnt leak.
Testcase:
ip netns add TEST
ip netns exec TEST ip link set lo up
ip link add dummy0 type dummy
ip link add dummy1 type dummy
ip link add dummy2 type dummy
ip link set dev dummy0 netns TEST
ip link set dev dummy1 netns TEST
ip link set dev dummy2 netns TEST
ip netns exec TEST ip addr add 192.168.1.1/24 dev dummy0
ip netns exec TEST ip link set dummy0 up
ip netns exec TEST ip addr add 192.168.1.2/24 dev dummy1
ip netns exec TEST ip link set dummy1 up
ip netns exec TEST ip addr add 192.168.1.3/24 dev dummy2
ip netns exec TEST ip link set dummy2 up
ip netns exec TEST sctp_test -H 192.168.1.2 -P 20002 -h 192.168.1.1 -p 20000 -s -B 192.168.1.3
ip netns del TEST
In 4.4 and 4.9 kernels this results to:
[ 354.179591] unregister_netdevice: waiting for lo to become free. Usage count = 1
[ 364.419674] unregister_netdevice: waiting for lo to become free. Usage count = 1
[ 374.663664] unregister_netdevice: waiting for lo to become free. Usage count = 1
[ 384.903717] unregister_netdevice: waiting for lo to become free. Usage count = 1
[ 395.143724] unregister_netdevice: waiting for lo to become free. Usage count = 1
[ 405.383645] unregister_netdevice: waiting for lo to become free. Usage count = 1
...
Fixes: 410f03831 ("sctp: add routing output fallback")
Fixes: 0ca50d12f ("sctp: fix src address selection if using secondary addresses")
Signed-off-by: Tommi Rantala <tommi.t.rantala(a)nokia.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner(a)gmail.com>
Acked-by: Neil Horman <nhorman(a)tuxdriver.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/sctp/protocol.c | 10 ++++------
1 file changed, 4 insertions(+), 6 deletions(-)
--- a/net/sctp/protocol.c
+++ b/net/sctp/protocol.c
@@ -514,22 +514,20 @@ static void sctp_v4_get_dst(struct sctp_
if (IS_ERR(rt))
continue;
- if (!dst)
- dst = &rt->dst;
-
/* Ensure the src address belongs to the output
* interface.
*/
odev = __ip_dev_find(sock_net(sk), laddr->a.v4.sin_addr.s_addr,
false);
if (!odev || odev->ifindex != fl4->flowi4_oif) {
- if (&rt->dst != dst)
+ if (!dst)
+ dst = &rt->dst;
+ else
dst_release(&rt->dst);
continue;
}
- if (dst != &rt->dst)
- dst_release(dst);
+ dst_release(dst);
dst = &rt->dst;
break;
}
Patches currently in stable-queue which might be from tommi.t.rantala(a)nokia.com are
queue-4.15/sctp-fix-dst-refcnt-leak-in-sctp_v4_get_dst.patch
This is a note to let you know that I've just added the patch titled
sctp: do not pr_err for the duplicated node in transport rhlist
to the 4.15-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
sctp-do-not-pr_err-for-the-duplicated-node-in-transport-rhlist.patch
and it can be found in the queue-4.15 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Tue Mar 6 19:02:56 PST 2018
From: Xin Long <lucien.xin(a)gmail.com>
Date: Mon, 12 Feb 2018 18:29:06 +0800
Subject: sctp: do not pr_err for the duplicated node in transport rhlist
From: Xin Long <lucien.xin(a)gmail.com>
[ Upstream commit 27af86bb038d9c8b8066cd17854ddaf2ea92bce1 ]
The pr_err in sctp_hash_transport was supposed to report a sctp bug
for using rhashtable/rhlist.
The err '-EEXIST' introduced in Commit cd2b70875058 ("sctp: check
duplicate node before inserting a new transport") doesn't belong
to that case.
So just return -EEXIST back without pr_err any kmsg.
Fixes: cd2b70875058 ("sctp: check duplicate node before inserting a new transport")
Reported-by: Wei Chen <weichen(a)redhat.com>
Signed-off-by: Xin Long <lucien.xin(a)gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner(a)gmail.com>
Acked-by: Neil Horman <nhorman(a)tuxdriver.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/sctp/input.c | 5 +----
1 file changed, 1 insertion(+), 4 deletions(-)
--- a/net/sctp/input.c
+++ b/net/sctp/input.c
@@ -897,15 +897,12 @@ int sctp_hash_transport(struct sctp_tran
rhl_for_each_entry_rcu(transport, tmp, list, node)
if (transport->asoc->ep == t->asoc->ep) {
rcu_read_unlock();
- err = -EEXIST;
- goto out;
+ return -EEXIST;
}
rcu_read_unlock();
err = rhltable_insert_key(&sctp_transport_hashtable, &arg,
&t->node, sctp_hash_params);
-
-out:
if (err)
pr_err_once("insert transport fail, errno %d\n", err);
Patches currently in stable-queue which might be from lucien.xin(a)gmail.com are
queue-4.15/bridge-check-brport-attr-show-in-brport_show.patch
queue-4.15/sctp-do-not-pr_err-for-the-duplicated-node-in-transport-rhlist.patch
This is a note to let you know that I've just added the patch titled
s390/qeth: fix overestimated count of buffer elements
to the 4.15-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
s390-qeth-fix-overestimated-count-of-buffer-elements.patch
and it can be found in the queue-4.15 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Tue Mar 6 19:02:57 PST 2018
From: Julian Wiedmann <jwi(a)linux.vnet.ibm.com>
Date: Tue, 27 Feb 2018 18:58:12 +0100
Subject: s390/qeth: fix overestimated count of buffer elements
From: Julian Wiedmann <jwi(a)linux.vnet.ibm.com>
[ Upstream commit 12472af89632beb1ed8dea29d4efe208ca05b06a ]
qeth_get_elements_for_range() doesn't know how to handle a 0-length
range (ie. start == end), and returns 1 when it should return 0.
Such ranges occur on TSO skbs, where the L2/L3/L4 headers (and thus all
of the skb's linear data) are skipped when mapping the skb into regular
buffer elements.
This overestimation may cause several performance-related issues:
1. sub-optimal IO buffer selection, where the next buffer gets selected
even though the skb would actually still fit into the current buffer.
2. forced linearization, if the element count for a non-linear skb
exceeds QETH_MAX_BUFFER_ELEMENTS.
Rather than modifying qeth_get_elements_for_range() and adding overhead
to every caller, fix up those callers that are in risk of passing a
0-length range.
Fixes: 2863c61334aa ("qeth: refactor calculation of SBALE count")
Signed-off-by: Julian Wiedmann <jwi(a)linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/s390/net/qeth_core_main.c | 10 ++++++----
drivers/s390/net/qeth_l3_main.c | 11 ++++++-----
2 files changed, 12 insertions(+), 9 deletions(-)
--- a/drivers/s390/net/qeth_core_main.c
+++ b/drivers/s390/net/qeth_core_main.c
@@ -3835,10 +3835,12 @@ EXPORT_SYMBOL_GPL(qeth_get_elements_for_
int qeth_get_elements_no(struct qeth_card *card,
struct sk_buff *skb, int extra_elems, int data_offset)
{
- int elements = qeth_get_elements_for_range(
- (addr_t)skb->data + data_offset,
- (addr_t)skb->data + skb_headlen(skb)) +
- qeth_get_elements_for_frags(skb);
+ addr_t end = (addr_t)skb->data + skb_headlen(skb);
+ int elements = qeth_get_elements_for_frags(skb);
+ addr_t start = (addr_t)skb->data + data_offset;
+
+ if (start != end)
+ elements += qeth_get_elements_for_range(start, end);
if ((elements + extra_elems) > QETH_MAX_BUFFER_ELEMENTS(card)) {
QETH_DBF_MESSAGE(2, "Invalid size of IP packet "
--- a/drivers/s390/net/qeth_l3_main.c
+++ b/drivers/s390/net/qeth_l3_main.c
@@ -2629,11 +2629,12 @@ static void qeth_tso_fill_header(struct
static int qeth_l3_get_elements_no_tso(struct qeth_card *card,
struct sk_buff *skb, int extra_elems)
{
- addr_t tcpdptr = (addr_t)tcp_hdr(skb) + tcp_hdrlen(skb);
- int elements = qeth_get_elements_for_range(
- tcpdptr,
- (addr_t)skb->data + skb_headlen(skb)) +
- qeth_get_elements_for_frags(skb);
+ addr_t start = (addr_t)tcp_hdr(skb) + tcp_hdrlen(skb);
+ addr_t end = (addr_t)skb->data + skb_headlen(skb);
+ int elements = qeth_get_elements_for_frags(skb);
+
+ if (start != end)
+ elements += qeth_get_elements_for_range(start, end);
if ((elements + extra_elems) > QETH_MAX_BUFFER_ELEMENTS(card)) {
QETH_DBF_MESSAGE(2,
Patches currently in stable-queue which might be from jwi(a)linux.vnet.ibm.com are
queue-4.15/s390-qeth-fix-setip-command-handling.patch
queue-4.15/s390-qeth-fix-ip-address-lookup-for-l3-devices.patch
queue-4.15/s390-qeth-fix-ipa-command-submission-race.patch
queue-4.15/revert-s390-qeth-fix-using-of-ref-counter-for-rxip-addresses.patch
queue-4.15/s390-qeth-fix-overestimated-count-of-buffer-elements.patch
queue-4.15/s390-qeth-fix-double-free-on-ip-add-remove-race.patch
queue-4.15/s390-qeth-fix-ip-removal-on-offline-cards.patch
queue-4.15/s390-qeth-fix-underestimated-count-of-buffer-elements.patch
This is a note to let you know that I've just added the patch titled
s390/qeth: fix underestimated count of buffer elements
to the 4.15-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
s390-qeth-fix-underestimated-count-of-buffer-elements.patch
and it can be found in the queue-4.15 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Tue Mar 6 19:02:57 PST 2018
From: Ursula Braun <ubraun(a)linux.vnet.ibm.com>
Date: Fri, 9 Feb 2018 11:03:49 +0100
Subject: s390/qeth: fix underestimated count of buffer elements
From: Ursula Braun <ubraun(a)linux.vnet.ibm.com>
[ Upstream commit 89271c65edd599207dd982007900506283c90ae3 ]
For a memory range/skb where the last byte falls onto a page boundary
(ie. 'end' is of the form xxx...xxx001), the PFN_UP() part of the
calculation currently doesn't round up to the next PFN due to an
off-by-one error.
Thus qeth believes that the skb occupies one page less than it
actually does, and may select a IO buffer that doesn't have enough spare
buffer elements to fit all of the skb's data.
HW detects this as a malformed buffer descriptor, and raises an
exception which then triggers device recovery.
Fixes: 2863c61334aa ("qeth: refactor calculation of SBALE count")
Signed-off-by: Ursula Braun <ubraun(a)linux.vnet.ibm.com>
Signed-off-by: Julian Wiedmann <jwi(a)linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/s390/net/qeth_core.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/s390/net/qeth_core.h
+++ b/drivers/s390/net/qeth_core.h
@@ -836,7 +836,7 @@ struct qeth_trap_id {
*/
static inline int qeth_get_elements_for_range(addr_t start, addr_t end)
{
- return PFN_UP(end - 1) - PFN_DOWN(start);
+ return PFN_UP(end) - PFN_DOWN(start);
}
static inline int qeth_get_micros(void)
Patches currently in stable-queue which might be from ubraun(a)linux.vnet.ibm.com are
queue-4.15/s390-qeth-fix-underestimated-count-of-buffer-elements.patch
This is a note to let you know that I've just added the patch titled
s390/qeth: fix SETIP command handling
to the 4.15-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
s390-qeth-fix-setip-command-handling.patch
and it can be found in the queue-4.15 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Tue Mar 6 19:02:57 PST 2018
From: Julian Wiedmann <jwi(a)linux.vnet.ibm.com>
Date: Fri, 9 Feb 2018 11:03:50 +0100
Subject: s390/qeth: fix SETIP command handling
From: Julian Wiedmann <jwi(a)linux.vnet.ibm.com>
[ Upstream commit 1c5b2216fbb973a9410e0b06389740b5c1289171 ]
send_control_data() applies some special handling to SETIP v4 IPA
commands. But current code parses *all* command types for the SETIP
command code. Limit the command code check to IPA commands.
Fixes: 5b54e16f1a54 ("qeth: do not spin for SETIP ip assist command")
Signed-off-by: Julian Wiedmann <jwi(a)linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/s390/net/qeth_core.h | 5 +++++
drivers/s390/net/qeth_core_main.c | 14 ++++++++------
2 files changed, 13 insertions(+), 6 deletions(-)
--- a/drivers/s390/net/qeth_core.h
+++ b/drivers/s390/net/qeth_core.h
@@ -581,6 +581,11 @@ struct qeth_cmd_buffer {
void (*callback) (struct qeth_channel *, struct qeth_cmd_buffer *);
};
+static inline struct qeth_ipa_cmd *__ipa_cmd(struct qeth_cmd_buffer *iob)
+{
+ return (struct qeth_ipa_cmd *)(iob->data + IPA_PDU_HEADER_SIZE);
+}
+
/**
* definition of a qeth channel, used for read and write
*/
--- a/drivers/s390/net/qeth_core_main.c
+++ b/drivers/s390/net/qeth_core_main.c
@@ -2057,7 +2057,7 @@ int qeth_send_control_data(struct qeth_c
unsigned long flags;
struct qeth_reply *reply = NULL;
unsigned long timeout, event_timeout;
- struct qeth_ipa_cmd *cmd;
+ struct qeth_ipa_cmd *cmd = NULL;
QETH_CARD_TEXT(card, 2, "sendctl");
@@ -2083,10 +2083,13 @@ int qeth_send_control_data(struct qeth_c
while (atomic_cmpxchg(&card->write.irq_pending, 0, 1)) ;
qeth_prepare_control_data(card, len, iob);
- if (IS_IPA(iob->data))
+ if (IS_IPA(iob->data)) {
+ cmd = __ipa_cmd(iob);
event_timeout = QETH_IPA_TIMEOUT;
- else
+ } else {
event_timeout = QETH_TIMEOUT;
+ }
+
timeout = jiffies + event_timeout;
QETH_CARD_TEXT(card, 6, "noirqpnd");
@@ -2111,9 +2114,8 @@ int qeth_send_control_data(struct qeth_c
/* we have only one long running ipassist, since we can ensure
process context of this command we can sleep */
- cmd = (struct qeth_ipa_cmd *)(iob->data+IPA_PDU_HEADER_SIZE);
- if ((cmd->hdr.command == IPA_CMD_SETIP) &&
- (cmd->hdr.prot_version == QETH_PROT_IPV4)) {
+ if (cmd && cmd->hdr.command == IPA_CMD_SETIP &&
+ cmd->hdr.prot_version == QETH_PROT_IPV4) {
if (!wait_event_timeout(reply->wait_q,
atomic_read(&reply->received), event_timeout))
goto time_err;
Patches currently in stable-queue which might be from jwi(a)linux.vnet.ibm.com are
queue-4.15/s390-qeth-fix-setip-command-handling.patch
queue-4.15/s390-qeth-fix-ip-address-lookup-for-l3-devices.patch
queue-4.15/s390-qeth-fix-ipa-command-submission-race.patch
queue-4.15/revert-s390-qeth-fix-using-of-ref-counter-for-rxip-addresses.patch
queue-4.15/s390-qeth-fix-overestimated-count-of-buffer-elements.patch
queue-4.15/s390-qeth-fix-double-free-on-ip-add-remove-race.patch
queue-4.15/s390-qeth-fix-ip-removal-on-offline-cards.patch
queue-4.15/s390-qeth-fix-underestimated-count-of-buffer-elements.patch
This is a note to let you know that I've just added the patch titled
s390/qeth: fix IPA command submission race
to the 4.15-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
s390-qeth-fix-ipa-command-submission-race.patch
and it can be found in the queue-4.15 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Tue Mar 6 19:02:57 PST 2018
From: Julian Wiedmann <jwi(a)linux.vnet.ibm.com>
Date: Tue, 27 Feb 2018 18:58:17 +0100
Subject: s390/qeth: fix IPA command submission race
From: Julian Wiedmann <jwi(a)linux.vnet.ibm.com>
[ Upstream commit d22ffb5a712f9211ffd104c38fc17cbfb1b5e2b0 ]
If multiple IPA commands are build & sent out concurrently,
fill_ipacmd_header() may assign a seqno value to a command that's
different from what send_control_data() later assigns to this command's
reply.
This is due to other commands passing through send_control_data(),
and incrementing card->seqno.ipa along the way.
So one IPA command has no reply that's waiting for its seqno, while some
other IPA command has multiple reply objects waiting for it.
Only one of those waiting replies wins, and the other(s) times out and
triggers a recovery via send_ipa_cmd().
Fix this by making sure that the same seqno value is assigned to
a command and its reply object.
Do so immediately before submitting the command & while holding the
irq_pending "lock", to produce nicely ascending seqnos.
As a side effect, *all* IPA commands now use a reply object that's
waiting for its actual seqno. Previously, early IPA commands that were
submitted while the card was still DOWN used the "catch-all" IDX seqno.
Signed-off-by: Julian Wiedmann <jwi(a)linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/s390/net/qeth_core_main.c | 19 ++++++++++---------
1 file changed, 10 insertions(+), 9 deletions(-)
--- a/drivers/s390/net/qeth_core_main.c
+++ b/drivers/s390/net/qeth_core_main.c
@@ -2071,24 +2071,25 @@ int qeth_send_control_data(struct qeth_c
}
reply->callback = reply_cb;
reply->param = reply_param;
- if (card->state == CARD_STATE_DOWN)
- reply->seqno = QETH_IDX_COMMAND_SEQNO;
- else
- reply->seqno = card->seqno.ipa++;
+
init_waitqueue_head(&reply->wait_q);
- spin_lock_irqsave(&card->lock, flags);
- list_add_tail(&reply->list, &card->cmd_waiter_list);
- spin_unlock_irqrestore(&card->lock, flags);
while (atomic_cmpxchg(&card->write.irq_pending, 0, 1)) ;
- qeth_prepare_control_data(card, len, iob);
if (IS_IPA(iob->data)) {
cmd = __ipa_cmd(iob);
+ cmd->hdr.seqno = card->seqno.ipa++;
+ reply->seqno = cmd->hdr.seqno;
event_timeout = QETH_IPA_TIMEOUT;
} else {
+ reply->seqno = QETH_IDX_COMMAND_SEQNO;
event_timeout = QETH_TIMEOUT;
}
+ qeth_prepare_control_data(card, len, iob);
+
+ spin_lock_irqsave(&card->lock, flags);
+ list_add_tail(&reply->list, &card->cmd_waiter_list);
+ spin_unlock_irqrestore(&card->lock, flags);
timeout = jiffies + event_timeout;
@@ -2870,7 +2871,7 @@ static void qeth_fill_ipacmd_header(stru
memset(cmd, 0, sizeof(struct qeth_ipa_cmd));
cmd->hdr.command = command;
cmd->hdr.initiator = IPA_CMD_INITIATOR_HOST;
- cmd->hdr.seqno = card->seqno.ipa;
+ /* cmd->hdr.seqno is set by qeth_send_control_data() */
cmd->hdr.adapter_type = qeth_get_ipa_adp_type(card->info.link_type);
cmd->hdr.rel_adapter_no = (__u8) card->info.portno;
if (card->options.layer2)
Patches currently in stable-queue which might be from jwi(a)linux.vnet.ibm.com are
queue-4.15/s390-qeth-fix-setip-command-handling.patch
queue-4.15/s390-qeth-fix-ip-address-lookup-for-l3-devices.patch
queue-4.15/s390-qeth-fix-ipa-command-submission-race.patch
queue-4.15/revert-s390-qeth-fix-using-of-ref-counter-for-rxip-addresses.patch
queue-4.15/s390-qeth-fix-overestimated-count-of-buffer-elements.patch
queue-4.15/s390-qeth-fix-double-free-on-ip-add-remove-race.patch
queue-4.15/s390-qeth-fix-ip-removal-on-offline-cards.patch
queue-4.15/s390-qeth-fix-underestimated-count-of-buffer-elements.patch
This is a note to let you know that I've just added the patch titled
s390/qeth: fix IP address lookup for L3 devices
to the 4.15-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
s390-qeth-fix-ip-address-lookup-for-l3-devices.patch
and it can be found in the queue-4.15 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Tue Mar 6 19:02:57 PST 2018
From: Julian Wiedmann <jwi(a)linux.vnet.ibm.com>
Date: Tue, 27 Feb 2018 18:58:16 +0100
Subject: s390/qeth: fix IP address lookup for L3 devices
From: Julian Wiedmann <jwi(a)linux.vnet.ibm.com>
[ Upstream commit c5c48c58b259bb8f0482398370ee539d7a12df3e ]
Current code ("qeth_l3_ip_from_hash()") matches a queried address object
against objects in the IP table by IP address, Mask/Prefix Length and
MAC address ("qeth_l3_ipaddrs_is_equal()"). But what callers actually
require is either
a) "is this IP address registered" (ie. match by IP address only),
before adding a new address.
b) or "is this address object registered" (ie. match all relevant
attributes), before deleting an address.
Right now
1. the ADD path is too strict in its lookup, and eg. doesn't detect
conflicts between an existing NORMAL address and a new VIPA address
(because the NORMAL address will have mask != 0, while VIPA has
a mask == 0),
2. the DELETE path is not strict enough, and eg. allows del_rxip() to
delete a VIPA address as long as the IP address matches.
Fix all this by adding helpers (_addr_match_ip() and _addr_match_all())
that do the appropriate checking.
Note that the ADD path for NORMAL addresses is special, as qeth keeps
track of how many times such an address is in use (and there is no
immediate way of returning errors to the caller). So when a requested
NORMAL address _fully_ matches an existing one, it's not considered a
conflict and we merely increment the refcount.
Fixes: 5f78e29ceebf ("qeth: optimize IP handling in rx_mode callback")
Signed-off-by: Julian Wiedmann <jwi(a)linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/s390/net/qeth_l3.h | 34 ++++++++++++++
drivers/s390/net/qeth_l3_main.c | 91 ++++++++++++++++++----------------------
2 files changed, 74 insertions(+), 51 deletions(-)
--- a/drivers/s390/net/qeth_l3.h
+++ b/drivers/s390/net/qeth_l3.h
@@ -40,8 +40,40 @@ struct qeth_ipaddr {
unsigned int pfxlen;
} a6;
} u;
-
};
+
+static inline bool qeth_l3_addr_match_ip(struct qeth_ipaddr *a1,
+ struct qeth_ipaddr *a2)
+{
+ if (a1->proto != a2->proto)
+ return false;
+ if (a1->proto == QETH_PROT_IPV6)
+ return ipv6_addr_equal(&a1->u.a6.addr, &a2->u.a6.addr);
+ return a1->u.a4.addr == a2->u.a4.addr;
+}
+
+static inline bool qeth_l3_addr_match_all(struct qeth_ipaddr *a1,
+ struct qeth_ipaddr *a2)
+{
+ /* Assumes that the pair was obtained via qeth_l3_addr_find_by_ip(),
+ * so 'proto' and 'addr' match for sure.
+ *
+ * For ucast:
+ * - 'mac' is always 0.
+ * - 'mask'/'pfxlen' for RXIP/VIPA is always 0. For NORMAL, matching
+ * values are required to avoid mixups in takeover eligibility.
+ *
+ * For mcast,
+ * - 'mac' is mapped from the IP, and thus always matches.
+ * - 'mask'/'pfxlen' is always 0.
+ */
+ if (a1->type != a2->type)
+ return false;
+ if (a1->proto == QETH_PROT_IPV6)
+ return a1->u.a6.pfxlen == a2->u.a6.pfxlen;
+ return a1->u.a4.mask == a2->u.a4.mask;
+}
+
static inline u64 qeth_l3_ipaddr_hash(struct qeth_ipaddr *addr)
{
u64 ret = 0;
--- a/drivers/s390/net/qeth_l3_main.c
+++ b/drivers/s390/net/qeth_l3_main.c
@@ -150,6 +150,24 @@ int qeth_l3_string_to_ipaddr(const char
return -EINVAL;
}
+static struct qeth_ipaddr *qeth_l3_find_addr_by_ip(struct qeth_card *card,
+ struct qeth_ipaddr *query)
+{
+ u64 key = qeth_l3_ipaddr_hash(query);
+ struct qeth_ipaddr *addr;
+
+ if (query->is_multicast) {
+ hash_for_each_possible(card->ip_mc_htable, addr, hnode, key)
+ if (qeth_l3_addr_match_ip(addr, query))
+ return addr;
+ } else {
+ hash_for_each_possible(card->ip_htable, addr, hnode, key)
+ if (qeth_l3_addr_match_ip(addr, query))
+ return addr;
+ }
+ return NULL;
+}
+
static void qeth_l3_convert_addr_to_bits(u8 *addr, u8 *bits, int len)
{
int i, j;
@@ -203,34 +221,6 @@ static bool qeth_l3_is_addr_covered_by_i
return rc;
}
-inline int
-qeth_l3_ipaddrs_is_equal(struct qeth_ipaddr *addr1, struct qeth_ipaddr *addr2)
-{
- return addr1->proto == addr2->proto &&
- !memcmp(&addr1->u, &addr2->u, sizeof(addr1->u)) &&
- !memcmp(&addr1->mac, &addr2->mac, sizeof(addr1->mac));
-}
-
-static struct qeth_ipaddr *
-qeth_l3_ip_from_hash(struct qeth_card *card, struct qeth_ipaddr *tmp_addr)
-{
- struct qeth_ipaddr *addr;
-
- if (tmp_addr->is_multicast) {
- hash_for_each_possible(card->ip_mc_htable, addr,
- hnode, qeth_l3_ipaddr_hash(tmp_addr))
- if (qeth_l3_ipaddrs_is_equal(tmp_addr, addr))
- return addr;
- } else {
- hash_for_each_possible(card->ip_htable, addr,
- hnode, qeth_l3_ipaddr_hash(tmp_addr))
- if (qeth_l3_ipaddrs_is_equal(tmp_addr, addr))
- return addr;
- }
-
- return NULL;
-}
-
int qeth_l3_delete_ip(struct qeth_card *card, struct qeth_ipaddr *tmp_addr)
{
int rc = 0;
@@ -245,8 +235,8 @@ int qeth_l3_delete_ip(struct qeth_card *
QETH_CARD_HEX(card, 4, ((char *)&tmp_addr->u.a6.addr) + 8, 8);
}
- addr = qeth_l3_ip_from_hash(card, tmp_addr);
- if (!addr)
+ addr = qeth_l3_find_addr_by_ip(card, tmp_addr);
+ if (!addr || !qeth_l3_addr_match_all(addr, tmp_addr))
return -ENOENT;
addr->ref_counter--;
@@ -268,6 +258,7 @@ int qeth_l3_add_ip(struct qeth_card *car
{
int rc = 0;
struct qeth_ipaddr *addr;
+ char buf[40];
QETH_CARD_TEXT(card, 4, "addip");
@@ -278,8 +269,20 @@ int qeth_l3_add_ip(struct qeth_card *car
QETH_CARD_HEX(card, 4, ((char *)&tmp_addr->u.a6.addr) + 8, 8);
}
- addr = qeth_l3_ip_from_hash(card, tmp_addr);
- if (!addr) {
+ addr = qeth_l3_find_addr_by_ip(card, tmp_addr);
+ if (addr) {
+ if (tmp_addr->type != QETH_IP_TYPE_NORMAL)
+ return -EADDRINUSE;
+ if (qeth_l3_addr_match_all(addr, tmp_addr)) {
+ addr->ref_counter++;
+ return 0;
+ }
+ qeth_l3_ipaddr_to_string(tmp_addr->proto, (u8 *)&tmp_addr->u,
+ buf);
+ dev_warn(&card->gdev->dev,
+ "Registering IP address %s failed\n", buf);
+ return -EADDRINUSE;
+ } else {
addr = qeth_l3_get_addr_buffer(tmp_addr->proto);
if (!addr)
return -ENOMEM;
@@ -327,11 +330,7 @@ int qeth_l3_add_ip(struct qeth_card *car
hash_del(&addr->hnode);
kfree(addr);
}
- } else {
- if (addr->type == QETH_IP_TYPE_NORMAL)
- addr->ref_counter++;
}
-
return rc;
}
@@ -715,12 +714,7 @@ int qeth_l3_add_vipa(struct qeth_card *c
return -ENOMEM;
spin_lock_bh(&card->ip_lock);
-
- if (qeth_l3_ip_from_hash(card, ipaddr))
- rc = -EEXIST;
- else
- qeth_l3_add_ip(card, ipaddr);
-
+ rc = qeth_l3_add_ip(card, ipaddr);
spin_unlock_bh(&card->ip_lock);
kfree(ipaddr);
@@ -783,12 +777,7 @@ int qeth_l3_add_rxip(struct qeth_card *c
return -ENOMEM;
spin_lock_bh(&card->ip_lock);
-
- if (qeth_l3_ip_from_hash(card, ipaddr))
- rc = -EEXIST;
- else
- qeth_l3_add_ip(card, ipaddr);
-
+ rc = qeth_l3_add_ip(card, ipaddr);
spin_unlock_bh(&card->ip_lock);
kfree(ipaddr);
@@ -1396,8 +1385,9 @@ qeth_l3_add_mc_to_hash(struct qeth_card
memcpy(tmp->mac, buf, sizeof(tmp->mac));
tmp->is_multicast = 1;
- ipm = qeth_l3_ip_from_hash(card, tmp);
+ ipm = qeth_l3_find_addr_by_ip(card, tmp);
if (ipm) {
+ /* for mcast, by-IP match means full match */
ipm->disp_flag = QETH_DISP_ADDR_DO_NOTHING;
} else {
ipm = qeth_l3_get_addr_buffer(QETH_PROT_IPV4);
@@ -1480,8 +1470,9 @@ qeth_l3_add_mc6_to_hash(struct qeth_card
sizeof(struct in6_addr));
tmp->is_multicast = 1;
- ipm = qeth_l3_ip_from_hash(card, tmp);
+ ipm = qeth_l3_find_addr_by_ip(card, tmp);
if (ipm) {
+ /* for mcast, by-IP match means full match */
ipm->disp_flag = QETH_DISP_ADDR_DO_NOTHING;
continue;
}
Patches currently in stable-queue which might be from jwi(a)linux.vnet.ibm.com are
queue-4.15/s390-qeth-fix-setip-command-handling.patch
queue-4.15/s390-qeth-fix-ip-address-lookup-for-l3-devices.patch
queue-4.15/s390-qeth-fix-ipa-command-submission-race.patch
queue-4.15/revert-s390-qeth-fix-using-of-ref-counter-for-rxip-addresses.patch
queue-4.15/s390-qeth-fix-overestimated-count-of-buffer-elements.patch
queue-4.15/s390-qeth-fix-double-free-on-ip-add-remove-race.patch
queue-4.15/s390-qeth-fix-ip-removal-on-offline-cards.patch
queue-4.15/s390-qeth-fix-underestimated-count-of-buffer-elements.patch
This is a note to let you know that I've just added the patch titled
s390/qeth: fix IP removal on offline cards
to the 4.15-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
s390-qeth-fix-ip-removal-on-offline-cards.patch
and it can be found in the queue-4.15 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Tue Mar 6 19:02:57 PST 2018
From: Julian Wiedmann <jwi(a)linux.vnet.ibm.com>
Date: Tue, 27 Feb 2018 18:58:13 +0100
Subject: s390/qeth: fix IP removal on offline cards
From: Julian Wiedmann <jwi(a)linux.vnet.ibm.com>
[ Upstream commit 98d823ab1fbdcb13abc25b420f9bb71bade42056 ]
If the HW is not reachable, then none of the IPs in qeth's internal
table has been registered with the HW yet. So when deleting such an IP,
there's no need to stage it for deregistration - just drop it from
the table.
This fixes the "add-delete-add" scenario on an offline card, where the
the second "add" merely increments the IP's use count. But as the IP is
still set to DISP_ADDR_DELETE from the previous "delete" step,
l3_recover_ip() won't register it with the HW when the card goes online.
Fixes: 5f78e29ceebf ("qeth: optimize IP handling in rx_mode callback")
Signed-off-by: Julian Wiedmann <jwi(a)linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/s390/net/qeth_l3_main.c | 14 +++-----------
1 file changed, 3 insertions(+), 11 deletions(-)
--- a/drivers/s390/net/qeth_l3_main.c
+++ b/drivers/s390/net/qeth_l3_main.c
@@ -256,12 +256,8 @@ int qeth_l3_delete_ip(struct qeth_card *
if (addr->in_progress)
return -EINPROGRESS;
- if (!qeth_card_hw_is_reachable(card)) {
- addr->disp_flag = QETH_DISP_ADDR_DELETE;
- return 0;
- }
-
- rc = qeth_l3_deregister_addr_entry(card, addr);
+ if (qeth_card_hw_is_reachable(card))
+ rc = qeth_l3_deregister_addr_entry(card, addr);
hash_del(&addr->hnode);
kfree(addr);
@@ -404,11 +400,7 @@ static void qeth_l3_recover_ip(struct qe
spin_lock_bh(&card->ip_lock);
hash_for_each_safe(card->ip_htable, i, tmp, addr, hnode) {
- if (addr->disp_flag == QETH_DISP_ADDR_DELETE) {
- qeth_l3_deregister_addr_entry(card, addr);
- hash_del(&addr->hnode);
- kfree(addr);
- } else if (addr->disp_flag == QETH_DISP_ADDR_ADD) {
+ if (addr->disp_flag == QETH_DISP_ADDR_ADD) {
if (addr->proto == QETH_PROT_IPV4) {
addr->in_progress = 1;
spin_unlock_bh(&card->ip_lock);
Patches currently in stable-queue which might be from jwi(a)linux.vnet.ibm.com are
queue-4.15/s390-qeth-fix-setip-command-handling.patch
queue-4.15/s390-qeth-fix-ip-address-lookup-for-l3-devices.patch
queue-4.15/s390-qeth-fix-ipa-command-submission-race.patch
queue-4.15/revert-s390-qeth-fix-using-of-ref-counter-for-rxip-addresses.patch
queue-4.15/s390-qeth-fix-overestimated-count-of-buffer-elements.patch
queue-4.15/s390-qeth-fix-double-free-on-ip-add-remove-race.patch
queue-4.15/s390-qeth-fix-ip-removal-on-offline-cards.patch
queue-4.15/s390-qeth-fix-underestimated-count-of-buffer-elements.patch
This is a note to let you know that I've just added the patch titled
s390/qeth: fix double-free on IP add/remove race
to the 4.15-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
s390-qeth-fix-double-free-on-ip-add-remove-race.patch
and it can be found in the queue-4.15 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Tue Mar 6 19:02:57 PST 2018
From: Julian Wiedmann <jwi(a)linux.vnet.ibm.com>
Date: Tue, 27 Feb 2018 18:58:14 +0100
Subject: s390/qeth: fix double-free on IP add/remove race
From: Julian Wiedmann <jwi(a)linux.vnet.ibm.com>
[ Upstream commit 14d066c3531a87f727968cacd85bd95c75f59843 ]
Registering an IPv4 address with the HW takes quite a while, so we
temporarily drop the ip_htable lock. Any concurrent add/remove of the
same IP adjusts the IP's use count, and (on remove) is then blocked by
addr->in_progress.
After the register call has completed, we check the use count for
concurrently attempted add/remove calls - and possibly straight-away
deregister the IP again. This happens via l3_delete_ip(), which
1) looks up the queried IP in the htable (getting a reference to the
*same* queried object),
2) deregisters the IP from the HW, and
3) frees the IP object.
The caller in l3_add_ip() then does a second free on the same object.
For this case, skip all the extra checks and lookups in l3_delete_ip()
and just deregister & free the IP object ourselves.
Fixes: 5f78e29ceebf ("qeth: optimize IP handling in rx_mode callback")
Signed-off-by: Julian Wiedmann <jwi(a)linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/s390/net/qeth_l3_main.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
--- a/drivers/s390/net/qeth_l3_main.c
+++ b/drivers/s390/net/qeth_l3_main.c
@@ -320,7 +320,8 @@ int qeth_l3_add_ip(struct qeth_card *car
(rc == IPA_RC_LAN_OFFLINE)) {
addr->disp_flag = QETH_DISP_ADDR_DO_NOTHING;
if (addr->ref_counter < 1) {
- qeth_l3_delete_ip(card, addr);
+ qeth_l3_deregister_addr_entry(card, addr);
+ hash_del(&addr->hnode);
kfree(addr);
}
} else {
Patches currently in stable-queue which might be from jwi(a)linux.vnet.ibm.com are
queue-4.15/s390-qeth-fix-setip-command-handling.patch
queue-4.15/s390-qeth-fix-ip-address-lookup-for-l3-devices.patch
queue-4.15/s390-qeth-fix-ipa-command-submission-race.patch
queue-4.15/revert-s390-qeth-fix-using-of-ref-counter-for-rxip-addresses.patch
queue-4.15/s390-qeth-fix-overestimated-count-of-buffer-elements.patch
queue-4.15/s390-qeth-fix-double-free-on-ip-add-remove-race.patch
queue-4.15/s390-qeth-fix-ip-removal-on-offline-cards.patch
queue-4.15/s390-qeth-fix-underestimated-count-of-buffer-elements.patch