From: YueHaibing yuehaibing@huawei.com
[ Upstream commit b805d78d300bcf2c83d6df7da0c818b0fee41427 ]
UBSAN report this:
UBSAN: Undefined behaviour in net/xfrm/xfrm_policy.c:1289:24 index 6 is out of range for type 'unsigned int [6]' CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.4.162-514.55.6.9.x86_64+ #13 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014 0000000000000000 1466cf39b41b23c9 ffff8801f6b07a58 ffffffff81cb35f4 0000000041b58ab3 ffffffff83230f9c ffffffff81cb34e0 ffff8801f6b07a80 ffff8801f6b07a20 1466cf39b41b23c9 ffffffff851706e0 ffff8801f6b07ae8 Call Trace: <IRQ> [<ffffffff81cb35f4>] __dump_stack lib/dump_stack.c:15 [inline] <IRQ> [<ffffffff81cb35f4>] dump_stack+0x114/0x1a0 lib/dump_stack.c:51 [<ffffffff81d94225>] ubsan_epilogue+0x12/0x8f lib/ubsan.c:164 [<ffffffff81d954db>] __ubsan_handle_out_of_bounds+0x16e/0x1b2 lib/ubsan.c:382 [<ffffffff82a25acd>] __xfrm_policy_unlink+0x3dd/0x5b0 net/xfrm/xfrm_policy.c:1289 [<ffffffff82a2e572>] xfrm_policy_delete+0x52/0xb0 net/xfrm/xfrm_policy.c:1309 [<ffffffff82a3319b>] xfrm_policy_timer+0x30b/0x590 net/xfrm/xfrm_policy.c:243 [<ffffffff813d3927>] call_timer_fn+0x237/0x990 kernel/time/timer.c:1144 [<ffffffff813d8e7e>] __run_timers kernel/time/timer.c:1218 [inline] [<ffffffff813d8e7e>] run_timer_softirq+0x6ce/0xb80 kernel/time/timer.c:1401 [<ffffffff8120d6f9>] __do_softirq+0x299/0xe10 kernel/softirq.c:273 [<ffffffff8120e676>] invoke_softirq kernel/softirq.c:350 [inline] [<ffffffff8120e676>] irq_exit+0x216/0x2c0 kernel/softirq.c:391 [<ffffffff82c5edab>] exiting_irq arch/x86/include/asm/apic.h:652 [inline] [<ffffffff82c5edab>] smp_apic_timer_interrupt+0x8b/0xc0 arch/x86/kernel/apic/apic.c:926 [<ffffffff82c5c985>] apic_timer_interrupt+0xa5/0xb0 arch/x86/entry/entry_64.S:735 <EOI> [<ffffffff81188096>] ? native_safe_halt+0x6/0x10 arch/x86/include/asm/irqflags.h:52 [<ffffffff810834d7>] arch_safe_halt arch/x86/include/asm/paravirt.h:111 [inline] [<ffffffff810834d7>] default_idle+0x27/0x430 arch/x86/kernel/process.c:446 [<ffffffff81085f05>] arch_cpu_idle+0x15/0x20 arch/x86/kernel/process.c:437 [<ffffffff8132abc3>] default_idle_call+0x53/0x90 kernel/sched/idle.c:92 [<ffffffff8132b32d>] cpuidle_idle_call kernel/sched/idle.c:156 [inline] [<ffffffff8132b32d>] cpu_idle_loop kernel/sched/idle.c:251 [inline] [<ffffffff8132b32d>] cpu_startup_entry+0x60d/0x9a0 kernel/sched/idle.c:299 [<ffffffff8113e119>] start_secondary+0x3c9/0x560 arch/x86/kernel/smpboot.c:245
The issue is triggered as this:
xfrm_add_policy -->verify_newpolicy_info //check the index provided by user with XFRM_POLICY_MAX //In my case, the index is 0x6E6BB6, so it pass the check. -->xfrm_policy_construct //copy the user's policy and set xfrm_policy_timer -->xfrm_policy_insert --> __xfrm_policy_link //use the orgin dir, in my case is 2 --> xfrm_gen_index //generate policy index, there is 0x6E6BB6
then xfrm_policy_timer be fired
xfrm_policy_timer --> xfrm_policy_id2dir //get dir from (policy index & 7), in my case is 6 --> xfrm_policy_delete --> __xfrm_policy_unlink //access policy_count[dir], trigger out of range access
Add xfrm_policy_id2dir check in verify_newpolicy_info, make sure the computed dir is valid, to fix the issue.
Reported-by: Hulk Robot hulkci@huawei.com Fixes: e682adf021be ("xfrm: Try to honor policy index if it's supplied by user") Signed-off-by: YueHaibing yuehaibing@huawei.com Acked-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Steffen Klassert steffen.klassert@secunet.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/xfrm/xfrm_user.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/xfrm/xfrm_user.c b/net/xfrm/xfrm_user.c index a131f9ff979e1..8d4d52fd457b2 100644 --- a/net/xfrm/xfrm_user.c +++ b/net/xfrm/xfrm_user.c @@ -1424,7 +1424,7 @@ static int verify_newpolicy_info(struct xfrm_userpolicy_info *p) ret = verify_policy_dir(p->dir); if (ret) return ret; - if (p->index && ((p->index & XFRM_POLICY_MAX) != p->dir)) + if (p->index && (xfrm_policy_id2dir(p->index) != p->dir)) return -EINVAL;
return 0;
From: Myungho Jung mhjungk@gmail.com
[ Upstream commit 6ed69184ed9c43873b8a1ee721e3bf3c08c2c6be ]
In esp4_gro_receive() and esp6_gro_receive(), secpath can be allocated without adding xfrm state to xvec. Then, sp->xvec[sp->len - 1] would fail and result in dereferencing invalid pointer in esp4_gso_segment() and esp6_gso_segment(). Reset secpath if xfrm function returns error.
Fixes: 7785bba299a8 ("esp: Add a software GRO codepath") Reported-by: syzbot+b69368fd933c6c592f4c@syzkaller.appspotmail.com Signed-off-by: Myungho Jung mhjungk@gmail.com Signed-off-by: Steffen Klassert steffen.klassert@secunet.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/ipv4/esp4_offload.c | 8 +++++--- net/ipv6/esp6_offload.c | 8 +++++--- 2 files changed, 10 insertions(+), 6 deletions(-)
diff --git a/net/ipv4/esp4_offload.c b/net/ipv4/esp4_offload.c index 8756e0e790d2a..d3170a8001b2a 100644 --- a/net/ipv4/esp4_offload.c +++ b/net/ipv4/esp4_offload.c @@ -52,13 +52,13 @@ static struct sk_buff *esp4_gro_receive(struct list_head *head, goto out;
if (sp->len == XFRM_MAX_DEPTH) - goto out; + goto out_reset;
x = xfrm_state_lookup(dev_net(skb->dev), skb->mark, (xfrm_address_t *)&ip_hdr(skb)->daddr, spi, IPPROTO_ESP, AF_INET); if (!x) - goto out; + goto out_reset;
sp->xvec[sp->len++] = x; sp->olen++; @@ -66,7 +66,7 @@ static struct sk_buff *esp4_gro_receive(struct list_head *head, xo = xfrm_offload(skb); if (!xo) { xfrm_state_put(x); - goto out; + goto out_reset; } }
@@ -82,6 +82,8 @@ static struct sk_buff *esp4_gro_receive(struct list_head *head, xfrm_input(skb, IPPROTO_ESP, spi, -2);
return ERR_PTR(-EINPROGRESS); +out_reset: + secpath_reset(skb); out: skb_push(skb, offset); NAPI_GRO_CB(skb)->same_flow = 0; diff --git a/net/ipv6/esp6_offload.c b/net/ipv6/esp6_offload.c index d46b4eb645c2e..cb99f6fb79b79 100644 --- a/net/ipv6/esp6_offload.c +++ b/net/ipv6/esp6_offload.c @@ -74,13 +74,13 @@ static struct sk_buff *esp6_gro_receive(struct list_head *head, goto out;
if (sp->len == XFRM_MAX_DEPTH) - goto out; + goto out_reset;
x = xfrm_state_lookup(dev_net(skb->dev), skb->mark, (xfrm_address_t *)&ipv6_hdr(skb)->daddr, spi, IPPROTO_ESP, AF_INET6); if (!x) - goto out; + goto out_reset;
sp->xvec[sp->len++] = x; sp->olen++; @@ -88,7 +88,7 @@ static struct sk_buff *esp6_gro_receive(struct list_head *head, xo = xfrm_offload(skb); if (!xo) { xfrm_state_put(x); - goto out; + goto out_reset; } }
@@ -109,6 +109,8 @@ static struct sk_buff *esp6_gro_receive(struct list_head *head, xfrm_input(skb, IPPROTO_ESP, spi, -2);
return ERR_PTR(-EINPROGRESS); +out_reset: + secpath_reset(skb); out: skb_push(skb, offset); NAPI_GRO_CB(skb)->same_flow = 0;
From: Su Yanjun suyj.fnst@cn.fujitsu.com
[ Upstream commit 6ee02a54ef990a71bf542b6f0a4e3321de9d9c66 ]
When unloading xfrm6_tunnel module, xfrm6_tunnel_fini directly frees the xfrm6_tunnel_spi_kmem. Maybe someone has gotten the xfrm6_tunnel_spi, so need to wait it.
Fixes: 91cc3bb0b04ff("xfrm6_tunnel: RCU conversion") Signed-off-by: Su Yanjun suyj.fnst@cn.fujitsu.com Acked-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Steffen Klassert steffen.klassert@secunet.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/ipv6/xfrm6_tunnel.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/net/ipv6/xfrm6_tunnel.c b/net/ipv6/xfrm6_tunnel.c index bc65db782bfb1..12cb3aa990af4 100644 --- a/net/ipv6/xfrm6_tunnel.c +++ b/net/ipv6/xfrm6_tunnel.c @@ -402,6 +402,10 @@ static void __exit xfrm6_tunnel_fini(void) xfrm6_tunnel_deregister(&xfrm6_tunnel_handler, AF_INET6); xfrm_unregister_type(&xfrm6_tunnel_type, AF_INET6); unregister_pernet_subsys(&xfrm6_tunnel_net_ops); + /* Someone maybe has gotten the xfrm6_tunnel_spi. + * So need to wait it. + */ + rcu_barrier(); kmem_cache_destroy(xfrm6_tunnel_spi_kmem); }
From: Jeremy Sowden jeremy@azazel.net
[ Upstream commit 5483844c3fc18474de29f5d6733003526e0a9f78 ]
If tunnel registration failed during module initialization, the module would fail to deregister the IPPROTO_COMP protocol and would attempt to deregister the tunnel.
The tunnel was not deregistered during module-exit.
Fixes: dd9ee3444014e ("vti4: Fix a ipip packet processing bug in 'IPCOMP' virtual tunnel") Signed-off-by: Jeremy Sowden jeremy@azazel.net Signed-off-by: Steffen Klassert steffen.klassert@secunet.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/ipv4/ip_vti.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/net/ipv4/ip_vti.c b/net/ipv4/ip_vti.c index 68a21bf75dd0b..b6235ca09fa53 100644 --- a/net/ipv4/ip_vti.c +++ b/net/ipv4/ip_vti.c @@ -659,9 +659,9 @@ static int __init vti_init(void) return err;
rtnl_link_failed: - xfrm4_protocol_deregister(&vti_ipcomp4_protocol, IPPROTO_COMP); -xfrm_tunnel_failed: xfrm4_tunnel_deregister(&ipip_handler, AF_INET); +xfrm_tunnel_failed: + xfrm4_protocol_deregister(&vti_ipcomp4_protocol, IPPROTO_COMP); xfrm_proto_comp_failed: xfrm4_protocol_deregister(&vti_ah4_protocol, IPPROTO_AH); xfrm_proto_ah_failed: @@ -676,6 +676,7 @@ static int __init vti_init(void) static void __exit vti_fini(void) { rtnl_link_unregister(&vti_link_ops); + xfrm4_tunnel_deregister(&ipip_handler, AF_INET); xfrm4_protocol_deregister(&vti_ipcomp4_protocol, IPPROTO_COMP); xfrm4_protocol_deregister(&vti_ah4_protocol, IPPROTO_AH); xfrm4_protocol_deregister(&vti_esp4_protocol, IPPROTO_ESP);
From: Cong Wang xiyou.wangcong@gmail.com
[ Upstream commit dbb2483b2a46fbaf833cfb5deb5ed9cace9c7399 ]
In commit 6a53b7593233 ("xfrm: check id proto in validate_tmpl()") I introduced a check for xfrm protocol, but according to Herbert IPSEC_PROTO_ANY should only be used as a wildcard for lookup, so it should be removed from validate_tmpl().
And, IPSEC_PROTO_ANY is expected to only match 3 IPSec-specific protocols, this is why xfrm_state_flush() could still miss IPPROTO_ROUTING, which leads that those entries are left in net->xfrm.state_all before exit net. Fix this by replacing IPSEC_PROTO_ANY with zero.
This patch also extracts the check from validate_tmpl() to xfrm_id_proto_valid() and uses it in parse_ipsecrequest(). With this, no other protocols should be added into xfrm.
Fixes: 6a53b7593233 ("xfrm: check id proto in validate_tmpl()") Reported-by: syzbot+0bf0519d6e0de15914fe@syzkaller.appspotmail.com Cc: Steffen Klassert steffen.klassert@secunet.com Cc: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Cong Wang xiyou.wangcong@gmail.com Acked-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Steffen Klassert steffen.klassert@secunet.com Signed-off-by: Sasha Levin sashal@kernel.org --- include/net/xfrm.h | 17 +++++++++++++++++ net/ipv6/xfrm6_tunnel.c | 2 +- net/key/af_key.c | 4 +++- net/xfrm/xfrm_state.c | 2 +- net/xfrm/xfrm_user.c | 14 +------------- 5 files changed, 23 insertions(+), 16 deletions(-)
diff --git a/include/net/xfrm.h b/include/net/xfrm.h index 85386becbaea2..902437dfbce77 100644 --- a/include/net/xfrm.h +++ b/include/net/xfrm.h @@ -1404,6 +1404,23 @@ static inline int xfrm_state_kern(const struct xfrm_state *x) return atomic_read(&x->tunnel_users); }
+static inline bool xfrm_id_proto_valid(u8 proto) +{ + switch (proto) { + case IPPROTO_AH: + case IPPROTO_ESP: + case IPPROTO_COMP: +#if IS_ENABLED(CONFIG_IPV6) + case IPPROTO_ROUTING: + case IPPROTO_DSTOPTS: +#endif + return true; + default: + return false; + } +} + +/* IPSEC_PROTO_ANY only matches 3 IPsec protocols, 0 could match all. */ static inline int xfrm_id_proto_match(u8 proto, u8 userproto) { return (!userproto || proto == userproto || diff --git a/net/ipv6/xfrm6_tunnel.c b/net/ipv6/xfrm6_tunnel.c index 12cb3aa990af4..d9e5f6808811a 100644 --- a/net/ipv6/xfrm6_tunnel.c +++ b/net/ipv6/xfrm6_tunnel.c @@ -345,7 +345,7 @@ static void __net_exit xfrm6_tunnel_net_exit(struct net *net) unsigned int i;
xfrm_flush_gc(); - xfrm_state_flush(net, IPSEC_PROTO_ANY, false, true); + xfrm_state_flush(net, 0, false, true);
for (i = 0; i < XFRM6_TUNNEL_SPI_BYADDR_HSIZE; i++) WARN_ON_ONCE(!hlist_empty(&xfrm6_tn->spi_byaddr[i])); diff --git a/net/key/af_key.c b/net/key/af_key.c index 5651c29cb5bd0..4af1e1d60b9f2 100644 --- a/net/key/af_key.c +++ b/net/key/af_key.c @@ -1951,8 +1951,10 @@ parse_ipsecrequest(struct xfrm_policy *xp, struct sadb_x_ipsecrequest *rq)
if (rq->sadb_x_ipsecrequest_mode == 0) return -EINVAL; + if (!xfrm_id_proto_valid(rq->sadb_x_ipsecrequest_proto)) + return -EINVAL;
- t->id.proto = rq->sadb_x_ipsecrequest_proto; /* XXX check proto */ + t->id.proto = rq->sadb_x_ipsecrequest_proto; if ((mode = pfkey_mode_to_xfrm(rq->sadb_x_ipsecrequest_mode)) < 0) return -EINVAL; t->mode = mode; diff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c index 1bb971f46fc6f..178baaa037e5b 100644 --- a/net/xfrm/xfrm_state.c +++ b/net/xfrm/xfrm_state.c @@ -2384,7 +2384,7 @@ void xfrm_state_fini(struct net *net)
flush_work(&net->xfrm.state_hash_work); flush_work(&xfrm_state_gc_work); - xfrm_state_flush(net, IPSEC_PROTO_ANY, false, true); + xfrm_state_flush(net, 0, false, true);
WARN_ON(!list_empty(&net->xfrm.state_all));
diff --git a/net/xfrm/xfrm_user.c b/net/xfrm/xfrm_user.c index 8d4d52fd457b2..6916931b1de1c 100644 --- a/net/xfrm/xfrm_user.c +++ b/net/xfrm/xfrm_user.c @@ -1513,20 +1513,8 @@ static int validate_tmpl(int nr, struct xfrm_user_tmpl *ut, u16 family) return -EINVAL; }
- switch (ut[i].id.proto) { - case IPPROTO_AH: - case IPPROTO_ESP: - case IPPROTO_COMP: -#if IS_ENABLED(CONFIG_IPV6) - case IPPROTO_ROUTING: - case IPPROTO_DSTOPTS: -#endif - case IPSEC_PROTO_ANY: - break; - default: + if (!xfrm_id_proto_valid(ut[i].id.proto)) return -EINVAL; - } - }
return 0;
From: Sabrina Dubroca sd@queasysnail.net
[ Upstream commit 8dfb4eba4100e7cdd161a8baef2d8d61b7a7e62e ]
esp_output_udp_encap can produce a length that doesn't fit in the 16 bits of a UDP header's length field. In that case, we'll send a fragmented packet whose length is larger than IP_MAX_MTU (resulting in "Oversized IP packet" warnings on receive) and with a bogus UDP length.
To prevent this, add a length check to esp_output_udp_encap and return -EMSGSIZE on failure.
This seems to be older than git history.
Signed-off-by: Sabrina Dubroca sd@queasysnail.net Signed-off-by: Steffen Klassert steffen.klassert@secunet.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/ipv4/esp4.c | 20 +++++++++++++++----- 1 file changed, 15 insertions(+), 5 deletions(-)
diff --git a/net/ipv4/esp4.c b/net/ipv4/esp4.c index 10e809b296ec8..fb065a8937ea2 100644 --- a/net/ipv4/esp4.c +++ b/net/ipv4/esp4.c @@ -226,7 +226,7 @@ static void esp_output_fill_trailer(u8 *tail, int tfclen, int plen, __u8 proto) tail[plen - 1] = proto; }
-static void esp_output_udp_encap(struct xfrm_state *x, struct sk_buff *skb, struct esp_info *esp) +static int esp_output_udp_encap(struct xfrm_state *x, struct sk_buff *skb, struct esp_info *esp) { int encap_type; struct udphdr *uh; @@ -234,6 +234,7 @@ static void esp_output_udp_encap(struct xfrm_state *x, struct sk_buff *skb, stru __be16 sport, dport; struct xfrm_encap_tmpl *encap = x->encap; struct ip_esp_hdr *esph = esp->esph; + unsigned int len;
spin_lock_bh(&x->lock); sport = encap->encap_sport; @@ -241,11 +242,14 @@ static void esp_output_udp_encap(struct xfrm_state *x, struct sk_buff *skb, stru encap_type = encap->encap_type; spin_unlock_bh(&x->lock);
+ len = skb->len + esp->tailen - skb_transport_offset(skb); + if (len + sizeof(struct iphdr) >= IP_MAX_MTU) + return -EMSGSIZE; + uh = (struct udphdr *)esph; uh->source = sport; uh->dest = dport; - uh->len = htons(skb->len + esp->tailen - - skb_transport_offset(skb)); + uh->len = htons(len); uh->check = 0;
switch (encap_type) { @@ -262,6 +266,8 @@ static void esp_output_udp_encap(struct xfrm_state *x, struct sk_buff *skb, stru
*skb_mac_header(skb) = IPPROTO_UDP; esp->esph = esph; + + return 0; }
int esp_output_head(struct xfrm_state *x, struct sk_buff *skb, struct esp_info *esp) @@ -275,8 +281,12 @@ int esp_output_head(struct xfrm_state *x, struct sk_buff *skb, struct esp_info * int tailen = esp->tailen;
/* this is non-NULL only with UDP Encapsulation */ - if (x->encap) - esp_output_udp_encap(x, skb, esp); + if (x->encap) { + int err = esp_output_udp_encap(x, skb, esp); + + if (err < 0) + return err; + }
if (!skb_cloned(skb)) { if (tailen <= skb_tailroom(skb)) {
From: Martin Willi martin@strongswan.org
[ Upstream commit 025c65e119bf58b610549ca359c9ecc5dee6a8d2 ]
If an xfrmi is associated to a vrf layer 3 master device, xfrm_policy_check() fails after traffic decapsulation. The input interface is replaced by the layer 3 master device, and hence xfrmi_decode_session() can't match the xfrmi anymore to satisfy policy checking.
Extend ingress xfrmi lookup to honor the original layer 3 slave device, allowing xfrm interfaces to operate within a vrf domain.
Fixes: f203b76d7809 ("xfrm: Add virtual xfrm interfaces") Signed-off-by: Martin Willi martin@strongswan.org Signed-off-by: Steffen Klassert steffen.klassert@secunet.com Signed-off-by: Sasha Levin sashal@kernel.org --- include/net/xfrm.h | 3 ++- net/xfrm/xfrm_interface.c | 17 ++++++++++++++--- net/xfrm/xfrm_policy.c | 2 +- 3 files changed, 17 insertions(+), 5 deletions(-)
diff --git a/include/net/xfrm.h b/include/net/xfrm.h index 902437dfbce77..c9b0b2b5d672f 100644 --- a/include/net/xfrm.h +++ b/include/net/xfrm.h @@ -295,7 +295,8 @@ struct xfrm_replay { };
struct xfrm_if_cb { - struct xfrm_if *(*decode_session)(struct sk_buff *skb); + struct xfrm_if *(*decode_session)(struct sk_buff *skb, + unsigned short family); };
void xfrm_if_register_cb(const struct xfrm_if_cb *ifcb); diff --git a/net/xfrm/xfrm_interface.c b/net/xfrm/xfrm_interface.c index dbb3c1945b5c9..85fec98676d34 100644 --- a/net/xfrm/xfrm_interface.c +++ b/net/xfrm/xfrm_interface.c @@ -70,17 +70,28 @@ static struct xfrm_if *xfrmi_lookup(struct net *net, struct xfrm_state *x) return NULL; }
-static struct xfrm_if *xfrmi_decode_session(struct sk_buff *skb) +static struct xfrm_if *xfrmi_decode_session(struct sk_buff *skb, + unsigned short family) { struct xfrmi_net *xfrmn; - int ifindex; struct xfrm_if *xi; + int ifindex = 0;
if (!secpath_exists(skb) || !skb->dev) return NULL;
+ switch (family) { + case AF_INET6: + ifindex = inet6_sdif(skb); + break; + case AF_INET: + ifindex = inet_sdif(skb); + break; + } + if (!ifindex) + ifindex = skb->dev->ifindex; + xfrmn = net_generic(xs_net(xfrm_input_state(skb)), xfrmi_net_id); - ifindex = skb->dev->ifindex;
for_each_xfrmi_rcu(xfrmn->xfrmi[0], xi) { if (ifindex == xi->dev->ifindex && diff --git a/net/xfrm/xfrm_policy.c b/net/xfrm/xfrm_policy.c index 8d1a898d0ba56..a6b58df7a70f6 100644 --- a/net/xfrm/xfrm_policy.c +++ b/net/xfrm/xfrm_policy.c @@ -3313,7 +3313,7 @@ int __xfrm_policy_check(struct sock *sk, int dir, struct sk_buff *skb, ifcb = xfrm_if_get_cb();
if (ifcb) { - xi = ifcb->decode_session(skb); + xi = ifcb->decode_session(skb, family); if (xi) { if_id = xi->p.if_id; net = xi->net;
From: Steffen Klassert steffen.klassert@secunet.com
[ Upstream commit 8742dc86d0c7a9628117a989c11f04a9b6b898f3 ]
We currently don't reload pointers pointing into skb header after doing pskb_may_pull() in _decode_session4(). So in case pskb_may_pull() changed the pointers, we read from random memory. Fix this by putting all the needed infos on the stack, so that we don't need to access the header pointers after doing pskb_may_pull().
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Steffen Klassert steffen.klassert@secunet.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/ipv4/xfrm4_policy.c | 24 +++++++++++++----------- 1 file changed, 13 insertions(+), 11 deletions(-)
diff --git a/net/ipv4/xfrm4_policy.c b/net/ipv4/xfrm4_policy.c index d73a6d6652f60..2b144b92ae46a 100644 --- a/net/ipv4/xfrm4_policy.c +++ b/net/ipv4/xfrm4_policy.c @@ -111,7 +111,8 @@ static void _decode_session4(struct sk_buff *skb, struct flowi *fl, int reverse) { const struct iphdr *iph = ip_hdr(skb); - u8 *xprth = skb_network_header(skb) + iph->ihl * 4; + int ihl = iph->ihl; + u8 *xprth = skb_network_header(skb) + ihl * 4; struct flowi4 *fl4 = &fl->u.ip4; int oif = 0;
@@ -122,6 +123,11 @@ _decode_session4(struct sk_buff *skb, struct flowi *fl, int reverse) fl4->flowi4_mark = skb->mark; fl4->flowi4_oif = reverse ? skb->skb_iif : oif;
+ fl4->flowi4_proto = iph->protocol; + fl4->daddr = reverse ? iph->saddr : iph->daddr; + fl4->saddr = reverse ? iph->daddr : iph->saddr; + fl4->flowi4_tos = iph->tos; + if (!ip_is_fragment(iph)) { switch (iph->protocol) { case IPPROTO_UDP: @@ -133,7 +139,7 @@ _decode_session4(struct sk_buff *skb, struct flowi *fl, int reverse) pskb_may_pull(skb, xprth + 4 - skb->data)) { __be16 *ports;
- xprth = skb_network_header(skb) + iph->ihl * 4; + xprth = skb_network_header(skb) + ihl * 4; ports = (__be16 *)xprth;
fl4->fl4_sport = ports[!!reverse]; @@ -146,7 +152,7 @@ _decode_session4(struct sk_buff *skb, struct flowi *fl, int reverse) pskb_may_pull(skb, xprth + 2 - skb->data)) { u8 *icmp;
- xprth = skb_network_header(skb) + iph->ihl * 4; + xprth = skb_network_header(skb) + ihl * 4; icmp = xprth;
fl4->fl4_icmp_type = icmp[0]; @@ -159,7 +165,7 @@ _decode_session4(struct sk_buff *skb, struct flowi *fl, int reverse) pskb_may_pull(skb, xprth + 4 - skb->data)) { __be32 *ehdr;
- xprth = skb_network_header(skb) + iph->ihl * 4; + xprth = skb_network_header(skb) + ihl * 4; ehdr = (__be32 *)xprth;
fl4->fl4_ipsec_spi = ehdr[0]; @@ -171,7 +177,7 @@ _decode_session4(struct sk_buff *skb, struct flowi *fl, int reverse) pskb_may_pull(skb, xprth + 8 - skb->data)) { __be32 *ah_hdr;
- xprth = skb_network_header(skb) + iph->ihl * 4; + xprth = skb_network_header(skb) + ihl * 4; ah_hdr = (__be32 *)xprth;
fl4->fl4_ipsec_spi = ah_hdr[1]; @@ -183,7 +189,7 @@ _decode_session4(struct sk_buff *skb, struct flowi *fl, int reverse) pskb_may_pull(skb, xprth + 4 - skb->data)) { __be16 *ipcomp_hdr;
- xprth = skb_network_header(skb) + iph->ihl * 4; + xprth = skb_network_header(skb) + ihl * 4; ipcomp_hdr = (__be16 *)xprth;
fl4->fl4_ipsec_spi = htonl(ntohs(ipcomp_hdr[1])); @@ -196,7 +202,7 @@ _decode_session4(struct sk_buff *skb, struct flowi *fl, int reverse) __be16 *greflags; __be32 *gre_hdr;
- xprth = skb_network_header(skb) + iph->ihl * 4; + xprth = skb_network_header(skb) + ihl * 4; greflags = (__be16 *)xprth; gre_hdr = (__be32 *)xprth;
@@ -213,10 +219,6 @@ _decode_session4(struct sk_buff *skb, struct flowi *fl, int reverse) break; } } - fl4->flowi4_proto = iph->protocol; - fl4->daddr = reverse ? iph->saddr : iph->daddr; - fl4->saddr = reverse ? iph->daddr : iph->saddr; - fl4->flowi4_tos = iph->tos; }
static void xfrm4_update_pmtu(struct dst_entry *dst, struct sock *sk,
From: Vineet Gupta vgupta@synopsys.com
[ Upstream commit 99bd5fcc505d65ea9c60619202f0b2d926eabbe9 ]
HSDK currently panics when built for HIGHMEM/ARC_HAS_PAE40 because ioc is enabled with default which doesn't work for the 2 non contiguous memory nodes. So get PAE working by disabling ioc instead.
Tested with !PAE40 by forcing @ioc_enable=0 and running the glibc testsuite over ssh
Signed-off-by: Vineet Gupta vgupta@synopsys.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arc/mm/cache.c | 31 ++++++++++++++++--------------- 1 file changed, 16 insertions(+), 15 deletions(-)
diff --git a/arch/arc/mm/cache.c b/arch/arc/mm/cache.c index 4135abec3fb09..63e6e65046992 100644 --- a/arch/arc/mm/cache.c +++ b/arch/arc/mm/cache.c @@ -113,10 +113,24 @@ static void read_decode_cache_bcr_arcv2(int cpu) }
READ_BCR(ARC_REG_CLUSTER_BCR, cbcr); - if (cbcr.c) + if (cbcr.c) { ioc_exists = 1; - else + + /* + * As for today we don't support both IOC and ZONE_HIGHMEM enabled + * simultaneously. This happens because as of today IOC aperture covers + * only ZONE_NORMAL (low mem) and any dma transactions outside this + * region won't be HW coherent. + * If we want to use both IOC and ZONE_HIGHMEM we can use + * bounce_buffer to handle dma transactions to HIGHMEM. + * Also it is possible to modify dma_direct cache ops or increase IOC + * aperture size if we are planning to use HIGHMEM without PAE. + */ + if (IS_ENABLED(CONFIG_HIGHMEM) || is_pae40_enabled()) + ioc_enable = 0; + } else { ioc_enable = 0; + }
/* HS 2.0 didn't have AUX_VOL */ if (cpuinfo_arc700[cpu].core.family > 0x51) { @@ -1158,19 +1172,6 @@ noinline void __init arc_ioc_setup(void) if (!ioc_enable) return;
- /* - * As for today we don't support both IOC and ZONE_HIGHMEM enabled - * simultaneously. This happens because as of today IOC aperture covers - * only ZONE_NORMAL (low mem) and any dma transactions outside this - * region won't be HW coherent. - * If we want to use both IOC and ZONE_HIGHMEM we can use - * bounce_buffer to handle dma transactions to HIGHMEM. - * Also it is possible to modify dma_direct cache ops or increase IOC - * aperture size if we are planning to use HIGHMEM without PAE. - */ - if (IS_ENABLED(CONFIG_HIGHMEM)) - panic("IOC and HIGHMEM can't be used simultaneously"); - /* Flush + invalidate + disable L1 dcache */ __dc_disable();
From: Jernej Skrabec jernej.skrabec@siol.net
[ Upstream commit 2abc330c514fe56c570bb1a6318b054b06a4f72e ]
Sometimes one of the nkmp factors is unused. This means that one of the factors shift and width values are set to 0. Current nkmp clock code generates a mask for each factor with GENMASK(width + shift - 1, shift). For unused factor this translates to GENMASK(-1, 0). This code is further expanded by C preprocessor to final version: (((~0UL) - (1UL << (0)) + 1) & (~0UL >> (BITS_PER_LONG - 1 - (-1)))) or a bit simplified: (~0UL & (~0UL >> BITS_PER_LONG))
It turns out that result of the second part (~0UL >> BITS_PER_LONG) is actually undefined by C standard, which clearly specifies:
"If the value of the right operand is negative or is greater than or equal to the width of the promoted left operand, the behavior is undefined."
Additionally, compiling kernel with aarch64-linux-gnu-gcc 8.3.0 gave different results whether literals or variables with same values as literals were used. GENMASK with literals -1 and 0 gives zero and with variables gives 0xFFFFFFFFFFFFFFF (~0UL). Because nkmp driver uses GENMASK with variables as parameter, expression calculates mask as ~0UL instead of 0. This has further consequences that LSB in register is always set to 1 (1 is neutral value for a factor and shift is 0).
For example, H6 pll-de clock is set to 600 MHz by sun4i-drm driver, but due to this bug ends up being 300 MHz. Additionally, 300 MHz seems to be too low because following warning can be found in dmesg:
[ 1.752763] WARNING: CPU: 2 PID: 41 at drivers/clk/sunxi-ng/ccu_common.c:41 ccu_helper_wait_for_lock.part.0+0x6c/0x90 [ 1.763378] Modules linked in: [ 1.766441] CPU: 2 PID: 41 Comm: kworker/2:1 Not tainted 5.1.0-rc2-next-20190401 #138 [ 1.774269] Hardware name: Pine H64 (DT) [ 1.778200] Workqueue: events deferred_probe_work_func [ 1.783341] pstate: 40000005 (nZcv daif -PAN -UAO) [ 1.788135] pc : ccu_helper_wait_for_lock.part.0+0x6c/0x90 [ 1.793623] lr : ccu_helper_wait_for_lock.part.0+0x48/0x90 [ 1.799107] sp : ffff000010f93840 [ 1.802422] x29: ffff000010f93840 x28: 0000000000000000 [ 1.807735] x27: ffff800073ce9d80 x26: ffff000010afd1b8 [ 1.813049] x25: ffffffffffffffff x24: 00000000ffffffff [ 1.818362] x23: 0000000000000001 x22: ffff000010abd5c8 [ 1.823675] x21: 0000000010000000 x20: 00000000685f367e [ 1.828987] x19: 0000000000001801 x18: 0000000000000001 [ 1.834300] x17: 0000000000000001 x16: 0000000000000000 [ 1.839613] x15: 0000000000000000 x14: ffff000010789858 [ 1.844926] x13: 0000000000000000 x12: 0000000000000001 [ 1.850239] x11: 0000000000000000 x10: 0000000000000970 [ 1.855551] x9 : ffff000010f936c0 x8 : ffff800074cec0d0 [ 1.860864] x7 : 0000800067117000 x6 : 0000000115c30b41 [ 1.866177] x5 : 00ffffffffffffff x4 : 002c959300bfe500 [ 1.871490] x3 : 0000000000000018 x2 : 0000000029aaaaab [ 1.876802] x1 : 00000000000002e6 x0 : 00000000686072bc [ 1.882114] Call trace: [ 1.884565] ccu_helper_wait_for_lock.part.0+0x6c/0x90 [ 1.889705] ccu_helper_wait_for_lock+0x10/0x20 [ 1.894236] ccu_nkmp_set_rate+0x244/0x2a8 [ 1.898334] clk_change_rate+0x144/0x290 [ 1.902258] clk_core_set_rate_nolock+0x180/0x1b8 [ 1.906963] clk_set_rate+0x34/0xa0 [ 1.910455] sun8i_mixer_bind+0x484/0x558 [ 1.914466] component_bind_all+0x10c/0x230 [ 1.918651] sun4i_drv_bind+0xc4/0x1a0 [ 1.922401] try_to_bring_up_master+0x164/0x1c0 [ 1.926932] __component_add+0xa0/0x168 [ 1.930769] component_add+0x10/0x18 [ 1.934346] sun8i_dw_hdmi_probe+0x18/0x20 [ 1.938443] platform_drv_probe+0x50/0xa0 [ 1.942455] really_probe+0xcc/0x280 [ 1.946032] driver_probe_device+0x54/0xe8 [ 1.950130] __device_attach_driver+0x80/0xb8 [ 1.954488] bus_for_each_drv+0x78/0xc8 [ 1.958326] __device_attach+0xd4/0x130 [ 1.962163] device_initial_probe+0x10/0x18 [ 1.966348] bus_probe_device+0x90/0x98 [ 1.970185] deferred_probe_work_func+0x6c/0xa0 [ 1.974720] process_one_work+0x1e0/0x320 [ 1.978732] worker_thread+0x228/0x428 [ 1.982484] kthread+0x120/0x128 [ 1.985714] ret_from_fork+0x10/0x18 [ 1.989290] ---[ end trace 9babd42e1ca4b84f ]---
This commit solves the issue by first checking value of the factor width. If it is equal to 0 (unused factor), mask is set to 0, otherwise GENMASK() macro is used as before.
Fixes: d897ef56faf9 ("clk: sunxi-ng: Mask nkmp factors when setting register") Signed-off-by: Jernej Skrabec jernej.skrabec@siol.net Signed-off-by: Maxime Ripard maxime.ripard@bootlin.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/clk/sunxi-ng/ccu_nkmp.c | 18 +++++++++++++----- 1 file changed, 13 insertions(+), 5 deletions(-)
diff --git a/drivers/clk/sunxi-ng/ccu_nkmp.c b/drivers/clk/sunxi-ng/ccu_nkmp.c index 9b49adb20d07c..69dfc6de1c4e6 100644 --- a/drivers/clk/sunxi-ng/ccu_nkmp.c +++ b/drivers/clk/sunxi-ng/ccu_nkmp.c @@ -167,7 +167,7 @@ static int ccu_nkmp_set_rate(struct clk_hw *hw, unsigned long rate, unsigned long parent_rate) { struct ccu_nkmp *nkmp = hw_to_ccu_nkmp(hw); - u32 n_mask, k_mask, m_mask, p_mask; + u32 n_mask = 0, k_mask = 0, m_mask = 0, p_mask = 0; struct _ccu_nkmp _nkmp; unsigned long flags; u32 reg; @@ -186,10 +186,18 @@ static int ccu_nkmp_set_rate(struct clk_hw *hw, unsigned long rate,
ccu_nkmp_find_best(parent_rate, rate, &_nkmp);
- n_mask = GENMASK(nkmp->n.width + nkmp->n.shift - 1, nkmp->n.shift); - k_mask = GENMASK(nkmp->k.width + nkmp->k.shift - 1, nkmp->k.shift); - m_mask = GENMASK(nkmp->m.width + nkmp->m.shift - 1, nkmp->m.shift); - p_mask = GENMASK(nkmp->p.width + nkmp->p.shift - 1, nkmp->p.shift); + if (nkmp->n.width) + n_mask = GENMASK(nkmp->n.width + nkmp->n.shift - 1, + nkmp->n.shift); + if (nkmp->k.width) + k_mask = GENMASK(nkmp->k.width + nkmp->k.shift - 1, + nkmp->k.shift); + if (nkmp->m.width) + m_mask = GENMASK(nkmp->m.width + nkmp->m.shift - 1, + nkmp->m.shift); + if (nkmp->p.width) + p_mask = GENMASK(nkmp->p.width + nkmp->p.shift - 1, + nkmp->p.shift);
spin_lock_irqsave(nkmp->common.lock, flags);
From: Suraj Jitindar Singh sjitindarsingh@gmail.com
[ Upstream commit 7cb9eb106d7a4efab6bcf30ec9503f1d703c77f5 ]
There is a hardware bug in some POWER9 processors where a treclaim in fake suspend mode can cause an inconsistency in the XER[SO] bit across the threads of a core, the workaround being to force the core into SMT4 when doing the treclaim.
The FAKE_SUSPEND bit (bit 10) in the PSSCR is used to control whether a thread is in fake suspend or real suspend. The important difference here being that thread reconfiguration is blocked in real suspend but not fake suspend mode.
When we exit a guest which was in fake suspend mode, we force the core into SMT4 while we do the treclaim in kvmppc_save_tm_hv(). However on the new exit path introduced with the function kvmhv_run_single_vcpu() we restore the host PSSCR before calling kvmppc_save_tm_hv() which means that if we were in fake suspend mode we put the thread into real suspend mode when we clear the PSSCR[FAKE_SUSPEND] bit. This means that we block thread reconfiguration and the thread which is trying to get the core into SMT4 before it can do the treclaim spins forever since it itself is blocking thread reconfiguration. The result is that that core is essentially lost.
This results in a trace such as: [ 93.512904] CPU: 7 PID: 13352 Comm: qemu-system-ppc Not tainted 5.0.0 #4 [ 93.512905] NIP: c000000000098a04 LR: c0000000000cc59c CTR: 0000000000000000 [ 93.512908] REGS: c000003fffd2bd70 TRAP: 0100 Not tainted (5.0.0) [ 93.512908] MSR: 9000000302883033 <SF,HV,VEC,VSX,FP,ME,IR,DR,RI,LE,TM[SE]> CR: 22222444 XER: 00000000 [ 93.512914] CFAR: c000000000098a5c IRQMASK: 3 [ 93.512915] PACATMSCRATCH: 0000000000000001 [ 93.512916] GPR00: 0000000000000001 c000003f6cc1b830 c000000001033100 0000000000000004 [ 93.512928] GPR04: 0000000000000004 0000000000000002 0000000000000004 0000000000000007 [ 93.512930] GPR08: 0000000000000000 0000000000000004 0000000000000000 0000000000000004 [ 93.512932] GPR12: c000203fff7fc000 c000003fffff9500 0000000000000000 0000000000000000 [ 93.512935] GPR16: 2000000000300375 000000000000059f 0000000000000000 0000000000000000 [ 93.512951] GPR20: 0000000000000000 0000000000080053 004000000256f41f c000003f6aa88ef0 [ 93.512953] GPR24: c000003f6aa89100 0000000000000010 0000000000000000 0000000000000000 [ 93.512956] GPR28: c000003f9e9a0800 0000000000000000 0000000000000001 c000203fff7fc000 [ 93.512959] NIP [c000000000098a04] pnv_power9_force_smt4_catch+0x1b4/0x2c0 [ 93.512960] LR [c0000000000cc59c] kvmppc_save_tm_hv+0x40/0x88 [ 93.512960] Call Trace: [ 93.512961] [c000003f6cc1b830] [0000000000080053] 0x80053 (unreliable) [ 93.512965] [c000003f6cc1b8a0] [c00800001e9cb030] kvmhv_p9_guest_entry+0x508/0x6b0 [kvm_hv] [ 93.512967] [c000003f6cc1b940] [c00800001e9cba44] kvmhv_run_single_vcpu+0x2dc/0xb90 [kvm_hv] [ 93.512968] [c000003f6cc1ba10] [c00800001e9cc948] kvmppc_vcpu_run_hv+0x650/0xb90 [kvm_hv] [ 93.512969] [c000003f6cc1bae0] [c00800001e8f620c] kvmppc_vcpu_run+0x34/0x48 [kvm] [ 93.512971] [c000003f6cc1bb00] [c00800001e8f2d4c] kvm_arch_vcpu_ioctl_run+0x2f4/0x400 [kvm] [ 93.512972] [c000003f6cc1bb90] [c00800001e8e3918] kvm_vcpu_ioctl+0x460/0x7d0 [kvm] [ 93.512974] [c000003f6cc1bd00] [c0000000003ae2c0] do_vfs_ioctl+0xe0/0x8e0 [ 93.512975] [c000003f6cc1bdb0] [c0000000003aeb24] ksys_ioctl+0x64/0xe0 [ 93.512978] [c000003f6cc1be00] [c0000000003aebc8] sys_ioctl+0x28/0x80 [ 93.512981] [c000003f6cc1be20] [c00000000000b3a4] system_call+0x5c/0x70 [ 93.512983] Instruction dump: [ 93.512986] 419dffbc e98c0000 2e8b0000 38000001 60000000 60000000 60000000 40950068 [ 93.512993] 392bffff 39400000 79290020 39290001 <7d2903a6> 60000000 60000000 7d235214
To fix this we preserve the PSSCR[FAKE_SUSPEND] bit until we call kvmppc_save_tm_hv() which will mean the core can get into SMT4 and perform the treclaim. Note kvmppc_save_tm_hv() clears the PSSCR[FAKE_SUSPEND] bit again so there is no need to explicitly do that.
Fixes: 95a6432ce9038 ("KVM: PPC: Book3S HV: Streamlined guest entry/exit path on P9 for radix guests")
Signed-off-by: Suraj Jitindar Singh sjitindarsingh@gmail.com Signed-off-by: Paul Mackerras paulus@ozlabs.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/powerpc/kvm/book3s_hv.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c index 5a066fc299e17..f17065f2c962f 100644 --- a/arch/powerpc/kvm/book3s_hv.c +++ b/arch/powerpc/kvm/book3s_hv.c @@ -3407,7 +3407,9 @@ static int kvmhv_load_hv_regs_and_go(struct kvm_vcpu *vcpu, u64 time_limit, vcpu->arch.shregs.sprg2 = mfspr(SPRN_SPRG2); vcpu->arch.shregs.sprg3 = mfspr(SPRN_SPRG3);
- mtspr(SPRN_PSSCR, host_psscr); + /* Preserve PSSCR[FAKE_SUSPEND] until we've called kvmppc_save_tm_hv */ + mtspr(SPRN_PSSCR, host_psscr | + (local_paca->kvm_hstate.fake_suspend << PSSCR_FAKE_SUSPEND_LG)); mtspr(SPRN_HFSCR, host_hfscr); mtspr(SPRN_CIABR, host_ciabr); mtspr(SPRN_DAWR, host_dawr);
From: Alexey Kardashevskiy aik@ozlabs.ru
[ Upstream commit 345077c8e172c255ea0707214303ccd099e5656b ]
Guest physical to user address translation uses KVM memslots and reading these requires holding the kvm->srcu lock. However recently introduced kvmppc_tce_validate() broke the rule (see the lockdep warning below).
This moves srcu_read_lock(&vcpu->kvm->srcu) earlier to protect kvmppc_tce_validate() as well.
============================= WARNING: suspicious RCU usage 5.1.0-rc2-le_nv2_aikATfstn1-p1 #380 Not tainted ----------------------------- include/linux/kvm_host.h:605 suspicious rcu_dereference_check() usage!
other info that might help us debug this:
rcu_scheduler_active = 2, debug_locks = 1 1 lock held by qemu-system-ppc/8020: #0: 0000000094972fe9 (&vcpu->mutex){+.+.}, at: kvm_vcpu_ioctl+0xdc/0x850 [kvm]
stack backtrace: CPU: 44 PID: 8020 Comm: qemu-system-ppc Not tainted 5.1.0-rc2-le_nv2_aikATfstn1-p1 #380 Call Trace: [c000003fece8f740] [c000000000bcc134] dump_stack+0xe8/0x164 (unreliable) [c000003fece8f790] [c000000000181be0] lockdep_rcu_suspicious+0x130/0x170 [c000003fece8f810] [c0000000000d5f50] kvmppc_tce_to_ua+0x280/0x290 [c000003fece8f870] [c00800001a7e2c78] kvmppc_tce_validate+0x80/0x1b0 [kvm] [c000003fece8f8e0] [c00800001a7e3fac] kvmppc_h_put_tce+0x94/0x3e4 [kvm] [c000003fece8f9a0] [c00800001a8baac4] kvmppc_pseries_do_hcall+0x30c/0xce0 [kvm_hv] [c000003fece8fa10] [c00800001a8bd89c] kvmppc_vcpu_run_hv+0x694/0xec0 [kvm_hv] [c000003fece8fae0] [c00800001a7d95dc] kvmppc_vcpu_run+0x34/0x48 [kvm] [c000003fece8fb00] [c00800001a7d56bc] kvm_arch_vcpu_ioctl_run+0x2f4/0x400 [kvm] [c000003fece8fb90] [c00800001a7c3618] kvm_vcpu_ioctl+0x460/0x850 [kvm] [c000003fece8fd00] [c00000000041c4f4] do_vfs_ioctl+0xe4/0x930 [c000003fece8fdb0] [c00000000041ce04] ksys_ioctl+0xc4/0x110 [c000003fece8fe00] [c00000000041ce78] sys_ioctl+0x28/0x80 [c000003fece8fe20] [c00000000000b5a4] system_call+0x5c/0x70
Fixes: 42de7b9e2167 ("KVM: PPC: Validate TCEs against preregistered memory page sizes", 2018-09-10) Signed-off-by: Alexey Kardashevskiy aik@ozlabs.ru Signed-off-by: Paul Mackerras paulus@ozlabs.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/powerpc/kvm/book3s_64_vio.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/arch/powerpc/kvm/book3s_64_vio.c b/arch/powerpc/kvm/book3s_64_vio.c index 532ab79734c7a..d43e8fe6d4241 100644 --- a/arch/powerpc/kvm/book3s_64_vio.c +++ b/arch/powerpc/kvm/book3s_64_vio.c @@ -543,14 +543,14 @@ long kvmppc_h_put_tce(struct kvm_vcpu *vcpu, unsigned long liobn, if (ret != H_SUCCESS) return ret;
+ idx = srcu_read_lock(&vcpu->kvm->srcu); + ret = kvmppc_tce_validate(stt, tce); if (ret != H_SUCCESS) - return ret; + goto unlock_exit;
dir = iommu_tce_direction(tce);
- idx = srcu_read_lock(&vcpu->kvm->srcu); - if ((dir != DMA_NONE) && kvmppc_tce_to_ua(vcpu->kvm, tce, &ua, NULL)) { ret = H_PARAMETER; goto unlock_exit;
From: Tony Lindgren tony@atomide.com
[ Upstream commit dbe7208c6c4aec083571f2ec742870a0d0edbea3 ]
If called fast enough so samples do not increment, we can get division by zero in kernel:
__div0 cpcap_battery_cc_raw_div cpcap_battery_get_property power_supply_get_property.part.1 power_supply_get_property power_supply_show_property power_supply_uevent
Fixes: 874b2adbed12 ("power: supply: cpcap-battery: Add a battery driver") Signed-off-by: Tony Lindgren tony@atomide.com Acked-by: Pavel Machek pavel@ucw.cz Signed-off-by: Sebastian Reichel sebastian.reichel@collabora.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/power/supply/cpcap-battery.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/power/supply/cpcap-battery.c b/drivers/power/supply/cpcap-battery.c index 08d5037fd0521..6887870ba32c3 100644 --- a/drivers/power/supply/cpcap-battery.c +++ b/drivers/power/supply/cpcap-battery.c @@ -221,6 +221,9 @@ static int cpcap_battery_cc_raw_div(struct cpcap_battery_ddata *ddata, int avg_current; u32 cc_lsb;
+ if (!divider) + return 0; + sample &= 0xffffff; /* 24-bits, unsigned */ offset &= 0x7ff; /* 10-bits, signed */
From: Al Viro viro@zeniv.linux.org.uk
[ Upstream commit 46c874419652bbefdfed17420fd6e88d8a31d9ec ]
symlink body shouldn't be freed without an RCU delay. Switch securityfs to ->destroy_inode() and use of call_rcu(); free both the inode and symlink body in the callback.
Signed-off-by: Al Viro viro@zeniv.linux.org.uk Signed-off-by: Sasha Levin sashal@kernel.org --- security/inode.c | 13 +++++++++---- 1 file changed, 9 insertions(+), 4 deletions(-)
diff --git a/security/inode.c b/security/inode.c index b7772a9b315ee..421dd72b58767 100644 --- a/security/inode.c +++ b/security/inode.c @@ -27,17 +27,22 @@ static struct vfsmount *mount; static int mount_count;
-static void securityfs_evict_inode(struct inode *inode) +static void securityfs_i_callback(struct rcu_head *head) { - truncate_inode_pages_final(&inode->i_data); - clear_inode(inode); + struct inode *inode = container_of(head, struct inode, i_rcu); if (S_ISLNK(inode->i_mode)) kfree(inode->i_link); + free_inode_nonrcu(inode); +} + +static void securityfs_destroy_inode(struct inode *inode) +{ + call_rcu(&inode->i_rcu, securityfs_i_callback); }
static const struct super_operations securityfs_super_operations = { .statfs = simple_statfs, - .evict_inode = securityfs_evict_inode, + .destroy_inode = securityfs_destroy_inode, };
static int fill_super(struct super_block *sb, void *data, int silent)
From: Al Viro viro@zeniv.linux.org.uk
[ Upstream commit f51dcd0f621caac5380ce90fbbeafc32ce4517ae ]
symlink body shouldn't be freed without an RCU delay. Switch apparmorfs to ->destroy_inode() and use of call_rcu(); free both the inode and symlink body in the callback.
Signed-off-by: Al Viro viro@zeniv.linux.org.uk Signed-off-by: Sasha Levin sashal@kernel.org --- security/apparmor/apparmorfs.c | 13 +++++++++---- 1 file changed, 9 insertions(+), 4 deletions(-)
diff --git a/security/apparmor/apparmorfs.c b/security/apparmor/apparmorfs.c index 3f80a684c232a..665853dd517ca 100644 --- a/security/apparmor/apparmorfs.c +++ b/security/apparmor/apparmorfs.c @@ -123,17 +123,22 @@ static int aafs_show_path(struct seq_file *seq, struct dentry *dentry) return 0; }
-static void aafs_evict_inode(struct inode *inode) +static void aafs_i_callback(struct rcu_head *head) { - truncate_inode_pages_final(&inode->i_data); - clear_inode(inode); + struct inode *inode = container_of(head, struct inode, i_rcu); if (S_ISLNK(inode->i_mode)) kfree(inode->i_link); + free_inode_nonrcu(inode); +} + +static void aafs_destroy_inode(struct inode *inode) +{ + call_rcu(&inode->i_rcu, aafs_i_callback); }
static const struct super_operations aafs_super_ops = { .statfs = simple_statfs, - .evict_inode = aafs_evict_inode, + .destroy_inode = aafs_destroy_inode, .show_path = aafs_show_path, };
From: Logan Gunthorpe logang@deltatee.com
[ Upstream commit d5bc73f34cc97c4b4b9202cc93182c2515076edf ]
In most cases, kmalloc() will not be available early in boot when pci_setup() is called. Thus, the kstrdup() call that was added to fix the __initdata bug with the disable_acs_redir parameter usually returns NULL, so the parameter is discarded and has no effect.
To fix this, store the string that's in initdata until an initcall function can allocate the memory appropriately. This way we don't need any additional static memory.
Fixes: d2fd6e81912a ("PCI: Fix __initdata issue with "pci=disable_acs_redir" parameter") Signed-off-by: Logan Gunthorpe logang@deltatee.com Signed-off-by: Bjorn Helgaas bhelgaas@google.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pci/pci.c | 19 +++++++++++++++++-- 1 file changed, 17 insertions(+), 2 deletions(-)
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index e91005d0f20c7..3f77bab698ced 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -6266,8 +6266,7 @@ static int __init pci_setup(char *str) } else if (!strncmp(str, "pcie_scan_all", 13)) { pci_add_flags(PCI_SCAN_ALL_PCIE_DEVS); } else if (!strncmp(str, "disable_acs_redir=", 18)) { - disable_acs_redir_param = - kstrdup(str + 18, GFP_KERNEL); + disable_acs_redir_param = str + 18; } else { printk(KERN_ERR "PCI: Unknown option `%s'\n", str); @@ -6278,3 +6277,19 @@ static int __init pci_setup(char *str) return 0; } early_param("pci", pci_setup); + +/* + * 'disable_acs_redir_param' is initialized in pci_setup(), above, to point + * to data in the __initdata section which will be freed after the init + * sequence is complete. We can't allocate memory in pci_setup() because some + * architectures do not have any memory allocation service available during + * an early_param() call. So we allocate memory and copy the variable here + * before the init section is freed. + */ +static int __init pci_realloc_setup_params(void) +{ + disable_acs_redir_param = kstrdup(disable_acs_redir_param, GFP_KERNEL); + + return 0; +} +pure_initcall(pci_realloc_setup_params);
From: Vitaly Kuznetsov vkuznets@redhat.com
[ Upstream commit da66761c2d93a46270d69001abb5692717495a68 ]
It was reported that with some special Multi Processor Group configuration, e.g: bcdedit.exe /set groupsize 1 bcdedit.exe /set maxgroup on bcdedit.exe /set groupaware on for a 16-vCPU guest WS2012 shows BSOD on boot when PV TLB flush mechanism is in use.
Tracing kvm_hv_flush_tlb immediately reveals the issue:
kvm_hv_flush_tlb: processor_mask 0x0 address_space 0x0 flags 0x2
The only flag set in this request is HV_FLUSH_ALL_VIRTUAL_ADDRESS_SPACES, however, processor_mask is 0x0 and no HV_FLUSH_ALL_PROCESSORS is specified. We don't flush anything and apparently it's not what Windows expects.
TLFS doesn't say anything about such requests and newer Windows versions seem to be unaffected. This all feels like a WS2012 bug, which is, however, easy to workaround in KVM: let's flush everything when we see an empty flush request, over-flushing doesn't hurt.
Signed-off-by: Vitaly Kuznetsov vkuznets@redhat.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/kvm/hyperv.c | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-)
diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c index 371c669696d70..610c0f1fbdd71 100644 --- a/arch/x86/kvm/hyperv.c +++ b/arch/x86/kvm/hyperv.c @@ -1371,7 +1371,16 @@ static u64 kvm_hv_flush_tlb(struct kvm_vcpu *current_vcpu, u64 ingpa,
valid_bank_mask = BIT_ULL(0); sparse_banks[0] = flush.processor_mask; - all_cpus = flush.flags & HV_FLUSH_ALL_PROCESSORS; + + /* + * Work around possible WS2012 bug: it sends hypercalls + * with processor_mask = 0x0 and HV_FLUSH_ALL_PROCESSORS clear, + * while also expecting us to flush something and crashing if + * we don't. Let's treat processor_mask == 0 same as + * HV_FLUSH_ALL_PROCESSORS. + */ + all_cpus = (flush.flags & HV_FLUSH_ALL_PROCESSORS) || + flush.processor_mask == 0; } else { if (unlikely(kvm_read_guest(kvm, ingpa, &flush_ex, sizeof(flush_ex))))
From: Bhagavathi Perumal S bperumal@codeaurora.org
[ Upstream commit f1267cf3c01b12e0f843fb6a7450a7f0b2efab8a ]
The txq of vif is added to active_txqs list for ATF TXQ scheduling in the function ieee80211_queue_skb(), but it was not properly removed before freeing the txq object. It was causing use after free of the txq objects from the active_txqs list, result was kernel panic due to invalid memory access.
Fix kernel invalid memory access by properly removing txq object from active_txqs list before free the object.
Signed-off-by: Bhagavathi Perumal S bperumal@codeaurora.org Acked-by: Toke Høiland-Jørgensen toke@redhat.com Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/mac80211/iface.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/net/mac80211/iface.c b/net/mac80211/iface.c index 4a6ff1482a9ff..02d2e6f11e936 100644 --- a/net/mac80211/iface.c +++ b/net/mac80211/iface.c @@ -1908,6 +1908,9 @@ void ieee80211_if_remove(struct ieee80211_sub_if_data *sdata) list_del_rcu(&sdata->list); mutex_unlock(&sdata->local->iflist_mtx);
+ if (sdata->vif.txq) + ieee80211_txq_purge(sdata->local, to_txq_info(sdata->vif.txq)); + synchronize_rcu();
if (sdata->dev) {
From: Kangjie Lu kjlu@umn.edu
[ Upstream commit 22e8860cf8f777fbf6a83f2fb7127f682a8e9de4 ]
regmap_update_bits could fail and deserves a check.
The patch adds the checks and if it fails, returns its error code upstream.
Signed-off-by: Kangjie Lu kjlu@umn.edu Reviewed-by: Mukesh Ojha mojha@codeaurora.org Signed-off-by: Stefan Schmidt stefan@datenfreihafen.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ieee802154/mcr20a.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/drivers/net/ieee802154/mcr20a.c b/drivers/net/ieee802154/mcr20a.c index c589f5ae75bb5..8bb53ec8d9cf2 100644 --- a/drivers/net/ieee802154/mcr20a.c +++ b/drivers/net/ieee802154/mcr20a.c @@ -533,6 +533,8 @@ mcr20a_start(struct ieee802154_hw *hw) dev_dbg(printdev(lp), "no slotted operation\n"); ret = regmap_update_bits(lp->regmap_dar, DAR_PHY_CTRL1, DAR_PHY_CTRL1_SLOTTED, 0x0); + if (ret < 0) + return ret;
/* enable irq */ enable_irq(lp->spi->irq); @@ -540,11 +542,15 @@ mcr20a_start(struct ieee802154_hw *hw) /* Unmask SEQ interrupt */ ret = regmap_update_bits(lp->regmap_dar, DAR_PHY_CTRL2, DAR_PHY_CTRL2_SEQMSK, 0x0); + if (ret < 0) + return ret;
/* Start the RX sequence */ dev_dbg(printdev(lp), "start the RX sequence\n"); ret = regmap_update_bits(lp->regmap_dar, DAR_PHY_CTRL1, DAR_PHY_CTRL1_XCVSEQ_MASK, MCR20A_XCVSEQ_RX); + if (ret < 0) + return ret;
return 0; }
From: Andrew Jones drjones@redhat.com
[ Upstream commit 811328fc3222f7b55846de0cd0404339e2e1e6d7 ]
A failed KVM_ARM_VCPU_INIT should not set the vcpu target, as the vcpu target is used by kvm_vcpu_initialized() to determine if other vcpu ioctls may proceed. We need to set the target before calling kvm_reset_vcpu(), but if that call fails, we should then unset it and clear the feature bitmap while we're at it.
Signed-off-by: Andrew Jones drjones@redhat.com [maz: Simplified patch, completed commit message] Signed-off-by: Marc Zyngier marc.zyngier@arm.com Signed-off-by: Sasha Levin sashal@kernel.org --- virt/kvm/arm/arm.c | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-)
diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c index 9c486fad3f9f8..6202b4f718cea 100644 --- a/virt/kvm/arm/arm.c +++ b/virt/kvm/arm/arm.c @@ -949,7 +949,7 @@ int kvm_vm_ioctl_irq_line(struct kvm *kvm, struct kvm_irq_level *irq_level, static int kvm_vcpu_set_target(struct kvm_vcpu *vcpu, const struct kvm_vcpu_init *init) { - unsigned int i; + unsigned int i, ret; int phys_target = kvm_target_cpu();
if (init->target != phys_target) @@ -984,9 +984,14 @@ static int kvm_vcpu_set_target(struct kvm_vcpu *vcpu, vcpu->arch.target = phys_target;
/* Now we know what it is, we can reset it. */ - return kvm_reset_vcpu(vcpu); -} + ret = kvm_reset_vcpu(vcpu); + if (ret) { + vcpu->arch.target = -1; + bitmap_zero(vcpu->arch.features, KVM_VCPU_MAX_FEATURES); + }
+ return ret; +}
static int kvm_arch_vcpu_ioctl_vcpu_init(struct kvm_vcpu *vcpu, struct kvm_vcpu_init *init)
From: Andrey Smirnov andrew.smirnov@gmail.com
[ Upstream commit 349ced9984ff540ce74ca8a0b2e9b03dc434b9dd ]
Fix a similar endless event loop as was done in commit 8dcf32175b4e ("i2c: prevent endless uevent loop with CONFIG_I2C_DEBUG_CORE"):
The culprit is the dev_dbg printk in the i2c uevent handler. If this is activated (for instance by CONFIG_I2C_DEBUG_CORE) it results in an endless loop with systemd-journald.
This happens if user-space scans the system log and reads the uevent file to get information about a newly created device, which seems fair use to me. Unfortunately reading the "uevent" file uses the same function that runs for creating the uevent for a new device, generating the next syslog entry
Both CONFIG_I2C_DEBUG_CORE and CONFIG_POWER_SUPPLY_DEBUG were reported in https://bugs.freedesktop.org/show_bug.cgi?id=76886 but only former seems to have been fixed. Drop debug prints as it was done in I2C subsystem to resolve the issue.
Signed-off-by: Andrey Smirnov andrew.smirnov@gmail.com Cc: Chris Healy cphealy@gmail.com Cc: linux-pm@vger.kernel.org Signed-off-by: Sebastian Reichel sebastian.reichel@collabora.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/power/supply/power_supply_sysfs.c | 6 ------ 1 file changed, 6 deletions(-)
diff --git a/drivers/power/supply/power_supply_sysfs.c b/drivers/power/supply/power_supply_sysfs.c index dce24f5961609..5358a80d854f9 100644 --- a/drivers/power/supply/power_supply_sysfs.c +++ b/drivers/power/supply/power_supply_sysfs.c @@ -383,15 +383,11 @@ int power_supply_uevent(struct device *dev, struct kobj_uevent_env *env) char *prop_buf; char *attrname;
- dev_dbg(dev, "uevent\n"); - if (!psy || !psy->desc) { dev_dbg(dev, "No power supply yet\n"); return ret; }
- dev_dbg(dev, "POWER_SUPPLY_NAME=%s\n", psy->desc->name); - ret = add_uevent_var(env, "POWER_SUPPLY_NAME=%s", psy->desc->name); if (ret) return ret; @@ -427,8 +423,6 @@ int power_supply_uevent(struct device *dev, struct kobj_uevent_env *env) goto out; }
- dev_dbg(dev, "prop %s=%s\n", attrname, prop_buf); - ret = add_uevent_var(env, "POWER_SUPPLY_%s=%s", attrname, prop_buf); kfree(attrname); if (ret)
From: Alban Crequy alban@kinvolk.io
[ Upstream commit 8694d8c1f82cccec9380e0d3720b84eee315dfb7 ]
"bpftool map create" has an infinite loop on "while (argc)". The error case is missing.
Symptoms: when forgetting to type the keyword 'type' in front of 'hash': $ sudo bpftool map create /sys/fs/bpf/dir/foobar hash key 8 value 8 entries 128 (infinite loop, taking all the CPU) ^C
After the patch: $ sudo bpftool map create /sys/fs/bpf/dir/foobar hash key 8 value 8 entries 128 Error: unknown arg hash
Fixes: 0b592b5a01be ("tools: bpftool: add map create command") Signed-off-by: Alban Crequy alban@kinvolk.io Reviewed-by: Quentin Monnet quentin.monnet@netronome.com Acked-by: Song Liu songliubraving@fb.com Reviewed-by: Jakub Kicinski jakub.kicinski@netronome.com Signed-off-by: Daniel Borkmann daniel@iogearbox.net Signed-off-by: Alexei Starovoitov ast@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- tools/bpf/bpftool/map.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/tools/bpf/bpftool/map.c b/tools/bpf/bpftool/map.c index 1ef1ee2280a28..227766d9f43b1 100644 --- a/tools/bpf/bpftool/map.c +++ b/tools/bpf/bpftool/map.c @@ -1111,6 +1111,9 @@ static int do_create(int argc, char **argv) return -1; } NEXT_ARG(); + } else { + p_err("unknown arg %s", *argv); + return -1; } }
From: Peter Zijlstra peterz@infradead.org
[ Upstream commit 0edd6b64d1939e9e9168ff27947995bb7751db5d ]
Unless the very next line is schedule(), or implies it, one must not use preempt_enable_no_resched(). It can cause a preemption to go missing and thereby cause arbitrary delays, breaking the PREEMPT=y invariant.
Cc: Roman Gushchin guro@fb.com Cc: Alexei Starovoitov ast@kernel.org Cc: Daniel Borkmann daniel@iogearbox.net Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Signed-off-by: Alexei Starovoitov ast@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/bpf.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/include/linux/bpf.h b/include/linux/bpf.h index e734f163bd0b9..1fbd7672e4b37 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -455,7 +455,7 @@ int bpf_prog_array_copy(struct bpf_prog_array __rcu *old_array, } \ _out: \ rcu_read_unlock(); \ - preempt_enable_no_resched(); \ + preempt_enable(); \ _ret; \ })
From: Bjørn Mork bjorn@mork.no
[ Upstream commit 88ef66a28391ea7b624bfb7508a5b015c13b28f3 ]
Adding device entries found in vendor modified versions of this driver. Function maps for some of the devices follow:
WNC D16Q1, D16Q5, D18Q1 LTE CAT3 module (1435:0918)
MI_00 Qualcomm HS-USB Diagnostics MI_01 Android Debug interface MI_02 Qualcomm HS-USB Modem MI_03 Qualcomm Wireless HS-USB Ethernet Adapter MI_04 Qualcomm Wireless HS-USB Ethernet Adapter MI_05 Qualcomm Wireless HS-USB Ethernet Adapter MI_06 USB Mass Storage Device
T: Bus=02 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 2 Spd=480 MxCh= 0 D: Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs= 1 P: Vendor=1435 ProdID=0918 Rev= 2.32 S: Manufacturer=Android S: Product=Android S: SerialNumber=0123456789ABCDEF C:* #Ifs= 7 Cfg#= 1 Atr=80 MxPwr=500mA I:* If#= 0 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=option E: Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 1 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=42 Prot=01 Driver=(none) E: Ad=82(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=84(I) Atr=03(Int.) MxPS= 64 Ivl=32ms E: Ad=83(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=qmi_wwan E: Ad=86(I) Atr=03(Int.) MxPS= 64 Ivl=32ms E: Ad=85(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=qmi_wwan E: Ad=88(I) Atr=03(Int.) MxPS= 64 Ivl=32ms E: Ad=87(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 5 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=qmi_wwan E: Ad=8a(I) Atr=03(Int.) MxPS= 64 Ivl=32ms E: Ad=89(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=06(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
WNC D18 LTE CAT3 module (1435:d182)
MI_00 Qualcomm HS-USB Diagnostics MI_01 Androd Debug interface MI_02 Qualcomm HS-USB Modem MI_03 Qualcomm HS-USB NMEA MI_04 Qualcomm Wireless HS-USB Ethernet Adapter MI_05 Qualcomm Wireless HS-USB Ethernet Adapter MI_06 USB Mass Storage Device
ZM8510/ZM8620/ME3960 (19d2:0396)
MI_00 ZTE Mobile Broadband Diagnostics Port MI_01 ZTE Mobile Broadband AT Port MI_02 ZTE Mobile Broadband Modem MI_03 ZTE Mobile Broadband NDIS Port (qmi_wwan) MI_04 ZTE Mobile Broadband ADB Port
ME3620_X (19d2:1432)
MI_00 ZTE Diagnostics Device MI_01 ZTE UI AT Interface MI_02 ZTE Modem Device MI_03 ZTE Mobile Broadband Network Adapter MI_04 ZTE Composite ADB Interface
Reported-by: Lars Melin larsm17@gmail.com Signed-off-by: Bjørn Mork bjorn@mork.no Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/usb/qmi_wwan.c | 10 ++++++++++ 1 file changed, 10 insertions(+)
diff --git a/drivers/net/usb/qmi_wwan.c b/drivers/net/usb/qmi_wwan.c index 9195f3476b1d7..679e404a5224f 100644 --- a/drivers/net/usb/qmi_wwan.c +++ b/drivers/net/usb/qmi_wwan.c @@ -1122,9 +1122,16 @@ static const struct usb_device_id products[] = { {QMI_FIXED_INTF(0x0846, 0x68d3, 8)}, /* Netgear Aircard 779S */ {QMI_FIXED_INTF(0x12d1, 0x140c, 1)}, /* Huawei E173 */ {QMI_FIXED_INTF(0x12d1, 0x14ac, 1)}, /* Huawei E1820 */ + {QMI_FIXED_INTF(0x1435, 0x0918, 3)}, /* Wistron NeWeb D16Q1 */ + {QMI_FIXED_INTF(0x1435, 0x0918, 4)}, /* Wistron NeWeb D16Q1 */ + {QMI_FIXED_INTF(0x1435, 0x0918, 5)}, /* Wistron NeWeb D16Q1 */ + {QMI_FIXED_INTF(0x1435, 0x3185, 4)}, /* Wistron NeWeb M18Q5 */ + {QMI_FIXED_INTF(0x1435, 0xd111, 4)}, /* M9615A DM11-1 D51QC */ {QMI_FIXED_INTF(0x1435, 0xd181, 3)}, /* Wistron NeWeb D18Q1 */ {QMI_FIXED_INTF(0x1435, 0xd181, 4)}, /* Wistron NeWeb D18Q1 */ {QMI_FIXED_INTF(0x1435, 0xd181, 5)}, /* Wistron NeWeb D18Q1 */ + {QMI_FIXED_INTF(0x1435, 0xd182, 4)}, /* Wistron NeWeb D18 */ + {QMI_FIXED_INTF(0x1435, 0xd182, 5)}, /* Wistron NeWeb D18 */ {QMI_FIXED_INTF(0x1435, 0xd191, 4)}, /* Wistron NeWeb D19Q1 */ {QMI_QUIRK_SET_DTR(0x1508, 0x1001, 4)}, /* Fibocom NL668 series */ {QMI_FIXED_INTF(0x16d8, 0x6003, 0)}, /* CMOTech 6003 */ @@ -1180,6 +1187,7 @@ static const struct usb_device_id products[] = { {QMI_FIXED_INTF(0x19d2, 0x0265, 4)}, /* ONDA MT8205 4G LTE */ {QMI_FIXED_INTF(0x19d2, 0x0284, 4)}, /* ZTE MF880 */ {QMI_FIXED_INTF(0x19d2, 0x0326, 4)}, /* ZTE MF821D */ + {QMI_FIXED_INTF(0x19d2, 0x0396, 3)}, /* ZTE ZM8620 */ {QMI_FIXED_INTF(0x19d2, 0x0412, 4)}, /* Telewell TW-LTE 4G */ {QMI_FIXED_INTF(0x19d2, 0x1008, 4)}, /* ZTE (Vodafone) K3570-Z */ {QMI_FIXED_INTF(0x19d2, 0x1010, 4)}, /* ZTE (Vodafone) K3571-Z */ @@ -1200,7 +1208,9 @@ static const struct usb_device_id products[] = { {QMI_FIXED_INTF(0x19d2, 0x1425, 2)}, {QMI_FIXED_INTF(0x19d2, 0x1426, 2)}, /* ZTE MF91 */ {QMI_FIXED_INTF(0x19d2, 0x1428, 2)}, /* Telewell TW-LTE 4G v2 */ + {QMI_FIXED_INTF(0x19d2, 0x1432, 3)}, /* ZTE ME3620 */ {QMI_FIXED_INTF(0x19d2, 0x2002, 4)}, /* ZTE (Vodafone) K3765-Z */ + {QMI_FIXED_INTF(0x2001, 0x7e16, 3)}, /* D-Link DWM-221 */ {QMI_FIXED_INTF(0x2001, 0x7e19, 4)}, /* D-Link DWM-221 B1 */ {QMI_FIXED_INTF(0x2001, 0x7e35, 4)}, /* D-Link DWM-222 */ {QMI_FIXED_INTF(0x2020, 0x2031, 4)}, /* Olicard 600 */
From: Luca Coelho luciano.coelho@intel.com
[ Upstream commit de1887c064b9996ac03120d90d0a909a3f678f98 ]
We don't check for the validity of the lengths in the packet received from the firmware. If the MPDU length received in the rx descriptor is too short to contain the header length and the crypt length together, we may end up trying to copy a negative number of bytes (headlen - hdrlen < 0) which will underflow and cause us to try to copy a huge amount of data. This causes oopses such as this one:
BUG: unable to handle kernel paging request at ffff896be2970000 PGD 5e201067 P4D 5e201067 PUD 5e205067 PMD 16110d063 PTE 8000000162970161 Oops: 0003 [#1] PREEMPT SMP NOPTI CPU: 2 PID: 1824 Comm: irq/134-iwlwifi Not tainted 4.19.33-04308-geea41cf4930f #1 Hardware name: [...] RIP: 0010:memcpy_erms+0x6/0x10 Code: 90 90 90 90 eb 1e 0f 1f 00 48 89 f8 48 89 d1 48 c1 e9 03 83 e2 07 f3 48 a5 89 d1 f3 a4 c3 66 0f 1f 44 00 00 48 89 f8 48 89 d1 <f3> a4 c3 0f 1f 80 00 00 00 00 48 89 f8 48 83 fa 20 72 7e 40 38 fe RSP: 0018:ffffa4630196fc60 EFLAGS: 00010287 RAX: ffff896be2924618 RBX: ffff896bc8ecc600 RCX: 00000000fffb4610 RDX: 00000000fffffff8 RSI: ffff896a835e2a38 RDI: ffff896be2970000 RBP: ffffa4630196fd30 R08: ffff896bc8ecc600 R09: ffff896a83597000 R10: ffff896bd6998400 R11: 000000000200407f R12: ffff896a83597050 R13: 00000000fffffff8 R14: 0000000000000010 R15: ffff896a83597038 FS: 0000000000000000(0000) GS:ffff896be8280000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffff896be2970000 CR3: 000000005dc12002 CR4: 00000000003606e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: iwl_mvm_rx_mpdu_mq+0xb51/0x121b [iwlmvm] iwl_pcie_rx_handle+0x58c/0xa89 [iwlwifi] iwl_pcie_irq_rx_msix_handler+0xd9/0x12a [iwlwifi] irq_thread_fn+0x24/0x49 irq_thread+0xb0/0x122 kthread+0x138/0x140 ret_from_fork+0x1f/0x40
Fix that by checking the lengths for correctness and trigger a warning to show that we have received wrong data.
Signed-off-by: Luca Coelho luciano.coelho@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireless/intel/iwlwifi/mvm/rxmq.c | 28 ++++++++++++++++--- 1 file changed, 24 insertions(+), 4 deletions(-)
diff --git a/drivers/net/wireless/intel/iwlwifi/mvm/rxmq.c b/drivers/net/wireless/intel/iwlwifi/mvm/rxmq.c index 7bd8676508f54..519c7dd47f694 100644 --- a/drivers/net/wireless/intel/iwlwifi/mvm/rxmq.c +++ b/drivers/net/wireless/intel/iwlwifi/mvm/rxmq.c @@ -143,9 +143,9 @@ static inline int iwl_mvm_check_pn(struct iwl_mvm *mvm, struct sk_buff *skb, }
/* iwl_mvm_create_skb Adds the rxb to a new skb */ -static void iwl_mvm_create_skb(struct sk_buff *skb, struct ieee80211_hdr *hdr, - u16 len, u8 crypt_len, - struct iwl_rx_cmd_buffer *rxb) +static int iwl_mvm_create_skb(struct iwl_mvm *mvm, struct sk_buff *skb, + struct ieee80211_hdr *hdr, u16 len, u8 crypt_len, + struct iwl_rx_cmd_buffer *rxb) { struct iwl_rx_packet *pkt = rxb_addr(rxb); struct iwl_rx_mpdu_desc *desc = (void *)pkt->data; @@ -178,6 +178,20 @@ static void iwl_mvm_create_skb(struct sk_buff *skb, struct ieee80211_hdr *hdr, * present before copying packet data. */ hdrlen += crypt_len; + + if (WARN_ONCE(headlen < hdrlen, + "invalid packet lengths (hdrlen=%d, len=%d, crypt_len=%d)\n", + hdrlen, len, crypt_len)) { + /* + * We warn and trace because we want to be able to see + * it in trace-cmd as well. + */ + IWL_DEBUG_RX(mvm, + "invalid packet lengths (hdrlen=%d, len=%d, crypt_len=%d)\n", + hdrlen, len, crypt_len); + return -EINVAL; + } + skb_put_data(skb, hdr, hdrlen); skb_put_data(skb, (u8 *)hdr + hdrlen + pad_len, headlen - hdrlen);
@@ -190,6 +204,8 @@ static void iwl_mvm_create_skb(struct sk_buff *skb, struct ieee80211_hdr *hdr, skb_add_rx_frag(skb, 0, rxb_steal_page(rxb), offset, fraglen, rxb->truesize); } + + return 0; }
/* iwl_mvm_pass_packet_to_mac80211 - passes the packet for mac80211 */ @@ -1600,7 +1616,11 @@ void iwl_mvm_rx_mpdu_mq(struct iwl_mvm *mvm, struct napi_struct *napi, rx_status->boottime_ns = ktime_get_boot_ns(); }
- iwl_mvm_create_skb(skb, hdr, len, crypt_len, rxb); + if (iwl_mvm_create_skb(mvm, skb, hdr, len, crypt_len, rxb)) { + kfree_skb(skb); + goto out; + } + if (!iwl_mvm_reorder(mvm, napi, queue, sta, skb, desc)) iwl_mvm_pass_packet_to_mac80211(mvm, napi, skb, queue, sta); out:
From: "Tobin C. Harding" tobin@kernel.org
[ Upstream commit 9a4f26cc98d81b67ecc23b890c28e2df324e29f3 ]
Currently the error return path from kobject_init_and_add() is not followed by a call to kobject_put() - which means we are leaking the kobject.
Fix it by adding a call to kobject_put() in the error path of kobject_init_and_add().
Signed-off-by: Tobin C. Harding tobin@kernel.org Cc: Greg Kroah-Hartman gregkh@linuxfoundation.org Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Peter Zijlstra peterz@infradead.org Cc: Rafael J. Wysocki rafael.j.wysocki@intel.com Cc: Thomas Gleixner tglx@linutronix.de Cc: Tobin C. Harding tobin@kernel.org Cc: Vincent Guittot vincent.guittot@linaro.org Cc: Viresh Kumar viresh.kumar@linaro.org Link: http://lkml.kernel.org/r/20190430001144.24890-1-tobin@kernel.org Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/sched/cpufreq_schedutil.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c index 1ccf77f6d346d..d4ab9245e016c 100644 --- a/kernel/sched/cpufreq_schedutil.c +++ b/kernel/sched/cpufreq_schedutil.c @@ -771,6 +771,7 @@ static int sugov_init(struct cpufreq_policy *policy) return 0;
fail: + kobject_put(&tunables->attr_set.kobj); policy->governor_data = NULL; sugov_tunables_free(tunables);
From: Gary Hook Gary.Hook@amd.com
[ Upstream commit b51ce3744f115850166f3d6c292b9c8cb849ad4f ]
Enablement of AMD's Secure Memory Encryption feature is determined very early after start_kernel() is entered. Part of this procedure involves scanning the command line for the parameter 'mem_encrypt'.
To determine intended state, the function sme_enable() uses library functions cmdline_find_option() and strncmp(). Their use occurs early enough such that it cannot be assumed that any instrumentation subsystem is initialized.
For example, making calls to a KASAN-instrumented function before KASAN is set up will result in the use of uninitialized memory and a boot failure.
When AMD's SME support is enabled, conditionally disable instrumentation of these dependent functions in lib/string.c and arch/x86/lib/cmdline.c.
[ bp: Get rid of intermediary nostackp var and cleanup whitespace. ]
Fixes: aca20d546214 ("x86/mm: Add support to make use of Secure Memory Encryption") Reported-by: Li RongQing lirongqing@baidu.com Signed-off-by: Gary R Hook gary.hook@amd.com Signed-off-by: Borislav Petkov bp@suse.de Cc: Alexander Shishkin alexander.shishkin@linux.intel.com Cc: Andrew Morton akpm@linux-foundation.org Cc: Andy Shevchenko andriy.shevchenko@linux.intel.com Cc: Boris Brezillon bbrezillon@kernel.org Cc: Coly Li colyli@suse.de Cc: "dave.hansen@linux.intel.com" dave.hansen@linux.intel.com Cc: "H. Peter Anvin" hpa@zytor.com Cc: Ingo Molnar mingo@redhat.com Cc: Kees Cook keescook@chromium.org Cc: Kent Overstreet kent.overstreet@gmail.com Cc: "luto@kernel.org" luto@kernel.org Cc: Masahiro Yamada yamada.masahiro@socionext.com Cc: Matthew Wilcox willy@infradead.org Cc: "mingo@redhat.com" mingo@redhat.com Cc: "peterz@infradead.org" peterz@infradead.org Cc: Sebastian Andrzej Siewior bigeasy@linutronix.de Cc: Thomas Gleixner tglx@linutronix.de Cc: x86-ml x86@kernel.org Link: https://lkml.kernel.org/r/155657657552.7116.18363762932464011367.stgit@sosrh... Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/lib/Makefile | 12 ++++++++++++ lib/Makefile | 11 +++++++++++ 2 files changed, 23 insertions(+)
diff --git a/arch/x86/lib/Makefile b/arch/x86/lib/Makefile index 140e61843a079..3cb3af51ec897 100644 --- a/arch/x86/lib/Makefile +++ b/arch/x86/lib/Makefile @@ -6,6 +6,18 @@ # Produces uninteresting flaky coverage. KCOV_INSTRUMENT_delay.o := n
+# Early boot use of cmdline; don't instrument it +ifdef CONFIG_AMD_MEM_ENCRYPT +KCOV_INSTRUMENT_cmdline.o := n +KASAN_SANITIZE_cmdline.o := n + +ifdef CONFIG_FUNCTION_TRACER +CFLAGS_REMOVE_cmdline.o = -pg +endif + +CFLAGS_cmdline.o := $(call cc-option, -fno-stack-protector) +endif + inat_tables_script = $(srctree)/arch/x86/tools/gen-insn-attr-x86.awk inat_tables_maps = $(srctree)/arch/x86/lib/x86-opcode-map.txt quiet_cmd_inat_tables = GEN $@ diff --git a/lib/Makefile b/lib/Makefile index e1b59da714186..d1f312096bec5 100644 --- a/lib/Makefile +++ b/lib/Makefile @@ -17,6 +17,17 @@ KCOV_INSTRUMENT_list_debug.o := n KCOV_INSTRUMENT_debugobjects.o := n KCOV_INSTRUMENT_dynamic_debug.o := n
+# Early boot use of cmdline, don't instrument it +ifdef CONFIG_AMD_MEM_ENCRYPT +KASAN_SANITIZE_string.o := n + +ifdef CONFIG_FUNCTION_TRACER +CFLAGS_REMOVE_string.o = -pg +endif + +CFLAGS_string.o := $(call cc-option, -fno-stack-protector) +endif + lib-y := ctype.o string.o vsprintf.o cmdline.o \ rbtree.o radix-tree.o timerqueue.o xarray.o \ idr.o int_sqrt.o extable.o \
From: Paolo Bonzini pbonzini@redhat.com
[ Upstream commit 76d58e0f07ec203bbdfcaabd9a9fc10a5a3ed5ea ]
If a memory slot's size is not a multiple of 64 pages (256K), then the KVM_CLEAR_DIRTY_LOG API is unusable: clearing the final 64 pages either requires the requested page range to go beyond memslot->npages, or requires log->num_pages to be unaligned, and kvm_clear_dirty_log_protect requires log->num_pages to be both in range and aligned.
To allow this case, allow log->num_pages not to be a multiple of 64 if it ends exactly on the last page of the slot.
Reported-by: Peter Xu peterx@redhat.com Fixes: 98938aa8edd6 ("KVM: validate userspace input in kvm_clear_dirty_log_protect()", 2019-01-02) Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- Documentation/virtual/kvm/api.txt | 5 +++-- tools/testing/selftests/kvm/dirty_log_test.c | 9 ++++++--- virt/kvm/kvm_main.c | 7 ++++--- 3 files changed, 13 insertions(+), 8 deletions(-)
diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt index ba8927c0d45c4..a1b8e6d92298c 100644 --- a/Documentation/virtual/kvm/api.txt +++ b/Documentation/virtual/kvm/api.txt @@ -3790,8 +3790,9 @@ The ioctl clears the dirty status of pages in a memory slot, according to the bitmap that is passed in struct kvm_clear_dirty_log's dirty_bitmap field. Bit 0 of the bitmap corresponds to page "first_page" in the memory slot, and num_pages is the size in bits of the input bitmap. -Both first_page and num_pages must be a multiple of 64. For each bit -that is set in the input bitmap, the corresponding page is marked "clean" +first_page must be a multiple of 64; num_pages must also be a multiple of +64 unless first_page + num_pages is the size of the memory slot. For each +bit that is set in the input bitmap, the corresponding page is marked "clean" in KVM's dirty bitmap, and dirty tracking is re-enabled for that page (for example via write-protection, or by clearing the dirty bit in a page table entry). diff --git a/tools/testing/selftests/kvm/dirty_log_test.c b/tools/testing/selftests/kvm/dirty_log_test.c index 4715cfba20dce..93f99c6b7d79e 100644 --- a/tools/testing/selftests/kvm/dirty_log_test.c +++ b/tools/testing/selftests/kvm/dirty_log_test.c @@ -288,8 +288,11 @@ static void run_test(enum vm_guest_mode mode, unsigned long iterations, #endif max_gfn = (1ul << (guest_pa_bits - guest_page_shift)) - 1; guest_page_size = (1ul << guest_page_shift); - /* 1G of guest page sized pages */ - guest_num_pages = (1ul << (30 - guest_page_shift)); + /* + * A little more than 1G of guest page sized pages. Cover the + * case where the size is not aligned to 64 pages. + */ + guest_num_pages = (1ul << (30 - guest_page_shift)) + 3; host_page_size = getpagesize(); host_num_pages = (guest_num_pages * guest_page_size) / host_page_size + !!((guest_num_pages * guest_page_size) % host_page_size); @@ -359,7 +362,7 @@ static void run_test(enum vm_guest_mode mode, unsigned long iterations, kvm_vm_get_dirty_log(vm, TEST_MEM_SLOT_INDEX, bmap); #ifdef USE_CLEAR_DIRTY_LOG kvm_vm_clear_dirty_log(vm, TEST_MEM_SLOT_INDEX, bmap, 0, - DIV_ROUND_UP(host_num_pages, 64) * 64); + host_num_pages); #endif vm_dirty_log_verify(bmap); iteration++; diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index b4f2d892a1d36..c9b102e65a3a7 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -1241,7 +1241,7 @@ int kvm_clear_dirty_log_protect(struct kvm *kvm, if (as_id >= KVM_ADDRESS_SPACE_NUM || id >= KVM_USER_MEM_SLOTS) return -EINVAL;
- if ((log->first_page & 63) || (log->num_pages & 63)) + if (log->first_page & 63) return -EINVAL;
slots = __kvm_memslots(kvm, as_id); @@ -1254,8 +1254,9 @@ int kvm_clear_dirty_log_protect(struct kvm *kvm, n = kvm_dirty_bitmap_bytes(memslot);
if (log->first_page > memslot->npages || - log->num_pages > memslot->npages - log->first_page) - return -EINVAL; + log->num_pages > memslot->npages - log->first_page || + (log->num_pages < memslot->npages - log->first_page && (log->num_pages & 63))) + return -EINVAL;
*flush = false; dirty_bitmap_buffer = kvm_second_dirty_bitmap(memslot);
From: Vitaly Kuznetsov vkuznets@redhat.com
[ Upstream commit eba3afde1cea7dbd7881683232f2a85e2ed86bfe ]
Enlightened VMCS is only supported on Intel CPUs but the test shouldn't fail completely.
Signed-off-by: Vitaly Kuznetsov vkuznets@redhat.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/testing/selftests/kvm/x86_64/hyperv_cpuid.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/kvm/x86_64/hyperv_cpuid.c b/tools/testing/selftests/kvm/x86_64/hyperv_cpuid.c index 264425f75806b..9a21e912097c4 100644 --- a/tools/testing/selftests/kvm/x86_64/hyperv_cpuid.c +++ b/tools/testing/selftests/kvm/x86_64/hyperv_cpuid.c @@ -141,7 +141,13 @@ int main(int argc, char *argv[])
free(hv_cpuid_entries);
- vcpu_ioctl(vm, VCPU_ID, KVM_ENABLE_CAP, &enable_evmcs_cap); + rv = _vcpu_ioctl(vm, VCPU_ID, KVM_ENABLE_CAP, &enable_evmcs_cap); + + if (rv) { + fprintf(stderr, + "Enlightened VMCS is unsupported, skip related test\n"); + goto vm_free; + }
hv_cpuid_entries = kvm_get_supported_hv_cpuid(vm); if (!hv_cpuid_entries) @@ -151,6 +157,7 @@ int main(int argc, char *argv[])
free(hv_cpuid_entries);
+vm_free: kvm_vm_free(vm);
return 0;
From: Al Viro viro@zeniv.linux.org.uk
[ Upstream commit 4e9036042fedaffcd868d7f7aa948756c48c637d ]
To choose whether to pick the GID from the old (16bit) or new (32bit) field, we should check if the old gid field is set to 0xffff. Mainline checks the old *UID* field instead - cut'n'paste from the corresponding code in ufs_get_inode_uid().
Fixes: 252e211e90ce Signed-off-by: Al Viro viro@zeniv.linux.org.uk Signed-off-by: Sasha Levin sashal@kernel.org --- fs/ufs/util.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/ufs/util.h b/fs/ufs/util.h index 1fd3011ea6236..7fd4802222b8c 100644 --- a/fs/ufs/util.h +++ b/fs/ufs/util.h @@ -229,7 +229,7 @@ ufs_get_inode_gid(struct super_block *sb, struct ufs_inode *inode) case UFS_UID_44BSD: return fs32_to_cpu(sb, inode->ui_u3.ui_44.ui_gid); case UFS_UID_EFT: - if (inode->ui_u1.oldids.ui_suid == 0xFFFF) + if (inode->ui_u1.oldids.ui_sgid == 0xFFFF) return fs32_to_cpu(sb, inode->ui_u3.ui_sun.ui_gid); /* Fall through */ default:
From: Wolfram Sang wsa+renesas@sang-engineering.com
[ Upstream commit 6bac9bc273cdab6157ad7a2ead09400aabfc445b ]
There are two problems with dev_err() here. One: It is not ratelimited. Two: We don't see which driver tried to transfer something with a suspended adapter. Switch to dev_WARN_ONCE to fix both issues. Drawback is that we don't see if multiple drivers are trying to transfer while suspended. They need to be discovered one after the other now. This is better than a high CPU load because a really broken driver might try to resend endlessly.
Link: https://bugs.archlinux.org/task/62391 Fixes: 275154155538 ("i2c: designware: Do not allow i2c_dw_xfer() calls while suspended") Signed-off-by: Wolfram Sang wsa+renesas@sang-engineering.com Reported-by: skidnik skidnik@gmail.com Acked-by: Jarkko Nikula jarkko.nikula@linux.intel.com Reviewed-by: Andy Shevchenko andriy.shevchenko@linux.intel.com Tested-by: skidnik skidnik@gmail.com Signed-off-by: Wolfram Sang wsa@the-dreams.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/i2c/busses/i2c-designware-master.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/i2c/busses/i2c-designware-master.c b/drivers/i2c/busses/i2c-designware-master.c index bb8e3f1499796..d464799e40a30 100644 --- a/drivers/i2c/busses/i2c-designware-master.c +++ b/drivers/i2c/busses/i2c-designware-master.c @@ -426,8 +426,7 @@ i2c_dw_xfer(struct i2c_adapter *adap, struct i2c_msg msgs[], int num)
pm_runtime_get_sync(dev->dev);
- if (dev->suspended) { - dev_err(dev->dev, "Error %s call while suspended\n", __func__); + if (dev_WARN_ONCE(dev->dev, dev->suspended, "Transfer while suspended\n")) { ret = -ESHUTDOWN; goto done_nolock; }
From: Arnaldo Carvalho de Melo acme@redhat.com
[ Upstream commit bf561d3c13423fc54daa19b5d49dc15fafdb7acc ]
While cross building perf to the ARC architecture on a fedora 30 host, we were failing with:
CC /tmp/build/perf/bench/numa.o bench/numa.c: In function ‘worker_thread’: bench/numa.c:1261:12: error: ‘RUSAGE_THREAD’ undeclared (first use in this function); did you mean ‘SIGEV_THREAD’? getrusage(RUSAGE_THREAD, &rusage); ^~~~~~~~~~~~~ SIGEV_THREAD bench/numa.c:1261:12: note: each undeclared identifier is reported only once for each function it appears in
[perfbuilder@60d5802468f6 perf]$ /arc_gnu_2019.03-rc1_prebuilt_uclibc_le_archs_linux_install/bin/arc-linux-gcc --version | head -1 arc-linux-gcc (ARCv2 ISA Linux uClibc toolchain 2019.03-rc1) 8.3.1 20190225 [perfbuilder@60d5802468f6 perf]$
Trying to reproduce a report by Vineet, I noticed that, with just cross-built zlib and numactl libraries, I ended up with the above failure.
So, since RUSAGE_THREAD is available as a define, check for that and numactl libraries, I ended up with the above failure.
So, since RUSAGE_THREAD is available as a define in the system headers, check if it is defined in the 'perf bench numa' sources and define it if not.
Now it builds and I have to figure out if the problem reported by Vineet only takes place if we have libelf or some other library available.
Cc: Arnd Bergmann arnd@arndb.de Cc: Jiri Olsa jolsa@kernel.org Cc: linux-snps-arc@lists.infradead.org Cc: Namhyung Kim namhyung@kernel.org Cc: Vineet Gupta Vineet.Gupta1@synopsys.com Link: https://lkml.kernel.org/n/tip-2wb4r1gir9xrevbpq7qp0amk@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/perf/bench/numa.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/tools/perf/bench/numa.c b/tools/perf/bench/numa.c index 44195514b19e6..fa56fde6e8d80 100644 --- a/tools/perf/bench/numa.c +++ b/tools/perf/bench/numa.c @@ -38,6 +38,10 @@ #include <numa.h> #include <numaif.h>
+#ifndef RUSAGE_THREAD +# define RUSAGE_THREAD 1 +#endif + /* * Regular printout to the terminal, supressed if -q is specified: */
From: Leo Yan leo.yan@linaro.org
[ Upstream commit 35bb59c10a6d0578806dd500477dae9cb4be344e ]
Robert Walker reported a segmentation fault is observed when process CoreSight trace data; this issue can be easily reproduced by the command 'perf report --itrace=i1000i' for decoding tracing data.
If neither the 'b' flag (synthesize branches events) nor 'l' flag (synthesize last branch entries) are specified to option '--itrace', cs_etm_queue::prev_packet will not been initialised. After merging the code to support exception packets and sample flags, there introduced a number of uses of cs_etm_queue::prev_packet without checking whether it is valid, for these cases any accessing to uninitialised prev_packet will cause crash.
As cs_etm_queue::prev_packet is used more widely now and it's already hard to follow which functions have been called in a context where the validity of cs_etm_queue::prev_packet has been checked, this patch always allocates memory for cs_etm_queue::prev_packet.
Reported-by: Robert Walker robert.walker@arm.com Suggested-by: Robert Walker robert.walker@arm.com Signed-off-by: Leo Yan leo.yan@linaro.org Tested-by: Robert Walker robert.walker@arm.com Cc: Alexander Shishkin alexander.shishkin@linux.intel.com Cc: Jiri Olsa jolsa@redhat.com Cc: Mathieu Poirier mathieu.poirier@linaro.org Cc: Mike Leach mike.leach@linaro.org Cc: Namhyung Kim namhyung@kernel.org Cc: Suzuki K Poulouse suzuki.poulose@arm.com Cc: linux-arm-kernel@lists.infradead.org Fixes: 7100b12cf474 ("perf cs-etm: Generate branch sample for exception packet") Fixes: 24fff5eb2b93 ("perf cs-etm: Avoid stale branch samples when flush packet") Link: http://lkml.kernel.org/r/20190428083228.20246-1-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/perf/util/cs-etm.c | 8 +++----- 1 file changed, 3 insertions(+), 5 deletions(-)
diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c index 27a374ddf6615..947f1bb2fbdfb 100644 --- a/tools/perf/util/cs-etm.c +++ b/tools/perf/util/cs-etm.c @@ -345,11 +345,9 @@ static struct cs_etm_queue *cs_etm__alloc_queue(struct cs_etm_auxtrace *etm, if (!etmq->packet) goto out_free;
- if (etm->synth_opts.last_branch || etm->sample_branches) { - etmq->prev_packet = zalloc(szp); - if (!etmq->prev_packet) - goto out_free; - } + etmq->prev_packet = zalloc(szp); + if (!etmq->prev_packet) + goto out_free;
if (etm->synth_opts.last_branch) { size_t sz = sizeof(struct branch_stack);
From: Jiri Olsa jolsa@kernel.org
[ Upstream commit 6f55967ad9d9752813e36de6d5fdbd19741adfc7 ]
New race in x86_pmu_stop() was introduced by replacing the atomic __test_and_clear_bit() of cpuc->active_mask by separate test_bit() and __clear_bit() calls in the following commit:
3966c3feca3f ("x86/perf/amd: Remove need to check "running" bit in NMI handler")
The race causes panic for PEBS events with enabled callchains:
BUG: unable to handle kernel NULL pointer dereference at 0000000000000000 ... RIP: 0010:perf_prepare_sample+0x8c/0x530 Call Trace: <NMI> perf_event_output_forward+0x2a/0x80 __perf_event_overflow+0x51/0xe0 handle_pmi_common+0x19e/0x240 intel_pmu_handle_irq+0xad/0x170 perf_event_nmi_handler+0x2e/0x50 nmi_handle+0x69/0x110 default_do_nmi+0x3e/0x100 do_nmi+0x11a/0x180 end_repeat_nmi+0x16/0x1a RIP: 0010:native_write_msr+0x6/0x20 ... </NMI> intel_pmu_disable_event+0x98/0xf0 x86_pmu_stop+0x6e/0xb0 x86_pmu_del+0x46/0x140 event_sched_out.isra.97+0x7e/0x160 ...
The event is configured to make samples from PEBS drain code, but when it's disabled, we'll go through NMI path instead, where data->callchain will not get allocated and we'll crash:
x86_pmu_stop test_bit(hwc->idx, cpuc->active_mask) intel_pmu_disable_event(event) { ... intel_pmu_pebs_disable(event); ...
EVENT OVERFLOW -> <NMI> intel_pmu_handle_irq handle_pmi_common TEST PASSES -> test_bit(bit, cpuc->active_mask)) perf_event_overflow perf_prepare_sample { ... if (!(sample_type & __PERF_SAMPLE_CALLCHAIN_EARLY)) data->callchain = perf_callchain(event, regs);
CRASH -> size += data->callchain->nr; } </NMI> ... x86_pmu_disable_event(event) }
__clear_bit(hwc->idx, cpuc->active_mask);
Fixing this by disabling the event itself before setting off the PEBS bit.
Signed-off-by: Jiri Olsa jolsa@kernel.org Cc: Alexander Shishkin alexander.shishkin@linux.intel.com Cc: Arnaldo Carvalho de Melo acme@redhat.com Cc: David Arcari darcari@redhat.com Cc: Jiri Olsa jolsa@redhat.com Cc: Lendacky Thomas Thomas.Lendacky@amd.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Peter Zijlstra peterz@infradead.org Cc: Stephane Eranian eranian@google.com Cc: Thomas Gleixner tglx@linutronix.de Cc: Vince Weaver vincent.weaver@maine.edu Fixes: 3966c3feca3f ("x86/perf/amd: Remove need to check "running" bit in NMI handler") Link: http://lkml.kernel.org/r/20190504151556.31031-1-jolsa@kernel.org Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/events/intel/core.c | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-)
diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c index 71fb8b7b29545..c87b06ad9f860 100644 --- a/arch/x86/events/intel/core.c +++ b/arch/x86/events/intel/core.c @@ -2090,15 +2090,19 @@ static void intel_pmu_disable_event(struct perf_event *event) cpuc->intel_ctrl_host_mask &= ~(1ull << hwc->idx); cpuc->intel_cp_status &= ~(1ull << hwc->idx);
- if (unlikely(event->attr.precise_ip)) - intel_pmu_pebs_disable(event); - if (unlikely(hwc->config_base == MSR_ARCH_PERFMON_FIXED_CTR_CTRL)) { intel_pmu_disable_fixed(hwc); return; }
x86_pmu_disable_event(event); + + /* + * Needs to be called after x86_pmu_disable_event, + * so we don't trigger the event without PEBS bit set. + */ + if (unlikely(event->attr.precise_ip)) + intel_pmu_pebs_disable(event); }
static void intel_pmu_del_event(struct perf_event *event)
linux-stable-mirror@lists.linaro.org