Hi folks,
I noticed a regression introduced sometime after 4.19.4 in USB power
management. I have a 2015 MacBook Pro. When I try to do a suspend or a
suspend+hibernate, I get the following error messages trying to
suspend usb2 and the suspend fails. This works fine in 4.19.4:
Dec 22 13:50:36 eric-macbookpro kernel: Freezing remaining freezable
tasks ... (elapsed 0.001 seconds) done.
Dec 22 13:50:36 eric-macbookpro kernel: Suspending console(s) (use
no_console_suspend to debug)
Dec 22 13:50:36 eric-macbookpro kernel: dpm_run_callback():
usb_dev_freeze+0x0/0x10 returns -16
Dec 22 13:50:36 eric-macbookpro kernel: PM: Device usb2 failed to
freeze async: error -16
Dec 22 13:50:38 eric-macbookpro systemd[1]:
systemd-hybrid-sleep.service: Main process exited, code=exited,
status=1/FAILURE
Dec 22 13:50:38 eric-macbookpro systemd[1]:
systemd-hybrid-sleep.service: Failed with result 'exit-code'.
Dec 22 13:50:38 eric-macbookpro systemd[1]: Failed to start Hybrid
Suspend+Hibernate.
Dec 22 13:50:38 eric-macbookpro systemd[1]: Dependency failed for
Hybrid Suspend+Hibernate.
Dec 22 13:50:38 eric-macbookpro systemd[1]: hybrid-sleep.target: Job
hybrid-sleep.target/start failed with result 'dependency'.
Dec 22 13:50:38 eric-macbookpro systemd-logind[1573]: Operation
'sleep' finished.
Dec 22 13:50:38 eric-macbookpro systemd[1]: Stopped target Sleep.
The behavior exists in 4.19.8 and 4.19.11, the kernel versions I have
upgraded to with Arch Linux, so the regression was introduced sometime
between 4.19.4 and 4.19.8. Hibernate still works but when I resume
from hibernate, there is a ksoftirqd and kworker thread/process
together taking up 100% of one core. If I turn off auto power control
for usb1 and usb2, the threads stop spinning. i.e.,
echo 'on' > '/sys/bus/usb/devices/usb1/power/control
Any suggestions as to where this regression was introduced and what
can be done to fix it?
Thanks,
Eric
From: Stanley Chu <stanley.chu(a)mediatek.com>
The commit 356fd2663cff ("scsi: Set request queue runtime PM status
back to active on resume") fixed up the inconsistent RPM status between
request queue and device. However changing request queue RPM status
shall be done only on successful resume, otherwise status may be still
inconsistent as below,
Request queue: RPM_ACTIVE
Device: RPM_SUSPENDED
This ends up soft lockup because requests can be submitted to
underlying devices but those devices and their required resource
are not resumed.
Fixes: 356fd2663cff ("scsi: Set request queue runtime PM status
back to active on resume")
Cc: stable(a)vger.kernel.org
Signed-off-by: Stanley Chu <stanley.chu(a)mediatek.com>
---
drivers/scsi/scsi_pm.c | 26 +++++++++++++++-----------
1 file changed, 15 insertions(+), 11 deletions(-)
diff --git a/drivers/scsi/scsi_pm.c b/drivers/scsi/scsi_pm.c
index a2b4179bfdf7..7639df91b110 100644
--- a/drivers/scsi/scsi_pm.c
+++ b/drivers/scsi/scsi_pm.c
@@ -80,8 +80,22 @@ static int scsi_dev_type_resume(struct device *dev,
if (err == 0) {
pm_runtime_disable(dev);
- pm_runtime_set_active(dev);
+ err = pm_runtime_set_active(dev);
pm_runtime_enable(dev);
+
+ /*
+ * Forcibly set runtime PM status of request queue to "active"
+ * to make sure we can again get requests from the queue
+ * (see also blk_pm_peek_request()).
+ *
+ * The resume hook will correct runtime PM status of the disk.
+ */
+ if (!err && scsi_is_sdev_device(dev)) {
+ struct scsi_device *sdev = to_scsi_device(dev);
+
+ if (sdev->request_queue->dev)
+ blk_set_runtime_active(sdev->request_queue);
+ }
}
return err;
@@ -140,16 +154,6 @@ static int scsi_bus_resume_common(struct device *dev,
else
fn = NULL;
- /*
- * Forcibly set runtime PM status of request queue to "active" to
- * make sure we can again get requests from the queue (see also
- * blk_pm_peek_request()).
- *
- * The resume hook will correct runtime PM status of the disk.
- */
- if (scsi_is_sdev_device(dev) && pm_runtime_suspended(dev))
- blk_set_runtime_active(to_scsi_device(dev)->request_queue);
-
if (fn) {
async_schedule_domain(fn, dev, &scsi_sd_pm_domain);
--
2.18.0
Commit 2b2ea09e74a5 ("staging:r8188eu: Use lib80211 to decrypt WEP-frames")
causes scheduling while atomic bugs followed by a hard freeze whenever
the driver tries to connect to a WEP-encrypted network. Experimentation
showed that the freezes were eliminated when module lib80211 was
preloaded, which can be forced by calling lib80211_get_crypto_ops()
directly rather than indirectly through try_then_request_module().
With this change, no BUG messages are logged.
Fixes: 2b2ea09e74a5 ("staging:r8188eu: Use lib80211 to decrypt WEP-frames")
Cc: Stable <stable(a)vger.kernel.org> # v4.17+
Cc: Michael Straube <straube.linux(a)gmail.com>
Cc: Ivan Safonov <insafonov(a)gmail.com>
Signed-off-by: Larry Finger <Larry.Finger(a)lwfinger.net>
---
drivers/staging/rtl8188eu/core/rtw_security.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/staging/rtl8188eu/core/rtw_security.c b/drivers/staging/rtl8188eu/core/rtw_security.c
index 052656a22821..bab96c870042 100644
--- a/drivers/staging/rtl8188eu/core/rtw_security.c
+++ b/drivers/staging/rtl8188eu/core/rtw_security.c
@@ -154,7 +154,7 @@ void rtw_wep_encrypt(struct adapter *padapter, u8 *pxmitframe)
pframe = ((struct xmit_frame *)pxmitframe)->buf_addr + hw_hdr_offset;
- crypto_ops = try_then_request_module(lib80211_get_crypto_ops("WEP"), "lib80211_crypt_wep");
+ crypto_ops = lib80211_get_crypto_ops("WEP");
if (!crypto_ops)
return;
@@ -210,7 +210,7 @@ int rtw_wep_decrypt(struct adapter *padapter, u8 *precvframe)
void *crypto_private = NULL;
int status = _SUCCESS;
const int keyindex = prxattrib->key_index;
- struct lib80211_crypto_ops *crypto_ops = try_then_request_module(lib80211_get_crypto_ops("WEP"), "lib80211_crypt_wep");
+ struct lib80211_crypto_ops *crypto_ops = lib80211_get_crypto_ops("WEP");
char iv[4], icv[4];
if (!crypto_ops) {
--
2.16.4
An iptable rule like the following on a multicore systems will result in
accepting more connections than set in the rule.
iptables -A INPUT -p tcp -m tcp --syn --dport 7777 -m connlimit \
--connlimit-above 2000 --connlimit-mask 0 -j DROP
In check_hlist function, connections that are found in saved connections
but not in netfilter conntrack are deleted, assuming that those
connections do not exist anymore. But for multi core systems, there exists
a small time window, when a connection has been added to the xt_connlimit
maintained rb-tree but has not yet made to netfilter conntrack table. This
causes concurrent connections to return incorrect counts and go over limit
set in iptable rule.
The fix has been partially backported from the above mentioned upstream
commit. Introduce timestamp and the owning cpu.
Signed-off-by: Alakesh Haloi <alakeshh(a)amazon.com>
Cc: Pablo Neira Ayuso <pablo(a)netfilter.org>
Cc: Jozsef Kadlecsik <kadlec(a)blackhole.kfki.hu>
Cc: Florian Westphal <fw(a)strlen.de>
Cc: "David S. Miller" <davem(a)davemloft.net>
Cc: stable(a)vger.kernel.org # v4.15 and before
Cc: netdev(a)vger.kernel.org
Cc: Dmitry Andrianov <dmitry.andrianov(a)alertme.com>
Cc: Justin Pettit <jpettit(a)vmware.com>
Cc: Yi-Hung Wei <yihung.wei(a)gmail.com>
---
net/netfilter/xt_connlimit.c | 28 ++++++++++++++++++++++++++--
1 file changed, 26 insertions(+), 2 deletions(-)
diff --git a/net/netfilter/xt_connlimit.c b/net/netfilter/xt_connlimit.c
index ffa8eec..e7b092b 100644
--- a/net/netfilter/xt_connlimit.c
+++ b/net/netfilter/xt_connlimit.c
@@ -47,6 +47,8 @@ struct xt_connlimit_conn {
struct hlist_node node;
struct nf_conntrack_tuple tuple;
union nf_inet_addr addr;
+ int cpu;
+ u32 jiffies32;
};
struct xt_connlimit_rb {
@@ -126,6 +128,8 @@ static bool add_hlist(struct hlist_head *head,
return false;
conn->tuple = *tuple;
conn->addr = *addr;
+ conn->cpu = raw_smp_processor_id();
+ conn->jiffies32 = (u32)jiffies;
hlist_add_head(&conn->node, head);
return true;
}
@@ -148,8 +152,26 @@ static unsigned int check_hlist(struct net *net,
hlist_for_each_entry_safe(conn, n, head, node) {
found = nf_conntrack_find_get(net, zone, &conn->tuple);
if (found == NULL) {
- hlist_del(&conn->node);
- kmem_cache_free(connlimit_conn_cachep, conn);
+ /* If connection is not found, it may be because
+ * it has not made into conntrack table yet. We
+ * check if it is a recently created connection
+ * on a different core and do not delete it in that
+ * case.
+ */
+
+ unsigned long a, b;
+ int cpu = raw_smp_processor_id();
+ __u32 age;
+
+ b = conn->jiffies;
+ a = (u32)jiffies;
+ age = a - b;
+ if (conn->cpu != cpu && age <= 2) {
+ length++;
+ } else {
+ hlist_del(&conn->node);
+ kmem_cache_free(connlimit_conn_cachep, conn);
+ }
continue;
}
@@ -271,6 +293,8 @@ static void tree_nodes_free(struct rb_root *root,
conn->tuple = *tuple;
conn->addr = *addr;
+ conn->cpu = raw_smp_processor_id();
+ conn->jiffies32 = (u32)jiffies;
rbconn->addr = *addr;
INIT_HLIST_HEAD(&rbconn->hhead);
--
1.8.3.1
An iptable rule like the following on a multicore systems will result in
accepting more connections than set in the rule.
iptables -A INPUT -p tcp -m tcp --syn --dport 7777 -m connlimit \
--connlimit-above 2000 --connlimit-mask 0 -j DROP
In check_hlist function, connections that are found in saved connections
but not in netfilter conntrack are deleted, assuming that those
connections do not exist anymore. But for multi core systems, there exists
a small time window, when a connection has been added to the xt_connlimit
maintained rb-tree but has not yet made to netfilter conntrack table. This
causes concurrent connections to return incorrect counts and go over limit
set in iptable rule.
Connection 1 on Core 1 Connection 2 on Core 2
list_length = N
conntrack_table_len = N
spin_lock_bh()
In check_hlist() function
a. loop over saved connections
1. call nf_conntrack_find_get()
2. If not found in 1,
i. call hlist_del()
b. return total count to caller
c. connection gets added to list
of saved connections.
spin_unlock_bh()
list_length = N + 1
spin_lock_bh() on core 2
In check_hlist() function
a. loop over saved connection.
1. call nf_conntrack_find_get()
2. If not found in 1.
i. call hlist_del()
[Connection 1 was in the
but not in nf_conntrack yet]
ii. connection 1 gets deleted
list_length = N
conntrack_table_len = N
b. return total count to caller
c. connection 2 gets added to list
of saved connections
spin_unlock_bh()
d. connection 1 gets added to
nf_conntrack
list_length = N + 1
conntrack_table_len = N + 1
e. connection 2 gets added to
nf_conntrack
list_length = N + 1
conntrack_table_len = N + 2
So we end up with N + 1 connections in the list but N + 2 in nf_conntrack,
allowing more number of connections eventually than set in the rule.
This fix adds an additional field to track such pending connections
and prevent them from being deleted by another execution thread on
a different core and returns correct count.
Signed-off-by: Alakesh Haloi <alakeshh(a)amazon.com>
Cc: Pablo Neira Ayuso <pablo(a)netfilter.org>
Cc: Jozsef Kadlecsik <kadlec(a)blackhole.kfki.hu>
Cc: Florian Westphal <fw(a)strlen.de>
Cc: "David S. Miller" <davem(a)davemloft.net>
Cc: stable(a)vger.kernel.org # v4.15 and before
---
net/netfilter/xt_connlimit.c | 24 +++++++++++++++++++++---
1 file changed, 21 insertions(+), 3 deletions(-)
diff --git a/net/netfilter/xt_connlimit.c b/net/netfilter/xt_connlimit.c
index ffa8eec980e9..bd7563c209a4 100644
--- a/net/netfilter/xt_connlimit.c
+++ b/net/netfilter/xt_connlimit.c
@@ -47,6 +47,7 @@ struct xt_connlimit_conn {
struct hlist_node node;
struct nf_conntrack_tuple tuple;
union nf_inet_addr addr;
+ bool pending_add;
};
struct xt_connlimit_rb {
@@ -126,6 +127,7 @@ static bool add_hlist(struct hlist_head *head,
return false;
conn->tuple = *tuple;
conn->addr = *addr;
+ conn->pending_add = true;
hlist_add_head(&conn->node, head);
return true;
}
@@ -144,15 +146,31 @@ static unsigned int check_hlist(struct net *net,
*addit = true;
- /* check the saved connections */
+ /* check the saved connections
+ */
hlist_for_each_entry_safe(conn, n, head, node) {
found = nf_conntrack_find_get(net, zone, &conn->tuple);
if (found == NULL) {
- hlist_del(&conn->node);
- kmem_cache_free(connlimit_conn_cachep, conn);
+ /* It could be an already deleted connection or
+ * a new connection that is not there in conntrack
+ * yet. If former delete it from the list, else
+ * increase count and move on.
+ */
+ if (conn->pending_add) {
+ length++;
+ } else {
+ hlist_del(&conn->node);
+ kmem_cache_free(connlimit_conn_cachep, conn);
+ }
continue;
}
+ /* If it is a connection that was pending insertion to
+ * connection tracking table before, then it's time to clear
+ * the flag.
+ */
+ conn->pending_add = false;
+
found_ct = nf_ct_tuplehash_to_ctrack(found);
if (nf_ct_tuple_equal(&conn->tuple, tuple)) {
--
2.14.4
Hi x86 maintainers,
This is an important fix that I believe needs to be merged for 4.21.
Without it, applications calling fork() can potentially double-allocate
a protection key, causing lots of strange problems.
Thomas's Reviewed-by is on the the actual fix, but not the selftest.
I would also be happy to send this as a pull request if you would
prefer.
Cc: x86(a)kernel.org
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Ingo Molnar <mingo(a)redhat.com>
Cc: Borislav Petkov <bp(a)alien8.de>
Cc: "H. Peter Anvin" <hpa(a)zytor.com>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Michael Ellerman <mpe(a)ellerman.id.au>
Cc: Will Deacon <will.deacon(a)arm.com>
Cc: Andy Lutomirski <luto(a)kernel.org>
Cc: Joerg Roedel <jroedel(a)suse.de>
Cc: stable(a)vger.kernel.org