On Fri, Jun 24, 2022 at 1:44 PM Jason A. Donenfeld Jason@zx2c4.com wrote:
Even though hwrng provides a `wait` parameter, it doesn't work very well when waiting for a long time. There are numerous deadlocks that emerge related to shutdown. Work around this API limitation by waiting for a shorter amount of time and erroring more frequently. This commit also prevents hwrng from splatting messages to dmesg when there's a timeout and prevents calling msleep_interruptible() for tons of time when a thread is supposed to be shutting down, since msleep_interruptible() isn't actually interrupted by kthread_stop().
Reported-by: Gregory Erwin gregerwin256@gmail.com Cc: Toke Høiland-Jørgensen toke@redhat.com Cc: Kalle Valo kvalo@kernel.org Cc: Rui Salvaterra rsalvaterra@gmail.com Cc: Herbert Xu herbert@gondor.apana.org.au Cc: stable@vger.kernel.org Fixes: fcd09c90c3c5 ("ath9k: use hw_random API instead of directly dumping into random.c") Link: https://lore.kernel.org/all/CAO+Okf6ZJC5-nTE_EJUGQtd8JiCkiEHytGgDsFGTEjs0c00... Link: https://lore.kernel.org/lkml/CAO+Okf5k+C+SE6pMVfPf-d8MfVPVq4PO7EY8Hys_DVXten... Link: https://bugs.archlinux.org/task/75138 Signed-off-by: Jason A. Donenfeld Jason@zx2c4.com
I do not have an ath9k and therefore I can't test this myself. The analysis above was done completely statically, with no dynamic tracing and just a bug report of symptoms from Gregory. So it might be totally wrong. Thus, this patch very much requires Gregory's testing. Please don't apply it until we have his `Tested-by` line.
drivers/char/hw_random/core.c | 10 ++++++++-- drivers/net/wireless/ath/ath9k/rng.c | 19 ++----------------- 2 files changed, 10 insertions(+), 19 deletions(-)
diff --git a/drivers/char/hw_random/core.c b/drivers/char/hw_random/core.c index 16f227b995e8..af1c1905bb7e 100644 --- a/drivers/char/hw_random/core.c +++ b/drivers/char/hw_random/core.c @@ -513,8 +513,13 @@ static int hwrng_fillfn(void *unused) break;
if (rc <= 0) {
pr_warn("hwrng: no data available\n");
msleep_interruptible(10000);
int i;
for (i = 0; i < 100; ++i) {
if (kthread_should_stop() ||
msleep_interruptible(10000 / 100))
goto out;
} continue; }
@@ -529,6 +534,7 @@ static int hwrng_fillfn(void *unused) add_hwgenerator_randomness((void *)rng_fillbuf, rc, entropy >> 10); } +out: hwrng_fill = NULL; return 0; } diff --git a/drivers/net/wireless/ath/ath9k/rng.c b/drivers/net/wireless/ath/ath9k/rng.c index cb5414265a9b..883110c66e5e 100644 --- a/drivers/net/wireless/ath/ath9k/rng.c +++ b/drivers/net/wireless/ath/ath9k/rng.c @@ -52,20 +52,6 @@ static int ath9k_rng_data_read(struct ath_softc *sc, u32 *buf, u32 buf_size) return j << 2; }
-static u32 ath9k_rng_delay_get(u32 fail_stats) -{
u32 delay;
if (fail_stats < 100)
delay = 10;
else if (fail_stats < 105)
delay = 1000;
else
delay = 10000;
return delay;
-}
static int ath9k_rng_read(struct hwrng *rng, void *buf, size_t max, bool wait) { struct ath_softc *sc = container_of(rng, struct ath_softc, rng_ops); @@ -80,10 +66,9 @@ static int ath9k_rng_read(struct hwrng *rng, void *buf, size_t max, bool wait) bytes_read += max & 3UL; memzero_explicit(&word, sizeof(word)); }
if (!wait || !max || likely(bytes_read) || fail_stats > 110)
if (!wait || !max || likely(bytes_read) ||
++fail_stats >= 100 || msleep_interruptible(5)) break;
msleep_interruptible(ath9k_rng_delay_get(++fail_stats)); } if (wait && !bytes_read && max)
-- 2.35.1
Jason,
This patch is working as you described. Trying to read from /dev/hwrng consistently blocks for only 1.3s before returning an IO error. The longest that I observed 'ip link set wlan0 down' to block was also about 1.3s, and that was immediately after 'cat /dev/hwrng'. Additionally, the longest duration that I observed for wiphy_suspend() to return was just under 100ms.
Tested-by: Gregory Erwin gregerwin256@gmail.com