Syzkaller reported a hung task with uevent_show() on stack trace. That
specific issue was addressed by another commit [0], but even with that
fix applied (for example, running v6.12-rc5) we face another type of hung
task that comes from the same reproducer [1]. By investigating that, we
could narrow it to the following path:
(a) Syzkaller emulates a Realtek USB WiFi adapter using raw-gadget and
dummy_hcd infrastructure.
(b) During the probe of rtl8192cu, the driver ends-up performing an efuse
read procedure (which is related to EEPROM load IIUC), and here lies the
issue: the function read_efuse() calls read_efuse_byte() many times, as
loop iterations depending on the efuse size (in our example, 512 in total).
This procedure for reading efuse bytes relies in a loop that performs an
I/O read up to *10k* times in case of failures. We measured the time of
the loop inside read_efuse_byte() alone, and in this reproducer (which
involves the dummy_hcd emulation layer), it takes 15 seconds each. As a
consequence, we have the driver stuck in its probe routine for big time,
exposing a stack trace like below if we attempt to reboot the system, for
example:
task:kworker/0:3 state:D stack:0 pid:662 tgid:662 ppid:2 flags:0x00004000
Workqueue: usb_hub_wq hub_event
Call Trace:
__schedule+0xe22/0xeb6
schedule_timeout+0xe7/0x132
__wait_for_common+0xb5/0x12e
usb_start_wait_urb+0xc5/0x1ef
? usb_alloc_urb+0x95/0xa4
usb_control_msg+0xff/0x184
_usbctrl_vendorreq_sync+0xa0/0x161
_usb_read_sync+0xb3/0xc5
read_efuse_byte+0x13c/0x146
read_efuse+0x351/0x5f0
efuse_read_all_map+0x42/0x52
rtl_efuse_shadow_map_update+0x60/0xef
rtl_get_hwinfo+0x5d/0x1c2
rtl92cu_read_eeprom_info+0x10a/0x8d5
? rtl92c_read_chip_version+0x14f/0x17e
rtl_usb_probe+0x323/0x851
usb_probe_interface+0x278/0x34b
really_probe+0x202/0x4a4
__driver_probe_device+0x166/0x1b2
driver_probe_device+0x2f/0xd8
[...]
We propose hereby to drastically reduce the attempts of doing the I/O reads
in case of failures, restricted to USB devices (given that they're inherently
slower than PCIe ones). By retrying up to 10 times (instead of 10000), we
got responsiveness in the reproducer, while seems reasonable to believe
that there's no sane USB device implementation in the field requiring this
amount of retries (at every I/O read) in order to properly work. Based on
that assumption, it'd be good to have it backported to stable but maybe not
since driver implementation (the 10k number comes from day 0), perhaps up
to 6.x series makes sense.
[0] Commit 15fffc6a5624 ("driver core: Fix uevent_show() vs driver detach race").
[1] A note about that: this syzkaller report presents multiple reproducers
that differs by the type of emulated USB device. For this specific case,
check the entry from 2024/08/08 06:23 in the list of crashes; the C repro
is available at https://syzkaller.appspot.com/text?tag=ReproC&x=1521fc83980000.
Cc: stable(a)vger.kernel.org # v6.1+
Reported-by: syzbot+edd9fe0d3a65b14588d5(a)syzkaller.appspotmail.com
Tested-by: Bitterblue Smith <rtl8821cerfe2(a)gmail.com>
Signed-off-by: Guilherme G. Piccoli <gpiccoli(a)igalia.com>
---
V2:
- Restrict the change to USB device only (thanks Ping-Ke Shih).
- Tested in 2 USB devices by Bitterblue Smith - thanks a lot!
V1: https://lore.kernel.org/lkml/20241025150226.896613-1-gpiccoli@igalia.com/
drivers/net/wireless/realtek/rtlwifi/efuse.c | 14 ++++++++++++--
1 file changed, 12 insertions(+), 2 deletions(-)
diff --git a/drivers/net/wireless/realtek/rtlwifi/efuse.c b/drivers/net/wireless/realtek/rtlwifi/efuse.c
index 82cf5fb5175f..f741066c06de 100644
--- a/drivers/net/wireless/realtek/rtlwifi/efuse.c
+++ b/drivers/net/wireless/realtek/rtlwifi/efuse.c
@@ -164,7 +164,17 @@ void read_efuse_byte(struct ieee80211_hw *hw, u16 _offset, u8 *pbuf)
struct rtl_priv *rtlpriv = rtl_priv(hw);
u32 value32;
u8 readbyte;
- u16 retry;
+ u16 retry, max_attempts;
+
+ /*
+ * In case of USB devices, transfer speeds are limited, hence
+ * efuse I/O reads could be (way) slower. So, decrease (a lot)
+ * the read attempts in case of failures.
+ */
+ if (rtlpriv->rtlhal.interface == INTF_PCI)
+ max_attempts = 10000;
+ else
+ max_attempts = 10;
rtl_write_byte(rtlpriv, rtlpriv->cfg->maps[EFUSE_CTRL] + 1,
(_offset & 0xff));
@@ -178,7 +188,7 @@ void read_efuse_byte(struct ieee80211_hw *hw, u16 _offset, u8 *pbuf)
retry = 0;
value32 = rtl_read_dword(rtlpriv, rtlpriv->cfg->maps[EFUSE_CTRL]);
- while (!(((value32 >> 24) & 0xff) & 0x80) && (retry < 10000)) {
+ while (!(((value32 >> 24) & 0xff) & 0x80) && (retry < max_attempts)) {
value32 = rtl_read_dword(rtlpriv,
rtlpriv->cfg->maps[EFUSE_CTRL]);
retry++;
--
2.46.2
Since commit 92a81562e695 ("leds: lp55xx: Add multicolor framework
support to lp55xx") there are two subsequent tests if the chan_nr
(reg property) is in valid range. One in the lp55xx_init_led()
function and one in the lp55xx_parse_common_child() function that
was added with the mentioned commit.
There are two issues with that.
First is in the lp55xx_parse_common_child() function where the reg
property is tested right after it is read from the device tree.
Test for the upper range is not correct though. Valid reg values are
0 to (max_channel - 1) so it should be >=.
Second issue is that in case the parsed value is out of the range
the probe just fails and no error message is shown as the code never
reaches the second test that prints and error message.
Remove the test form lp55xx_parse_common_child() function completely
and keep the one in lp55xx_init_led() function to deal with it.
Fixes: 92a81562e695 ("leds: lp55xx: Add multicolor framework support to lp55xx")
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Michal Vokáč <michal.vokac(a)ysoft.com>
---
v2:
- Complete change of the approach to the problem.
In v1 I removed the test from lp55xx_init_led() but I failed to test that
solution properly. It could not work. In v2 I removed the test for chan_nr
being out of range from the lp55xx_parse_common_child() function.
- Re-worded the subject and commit message to fit the changes. It was:
"leds: lp55xx: Fix check for invalid channel number"
drivers/leds/leds-lp55xx-common.c | 3 ---
1 file changed, 3 deletions(-)
diff --git a/drivers/leds/leds-lp55xx-common.c b/drivers/leds/leds-lp55xx-common.c
index 5a2e259679cf..e71456a56ab8 100644
--- a/drivers/leds/leds-lp55xx-common.c
+++ b/drivers/leds/leds-lp55xx-common.c
@@ -1132,9 +1132,6 @@ static int lp55xx_parse_common_child(struct device_node *np,
if (ret)
return ret;
- if (*chan_nr < 0 || *chan_nr > cfg->max_channel)
- return -EINVAL;
-
return 0;
}
--
2.1.4
Move the EXYNOS_UFS_OPT_UFSPR_SECURE check inside exynos_ufs_config_smu().
This way all call sites will benefit from the check. This fixes a bug
currently in the exynos_ufs_resume() path on gs101 as it calls
exynos_ufs_config_smu() and we end up accessing registers that can only
be accessed from secure world which results in a serror.
Fixes: d11e0a318df8 ("scsi: ufs: exynos: Add support for Tensor gs101 SoC")
Signed-off-by: Peter Griffin <peter.griffin(a)linaro.org>
Reviewed-by: Tudor Ambarus <tudor.ambarus(a)linaro.org>
Cc: stable(a)vger.kernel.org
---
v3: CC stable and be more verbose in commit message (Tudor)
---
drivers/ufs/host/ufs-exynos.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/drivers/ufs/host/ufs-exynos.c b/drivers/ufs/host/ufs-exynos.c
index 33de7ff747a2..f4454e89040f 100644
--- a/drivers/ufs/host/ufs-exynos.c
+++ b/drivers/ufs/host/ufs-exynos.c
@@ -719,6 +719,9 @@ static void exynos_ufs_config_smu(struct exynos_ufs *ufs)
{
u32 reg, val;
+ if (ufs->opts & EXYNOS_UFS_OPT_UFSPR_SECURE)
+ return;
+
exynos_ufs_disable_auto_ctrl_hcc_save(ufs, &val);
/* make encryption disabled by default */
@@ -1454,8 +1457,8 @@ static int exynos_ufs_init(struct ufs_hba *hba)
if (ret)
goto out;
exynos_ufs_specify_phy_time_attr(ufs);
- if (!(ufs->opts & EXYNOS_UFS_OPT_UFSPR_SECURE))
- exynos_ufs_config_smu(ufs);
+
+ exynos_ufs_config_smu(ufs);
hba->host->dma_alignment = DATA_UNIT_SIZE - 1;
return 0;
--
2.47.0.163.g1226f6d8fa-goog
From: Ard Biesheuvel <ardb(a)kernel.org>
The TPM event log table is a Linux specific construct, where the data
produced by the GetEventLog() boot service is cached in memory, and
passed on to the OS using a EFI configuration table.
The use of EFI_LOADER_DATA here results in the region being left
unreserved in the E820 memory map constructed by the EFI stub, and this
is the memory description that is passed on to the incoming kernel by
kexec, which is therefore unaware that the region should be reserved.
Even though the utility of the TPM2 event log after a kexec is
questionable, any corruption might send the parsing code off into the
weeds and crash the kernel. So let's use EFI_ACPI_RECLAIM_MEMORY
instead, which is always treated as reserved by the E820 conversion
logic.
Cc: <stable(a)vger.kernel.org>
Reported-by: Breno Leitao <leitao(a)debian.org>
Tested-by: Usama Arif <usamaarif642(a)gmail.com>
Signed-off-by: Ard Biesheuvel <ardb(a)kernel.org>
---
drivers/firmware/efi/libstub/tpm.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/firmware/efi/libstub/tpm.c b/drivers/firmware/efi/libstub/tpm.c
index df3182f2e63a..1fd6823248ab 100644
--- a/drivers/firmware/efi/libstub/tpm.c
+++ b/drivers/firmware/efi/libstub/tpm.c
@@ -96,7 +96,7 @@ static void efi_retrieve_tcg2_eventlog(int version, efi_physical_addr_t log_loca
}
/* Allocate space for the logs and copy them. */
- status = efi_bs_call(allocate_pool, EFI_LOADER_DATA,
+ status = efi_bs_call(allocate_pool, EFI_ACPI_RECLAIM_MEMORY,
sizeof(*log_tbl) + log_size, (void **)&log_tbl);
if (status != EFI_SUCCESS) {
--
2.46.0.662.g92d0881bb0-goog
The quilt patch titled
Subject: mm, mmap: limit THP alignment of anonymous mappings to PMD-aligned sizes
has been removed from the -mm tree. Its filename was
mm-mmap-limit-thp-aligment-of-anonymous-mappings-to-pmd-aligned-sizes.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Vlastimil Babka <vbabka(a)suse.cz>
Subject: mm, mmap: limit THP alignment of anonymous mappings to PMD-aligned sizes
Date: Thu, 24 Oct 2024 17:12:29 +0200
Since commit efa7df3e3bb5 ("mm: align larger anonymous mappings on THP
boundaries") a mmap() of anonymous memory without a specific address hint
and of at least PMD_SIZE will be aligned to PMD so that it can benefit
from a THP backing page.
However this change has been shown to regress some workloads
significantly. [1] reports regressions in various spec benchmarks, with
up to 600% slowdown of the cactusBSSN benchmark on some platforms. The
benchmark seems to create many mappings of 4632kB, which would have merged
to a large THP-backed area before commit efa7df3e3bb5 and now they are
fragmented to multiple areas each aligned to PMD boundary with gaps
between. The regression then seems to be caused mainly due to the
benchmark's memory access pattern suffering from TLB or cache aliasing due
to the aligned boundaries of the individual areas.
Another known regression bisected to commit efa7df3e3bb5 is darktable [2]
[3] and early testing suggests this patch fixes the regression there as
well.
To fix the regression but still try to benefit from THP-friendly anonymous
mapping alignment, add a condition that the size of the mapping must be a
multiple of PMD size instead of at least PMD size. In case of many
odd-sized mapping like the cactusBSSN creates, those will stop being
aligned and with gaps between, and instead naturally merge again.
Link: https://lkml.kernel.org/r/20241024151228.101841-2-vbabka@suse.cz
Fixes: efa7df3e3bb5 ("mm: align larger anonymous mappings on THP boundaries")
Signed-off-by: Vlastimil Babka <vbabka(a)suse.cz>
Reported-by: Michael Matz <matz(a)suse.de>
Debugged-by: Gabriel Krisman Bertazi <gabriel(a)krisman.be>
Closes: https://bugzilla.suse.com/show_bug.cgi?id=1229012 [1]
Reported-by: Matthias Bodenbinder <matthias(a)bodenbinder.de>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219366 [2]
Closes: https://lore.kernel.org/all/2050f0d4-57b0-481d-bab8-05e8d48fed0c@leemhuis.i… [3]
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes(a)oracle.com>
Reviewed-by: Yang Shi <yang(a)os.amperecomputing.com>
Cc: Rik van Riel <riel(a)surriel.com>
Cc: Jann Horn <jannh(a)google.com>
Cc: Liam R. Howlett <Liam.Howlett(a)Oracle.com>
Cc: Petr Tesarik <ptesarik(a)suse.com>
Cc: Thorsten Leemhuis <regressions(a)leemhuis.info>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/mmap.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
--- a/mm/mmap.c~mm-mmap-limit-thp-aligment-of-anonymous-mappings-to-pmd-aligned-sizes
+++ a/mm/mmap.c
@@ -900,7 +900,8 @@ __get_unmapped_area(struct file *file, u
if (get_area) {
addr = get_area(file, addr, len, pgoff, flags);
- } else if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE)) {
+ } else if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE)
+ && IS_ALIGNED(len, PMD_SIZE)) {
/* Ensures that larger anonymous mappings are THP aligned. */
addr = thp_get_unmapped_area_vmflags(file, addr, len,
pgoff, flags, vm_flags);
_
Patches currently in -mm which might be from vbabka(a)suse.cz are