Syzkaller reported a hung task with uevent_show() on stack trace. That
specific issue was addressed by another commit [0], but even with that
fix applied (for example, running v6.12-rc4) we face another type of hung
task that comes from the same reproducer [1]. By investigating that, we
could narrow it to the following path:
(a) Syzkaller emulates a Realtek USB WiFi adapter using raw-gadget and
dummy_hcd infrastructure.
(b) During the probe of rtl8192cu, the driver ends-up performing an efuse
read procedure (which is related to EEPROM load IIUC), and here lies the
issue: the function read_efuse() calls read_efuse_byte() many times, as
loop iterations depending on the efuse size (in our example, 512 in total).
This procedure for reading efuse bytes relies in a loop that performs an
I/O read up to *10k* times in case of failures. We measured the time of
the loop inside read_efuse_byte() alone, and in this reproducer (which
involves the dummy_hcd emulation layer), it takes 15 seconds each. As a
consequence, we have the driver stuck in its probe routine for big time,
exposing a stack trace like below if we attempt to reboot the system, for
example:
task:kworker/0:3 state:D stack:0 pid:662 tgid:662 ppid:2 flags:0x00004000
Workqueue: usb_hub_wq hub_event
Call Trace:
__schedule+0xe22/0xeb6
schedule_timeout+0xe7/0x132
__wait_for_common+0xb5/0x12e
usb_start_wait_urb+0xc5/0x1ef
? usb_alloc_urb+0x95/0xa4
usb_control_msg+0xff/0x184
_usbctrl_vendorreq_sync+0xa0/0x161
_usb_read_sync+0xb3/0xc5
read_efuse_byte+0x13c/0x146
read_efuse+0x351/0x5f0
efuse_read_all_map+0x42/0x52
rtl_efuse_shadow_map_update+0x60/0xef
rtl_get_hwinfo+0x5d/0x1c2
rtl92cu_read_eeprom_info+0x10a/0x8d5
? rtl92c_read_chip_version+0x14f/0x17e
rtl_usb_probe+0x323/0x851
usb_probe_interface+0x278/0x34b
really_probe+0x202/0x4a4
__driver_probe_device+0x166/0x1b2
driver_probe_device+0x2f/0xd8
[...]
We propose hereby to drastically reduce the attempts of doing the I/O read
in case of failures, from 10000 to 10. With that, we got reponsiveness in the
reproducer, while seems reasonable to believe that there's no sane device
implementation in the field requiring this amount of retries at every I/O
read in order to properly work. Based on that assumption it'd be good to
have it backported to stable but maybe not since driver implementation
(the 10k number comes from day 0), perhaps up to 6.x series makes sense.
[0] Commit 15fffc6a5624 ("driver core: Fix uevent_show() vs driver detach race").
[1] A note about that: this syzkaller report presents multiple reproducers
that differs by the type of emulated USB device. For this specific case,
check the entry from 2024/08/08 06:23 in the list of crashes; the C repro
is available at https://syzkaller.appspot.com/text?tag=ReproC&x=1521fc83980000.
Cc: stable(a)vger.kernel.org # v6.1+
Reported-by: syzbot+edd9fe0d3a65b14588d5(a)syzkaller.appspotmail.com
Signed-off-by: Guilherme G. Piccoli <gpiccoli(a)igalia.com>
---
drivers/net/wireless/realtek/rtlwifi/efuse.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/wireless/realtek/rtlwifi/efuse.c b/drivers/net/wireless/realtek/rtlwifi/efuse.c
index 82cf5fb5175f..2f75e376c0f6 100644
--- a/drivers/net/wireless/realtek/rtlwifi/efuse.c
+++ b/drivers/net/wireless/realtek/rtlwifi/efuse.c
@@ -178,7 +178,7 @@ void read_efuse_byte(struct ieee80211_hw *hw, u16 _offset, u8 *pbuf)
retry = 0;
value32 = rtl_read_dword(rtlpriv, rtlpriv->cfg->maps[EFUSE_CTRL]);
- while (!(((value32 >> 24) & 0xff) & 0x80) && (retry < 10000)) {
+ while (!(((value32 >> 24) & 0xff) & 0x80) && (retry < 10)) {
value32 = rtl_read_dword(rtlpriv,
rtlpriv->cfg->maps[EFUSE_CTRL]);
retry++;
--
2.46.2
Sometimes the hub driver does not recognize the USB device connected
to the external USB2.0 hub when the system resumes from S4.
After the SetPortFeature(PORT_RESET) request is completed, the hub
driver calls the HCD reset_device callback, which will issue a Reset
Device command and free all structures associated with endpoints
that were disabled.
This happens when the xHCI driver issue a Reset Device command to
inform the Etron xHCI host that the USB device associated with a
device slot has been reset. Seems that the Etron xHCI host can not
perform this command correctly, affecting the USB device.
To work around this, the xHCI driver should obtain a new device slot
with reference to commit 651aaf36a7d7 ("usb: xhci: Handle USB transaction
error on address command"), which is another way to inform the Etron
xHCI host that the USB device has been reset.
Add a new XHCI_ETRON_HOST quirk flag to invoke the workaround in
xhci_discover_or_reset_device().
Fixes: 2a8f82c4ceaf ("USB: xhci: Notify the xHC when a device is reset.")
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Kuangyi Chiang <ki.chiang65(a)gmail.com>
---
drivers/usb/host/xhci-pci.c | 1 +
drivers/usb/host/xhci.c | 19 +++++++++++++++++++
drivers/usb/host/xhci.h | 1 +
3 files changed, 21 insertions(+)
diff --git a/drivers/usb/host/xhci-pci.c b/drivers/usb/host/xhci-pci.c
index 33a6d99afc10..ddc9a82cceec 100644
--- a/drivers/usb/host/xhci-pci.c
+++ b/drivers/usb/host/xhci-pci.c
@@ -397,6 +397,7 @@ static void xhci_pci_quirks(struct device *dev, struct xhci_hcd *xhci)
if (pdev->vendor == PCI_VENDOR_ID_ETRON &&
(pdev->device == PCI_DEVICE_ID_EJ168 ||
pdev->device == PCI_DEVICE_ID_EJ188)) {
+ xhci->quirks |= XHCI_ETRON_HOST;
xhci->quirks |= XHCI_RESET_ON_RESUME;
xhci->quirks |= XHCI_BROKEN_STREAMS;
}
diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
index 899c0effb5d3..ef7ead6393d4 100644
--- a/drivers/usb/host/xhci.c
+++ b/drivers/usb/host/xhci.c
@@ -3692,6 +3692,8 @@ void xhci_free_device_endpoint_resources(struct xhci_hcd *xhci,
xhci->num_active_eps);
}
+static void xhci_free_dev(struct usb_hcd *hcd, struct usb_device *udev);
+
/*
* This submits a Reset Device Command, which will set the device state to 0,
* set the device address to 0, and disable all the endpoints except the default
@@ -3762,6 +3764,23 @@ static int xhci_discover_or_reset_device(struct usb_hcd *hcd,
SLOT_STATE_DISABLED)
return 0;
+ if (xhci->quirks & XHCI_ETRON_HOST) {
+ /*
+ * Obtaining a new device slot to inform the xHCI host that
+ * the USB device has been reset.
+ */
+ ret = xhci_disable_slot(xhci, udev->slot_id);
+ xhci_free_virt_device(xhci, udev->slot_id);
+ if (!ret) {
+ ret = xhci_alloc_dev(hcd, udev);
+ if (ret == 1)
+ ret = 0;
+ else
+ ret = -EINVAL;
+ }
+ return ret;
+ }
+
trace_xhci_discover_or_reset_device(slot_ctx);
xhci_dbg(xhci, "Resetting device with slot ID %u\n", slot_id);
diff --git a/drivers/usb/host/xhci.h b/drivers/usb/host/xhci.h
index f0fb696d5619..4f5b732e8944 100644
--- a/drivers/usb/host/xhci.h
+++ b/drivers/usb/host/xhci.h
@@ -1624,6 +1624,7 @@ struct xhci_hcd {
#define XHCI_ZHAOXIN_HOST BIT_ULL(46)
#define XHCI_WRITE_64_HI_LO BIT_ULL(47)
#define XHCI_CDNS_SCTX_QUIRK BIT_ULL(48)
+#define XHCI_ETRON_HOST BIT_ULL(49)
unsigned int num_active_eps;
unsigned int limit_active_eps;
--
2.25.1
This series addresses two issues in spm_cpuidle_register: a missing
device_node release in an error path, and releasing a reference to a
device after its usage.
These issues were found while analyzing the code, and the patches have
been successfully compiled and statically analyzed, but not tested on
real hardware as I don't have access to it. Any volunteering for testing
is always more than welcome.
Signed-off-by: Javier Carrasco <javier.carrasco.cruz(a)gmail.com>
---
Javier Carrasco (2):
cpuidle: qcom-spm: fix device node release in spm_cpuidle_register
cpuidle: qcom-spm: fix platform device release in spm_cpuidle_register
drivers/cpuidle/cpuidle-qcom-spm.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
---
base-commit: 6fb2fa9805c501d9ade047fc511961f3273cdcb5
change-id: 20241029-cpuidle-qcom-spm-cleanup-6f03669bcd70
Best regards,
--
Javier Carrasco <javier.carrasco.cruz(a)gmail.com>
The patch titled
Subject: mm/page_alloc: keep track of free highatomic
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-page_alloc-keep-track-of-free-highatomic.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Yu Zhao <yuzhao(a)google.com>
Subject: mm/page_alloc: keep track of free highatomic
Date: Mon, 28 Oct 2024 12:26:53 -0600
OOM kills due to vastly overestimated free highatomic reserves were
observed:
... invoked oom-killer: gfp_mask=0x100cca(GFP_HIGHUSER_MOVABLE), order=0 ...
Node 0 Normal free:1482936kB boost:0kB min:410416kB low:739404kB high:1068392kB reserved_highatomic:1073152KB ...
Node 0 Normal: 1292*4kB (ME) 1920*8kB (E) 383*16kB (UE) 220*32kB (ME) 340*64kB (E) 2155*128kB (UE) 3243*256kB (UE) 615*512kB (U) 1*1024kB (M) 0*2048kB 0*4096kB = 1477408kB
The second line above shows that the OOM kill was due to the following
condition:
free (1482936kB) - reserved_highatomic (1073152kB) = 409784KB < min (410416kB)
And the third line shows there were no free pages in any
MIGRATE_HIGHATOMIC pageblocks, which otherwise would show up as type 'H'.
Therefore __zone_watermark_unusable_free() underestimated the usable free
memory by over 1GB, which resulted in the unnecessary OOM kill above.
The comments in __zone_watermark_unusable_free() warns about the potential
risk, i.e.,
If the caller does not have rights to reserves below the min
watermark then subtract the high-atomic reserves. This will
over-estimate the size of the atomic reserve but it avoids a search.
However, it is possible to keep track of free pages in reserved highatomic
pageblocks with a new per-zone counter nr_free_highatomic protected by the
zone lock, to avoid a search when calculating the usable free memory. And
the cost would be minimal, i.e., simple arithmetics in the highatomic
alloc/free/move paths.
Note that since nr_free_highatomic can be relatively small, using a
per-cpu counter might cause too much drift and defeat its purpose, in
addition to the extra memory overhead.
Link: https://lkml.kernel.org/r/20241028182653.3420139-1-yuzhao@google.com
Signed-off-by: Yu Zhao <yuzhao(a)google.com>
Reported-by: Link Lin <linkl(a)google.com>
Acked-by: David Rientjes <rientjes(a)google.com>
Acked-by: Vlastimil Babka <vbabka(a)suse.cz>
Cc: <stable(a)vger.kernel.org> [6.12+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
include/linux/mmzone.h | 1 +
mm/page_alloc.c | 10 +++++++---
2 files changed, 8 insertions(+), 3 deletions(-)
--- a/include/linux/mmzone.h~mm-page_alloc-keep-track-of-free-highatomic
+++ a/include/linux/mmzone.h
@@ -823,6 +823,7 @@ struct zone {
unsigned long watermark_boost;
unsigned long nr_reserved_highatomic;
+ unsigned long nr_free_highatomic;
/*
* We don't know if the memory that we're going to allocate will be
--- a/mm/page_alloc.c~mm-page_alloc-keep-track-of-free-highatomic
+++ a/mm/page_alloc.c
@@ -635,6 +635,8 @@ compaction_capture(struct capture_contro
static inline void account_freepages(struct zone *zone, int nr_pages,
int migratetype)
{
+ lockdep_assert_held(&zone->lock);
+
if (is_migrate_isolate(migratetype))
return;
@@ -642,6 +644,9 @@ static inline void account_freepages(str
if (is_migrate_cma(migratetype))
__mod_zone_page_state(zone, NR_FREE_CMA_PAGES, nr_pages);
+
+ if (is_migrate_highatomic(migratetype))
+ WRITE_ONCE(zone->nr_free_highatomic, zone->nr_free_highatomic + nr_pages);
}
/* Used for pages not on another list */
@@ -3081,11 +3086,10 @@ static inline long __zone_watermark_unus
/*
* If the caller does not have rights to reserves below the min
- * watermark then subtract the high-atomic reserves. This will
- * over-estimate the size of the atomic reserve but it avoids a search.
+ * watermark then subtract the free pages reserved for highatomic.
*/
if (likely(!(alloc_flags & ALLOC_RESERVES)))
- unusable_free += z->nr_reserved_highatomic;
+ unusable_free += READ_ONCE(z->nr_free_highatomic);
#ifdef CONFIG_CMA
/* If allocation can't use CMA areas don't use free CMA pages */
_
Patches currently in -mm which might be from yuzhao(a)google.com are
mm-allow-set-clear-page_type-again.patch
mm-multi-gen-lru-remove-mm_leaf_old-and-mm_nonleaf_total-stats.patch
mm-multi-gen-lru-use-pteppmdp_clear_young_notify.patch
mm-page_alloc-keep-track-of-free-highatomic.patch