Linux-stable-mirror July 2024

linux-stable-mirror@lists.linaro.org

448 participants
1250 discussions

[PATCH 5.4/5.10] nvme/pci: Add APST quirk for Lenovo N60z laptop

by WangYuli

commit ab091ec536cb7b271983c0c063b17f62f3591583 upstream There is a hardware power-saving problem with the Lenovo N60z board. When turn it on and leave it for 10 hours, there is a 20% chance that a nvme disk will not wake up until reboot. Link: https://lore.kernel.org/all/2B5581C46AC6E335+9c7a81f1-05fb-4fd0-9fbb-108757… Signed-off-by: hmy <huanglin(a)uniontech.com> Signed-off-by: Wentao Guan <guanwentao(a)uniontech.com> Signed-off-by: WangYuli <wangyuli(a)uniontech.com> Signed-off-by: Keith Busch <kbusch(a)kernel.org> --- drivers/nvme/host/pci.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c index 486e44d20b43..e4776cff4208 100644 --- a/drivers/nvme/host/pci.c +++ b/drivers/nvme/host/pci.c @@ -2821,6 +2821,13 @@ static unsigned long check_vendor_combination_bug(struct pci_dev *pdev) return NVME_QUIRK_SIMPLE_SUSPEND; } + /* + * NVMe SSD drops off the PCIe bus after system idle + * for 10 hours on a Lenovo N60z board. + */ + if (dmi_match(DMI_BOARD_NAME, "LXKT-ZXEG-N6")) + return NVME_QUIRK_NO_APST; + return 0; } -- 2.43.4

1 year, 5 months

[PATCH 4.19] nvme/pci: Add APST quirk for Lenovo N60z laptop

by WangYuli

commit ab091ec536cb7b271983c0c063b17f62f3591583 upstream There is a hardware power-saving problem with the Lenovo N60z board. When turn it on and leave it for 10 hours, there is a 20% chance that a nvme disk will not wake up until reboot. Link: https://lore.kernel.org/all/2B5581C46AC6E335+9c7a81f1-05fb-4fd0-9fbb-108757… Signed-off-by: hmy <huanglin(a)uniontech.com> Signed-off-by: Wentao Guan <guanwentao(a)uniontech.com> Signed-off-by: WangYuli <wangyuli(a)uniontech.com> Signed-off-by: Keith Busch <kbusch(a)kernel.org> --- drivers/nvme/host/pci.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c index 163497ef48fd..a243c066d923 100644 --- a/drivers/nvme/host/pci.c +++ b/drivers/nvme/host/pci.c @@ -2481,6 +2481,13 @@ static unsigned long check_vendor_combination_bug(struct pci_dev *pdev) return NVME_QUIRK_NO_APST; } + /* + * NVMe SSD drops off the PCIe bus after system idle + * for 10 hours on a Lenovo N60z board. + */ + if (dmi_match(DMI_BOARD_NAME, "LXKT-ZXEG-N6")) + return NVME_QUIRK_NO_APST; + return 0; } -- 2.43.4

1 year, 5 months

[PATCH 2/3] arm64: dts: mediatek: mt8395-nio-12l: Mark USB 3.0 on xhci1 as disabled

by Chen-Yu Tsai

USB 3.0 on xhci1 is not used, as the controller shares the same PHY as pcie1. The latter is enabled to support the M.2 PCIe WLAN card on this design. Mark USB 3.0 as disabled on this controller using the "mediatek,u3p-dis-msk" property. Fixes: 96564b1e2ea4 ("arm64: dts: mediatek: Introduce the MT8395 Radxa NIO 12L board") Cc: <stable(a)vger.kernel.org> Signed-off-by: Chen-Yu Tsai <wenst(a)chromium.org> --- arch/arm64/boot/dts/mediatek/mt8395-radxa-nio-12l.dts | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/arm64/boot/dts/mediatek/mt8395-radxa-nio-12l.dts b/arch/arm64/boot/dts/mediatek/mt8395-radxa-nio-12l.dts index 4b5f6cf16f70..096fa999aa59 100644 --- a/arch/arm64/boot/dts/mediatek/mt8395-radxa-nio-12l.dts +++ b/arch/arm64/boot/dts/mediatek/mt8395-radxa-nio-12l.dts @@ -898,6 +898,7 @@ &xhci1 { usb2-lpm-disable; vusb33-supply = <&mt6359_vusb_ldo_reg>; vbus-supply = <&vsys>; + mediatek,u3p-dis-msk = <1>; status = "okay"; }; -- 2.46.0.rc1.232.g9752f9e123-goog

1 year, 5 months

[PATCH 1/3] arm64: dts: mediatek: mt8195-cherry: Mark USB 3.0 on xhci1 as disabled

by Chen-Yu Tsai

USB 3.0 on xhci1 is not used, as the controller shares the same PHY as pcie1. The latter is enabled to support the M.2 PCIe WLAN card on this design. Mark USB 3.0 as disabled on this controller using the "mediatek,u3p-dis-msk" property. Reported-by: Nícolas F. R. A. Prado <nfraprado(a)collabora.com> #KernelCI Closes: https://lore.kernel.org/all/9fce9838-ef87-4d1b-b3df-63e1ddb0ec51@notapiano/ Fixes: b6267a396e1c ("arm64: dts: mediatek: cherry: Enable T-PHYs and USB XHCI controllers") Cc: <stable(a)vger.kernel.org> Signed-off-by: Chen-Yu Tsai <wenst(a)chromium.org> --- arch/arm64/boot/dts/mediatek/mt8195-cherry.dtsi | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/arm64/boot/dts/mediatek/mt8195-cherry.dtsi b/arch/arm64/boot/dts/mediatek/mt8195-cherry.dtsi index fe5400e17b0f..d3a52acbe48a 100644 --- a/arch/arm64/boot/dts/mediatek/mt8195-cherry.dtsi +++ b/arch/arm64/boot/dts/mediatek/mt8195-cherry.dtsi @@ -1404,6 +1404,7 @@ &xhci1 { rx-fifo-depth = <3072>; vusb33-supply = <&mt6359_vusb_ldo_reg>; vbus-supply = <&usb_vbus>; + mediatek,u3p-dis-msk = <1>; }; &xhci2 { -- 2.46.0.rc1.232.g9752f9e123-goog

1 year, 5 months

+ kcov-properly-check-for-softirq-context.patch added to mm-hotfixes-unstable branch

by Andrew Morton

The patch titled Subject: kcov: properly check for softirq context has been added to the -mm mm-hotfixes-unstable branch. Its filename is kcov-properly-check-for-softirq-context.patch This patch will shortly appear at https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche… This patch will later appear in the mm-hotfixes-unstable branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/process/submit-checklist.rst when testing your code *** The -mm tree is included into linux-next via the mm-everything branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm and is updated there every 2-3 working days ------------------------------------------------------ From: Andrey Konovalov <andreyknvl(a)gmail.com> Subject: kcov: properly check for softirq context Date: Mon, 29 Jul 2024 04:21:58 +0200 When collecting coverage from softirqs, KCOV uses in_serving_softirq() to check whether the code is running in the softirq context. Unfortunately, in_serving_softirq() is > 0 even when the code is running in the hardirq or NMI context for hardirqs and NMIs that happened during a softirq. As a result, if a softirq handler contains a remote coverage collection section and a hardirq with another remote coverage collection section happens during handling the softirq, KCOV incorrectly detects a nested softirq coverate collection section and prints a WARNING, as reported by syzbot. This issue was exposed by commit a7f3813e589f ("usb: gadget: dummy_hcd: Switch to hrtimer transfer scheduler"), which switched dummy_hcd to using hrtimer and made the timer's callback be executed in the hardirq context. Change the related checks in KCOV to account for this behavior of in_serving_softirq() and make KCOV ignore remote coverage collection sections in the hardirq and NMI contexts. This prevents the WARNING printed by syzbot but does not fix the inability of KCOV to collect coverage from the __usb_hcd_giveback_urb when dummy_hcd is in use (caused by a7f3813e589f); a separate patch is required for that. Link: https://lkml.kernel.org/r/20240729022158.92059-1-andrey.konovalov@linux.dev Fixes: 5ff3b30ab57d ("kcov: collect coverage from interrupts") Signed-off-by: Andrey Konovalov <andreyknvl(a)gmail.com> Reported-by: syzbot+2388cdaeb6b10f0c13ac(a)syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=2388cdaeb6b10f0c13ac Acked-by: Marco Elver <elver(a)google.com> Cc: Alan Stern <stern(a)rowland.harvard.edu> Cc: Aleksandr Nogikh <nogikh(a)google.com> Cc: Alexander Potapenko <glider(a)google.com> Cc: Dmitry Vyukov <dvyukov(a)google.com> Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> Cc: Marcello Sylvester Bauer <sylv(a)sylv.io> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- kernel/kcov.c | 15 ++++++++++++--- 1 file changed, 12 insertions(+), 3 deletions(-) --- a/kernel/kcov.c~kcov-properly-check-for-softirq-context +++ a/kernel/kcov.c @@ -161,6 +161,15 @@ static void kcov_remote_area_put(struct kmsan_unpoison_memory(&area->list, sizeof(area->list)); } +/* + * Unlike in_serving_softirq(), this function returns false when called during + * a hardirq or an NMI that happened in the softirq context. + */ +static inline bool in_softirq_really(void) +{ + return in_serving_softirq() && !in_hardirq() && !in_nmi(); +} + static notrace bool check_kcov_mode(enum kcov_mode needed_mode, struct task_struct *t) { unsigned int mode; @@ -170,7 +179,7 @@ static notrace bool check_kcov_mode(enum * so we ignore code executed in interrupts, unless we are in a remote * coverage collection section in a softirq. */ - if (!in_task() && !(in_serving_softirq() && t->kcov_softirq)) + if (!in_task() && !(in_softirq_really() && t->kcov_softirq)) return false; mode = READ_ONCE(t->kcov_mode); /* @@ -849,7 +858,7 @@ void kcov_remote_start(u64 handle) if (WARN_ON(!kcov_check_handle(handle, true, true, true))) return; - if (!in_task() && !in_serving_softirq()) + if (!in_task() && !in_softirq_really()) return; local_lock_irqsave(&kcov_percpu_data.lock, flags); @@ -991,7 +1000,7 @@ void kcov_remote_stop(void) int sequence; unsigned long flags; - if (!in_task() && !in_serving_softirq()) + if (!in_task() && !in_softirq_really()) return; local_lock_irqsave(&kcov_percpu_data.lock, flags); _ Patches currently in -mm which might be from andreyknvl(a)gmail.com are kcov-properly-check-for-softirq-context.patch kcov-dont-instrument-lib-find_bitc.patch

1 year, 5 months

[PATCH v2] mm/hugetlb: fix hugetlb vs. core-mm PT locking

by David Hildenbrand

We recently made GUP's common page table walking code to also walk hugetlb VMAs without most hugetlb special-casing, preparing for the future of having less hugetlb-specific page table walking code in the codebase. Turns out that we missed one page table locking detail: page table locking for hugetlb folios that are not mapped using a single PMD/PUD. Assume we have hugetlb folio that spans multiple PTEs (e.g., 64 KiB hugetlb folios on arm64 with 4 KiB base page size). GUP, as it walks the page tables, will perform a pte_offset_map_lock() to grab the PTE table lock. However, hugetlb that concurrently modifies these page tables would actually grab the mm->page_table_lock: with USE_SPLIT_PTE_PTLOCKS, the locks would differ. Something similar can happen right now with hugetlb folios that span multiple PMDs when USE_SPLIT_PMD_PTLOCKS. This issue can be reproduced [1], for example triggering: [ 3105.936100] ------------[ cut here ]------------ [ 3105.939323] WARNING: CPU: 31 PID: 2732 at mm/gup.c:142 try_grab_folio+0x11c/0x188 [ 3105.944634] Modules linked in: [...] [ 3105.974841] CPU: 31 PID: 2732 Comm: reproducer Not tainted 6.10.0-64.eln141.aarch64 #1 [ 3105.980406] Hardware name: QEMU KVM Virtual Machine, BIOS edk2-20240524-4.fc40 05/24/2024 [ 3105.986185] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 3105.991108] pc : try_grab_folio+0x11c/0x188 [ 3105.994013] lr : follow_page_pte+0xd8/0x430 [ 3105.996986] sp : ffff80008eafb8f0 [ 3105.999346] x29: ffff80008eafb900 x28: ffffffe8d481f380 x27: 00f80001207cff43 [ 3106.004414] x26: 0000000000000001 x25: 0000000000000000 x24: ffff80008eafba48 [ 3106.009520] x23: 0000ffff9372f000 x22: ffff7a54459e2000 x21: ffff7a546c1aa978 [ 3106.014529] x20: ffffffe8d481f3c0 x19: 0000000000610041 x18: 0000000000000001 [ 3106.019506] x17: 0000000000000001 x16: ffffffffffffffff x15: 0000000000000000 [ 3106.024494] x14: ffffb85477fdfe08 x13: 0000ffff9372ffff x12: 0000000000000000 [ 3106.029469] x11: 1fffef4a88a96be1 x10: ffff7a54454b5f0c x9 : ffffb854771b12f0 [ 3106.034324] x8 : 0008000000000000 x7 : ffff7a546c1aa980 x6 : 0008000000000080 [ 3106.038902] x5 : 00000000001207cf x4 : 0000ffff9372f000 x3 : ffffffe8d481f000 [ 3106.043420] x2 : 0000000000610041 x1 : 0000000000000001 x0 : 0000000000000000 [ 3106.047957] Call trace: [ 3106.049522] try_grab_folio+0x11c/0x188 [ 3106.051996] follow_pmd_mask.constprop.0.isra.0+0x150/0x2e0 [ 3106.055527] follow_page_mask+0x1a0/0x2b8 [ 3106.058118] __get_user_pages+0xf0/0x348 [ 3106.060647] faultin_page_range+0xb0/0x360 [ 3106.063651] do_madvise+0x340/0x598 Let's make huge_pte_lockptr() effectively uses the same PT locks as any core-mm page table walker would. Add ptep_lockptr() to obtain the PTE page table lock using a pte pointer -- unfortunately we cannot convert pte_lockptr() because virt_to_page() doesn't work with kmap'ed page tables we can have with CONFIG_HIGHPTE. There is one ugly case: powerpc 8xx, whereby we have an 8 MiB hugetlb folio being mapped using two PTE page tables. While hugetlb wants to take the PMD table lock, core-mm would grab the PTE table lock of one of both PTE page tables. In such corner cases, we have to make sure that both locks match, which is (fortunately!) currently guaranteed for 8xx as it does not support SMP and consequently doesn't use split PT locks. [1] https://lore.kernel.org/all/1bbfcc7f-f222-45a5-ac44-c5a1381c596d@redhat.com/ Fixes: 9cb28da54643 ("mm/gup: handle hugetlb in the generic follow_page_mask code") Cc: <stable(a)vger.kernel.org> Cc: Peter Xu <peterx(a)redhat.com> Cc: Oscar Salvador <osalvador(a)suse.de> Cc: Muchun Song <muchun.song(a)linux.dev> Cc: Baolin Wang <baolin.wang(a)linux.alibaba.com> Signed-off-by: David Hildenbrand <david(a)redhat.com> --- Still busy runtime-testing of this version -- have to set up my ARM environment again. Dropped the RB's/ACKs because there was significant change in the pte_lockptr() handling. v1 -> 2: * Extend patch description * Drop "mm: let pte_lockptr() consume a pte_t pointer" * Introduce ptep_lockptr() in this patch I wish there was a nicer way to avoid messing with CONFIG_HIGHPTE ... --- include/linux/hugetlb.h | 26 +++++++++++++++++++++++--- include/linux/mm.h | 10 ++++++++++ 2 files changed, 33 insertions(+), 3 deletions(-) diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index c9bf68c239a01..dd6d4ee5ee59c 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -944,10 +944,30 @@ static inline bool htlb_allow_alloc_fallback(int reason) static inline spinlock_t *huge_pte_lockptr(struct hstate *h, struct mm_struct *mm, pte_t *pte) { - if (huge_page_size(h) == PMD_SIZE) + VM_WARN_ON(huge_page_size(h) == PAGE_SIZE); + VM_WARN_ON(huge_page_size(h) >= P4D_SIZE); + + /* + * hugetlb must use the exact same PT locks as core-mm page table + * walkers would. When modifying a PTE table, hugetlb must take the + * PTE PT lock, when modifying a PMD table, hugetlb must take the PMD + * PT lock etc. + * + * The expectation is that any hugetlb folio smaller than a PMD is + * always mapped into a single PTE table and that any hugetlb folio + * smaller than a PUD (but at least as big as a PMD) is always mapped + * into a single PMD table. + * + * If that does not hold for an architecture, then that architecture + * must disable split PT locks such that all *_lockptr() functions + * will give us the same result: the per-MM PT lock. + */ + if (huge_page_size(h) < PMD_SIZE && !IS_ENABLED(CONFIG_HIGHPTE)) + /* pte_alloc_huge() only applies with !CONFIG_HIGHPTE */ + return ptep_lockptr(mm, pte); + else if (huge_page_size(h) < PUD_SIZE) return pmd_lockptr(mm, (pmd_t *) pte); - VM_BUG_ON(huge_page_size(h) == PAGE_SIZE); - return &mm->page_table_lock; + return pud_lockptr(mm, (pud_t *) pte); } #ifndef hugepages_supported diff --git a/include/linux/mm.h b/include/linux/mm.h index b100df8cb5857..1b1f40ff00b7d 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -2926,6 +2926,12 @@ static inline spinlock_t *pte_lockptr(struct mm_struct *mm, pmd_t *pmd) return ptlock_ptr(page_ptdesc(pmd_page(*pmd))); } +static inline spinlock_t *ptep_lockptr(struct mm_struct *mm, pte_t *pte) +{ + BUILD_BUG_ON(IS_ENABLED(CONFIG_HIGHPTE)); + return ptlock_ptr(virt_to_ptdesc(pte)); +} + static inline bool ptlock_init(struct ptdesc *ptdesc) { /* @@ -2950,6 +2956,10 @@ static inline spinlock_t *pte_lockptr(struct mm_struct *mm, pmd_t *pmd) { return &mm->page_table_lock; } +static inline spinlock_t *ptep_lockptr(struct mm_struct *mm, pte_t *pte) +{ + return &mm->page_table_lock; +} static inline void ptlock_cache_init(void) {} static inline bool ptlock_init(struct ptdesc *ptdesc) { return true; } static inline void ptlock_free(struct ptdesc *ptdesc) {} -- 2.45.2

1 year, 5 months

[PATCH net V1] net: phy: micrel: Fix the KSZ9131 MDI-X status issue

by Raju Lakkaraju

The MDIX status is not accurately reflecting the current state after the link partner has manually altered its MDIX configuration while operating in forced mode. Access information about Auto mdix completion and pair selection from the KSZ9131's Auto/MDI/MDI-X status register Fixes: b64e6a8794d9 ("net: phy: micrel: Add PHY Auto/MDI/MDI-X set driver for KSZ9131") Signed-off-by: Raju Lakkaraju <Raju.Lakkaraju(a)microchip.com> --- drivers/net/phy/micrel.c | 34 +++++++++++++++++++--------------- 1 file changed, 19 insertions(+), 15 deletions(-) diff --git a/drivers/net/phy/micrel.c b/drivers/net/phy/micrel.c index dd519805deee..65b0a3115e14 100644 --- a/drivers/net/phy/micrel.c +++ b/drivers/net/phy/micrel.c @@ -1389,6 +1389,8 @@ static int ksz9131_config_init(struct phy_device *phydev) const struct device *dev_walker; int ret; + phydev->mdix_ctrl = ETH_TP_MDI_AUTO; + dev_walker = &phydev->mdio.dev; do { of_node = dev_walker->of_node; @@ -1438,28 +1440,30 @@ static int ksz9131_config_init(struct phy_device *phydev) #define MII_KSZ9131_AUTO_MDIX 0x1C #define MII_KSZ9131_AUTO_MDI_SET BIT(7) #define MII_KSZ9131_AUTO_MDIX_SWAP_OFF BIT(6) +#define MII_KSZ9131_DIG_AXAN_STS 0x14 +#define MII_KSZ9131_DIG_AXAN_STS_LINK_DET BIT(14) +#define MII_KSZ9131_DIG_AXAN_STS_A_SELECT BIT(12) static int ksz9131_mdix_update(struct phy_device *phydev) { int ret; - ret = phy_read(phydev, MII_KSZ9131_AUTO_MDIX); - if (ret < 0) - return ret; - - if (ret & MII_KSZ9131_AUTO_MDIX_SWAP_OFF) { - if (ret & MII_KSZ9131_AUTO_MDI_SET) - phydev->mdix_ctrl = ETH_TP_MDI; - else - phydev->mdix_ctrl = ETH_TP_MDI_X; + if (phydev->mdix_ctrl != ETH_TP_MDI_AUTO) { + phydev->mdix = phydev->mdix_ctrl; } else { - phydev->mdix_ctrl = ETH_TP_MDI_AUTO; - } + ret = phy_read(phydev, MII_KSZ9131_DIG_AXAN_STS); + if (ret < 0) + return ret; - if (ret & MII_KSZ9131_AUTO_MDI_SET) - phydev->mdix = ETH_TP_MDI; - else - phydev->mdix = ETH_TP_MDI_X; + if (ret & MII_KSZ9131_DIG_AXAN_STS_LINK_DET) { + if (ret & MII_KSZ9131_DIG_AXAN_STS_A_SELECT) + phydev->mdix = ETH_TP_MDI; + else + phydev->mdix = ETH_TP_MDI_X; + } else { + phydev->mdix = ETH_TP_MDI_INVALID; + } + } return 0; } -- 2.34.1

1 year, 5 months

[to-be-updated] mm-let-pte_lockptr-consume-a-pte_t-pointer.patch removed from -mm tree

by Andrew Morton

The quilt patch titled Subject: mm: let pte_lockptr() consume a pte_t pointer has been removed from the -mm tree. Its filename was mm-let-pte_lockptr-consume-a-pte_t-pointer.patch This patch was dropped because an updated version will be issued ------------------------------------------------------ From: David Hildenbrand <david(a)redhat.com> Subject: mm: let pte_lockptr() consume a pte_t pointer Date: Thu, 25 Jul 2024 20:39:54 +0200 Patch series "mm/hugetlb: fix hugetlb vs. core-mm PT locking". Working on another generic page table walker that tries to avoid special-casing hugetlb, I found a page table locking issue with hugetlb folios that are not mapped using a single PMD/PUD. For some hugetlb folio sizes, GUP will take different page table locks when walking the page tables than hugetlb when modifying the page tables. I did not actually try reproducing an issue, but looking at follow_pmd_mask() where we might be rereading a PMD value multiple times it's rather clear that concurrent modifications are rather unpleasant. In follow_page_pte() we might be better in that regard -- ptep_get() does a READ_ONCE() -- but who knows what else could happen concurrently in some weird corner cases (e.g., hugetlb folio getting unmapped and freed). This patch (of 2): pte_lockptr() is the only *_lockptr() function that doesn't consume what would be expected: it consumes a pmd_t pointer instead of a pte_t pointer. Let's change that. The two callers in pgtable-generic.c are easily adjusted. Adjust khugepaged.c:retract_page_tables() to simply do a pte_offset_map_nolock() to obtain the lock, even though we won't actually be traversing the page table. This makes the code more similar to the other variants and avoids other hacks to make the new pte_lockptr() version happy. pte_lockptr() users reside now only in pgtable-generic.c. Maybe, using pte_offset_map_nolock() is the right thing to do because the PTE table could have been removed in the meantime? At least it sounds more future proof if we ever have other means of page table reclaim. It's not quite clear if holding the PTE table lock is really required: what if someone else obtains the lock just after we unlock it? But we'll leave that as is for now, maybe there are good reasons. This is a preparation for adapting hugetlb page table locking logic to take the same locks as core-mm page table walkers would. Link: https://lkml.kernel.org/r/20240725183955.2268884-1-david@redhat.com Link: https://lkml.kernel.org/r/20240725183955.2268884-2-david@redhat.com Fixes: 9cb28da54643 ("mm/gup: handle hugetlb in the generic follow_page_mask code") Signed-off-by: David Hildenbrand <david(a)redhat.com> Cc: Muchun Song <muchun.song(a)linux.dev> Cc: Oscar Salvador <osalvador(a)suse.de> Cc: Peter Xu <peterx(a)redhat.com> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- include/linux/mm.h | 7 ++++--- mm/khugepaged.c | 21 +++++++++++++++------ mm/pgtable-generic.c | 4 ++-- 3 files changed, 21 insertions(+), 11 deletions(-) --- a/include/linux/mm.h~mm-let-pte_lockptr-consume-a-pte_t-pointer +++ a/include/linux/mm.h @@ -2915,9 +2915,10 @@ static inline spinlock_t *ptlock_ptr(str } #endif /* ALLOC_SPLIT_PTLOCKS */ -static inline spinlock_t *pte_lockptr(struct mm_struct *mm, pmd_t *pmd) +static inline spinlock_t *pte_lockptr(struct mm_struct *mm, pte_t *pte) { - return ptlock_ptr(page_ptdesc(pmd_page(*pmd))); + /* PTE page tables don't currently exceed a single page. */ + return ptlock_ptr(virt_to_ptdesc(pte)); } static inline bool ptlock_init(struct ptdesc *ptdesc) @@ -2940,7 +2941,7 @@ static inline bool ptlock_init(struct pt /* * We use mm->page_table_lock to guard all pagetable pages of the mm. */ -static inline spinlock_t *pte_lockptr(struct mm_struct *mm, pmd_t *pmd) +static inline spinlock_t *pte_lockptr(struct mm_struct *mm, pte_t *pte) { return &mm->page_table_lock; } --- a/mm/khugepaged.c~mm-let-pte_lockptr-consume-a-pte_t-pointer +++ a/mm/khugepaged.c @@ -1697,12 +1697,13 @@ static void retract_page_tables(struct a i_mmap_lock_read(mapping); vma_interval_tree_foreach(vma, &mapping->i_mmap, pgoff, pgoff) { struct mmu_notifier_range range; + bool retracted = false; struct mm_struct *mm; unsigned long addr; pmd_t *pmd, pgt_pmd; spinlock_t *pml; spinlock_t *ptl; - bool skipped_uffd = false; + pte_t *pte; /* * Check vma->anon_vma to exclude MAP_PRIVATE mappings that @@ -1739,9 +1740,17 @@ static void retract_page_tables(struct a mmu_notifier_invalidate_range_start(&range); pml = pmd_lock(mm, pmd); - ptl = pte_lockptr(mm, pmd); + + /* + * No need to check the PTE table content, but we'll grab the + * PTE table lock while we zap it. + */ + pte = pte_offset_map_nolock(mm, pmd, addr, &ptl); + if (!pte) + goto unlock_pmd; if (ptl != pml) spin_lock_nested(ptl, SINGLE_DEPTH_NESTING); + pte_unmap(pte); /* * Huge page lock is still held, so normally the page table @@ -1752,20 +1761,20 @@ static void retract_page_tables(struct a * repeating the anon_vma check protects from one category, * and repeating the userfaultfd_wp() check from another. */ - if (unlikely(vma->anon_vma || userfaultfd_wp(vma))) { - skipped_uffd = true; - } else { + if (likely(!vma->anon_vma && !userfaultfd_wp(vma))) { pgt_pmd = pmdp_collapse_flush(vma, addr, pmd); pmdp_get_lockless_sync(); + retracted = true; } if (ptl != pml) spin_unlock(ptl); +unlock_pmd: spin_unlock(pml); mmu_notifier_invalidate_range_end(&range); - if (!skipped_uffd) { + if (retracted) { mm_dec_nr_ptes(mm); page_table_check_pte_clear_range(mm, addr, pgt_pmd); pte_free_defer(mm, pmd_pgtable(pgt_pmd)); --- a/mm/pgtable-generic.c~mm-let-pte_lockptr-consume-a-pte_t-pointer +++ a/mm/pgtable-generic.c @@ -313,7 +313,7 @@ pte_t *pte_offset_map_nolock(struct mm_s pte = __pte_offset_map(pmd, addr, &pmdval); if (likely(pte)) - *ptlp = pte_lockptr(mm, &pmdval); + *ptlp = pte_lockptr(mm, pte); return pte; } @@ -371,7 +371,7 @@ again: pte = __pte_offset_map(pmd, addr, &pmdval); if (unlikely(!pte)) return pte; - ptl = pte_lockptr(mm, &pmdval); + ptl = pte_lockptr(mm, pte); spin_lock(ptl); if (likely(pmd_same(pmdval, pmdp_get_lockless(pmd)))) { *ptlp = ptl; _ Patches currently in -mm which might be from david(a)redhat.com are mm-let-pte_lockptr-consume-a-pte_t-pointer-fix.patch mm-hugetlb-fix-hugetlb-vs-core-mm-pt-locking.patch mm-turn-use_split_pte_ptlocks-use_split_pte_ptlocks-into-kconfig-options.patch mm-hugetlb-enforce-that-pmd-pt-sharing-has-split-pmd-pt-locks.patch powerpc-8xx-document-and-enforce-that-split-pt-locks-are-not-used.patch mm-simplify-arch_make_folio_accessible.patch mm-gup-convert-to-arch_make_folio_accessible.patch s390-uv-drop-arch_make_page_accessible.patch

1 year, 5 months

[PATCH v4 2/2] Revert "ALSA: firewire-lib: operate for period elapse event in process context"

by Edmund Raile

Commit 7ba5ca32fe6e ("ALSA: firewire-lib: operate for period elapse event in process context") removed the process context workqueue from amdtp_domain_stream_pcm_pointer() and update_pcm_pointers() to remove its overhead. With RME Fireface 800, this lead to a regression since Kernels 5.14.0, causing an AB/BA deadlock competition for the substream lock with eventual system freeze under ALSA operation: thread 0: * (lock A) acquire substream lock by snd_pcm_stream_lock_irq() in snd_pcm_status64() * (lock B) wait for tasklet to finish by calling tasklet_unlock_spin_wait() in tasklet_disable_in_atomic() in ohci_flush_iso_completions() of ohci.c thread 1: * (lock B) enter tasklet * (lock A) attempt to acquire substream lock, waiting for it to be released: snd_pcm_stream_lock_irqsave() in snd_pcm_period_elapsed() in update_pcm_pointers() in process_ctx_payloads() in process_rx_packets() of amdtp-stream.c ? tasklet_unlock_spin_wait </NMI> <TASK> ohci_flush_iso_completions firewire_ohci amdtp_domain_stream_pcm_pointer snd_firewire_lib snd_pcm_update_hw_ptr0 snd_pcm snd_pcm_status64 snd_pcm ? native_queued_spin_lock_slowpath </NMI> <IRQ> _raw_spin_lock_irqsave snd_pcm_period_elapsed snd_pcm process_rx_packets snd_firewire_lib irq_target_callback snd_firewire_lib handle_it_packet firewire_ohci context_tasklet firewire_ohci Restore the process context work queue to prevent deadlock AB/BA deadlock competition for ALSA substream lock of snd_pcm_stream_lock_irq() in snd_pcm_status64() and snd_pcm_stream_lock_irqsave() in snd_pcm_period_elapsed(). revert commit 7ba5ca32fe6e ("ALSA: firewire-lib: operate for period elapse event in process context") Replace inline description to prevent future deadlock. Cc: stable(a)vger.kernel.org Fixes: 7ba5ca32fe6e ("ALSA: firewire-lib: operate for period elapse event in process context") Reported-by: edmund.raile <edmund.raile(a)proton.me> Closes: https://lore.kernel.org/r/kwryofzdmjvzkuw6j3clftsxmoolynljztxqwg76hzeo4simn… Signed-off-by: Edmund Raile <edmund.raile(a)protonmail.com> --- sound/firewire/amdtp-stream.c | 23 +++++++++-------------- 1 file changed, 9 insertions(+), 14 deletions(-) diff --git a/sound/firewire/amdtp-stream.c b/sound/firewire/amdtp-stream.c index 31201d506a21..7438999e0510 100644 --- a/sound/firewire/amdtp-stream.c +++ b/sound/firewire/amdtp-stream.c @@ -615,16 +615,8 @@ static void update_pcm_pointers(struct amdtp_stream *s, // The program in user process should periodically check the status of intermediate // buffer associated to PCM substream to process PCM frames in the buffer, instead // of receiving notification of period elapsed by poll wait. - if (!pcm->runtime->no_period_wakeup) { - if (in_softirq()) { - // In software IRQ context for 1394 OHCI. - snd_pcm_period_elapsed(pcm); - } else { - // In process context of ALSA PCM application under acquired lock of - // PCM substream. - snd_pcm_period_elapsed_under_stream_lock(pcm); - } - } + if (!pcm->runtime->no_period_wakeup) + queue_work(system_highpri_wq, &s->period_work); } } @@ -1864,11 +1856,14 @@ unsigned long amdtp_domain_stream_pcm_pointer(struct amdtp_domain *d, { struct amdtp_stream *irq_target = d->irq_target; - // Process isochronous packets queued till recent isochronous cycle to handle PCM frames. if (irq_target && amdtp_stream_running(irq_target)) { - // In software IRQ context, the call causes dead-lock to disable the tasklet - // synchronously. - if (!in_softirq()) + // use wq to prevent AB/BA deadlock competition for + // substream lock: + // fw_iso_context_flush_completions() acquires + // lock by ohci_flush_iso_completions(), + // amdtp-stream process_rx_packets() attempts to + // acquire same lock by snd_pcm_elapsed() + if (current_work() != &s->period_work) fw_iso_context_flush_completions(irq_target->context); } -- 2.45.2

1 year, 5 months

[PATCH v4 1/2] Revert "ALSA: firewire-lib: obsolete workqueue for period update"

by Edmund Raile

prepare resolution of AB/BA deadlock competition for substream lock: restore workqueue previously used for process context: revert commit b5b519965c4c ("ALSA: firewire-lib: obsolete workqueue for period update") Cc: stable(a)vger.kernel.org Link: https://lore.kernel.org/r/kwryofzdmjvzkuw6j3clftsxmoolynljztxqwg76hzeo4simn… Signed-off-by: Edmund Raile <edmund.raile(a)protonmail.com> --- sound/firewire/amdtp-stream.c | 15 +++++++++++++++ sound/firewire/amdtp-stream.h | 1 + 2 files changed, 16 insertions(+) diff --git a/sound/firewire/amdtp-stream.c b/sound/firewire/amdtp-stream.c index d35d0a420ee0..31201d506a21 100644 --- a/sound/firewire/amdtp-stream.c +++ b/sound/firewire/amdtp-stream.c @@ -77,6 +77,8 @@ // overrun. Actual device can skip more, then this module stops the packet streaming. #define IR_JUMBO_PAYLOAD_MAX_SKIP_CYCLES 5 +static void pcm_period_work(struct work_struct *work); + /** * amdtp_stream_init - initialize an AMDTP stream structure * @s: the AMDTP stream to initialize @@ -105,6 +107,7 @@ int amdtp_stream_init(struct amdtp_stream *s, struct fw_unit *unit, s->flags = flags; s->context = ERR_PTR(-1); mutex_init(&s->mutex); + INIT_WORK(&s->period_work, pcm_period_work); s->packet_index = 0; init_waitqueue_head(&s->ready_wait); @@ -347,6 +350,7 @@ EXPORT_SYMBOL(amdtp_stream_get_max_payload); */ void amdtp_stream_pcm_prepare(struct amdtp_stream *s) { + cancel_work_sync(&s->period_work); s->pcm_buffer_pointer = 0; s->pcm_period_pointer = 0; } @@ -624,6 +628,16 @@ static void update_pcm_pointers(struct amdtp_stream *s, } } +static void pcm_period_work(struct work_struct *work) +{ + struct amdtp_stream *s = container_of(work, struct amdtp_stream, + period_work); + struct snd_pcm_substream *pcm = READ_ONCE(s->pcm); + + if (pcm) + snd_pcm_period_elapsed(pcm); +} + static int queue_packet(struct amdtp_stream *s, struct fw_iso_packet *params, bool sched_irq) { @@ -1910,6 +1924,7 @@ static void amdtp_stream_stop(struct amdtp_stream *s) return; } + cancel_work_sync(&s->period_work); fw_iso_context_stop(s->context); fw_iso_context_destroy(s->context); s->context = ERR_PTR(-1); diff --git a/sound/firewire/amdtp-stream.h b/sound/firewire/amdtp-stream.h index a1ed2e80f91a..775db3fc4959 100644 --- a/sound/firewire/amdtp-stream.h +++ b/sound/firewire/amdtp-stream.h @@ -191,6 +191,7 @@ struct amdtp_stream { /* For a PCM substream processing. */ struct snd_pcm_substream *pcm; + struct work_struct period_work; snd_pcm_uframes_t pcm_buffer_pointer; unsigned int pcm_period_pointer; unsigned int pcm_frame_multiplier; -- 2.45.2

1 year, 5 months

Jump to page:

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

Linux-stable-mirror July 2024