July 2023 - Linux-stable-mirror

[obsolete] mm-disable-config_per_vma_lock-until-its-fixed.patch removed from -mm tree

by Andrew Morton

The quilt patch titled Subject: mm: disable CONFIG_PER_VMA_LOCK until its fixed has been removed from the -mm tree. Its filename was mm-disable-config_per_vma_lock-until-its-fixed.patch This patch was dropped because it is obsolete ------------------------------------------------------ From: Suren Baghdasaryan <surenb(a)google.com> Subject: mm: disable CONFIG_PER_VMA_LOCK until its fixed Date: Wed, 5 Jul 2023 18:14:00 -0700 A memory corruption was reported in [1] with bisection pointing to the patch [2] enabling per-VMA locks for x86. Disable per-VMA locks config to prevent this issue until the fix is confirmed. This is expected to be a temporary measure. [1] https://bugzilla.kernel.org/show_bug.cgi?id=217624 [2] https://lore.kernel.org/all/20230227173632.3292573-30-surenb@google.com Link: https://lkml.kernel.org/r/20230706011400.2949242-3-surenb@google.com Reported-by: Jiri Slaby <jirislaby(a)kernel.org> Closes: https://lore.kernel.org/all/dbdef34c-3a07-5951-e1ae-e9c6e3cdf51b@kernel.org/ Reported-by: Jacob Young <jacobly.alt(a)gmail.com> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217624 Fixes: 0bff0aaea03e ("x86/mm: try VMA lock-based page fault handling first") Signed-off-by: Suren Baghdasaryan <surenb(a)google.com> Cc: David Hildenbrand <david(a)redhat.com> Cc: Holger Hoffst��tte <holger(a)applied-asynchrony.com> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- mm/Kconfig | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) --- a/mm/Kconfig~mm-disable-config_per_vma_lock-until-its-fixed +++ a/mm/Kconfig @@ -1224,8 +1224,9 @@ config ARCH_SUPPORTS_PER_VMA_LOCK def_bool n config PER_VMA_LOCK - def_bool y + bool "Enable per-vma locking during page fault handling." depends on ARCH_SUPPORTS_PER_VMA_LOCK && MMU && SMP + depends on BROKEN help Allow per-vma locking during page fault handling. _ Patches currently in -mm which might be from surenb(a)google.com are mm-lock-a-vma-before-stack-expansion.patch mm-lock-newly-mapped-vma-which-can-be-modified-after-it-becomes-visible.patch swap-remove-remnants-of-polling-from-read_swap_cache_async.patch mm-add-missing-vm_fault_result_trace-name-for-vm_fault_completed.patch mm-drop-per-vma-lock-when-returning-vm_fault_retry-or-vm_fault_completed.patch mm-change-folio_lock_or_retry-to-use-vm_fault-directly.patch mm-handle-swap-page-faults-under-per-vma-lock.patch mm-handle-userfaults-under-vma-lock.patch

2 years

1
0
0 0

[obsolete] fork-lock-vmas-of-the-parent-process-when-forking.patch removed from -mm tree

by Andrew Morton

The quilt patch titled Subject: fork: lock VMAs of the parent process when forking has been removed from the -mm tree. Its filename was fork-lock-vmas-of-the-parent-process-when-forking.patch This patch was dropped because it is obsolete ------------------------------------------------------ From: Suren Baghdasaryan <surenb(a)google.com> Subject: fork: lock VMAs of the parent process when forking Date: Wed, 5 Jul 2023 18:13:59 -0700 Patch series "Avoid memory corruption caused by per-VMA locks", v4. A memory corruption was reported in [1] with bisection pointing to the patch [2] enabling per-VMA locks for x86. Based on the reproducer provided in [1] we suspect this is caused by the lack of VMA locking while forking a child process. Patch 1/2 in the series implements proper VMA locking during fork. I tested the fix locally using the reproducer and was unable to reproduce the memory corruption problem. This fix can potentially regress some fork-heavy workloads. Kernel build time did not show noticeable regression on a 56-core machine while a stress test mapping 10000 VMAs and forking 5000 times in a tight loop shows ~7% regression. If such fork time regression is unacceptable, disabling CONFIG_PER_VMA_LOCK should restore its performance. Further optimizations are possible if this regression proves to be problematic. Patch 2/2 disables per-VMA locks until the fix is tested and verified. This patch (of 2): When forking a child process, parent write-protects an anonymous page and COW-shares it with the child being forked using copy_present_pte(). Parent's TLB is flushed right before we drop the parent's mmap_lock in dup_mmap(). If we get a write-fault before that TLB flush in the parent, and we end up replacing that anonymous page in the parent process in do_wp_page() (because, COW-shared with the child), this might lead to some stale writable TLB entries targeting the wrong (old) page. Similar issue happened in the past with userfaultfd (see flush_tlb_page() call inside do_wp_page()). Lock VMAs of the parent process when forking a child, which prevents concurrent page faults during fork operation and avoids this issue. This fix can potentially regress some fork-heavy workloads. Kernel build time did not show noticeable regression on a 56-core machine while a stress test mapping 10000 VMAs and forking 5000 times in a tight loop shows ~7% regression. If such fork time regression is unacceptable, disabling CONFIG_PER_VMA_LOCK should restore its performance. Further optimizations are possible if this regression proves to be problematic. Link: https://lkml.kernel.org/r/20230706011400.2949242-1-surenb@google.com Link: https://lkml.kernel.org/r/20230706011400.2949242-2-surenb@google.com Fixes: 0bff0aaea03e ("x86/mm: try VMA lock-based page fault handling first") Signed-off-by: Suren Baghdasaryan <surenb(a)google.com> Suggested-by: David Hildenbrand <david(a)redhat.com> Reported-by: Jiri Slaby <jirislaby(a)kernel.org> Closes: https://lore.kernel.org/all/dbdef34c-3a07-5951-e1ae-e9c6e3cdf51b@kernel.org/ Reported-by: Holger Hoffst��tte <holger(a)applied-asynchrony.com> Closes: https://lore.kernel.org/all/b198d649-f4bf-b971-31d0-e8433ec2a34c@applied-as… Reported-by: Jacob Young <jacobly.alt(a)gmail.com> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=3D217624 Reviewed-by: Liam R. Howlett <Liam.Howlett(a)oracle.com> Acked-by: David Hildenbrand <david(a)redhat.com> Tested-by: Holger Hoffsttte <holger(a)applied-asynchrony.com> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- kernel/fork.c | 6 ++++++ 1 file changed, 6 insertions(+) --- a/kernel/fork.c~fork-lock-vmas-of-the-parent-process-when-forking +++ a/kernel/fork.c @@ -658,6 +658,12 @@ static __latent_entropy int dup_mmap(str retval = -EINTR; goto fail_uprobe_end; } +#ifdef CONFIG_PER_VMA_LOCK + /* Disallow any page faults before calling flush_cache_dup_mm */ + for_each_vma(old_vmi, mpnt) + vma_start_write(mpnt); + vma_iter_set(&old_vmi, 0); +#endif flush_cache_dup_mm(oldmm); uprobe_dup_mmap(oldmm, mm); /* _ Patches currently in -mm which might be from surenb(a)google.com are mm-disable-config_per_vma_lock-until-its-fixed.patch mm-lock-a-vma-before-stack-expansion.patch mm-lock-newly-mapped-vma-which-can-be-modified-after-it-becomes-visible.patch swap-remove-remnants-of-polling-from-read_swap_cache_async.patch mm-add-missing-vm_fault_result_trace-name-for-vm_fault_completed.patch mm-drop-per-vma-lock-when-returning-vm_fault_retry-or-vm_fault_completed.patch mm-change-folio_lock_or_retry-to-use-vm_fault-directly.patch mm-handle-swap-page-faults-under-per-vma-lock.patch mm-handle-userfaults-under-vma-lock.patch

2 years

1
0
0 0

[PATCH] prctl: move PR_GET_AUXV out of PR_MCE_KILL

by Miguel Ojeda

Somehow PR_GET_AUXV got added into PR_MCE_KILL's switch when the patch was applied [1]. Thus move it out of the switch, to the place the patch added it. In the recently released v6.4 kernel some user could, in principle, be already using this feature by mapping the right page and passing the PR_GET_AUXV constant as a pointer: prctl(PR_MCE_KILL, PR_GET_AUXV, ...) So this does change the behavior for users. We could keep the bug since the other subcases in PR_MCE_KILL (PR_MCE_KILL_CLEAR and PR_MCE_KILL_SET) do not overlap. However, v6.4 may be recent enough (2 weeks old) that moving the lines (rather than just adding a new case) does not break anybody? Moreover, the documentation in man-pages was just committed today [2]. Fixes: ddc65971bb67 ("prctl: add PR_GET_AUXV to copy auxv to userspace") Cc: stable(a)vger.kernel.org Link: https://lore.kernel.org/all/d81864a7f7f43bca6afa2a09fc2e850e4050ab42.168061… [1] Link: https://git.kernel.org/pub/scm/docs/man-pages/man-pages.git/commit/?id=8cf0… [2] Signed-off-by: Miguel Ojeda <ojeda(a)kernel.org> --- kernel/sys.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/kernel/sys.c b/kernel/sys.c index 339fee3eff6a..a36a27ebac33 100644 --- a/kernel/sys.c +++ b/kernel/sys.c @@ -2529,11 +2529,6 @@ SYSCALL_DEFINE5(prctl, int, option, unsigned long, arg2, unsigned long, arg3, else return -EINVAL; break; - case PR_GET_AUXV: - if (arg4 || arg5) - return -EINVAL; - error = prctl_get_auxv((void __user *)arg2, arg3); - break; default: return -EINVAL; } @@ -2688,6 +2683,11 @@ SYSCALL_DEFINE5(prctl, int, option, unsigned long, arg2, unsigned long, arg3, case PR_SET_VMA: error = prctl_set_vma(arg2, arg3, arg4, arg5); break; + case PR_GET_AUXV: + if (arg4 || arg5) + return -EINVAL; + error = prctl_get_auxv((void __user *)arg2, arg3); + break; #ifdef CONFIG_KSM case PR_SET_MEMORY_MERGE: if (arg3 || arg4 || arg5) base-commit: 6995e2de6891c724bfeb2db33d7b87775f913ad1 -- 2.41.0

2 years

1
0
0 0

[PATCH] mm: lock newly mapped VMA with corrected ordering

by Hugh Dickins

Lockdep is certainly right to complain about (&vma->vm_lock->lock){++++}-{3:3}, at: vma_start_write+0x2d/0x3f but task is already holding lock: (&mapping->i_mmap_rwsem){+.+.}-{3:3}, at: mmap_region+0x4dc/0x6db Invert those to the usual ordering. Fixes: 33313a747e81 ("mm: lock newly mapped VMA which can be modified after it becomes visible") Cc: stable(a)vger.kernel.org Signed-off-by: Hugh Dickins <hughd(a)google.com> --- mm/mmap.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/mm/mmap.c b/mm/mmap.c index 84c71431a527..3eda23c9ebe7 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -2809,11 +2809,11 @@ unsigned long mmap_region(struct file *file, unsigned long addr, if (vma_iter_prealloc(&vmi)) goto close_and_free_vma; + /* Lock the VMA since it is modified after insertion into VMA tree */ + vma_start_write(vma); if (vma->vm_file) i_mmap_lock_write(vma->vm_file->f_mapping); - /* Lock the VMA since it is modified after insertion into VMA tree */ - vma_start_write(vma); vma_iter_store(&vmi, vma); mm->map_count++; if (vma->vm_file) { -- 2.35.3

2 years

2
2
0 0

A linux kernel bug report about a USBcore driver

by 周逸林

Dear developer, I am a security researcher at Wuhan University. Recently I discovered a vulnerability in the driver of the USBcore module in the Linux kernel. This vulnerability will lead to an infinite loop in the probe process of the USB device, which will consume a lot of system resources. The vulnerability was found in kernel version 5.6.19 and tested to exist in the new 6.3.7 kernel version as well. I hope that after your review, you will be able to apply for a CVE number to disclose this vulnerability. If you need more detailed vulnerability information, please contact me. Thank you for your help.

2 years

2
1
0 0

[PATCH 1/2] mm: lock a vma before stack expansion

by Suren Baghdasaryan

With recent changes necessitating mmap_lock to be held for write while expanding a stack, per-VMA locks should follow the same rules and be write-locked to prevent page faults into the VMA being expanded. Add the necessary locking. Signed-off-by: Suren Baghdasaryan <surenb(a)google.com> --- mm/mmap.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/mm/mmap.c b/mm/mmap.c index 204ddcd52625..c66e4622a557 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -1977,6 +1977,8 @@ static int expand_upwards(struct vm_area_struct *vma, unsigned long address) return -ENOMEM; } + /* Lock the VMA before expanding to prevent concurrent page faults */ + vma_start_write(vma); /* * vma->vm_start/vm_end cannot change under us because the caller * is required to hold the mmap_lock in read mode. We need the @@ -2064,6 +2066,8 @@ int expand_downwards(struct vm_area_struct *vma, unsigned long address) return -ENOMEM; } + /* Lock the VMA before expanding to prevent concurrent page faults */ + vma_start_write(vma); /* * vma->vm_start/vm_end cannot change under us because the caller * is required to hold the mmap_lock in read mode. We need the -- 2.41.0.255.g8b1d071c50-goog

2 years

4
6
0 0

+ mm-hugetlb_vmemmap-fix-a-race-between-vmemmap-pmd-split.patch added to mm-unstable branch

by Andrew Morton

The patch titled Subject: mm: hugetlb_vmemmap: fix a race between vmemmap pmd split has been added to the -mm mm-unstable branch. Its filename is mm-hugetlb_vmemmap-fix-a-race-between-vmemmap-pmd-split.patch This patch will shortly appear at https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche… This patch will later appear in the mm-unstable branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/process/submit-checklist.rst when testing your code *** The -mm tree is included into linux-next via the mm-everything branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm and is updated there every 2-3 working days ------------------------------------------------------ From: Muchun Song <songmuchun(a)bytedance.com> Subject: mm: hugetlb_vmemmap: fix a race between vmemmap pmd split Date: Fri, 7 Jul 2023 11:38:59 +0800 The local variable @page in __split_vmemmap_huge_pmd() to obtain a pmd page without holding page_table_lock may possiblely get the page table page instead of a huge pmd page. The effect may be in set_pte_at() since we may pass an invalid page struct, if set_pte_at() wants to access the page struct (e.g. CONFIG_PAGE_TABLE_CHECK is enabled), it may crash the kernel. So fix it. And inline __split_vmemmap_huge_pmd() since it only has one user. Link: https://lkml.kernel.org/r/20230707033859.16148-1-songmuchun@bytedance.com Fixes: d8d55f5616cf ("mm: sparsemem: use page table lock to protect kernel pmd operations") Signed-off-by: Muchun Song <songmuchun(a)bytedance.com> Cc: Mike Kravetz <mike.kravetz(a)oracle.com> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- mm/hugetlb_vmemmap.c | 34 ++++++++++++++-------------------- 1 file changed, 14 insertions(+), 20 deletions(-) --- a/mm/hugetlb_vmemmap.c~mm-hugetlb_vmemmap-fix-a-race-between-vmemmap-pmd-split +++ a/mm/hugetlb_vmemmap.c @@ -36,14 +36,22 @@ struct vmemmap_remap_walk { struct list_head *vmemmap_pages; }; -static int __split_vmemmap_huge_pmd(pmd_t *pmd, unsigned long start) +static int split_vmemmap_huge_pmd(pmd_t *pmd, unsigned long start) { pmd_t __pmd; int i; unsigned long addr = start; - struct page *page = pmd_page(*pmd); - pte_t *pgtable = pte_alloc_one_kernel(&init_mm); + struct page *head; + pte_t *pgtable; + + spin_lock(&init_mm.page_table_lock); + head = pmd_leaf(*pmd) ? pmd_page(*pmd) : NULL; + spin_unlock(&init_mm.page_table_lock); + if (!head) + return 0; + + pgtable = pte_alloc_one_kernel(&init_mm); if (!pgtable) return -ENOMEM; @@ -53,7 +61,7 @@ static int __split_vmemmap_huge_pmd(pmd_ pte_t entry, *pte; pgprot_t pgprot = PAGE_KERNEL; - entry = mk_pte(page + i, pgprot); + entry = mk_pte(head + i, pgprot); pte = pte_offset_kernel(&__pmd, addr); set_pte_at(&init_mm, addr, pte, entry); } @@ -65,8 +73,8 @@ static int __split_vmemmap_huge_pmd(pmd_ * be treated as indepdenent small pages (as they can be freed * individually). */ - if (!PageReserved(page)) - split_page(page, get_order(PMD_SIZE)); + if (!PageReserved(head)) + split_page(head, get_order(PMD_SIZE)); /* Make pte visible before pmd. See comment in pmd_install(). */ smp_wmb(); @@ -80,20 +88,6 @@ static int __split_vmemmap_huge_pmd(pmd_ return 0; } -static int split_vmemmap_huge_pmd(pmd_t *pmd, unsigned long start) -{ - int leaf; - - spin_lock(&init_mm.page_table_lock); - leaf = pmd_leaf(*pmd); - spin_unlock(&init_mm.page_table_lock); - - if (!leaf) - return 0; - - return __split_vmemmap_huge_pmd(pmd, start); -} - static void vmemmap_pte_range(pmd_t *pmd, unsigned long addr, unsigned long end, struct vmemmap_remap_walk *walk) _ Patches currently in -mm which might be from songmuchun(a)bytedance.com are mm-hugetlb_vmemmap-fix-a-race-between-vmemmap-pmd-split.patch

2 years

1
0
0 0

backport d8c47cc7bf602ef73384a00869a70148146c1191 to linux-5.15.y

by Charan Teja Kalla

Hi, Can you please help in back porting the below patch to linux-5.15.y stable tree: commit d8c47cc7bf602ef73384a00869a70148146c1191("mm: page_io: fix psi memory pressure error on cold swapins") . With the absence of this patch we are seeing some user space tools, like Android low memory killer based on PSI events, bit aggressive as it makes the PSI is accounted for even cold swapin on a device where swap is mounted on a zram with slower backing dev. Thanks, Charan

2 years

2
2
0 0

[PATCH linux-5.4.y] bgmac: fix *initial* chip reset to support BCM5358

by Rafał Miłecki

From: Rafał Miłecki <rafal(a)milecki.pl> commit f99e6d7c4ed3be2531bd576425a5bd07fb133bd7 upstream. While bringing hardware up we should perform a full reset including the switch bit (BGMAC_BCMA_IOCTL_SW_RESET aka SICF_SWRST). It's what specification says and what reference driver does. This seems to be critical for the BCM5358. Without this hardware doesn't get initialized properly and doesn't seem to transmit or receive any packets. Originally bgmac was calling bgmac_chip_reset() before setting "has_robosw" property which resulted in expected behaviour. That has changed as a side effect of adding platform device support which regressed BCM5358 support. Fixes: f6a95a24957a ("net: ethernet: bgmac: Add platform device support") Cc: Jon Mason <jdmason(a)kudzu.us> Signed-off-by: Rafał Miłecki <rafal(a)milecki.pl> Reviewed-by: Leon Romanovsky <leonro(a)nvidia.com> Reviewed-by: Florian Fainelli <f.fainelli(a)gmail.com> Link: https://lore.kernel.org/r/20230227091156.19509-1-zajec5@gmail.com Signed-off-by: Paolo Abeni <pabeni(a)redhat.com> --- Upstream commit wasn't backported to 5.4 (and older) because it couldn't be cherry-picked cleanly. There was a small fuzz caused by a missing commit 8c7da63978f1 ("bgmac: configure MTU and add support for frames beyond 8192 byte size"). I've manually cherry-picked fix for BCM5358 to the linux-5.4.x. --- drivers/net/ethernet/broadcom/bgmac.c | 8 ++++++-- drivers/net/ethernet/broadcom/bgmac.h | 2 ++ 2 files changed, 8 insertions(+), 2 deletions(-) diff --git a/drivers/net/ethernet/broadcom/bgmac.c b/drivers/net/ethernet/broadcom/bgmac.c index 193722334d93..89a63fdbe0e3 100644 --- a/drivers/net/ethernet/broadcom/bgmac.c +++ b/drivers/net/ethernet/broadcom/bgmac.c @@ -890,13 +890,13 @@ static void bgmac_chip_reset_idm_config(struct bgmac *bgmac) if (iost & BGMAC_BCMA_IOST_ATTACHED) { flags = BGMAC_BCMA_IOCTL_SW_CLKEN; - if (!bgmac->has_robosw) + if (bgmac->in_init || !bgmac->has_robosw) flags |= BGMAC_BCMA_IOCTL_SW_RESET; } bgmac_clk_enable(bgmac, flags); } - if (iost & BGMAC_BCMA_IOST_ATTACHED && !bgmac->has_robosw) + if (iost & BGMAC_BCMA_IOST_ATTACHED && (bgmac->in_init || !bgmac->has_robosw)) bgmac_idm_write(bgmac, BCMA_IOCTL, bgmac_idm_read(bgmac, BCMA_IOCTL) & ~BGMAC_BCMA_IOCTL_SW_RESET); @@ -1489,6 +1489,8 @@ int bgmac_enet_probe(struct bgmac *bgmac) struct net_device *net_dev = bgmac->net_dev; int err; + bgmac->in_init = true; + bgmac_chip_intrs_off(bgmac); net_dev->irq = bgmac->irq; @@ -1538,6 +1540,8 @@ int bgmac_enet_probe(struct bgmac *bgmac) net_dev->hw_features = net_dev->features; net_dev->vlan_features = net_dev->features; + bgmac->in_init = false; + err = register_netdev(bgmac->net_dev); if (err) { dev_err(bgmac->dev, "Cannot register net device\n"); diff --git a/drivers/net/ethernet/broadcom/bgmac.h b/drivers/net/ethernet/broadcom/bgmac.h index 40d02fec2747..76930b8353d6 100644 --- a/drivers/net/ethernet/broadcom/bgmac.h +++ b/drivers/net/ethernet/broadcom/bgmac.h @@ -511,6 +511,8 @@ struct bgmac { int irq; u32 int_mask; + bool in_init; + /* Current MAC state */ int mac_speed; int mac_duplex; -- 2.35.3

2 years

2
1
0 0

[PATCH net] net: mana: Configure hwc timeout from hardware

by Souradeep Chakrabarti

At present hwc timeout value is a fixed value. This patch sets the hwc timeout from the hardware. Signed-off-by: Souradeep Chakrabarti <schakrabarti(a)linux.microsoft.com> --- .../net/ethernet/microsoft/mana/gdma_main.c | 27 +++++++++++++++++++ .../net/ethernet/microsoft/mana/hw_channel.c | 25 ++++++++++++++++- include/net/mana/gdma.h | 20 +++++++++++++- include/net/mana/hw_channel.h | 5 ++++ 4 files changed, 75 insertions(+), 2 deletions(-) diff --git a/drivers/net/ethernet/microsoft/mana/gdma_main.c b/drivers/net/ethernet/microsoft/mana/gdma_main.c index 8f3f78b68592..5d30347e0137 100644 --- a/drivers/net/ethernet/microsoft/mana/gdma_main.c +++ b/drivers/net/ethernet/microsoft/mana/gdma_main.c @@ -106,6 +106,30 @@ static int mana_gd_query_max_resources(struct pci_dev *pdev) return 0; } +static int mana_gd_query_hwc_timeout(struct pci_dev *pdev, u32 *timeout_val) +{ + struct gdma_context *gc = pci_get_drvdata(pdev); + struct gdma_query_hwc_timeout_req req = {}; + struct gdma_query_hwc_timeout_resp resp = {}; + int err; + + mana_gd_init_req_hdr(&req.hdr, GDMA_QUERY_HWC_TIMEOUT, + sizeof(req), sizeof(resp)); + req.timeout_ms = *timeout_val; + err = mana_gd_send_request(gc, sizeof(req), &req, sizeof(resp), &resp); + if (err || resp.hdr.status) { + dev_err(gc->dev, "Failed to query timeout: %d, 0x%x\n", err, + resp.hdr.status); + return err ? err : -EPROTO; + } + + *timeout_val = resp.timeout_ms; + dev_info(gc->dev, "Successfully changed the timeout value %u\n", + *timeout_val); + + return 0; +} + static int mana_gd_detect_devices(struct pci_dev *pdev) { struct gdma_context *gc = pci_get_drvdata(pdev); @@ -879,6 +903,7 @@ int mana_gd_verify_vf_version(struct pci_dev *pdev) struct gdma_context *gc = pci_get_drvdata(pdev); struct gdma_verify_ver_resp resp = {}; struct gdma_verify_ver_req req = {}; + struct hw_channel_context *hwc = gc->hwc.driver_data; int err; mana_gd_init_req_hdr(&req.hdr, GDMA_VERIFY_VF_DRIVER_VERSION, @@ -907,6 +932,8 @@ int mana_gd_verify_vf_version(struct pci_dev *pdev) err, resp.hdr.status); return err ? err : -EPROTO; } + if (resp.pf_cap_flags1 & GDMA_DRV_CAP_FLAG_1_HWC_TIMEOUT_RECONFIG) + mana_gd_query_hwc_timeout(pdev, &hwc->hwc_timeout); return 0; } diff --git a/drivers/net/ethernet/microsoft/mana/hw_channel.c b/drivers/net/ethernet/microsoft/mana/hw_channel.c index 9d1507eba5b9..f5980c26fd09 100644 --- a/drivers/net/ethernet/microsoft/mana/hw_channel.c +++ b/drivers/net/ethernet/microsoft/mana/hw_channel.c @@ -174,7 +174,25 @@ static void mana_hwc_init_event_handler(void *ctx, struct gdma_queue *q_self, complete(&hwc->hwc_init_eqe_comp); break; + case GDMA_EQE_HWC_SOC_RECONFIG_DATA: + type_data.as_uint32 = event->details[0]; + type = type_data.type; + val = type_data.value; + + switch (type) { + case HWC_DATA_CFG_HWC_TIMEOUT: + hwc->hwc_timeout = val; + break; + + default: + dev_warn(hwc->dev, "Received unknown reconfig type %u\n", type); + break; + } + + break; + default: + dev_warn(hwc->dev, "Received unknown gdma event %u\n", event->type); /* Ignore unknown events, which should never happen. */ break; } @@ -704,6 +722,7 @@ int mana_hwc_create_channel(struct gdma_context *gc) gd->pdid = INVALID_PDID; gd->doorbell = INVALID_DOORBELL; + hwc->hwc_timeout = HW_CHANNEL_WAIT_RESOURCE_TIMEOUT_MS; /* mana_hwc_init_queues() only creates the required data structures, * and doesn't touch the HWC device. */ @@ -770,6 +789,8 @@ void mana_hwc_destroy_channel(struct gdma_context *gc) hwc->gdma_dev->doorbell = INVALID_DOORBELL; hwc->gdma_dev->pdid = INVALID_PDID; + hwc->hwc_timeout = 0; + kfree(hwc); gc->hwc.driver_data = NULL; gc->hwc.gdma_context = NULL; @@ -818,6 +839,7 @@ int mana_hwc_send_request(struct hw_channel_context *hwc, u32 req_len, dest_vrq = hwc->pf_dest_vrq_id; dest_vrcq = hwc->pf_dest_vrcq_id; } + dev_err(hwc->dev, "HWC: timeout %u ms\n", hwc->hwc_timeout); err = mana_hwc_post_tx_wqe(txq, tx_wr, dest_vrq, dest_vrcq, false); if (err) { @@ -825,7 +847,8 @@ int mana_hwc_send_request(struct hw_channel_context *hwc, u32 req_len, goto out; } - if (!wait_for_completion_timeout(&ctx->comp_event, 30 * HZ)) { + if (!wait_for_completion_timeout(&ctx->comp_event, + (hwc->hwc_timeout / 1000) * HZ)) { dev_err(hwc->dev, "HWC: Request timed out!\n"); err = -ETIMEDOUT; goto out; diff --git a/include/net/mana/gdma.h b/include/net/mana/gdma.h index 96c120160f15..88b6ef7ce1a6 100644 --- a/include/net/mana/gdma.h +++ b/include/net/mana/gdma.h @@ -33,6 +33,7 @@ enum gdma_request_type { GDMA_DESTROY_PD = 30, GDMA_CREATE_MR = 31, GDMA_DESTROY_MR = 32, + GDMA_QUERY_HWC_TIMEOUT = 84, /* 0x54 */ }; #define GDMA_RESOURCE_DOORBELL_PAGE 27 @@ -57,6 +58,8 @@ enum gdma_eqe_type { GDMA_EQE_HWC_INIT_EQ_ID_DB = 129, GDMA_EQE_HWC_INIT_DATA = 130, GDMA_EQE_HWC_INIT_DONE = 131, + GDMA_EQE_HWC_SOC_RECONFIG = 132, + GDMA_EQE_HWC_SOC_RECONFIG_DATA = 133, }; enum { @@ -531,10 +534,12 @@ enum { * so the driver is able to reliably support features like busy_poll. */ #define GDMA_DRV_CAP_FLAG_1_NAPI_WKDONE_FIX BIT(2) +#define GDMA_DRV_CAP_FLAG_1_HWC_TIMEOUT_RECONFIG BIT(3) #define GDMA_DRV_CAP_FLAGS1 \ (GDMA_DRV_CAP_FLAG_1_EQ_SHARING_MULTI_VPORT | \ - GDMA_DRV_CAP_FLAG_1_NAPI_WKDONE_FIX) + GDMA_DRV_CAP_FLAG_1_NAPI_WKDONE_FIX | \ + GDMA_DRV_CAP_FLAG_1_HWC_TIMEOUT_RECONFIG) #define GDMA_DRV_CAP_FLAGS2 0 @@ -664,6 +669,19 @@ struct gdma_disable_queue_req { u32 alloc_res_id_on_creation; }; /* HW DATA */ +/* GDMA_QUERY_HWC_TIMEOUT */ +struct gdma_query_hwc_timeout_req { + struct gdma_req_hdr hdr; + u32 timeout_ms; + u32 reserved; +}; + +struct gdma_query_hwc_timeout_resp { + struct gdma_resp_hdr hdr; + u32 timeout_ms; + u32 reserved; +}; + enum atb_page_size { ATB_PAGE_SIZE_4K, ATB_PAGE_SIZE_8K, diff --git a/include/net/mana/hw_channel.h b/include/net/mana/hw_channel.h index 6a757a6e2732..3d3b5c881bc1 100644 --- a/include/net/mana/hw_channel.h +++ b/include/net/mana/hw_channel.h @@ -23,6 +23,10 @@ #define HWC_INIT_DATA_PF_DEST_RQ_ID 10 #define HWC_INIT_DATA_PF_DEST_CQ_ID 11 +#define HWC_DATA_CFG_HWC_TIMEOUT 1 + +#define HW_CHANNEL_WAIT_RESOURCE_TIMEOUT_MS 30000 + /* Structures labeled with "HW DATA" are exchanged with the hardware. All of * them are naturally aligned and hence don't need __packed. */ @@ -182,6 +186,7 @@ struct hw_channel_context { u32 pf_dest_vrq_id; u32 pf_dest_vrcq_id; + u32 hwc_timeout; struct hwc_caller_ctx *caller_ctx; }; -- 2.34.1

2 years

4
5
0 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

Linux-stable-mirror July 2023