July 2024 - Linux-stable-mirror

[PATCH net v2 3/4] ipv6: take care of scope when choosing the src addr

by Nicolas Dichtel

When the source address is selected, the scope must be checked. For example, if a loopback address is assigned to the vrf device, it must not be chosen for packets sent outside. CC: stable(a)vger.kernel.org Fixes: afbac6010aec ("net: ipv6: Address selection needs to consider L3 domains") Signed-off-by: Nicolas Dichtel <nicolas.dichtel(a)6wind.com> --- net/ipv6/addrconf.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c index 5c424a0e7232..4f2c5cc31015 100644 --- a/net/ipv6/addrconf.c +++ b/net/ipv6/addrconf.c @@ -1873,7 +1873,8 @@ int ipv6_dev_get_saddr(struct net *net, const struct net_device *dst_dev, master, &dst, scores, hiscore_idx); - if (scores[hiscore_idx].ifa) + if (scores[hiscore_idx].ifa && + scores[hiscore_idx].scopedist >= 0) goto out; } -- 2.43.1

1 year, 5 months

2
1
0 0

[PATCH v3] mm: Fix khugepaged activation policy

by Ryan Roberts

Since the introduction of mTHP, the docuementation has stated that khugepaged would be enabled when any mTHP size is enabled, and disabled when all mTHP sizes are disabled. There are 2 problems with this; 1. this is not what was implemented by the code and 2. this is not the desirable behavior. Desirable behavior is for khugepaged to be enabled when any PMD-sized THP is enabled, anon or file. (Note that file THP is still controlled by the top-level control so we must always consider that, as well as the PMD-size mTHP control for anon). khugepaged only supports collapsing to PMD-sized THP so there is no value in enabling it when PMD-sized THP is disabled. So let's change the code and documentation to reflect this policy. Further, per-size enabled control modification events were not previously forwarded to khugepaged to give it an opportunity to start or stop. Consequently the following was resulting in khugepaged eroneously not being activated: echo never > /sys/kernel/mm/transparent_hugepage/enabled echo always > /sys/kernel/mm/transparent_hugepage/hugepages-2048kB/enabled Signed-off-by: Ryan Roberts <ryan.roberts(a)arm.com> Fixes: 3485b88390b0 ("mm: thp: introduce multi-size THP sysfs interface") Closes: https://lore.kernel.org/linux-mm/7a0bbe69-1e3d-4263-b206-da007791a5c4@redha… Cc: stable(a)vger.kernel.org --- Hi All, Applies on top of mm-unstable from a couple of days ago (9bb8753acdd8). No regressions observed in mm selftests. When fixing this I also noticed that khugepaged doesn't get (and never has been) activated/deactivated by `shmem_enabled=`. I've concluded that this is definitely a (separate) bug. But I'm waiting for the conclusion of the conversation at [3] before fixing, so will send separately. Changes since v1 [1] ==================== - hugepage_pmd_enabled() now considers CONFIG_READ_ONLY_THP_FOR_FS as part of decision; means that for kernels without this config, khugepaged will not be started when only the top-level control is enabled. Changes since v2 [2] ==================== - Make hugepage_pmd_enabled() out-of-line static in khugepaged.c (per Andrew) - Refactor hugepage_pmd_enabled() for better readability (per Andrew) [1] https://lore.kernel.org/linux-mm/20240702144617.2291480-1-ryan.roberts@arm.… [2] https://lore.kernel.org/linux-mm/20240704091051.2411934-1-ryan.roberts@arm.… [3] https://lore.kernel.org/linux-mm/65c37315-2741-481f-b433-cec35ef1af35@arm.c… Thanks, Ryan Documentation/admin-guide/mm/transhuge.rst | 11 ++++---- include/linux/huge_mm.h | 12 -------- mm/huge_memory.c | 7 +++++ mm/khugepaged.c | 33 +++++++++++++++++----- 4 files changed, 38 insertions(+), 25 deletions(-) diff --git a/Documentation/admin-guide/mm/transhuge.rst b/Documentation/admin-guide/mm/transhuge.rst index 709fe10b60f4..fc321d40b8ac 100644 --- a/Documentation/admin-guide/mm/transhuge.rst +++ b/Documentation/admin-guide/mm/transhuge.rst @@ -202,12 +202,11 @@ PMD-mappable transparent hugepage:: cat /sys/kernel/mm/transparent_hugepage/hpage_pmd_size -khugepaged will be automatically started when one or more hugepage -sizes are enabled (either by directly setting "always" or "madvise", -or by setting "inherit" while the top-level enabled is set to "always" -or "madvise"), and it'll be automatically shutdown when the last -hugepage size is disabled (either by directly setting "never", or by -setting "inherit" while the top-level enabled is set to "never"). +khugepaged will be automatically started when PMD-sized THP is enabled +(either of the per-size anon control or the top-level control are set +to "always" or "madvise"), and it'll be automatically shutdown when +PMD-sized THP is disabled (when both the per-size anon control and the +top-level control are "never") Khugepaged controls ------------------- diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h index 4d155c7a4792..107da5c4eba4 100644 --- a/include/linux/huge_mm.h +++ b/include/linux/huge_mm.h @@ -128,18 +128,6 @@ static inline bool hugepage_global_always(void) (1<<TRANSPARENT_HUGEPAGE_FLAG); } -static inline bool hugepage_flags_enabled(void) -{ - /* - * We cover both the anon and the file-backed case here; we must return - * true if globally enabled, even when all anon sizes are set to never. - * So we don't need to look at huge_anon_orders_inherit. - */ - return hugepage_global_enabled() || - READ_ONCE(huge_anon_orders_always) || - READ_ONCE(huge_anon_orders_madvise); -} - static inline int highest_order(unsigned long orders) { return fls_long(orders) - 1; diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 251d6932130f..085f5e973231 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -502,6 +502,13 @@ static ssize_t thpsize_enabled_store(struct kobject *kobj, } else ret = -EINVAL; + if (ret > 0) { + int err; + + err = start_stop_khugepaged(); + if (err) + ret = err; + } return ret; } diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 409f67a817f1..a5ec03ef8722 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -413,6 +413,26 @@ static inline int hpage_collapse_test_exit_or_disable(struct mm_struct *mm) test_bit(MMF_DISABLE_THP, &mm->flags); } +static bool hugepage_pmd_enabled(void) +{ + /* + * We cover both the anon and the file-backed case here; file-backed + * hugepages, when configured in, are determined by the global control. + * Anon pmd-sized hugepages are determined by the pmd-size control. + */ + if (IS_ENABLED(CONFIG_READ_ONLY_THP_FOR_FS) && + hugepage_global_enabled()) + return true; + if (test_bit(PMD_ORDER, &huge_anon_orders_always)) + return true; + if (test_bit(PMD_ORDER, &huge_anon_orders_madvise)) + return true; + if (test_bit(PMD_ORDER, &huge_anon_orders_inherit) && + hugepage_global_enabled()) + return true; + return false; +} + void __khugepaged_enter(struct mm_struct *mm) { struct khugepaged_mm_slot *mm_slot; @@ -449,7 +469,7 @@ void khugepaged_enter_vma(struct vm_area_struct *vma, unsigned long vm_flags) { if (!test_bit(MMF_VM_HUGEPAGE, &vma->vm_mm->flags) && - hugepage_flags_enabled()) { + hugepage_pmd_enabled()) { if (thp_vma_allowable_order(vma, vm_flags, TVA_ENFORCE_SYSFS, PMD_ORDER)) __khugepaged_enter(vma->vm_mm); @@ -2462,8 +2482,7 @@ static unsigned int khugepaged_scan_mm_slot(unsigned int pages, int *result, static int khugepaged_has_work(void) { - return !list_empty(&khugepaged_scan.mm_head) && - hugepage_flags_enabled(); + return !list_empty(&khugepaged_scan.mm_head) && hugepage_pmd_enabled(); } static int khugepaged_wait_event(void) @@ -2536,7 +2555,7 @@ static void khugepaged_wait_work(void) return; } - if (hugepage_flags_enabled()) + if (hugepage_pmd_enabled()) wait_event_freezable(khugepaged_wait, khugepaged_wait_event()); } @@ -2567,7 +2586,7 @@ static void set_recommended_min_free_kbytes(void) int nr_zones = 0; unsigned long recommended_min; - if (!hugepage_flags_enabled()) { + if (!hugepage_pmd_enabled()) { calculate_min_free_kbytes(); goto update_wmarks; } @@ -2617,7 +2636,7 @@ int start_stop_khugepaged(void) int err = 0; mutex_lock(&khugepaged_mutex); - if (hugepage_flags_enabled()) { + if (hugepage_pmd_enabled()) { if (!khugepaged_thread) khugepaged_thread = kthread_run(khugepaged, NULL, "khugepaged"); @@ -2643,7 +2662,7 @@ int start_stop_khugepaged(void) void khugepaged_min_free_kbytes_update(void) { mutex_lock(&khugepaged_mutex); - if (hugepage_flags_enabled() && khugepaged_thread) + if (hugepage_pmd_enabled() && khugepaged_thread) set_recommended_min_free_kbytes(); mutex_unlock(&khugepaged_mutex); } -- 2.43.0

1 year, 5 months

2
1
0 0

[merged mm-hotfixes-stable] mm-fix-crashes-from-deferred-split-racing-folio-migration.patch removed from -mm tree

by Andrew Morton

The quilt patch titled Subject: mm: fix crashes from deferred split racing folio migration has been removed from the -mm tree. Its filename was mm-fix-crashes-from-deferred-split-racing-folio-migration.patch This patch was dropped because it was merged into the mm-hotfixes-stable branch of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm ------------------------------------------------------ From: Hugh Dickins <hughd(a)google.com> Subject: mm: fix crashes from deferred split racing folio migration Date: Tue, 2 Jul 2024 00:40:55 -0700 (PDT) Even on 6.10-rc6, I've been seeing elusive "Bad page state"s (often on flags when freeing, yet the flags shown are not bad: PG_locked had been set and cleared??), and VM_BUG_ON_PAGE(page_ref_count(page) == 0)s from deferred_split_scan()'s folio_put(), and a variety of other BUG and WARN symptoms implying double free by deferred split and large folio migration. 6.7 commit 9bcef5973e31 ("mm: memcg: fix split queue list crash when large folio migration") was right to fix the memcg-dependent locking broken in 85ce2c517ade ("memcontrol: only transfer the memcg data for migration"), but missed a subtlety of deferred_split_scan(): it moves folios to its own local list to work on them without split_queue_lock, during which time folio->_deferred_list is not empty, but even the "right" lock does nothing to secure the folio and the list it is on. Fortunately, deferred_split_scan() is careful to use folio_try_get(): so folio_migrate_mapping() can avoid the race by folio_undo_large_rmappable() while the old folio's reference count is temporarily frozen to 0 - adding such a freeze in the !mapping case too (originally, folio lock and unmapping and no swap cache left an anon folio unreachable, so no freezing was needed there: but the deferred split queue offers a way to reach it). Link: https://lkml.kernel.org/r/29c83d1a-11ca-b6c9-f92e-6ccb322af510@google.com Fixes: 9bcef5973e31 ("mm: memcg: fix split queue list crash when large folio migration") Signed-off-by: Hugh Dickins <hughd(a)google.com> Reviewed-by: Baolin Wang <baolin.wang(a)linux.alibaba.com> Cc: Barry Song <baohua(a)kernel.org> Cc: David Hildenbrand <david(a)redhat.com> Cc: Hugh Dickins <hughd(a)google.com> Cc: Kefeng Wang <wangkefeng.wang(a)huawei.com> Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org> Cc: Nhat Pham <nphamcs(a)gmail.com> Cc: Yang Shi <shy828301(a)gmail.com> Cc: Zi Yan <ziy(a)nvidia.com> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- mm/memcontrol.c | 11 ----------- mm/migrate.c | 13 +++++++++++++ 2 files changed, 13 insertions(+), 11 deletions(-) --- a/mm/memcontrol.c~mm-fix-crashes-from-deferred-split-racing-folio-migration +++ a/mm/memcontrol.c @@ -7823,17 +7823,6 @@ void mem_cgroup_migrate(struct folio *ol /* Transfer the charge and the css ref */ commit_charge(new, memcg); - /* - * If the old folio is a large folio and is in the split queue, it needs - * to be removed from the split queue now, in case getting an incorrect - * split queue in destroy_large_folio() after the memcg of the old folio - * is cleared. - * - * In addition, the old folio is about to be freed after migration, so - * removing from the split queue a bit earlier seems reasonable. - */ - if (folio_test_large(old) && folio_test_large_rmappable(old)) - folio_undo_large_rmappable(old); old->memcg_data = 0; } --- a/mm/migrate.c~mm-fix-crashes-from-deferred-split-racing-folio-migration +++ a/mm/migrate.c @@ -415,6 +415,15 @@ int folio_migrate_mapping(struct address if (folio_ref_count(folio) != expected_count) return -EAGAIN; + /* Take off deferred split queue while frozen and memcg set */ + if (folio_test_large(folio) && + folio_test_large_rmappable(folio)) { + if (!folio_ref_freeze(folio, expected_count)) + return -EAGAIN; + folio_undo_large_rmappable(folio); + folio_ref_unfreeze(folio, expected_count); + } + /* No turning back from here */ newfolio->index = folio->index; newfolio->mapping = folio->mapping; @@ -433,6 +442,10 @@ int folio_migrate_mapping(struct address return -EAGAIN; } + /* Take off deferred split queue while frozen and memcg set */ + if (folio_test_large(folio) && folio_test_large_rmappable(folio)) + folio_undo_large_rmappable(folio); + /* * Now we know that no one else is looking at the folio: * no turning back from here. _ Patches currently in -mm which might be from hughd(a)google.com are

1 year, 5 months

1
0
0 0

[merged mm-hotfixes-stable] mm-gup-stop-abusing-try_grab_folio.patch removed from -mm tree

by Andrew Morton

The quilt patch titled Subject: mm: gup: stop abusing try_grab_folio has been removed from the -mm tree. Its filename was mm-gup-stop-abusing-try_grab_folio.patch This patch was dropped because it was merged into the mm-hotfixes-stable branch of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm ------------------------------------------------------ From: Yang Shi <yang(a)os.amperecomputing.com> Subject: mm: gup: stop abusing try_grab_folio Date: Fri, 28 Jun 2024 12:14:58 -0700 A kernel warning was reported when pinning folio in CMA memory when launching SEV virtual machine. The splat looks like: [ 464.325306] WARNING: CPU: 13 PID: 6734 at mm/gup.c:1313 __get_user_pages+0x423/0x520 [ 464.325464] CPU: 13 PID: 6734 Comm: qemu-kvm Kdump: loaded Not tainted 6.6.33+ #6 [ 464.325477] RIP: 0010:__get_user_pages+0x423/0x520 [ 464.325515] Call Trace: [ 464.325520] <TASK> [ 464.325523] ? __get_user_pages+0x423/0x520 [ 464.325528] ? __warn+0x81/0x130 [ 464.325536] ? __get_user_pages+0x423/0x520 [ 464.325541] ? report_bug+0x171/0x1a0 [ 464.325549] ? handle_bug+0x3c/0x70 [ 464.325554] ? exc_invalid_op+0x17/0x70 [ 464.325558] ? asm_exc_invalid_op+0x1a/0x20 [ 464.325567] ? __get_user_pages+0x423/0x520 [ 464.325575] __gup_longterm_locked+0x212/0x7a0 [ 464.325583] internal_get_user_pages_fast+0xfb/0x190 [ 464.325590] pin_user_pages_fast+0x47/0x60 [ 464.325598] sev_pin_memory+0xca/0x170 [kvm_amd] [ 464.325616] sev_mem_enc_register_region+0x81/0x130 [kvm_amd] Per the analysis done by yangge, when starting the SEV virtual machine, it will call pin_user_pages_fast(..., FOLL_LONGTERM, ...) to pin the memory. But the page is in CMA area, so fast GUP will fail then fallback to the slow path due to the longterm pinnalbe check in try_grab_folio(). The slow path will try to pin the pages then migrate them out of CMA area. But the slow path also uses try_grab_folio() to pin the page, it will also fail due to the same check then the above warning is triggered. In addition, the try_grab_folio() is supposed to be used in fast path and it elevates folio refcount by using add ref unless zero. We are guaranteed to have at least one stable reference in slow path, so the simple atomic add could be used. The performance difference should be trivial, but the misuse may be confusing and misleading. Redefined try_grab_folio() to try_grab_folio_fast(), and try_grab_page() to try_grab_folio(), and use them in the proper paths. This solves both the abuse and the kernel warning. The proper naming makes their usecase more clear and should prevent from abusing in the future. peterx said: : The user will see the pin fails, for gpu-slow it further triggers the WARN : right below that failure (as in the original report): : : folio = try_grab_folio(page, page_increm - 1, : foll_flags); : if (WARN_ON_ONCE(!folio)) { <------------------------ here : /* : * Release the 1st page ref if the : * folio is problematic, fail hard. : */ : gup_put_folio(page_folio(page), 1, : foll_flags); : ret = -EFAULT; : goto out; : } [1] https://lore.kernel.org/linux-mm/1719478388-31917-1-git-send-email-yangge11… [shy828301(a)gmail.com: fix implicit declaration of function try_grab_folio_fast] Link: https://lkml.kernel.org/r/CAHbLzkowMSso-4Nufc9hcMehQsK9PNz3OSu-+eniU-2Mm-xj… Link: https://lkml.kernel.org/r/20240628191458.2605553-1-yang@os.amperecomputing.… Fixes: 57edfcfd3419 ("mm/gup: accelerate thp gup even for "pages != NULL"") Signed-off-by: Yang Shi <yang(a)os.amperecomputing.com> Reported-by: yangge <yangge1116(a)126.com> Cc: Christoph Hellwig <hch(a)infradead.org> Cc: David Hildenbrand <david(a)redhat.com> Cc: Peter Xu <peterx(a)redhat.com> Cc: <stable(a)vger.kernel.org> [6.6+] Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- mm/gup.c | 289 +++++++++++++++++++++++---------------------- mm/huge_memory.c | 2 mm/internal.h | 4 3 files changed, 156 insertions(+), 139 deletions(-) --- a/mm/gup.c~mm-gup-stop-abusing-try_grab_folio +++ a/mm/gup.c @@ -97,95 +97,6 @@ retry: return folio; } -/** - * try_grab_folio() - Attempt to get or pin a folio. - * @page: pointer to page to be grabbed - * @refs: the value to (effectively) add to the folio's refcount - * @flags: gup flags: these are the FOLL_* flag values. - * - * "grab" names in this file mean, "look at flags to decide whether to use - * FOLL_PIN or FOLL_GET behavior, when incrementing the folio's refcount. - * - * Either FOLL_PIN or FOLL_GET (or neither) must be set, but not both at the - * same time. (That's true throughout the get_user_pages*() and - * pin_user_pages*() APIs.) Cases: - * - * FOLL_GET: folio's refcount will be incremented by @refs. - * - * FOLL_PIN on large folios: folio's refcount will be incremented by - * @refs, and its pincount will be incremented by @refs. - * - * FOLL_PIN on single-page folios: folio's refcount will be incremented by - * @refs * GUP_PIN_COUNTING_BIAS. - * - * Return: The folio containing @page (with refcount appropriately - * incremented) for success, or NULL upon failure. If neither FOLL_GET - * nor FOLL_PIN was set, that's considered failure, and furthermore, - * a likely bug in the caller, so a warning is also emitted. - */ -struct folio *try_grab_folio(struct page *page, int refs, unsigned int flags) -{ - struct folio *folio; - - if (WARN_ON_ONCE((flags & (FOLL_GET | FOLL_PIN)) == 0)) - return NULL; - - if (unlikely(!(flags & FOLL_PCI_P2PDMA) && is_pci_p2pdma_page(page))) - return NULL; - - if (flags & FOLL_GET) - return try_get_folio(page, refs); - - /* FOLL_PIN is set */ - - /* - * Don't take a pin on the zero page - it's not going anywhere - * and it is used in a *lot* of places. - */ - if (is_zero_page(page)) - return page_folio(page); - - folio = try_get_folio(page, refs); - if (!folio) - return NULL; - - /* - * Can't do FOLL_LONGTERM + FOLL_PIN gup fast path if not in a - * right zone, so fail and let the caller fall back to the slow - * path. - */ - if (unlikely((flags & FOLL_LONGTERM) && - !folio_is_longterm_pinnable(folio))) { - if (!put_devmap_managed_folio_refs(folio, refs)) - folio_put_refs(folio, refs); - return NULL; - } - - /* - * When pinning a large folio, use an exact count to track it. - * - * However, be sure to *also* increment the normal folio - * refcount field at least once, so that the folio really - * is pinned. That's why the refcount from the earlier - * try_get_folio() is left intact. - */ - if (folio_test_large(folio)) - atomic_add(refs, &folio->_pincount); - else - folio_ref_add(folio, - refs * (GUP_PIN_COUNTING_BIAS - 1)); - /* - * Adjust the pincount before re-checking the PTE for changes. - * This is essentially a smp_mb() and is paired with a memory - * barrier in folio_try_share_anon_rmap_*(). - */ - smp_mb__after_atomic(); - - node_stat_mod_folio(folio, NR_FOLL_PIN_ACQUIRED, refs); - - return folio; -} - static void gup_put_folio(struct folio *folio, int refs, unsigned int flags) { if (flags & FOLL_PIN) { @@ -203,58 +114,59 @@ static void gup_put_folio(struct folio * } /** - * try_grab_page() - elevate a page's refcount by a flag-dependent amount - * @page: pointer to page to be grabbed - * @flags: gup flags: these are the FOLL_* flag values. + * try_grab_folio() - add a folio's refcount by a flag-dependent amount + * @folio: pointer to folio to be grabbed + * @refs: the value to (effectively) add to the folio's refcount + * @flags: gup flags: these are the FOLL_* flag values * * This might not do anything at all, depending on the flags argument. * * "grab" names in this file mean, "look at flags to decide whether to use - * FOLL_PIN or FOLL_GET behavior, when incrementing the page's refcount. + * FOLL_PIN or FOLL_GET behavior, when incrementing the folio's refcount. * * Either FOLL_PIN or FOLL_GET (or neither) may be set, but not both at the same - * time. Cases: please see the try_grab_folio() documentation, with - * "refs=1". + * time. * * Return: 0 for success, or if no action was required (if neither FOLL_PIN * nor FOLL_GET was set, nothing is done). A negative error code for failure: * - * -ENOMEM FOLL_GET or FOLL_PIN was set, but the page could not + * -ENOMEM FOLL_GET or FOLL_PIN was set, but the folio could not * be grabbed. + * + * It is called when we have a stable reference for the folio, typically in + * GUP slow path. */ -int __must_check try_grab_page(struct page *page, unsigned int flags) +int __must_check try_grab_folio(struct folio *folio, int refs, + unsigned int flags) { - struct folio *folio = page_folio(page); - if (WARN_ON_ONCE(folio_ref_count(folio) <= 0)) return -ENOMEM; - if (unlikely(!(flags & FOLL_PCI_P2PDMA) && is_pci_p2pdma_page(page))) + if (unlikely(!(flags & FOLL_PCI_P2PDMA) && is_pci_p2pdma_page(&folio->page))) return -EREMOTEIO; if (flags & FOLL_GET) - folio_ref_inc(folio); + folio_ref_add(folio, refs); else if (flags & FOLL_PIN) { /* * Don't take a pin on the zero page - it's not going anywhere * and it is used in a *lot* of places. */ - if (is_zero_page(page)) + if (is_zero_folio(folio)) return 0; /* - * Similar to try_grab_folio(): be sure to *also* - * increment the normal page refcount field at least once, + * Increment the normal page refcount field at least once, * so that the page really is pinned. */ if (folio_test_large(folio)) { - folio_ref_add(folio, 1); - atomic_add(1, &folio->_pincount); + folio_ref_add(folio, refs); + atomic_add(refs, &folio->_pincount); } else { - folio_ref_add(folio, GUP_PIN_COUNTING_BIAS); + folio_ref_add(folio, refs * GUP_PIN_COUNTING_BIAS); } - node_stat_mod_folio(folio, NR_FOLL_PIN_ACQUIRED, 1); + node_stat_mod_folio(folio, NR_FOLL_PIN_ACQUIRED, refs); } return 0; @@ -515,6 +427,102 @@ static int record_subpages(struct page * return nr; } + +/** + * try_grab_folio_fast() - Attempt to get or pin a folio in fast path. + * @page: pointer to page to be grabbed + * @refs: the value to (effectively) add to the folio's refcount + * @flags: gup flags: these are the FOLL_* flag values. + * + * "grab" names in this file mean, "look at flags to decide whether to use + * FOLL_PIN or FOLL_GET behavior, when incrementing the folio's refcount. + * + * Either FOLL_PIN or FOLL_GET (or neither) must be set, but not both at the + * same time. (That's true throughout the get_user_pages*() and + * pin_user_pages*() APIs.) Cases: + * + * FOLL_GET: folio's refcount will be incremented by @refs. + * + * FOLL_PIN on large folios: folio's refcount will be incremented by + * @refs, and its pincount will be incremented by @refs. + * + * FOLL_PIN on single-page folios: folio's refcount will be incremented by + * @refs * GUP_PIN_COUNTING_BIAS. + * + * Return: The folio containing @page (with refcount appropriately + * incremented) for success, or NULL upon failure. If neither FOLL_GET + * nor FOLL_PIN was set, that's considered failure, and furthermore, + * a likely bug in the caller, so a warning is also emitted. + * + * It uses add ref unless zero to elevate the folio refcount and must be called + * in fast path only. + */ +static struct folio *try_grab_folio_fast(struct page *page, int refs, + unsigned int flags) +{ + struct folio *folio; + + /* Raise warn if it is not called in fast GUP */ + VM_WARN_ON_ONCE(!irqs_disabled()); + + if (WARN_ON_ONCE((flags & (FOLL_GET | FOLL_PIN)) == 0)) + return NULL; + + if (unlikely(!(flags & FOLL_PCI_P2PDMA) && is_pci_p2pdma_page(page))) + return NULL; + + if (flags & FOLL_GET) + return try_get_folio(page, refs); + + /* FOLL_PIN is set */ + + /* + * Don't take a pin on the zero page - it's not going anywhere + * and it is used in a *lot* of places. + */ + if (is_zero_page(page)) + return page_folio(page); + + folio = try_get_folio(page, refs); + if (!folio) + return NULL; + + /* + * Can't do FOLL_LONGTERM + FOLL_PIN gup fast path if not in a + * right zone, so fail and let the caller fall back to the slow + * path. + */ + if (unlikely((flags & FOLL_LONGTERM) && + !folio_is_longterm_pinnable(folio))) { + if (!put_devmap_managed_folio_refs(folio, refs)) + folio_put_refs(folio, refs); + return NULL; + } + + /* + * When pinning a large folio, use an exact count to track it. + * + * However, be sure to *also* increment the normal folio + * refcount field at least once, so that the folio really + * is pinned. That's why the refcount from the earlier + * try_get_folio() is left intact. + */ + if (folio_test_large(folio)) + atomic_add(refs, &folio->_pincount); + else + folio_ref_add(folio, + refs * (GUP_PIN_COUNTING_BIAS - 1)); + /* + * Adjust the pincount before re-checking the PTE for changes. + * This is essentially a smp_mb() and is paired with a memory + * barrier in folio_try_share_anon_rmap_*(). + */ + smp_mb__after_atomic(); + + node_stat_mod_folio(folio, NR_FOLL_PIN_ACQUIRED, refs); + + return folio; +} #endif /* CONFIG_ARCH_HAS_HUGEPD || CONFIG_HAVE_GUP_FAST */ #ifdef CONFIG_ARCH_HAS_HUGEPD @@ -535,7 +543,7 @@ static unsigned long hugepte_addr_end(un */ static int gup_hugepte(struct vm_area_struct *vma, pte_t *ptep, unsigned long sz, unsigned long addr, unsigned long end, unsigned int flags, - struct page **pages, int *nr) + struct page **pages, int *nr, bool fast) { unsigned long pte_end; struct page *page; @@ -558,9 +566,15 @@ static int gup_hugepte(struct vm_area_st page = pte_page(pte); refs = record_subpages(page, sz, addr, end, pages + *nr); - folio = try_grab_folio(page, refs, flags); - if (!folio) - return 0; + if (fast) { + folio = try_grab_folio_fast(page, refs, flags); + if (!folio) + return 0; + } else { + folio = page_folio(page); + if (try_grab_folio(folio, refs, flags)) + return 0; + } if (unlikely(pte_val(pte) != pte_val(ptep_get(ptep)))) { gup_put_folio(folio, refs, flags); @@ -588,7 +602,7 @@ static int gup_hugepte(struct vm_area_st static int gup_hugepd(struct vm_area_struct *vma, hugepd_t hugepd, unsigned long addr, unsigned int pdshift, unsigned long end, unsigned int flags, - struct page **pages, int *nr) + struct page **pages, int *nr, bool fast) { pte_t *ptep; unsigned long sz = 1UL << hugepd_shift(hugepd); @@ -598,7 +612,8 @@ static int gup_hugepd(struct vm_area_str ptep = hugepte_offset(hugepd, addr, pdshift); do { next = hugepte_addr_end(addr, end, sz); - ret = gup_hugepte(vma, ptep, sz, addr, end, flags, pages, nr); + ret = gup_hugepte(vma, ptep, sz, addr, end, flags, pages, nr, + fast); if (ret != 1) return ret; } while (ptep++, addr = next, addr != end); @@ -625,7 +640,7 @@ static struct page *follow_hugepd(struct ptep = hugepte_offset(hugepd, addr, pdshift); ptl = huge_pte_lock(h, vma->vm_mm, ptep); ret = gup_hugepd(vma, hugepd, addr, pdshift, addr + PAGE_SIZE, - flags, &page, &nr); + flags, &page, &nr, false); spin_unlock(ptl); if (ret == 1) { @@ -642,7 +657,7 @@ static struct page *follow_hugepd(struct static inline int gup_hugepd(struct vm_area_struct *vma, hugepd_t hugepd, unsigned long addr, unsigned int pdshift, unsigned long end, unsigned int flags, - struct page **pages, int *nr) + struct page **pages, int *nr, bool fast) { return 0; } @@ -729,7 +744,7 @@ static struct page *follow_huge_pud(stru gup_must_unshare(vma, flags, page)) return ERR_PTR(-EMLINK); - ret = try_grab_page(page, flags); + ret = try_grab_folio(page_folio(page), 1, flags); if (ret) page = ERR_PTR(ret); else @@ -806,7 +821,7 @@ static struct page *follow_huge_pmd(stru VM_BUG_ON_PAGE((flags & FOLL_PIN) && PageAnon(page) && !PageAnonExclusive(page), page); - ret = try_grab_page(page, flags); + ret = try_grab_folio(page_folio(page), 1, flags); if (ret) return ERR_PTR(ret); @@ -968,8 +983,8 @@ static struct page *follow_page_pte(stru VM_BUG_ON_PAGE((flags & FOLL_PIN) && PageAnon(page) && !PageAnonExclusive(page), page); - /* try_grab_page() does nothing unless FOLL_GET or FOLL_PIN is set. */ - ret = try_grab_page(page, flags); + /* try_grab_folio() does nothing unless FOLL_GET or FOLL_PIN is set. */ + ret = try_grab_folio(page_folio(page), 1, flags); if (unlikely(ret)) { page = ERR_PTR(ret); goto out; @@ -1233,7 +1248,7 @@ static int get_gate_page(struct mm_struc goto unmap; *page = pte_page(entry); } - ret = try_grab_page(*page, gup_flags); + ret = try_grab_folio(page_folio(*page), 1, gup_flags); if (unlikely(ret)) goto unmap; out: @@ -1636,20 +1651,19 @@ next_page: * pages. */ if (page_increm > 1) { - struct folio *folio; + struct folio *folio = page_folio(page); /* * Since we already hold refcount on the * large folio, this should never fail. */ - folio = try_grab_folio(page, page_increm - 1, - foll_flags); - if (WARN_ON_ONCE(!folio)) { + if (try_grab_folio(folio, page_increm - 1, + foll_flags)) { /* * Release the 1st page ref if the * folio is problematic, fail hard. */ - gup_put_folio(page_folio(page), 1, + gup_put_folio(folio, 1, foll_flags); ret = -EFAULT; goto out; @@ -2797,7 +2811,6 @@ EXPORT_SYMBOL(get_user_pages_unlocked); * This code is based heavily on the PowerPC implementation by Nick Piggin. */ #ifdef CONFIG_HAVE_GUP_FAST - /* * Used in the GUP-fast path to determine whether GUP is permitted to work on * a specific folio. @@ -2962,7 +2975,7 @@ static int gup_fast_pte_range(pmd_t pmd, VM_BUG_ON(!pfn_valid(pte_pfn(pte))); page = pte_page(pte); - folio = try_grab_folio(page, 1, flags); + folio = try_grab_folio_fast(page, 1, flags); if (!folio) goto pte_unmap; @@ -3049,7 +3062,7 @@ static int gup_fast_devmap_leaf(unsigned break; } - folio = try_grab_folio(page, 1, flags); + folio = try_grab_folio_fast(page, 1, flags); if (!folio) { gup_fast_undo_dev_pagemap(nr, nr_start, flags, pages); break; @@ -3138,7 +3151,7 @@ static int gup_fast_pmd_leaf(pmd_t orig, page = pmd_page(orig); refs = record_subpages(page, PMD_SIZE, addr, end, pages + *nr); - folio = try_grab_folio(page, refs, flags); + folio = try_grab_folio_fast(page, refs, flags); if (!folio) return 0; @@ -3182,7 +3195,7 @@ static int gup_fast_pud_leaf(pud_t orig, page = pud_page(orig); refs = record_subpages(page, PUD_SIZE, addr, end, pages + *nr); - folio = try_grab_folio(page, refs, flags); + folio = try_grab_folio_fast(page, refs, flags); if (!folio) return 0; @@ -3222,7 +3235,7 @@ static int gup_fast_pgd_leaf(pgd_t orig, page = pgd_page(orig); refs = record_subpages(page, PGDIR_SIZE, addr, end, pages + *nr); - folio = try_grab_folio(page, refs, flags); + folio = try_grab_folio_fast(page, refs, flags); if (!folio) return 0; @@ -3276,7 +3289,8 @@ static int gup_fast_pmd_range(pud_t *pud * pmd format and THP pmd format */ if (gup_hugepd(NULL, __hugepd(pmd_val(pmd)), addr, - PMD_SHIFT, next, flags, pages, nr) != 1) + PMD_SHIFT, next, flags, pages, nr, + true) != 1) return 0; } else if (!gup_fast_pte_range(pmd, pmdp, addr, next, flags, pages, nr)) @@ -3306,7 +3320,8 @@ static int gup_fast_pud_range(p4d_t *p4d return 0; } else if (unlikely(is_hugepd(__hugepd(pud_val(pud))))) { if (gup_hugepd(NULL, __hugepd(pud_val(pud)), addr, - PUD_SHIFT, next, flags, pages, nr) != 1) + PUD_SHIFT, next, flags, pages, nr, + true) != 1) return 0; } else if (!gup_fast_pmd_range(pudp, pud, addr, next, flags, pages, nr)) @@ -3333,7 +3348,8 @@ static int gup_fast_p4d_range(pgd_t *pgd BUILD_BUG_ON(p4d_leaf(p4d)); if (unlikely(is_hugepd(__hugepd(p4d_val(p4d))))) { if (gup_hugepd(NULL, __hugepd(p4d_val(p4d)), addr, - P4D_SHIFT, next, flags, pages, nr) != 1) + P4D_SHIFT, next, flags, pages, nr, + true) != 1) return 0; } else if (!gup_fast_pud_range(p4dp, p4d, addr, next, flags, pages, nr)) @@ -3362,7 +3378,8 @@ static void gup_fast_pgd_range(unsigned return; } else if (unlikely(is_hugepd(__hugepd(pgd_val(pgd))))) { if (gup_hugepd(NULL, __hugepd(pgd_val(pgd)), addr, - PGDIR_SHIFT, next, flags, pages, nr) != 1) + PGDIR_SHIFT, next, flags, pages, nr, + true) != 1) return; } else if (!gup_fast_p4d_range(pgdp, pgd, addr, next, flags, pages, nr)) --- a/mm/huge_memory.c~mm-gup-stop-abusing-try_grab_folio +++ a/mm/huge_memory.c @@ -1331,7 +1331,7 @@ struct page *follow_devmap_pmd(struct vm if (!*pgmap) return ERR_PTR(-EFAULT); page = pfn_to_page(pfn); - ret = try_grab_page(page, flags); + ret = try_grab_folio(page_folio(page), 1, flags); if (ret) page = ERR_PTR(ret); --- a/mm/internal.h~mm-gup-stop-abusing-try_grab_folio +++ a/mm/internal.h @@ -1182,8 +1182,8 @@ int migrate_device_coherent_page(struct /* * mm/gup.c */ -struct folio *try_grab_folio(struct page *page, int refs, unsigned int flags); -int __must_check try_grab_page(struct page *page, unsigned int flags); +int __must_check try_grab_folio(struct folio *folio, int refs, + unsigned int flags); /* * mm/huge_memory.c _ Patches currently in -mm which might be from yang(a)os.amperecomputing.com are

1 year, 5 months

1
0
0 0

buckles ,leather hardware ,bag hardware ,shoe hardware ,cord hardware ,hockey hardware, head pins ,wallets ,trims ,handles,tools ,fashion ,wires supply ?

by qualityfactory＠vip.163.com

Dear Sir : Nice day! This is Su from METAL & HARNESSES factory ,we make D rings , O rings ,Plastic hardware ,clasps ,clips , buckles ,leather hardware ,bag hardware ,shoe hardware ,cord hardware ,hockey hardware, head pins ,wallets ,trims ,handles,tools ,fashion ,wires ,fastening,rivets , grommets,eyelets,buckles,screws , harnesses ,wires ,bolts ,nuts ,clips ,bits ,spurs ,screws,anchors,stainless steel hardware , brass hardware,steel harnesses ,aluminium hardware ,iron hardware ,metal & Harnesses,belt and buckles,straps ,leather Harnesses,tags ,belts,straps ,clips ,rings bars ,hooks , safety buckles,brass buckles, brass rings ,belts,buckles ,Split Ring ,straps ,sliders,snaps ,Bars Rings, Fasteners, magnent buttons,craft hardware ,bindings ,knobs,steps ,straps , bits ,eyelets ,tools , metal gear , metal hardware , metal products, metal Components,S hooks ,bars ,snaps ,webbings , buckles, HOOK,brass buckles, clips , hooks ,straps , accessories ,brass snaps ,brass gear , stainless steel hardware,steel harness ,steel wires, wire rings , wire buckles ,brass gears ,brass products ,iron metal hardware ,Fasteners,safety buckles,brass buckles, brass rings ,belts,buckles ,Ring ,Squares ,straps ,sliders,Bars Rings,Buckles, Cord locks, Hooks & Rings,Tabs, Locks and Slides,magnent buttons,carabiners,trigger ,buttons ,snaps ,fasteners, webbings , webbing buckles, buttons,Hook Loop ，textile accessories,safety buckles,sliders,snaps ,Rings ,chains,suspenders as required for our global clients . We are manufactory, we are the source, our price is very competitive ,you will get the best price , We have profuse designs with series quality grade, and expressly. Our factory always produce customer designs and drawing , if you have any products looking please let me know we could surely make for you Sincerely hope could work with you ! Best regards Su

1 year, 5 months

1
0
0 0

[PATCH v2] net: ks8851: Fix deadlock with the SPI chip variant

by Ronald Wahl

From: Ronald Wahl <ronald.wahl(a)raritan.com> When SMP is enabled and spinlocks are actually functional then there is a deadlock with the 'statelock' spinlock between ks8851_start_xmit_spi and ks8851_irq: watchdog: BUG: soft lockup - CPU#0 stuck for 27s! call trace: queued_spin_lock_slowpath+0x100/0x284 do_raw_spin_lock+0x34/0x44 ks8851_start_xmit_spi+0x30/0xb8 ks8851_start_xmit+0x14/0x20 netdev_start_xmit+0x40/0x6c dev_hard_start_xmit+0x6c/0xbc sch_direct_xmit+0xa4/0x22c __qdisc_run+0x138/0x3fc qdisc_run+0x24/0x3c net_tx_action+0xf8/0x130 handle_softirqs+0x1ac/0x1f0 __do_softirq+0x14/0x20 ____do_softirq+0x10/0x1c call_on_irq_stack+0x3c/0x58 do_softirq_own_stack+0x1c/0x28 __irq_exit_rcu+0x54/0x9c irq_exit_rcu+0x10/0x1c el1_interrupt+0x38/0x50 el1h_64_irq_handler+0x18/0x24 el1h_64_irq+0x64/0x68 __netif_schedule+0x6c/0x80 netif_tx_wake_queue+0x38/0x48 ks8851_irq+0xb8/0x2c8 irq_thread_fn+0x2c/0x74 irq_thread+0x10c/0x1b0 kthread+0xc8/0xd8 ret_from_fork+0x10/0x20 This issue has not been identified earlier because tests were done on a device with SMP disabled and so spinlocks were actually NOPs. Now use spin_(un)lock_bh for TX queue related locking to avoid execution of softirq work synchronously that would lead to a deadlock. Fixes: 3dc5d4454545 ("net: ks8851: Fix TX stall caused by TX buffer overrun") Cc: "David S. Miller" <davem(a)davemloft.net> Cc: Eric Dumazet <edumazet(a)google.com> Cc: Jakub Kicinski <kuba(a)kernel.org> Cc: Paolo Abeni <pabeni(a)redhat.com> Cc: Simon Horman <horms(a)kernel.org> Cc: netdev(a)vger.kernel.org Cc: stable(a)vger.kernel.org # 5.10+ Signed-off-by: Ronald Wahl <ronald.wahl(a)raritan.com> --- V2: - use spin_lock_bh instead of moving netif_wake_queue outside of locked region (doing the same in the start_xmit function) - add missing net: tag drivers/net/ethernet/micrel/ks8851_common.c | 4 ++-- drivers/net/ethernet/micrel/ks8851_spi.c | 4 ++-- 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/net/ethernet/micrel/ks8851_common.c b/drivers/net/ethernet/micrel/ks8851_common.c index 6453c92f0fa7..51fb6c27153e 100644 --- a/drivers/net/ethernet/micrel/ks8851_common.c +++ b/drivers/net/ethernet/micrel/ks8851_common.c @@ -352,11 +352,11 @@ static irqreturn_t ks8851_irq(int irq, void *_ks) netif_dbg(ks, intr, ks->netdev, "%s: txspace %d\n", __func__, tx_space); - spin_lock(&ks->statelock); + spin_lock_bh(&ks->statelock); ks->tx_space = tx_space; if (netif_queue_stopped(ks->netdev)) netif_wake_queue(ks->netdev); - spin_unlock(&ks->statelock); + spin_unlock_bh(&ks->statelock); } if (status & IRQ_SPIBEI) { diff --git a/drivers/net/ethernet/micrel/ks8851_spi.c b/drivers/net/ethernet/micrel/ks8851_spi.c index 670c1de966db..818e1ce3227b 100644 --- a/drivers/net/ethernet/micrel/ks8851_spi.c +++ b/drivers/net/ethernet/micrel/ks8851_spi.c @@ -385,7 +385,7 @@ static netdev_tx_t ks8851_start_xmit_spi(struct sk_buff *skb, netif_dbg(ks, tx_queued, ks->netdev, "%s: skb %p, %d@%p\n", __func__, skb, skb->len, skb->data); - spin_lock(&ks->statelock); + spin_lock_bh(&ks->statelock); if (ks->queued_len + needed > ks->tx_space) { netif_stop_queue(dev); @@ -395,7 +395,7 @@ static netdev_tx_t ks8851_start_xmit_spi(struct sk_buff *skb, skb_queue_tail(&ks->txq, skb); } - spin_unlock(&ks->statelock); + spin_unlock_bh(&ks->statelock); if (ret == NETDEV_TX_OK) schedule_work(&kss->tx_work); -- 2.45.2

1 year, 5 months

4
4
0 0

patch "bus: mhi: ep: Do not allocate memory for MHI objects from DMA zone" added to char-misc-next

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled bus: mhi: ep: Do not allocate memory for MHI objects from DMA zone to my char-misc git tree which can be found at git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git in the char-misc-next branch. The patch will show up in the next release of the linux-next tree (usually sometime within the next 24 hours during the week.) The patch will also be merged in the next major kernel release during the merge window. If you have any questions about this process, please let me know. From c7d0b2db5bc5e8c0fdc67b3c8f463c3dfec92f77 Mon Sep 17 00:00:00 2001 From: Manivannan Sadhasivam <manivannan.sadhasivam(a)linaro.org> Date: Mon, 3 Jun 2024 22:13:54 +0530 Subject: bus: mhi: ep: Do not allocate memory for MHI objects from DMA zone MHI endpoint stack accidentally started allocating memory for objects from DMA zone since commit 62210a26cd4f ("bus: mhi: ep: Use slab allocator where applicable"). But there is no real need to allocate memory from this naturally limited DMA zone. This also causes the MHI endpoint stack to run out of memory while doing high bandwidth transfers. So let's switch over to normal memory. Cc: <stable(a)vger.kernel.org> # 6.8 Fixes: 62210a26cd4f ("bus: mhi: ep: Use slab allocator where applicable") Reviewed-by: Mayank Rana <quic_mrana(a)quicinc.com> Link: https://lore.kernel.org/r/20240603164354.79035-1-manivannan.sadhasivam@lina… Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam(a)linaro.org> --- drivers/bus/mhi/ep/main.c | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/drivers/bus/mhi/ep/main.c b/drivers/bus/mhi/ep/main.c index f8f674adf1d4..4acfac73ca9a 100644 --- a/drivers/bus/mhi/ep/main.c +++ b/drivers/bus/mhi/ep/main.c @@ -90,7 +90,7 @@ static int mhi_ep_send_completion_event(struct mhi_ep_cntrl *mhi_cntrl, struct m struct mhi_ring_element *event; int ret; - event = kmem_cache_zalloc(mhi_cntrl->ev_ring_el_cache, GFP_KERNEL | GFP_DMA); + event = kmem_cache_zalloc(mhi_cntrl->ev_ring_el_cache, GFP_KERNEL); if (!event) return -ENOMEM; @@ -109,7 +109,7 @@ int mhi_ep_send_state_change_event(struct mhi_ep_cntrl *mhi_cntrl, enum mhi_stat struct mhi_ring_element *event; int ret; - event = kmem_cache_zalloc(mhi_cntrl->ev_ring_el_cache, GFP_KERNEL | GFP_DMA); + event = kmem_cache_zalloc(mhi_cntrl->ev_ring_el_cache, GFP_KERNEL); if (!event) return -ENOMEM; @@ -127,7 +127,7 @@ int mhi_ep_send_ee_event(struct mhi_ep_cntrl *mhi_cntrl, enum mhi_ee_type exec_e struct mhi_ring_element *event; int ret; - event = kmem_cache_zalloc(mhi_cntrl->ev_ring_el_cache, GFP_KERNEL | GFP_DMA); + event = kmem_cache_zalloc(mhi_cntrl->ev_ring_el_cache, GFP_KERNEL); if (!event) return -ENOMEM; @@ -146,7 +146,7 @@ static int mhi_ep_send_cmd_comp_event(struct mhi_ep_cntrl *mhi_cntrl, enum mhi_e struct mhi_ring_element *event; int ret; - event = kmem_cache_zalloc(mhi_cntrl->ev_ring_el_cache, GFP_KERNEL | GFP_DMA); + event = kmem_cache_zalloc(mhi_cntrl->ev_ring_el_cache, GFP_KERNEL); if (!event) return -ENOMEM; @@ -438,7 +438,7 @@ static int mhi_ep_read_channel(struct mhi_ep_cntrl *mhi_cntrl, read_offset = mhi_chan->tre_size - mhi_chan->tre_bytes_left; write_offset = len - buf_left; - buf_addr = kmem_cache_zalloc(mhi_cntrl->tre_buf_cache, GFP_KERNEL | GFP_DMA); + buf_addr = kmem_cache_zalloc(mhi_cntrl->tre_buf_cache, GFP_KERNEL); if (!buf_addr) return -ENOMEM; @@ -1481,14 +1481,14 @@ int mhi_ep_register_controller(struct mhi_ep_cntrl *mhi_cntrl, mhi_cntrl->ev_ring_el_cache = kmem_cache_create("mhi_ep_event_ring_el", sizeof(struct mhi_ring_element), 0, - SLAB_CACHE_DMA, NULL); + 0, NULL); if (!mhi_cntrl->ev_ring_el_cache) { ret = -ENOMEM; goto err_free_cmd; } mhi_cntrl->tre_buf_cache = kmem_cache_create("mhi_ep_tre_buf", MHI_EP_DEFAULT_MTU, 0, - SLAB_CACHE_DMA, NULL); + 0, NULL); if (!mhi_cntrl->tre_buf_cache) { ret = -ENOMEM; goto err_destroy_ev_ring_el_cache; -- 2.45.2

1 year, 5 months

1
0
0 0

patch "bus: mhi: ep: Do not allocate memory for MHI objects from DMA zone" added to char-misc-testing

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled bus: mhi: ep: Do not allocate memory for MHI objects from DMA zone to my char-misc git tree which can be found at git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git in the char-misc-testing branch. The patch will show up in the next release of the linux-next tree (usually sometime within the next 24 hours during the week.) The patch will be merged to the char-misc-next branch sometime soon, after it passes testing, and the merge window is open. If you have any questions about this process, please let me know. From c7d0b2db5bc5e8c0fdc67b3c8f463c3dfec92f77 Mon Sep 17 00:00:00 2001 From: Manivannan Sadhasivam <manivannan.sadhasivam(a)linaro.org> Date: Mon, 3 Jun 2024 22:13:54 +0530 Subject: bus: mhi: ep: Do not allocate memory for MHI objects from DMA zone MHI endpoint stack accidentally started allocating memory for objects from DMA zone since commit 62210a26cd4f ("bus: mhi: ep: Use slab allocator where applicable"). But there is no real need to allocate memory from this naturally limited DMA zone. This also causes the MHI endpoint stack to run out of memory while doing high bandwidth transfers. So let's switch over to normal memory. Cc: <stable(a)vger.kernel.org> # 6.8 Fixes: 62210a26cd4f ("bus: mhi: ep: Use slab allocator where applicable") Reviewed-by: Mayank Rana <quic_mrana(a)quicinc.com> Link: https://lore.kernel.org/r/20240603164354.79035-1-manivannan.sadhasivam@lina… Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam(a)linaro.org> --- drivers/bus/mhi/ep/main.c | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/drivers/bus/mhi/ep/main.c b/drivers/bus/mhi/ep/main.c index f8f674adf1d4..4acfac73ca9a 100644 --- a/drivers/bus/mhi/ep/main.c +++ b/drivers/bus/mhi/ep/main.c @@ -90,7 +90,7 @@ static int mhi_ep_send_completion_event(struct mhi_ep_cntrl *mhi_cntrl, struct m struct mhi_ring_element *event; int ret; - event = kmem_cache_zalloc(mhi_cntrl->ev_ring_el_cache, GFP_KERNEL | GFP_DMA); + event = kmem_cache_zalloc(mhi_cntrl->ev_ring_el_cache, GFP_KERNEL); if (!event) return -ENOMEM; @@ -109,7 +109,7 @@ int mhi_ep_send_state_change_event(struct mhi_ep_cntrl *mhi_cntrl, enum mhi_stat struct mhi_ring_element *event; int ret; - event = kmem_cache_zalloc(mhi_cntrl->ev_ring_el_cache, GFP_KERNEL | GFP_DMA); + event = kmem_cache_zalloc(mhi_cntrl->ev_ring_el_cache, GFP_KERNEL); if (!event) return -ENOMEM; @@ -127,7 +127,7 @@ int mhi_ep_send_ee_event(struct mhi_ep_cntrl *mhi_cntrl, enum mhi_ee_type exec_e struct mhi_ring_element *event; int ret; - event = kmem_cache_zalloc(mhi_cntrl->ev_ring_el_cache, GFP_KERNEL | GFP_DMA); + event = kmem_cache_zalloc(mhi_cntrl->ev_ring_el_cache, GFP_KERNEL); if (!event) return -ENOMEM; @@ -146,7 +146,7 @@ static int mhi_ep_send_cmd_comp_event(struct mhi_ep_cntrl *mhi_cntrl, enum mhi_e struct mhi_ring_element *event; int ret; - event = kmem_cache_zalloc(mhi_cntrl->ev_ring_el_cache, GFP_KERNEL | GFP_DMA); + event = kmem_cache_zalloc(mhi_cntrl->ev_ring_el_cache, GFP_KERNEL); if (!event) return -ENOMEM; @@ -438,7 +438,7 @@ static int mhi_ep_read_channel(struct mhi_ep_cntrl *mhi_cntrl, read_offset = mhi_chan->tre_size - mhi_chan->tre_bytes_left; write_offset = len - buf_left; - buf_addr = kmem_cache_zalloc(mhi_cntrl->tre_buf_cache, GFP_KERNEL | GFP_DMA); + buf_addr = kmem_cache_zalloc(mhi_cntrl->tre_buf_cache, GFP_KERNEL); if (!buf_addr) return -ENOMEM; @@ -1481,14 +1481,14 @@ int mhi_ep_register_controller(struct mhi_ep_cntrl *mhi_cntrl, mhi_cntrl->ev_ring_el_cache = kmem_cache_create("mhi_ep_event_ring_el", sizeof(struct mhi_ring_element), 0, - SLAB_CACHE_DMA, NULL); + 0, NULL); if (!mhi_cntrl->ev_ring_el_cache) { ret = -ENOMEM; goto err_free_cmd; } mhi_cntrl->tre_buf_cache = kmem_cache_create("mhi_ep_tre_buf", MHI_EP_DEFAULT_MTU, 0, - SLAB_CACHE_DMA, NULL); + 0, NULL); if (!mhi_cntrl->tre_buf_cache) { ret = -ENOMEM; goto err_destroy_ev_ring_el_cache; -- 2.45.2

1 year, 5 months

1
0
0 0

[PATCH] kbuild: Update ld-version.sh for change in LLD version output

by Nathan Chancellor

After [1] in upstream LLVM, ld.lld's version output is slightly different when the cmake configuration option LLVM_APPEND_VC_REV is disabled. Before: Debian LLD 19.0.0 (compatible with GNU linkers) After: Debian LLD 19.0.0, compatible with GNU linkers This results in ld-version.sh failing with scripts/ld-version.sh: 19: arithmetic expression: expecting EOF: "10000 * 19 + 100 * 0 + 0," because the trailing comma is included in the patch level part of the expression. Remove the trailing comma when assigning the version variable in the LLD block to resolve the error, resulting in the proper output: LLD 190000 With LLVM_APPEND_VC_REV enabled, there is no issue with the new output because it is treated the same as the prior LLVM_APPEND_VC_REV=OFF version string was. ClangBuiltLinux LLD 19.0.0 (https://github.com/llvm/llvm-project a3c5c83273358a85a4e02f5f76379b1a276e7714), compatible with GNU linkers Cc: stable(a)vger.kernel.org Fixes: 02aff8592204 ("kbuild: check the minimum linker version in Kconfig") Link: https://github.com/llvm/llvm-project/commit/0f9fbbb63cfcd2069441aa2ebef622c… [1] Signed-off-by: Nathan Chancellor <nathan(a)kernel.org> --- scripts/ld-version.sh | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/scripts/ld-version.sh b/scripts/ld-version.sh index a78b804b680c..f2f425322524 100755 --- a/scripts/ld-version.sh +++ b/scripts/ld-version.sh @@ -47,7 +47,9 @@ else done if [ "$1" = LLD ]; then - version=$2 + # LLD after https://github.com/llvm/llvm-project/commit/0f9fbbb63cfcd2069441aa2ebef622c… + # may have a trailing comma on the patch version with LLVM_APPEND_VC_REV=off. + version=${2%,} min_version=$($min_tool_version llvm) name=LLD disp_name=LLD --- base-commit: 22a40d14b572deb80c0648557f4bd502d7e83826 change-id: 20240704-update-ld-version-for-new-lld-ver-str-b7a4afbbd5f1 Best regards, -- Nathan Chancellor <nathan(a)kernel.org>

1 year, 5 months

3
4
0 0

[tip: perf/core] perf/x86/intel/pt: Fix topa_entry base length

by tip-bot2 for Marco Cavenati

The following commit has been merged into the perf/core branch of tip: Commit-ID: 5638bd722a44bbe97c1a7b3fae5b9efddb3e70ff Gitweb: https://git.kernel.org/tip/5638bd722a44bbe97c1a7b3fae5b9efddb3e70ff Author: Marco Cavenati <cavenati.marco(a)gmail.com> AuthorDate: Mon, 24 Jun 2024 23:10:55 +03:00 Committer: Peter Zijlstra <peterz(a)infradead.org> CommitterDate: Thu, 04 Jul 2024 16:00:20 +02:00 perf/x86/intel/pt: Fix topa_entry base length topa_entry->base needs to store a pfn. It obviously needs to be large enough to store the largest possible x86 pfn which is MAXPHYADDR-PAGE_SIZE (52-12). So it is 4 bits too small. Increase the size of topa_entry->base from 36 bits to 40 bits. Note, systems where physical addresses can be 256TiB or more are affected. [ Adrian: Amend commit message as suggested by Dave Hansen ] Fixes: 52ca9ced3f70 ("perf/x86/intel/pt: Add Intel PT PMU driver") Signed-off-by: Marco Cavenati <cavenati.marco(a)gmail.com> Signed-off-by: Adrian Hunter <adrian.hunter(a)intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz(a)infradead.org> Reviewed-by: Adrian Hunter <adrian.hunter(a)intel.com> Cc: stable(a)vger.kernel.org Link: https://lore.kernel.org/r/20240624201101.60186-2-adrian.hunter@intel.com --- arch/x86/events/intel/pt.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/x86/events/intel/pt.h b/arch/x86/events/intel/pt.h index 96906a6..f5e46c0 100644 --- a/arch/x86/events/intel/pt.h +++ b/arch/x86/events/intel/pt.h @@ -33,8 +33,8 @@ struct topa_entry { u64 rsvd2 : 1; u64 size : 4; u64 rsvd3 : 2; - u64 base : 36; - u64 rsvd4 : 16; + u64 base : 40; + u64 rsvd4 : 12; }; /* TSC to Core Crystal Clock Ratio */

1 year, 5 months

1
0
0 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

Linux-stable-mirror July 2024