May 2025 - Linux-stable-mirror

[PATCH] f2fs: fix to return correct error number in f2fs_sync_node_pages()

by Chao Yu

If __write_node_folio() failed, it will return AOP_WRITEPAGE_ACTIVATE, the incorrect return value may be passed to userspace in below path, fix it. - sync_filesystem - sync_fs - f2fs_issue_checkpoint - block_operations - f2fs_sync_node_pages - __write_node_folio : return AOP_WRITEPAGE_ACTIVATE Cc: stable(a)vger.kernel.org Reported-by: Christoph Hellwig <hch(a)lst.de> Signed-off-by: Chao Yu <chao(a)kernel.org> --- fs/f2fs/node.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c index ec74eb9982a5..69308523c34e 100644 --- a/fs/f2fs/node.c +++ b/fs/f2fs/node.c @@ -2092,10 +2092,14 @@ int f2fs_sync_node_pages(struct f2fs_sb_info *sbi, ret = __write_node_folio(folio, false, &submitted, wbc, do_balance, io_type, NULL); - if (ret) + if (ret) { folio_unlock(folio); - else if (submitted) + folio_batch_release(&fbatch); + ret = -EIO; + goto out; + } else if (submitted) { nwritten++; + } if (--wbc->nr_to_write == 0) break; -- 2.49.0

7 months, 2 weeks

2
1
0 0

[PATCH 1/6] f2fs: fix to return correct error number in f2fs_sync_node_pages()

by Christoph Hellwig

From: Chao Yu <chao(a)kernel.org> If __write_node_folio() failed, it will return AOP_WRITEPAGE_ACTIVATE, the incorrect return value may be passed to userspace in below path, fix it. - sync_filesystem - sync_fs - f2fs_issue_checkpoint - block_operations - f2fs_sync_node_pages - __write_node_folio : return AOP_WRITEPAGE_ACTIVATE Cc: stable(a)vger.kernel.org Reported-by: Christoph Hellwig <hch(a)lst.de> Signed-off-by: Chao Yu <chao(a)kernel.org> --- fs/f2fs/node.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c index ec74eb9982a5..69308523c34e 100644 --- a/fs/f2fs/node.c +++ b/fs/f2fs/node.c @@ -2092,10 +2092,14 @@ int f2fs_sync_node_pages(struct f2fs_sb_info *sbi, ret = __write_node_folio(folio, false, &submitted, wbc, do_balance, io_type, NULL); - if (ret) + if (ret) { folio_unlock(folio); - else if (submitted) + folio_batch_release(&fbatch); + ret = -EIO; + goto out; + } else if (submitted) { nwritten++; + } if (--wbc->nr_to_write == 0) break; -- 2.47.2

7 months, 2 weeks

2
1
0 0

FAILED: patch "[PATCH] mm: page_alloc: speed up fallbacks in rmqueue_bulk()" failed to apply to 6.12-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 6.12-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. To reproduce the conflict and resubmit, you may use the following commands: git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.12.y git checkout FETCH_HEAD git cherry-pick -x 90abee6d7895d5eef18c91d870d8168be4e76e9d # <resolve conflicts, build, test, etc.> git commit -s git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025042150-hardiness-hunting-0780@gregkh' --subject-prefix 'PATCH 6.12.y' HEAD^.. Possible dependencies: thanks, greg k-h ------------------ original commit in Linus's tree ------------------ From 90abee6d7895d5eef18c91d870d8168be4e76e9d Mon Sep 17 00:00:00 2001 From: Johannes Weiner <hannes(a)cmpxchg.org> Date: Mon, 7 Apr 2025 14:01:53 -0400 Subject: [PATCH] mm: page_alloc: speed up fallbacks in rmqueue_bulk() MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The test robot identified c2f6ea38fc1b ("mm: page_alloc: don't steal single pages from biggest buddy") as the root cause of a 56.4% regression in vm-scalability::lru-file-mmap-read. Carlos reports an earlier patch, c0cd6f557b90 ("mm: page_alloc: fix freelist movement during block conversion"), as the root cause for a regression in worst-case zone->lock+irqoff hold times. Both of these patches modify the page allocator's fallback path to be less greedy in an effort to stave off fragmentation. The flip side of this is that fallbacks are also less productive each time around, which means the fallback search can run much more frequently. Carlos' traces point to rmqueue_bulk() specifically, which tries to refill the percpu cache by allocating a large batch of pages in a loop. It highlights how once the native freelists are exhausted, the fallback code first scans orders top-down for whole blocks to claim, then falls back to a bottom-up search for the smallest buddy to steal. For the next batch page, it goes through the same thing again. This can be made more efficient. Since rmqueue_bulk() holds the zone->lock over the entire batch, the freelists are not subject to outside changes; when the search for a block to claim has already failed, there is no point in trying again for the next page. Modify __rmqueue() to remember the last successful fallback mode, and restart directly from there on the next rmqueue_bulk() iteration. Oliver confirms that this improves beyond the regression that the test robot reported against c2f6ea38fc1b: commit: f3b92176f4 ("tools/selftests: add guard region test for /proc/$pid/pagemap") c2f6ea38fc ("mm: page_alloc: don't steal single pages from biggest buddy") acc4d5ff0b ("Merge tag 'net-6.15-rc0' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net") 2c847f27c3 ("mm: page_alloc: speed up fallbacks in rmqueue_bulk()") <--- your patch f3b92176f4f7100f c2f6ea38fc1b640aa7a2e155cc1 acc4d5ff0b61eb1715c498b6536 2c847f27c37da65a93d23c237c5 ---------------- --------------------------- --------------------------- --------------------------- %stddev %change %stddev %change %stddev %change %stddev \ | \ | \ | \ 25525364 ± 3% -56.4% 11135467 -57.8% 10779336 +31.6% 33581409 vm-scalability.throughput Carlos confirms that worst-case times are almost fully recovered compared to before the earlier culprit patch: 2dd482ba627d (before freelist hygiene): 1ms c0cd6f557b90 (after freelist hygiene): 90ms next-20250319 (steal smallest buddy): 280ms this patch : 8ms [jackmanb(a)google.com: comment updates] Link: https://lkml.kernel.org/r/D92AC0P9594X.3BML64MUKTF8Z@google.com [hannes(a)cmpxchg.org: reset rmqueue_mode in rmqueue_buddy() error loop, per Yunsheng Lin] Link: https://lkml.kernel.org/r/20250409140023.GA2313@cmpxchg.org Link: https://lkml.kernel.org/r/20250407180154.63348-1-hannes@cmpxchg.org Fixes: c0cd6f557b90 ("mm: page_alloc: fix freelist movement during block conversion") Fixes: c2f6ea38fc1b ("mm: page_alloc: don't steal single pages from biggest buddy") Signed-off-by: Johannes Weiner <hannes(a)cmpxchg.org> Signed-off-by: Brendan Jackman <jackmanb(a)google.com> Reported-by: kernel test robot <oliver.sang(a)intel.com> Reported-by: Carlos Song <carlos.song(a)nxp.com> Tested-by: Carlos Song <carlos.song(a)nxp.com> Tested-by: kernel test robot <oliver.sang(a)intel.com> Closes: https://lore.kernel.org/oe-lkp/202503271547.fc08b188-lkp@intel.com Reviewed-by: Brendan Jackman <jackmanb(a)google.com> Tested-by: Shivank Garg <shivankg(a)amd.com> Acked-by: Zi Yan <ziy(a)nvidia.com> Reviewed-by: Vlastimil Babka <vbabka(a)suse.cz> Cc: <stable(a)vger.kernel.org> [6.10+] Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 9a219fe8e130..1715e34b91af 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -2183,23 +2183,15 @@ try_to_claim_block(struct zone *zone, struct page *page, } /* - * Try finding a free buddy page on the fallback list. - * - * This will attempt to claim a whole pageblock for the requested type - * to ensure grouping of such requests in the future. - * - * If a whole block cannot be claimed, steal an individual page, regressing to - * __rmqueue_smallest() logic to at least break up as little contiguity as - * possible. + * Try to allocate from some fallback migratetype by claiming the entire block, + * i.e. converting it to the allocation's start migratetype. * * The use of signed ints for order and current_order is a deliberate * deviation from the rest of this file, to make the for loop * condition simpler. - * - * Return the stolen page, or NULL if none can be found. */ static __always_inline struct page * -__rmqueue_fallback(struct zone *zone, int order, int start_migratetype, +__rmqueue_claim(struct zone *zone, int order, int start_migratetype, unsigned int alloc_flags) { struct free_area *area; @@ -2237,14 +2229,29 @@ __rmqueue_fallback(struct zone *zone, int order, int start_migratetype, page = try_to_claim_block(zone, page, current_order, order, start_migratetype, fallback_mt, alloc_flags); - if (page) - goto got_one; + if (page) { + trace_mm_page_alloc_extfrag(page, order, current_order, + start_migratetype, fallback_mt); + return page; + } } - if (alloc_flags & ALLOC_NOFRAGMENT) - return NULL; + return NULL; +} + +/* + * Try to steal a single page from some fallback migratetype. Leave the rest of + * the block as its current migratetype, potentially causing fragmentation. + */ +static __always_inline struct page * +__rmqueue_steal(struct zone *zone, int order, int start_migratetype) +{ + struct free_area *area; + int current_order; + struct page *page; + int fallback_mt; + bool claim_block; - /* No luck claiming pageblock. Find the smallest fallback page */ for (current_order = order; current_order < NR_PAGE_ORDERS; current_order++) { area = &(zone->free_area[current_order]); fallback_mt = find_suitable_fallback(area, current_order, @@ -2254,25 +2261,28 @@ __rmqueue_fallback(struct zone *zone, int order, int start_migratetype, page = get_page_from_free_area(area, fallback_mt); page_del_and_expand(zone, page, order, current_order, fallback_mt); - goto got_one; + trace_mm_page_alloc_extfrag(page, order, current_order, + start_migratetype, fallback_mt); + return page; } return NULL; - -got_one: - trace_mm_page_alloc_extfrag(page, order, current_order, - start_migratetype, fallback_mt); - - return page; } +enum rmqueue_mode { + RMQUEUE_NORMAL, + RMQUEUE_CMA, + RMQUEUE_CLAIM, + RMQUEUE_STEAL, +}; + /* * Do the hard work of removing an element from the buddy allocator. * Call me with the zone->lock already held. */ static __always_inline struct page * __rmqueue(struct zone *zone, unsigned int order, int migratetype, - unsigned int alloc_flags) + unsigned int alloc_flags, enum rmqueue_mode *mode) { struct page *page; @@ -2291,16 +2301,48 @@ __rmqueue(struct zone *zone, unsigned int order, int migratetype, } } - page = __rmqueue_smallest(zone, order, migratetype); - if (unlikely(!page)) { - if (alloc_flags & ALLOC_CMA) + /* + * First try the freelists of the requested migratetype, then try + * fallbacks modes with increasing levels of fragmentation risk. + * + * The fallback logic is expensive and rmqueue_bulk() calls in + * a loop with the zone->lock held, meaning the freelists are + * not subject to any outside changes. Remember in *mode where + * we found pay dirt, to save us the search on the next call. + */ + switch (*mode) { + case RMQUEUE_NORMAL: + page = __rmqueue_smallest(zone, order, migratetype); + if (page) + return page; + fallthrough; + case RMQUEUE_CMA: + if (alloc_flags & ALLOC_CMA) { page = __rmqueue_cma_fallback(zone, order); - - if (!page) - page = __rmqueue_fallback(zone, order, migratetype, - alloc_flags); + if (page) { + *mode = RMQUEUE_CMA; + return page; + } + } + fallthrough; + case RMQUEUE_CLAIM: + page = __rmqueue_claim(zone, order, migratetype, alloc_flags); + if (page) { + /* Replenished preferred freelist, back to normal mode. */ + *mode = RMQUEUE_NORMAL; + return page; + } + fallthrough; + case RMQUEUE_STEAL: + if (!(alloc_flags & ALLOC_NOFRAGMENT)) { + page = __rmqueue_steal(zone, order, migratetype); + if (page) { + *mode = RMQUEUE_STEAL; + return page; + } + } } - return page; + return NULL; } /* @@ -2312,6 +2354,7 @@ static int rmqueue_bulk(struct zone *zone, unsigned int order, unsigned long count, struct list_head *list, int migratetype, unsigned int alloc_flags) { + enum rmqueue_mode rmqm = RMQUEUE_NORMAL; unsigned long flags; int i; @@ -2323,7 +2366,7 @@ static int rmqueue_bulk(struct zone *zone, unsigned int order, } for (i = 0; i < count; ++i) { struct page *page = __rmqueue(zone, order, migratetype, - alloc_flags); + alloc_flags, &rmqm); if (unlikely(page == NULL)) break; @@ -2948,7 +2991,9 @@ struct page *rmqueue_buddy(struct zone *preferred_zone, struct zone *zone, if (alloc_flags & ALLOC_HIGHATOMIC) page = __rmqueue_smallest(zone, order, MIGRATE_HIGHATOMIC); if (!page) { - page = __rmqueue(zone, order, migratetype, alloc_flags); + enum rmqueue_mode rmqm = RMQUEUE_NORMAL; + + page = __rmqueue(zone, order, migratetype, alloc_flags, &rmqm); /* * If the allocation fails, allow OOM handling and

7 months, 2 weeks

2
2
0 0

[PATCH v2 0/7] accel/ivpu: Add context violation handling for 6.12

by Jacek Lawrynowicz

These patchset adds support for VPU_JSM_STATUS_MVNCI_CONTEXT_VIOLATION_HW message added in recent VPU firmware. Without it the driver will not be able to process any jobs after this message is received and would need to be reloaded. Most patches are as-is from upstream besides these two: - Fix locking order in ivpu_job_submit - Abort all jobs after command queue unregister Both these patches need to be rebased because of missing new CMDQ UAPI changes that should not be backported to stable. Changes since v1: - Documented deviations from the original upstream patches in commit messages Andrew Kreimer (1): accel/ivpu: Fix a typo Andrzej Kacprowski (1): accel/ivpu: Update VPU FW API headers Karol Wachowski (4): accel/ivpu: Use xa_alloc_cyclic() instead of custom function accel/ivpu: Abort all jobs after command queue unregister accel/ivpu: Fix locking order in ivpu_job_submit accel/ivpu: Add handling of VPU_JSM_STATUS_MVNCI_CONTEXT_VIOLATION_HW Tomasz Rusinowicz (1): accel/ivpu: Make DB_ID and JOB_ID allocations incremental drivers/accel/ivpu/ivpu_drv.c | 38 ++-- drivers/accel/ivpu/ivpu_drv.h | 9 + drivers/accel/ivpu/ivpu_job.c | 125 +++++++++--- drivers/accel/ivpu/ivpu_job.h | 1 + drivers/accel/ivpu/ivpu_jsm_msg.c | 3 +- drivers/accel/ivpu/ivpu_mmu.c | 3 +- drivers/accel/ivpu/ivpu_sysfs.c | 5 +- drivers/accel/ivpu/vpu_boot_api.h | 45 +++-- drivers/accel/ivpu/vpu_jsm_api.h | 303 +++++++++++++++++++++++++----- 9 files changed, 412 insertions(+), 120 deletions(-) -- 2.45.1

7 months, 2 weeks

2
14
0 0

[PATCH 6.14.y v2 0/7] ublk: fix race between io_uring_cmd_complete_in_task and ublk_cancel_cmd

by Jared Holzman

This patchset backports a series of ublk fixes from upstream to 6.14-stable. Patch 7 fixes the race that can cause kernel panic when ublk server daemon is exiting. It depends on patches 1-6 which simplifies & improves IO canceling when ublk server daemon is exiting as described here: https://lore.kernel.org/linux-block/20250416035444.99569-1-ming.lei@redhat.… Ming Lei (5): ublk: add helper of ublk_need_map_io() ublk: move device reset into ublk_ch_release() ublk: remove __ublk_quiesce_dev() ublk: simplify aborting ublk request ublk: fix race between io_uring_cmd_complete_in_task and ublk_cancel_cmd Uday Shankar (2): ublk: properly serialize all FETCH_REQs ublk: improve detection and handling of ublk server exit drivers/block/ublk_drv.c | 550 +++++++++++++++++++++------------------ 1 file changed, 291 insertions(+), 259 deletions(-) -- 2.43.0

7 months, 2 weeks

3
15
0 0

[PATCH v6.1] md: move initialization and destruction of 'io_acct_set' to md.c

by Yu Kuai

From: Yu Kuai <yukuai3(a)huawei.com> commit c567c86b90d4715081adfe5eb812141a5b6b4883 upstream. 'io_acct_set' is only used for raid0 and raid456, prepare to use it for raid1 and raid10, so that io accounting from different levels can be consistent. By the way, follow up patches will also use this io clone mechanism to make sure 'active_io' represents in flight io, not io that is dispatching, so that mddev_suspend will wait for io to be done as designed. Signed-off-by: Yu Kuai <yukuai3(a)huawei.com> Reviewed-by: Xiao Ni <xni(a)redhat.com> Signed-off-by: Song Liu <song(a)kernel.org> Link: https://lore.kernel.org/r/20230621165110.1498313-2-yukuai1@huaweicloud.com [Yu Kuai: This is the relied patch for commit 4a05f7ae3371 ("md/raid10: fix missing discard IO accounting"), kernel will panic while issuing discard to raid10 without this patch] Signed-off-by: Yu Kuai <yukuai3(a)huawei.com> --- drivers/md/md.c | 27 ++++++++++----------------- drivers/md/md.h | 2 -- drivers/md/raid0.c | 16 ++-------------- drivers/md/raid5.c | 41 +++++++++++------------------------------ 4 files changed, 23 insertions(+), 63 deletions(-) diff --git a/drivers/md/md.c b/drivers/md/md.c index d5fbccc72810..a9fcfcbc2d11 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -5965,6 +5965,13 @@ int md_run(struct mddev *mddev) goto exit_bio_set; } + if (!bioset_initialized(&mddev->io_acct_set)) { + err = bioset_init(&mddev->io_acct_set, BIO_POOL_SIZE, + offsetof(struct md_io_acct, bio_clone), 0); + if (err) + goto exit_sync_set; + } + spin_lock(&pers_lock); pers = find_pers(mddev->level, mddev->clevel); if (!pers || !try_module_get(pers->owner)) { @@ -6142,6 +6149,8 @@ int md_run(struct mddev *mddev) module_put(pers->owner); md_bitmap_destroy(mddev); abort: + bioset_exit(&mddev->io_acct_set); +exit_sync_set: bioset_exit(&mddev->sync_set); exit_bio_set: bioset_exit(&mddev->bio_set); @@ -6374,6 +6383,7 @@ static void __md_stop(struct mddev *mddev) percpu_ref_exit(&mddev->active_io); bioset_exit(&mddev->bio_set); bioset_exit(&mddev->sync_set); + bioset_exit(&mddev->io_acct_set); } void md_stop(struct mddev *mddev) @@ -8744,23 +8754,6 @@ void md_submit_discard_bio(struct mddev *mddev, struct md_rdev *rdev, } EXPORT_SYMBOL_GPL(md_submit_discard_bio); -int acct_bioset_init(struct mddev *mddev) -{ - int err = 0; - - if (!bioset_initialized(&mddev->io_acct_set)) - err = bioset_init(&mddev->io_acct_set, BIO_POOL_SIZE, - offsetof(struct md_io_acct, bio_clone), 0); - return err; -} -EXPORT_SYMBOL_GPL(acct_bioset_init); - -void acct_bioset_exit(struct mddev *mddev) -{ - bioset_exit(&mddev->io_acct_set); -} -EXPORT_SYMBOL_GPL(acct_bioset_exit); - static void md_end_io_acct(struct bio *bio) { struct md_io_acct *md_io_acct = bio->bi_private; diff --git a/drivers/md/md.h b/drivers/md/md.h index 4f0b48097455..1fda5e139beb 100644 --- a/drivers/md/md.h +++ b/drivers/md/md.h @@ -746,8 +746,6 @@ extern void md_error(struct mddev *mddev, struct md_rdev *rdev); extern void md_finish_reshape(struct mddev *mddev); void md_submit_discard_bio(struct mddev *mddev, struct md_rdev *rdev, struct bio *bio, sector_t start, sector_t size); -int acct_bioset_init(struct mddev *mddev); -void acct_bioset_exit(struct mddev *mddev); void md_account_bio(struct mddev *mddev, struct bio **bio); extern bool __must_check md_flush_request(struct mddev *mddev, struct bio *bio); diff --git a/drivers/md/raid0.c b/drivers/md/raid0.c index 7c6a0b4437d8..c50a7abda744 100644 --- a/drivers/md/raid0.c +++ b/drivers/md/raid0.c @@ -377,7 +377,6 @@ static void raid0_free(struct mddev *mddev, void *priv) struct r0conf *conf = priv; free_conf(mddev, conf); - acct_bioset_exit(mddev); } static int raid0_run(struct mddev *mddev) @@ -392,16 +391,11 @@ static int raid0_run(struct mddev *mddev) if (md_check_no_bitmap(mddev)) return -EINVAL; - if (acct_bioset_init(mddev)) { - pr_err("md/raid0:%s: alloc acct bioset failed.\n", mdname(mddev)); - return -ENOMEM; - } - /* if private is not null, we are here after takeover */ if (mddev->private == NULL) { ret = create_strip_zones(mddev, &conf); if (ret < 0) - goto exit_acct_set; + return ret; mddev->private = conf; } conf = mddev->private; @@ -432,15 +426,9 @@ static int raid0_run(struct mddev *mddev) ret = md_integrity_register(mddev); if (ret) - goto free; + free_conf(mddev, conf); return ret; - -free: - free_conf(mddev, conf); -exit_acct_set: - acct_bioset_exit(mddev); - return ret; } /* diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c index 4315dabd3202..6e80a439ec45 100644 --- a/drivers/md/raid5.c +++ b/drivers/md/raid5.c @@ -7770,19 +7770,12 @@ static int raid5_run(struct mddev *mddev) struct md_rdev *rdev; struct md_rdev *journal_dev = NULL; sector_t reshape_offset = 0; - int i, ret = 0; + int i; long long min_offset_diff = 0; int first = 1; - if (acct_bioset_init(mddev)) { - pr_err("md/raid456:%s: alloc acct bioset failed.\n", mdname(mddev)); + if (mddev_init_writes_pending(mddev) < 0) return -ENOMEM; - } - - if (mddev_init_writes_pending(mddev) < 0) { - ret = -ENOMEM; - goto exit_acct_set; - } if (mddev->recovery_cp != MaxSector) pr_notice("md/raid:%s: not clean -- starting background reconstruction\n", @@ -7813,8 +7806,7 @@ static int raid5_run(struct mddev *mddev) (mddev->bitmap_info.offset || mddev->bitmap_info.file)) { pr_notice("md/raid:%s: array cannot have both journal and bitmap\n", mdname(mddev)); - ret = -EINVAL; - goto exit_acct_set; + return -EINVAL; } if (mddev->reshape_position != MaxSector) { @@ -7839,15 +7831,13 @@ static int raid5_run(struct mddev *mddev) if (journal_dev) { pr_warn("md/raid:%s: don't support reshape with journal - aborting.\n", mdname(mddev)); - ret = -EINVAL; - goto exit_acct_set; + return -EINVAL; } if (mddev->new_level != mddev->level) { pr_warn("md/raid:%s: unsupported reshape required - aborting.\n", mdname(mddev)); - ret = -EINVAL; - goto exit_acct_set; + return -EINVAL; } old_disks = mddev->raid_disks - mddev->delta_disks; /* reshape_position must be on a new-stripe boundary, and one @@ -7863,8 +7853,7 @@ static int raid5_run(struct mddev *mddev) if (sector_div(here_new, chunk_sectors * new_data_disks)) { pr_warn("md/raid:%s: reshape_position not on a stripe boundary\n", mdname(mddev)); - ret = -EINVAL; - goto exit_acct_set; + return -EINVAL; } reshape_offset = here_new * chunk_sectors; /* here_new is the stripe we will write to */ @@ -7886,8 +7875,7 @@ static int raid5_run(struct mddev *mddev) else if (mddev->ro == 0) { pr_warn("md/raid:%s: in-place reshape must be started in read-only mode - aborting\n", mdname(mddev)); - ret = -EINVAL; - goto exit_acct_set; + return -EINVAL; } } else if (mddev->reshape_backwards ? (here_new * chunk_sectors + min_offset_diff <= @@ -7897,8 +7885,7 @@ static int raid5_run(struct mddev *mddev) /* Reading from the same stripe as writing to - bad */ pr_warn("md/raid:%s: reshape_position too early for auto-recovery - aborting.\n", mdname(mddev)); - ret = -EINVAL; - goto exit_acct_set; + return -EINVAL; } pr_debug("md/raid:%s: reshape will continue\n", mdname(mddev)); /* OK, we should be able to continue; */ @@ -7922,10 +7909,8 @@ static int raid5_run(struct mddev *mddev) else conf = mddev->private; - if (IS_ERR(conf)) { - ret = PTR_ERR(conf); - goto exit_acct_set; - } + if (IS_ERR(conf)) + return PTR_ERR(conf); if (test_bit(MD_HAS_JOURNAL, &mddev->flags)) { if (!journal_dev) { @@ -8125,10 +8110,7 @@ static int raid5_run(struct mddev *mddev) free_conf(conf); mddev->private = NULL; pr_warn("md/raid:%s: failed to run raid set.\n", mdname(mddev)); - ret = -EIO; -exit_acct_set: - acct_bioset_exit(mddev); - return ret; + return -EIO; } static void raid5_free(struct mddev *mddev, void *priv) @@ -8136,7 +8118,6 @@ static void raid5_free(struct mddev *mddev, void *priv) struct r5conf *conf = priv; free_conf(conf); - acct_bioset_exit(mddev); mddev->to_remove = &raid5_attrs_group; } -- 2.39.2

7 months, 2 weeks

4
3
0 0

FAILED: patch "[PATCH] drm/xe: Invalidate L3 read-only cachelines for geometry" failed to apply to 6.14-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 6.14-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. To reproduce the conflict and resubmit, you may use the following commands: git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.14.y git checkout FETCH_HEAD git cherry-pick -x e775278cd75f24a2758c28558c4e41b36c935740 # <resolve conflicts, build, test, etc.> git commit -s git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025042247-mounting-playlist-f479@gregkh' --subject-prefix 'PATCH 6.14.y' HEAD^.. Possible dependencies: thanks, greg k-h ------------------ original commit in Linus's tree ------------------ From e775278cd75f24a2758c28558c4e41b36c935740 Mon Sep 17 00:00:00 2001 From: Kenneth Graunke <kenneth(a)whitecape.org> Date: Sun, 30 Mar 2025 12:59:23 -0400 Subject: [PATCH] drm/xe: Invalidate L3 read-only cachelines for geometry streams too Historically, the Vertex Fetcher unit has not been an L3 client. That meant that, when a buffer containing vertex data was written to, it was necessary to issue a PIPE_CONTROL::VF Cache Invalidate to invalidate any VF L2 cachelines associated with that buffer, so the new value would be properly read from memory. Since Tigerlake and later, VERTEX_BUFFER_STATE and 3DSTATE_INDEX_BUFFER have included an "L3 Bypass Enable" bit which userspace drivers can set to request that the vertex fetcher unit snoop L3. However, unlike most true L3 clients, the "VF Cache Invalidate" bit continues to only invalidate the VF L2 cache - and not any associated L3 lines. To handle that, PIPE_CONTROL has a new "L3 Read Only Cache Invalidation Bit", which according to the docs, "controls the invalidation of the Geometry streams cached in L3 cache at the top of the pipe." In other words, the vertex and index buffer data that gets cached in L3 when "L3 Bypass Disable" is set. Mesa always sets L3 Bypass Disable so that the VF unit snoops L3, and whenever it issues a VF Cache Invalidate, it also issues a L3 Read Only Cache Invalidate so that both L2 and L3 vertex data is invalidated. xe is issuing VF cache invalidates too (which handles cases like CPU writes to a buffer between GPU batches). Because userspace may enable L3 snooping, it needs to issue an L3 Read Only Cache Invalidate as well. Fixes significant flickering in Firefox on Meteorlake, which was writing to vertex buffers via the CPU between batches; the missing L3 Read Only invalidates were causing the vertex fetcher to read stale data from L3. Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/4460 Fixes: 6ef3bb60557d ("drm/xe: enable lite restore") Cc: stable(a)vger.kernel.org # v6.13+ Signed-off-by: Kenneth Graunke <kenneth(a)whitecape.org> Reviewed-by: Rodrigo Vivi <rodrigo.vivi(a)intel.com> Link: https://lore.kernel.org/r/20250330165923.56410-1-rodrigo.vivi@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi(a)intel.com> (cherry picked from commit 61672806b579dd5a150a042ec9383be2bbc2ae7e) Signed-off-by: Lucas De Marchi <lucas.demarchi(a)intel.com> diff --git a/drivers/gpu/drm/xe/instructions/xe_gpu_commands.h b/drivers/gpu/drm/xe/instructions/xe_gpu_commands.h index a255946b6f77..8cfcd3360896 100644 --- a/drivers/gpu/drm/xe/instructions/xe_gpu_commands.h +++ b/drivers/gpu/drm/xe/instructions/xe_gpu_commands.h @@ -41,6 +41,7 @@ #define GFX_OP_PIPE_CONTROL(len) ((0x3<<29)|(0x3<<27)|(0x2<<24)|((len)-2)) +#define PIPE_CONTROL0_L3_READ_ONLY_CACHE_INVALIDATE BIT(10) /* gen12 */ #define PIPE_CONTROL0_HDC_PIPELINE_FLUSH BIT(9) /* gen12 */ #define PIPE_CONTROL_COMMAND_CACHE_INVALIDATE (1<<29) diff --git a/drivers/gpu/drm/xe/xe_ring_ops.c b/drivers/gpu/drm/xe/xe_ring_ops.c index 917fc16de866..a7582b097ae6 100644 --- a/drivers/gpu/drm/xe/xe_ring_ops.c +++ b/drivers/gpu/drm/xe/xe_ring_ops.c @@ -137,7 +137,8 @@ emit_pipe_control(u32 *dw, int i, u32 bit_group_0, u32 bit_group_1, u32 offset, static int emit_pipe_invalidate(u32 mask_flags, bool invalidate_tlb, u32 *dw, int i) { - u32 flags = PIPE_CONTROL_CS_STALL | + u32 flags0 = 0; + u32 flags1 = PIPE_CONTROL_CS_STALL | PIPE_CONTROL_COMMAND_CACHE_INVALIDATE | PIPE_CONTROL_INSTRUCTION_CACHE_INVALIDATE | PIPE_CONTROL_TEXTURE_CACHE_INVALIDATE | @@ -148,11 +149,15 @@ static int emit_pipe_invalidate(u32 mask_flags, bool invalidate_tlb, u32 *dw, PIPE_CONTROL_STORE_DATA_INDEX; if (invalidate_tlb) - flags |= PIPE_CONTROL_TLB_INVALIDATE; + flags1 |= PIPE_CONTROL_TLB_INVALIDATE; - flags &= ~mask_flags; + flags1 &= ~mask_flags; - return emit_pipe_control(dw, i, 0, flags, LRC_PPHWSP_FLUSH_INVAL_SCRATCH_ADDR, 0); + if (flags1 & PIPE_CONTROL_VF_CACHE_INVALIDATE) + flags0 |= PIPE_CONTROL0_L3_READ_ONLY_CACHE_INVALIDATE; + + return emit_pipe_control(dw, i, flags0, flags1, + LRC_PPHWSP_FLUSH_INVAL_SCRATCH_ADDR, 0); } static int emit_store_imm_ppgtt_posted(u64 addr, u64 value,

7 months, 2 weeks

3
2
0 0

[PATCH] dm: fix copying after src array boundaries

by Tudor Ambarus

The blammed commit copied to argv the size of the reallocated argv, instead of the size of the old_argv, thus reading and copying from past the old_argv allocated memory. Following BUG_ON was hit: [ 3.038929][ T1] kernel BUG at lib/string_helpers.c:1040! [ 3.039147][ T1] Internal error: Oops - BUG: 00000000f2000800 [#1] SMP ... [ 3.056489][ T1] Call trace: [ 3.056591][ T1] __fortify_panic+0x10/0x18 (P) [ 3.056773][ T1] dm_split_args+0x20c/0x210 [ 3.056942][ T1] dm_table_add_target+0x13c/0x360 [ 3.057132][ T1] table_load+0x110/0x3ac [ 3.057292][ T1] dm_ctl_ioctl+0x424/0x56c [ 3.057457][ T1] __arm64_sys_ioctl+0xa8/0xec [ 3.057634][ T1] invoke_syscall+0x58/0x10c [ 3.057804][ T1] el0_svc_common+0xa8/0xdc [ 3.057970][ T1] do_el0_svc+0x1c/0x28 [ 3.058123][ T1] el0_svc+0x50/0xac [ 3.058266][ T1] el0t_64_sync_handler+0x60/0xc4 [ 3.058452][ T1] el0t_64_sync+0x1b0/0x1b4 [ 3.058620][ T1] Code: f800865e a9bf7bfd 910003fd 941f48aa (d4210000) [ 3.058897][ T1] ---[ end trace 0000000000000000 ]--- [ 3.059083][ T1] Kernel panic - not syncing: Oops - BUG: Fatal exception Fix it by copying the size of src, and not the size of dst, as it was. Fixes: 5a2a6c428190 ("dm: always update the array size in realloc_argv on success") Cc: stable(a)vger.kernel.org Signed-off-by: Tudor Ambarus <tudor.ambarus(a)linaro.org> --- drivers/md/dm-table.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c index 9e175c5e0634b49b990436898f63c2b1e696febb..6dae73ee49dbb36d89341ff09556876d0973c4ff 100644 --- a/drivers/md/dm-table.c +++ b/drivers/md/dm-table.c @@ -524,9 +524,9 @@ static char **realloc_argv(unsigned int *size, char **old_argv) } argv = kmalloc_array(new_size, sizeof(*argv), gfp); if (argv) { - *size = new_size; if (old_argv) memcpy(argv, old_argv, *size * sizeof(*argv)); + *size = new_size; } kfree(old_argv); --- base-commit: 92a09c47464d040866cf2b4cd052bc60555185fb change-id: 20250506-dm-past-array-boundaries-1fe2f5a1030f Best regards, -- Tudor Ambarus <tudor.ambarus(a)linaro.org>

7 months, 2 weeks

3
2
0 0

[PATCH 6.6 0/2] PCI: imx6: Fix i.MX7D controller_id backport regression

by Ryan Matthews

Hello, This fixes an i.Mx 7D SoC PCIe regression caused by a backport mistake. The regression is broken PCIe initialization and for me a boot hang. I don't know how to organize this. I think a revert and redo best captures what's happening. To complicate things, it looks like the redo patch could also be applied to 5.4, 5.10, 5.15, and 6.1. But those versions don't have the original backport commit. Version 6.12 matches master and needs no change. One conflict resolution is needed to apply the redo patch back to versions 6.1 -> 5.15 -> 5.10. One more resolution to apply back to -> 5.4. Patches against those other versions aren't included here. -- Ryan Richard Zhu (1): PCI: imx6: Skip controller_id generation logic for i.MX7D Ryan Matthews (1): Revert "PCI: imx6: Skip controller_id generation logic for i.MX7D" drivers/pci/controller/dwc/pci-imx6.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) base-commit: 814637ca257f4faf57a73fd4e38888cce88b5911 -- 2.47.2

7 months, 2 weeks

3
10
0 0

FAILED: patch "[PATCH] btrfs: fix the inode leak in btrfs_iget()" failed to apply to 6.14-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 6.14-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. To reproduce the conflict and resubmit, you may use the following commands: git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.14.y git checkout FETCH_HEAD git cherry-pick -x 48c1d1bb525b1c44b8bdc8e7ec5629cb6c2b9fc4 # <resolve conflicts, build, test, etc.> git commit -s git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025050558-charger-crumpled-6ca4@gregkh' --subject-prefix 'PATCH 6.14.y' HEAD^.. Possible dependencies: thanks, greg k-h ------------------ original commit in Linus's tree ------------------ From 48c1d1bb525b1c44b8bdc8e7ec5629cb6c2b9fc4 Mon Sep 17 00:00:00 2001 From: Penglei Jiang <superman.xpt(a)gmail.com> Date: Mon, 21 Apr 2025 08:40:29 -0700 Subject: [PATCH] btrfs: fix the inode leak in btrfs_iget() [BUG] There is a bug report that a syzbot reproducer can lead to the following busy inode at unmount time: BTRFS info (device loop1): last unmount of filesystem 1680000e-3c1e-4c46-84b6-56bd3909af50 VFS: Busy inodes after unmount of loop1 (btrfs) ------------[ cut here ]------------ kernel BUG at fs/super.c:650! Oops: invalid opcode: 0000 [#1] SMP KASAN NOPTI CPU: 0 UID: 0 PID: 48168 Comm: syz-executor Not tainted 6.15.0-rc2-00471-g119009db2674 #2 PREEMPT(full) Hardware name: QEMU Ubuntu 24.04 PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 RIP: 0010:generic_shutdown_super+0x2e9/0x390 fs/super.c:650 Call Trace: <TASK> kill_anon_super+0x3a/0x60 fs/super.c:1237 btrfs_kill_super+0x3b/0x50 fs/btrfs/super.c:2099 deactivate_locked_super+0xbe/0x1a0 fs/super.c:473 deactivate_super fs/super.c:506 [inline] deactivate_super+0xe2/0x100 fs/super.c:502 cleanup_mnt+0x21f/0x440 fs/namespace.c:1435 task_work_run+0x14d/0x240 kernel/task_work.c:227 resume_user_mode_work include/linux/resume_user_mode.h:50 [inline] exit_to_user_mode_loop kernel/entry/common.c:114 [inline] exit_to_user_mode_prepare include/linux/entry-common.h:329 [inline] __syscall_exit_to_user_mode_work kernel/entry/common.c:207 [inline] syscall_exit_to_user_mode+0x269/0x290 kernel/entry/common.c:218 do_syscall_64+0xd4/0x250 arch/x86/entry/syscall_64.c:100 entry_SYSCALL_64_after_hwframe+0x77/0x7f </TASK> [CAUSE] When btrfs_alloc_path() failed, btrfs_iget() directly returned without releasing the inode already allocated by btrfs_iget_locked(). This results the above busy inode and trigger the kernel BUG. [FIX] Fix it by calling iget_failed() if btrfs_alloc_path() failed. If we hit error inside btrfs_read_locked_inode(), it will properly call iget_failed(), so nothing to worry about. Although the iget_failed() cleanup inside btrfs_read_locked_inode() is a break of the normal error handling scheme, let's fix the obvious bug and backport first, then rework the error handling later. Reported-by: Penglei Jiang <superman.xpt(a)gmail.com> Link: https://lore.kernel.org/linux-btrfs/20250421102425.44431-1-superman.xpt@gma… Fixes: 7c855e16ab72 ("btrfs: remove conditional path allocation in btrfs_read_locked_inode()") CC: stable(a)vger.kernel.org # 6.13+ Reviewed-by: Qu Wenruo <wqu(a)suse.com> Signed-off-by: Penglei Jiang <superman.xpt(a)gmail.com> Signed-off-by: David Sterba <dsterba(a)suse.com> diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 312fa996a987..d295a37fa049 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -5682,8 +5682,10 @@ struct btrfs_inode *btrfs_iget(u64 ino, struct btrfs_root *root) return inode; path = btrfs_alloc_path(); - if (!path) + if (!path) { + iget_failed(&inode->vfs_inode); return ERR_PTR(-ENOMEM); + } ret = btrfs_read_locked_inode(inode, path); btrfs_free_path(path);

7 months, 2 weeks

4
3
0 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

Linux-stable-mirror May 2025