The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 1b151e2435fc3a9b10c8946c6aebe9f3e1938c55
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024012215-drainable-immortal-a01a@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
1b151e2435fc ("block: Remove special-casing of compound pages")
fd363244e883 ("block: Add BIO_PAGE_PINNED and associated infrastructure")
e51bab4e2058 ("block: Replace BIO_NO_PAGE_REF with BIO_PAGE_REFFED with inverted logic")
a450f49708ea ("iomap: Don't get an reference on ZERO_PAGE for direct I/O block zeroing")
7ee4ccf57484 ("block: set FOLL_PCI_P2PDMA in bio_map_user_iov()")
80bd4a7aab4c ("blk-mq: move the srcu_struct used for quiescing to the tagset")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 1b151e2435fc3a9b10c8946c6aebe9f3e1938c55 Mon Sep 17 00:00:00 2001
From: "Matthew Wilcox (Oracle)" <willy(a)infradead.org>
Date: Mon, 14 Aug 2023 15:41:00 +0100
Subject: [PATCH] block: Remove special-casing of compound pages
The special casing was originally added in pre-git history; reproducing
the commit log here:
> commit a318a92567d77
> Author: Andrew Morton <akpm(a)osdl.org>
> Date: Sun Sep 21 01:42:22 2003 -0700
>
> [PATCH] Speed up direct-io hugetlbpage handling
>
> This patch short-circuits all the direct-io page dirtying logic for
> higher-order pages. Without this, we pointlessly bounce BIOs up to
> keventd all the time.
In the last twenty years, compound pages have become used for more than
just hugetlb. Rewrite these functions to operate on folios instead
of pages and remove the special case for hugetlbfs; I don't think
it's needed any more (and if it is, we can put it back in as a call
to folio_test_hugetlb()).
This was found by inspection; as far as I can tell, this bug can lead
to pages used as the destination of a direct I/O read not being marked
as dirty. If those pages are then reclaimed by the MM without being
dirtied for some other reason, they won't be written out. Then when
they're faulted back in, they will not contain the data they should.
It'll take a pretty unusual setup to produce this problem with several
races all going the wrong way.
This problem predates the folio work; it could for example have been
triggered by mmaping a THP in tmpfs and using that as the target of an
O_DIRECT read.
Fixes: 800d8c63b2e98 ("shmem: add huge pages support")
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Signed-off-by: Jens Axboe <axboe(a)kernel.dk>
diff --git a/block/bio.c b/block/bio.c
index 816d412c06e9..5eba53ca953b 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -1145,13 +1145,22 @@ EXPORT_SYMBOL(bio_add_folio);
void __bio_release_pages(struct bio *bio, bool mark_dirty)
{
- struct bvec_iter_all iter_all;
- struct bio_vec *bvec;
+ struct folio_iter fi;
+
+ bio_for_each_folio_all(fi, bio) {
+ struct page *page;
+ size_t done = 0;
- bio_for_each_segment_all(bvec, bio, iter_all) {
- if (mark_dirty && !PageCompound(bvec->bv_page))
- set_page_dirty_lock(bvec->bv_page);
- bio_release_page(bio, bvec->bv_page);
+ if (mark_dirty) {
+ folio_lock(fi.folio);
+ folio_mark_dirty(fi.folio);
+ folio_unlock(fi.folio);
+ }
+ page = folio_page(fi.folio, fi.offset / PAGE_SIZE);
+ do {
+ bio_release_page(bio, page++);
+ done += PAGE_SIZE;
+ } while (done < fi.length);
}
}
EXPORT_SYMBOL_GPL(__bio_release_pages);
@@ -1439,18 +1448,12 @@ EXPORT_SYMBOL(bio_free_pages);
* bio_set_pages_dirty() and bio_check_pages_dirty() are support functions
* for performing direct-IO in BIOs.
*
- * The problem is that we cannot run set_page_dirty() from interrupt context
+ * The problem is that we cannot run folio_mark_dirty() from interrupt context
* because the required locks are not interrupt-safe. So what we can do is to
* mark the pages dirty _before_ performing IO. And in interrupt context,
* check that the pages are still dirty. If so, fine. If not, redirty them
* in process context.
*
- * We special-case compound pages here: normally this means reads into hugetlb
- * pages. The logic in here doesn't really work right for compound pages
- * because the VM does not uniformly chase down the head page in all cases.
- * But dirtiness of compound pages is pretty meaningless anyway: the VM doesn't
- * handle them at all. So we skip compound pages here at an early stage.
- *
* Note that this code is very hard to test under normal circumstances because
* direct-io pins the pages with get_user_pages(). This makes
* is_page_cache_freeable return false, and the VM will not clean the pages.
@@ -1466,12 +1469,12 @@ EXPORT_SYMBOL(bio_free_pages);
*/
void bio_set_pages_dirty(struct bio *bio)
{
- struct bio_vec *bvec;
- struct bvec_iter_all iter_all;
+ struct folio_iter fi;
- bio_for_each_segment_all(bvec, bio, iter_all) {
- if (!PageCompound(bvec->bv_page))
- set_page_dirty_lock(bvec->bv_page);
+ bio_for_each_folio_all(fi, bio) {
+ folio_lock(fi.folio);
+ folio_mark_dirty(fi.folio);
+ folio_unlock(fi.folio);
}
}
EXPORT_SYMBOL_GPL(bio_set_pages_dirty);
@@ -1515,12 +1518,11 @@ static void bio_dirty_fn(struct work_struct *work)
void bio_check_pages_dirty(struct bio *bio)
{
- struct bio_vec *bvec;
+ struct folio_iter fi;
unsigned long flags;
- struct bvec_iter_all iter_all;
- bio_for_each_segment_all(bvec, bio, iter_all) {
- if (!PageDirty(bvec->bv_page) && !PageCompound(bvec->bv_page))
+ bio_for_each_folio_all(fi, bio) {
+ if (!folio_test_dirty(fi.folio))
goto defer;
}
The patch titled
Subject: mm/hugetlb: restore the reservation if needed
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-hugetlb-restore-the-reservation-if-needed.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Breno Leitao <leitao(a)debian.org>
Subject: mm/hugetlb: restore the reservation if needed
Date: Wed, 17 Jan 2024 09:10:57 -0800
Currently there is a bug that a huge page could be stolen, and when the
original owner tries to fault in it, it causes a page fault.
You can achieve that by:
1) Creating a single page
echo 1 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
2) mmap() the page above with MAP_HUGETLB into (void *ptr1).
* This will mark the page as reserved
3) touch the page, which causes a page fault and allocates the page
* This will move the page out of the free list.
* It will also unreserved the page, since there is no more free
page
4) madvise(MADV_DONTNEED) the page
* This will free the page, but not mark it as reserved.
5) Allocate a secondary page with mmap(MAP_HUGETLB) into (void *ptr2).
* it should fail, but, since there is no more available page.
* But, since the page above is not reserved, this mmap() succeed.
6) Faulting at ptr1 will cause a SIGBUS
* it will try to allocate a huge page, but there is none
available
A full reproducer is in selftest. See
https://lore.kernel.org/all/20240105155419.1939484-1-leitao@debian.org/
Fix this by restoring the reserved page if necessary. If the page being
unmapped has HPAGE_RESV_OWNER set, and needs a reservation, set the
restore_reserve flag, which will move the page from free to reserved.
Link: https://lkml.kernel.org/r/20240117171058.2192286-1-leitao@debian.org
Signed-off-by: Breno Leitao <leitao(a)debian.org>
Suggested-by: Rik van Riel <riel(a)surriel.com>
Cc: Lorenzo Stoakes <lstoakes(a)gmail.com>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: Muchun Song <muchun.song(a)linux.dev>
Cc: Rik van Riel <riel(a)surriel.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/hugetlb.c | 10 ++++++++++
1 file changed, 10 insertions(+)
--- a/mm/hugetlb.c~mm-hugetlb-restore-the-reservation-if-needed
+++ a/mm/hugetlb.c
@@ -5677,6 +5677,16 @@ void __unmap_hugepage_range(struct mmu_g
hugetlb_count_sub(pages_per_huge_page(h), mm);
hugetlb_remove_rmap(page_folio(page));
+ if (is_vma_resv_set(vma, HPAGE_RESV_OWNER) &&
+ vma_needs_reservation(h, vma, start)) {
+ /*
+ * Restore the reservation if needed, otherwise the
+ * backing page could be stolen by someone.
+ */
+ folio_set_hugetlb_restore_reserve(page_folio(page));
+ vma_add_reservation(h, vma, address);
+ }
+
spin_unlock(ptl);
tlb_remove_page_size(tlb, page, huge_page_size(h));
/*
_
Patches currently in -mm which might be from leitao(a)debian.org are
mm-hugetlb-restore-the-reservation-if-needed.patch
selftests-mm-run_vmtestssh-add-hugetlb_madv_vs_map.patch
selftests-mm-new-test-that-steals-pages.patch
The patch titled
Subject: selftests/mm: ksm_tests should only MADV_HUGEPAGE valid memory
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
selftests-mm-ksm_tests-should-only-madv_hugepage-valid-memory.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Ryan Roberts <ryan.roberts(a)arm.com>
Subject: selftests/mm: ksm_tests should only MADV_HUGEPAGE valid memory
Date: Mon, 22 Jan 2024 12:05:54 +0000
ksm_tests was previously mmapping a region of memory, aligning the
returned pointer to a PMD boundary, then setting MADV_HUGEPAGE, but was
setting it past the end of the mmapped area due to not taking the pointer
alignment into consideration. Fix this behaviour.
Up until commit efa7df3e3bb5 ("mm: align larger anonymous mappings on THP
boundaries"), this buggy behavior was (usually) masked because the
alignment difference was always less than PMD-size. But since the
mentioned commit, `ksm_tests -H -s 100` started failing.
Link: https://lkml.kernel.org/r/20240122120554.3108022-1-ryan.roberts@arm.com
Fixes: 325254899684 ("selftests: vm: add KSM huge pages merging time test")
Signed-off-by: Ryan Roberts <ryan.roberts(a)arm.com>
Cc: Pedro Demarchi Gomes <pedrodemargomes(a)gmail.com>
Cc: Shuah Khan <shuah(a)kernel.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
tools/testing/selftests/mm/ksm_tests.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/tools/testing/selftests/mm/ksm_tests.c~selftests-mm-ksm_tests-should-only-madv_hugepage-valid-memory
+++ a/tools/testing/selftests/mm/ksm_tests.c
@@ -566,7 +566,7 @@ static int ksm_merge_hugepages_time(int
if (map_ptr_orig == MAP_FAILED)
err(2, "initial mmap");
- if (madvise(map_ptr, len + HPAGE_SIZE, MADV_HUGEPAGE))
+ if (madvise(map_ptr, len, MADV_HUGEPAGE))
err(2, "MADV_HUGEPAGE");
pagemap_fd = open("/proc/self/pagemap", O_RDONLY);
_
Patches currently in -mm which might be from ryan.roberts(a)arm.com are
selftests-mm-ksm_tests-should-only-madv_hugepage-valid-memory.patch
tools-mm-add-thpmaps-script-to-dump-thp-usage-info.patch
The patch titled
Subject: scs: add CONFIG_MMU dependency for vfree_atomic()
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
scs-add-config_mmu-dependency-for-vfree_atomic.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Samuel Holland <samuel.holland(a)sifive.com>
Subject: scs: add CONFIG_MMU dependency for vfree_atomic()
Date: Mon, 22 Jan 2024 09:52:01 -0800
The shadow call stack implementation fails to build without CONFIG_MMU:
ld.lld: error: undefined symbol: vfree_atomic
>>> referenced by scs.c
>>> kernel/scs.o:(scs_free) in archive vmlinux.a
Link: https://lkml.kernel.org/r/20240122175204.2371009-1-samuel.holland@sifive.com
Fixes: a2abe7cbd8fe ("scs: switch to vmapped shadow stacks")
Signed-off-by: Samuel Holland <samuel.holland(a)sifive.com>
Reviewed-by: Sami Tolvanen <samitolvanen(a)google.com>
Cc: Will Deacon <will(a)kernel.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
arch/Kconfig | 1 +
1 file changed, 1 insertion(+)
--- a/arch/Kconfig~scs-add-config_mmu-dependency-for-vfree_atomic
+++ a/arch/Kconfig
@@ -668,6 +668,7 @@ config SHADOW_CALL_STACK
bool "Shadow Call Stack"
depends on ARCH_SUPPORTS_SHADOW_CALL_STACK
depends on DYNAMIC_FTRACE_WITH_ARGS || DYNAMIC_FTRACE_WITH_REGS || !FUNCTION_GRAPH_TRACER
+ depends on MMU
help
This option enables the compiler's Shadow Call Stack, which
uses a shadow stack to protect function return addresses from
_
Patches currently in -mm which might be from samuel.holland(a)sifive.com are
scs-add-config_mmu-dependency-for-vfree_atomic.patch
The patch titled
Subject: Revert "mm:vmscan: fix inaccurate reclaim during proactive reclaim"
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
revert-mm-vmscan-fix-inaccurate-reclaim-during-proactive-reclaim.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: "T.J. Mercier" <tjmercier(a)google.com>
Subject: Revert "mm:vmscan: fix inaccurate reclaim during proactive reclaim"
Date: Sun, 21 Jan 2024 21:44:12 +0000
This reverts commit 0388536ac29104a478c79b3869541524caec28eb.
Proactive reclaim on the root cgroup is 10x slower after this patch when
MGLRU is enabled, and completion times for proactive reclaim on much
smaller non-root cgroups take ~30% longer (with or without MGLRU). With
root reclaim before the patch, I observe average reclaim rates of ~70k
pages/sec before try_to_free_mem_cgroup_pages starts to fail and the
nr_retries counter starts to decrement, eventually ending the proactive
reclaim attempt. After the patch the reclaim rate is consistently ~6.6k
pages/sec due to the reduced nr_pages value causing scan aborts as soon as
SWAP_CLUSTER_MAX pages are reclaimed. The proactive reclaim doesn't
complete after several minutes because try_to_free_mem_cgroup_pages is
still capable of reclaiming pages in tiny SWAP_CLUSTER_MAX page chunks and
nr_retries is never decremented.
The docs for memory.reclaim say, "the kernel can over or under reclaim
from the target cgroup" which this patch was trying to fix. Revert it
until a less costly solution is found.
Link: https://lkml.kernel.org/r/20240121214413.833776-1-tjmercier@google.com
Fixes: 0388536ac291 ("mm:vmscan: fix inaccurate reclaim during proactive reclaim")
Signed-off-by: T.J. Mercier <tjmercier(a)google.com>
Cc: Efly Young <yangyifei03(a)kuaishou.com>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: Michal Hocko <mhocko(a)kernel.org>
Cc: Muchun Song <muchun.song(a)linux.dev>
Cc: Roman Gushchin <roman.gushchin(a)linux.dev>
Cc: Shakeel Butt <shakeelb(a)google.com>
Cc: T.J. Mercier <tjmercier(a)google.com>
Cc: Yu Zhao <yuzhao(a)google.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/memcontrol.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
--- a/mm/memcontrol.c~revert-mm-vmscan-fix-inaccurate-reclaim-during-proactive-reclaim
+++ a/mm/memcontrol.c
@@ -6977,8 +6977,8 @@ static ssize_t memory_reclaim(struct ker
lru_add_drain_all();
reclaimed = try_to_free_mem_cgroup_pages(memcg,
- min(nr_to_reclaim - nr_reclaimed, SWAP_CLUSTER_MAX),
- GFP_KERNEL, reclaim_options);
+ nr_to_reclaim - nr_reclaimed,
+ GFP_KERNEL, reclaim_options);
if (!reclaimed && !nr_retries--)
return -EAGAIN;
_
Patches currently in -mm which might be from tjmercier(a)google.com are
revert-mm-vmscan-fix-inaccurate-reclaim-during-proactive-reclaim.patch
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 1b151e2435fc3a9b10c8946c6aebe9f3e1938c55
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024012218-elevator-shrug-f68a@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
1b151e2435fc ("block: Remove special-casing of compound pages")
fd363244e883 ("block: Add BIO_PAGE_PINNED and associated infrastructure")
e51bab4e2058 ("block: Replace BIO_NO_PAGE_REF with BIO_PAGE_REFFED with inverted logic")
a450f49708ea ("iomap: Don't get an reference on ZERO_PAGE for direct I/O block zeroing")
7ee4ccf57484 ("block: set FOLL_PCI_P2PDMA in bio_map_user_iov()")
80bd4a7aab4c ("blk-mq: move the srcu_struct used for quiescing to the tagset")
e88811bc43b9 ("block: use on-stack page vec for <= UIO_FASTIOV")
480cb846c27b ("block: convert to advancing variants of iov_iter_get_pages{,_alloc}()")
e97424fd4472 ("block: fix leaking page ref on truncated direct io")
34cdb8c825f2 ("block: ensure bio_iov_add_page can't fail")
325347d965e7 ("block: ensure iov_iter advances for added pages")
46754bd05605 ("block: move ->bio_split to the gendisk")
5a97806f7dc0 ("block: change the blk_queue_split calling convention")
8374cfe647a1 ("Merge tag 'for-6.0/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 1b151e2435fc3a9b10c8946c6aebe9f3e1938c55 Mon Sep 17 00:00:00 2001
From: "Matthew Wilcox (Oracle)" <willy(a)infradead.org>
Date: Mon, 14 Aug 2023 15:41:00 +0100
Subject: [PATCH] block: Remove special-casing of compound pages
The special casing was originally added in pre-git history; reproducing
the commit log here:
> commit a318a92567d77
> Author: Andrew Morton <akpm(a)osdl.org>
> Date: Sun Sep 21 01:42:22 2003 -0700
>
> [PATCH] Speed up direct-io hugetlbpage handling
>
> This patch short-circuits all the direct-io page dirtying logic for
> higher-order pages. Without this, we pointlessly bounce BIOs up to
> keventd all the time.
In the last twenty years, compound pages have become used for more than
just hugetlb. Rewrite these functions to operate on folios instead
of pages and remove the special case for hugetlbfs; I don't think
it's needed any more (and if it is, we can put it back in as a call
to folio_test_hugetlb()).
This was found by inspection; as far as I can tell, this bug can lead
to pages used as the destination of a direct I/O read not being marked
as dirty. If those pages are then reclaimed by the MM without being
dirtied for some other reason, they won't be written out. Then when
they're faulted back in, they will not contain the data they should.
It'll take a pretty unusual setup to produce this problem with several
races all going the wrong way.
This problem predates the folio work; it could for example have been
triggered by mmaping a THP in tmpfs and using that as the target of an
O_DIRECT read.
Fixes: 800d8c63b2e98 ("shmem: add huge pages support")
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Signed-off-by: Jens Axboe <axboe(a)kernel.dk>
diff --git a/block/bio.c b/block/bio.c
index 816d412c06e9..5eba53ca953b 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -1145,13 +1145,22 @@ EXPORT_SYMBOL(bio_add_folio);
void __bio_release_pages(struct bio *bio, bool mark_dirty)
{
- struct bvec_iter_all iter_all;
- struct bio_vec *bvec;
+ struct folio_iter fi;
+
+ bio_for_each_folio_all(fi, bio) {
+ struct page *page;
+ size_t done = 0;
- bio_for_each_segment_all(bvec, bio, iter_all) {
- if (mark_dirty && !PageCompound(bvec->bv_page))
- set_page_dirty_lock(bvec->bv_page);
- bio_release_page(bio, bvec->bv_page);
+ if (mark_dirty) {
+ folio_lock(fi.folio);
+ folio_mark_dirty(fi.folio);
+ folio_unlock(fi.folio);
+ }
+ page = folio_page(fi.folio, fi.offset / PAGE_SIZE);
+ do {
+ bio_release_page(bio, page++);
+ done += PAGE_SIZE;
+ } while (done < fi.length);
}
}
EXPORT_SYMBOL_GPL(__bio_release_pages);
@@ -1439,18 +1448,12 @@ EXPORT_SYMBOL(bio_free_pages);
* bio_set_pages_dirty() and bio_check_pages_dirty() are support functions
* for performing direct-IO in BIOs.
*
- * The problem is that we cannot run set_page_dirty() from interrupt context
+ * The problem is that we cannot run folio_mark_dirty() from interrupt context
* because the required locks are not interrupt-safe. So what we can do is to
* mark the pages dirty _before_ performing IO. And in interrupt context,
* check that the pages are still dirty. If so, fine. If not, redirty them
* in process context.
*
- * We special-case compound pages here: normally this means reads into hugetlb
- * pages. The logic in here doesn't really work right for compound pages
- * because the VM does not uniformly chase down the head page in all cases.
- * But dirtiness of compound pages is pretty meaningless anyway: the VM doesn't
- * handle them at all. So we skip compound pages here at an early stage.
- *
* Note that this code is very hard to test under normal circumstances because
* direct-io pins the pages with get_user_pages(). This makes
* is_page_cache_freeable return false, and the VM will not clean the pages.
@@ -1466,12 +1469,12 @@ EXPORT_SYMBOL(bio_free_pages);
*/
void bio_set_pages_dirty(struct bio *bio)
{
- struct bio_vec *bvec;
- struct bvec_iter_all iter_all;
+ struct folio_iter fi;
- bio_for_each_segment_all(bvec, bio, iter_all) {
- if (!PageCompound(bvec->bv_page))
- set_page_dirty_lock(bvec->bv_page);
+ bio_for_each_folio_all(fi, bio) {
+ folio_lock(fi.folio);
+ folio_mark_dirty(fi.folio);
+ folio_unlock(fi.folio);
}
}
EXPORT_SYMBOL_GPL(bio_set_pages_dirty);
@@ -1515,12 +1518,11 @@ static void bio_dirty_fn(struct work_struct *work)
void bio_check_pages_dirty(struct bio *bio)
{
- struct bio_vec *bvec;
+ struct folio_iter fi;
unsigned long flags;
- struct bvec_iter_all iter_all;
- bio_for_each_segment_all(bvec, bio, iter_all) {
- if (!PageDirty(bvec->bv_page) && !PageCompound(bvec->bv_page))
+ bio_for_each_folio_all(fi, bio) {
+ if (!folio_test_dirty(fi.folio))
goto defer;
}